Data Engineer

Prospective

London

Why join us

We’re a small team with a clear growth trajectory, so you’ll be getting in on the ground floor of an exciting and dynamic startup. You’ll be working alongside smart and driven computational urban modellers, software engineers, data scientists and domain experts who are passionate about applying technology to solve the biggest problems facing cities.

We’re looking for an experienced software engineer who can help scale our applications for use by hundreds of public and private sector organisations and help magnify the capabilities of our team of urban system scientists.

What you’ll get up to

  • Develop ETL processes for large spatio-temporal datasets of the build-environment
  • Contribute to the design, development and maintenance of backend functionalities to increase data availability
  • Develop rich APIs, working with our data scientists and software developers to build this into applications
  • Identify and harvest relevant datasets
  • Lead data quality initiatives
  • Contribute to system architecture design and implementation as appropriate
  • Contribute to the evolution of data handling practices
  • Deploy and maintain data handling applications
  • Develop tools for data analysis and visualisation

Why you’ll win

You are passionate about distributed systems, networking, huge volumes of data, large-scale data processing, complex computational simulation models, performance monitoring and tuning. You are excited by working in a fast growing, dynamic team and are an effective communicator. You are keen on sharing best practices, and continuously learning.

You are great at

  • ETL process development
  • SQL Technologies
  • Schema design and data modelling
  • Python or another modern scripting language
  • Working with large and complex datasets (XML, JSON, AVRO, etc)

You are familiar with

  •  Using proprietary (e.g. ArcGIS, Mapinfo ) and/or open source (QGIS, OpenJump) GIS Packages
  • Best practice methodologies for data processing
  • Spatial data sets (SHP, GML, GeoJSON)
  • Spatial algorithms

Big Pluses

  • R
  • Java/C++
  • Machine Learning
  • Docker
  • PostGIS / GDAL
  • NoSQL Technologies
  • Hadoop or equivalent distributed storage and processing platforms
  • Processing datasets using High Performance and Cloud Computing technology, including GCP
  • Building web services and APIs
  • Unit testing