Our public sector client are currently looking for a Data Wrangler to join their team on a new contract.
Contract Length: 6 months – possibly ongoing
Location: Remote – Possibly visit the office in London or Leeds once normal working resumes.
Rate: Up to £500pd inside IR35
The following activities are required from a Data Wrangler:
1. Curate data from multiple datasets and prepare for analysis by others via a dashboard presentation
2. Organise a working research structure within the TRE service environment for practical and easy use, supporting users undertaking research
3. Carry out technical validation checks on the linked data sources (e.g. duplicates, linkage errors)
4. Identify appropriate existing code lists and algorithms and apply to derive a set of priority variables from the linked datasets
5. Write, organise and curate support documentation for the linked data resources (e.g. Data dictionaries, variable mapping tables, data access process documentation, Git repositories)
6. Anticipate, communicate and solve any potential problems that may arise with data curation for various research projects and use cases
7. Be the point of contact for researchers and clinicians to address queries about how to work with the linked data resources
Essential is strong experience around Databricks, R / R-Studio and Python! These are typically found in the and an essential requirement for this role. As this role is working with millions of pieces of data, we require someone experience working with mega data sets/flows and comfortable doing so. The Data Wrangler roles require significant data management and manipulation expertise with a background in one of bioinformatics, biostatistics, computer science, mathematics or statistics along with knowledge of commonly used terminologies in health data, such as ICD10 and SNOMED. The successful candidate will be experienced in preparing data extracts for analysis by others, working closely with end users etc.