Synthetic data mimic real-world observations but have no direct real-world referents. Their significance lies not in what they represent, but in how effectively they can train models that perform in the real world. In this sense, they invert traditional notions of data provenance and challenge representational thinking. When model performance becomes the main criterion to evaluate training data, the arrow of data provenance and chain of custody as a guarantee for data quality is reversed. What is the status of ground truth if fake is not just as good, but can be even better than real?

synthetic and non-representational data

Lead:
Dietmar Offenhuber

Publications:

Offenhuber, Dietmar. 2024. “Shapes and Frictions of Synthetic Data.” Big Data & Society 11 (2): 20539517241249390. https://doi.org/10.1177/20539517241249390.

synthetic and non-representational data

physical reservoir computing

border infrastructures and language