Each new column typically stores "features" or "embeddings" that are extracted from the original data, and then those features or embeddings are used in subsequent training and inference workflows
I spoke about this topic (and others) in a talk I did a while ago [1][2] (slides 15 and 16). Julien has done less academic versions of a similar talk as well Andrew [1]: https://docs.google.com/presentation/d/19F-XvNJ8sgIpIeIduA3PhbsWp4pC-P632J2eJV1cLG8 [2]: https://www.youtube.com/watch?v=k9uhw7yqPsQ On Thu, May 7, 2026 at 9:11 AM Andrew Bell <[email protected]> wrote: > Hi, > > Can someone please explain why AI processing generates data with wide > schemas? It's not an area I work in so I'm behind in trying to understand. > If you have thousands of columns in a row, are they named? Are they > expected to be queried by a human? > > Thanks, > > -- > Andrew Bell > [email protected] >
