jorisvandenbossche commented on pull request #8894: URL: https://github.com/apache/arrow/pull/8894#issuecomment-756222590
> > [@pitrou] I'm also curious why it's called "project". It sounds rather imprecise, though it may be the conventional term for this operation?) > > [@bkietz] "project" is the conventional term. I'll move it to a separate header/source. Although it's clearly related, I personally still find it a bit strange name for this specific (user exposed) function (but I am certainly not very familiar with the different contexts where "project" gets used, eg in Python/pandas this term is basically never used). In the Dataset context, we typically speak about projection when eg defining a subset of the columns to return, correct? But here, you already have the subset of arrays/scalars, and only combine them in a StructArray (naively, I would expect that a project function would eg receive a record batch and return a subset of it (with potentially renamed, reordered, etc fields). So it feels like a level lower as an actual 'project' operation. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org