jorisvandenbossche commented on pull request #8894:
URL: https://github.com/apache/arrow/pull/8894#issuecomment-756222590


   > > [@pitrou] I'm also curious why it's called "project". It sounds rather 
imprecise, though it may be the conventional term for this operation?)
   >
   > [@bkietz] "project" is the conventional term. I'll move it to a separate 
header/source.
   
   Although it's clearly related, I personally still find it a bit strange name 
for this specific (user exposed) function (but I am certainly not very familiar 
with the different contexts where "project" gets used, eg in Python/pandas this 
term is basically never used). 
   In the Dataset context, we typically speak about projection when eg defining 
a subset of the columns to return, correct? But here, you already have the 
subset of arrays/scalars, and only combine them in a StructArray (naively, I 
would expect that a project function would eg receive a record batch and return 
a subset of it (with potentially renamed, reordered, etc fields). So it feels 
like a level lower as an actual 'project' operation.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to