Weston Pace created ARROW-16288:
-----------------------------------

             Summary: [C++] ValueDescr::SCALAR nearly unused and does not work 
for projection
                 Key: ARROW-16288
                 URL: https://issues.apache.org/jira/browse/ARROW-16288
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Weston Pace


First, there are almost no kernels that actually use this shape.  Only the 
functions "all", "any", "list_element", "mean", "product", "struct_field", and 
"sum" have kernels with this shape.  Most kernels that have special logic for 
scalars handle it by using {{ValueDescr::ANY}}

Second, when passing an expression to the project node, the expression must be 
bound based on the dataset schema.  Since the binding happens based on a schema 
(and not a batch) the function is bound to ValueDescr::ARRAY 
(https://github.com/apache/arrow/blob/a16be6b7b6c8271202ff766b99c199b2e29bdfa8/cpp/src/arrow/compute/exec/expression.cc#L461)

This results in an error if the function has only ValueDescr::SCALAR kernels 
and would likely be a problem even if the function had both types of kernels 
because it would get bound to the wrong kernel.

This simplest fix may be to just get rid of ValueDescr and change all kernels 
to ValueDescr::ANY behavior.  If we choose to keep it we will need to figure 
out how to handle this kind of binding.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to