Will Jones created ARROW-16844:
----------------------------------

             Summary: [C++][Python] Implement to/from substrait for Expression
                 Key: ARROW-16844
                 URL: https://issues.apache.org/jira/browse/ARROW-16844
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++, Python
            Reporter: Will Jones


DataFusion has the ability to convert between Substrait expressions and it's 
own internal expressions. (See: 
[https://github.com/datafusion-contrib/datafusion-substrait] .) It would be 
cool if we had a similar conversion for Acero's Expression class.

This might unlock allowing datafusion-python to easily use PyArrow datasets, by 
using Substrait as intermediate format to pass down filter and projections from 
Datafusion into the scanner. (See early draft here: 
[https://github.com/datafusion-contrib/datafusion-python/pull/21].)

One problem is that it's unclear what should be the type of the object in 
Python representing the Substrait expression. IIUC Python doesn't have direct 
bindings to the Substrait protobuf.

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to