Peter Puhov created HDFS-16353:
----------------------------------

             Summary: hdfs support for pushdown operations
                 Key: HDFS-16353
                 URL: https://issues.apache.org/jira/browse/HDFS-16353
             Project: Hadoop HDFS
          Issue Type: Wish
          Components: hdfs
            Reporter: Peter Puhov


Spark community put significant effort to optimize logical plan and delegate 
pushdown operations such as projection, filter and some of aggregates to 
Datasource V2 compatible datasources. I would like to explore the possibility 
to send this pushdown information to hdfs.

hdfs can perform this simple operations on a known files such as CSV, ORC, 
Parquet, etc. 

This approach will help to greatly reduce data traffic.

It can be something similar to AWS select-object-content.

I would like to understand if there any interest in such functionality.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to