[jira] [Comment Edited] (SPARK-14820) Reduce shuffle data by pushing filter toward storage

Takeshi Yamamuro (JIRA) Fri, 22 Apr 2016 02:25:29 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-14820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253608#comment-15253608
 ]


Takeshi Yamamuro edited comment on SPARK-14820 at 4/22/16 9:24 AM:
-------------------------------------------------------------------

Seems `Optimizer#PushPredicateThroughJoin` handles this kind of push-down 
optimization.
Why cannot the current impl. apply filter push-down into the query described in 
your pdf?


was (Author: maropu):
Seems `Optimizer#PushPredicateThroughJoin` handles this kind of push-down 
optimization.
Why cannot the current impl. apply filter push-downs into the query described 
in your pdf?

> Reduce shuffle data by pushing filter toward storage
> ----------------------------------------------------
>
>                 Key: SPARK-14820
>                 URL: https://issues.apache.org/jira/browse/SPARK-14820
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.1
>            Reporter: Ali Tootoonchian
>            Priority: Trivial
>         Attachments: Reduce Shuffle Data by pushing filter toward storage.pdf
>
>
> SQL query planner can have intelligence to push down filter commands towards 
> the storage layer. If we optimize the query planner such that the IO to the 
> storage is reduced at the cost of running multiple filters (i.e., compute), 
> this should be desirable when the system is IO bound.
> Proven analysis and example is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-14820) Reduce shuffle data by pushing filter toward storage

Reply via email to