[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554431#comment-16554431 ] Apache Spark commented on SPARK-18874: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/21863 > First phase: Deferring the correlated predicate pull up to Optimizer phase > -- > > Key: SPARK-18874 > URL: https://issues.apache.org/jira/browse/SPARK-18874 > Project: Spark > Issue Type: Sub-task > Components: SQL >Reporter: Nattavut Sutyanyong >Assignee: Dilip Biswal >Priority: Major > Fix For: 2.2.0 > > Attachments: SPARK-18874-3.pdf > > > This JIRA implements the first phase of SPARK-18455 by deferring the > correlated predicate pull up from Analyzer to Optimizer. The goal is to > preserve the current functionality of subquery in Spark 2.0 (if it works, it > continues to work after this JIRA, if it does not, it won't). The performance > of subquery processing is expected to be at par with Spark 2.0. > The representation of the LogicalPlan after Analyzer will be different after > this JIRA that it will preserve the original positions of correlated > predicates in a subquery. This new representation is a preparation work for > the second phase of extending the support of correlated subquery to cases > Spark 2.0 does not support such as deep correlation, outer references in > SELECT clause. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15923944#comment-15923944 ] Apache Spark commented on SPARK-18874: -- User 'hvanhovell' has created a pull request for this issue: https://github.com/apache/spark/pull/17288 > First phase: Deferring the correlated predicate pull up to Optimizer phase > -- > > Key: SPARK-18874 > URL: https://issues.apache.org/jira/browse/SPARK-18874 > Project: Spark > Issue Type: Sub-task > Components: SQL >Reporter: Nattavut Sutyanyong >Assignee: Dilip Biswal > Fix For: 2.2.0 > > Attachments: SPARK-18874-3.pdf > > > This JIRA implements the first phase of SPARK-18455 by deferring the > correlated predicate pull up from Analyzer to Optimizer. The goal is to > preserve the current functionality of subquery in Spark 2.0 (if it works, it > continues to work after this JIRA, if it does not, it won't). The performance > of subquery processing is expected to be at par with Spark 2.0. > The representation of the LogicalPlan after Analyzer will be different after > this JIRA that it will preserve the original positions of correlated > predicates in a subquery. This new representation is a preparation work for > the second phase of extending the support of correlated subquery to cases > Spark 2.0 does not support such as deep correlation, outer references in > SELECT clause. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15869562#comment-15869562 ] Apache Spark commented on SPARK-18874: -- User 'dilipbiswal' has created a pull request for this issue: https://github.com/apache/spark/pull/16954 > First phase: Deferring the correlated predicate pull up to Optimizer phase > -- > > Key: SPARK-18874 > URL: https://issues.apache.org/jira/browse/SPARK-18874 > Project: Spark > Issue Type: Sub-task > Components: SQL >Reporter: Nattavut Sutyanyong > Attachments: SPARK-18874-3.pdf > > > This JIRA implements the first phase of SPARK-18455 by deferring the > correlated predicate pull up from Analyzer to Optimizer. The goal is to > preserve the current functionality of subquery in Spark 2.0 (if it works, it > continues to work after this JIRA, if it does not, it won't). The performance > of subquery processing is expected to be at par with Spark 2.0. > The representation of the LogicalPlan after Analyzer will be different after > this JIRA that it will preserve the original positions of correlated > predicates in a subquery. This new representation is a preparation work for > the second phase of extending the support of correlated subquery to cases > Spark 2.0 does not support such as deep correlation, outer references in > SELECT clause. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15859398#comment-15859398 ] Nattavut Sutyanyong commented on SPARK-18874: - Thank you, [~hvanhovell][~dkbiswal], for reviewing the design document. I have attached the revised version as a pdf format in this JIRA. @rxin: FYI. We will be submitting a PR for the code by Tuesday February 14. P.S. We will submit the last test PR today (Feb 9). As of now, there are 4 pending test PRs waiting for review. We hope to have all the test PRs merged before the code. This way when the code is merged, we can verify by exercising the new code against those test cases. > First phase: Deferring the correlated predicate pull up to Optimizer phase > -- > > Key: SPARK-18874 > URL: https://issues.apache.org/jira/browse/SPARK-18874 > Project: Spark > Issue Type: Sub-task > Components: SQL >Reporter: Nattavut Sutyanyong > Attachments: SPARK-18874-3.pdf > > > This JIRA implements the first phase of SPARK-18455 by deferring the > correlated predicate pull up from Analyzer to Optimizer. The goal is to > preserve the current functionality of subquery in Spark 2.0 (if it works, it > continues to work after this JIRA, if it does not, it won't). The performance > of subquery processing is expected to be at par with Spark 2.0. > The representation of the LogicalPlan after Analyzer will be different after > this JIRA that it will preserve the original positions of correlated > predicates in a subquery. This new representation is a preparation work for > the second phase of extending the support of correlated subquery to cases > Spark 2.0 does not support such as deep correlation, outer references in > SELECT clause. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851550#comment-15851550 ] Nattavut Sutyanyong commented on SPARK-18874: - I have published a design document as a reference when reviewing the code. https://issues.apache.org/jira/secure/attachment/12850832/SPARK-18874-3.pdf > First phase: Deferring the correlated predicate pull up to Optimizer phase > -- > > Key: SPARK-18874 > URL: https://issues.apache.org/jira/browse/SPARK-18874 > Project: Spark > Issue Type: Sub-task > Components: SQL >Reporter: Nattavut Sutyanyong > Attachments: SPARK-18874-3.pdf > > > This JIRA implements the first phase of SPARK-18455 by deferring the > correlated predicate pull up from Analyzer to Optimizer. The goal is to > preserve the current functionality of subquery in Spark 2.0 (if it works, it > continues to work after this JIRA, if it does not, it won't). The performance > of subquery processing is expected to be at par with Spark 2.0. > The representation of the LogicalPlan after Analyzer will be different after > this JIRA that it will preserve the original positions of correlated > predicates in a subquery. This new representation is a preparation work for > the second phase of extending the support of correlated subquery to cases > Spark 2.0 does not support such as deep correlation, outer references in > SELECT clause. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824182#comment-15824182 ] Nattavut Sutyanyong commented on SPARK-18874: - [~hvanhovell] [~rxin], Have you got a chance to review the document I posted? Are there any comments? We are wrapping up the code for this JIRA based on the design in the document. How would you like us to move forward? I feel I'd like to get your feedback on the design before submitting a PR for the code. Look forward to hearing from you. Thanks. > First phase: Deferring the correlated predicate pull up to Optimizer phase > -- > > Key: SPARK-18874 > URL: https://issues.apache.org/jira/browse/SPARK-18874 > Project: Spark > Issue Type: Sub-task > Components: SQL >Reporter: Nattavut Sutyanyong > > This JIRA implements the first phase of SPARK-18455 by deferring the > correlated predicate pull up from Analyzer to Optimizer. The goal is to > preserve the current functionality of subquery in Spark 2.0 (if it works, it > continues to work after this JIRA, if it does not, it won't). The performance > of subquery processing is expected to be at par with Spark 2.0. > The representation of the LogicalPlan after Analyzer will be different after > this JIRA that it will preserve the original positions of correlated > predicates in a subquery. This new representation is a preparation work for > the second phase of extending the support of correlated subquery to cases > Spark 2.0 does not support such as deep correlation, outer references in > SELECT clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803108#comment-15803108 ] Nattavut Sutyanyong commented on SPARK-18874: - Please try accessing the doc from the URL below. https://docs.google.com/document/d/1QDZ8JwU63RwGFS6KVF54Rjj9ZJyK33d49ZWbjFBaIgU/edit# > First phase: Deferring the correlated predicate pull up to Optimizer phase > -- > > Key: SPARK-18874 > URL: https://issues.apache.org/jira/browse/SPARK-18874 > Project: Spark > Issue Type: Sub-task > Components: SQL >Reporter: Nattavut Sutyanyong > > This JIRA implements the first phase of SPARK-18455 by deferring the > correlated predicate pull up from Analyzer to Optimizer. The goal is to > preserve the current functionality of subquery in Spark 2.0 (if it works, it > continues to work after this JIRA, if it does not, it won't). The performance > of subquery processing is expected to be at par with Spark 2.0. > The representation of the LogicalPlan after Analyzer will be different after > this JIRA that it will preserve the original positions of correlated > predicates in a subquery. This new representation is a preparation work for > the second phase of extending the support of correlated subquery to cases > Spark 2.0 does not support such as deep correlation, outer references in > SELECT clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803057#comment-15803057 ] Reynold Xin commented on SPARK-18874: - Thanks. Where is the doc? > First phase: Deferring the correlated predicate pull up to Optimizer phase > -- > > Key: SPARK-18874 > URL: https://issues.apache.org/jira/browse/SPARK-18874 > Project: Spark > Issue Type: Sub-task > Components: SQL >Reporter: Nattavut Sutyanyong > > This JIRA implements the first phase of SPARK-18455 by deferring the > correlated predicate pull up from Analyzer to Optimizer. The goal is to > preserve the current functionality of subquery in Spark 2.0 (if it works, it > continues to work after this JIRA, if it does not, it won't). The performance > of subquery processing is expected to be at par with Spark 2.0. > The representation of the LogicalPlan after Analyzer will be different after > this JIRA that it will preserve the original positions of correlated > predicates in a subquery. This new representation is a preparation work for > the second phase of extending the support of correlated subquery to cases > Spark 2.0 does not support such as deep correlation, outer references in > SELECT clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801504#comment-15801504 ] Nattavut Sutyanyong commented on SPARK-18874: - [~rxin], [~hvanhovell], [~smilegator] Just an FYI that an initial version of the design doc has been posted for public review. Your comments are much appreciated. Thanks! > First phase: Deferring the correlated predicate pull up to Optimizer phase > -- > > Key: SPARK-18874 > URL: https://issues.apache.org/jira/browse/SPARK-18874 > Project: Spark > Issue Type: Sub-task > Components: SQL >Reporter: Nattavut Sutyanyong > > This JIRA implements the first phase of SPARK-18455 by deferring the > correlated predicate pull up from Analyzer to Optimizer. The goal is to > preserve the current functionality of subquery in Spark 2.0 (if it works, it > continues to work after this JIRA, if it does not, it won't). The performance > of subquery processing is expected to be at par with Spark 2.0. > The representation of the LogicalPlan after Analyzer will be different after > this JIRA that it will preserve the original positions of correlated > predicates in a subquery. This new representation is a preparation work for > the second phase of extending the support of correlated subquery to cases > Spark 2.0 does not support such as deep correlation, outer references in > SELECT clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18874) First phase: Deferring the correlated predicate pull up to Optimizer phase
[ https://issues.apache.org/jira/browse/SPARK-18874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15764597#comment-15764597 ] Nattavut Sutyanyong commented on SPARK-18874: - Here is an initial version of the detailed design document for this work. https://docs.google.com/document/d/1QDZ8JwU63RwGFS6KVF54Rjj9ZJyK33d49ZWbjFBaIgU/edit# Comments and ideas are welcome. > First phase: Deferring the correlated predicate pull up to Optimizer phase > -- > > Key: SPARK-18874 > URL: https://issues.apache.org/jira/browse/SPARK-18874 > Project: Spark > Issue Type: Sub-task > Components: SQL >Reporter: Nattavut Sutyanyong > > This JIRA implements the first phase of SPARK-18455 by deferring the > correlated predicate pull up from Analyzer to Optimizer. The goal is to > preserve the current functionality of subquery in Spark 2.0 (if it works, it > continues to work after this JIRA, if it does not, it won't). The performance > of subquery processing is expected to be at par with Spark 2.0. > The representation of the LogicalPlan after Analyzer will be different after > this JIRA that it will preserve the original positions of correlated > predicates in a subquery. This new representation is a preparation work for > the second phase of extending the support of correlated subquery to cases > Spark 2.0 does not support such as deep correlation, outer references in > SELECT clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org