[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68204981 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68204978 [Test build #24851 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24851/consoleFull) for PR 3778 at commit [`527e6ce`](https://gith

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68204880 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68204877 [Test build #24850 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24850/consoleFull) for PR 3778 at commit [`37022d1`](https://gith

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-28 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68204050 Updated. /cc @liancheng, any comments here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-28 Thread scwf
Github user scwf commented on a diff in the pull request: https://github.com/apache/spark/pull/3778#discussion_r22297307 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -307,7 +309,29 @@ object BooleanSimplification extends Rule

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-28 Thread scwf
Github user scwf commented on a diff in the pull request: https://github.com/apache/spark/pull/3778#discussion_r22297306 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -316,7 +340,29 @@ object BooleanSimplification extends Rule

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68203564 [Test build #24851 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24851/consoleFull) for PR 3778 at commit [`527e6ce`](https://githu

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68203451 [Test build #24850 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24850/consoleFull) for PR 3778 at commit [`37022d1`](https://githu

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68183128 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68183126 [Test build #24845 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24845/consoleFull) for PR 3778 at commit [`546a82b`](https://gith

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-27 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68180176 [Test build #24845 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24845/consoleFull) for PR 3778 at commit [`546a82b`](https://githu

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-27 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68180010 Ok, get it. After discuss with @liancheng offline, i will change this to cover 4th optimization and some simple numeric comparison, and do not introduce the extra library sp

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-27 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68178485 @scwf Two comments about your new branch: 1. `ConditionSimplification` is actually a synonym of `BooleanSimplification`. The latter also covers simple cases of

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-27 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68173060 And @liancheng, can you give me a floating point precision test? i want to make sure this ok with float. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-27 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68172996 Hi @liancheng, @chenghao-intel, i refactored a clean one, can you have a look at it. I think it is more clean and readable. https://github.com/scwf/spark/compare/apache:

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-26 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68169633 Would like to add that the solution based on Spire `Interval` I posted above may suffer from floating point precision issue. Thus we might want to cast all integral com

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-26 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68169507 Actually I'd highly suggest you breaking this PR into at least two self contained PRs, which can be much easier to review and merge. Rule sets 1 and 4 can be merged int

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-26 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68140510 Not only, actually this PR cover optimizations as follows: ``` And/Or with same condition a && a => a , a && a && a ... => a a || a => a , a || a || a ... =

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-25 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68123618 For numeric comparison optimizations, did some experiments along my former double interval comparison idea and came up with the following snippet, I haven't even compil

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-25 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68121946 @scwf Would you mind to list all the optimizations in the PR description first? Some more concise examples coupled with each optimization can be really helpful. Then we

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-25 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/3778#discussion_r22277573 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -112,7 +112,30 @@ case class InSet(value: Expr

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-25 Thread scwf
Github user scwf commented on a diff in the pull request: https://github.com/apache/spark/pull/3778#discussion_r22277360 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -293,6 +295,380 @@ object OptimizeIn extends Rule[LogicalPl

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-25 Thread scwf
Github user scwf commented on a diff in the pull request: https://github.com/apache/spark/pull/3778#discussion_r22277333 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -112,7 +112,30 @@ case class InSet(value: Expression, hs

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-25 Thread chenghao-intel
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68119210 This is really a very useful optimization, particularly for those SQLs generated by machines. And it would make more senses if we add unit test to reflect the expr

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-25 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/3778#discussion_r22277164 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -293,6 +295,380 @@ object OptimizeIn extends Rule

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-25 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/3778#discussion_r22277118 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -293,6 +295,380 @@ object OptimizeIn extends Rule

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-25 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/3778#discussion_r22277105 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -160,6 +183,49 @@ abstract class BinaryCompari

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-25 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/3778#discussion_r22277093 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -112,7 +112,30 @@ case class InSet(value: Expr

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-25 Thread scwf
Github user scwf commented on a diff in the pull request: https://github.com/apache/spark/pull/3778#discussion_r22276984 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -293,6 +295,380 @@ object OptimizeIn extends Rule[LogicalPl

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-25 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/3778#discussion_r22274369 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -293,6 +295,380 @@ object OptimizeIn extends Rule[Logi

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68086250 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68086248 [Test build #24804 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24804/consoleFull) for PR 3778 at commit [`8733027`](https://gith

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68084460 [Test build #24804 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24804/consoleFull) for PR 3778 at commit [`8733027`](https://githu

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68035016 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68035013 [Test build #24773 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24773/consoleFull) for PR 3778 at commit [`8c0316f`](https://gith

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68033865 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68033864 [Test build #24768 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24768/consoleFull) for PR 3778 at commit [`f1a487f`](https://gith

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68032736 [Test build #24773 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24773/consoleFull) for PR 3778 at commit [`8c0316f`](https://githu

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-23 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68031794 [Test build #24768 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24768/consoleFull) for PR 3778 at commit [`f1a487f`](https://githu

[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...

2014-12-23 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68029380 @scwf These optimizations are useful, particularly the one that eliminates common predicates. Thanks for bringing them up! However, the implementation in this PR is rea