Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68204981
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68204978
[Test build #24851 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24851/consoleFull)
for PR 3778 at commit
[`527e6ce`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68204880
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68204877
[Test build #24850 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24850/consoleFull)
for PR 3778 at commit
[`37022d1`](https://gith
Github user scwf commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68204050
Updated. /cc @liancheng, any comments here?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user scwf commented on a diff in the pull request:
https://github.com/apache/spark/pull/3778#discussion_r22297307
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -307,7 +309,29 @@ object BooleanSimplification extends Rule
Github user scwf commented on a diff in the pull request:
https://github.com/apache/spark/pull/3778#discussion_r22297306
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -316,7 +340,29 @@ object BooleanSimplification extends Rule
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68203564
[Test build #24851 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24851/consoleFull)
for PR 3778 at commit
[`527e6ce`](https://githu
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68203451
[Test build #24850 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24850/consoleFull)
for PR 3778 at commit
[`37022d1`](https://githu
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68183128
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68183126
[Test build #24845 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24845/consoleFull)
for PR 3778 at commit
[`546a82b`](https://gith
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68180176
[Test build #24845 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24845/consoleFull)
for PR 3778 at commit
[`546a82b`](https://githu
Github user scwf commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68180010
Ok, get it. After discuss with @liancheng offline, i will change this to
cover 4th optimization and some simple numeric comparison, and do not introduce
the extra library sp
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68178485
@scwf Two comments about your new branch:
1. `ConditionSimplification` is actually a synonym of
`BooleanSimplification`. The latter also covers simple cases of
Github user scwf commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68173060
And @liancheng, can you give me a floating point precision test? i want to
make sure this ok with float.
---
If your project is set up for it, you can reply to this email
Github user scwf commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68172996
Hi @liancheng, @chenghao-intel, i refactored a clean one, can you have a
look at it. I think it is more clean and readable.
https://github.com/scwf/spark/compare/apache:
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68169633
Would like to add that the solution based on Spire `Interval` I posted
above may suffer from floating point precision issue. Thus we might want to
cast all integral com
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68169507
Actually I'd highly suggest you breaking this PR into at least two self
contained PRs, which can be much easier to review and merge. Rule sets 1 and 4
can be merged int
Github user scwf commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68140510
Not only, actually this PR cover optimizations as follows:
```
And/Or with same condition
a && a => a , a && a && a ... => a
a || a => a , a || a || a ... =
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68123618
For numeric comparison optimizations, did some experiments along my former
double interval comparison idea and came up with the following snippet, I
haven't even compil
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68121946
@scwf Would you mind to list all the optimizations in the PR description
first? Some more concise examples coupled with each optimization can be really
helpful. Then we
Github user chenghao-intel commented on a diff in the pull request:
https://github.com/apache/spark/pull/3778#discussion_r22277573
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
---
@@ -112,7 +112,30 @@ case class InSet(value: Expr
Github user scwf commented on a diff in the pull request:
https://github.com/apache/spark/pull/3778#discussion_r22277360
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -293,6 +295,380 @@ object OptimizeIn extends Rule[LogicalPl
Github user scwf commented on a diff in the pull request:
https://github.com/apache/spark/pull/3778#discussion_r22277333
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
---
@@ -112,7 +112,30 @@ case class InSet(value: Expression, hs
Github user chenghao-intel commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68119210
This is really a very useful optimization, particularly for those SQLs
generated by machines. And it would make more senses if we add unit test to
reflect the expr
Github user chenghao-intel commented on a diff in the pull request:
https://github.com/apache/spark/pull/3778#discussion_r22277164
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -293,6 +295,380 @@ object OptimizeIn extends Rule
Github user chenghao-intel commented on a diff in the pull request:
https://github.com/apache/spark/pull/3778#discussion_r22277118
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -293,6 +295,380 @@ object OptimizeIn extends Rule
Github user chenghao-intel commented on a diff in the pull request:
https://github.com/apache/spark/pull/3778#discussion_r22277105
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
---
@@ -160,6 +183,49 @@ abstract class BinaryCompari
Github user chenghao-intel commented on a diff in the pull request:
https://github.com/apache/spark/pull/3778#discussion_r22277093
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
---
@@ -112,7 +112,30 @@ case class InSet(value: Expr
Github user scwf commented on a diff in the pull request:
https://github.com/apache/spark/pull/3778#discussion_r22276984
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -293,6 +295,380 @@ object OptimizeIn extends Rule[LogicalPl
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/3778#discussion_r22274369
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -293,6 +295,380 @@ object OptimizeIn extends Rule[Logi
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68086250
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68086248
[Test build #24804 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24804/consoleFull)
for PR 3778 at commit
[`8733027`](https://gith
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68084460
[Test build #24804 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24804/consoleFull)
for PR 3778 at commit
[`8733027`](https://githu
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68035016
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68035013
[Test build #24773 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24773/consoleFull)
for PR 3778 at commit
[`8c0316f`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68033865
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68033864
[Test build #24768 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24768/consoleFull)
for PR 3778 at commit
[`f1a487f`](https://gith
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68032736
[Test build #24773 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24773/consoleFull)
for PR 3778 at commit
[`8c0316f`](https://githu
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68031794
[Test build #24768 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24768/consoleFull)
for PR 3778 at commit
[`f1a487f`](https://githu
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/3778#issuecomment-68029380
@scwf These optimizations are useful, particularly the one that eliminates
common predicates. Thanks for bringing them up! However, the implementation in
this PR is rea
41 matches
Mail list logo