Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19952#discussion_r156555044
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
---
@@ -147,65 +139,76 @@ object
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/19783
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/19783
For the past 2 test builds #84725 and #84732, I checked the test result on
the web. Actually there were no failures. See
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84725
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/19783
retest this please.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r155964396
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
---
@@ -114,4 +114,171 @@ object
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r155963930
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -359,7 +371,7 @@ class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r155963740
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -471,37 +508,47 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r155963622
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -332,8 +332,41 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r155683828
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
---
@@ -114,4 +114,144 @@ object
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r155343962
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
---
@@ -114,4 +114,171 @@ object
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r155125157
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
---
@@ -114,4 +114,194 @@ object
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r155125029
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
---
@@ -114,4 +114,194 @@ object
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r154255068
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -578,6 +590,112 @@ class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r154254419
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -359,7 +371,7 @@ class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r154254145
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -784,11 +879,16 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r154252063
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -513,10 +560,9 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r154250197
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -332,8 +332,45 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r154249995
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
---
@@ -114,4 +114,197 @@ object
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r154225069
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
---
@@ -114,4 +114,197 @@ object
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r154223769
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
---
@@ -114,4 +114,197 @@ object
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r154223705
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
---
@@ -114,4 +114,197 @@ object
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19783#discussion_r153902382
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala
---
@@ -114,4 +114,197 @@ object
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19594#discussion_r153665547
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/JoinEstimationSuite.scala
---
@@ -67,6 +68,205 @@ class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19594#discussion_r153665092
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/JoinEstimationSuite.scala
---
@@ -67,6 +68,205 @@ class
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/19783
cc @sameeragarwal @cloud-fan @gatorsmile @wzhfy
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/19357
This pull request was created while there were several dependencies that
had not been defined yet. As a result, it has caused many conflicts. I
decided to close this pull request and start a clean
Github user ron8hu closed the pull request at:
https://github.com/apache/spark/pull/19357
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org
GitHub user ron8hu opened a pull request:
https://github.com/apache/spark/pull/19783
support histogram in filter cardinality estimation
## What changes were proposed in this pull request?
Histogram is effective in dealing with skewed distribution. After we
generate
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19479#discussion_r148724807
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
---
@@ -275,6 +327,64 @@ object ColumnStat extends
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19479#discussion_r145840719
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
---
@@ -216,65 +218,61 @@ object ColumnStat extends
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/19479#discussion_r145828713
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala
---
@@ -216,65 +218,61 @@ object ColumnStat extends
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/19357
cc @wzhfy Please review code first before I request the community to review
it. Thanks.
---
-
To unsubscribe, e-mail: reviews
GitHub user ron8hu opened a pull request:
https://github.com/apache/spark/pull/19357
support histogram in filter cardinality estimation
## What changes were proposed in this pull request?
Histogram is effective in dealing with skewed distribution. After we
generate
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/17918
I agree that we need to scale down NDV for all the referenced columns in a
query if a filter condition reduces the number of qualified rows. Do you find
this problem when running tpc-ds benchmark
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110480494
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -736,6 +736,12 @@ object SQLConf {
.checkValue(weight
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109476536
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,221 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109476505
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,221 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109476460
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -491,7 +599,22 @@ class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109471156
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,220 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109340165
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,220 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109326608
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,225 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109323394
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,225 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109323397
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,225 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109323376
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,225 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109293607
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,220 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109273331
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,140 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109273334
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,140 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109273326
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,140 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109268076
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,140 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r109267675
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -550,6 +565,140 @@ case
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/17415
cc @sameeragarwal @cloud-fan @gatorsmile @wzhfy After a few round of
changes and commits, this PR should be in good shape. If we can include in
Spark 2.2, then it can help tpc-h queries
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r108754614
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -515,8 +530,138 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r108753830
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -515,8 +530,138 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r108752975
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -515,8 +530,138 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r108751882
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -515,8 +530,138 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r108583109
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -515,8 +530,135 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r108582594
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -515,8 +530,135 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r108582540
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -509,8 +524,131 @@ case
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/17446
The logic is straightforward. LGTM.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r108308949
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -509,8 +524,131 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r108307672
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -381,7 +461,22 @@ class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r108307490
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -381,7 +461,22 @@ class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17415#discussion_r108246637
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -509,8 +524,131 @@ case
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/17415
cc @sameeragarwal @cloud-fan @gatorsmile This Jira is not on Spark 2.2
blocker list. If time permits, we can include it in Spark 2.2. If not, we can
wait for a maintenance release. Thanks
GitHub user ron8hu opened a pull request:
https://github.com/apache/spark/pull/17415
[SPARK-19408][SQL] filter estimation on two columns of same table
## What changes were proposed in this pull request?
In SQL queries, we also see predicate expressions involving two columns
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106766281
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106285527
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -51,6 +51,11 @@ case class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106271250
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -51,6 +51,11 @@ case class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106237955
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -51,6 +51,11 @@ case class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106084556
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -51,6 +51,11 @@ case class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17138#discussion_r104760579
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -0,0 +1,274 @@
+/*
+ * Licensed
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17148#discussion_r104557770
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -90,32 +95,43 @@ case class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17148#discussion_r104557294
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -90,32 +95,43 @@ case class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17148#discussion_r104527462
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -254,133 +270,118 @@ class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17148#discussion_r104267627
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -157,7 +157,7 @@ class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17148#discussion_r104267295
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -101,21 +101,23 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17065#discussion_r103090517
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -361,57 +343,52 @@ case
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17065#discussion_r103087483
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -95,15 +84,16 @@ case class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/17065#discussion_r103087345
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -297,6 +278,8 @@ case class
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r102133390
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -0,0 +1,389
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/16395
@cloud-fan I have updated code based on your feedback. Please review it
again. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/16395
Hi @cloud-fan I revised the code using latest Range class. Thanks for
reviewing the code.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r100955392
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,623
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r100954234
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,623
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r100952454
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,623
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r100941831
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,623
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r100940942
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,623
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16696#discussion_r99474494
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/StatsEstimationSuite.scala
---
@@ -18,12 +18,41 @@
package
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16696#discussion_r98776491
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
---
@@ -727,37 +728,18 @@ case class
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/16594
To show a very large Long number, there is no need to print out every digit
in the number. We can use exponent. For example, a number 120,000,000,005,123
can be printed as 1.2*10**14, where 10**14
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/16395
@wzhfy For predicate condition d_date >= '2000-01-27', we do not support it
because Spark SQL cast d_date column to String first before comparison. For
predicate condition d_date >= cast('2
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96718076
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -0,0 +1,303
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96325991
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96324552
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,620
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96308583
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala
---
@@ -0,0 +1,309
Github user ron8hu commented on the issue:
https://github.com/apache/spark/pull/16395
cc @rxin @wzhfy
Have updated code. Please review again. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96128057
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/Range.scala
---
@@ -0,0 +1,75 @@
+/*
+ * Licensed
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r96128021
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/Range.scala
---
@@ -0,0 +1,75 @@
+/*
+ * Licensed
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r95935452
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
---
@@ -116,6 +116,12 @@ case class Filter
Github user ron8hu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16395#discussion_r95916520
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
---
@@ -0,0 +1,555
1 - 100 of 152 matches
Mail list logo