[spark] branch master updated (c35427f -> 7adf886)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c35427f [SPARK-30355][CORE] Unify isExecutorActive between CoarseGrainedSchedulerBackend and DriverEndpoint add 7adf886 [SPARK-30345][SQL] Fix intermittent test failure (ConnectException) on ThriftServerQueryTestSuite/ThriftServerWithSparkContextSuite No new revisions were added by this update. Summary of changes: ...ContextSuite.scala => SharedThriftServer.scala} | 48 ++--- .../thriftserver/ThriftServerQueryTestSuite.scala | 57 +--- .../ThriftServerWithSparkContextSuite.scala| 62 +- 3 files changed, 19 insertions(+), 148 deletions(-) copy sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/{ThriftServerWithSparkContextSuite.scala => SharedThriftServer.scala} (71%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (c35427f -> 7adf886)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c35427f [SPARK-30355][CORE] Unify isExecutorActive between CoarseGrainedSchedulerBackend and DriverEndpoint add 7adf886 [SPARK-30345][SQL] Fix intermittent test failure (ConnectException) on ThriftServerQueryTestSuite/ThriftServerWithSparkContextSuite No new revisions were added by this update. Summary of changes: ...ContextSuite.scala => SharedThriftServer.scala} | 48 ++--- .../thriftserver/ThriftServerQueryTestSuite.scala | 57 +--- .../ThriftServerWithSparkContextSuite.scala| 62 +- 3 files changed, 19 insertions(+), 148 deletions(-) copy sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/{ThriftServerWithSparkContextSuite.scala => SharedThriftServer.scala} (71%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9c046dc -> c35427f)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9c046dc [SPARK-30102][ML][PYSPARK] GMM supports instance weighting add c35427f [SPARK-30355][CORE] Unify isExecutorActive between CoarseGrainedSchedulerBackend and DriverEndpoint No new revisions were added by this update. Summary of changes: .../cluster/CoarseGrainedSchedulerBackend.scala | 21 + 1 file changed, 9 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9c046dc -> c35427f)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9c046dc [SPARK-30102][ML][PYSPARK] GMM supports instance weighting add c35427f [SPARK-30355][CORE] Unify isExecutorActive between CoarseGrainedSchedulerBackend and DriverEndpoint No new revisions were added by this update. Summary of changes: .../cluster/CoarseGrainedSchedulerBackend.scala | 21 + 1 file changed, 9 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a3cf9c5 -> 9c046dc)
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a3cf9c5 [SPARK-30247][PYSPARK][FOLLOWUP] Add Python class MultivariateGaussian add 9c046dc [SPARK-30102][ML][PYSPARK] GMM supports instance weighting No new revisions were added by this update. Summary of changes: .../spark/ml/clustering/GaussianMixture.scala | 123 + .../spark/ml/clustering/GaussianMixtureSuite.scala | 23 +++- python/pyspark/ml/clustering.py| 19 +++- 3 files changed, 108 insertions(+), 57 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a3cf9c5 -> 9c046dc)
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a3cf9c5 [SPARK-30247][PYSPARK][FOLLOWUP] Add Python class MultivariateGaussian add 9c046dc [SPARK-30102][ML][PYSPARK] GMM supports instance weighting No new revisions were added by this update. Summary of changes: .../spark/ml/clustering/GaussianMixture.scala | 123 + .../spark/ml/clustering/GaussianMixtureSuite.scala | 23 +++- python/pyspark/ml/clustering.py| 19 +++- 3 files changed, 108 insertions(+), 57 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-30247][PYSPARK][FOLLOWUP] Add Python class MultivariateGaussian
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new a3cf9c5 [SPARK-30247][PYSPARK][FOLLOWUP] Add Python class MultivariateGaussian a3cf9c5 is described below commit a3cf9c564e74effe0f8457eaf9835ca0d3ab8be3 Author: Huaxin Gao AuthorDate: Fri Dec 27 13:30:18 2019 +0800 [SPARK-30247][PYSPARK][FOLLOWUP] Add Python class MultivariateGaussian ### What changes were proposed in this pull request? add a corresponding class MultivariateGaussian containing a vector and a matrix on the py side, so gaussian can be used on the py side. ### Does this PR introduce any user-facing change? add Python class ```MultivariateGaussian``` ### How was this patch tested? doctest Closes #27020 from huaxingao/spark-30247. Authored-by: Huaxin Gao Signed-off-by: zhengruifeng --- python/pyspark/ml/clustering.py | 36 +--- python/pyspark/ml/stat.py | 17 + 2 files changed, 50 insertions(+), 3 deletions(-) diff --git a/python/pyspark/ml/clustering.py b/python/pyspark/ml/clustering.py index f784b8f..7295b76 100644 --- a/python/pyspark/ml/clustering.py +++ b/python/pyspark/ml/clustering.py @@ -22,7 +22,8 @@ from pyspark import since, keyword_only from pyspark.ml.util import * from pyspark.ml.wrapper import JavaEstimator, JavaModel, JavaParams, JavaWrapper from pyspark.ml.param.shared import * -from pyspark.ml.common import inherit_doc +from pyspark.ml.common import inherit_doc, _java2py +from pyspark.ml.stat import MultivariateGaussian from pyspark.sql import DataFrame __all__ = ['BisectingKMeans', 'BisectingKMeansModel', 'BisectingKMeansSummary', @@ -161,7 +162,11 @@ class GaussianMixtureModel(JavaModel, _GaussianMixtureParams, JavaMLWritable, Ja Array of :py:class:`MultivariateGaussian` where gaussians[i] represents the Multivariate Gaussian (Normal) Distribution for Gaussian i """ -return self._call_java("gaussians") +sc = SparkContext._active_spark_context +jgaussians = self._java_obj.gaussians() +return [ +MultivariateGaussian(_java2py(sc, jgaussian.mean()), _java2py(sc, jgaussian.cov())) +for jgaussian in jgaussians] @property @since("2.0.0") @@ -263,6 +268,21 @@ class GaussianMixture(JavaEstimator, _GaussianMixtureParams, JavaMLWritable, Jav >>> gaussians = model.gaussians >>> len(gaussians) 3 +>>> gaussians[0].mean +DenseVector([0.825, 0.8675]) +>>> gaussians[0].cov.toArray() +array([[ 0.005625 , -0.0050625 ], + [-0.0050625 , 0.00455625]]) +>>> gaussians[1].mean +DenseVector([-0.4777, -0.4096]) +>>> gaussians[1].cov.toArray() +array([[ 0.1679695 , 0.13181786], + [ 0.13181786, 0.10524592]]) +>>> gaussians[2].mean +DenseVector([-0.4473, -0.3853]) +>>> gaussians[2].cov.toArray() +array([[ 0.16730412, 0.13112435], + [ 0.13112435, 0.10469614]]) >>> model.gaussiansDF.select("mean").head() Row(mean=DenseVector([0.825, 0.8675])) >>> model.gaussiansDF.select("cov").head() @@ -285,7 +305,17 @@ class GaussianMixture(JavaEstimator, _GaussianMixtureParams, JavaMLWritable, Jav False >>> model2.weights == model.weights True ->>> model2.gaussians == model.gaussians +>>> model2.gaussians[0].mean == model.gaussians[0].mean +True +>>> model2.gaussians[0].cov == model.gaussians[0].cov +True +>>> model2.gaussians[1].mean == model.gaussians[1].mean +True +>>> model2.gaussians[1].cov == model.gaussians[1].cov +True +>>> model2.gaussians[2].mean == model.gaussians[2].mean +True +>>> model2.gaussians[2].cov == model.gaussians[2].cov True >>> model2.gaussiansDF.select("mean").head() Row(mean=DenseVector([0.825, 0.8675])) diff --git a/python/pyspark/ml/stat.py b/python/pyspark/ml/stat.py index 8f2eadd..53a57af 100644 --- a/python/pyspark/ml/stat.py +++ b/python/pyspark/ml/stat.py @@ -19,6 +19,7 @@ import sys from pyspark import since, SparkContext from pyspark.ml.common import _java2py, _py2java +from pyspark.ml.linalg import DenseMatrix, Vectors from pyspark.ml.wrapper import JavaWrapper, _jvm from pyspark.sql.column import Column, _to_seq from pyspark.sql.functions import lit @@ -394,6 +395,22 @@ class SummaryBuilder(JavaWrapper): return Column(self._java_obj.summary(featuresCol._jc, weightCol._jc)) +class MultivariateGaussian(object): +"""Represents a (mean, cov) tuple + +>>> m = MultivariateGaussian(Vectors.dense([11,12]), DenseMatrix(2, 2, (1.0, 3.0, 5.0, 2.0))) +>>> (m.mean, m.cov.toArray()) +(DenseVector([11.0, 12.0]), array([[ 1., 5.], + [ 3., 2.]])) + +.. versionadded:: 3.0.0 + +
[spark] branch master updated (a2de20c -> 2acae97)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a2de20c [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by add 2acae97 [SPARK-30278][SQL][DOC] Update Spark SQL document menu for new changes No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 8 ++-- docs/sql-migration-guide.md| 2 +- docs/sql-performance-tuning.md | 26 +++--- 3 files changed, 26 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a2de20c -> 2acae97)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a2de20c [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by add 2acae97 [SPARK-30278][SQL][DOC] Update Spark SQL document menu for new changes No new revisions were added by this update. Summary of changes: docs/_data/menu-sql.yaml | 8 ++-- docs/sql-migration-guide.md| 2 +- docs/sql-performance-tuning.md | 26 +++--- 3 files changed, 26 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8d3eed3 -> a2de20c)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8d3eed3 [SPARK-29224][ML] Implement Factorization Machines as a ml-pipeline component add a2de20c [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by No new revisions were added by this update. Summary of changes: .../execution/exchange/EnsureRequirements.scala| 2 + .../org/apache/spark/sql/ConfigBehaviorSuite.scala | 8 ++-- .../apache/spark/sql/execution/PlannerSuite.scala | 46 ++ 3 files changed, 51 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (8d3eed3 -> a2de20c)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 8d3eed3 [SPARK-29224][ML] Implement Factorization Machines as a ml-pipeline component add a2de20c [SPARK-30036][SQL] Fix: REPARTITION hint does not work with order by No new revisions were added by this update. Summary of changes: .../execution/exchange/EnsureRequirements.scala| 2 + .../org/apache/spark/sql/ConfigBehaviorSuite.scala | 8 ++-- .../apache/spark/sql/execution/PlannerSuite.scala | 46 ++ 3 files changed, 51 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3584d84 -> 8d3eed3)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3584d84 [MINOR][CORE] Quiet request executor remove message add 8d3eed3 [SPARK-29224][ML] Implement Factorization Machines as a ml-pipeline component No new revisions were added by this update. Summary of changes: docs/ml-classification-regression.md | 107 +++ .../spark/examples/ml/JavaFMClassifierExample.java | 105 +++ .../spark/examples/ml/JavaFMRegressorExample.java | 90 +++ .../src/main/python/ml/fm_classifier_example.py| 77 ++ .../src/main/python/ml/fm_regressor_example.py | 74 ++ .../spark/examples/ml/FMClassifierExample.scala| 96 +++ .../spark/examples/ml/FMRegressorExample.scala | 84 +++ .../spark/ml/classification/FMClassifier.scala | 332 + .../apache/spark/ml/regression/FMRegressor.scala | 815 + .../ml/classification/FMClassifierSuite.scala | 242 ++ .../spark/ml/regression/FMRegressorSuite.scala | 240 ++ python/pyspark/ml/classification.py| 167 - python/pyspark/ml/regression.py| 165 - 13 files changed, 2592 insertions(+), 2 deletions(-) create mode 100644 examples/src/main/java/org/apache/spark/examples/ml/JavaFMClassifierExample.java create mode 100644 examples/src/main/java/org/apache/spark/examples/ml/JavaFMRegressorExample.java create mode 100644 examples/src/main/python/ml/fm_classifier_example.py create mode 100644 examples/src/main/python/ml/fm_regressor_example.py create mode 100644 examples/src/main/scala/org/apache/spark/examples/ml/FMClassifierExample.scala create mode 100644 examples/src/main/scala/org/apache/spark/examples/ml/FMRegressorExample.scala create mode 100644 mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala create mode 100644 mllib/src/main/scala/org/apache/spark/ml/regression/FMRegressor.scala create mode 100644 mllib/src/test/scala/org/apache/spark/ml/classification/FMClassifierSuite.scala create mode 100644 mllib/src/test/scala/org/apache/spark/ml/regression/FMRegressorSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3584d84 -> 8d3eed3)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3584d84 [MINOR][CORE] Quiet request executor remove message add 8d3eed3 [SPARK-29224][ML] Implement Factorization Machines as a ml-pipeline component No new revisions were added by this update. Summary of changes: docs/ml-classification-regression.md | 107 +++ .../spark/examples/ml/JavaFMClassifierExample.java | 105 +++ .../spark/examples/ml/JavaFMRegressorExample.java | 90 +++ .../src/main/python/ml/fm_classifier_example.py| 77 ++ .../src/main/python/ml/fm_regressor_example.py | 74 ++ .../spark/examples/ml/FMClassifierExample.scala| 96 +++ .../spark/examples/ml/FMRegressorExample.scala | 84 +++ .../spark/ml/classification/FMClassifier.scala | 332 + .../apache/spark/ml/regression/FMRegressor.scala | 815 + .../ml/classification/FMClassifierSuite.scala | 242 ++ .../spark/ml/regression/FMRegressorSuite.scala | 240 ++ python/pyspark/ml/classification.py| 167 - python/pyspark/ml/regression.py| 165 - 13 files changed, 2592 insertions(+), 2 deletions(-) create mode 100644 examples/src/main/java/org/apache/spark/examples/ml/JavaFMClassifierExample.java create mode 100644 examples/src/main/java/org/apache/spark/examples/ml/JavaFMRegressorExample.java create mode 100644 examples/src/main/python/ml/fm_classifier_example.py create mode 100644 examples/src/main/python/ml/fm_regressor_example.py create mode 100644 examples/src/main/scala/org/apache/spark/examples/ml/FMClassifierExample.scala create mode 100644 examples/src/main/scala/org/apache/spark/examples/ml/FMRegressorExample.scala create mode 100644 mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala create mode 100644 mllib/src/main/scala/org/apache/spark/ml/regression/FMRegressor.scala create mode 100644 mllib/src/test/scala/org/apache/spark/ml/classification/FMClassifierSuite.scala create mode 100644 mllib/src/test/scala/org/apache/spark/ml/regression/FMRegressorSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (59c014e -> 3584d84)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 59c014e [SPARK-30350][SQL] Fix ScalaReflection to use an empty array for getting its class object add 3584d84 [MINOR][CORE] Quiet request executor remove message No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (d59e719 -> 59c014e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d59e719 [SPARK-27986][SQL] Support ANSI SQL filter clause for aggregate expression add 59c014e [SPARK-30350][SQL] Fix ScalaReflection to use an empty array for getting its class object No new revisions were added by this update. Summary of changes: .../src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (d59e719 -> 59c014e)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from d59e719 [SPARK-27986][SQL] Support ANSI SQL filter clause for aggregate expression add 59c014e [SPARK-30350][SQL] Fix ScalaReflection to use an empty array for getting its class object No new revisions were added by this update. Summary of changes: .../src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (481fb63 -> d59e719)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 481fb63 [MINOR][SQL][SS] Remove TODO comments as var in case class is discouraged but worth breaking it add d59e719 [SPARK-27986][SQL] Support ANSI SQL filter clause for aggregate expression No new revisions were added by this update. Summary of changes: docs/sql-keywords.md | 1 + .../apache/spark/sql/catalyst/parser/SqlBase.g4| 5 +- .../spark/sql/catalyst/analysis/Analyzer.scala | 37 +- .../sql/catalyst/analysis/CheckAnalysis.scala | 2 +- .../catalyst/analysis/higherOrderFunctions.scala | 7 +- .../spark/sql/catalyst/analysis/unresolved.scala | 11 +- .../expressions/aggregate/interfaces.scala | 16 +- .../spark/sql/catalyst/optimizer/Optimizer.scala | 4 +- .../optimizer/RewriteDistinctAggregates.scala | 57 ++- .../spark/sql/catalyst/optimizer/expressions.scala | 4 +- .../spark/sql/catalyst/optimizer/subquery.scala| 2 +- .../spark/sql/catalyst/parser/AstBuilder.scala | 3 +- .../sql/catalyst/analysis/AnalysisErrorSuite.scala | 32 +- .../sql/execution/AggregatingAccumulator.scala | 4 +- .../spark/sql/execution/aggregate/AggUtils.scala | 2 +- .../execution/aggregate/AggregationIterator.scala | 39 +- .../execution/aggregate/HashAggregateExec.scala| 6 +- .../aggregate/ObjectAggregationIterator.scala | 4 +- .../aggregate/TungstenAggregationIterator.scala| 4 +- .../sql/execution/window/WindowExecBase.scala | 2 +- .../resources/sql-tests/inputs/group-by-filter.sql | 132 ++ .../inputs/postgreSQL/aggregates_part3.sql | 8 +- .../sql-tests/inputs/postgreSQL/groupingsets.sql | 4 +- .../sql-tests/inputs/postgreSQL/window_part3.sql | 2 +- .../inputs/udf/postgreSQL/udf-aggregates_part3.sql | 1 - .../sql-tests/results/group-by-filter.sql.out | 464 + .../results/postgreSQL/aggregates_part3.sql.out| 22 +- .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 36 +- 28 files changed, 845 insertions(+), 66 deletions(-) create mode 100644 sql/core/src/test/resources/sql-tests/inputs/group-by-filter.sql create mode 100644 sql/core/src/test/resources/sql-tests/results/group-by-filter.sql.out - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (481fb63 -> d59e719)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 481fb63 [MINOR][SQL][SS] Remove TODO comments as var in case class is discouraged but worth breaking it add d59e719 [SPARK-27986][SQL] Support ANSI SQL filter clause for aggregate expression No new revisions were added by this update. Summary of changes: docs/sql-keywords.md | 1 + .../apache/spark/sql/catalyst/parser/SqlBase.g4| 5 +- .../spark/sql/catalyst/analysis/Analyzer.scala | 37 +- .../sql/catalyst/analysis/CheckAnalysis.scala | 2 +- .../catalyst/analysis/higherOrderFunctions.scala | 7 +- .../spark/sql/catalyst/analysis/unresolved.scala | 11 +- .../expressions/aggregate/interfaces.scala | 16 +- .../spark/sql/catalyst/optimizer/Optimizer.scala | 4 +- .../optimizer/RewriteDistinctAggregates.scala | 57 ++- .../spark/sql/catalyst/optimizer/expressions.scala | 4 +- .../spark/sql/catalyst/optimizer/subquery.scala| 2 +- .../spark/sql/catalyst/parser/AstBuilder.scala | 3 +- .../sql/catalyst/analysis/AnalysisErrorSuite.scala | 32 +- .../sql/execution/AggregatingAccumulator.scala | 4 +- .../spark/sql/execution/aggregate/AggUtils.scala | 2 +- .../execution/aggregate/AggregationIterator.scala | 39 +- .../execution/aggregate/HashAggregateExec.scala| 6 +- .../aggregate/ObjectAggregationIterator.scala | 4 +- .../aggregate/TungstenAggregationIterator.scala| 4 +- .../sql/execution/window/WindowExecBase.scala | 2 +- .../resources/sql-tests/inputs/group-by-filter.sql | 132 ++ .../inputs/postgreSQL/aggregates_part3.sql | 8 +- .../sql-tests/inputs/postgreSQL/groupingsets.sql | 4 +- .../sql-tests/inputs/postgreSQL/window_part3.sql | 2 +- .../inputs/udf/postgreSQL/udf-aggregates_part3.sql | 1 - .../sql-tests/results/group-by-filter.sql.out | 464 + .../results/postgreSQL/aggregates_part3.sql.out| 22 +- .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 36 +- 28 files changed, 845 insertions(+), 66 deletions(-) create mode 100644 sql/core/src/test/resources/sql-tests/inputs/group-by-filter.sql create mode 100644 sql/core/src/test/resources/sql-tests/results/group-by-filter.sql.out - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org