date:20200922

[spark] branch master updated (acfee3c -> 21b7479)

2020-09-22 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled
 add 21b7479  [SPARK-32959][SQL][TEST] Fix an invalid test in 
DataSourceV2SQLSuite

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala |  3 ++-
 .../spark/sql/connector/TestV2SessionCatalogBase.scala| 15 +--
 2 files changed, 11 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (acfee3c -> 21b7479)

2020-09-22 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled
 add 21b7479  [SPARK-32959][SQL][TEST] Fix an invalid test in 
DataSourceV2SQLSuite

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala |  3 ++-
 .../spark/sql/connector/TestV2SessionCatalogBase.scala| 15 +--
 2 files changed, 11 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (acfee3c -> 21b7479)

2020-09-22 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled
 add 21b7479  [SPARK-32959][SQL][TEST] Fix an invalid test in 
DataSourceV2SQLSuite

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala |  3 ++-
 .../spark/sql/connector/TestV2SessionCatalogBase.scala| 15 +--
 2 files changed, 11 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (acfee3c -> 21b7479)

2020-09-22 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled
 add 21b7479  [SPARK-32959][SQL][TEST] Fix an invalid test in 
DataSourceV2SQLSuite

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala |  3 ++-
 .../spark/sql/connector/TestV2SessionCatalogBase.scala| 15 +--
 2 files changed, 11 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b53da23 -> acfee3c)

2020-09-22 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`
 add acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  2 +-
 .../expressions/CallMethodViaReflection.scala  |  3 +-
 .../spark/sql/catalyst/expressions/Cast.scala  |  3 +-
 .../expressions/MonotonicallyIncreasingID.scala|  8 +-
 .../catalyst/expressions/SparkPartitionID.scala|  8 +-
 .../expressions/aggregate/CountMinSketchAgg.scala  |  7 ++
 .../expressions/aggregate/bitwiseAggregates.scala  |  1 +
 .../sql/catalyst/expressions/arithmetic.scala  | 38 ++---
 .../catalyst/expressions/bitwiseExpressions.scala  | 12 ++-
 .../expressions/collectionOperations.scala | 18 +++--
 .../catalyst/expressions/complexTypeCreator.scala  | 20 +++--
 .../expressions/conditionalExpressions.scala   |  6 +-
 .../catalyst/expressions/datetimeExpressions.scala | 40 +-
 .../sql/catalyst/expressions/generators.scala  | 12 ++-
 .../spark/sql/catalyst/expressions/hash.scala  | 18 +++--
 .../sql/catalyst/expressions/inputFileBlock.scala  | 27 ++-
 .../sql/catalyst/expressions/jsonExpressions.scala |  6 +-
 .../spark/sql/catalyst/expressions/misc.scala  | 16 +++-
 .../sql/catalyst/expressions/predicates.scala  | 61 ---
 .../catalyst/expressions/windowExpressions.scala   | 91 ++
 .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 --
 .../sql-functions/sql-expression-schema.md | 48 ++--
 .../test/resources/sql-tests/results/cast.sql.out  |  2 +
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  | 10 ++-
 .../spark/sql/execution/command/DDLSuite.scala |  6 +-
 .../sql/expressions/ExpressionInfoSuite.scala  | 37 -
 26 files changed, 404 insertions(+), 120 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b53da23 -> acfee3c)

2020-09-22 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`
 add acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  2 +-
 .../expressions/CallMethodViaReflection.scala  |  3 +-
 .../spark/sql/catalyst/expressions/Cast.scala  |  3 +-
 .../expressions/MonotonicallyIncreasingID.scala|  8 +-
 .../catalyst/expressions/SparkPartitionID.scala|  8 +-
 .../expressions/aggregate/CountMinSketchAgg.scala  |  7 ++
 .../expressions/aggregate/bitwiseAggregates.scala  |  1 +
 .../sql/catalyst/expressions/arithmetic.scala  | 38 ++---
 .../catalyst/expressions/bitwiseExpressions.scala  | 12 ++-
 .../expressions/collectionOperations.scala | 18 +++--
 .../catalyst/expressions/complexTypeCreator.scala  | 20 +++--
 .../expressions/conditionalExpressions.scala   |  6 +-
 .../catalyst/expressions/datetimeExpressions.scala | 40 +-
 .../sql/catalyst/expressions/generators.scala  | 12 ++-
 .../spark/sql/catalyst/expressions/hash.scala  | 18 +++--
 .../sql/catalyst/expressions/inputFileBlock.scala  | 27 ++-
 .../sql/catalyst/expressions/jsonExpressions.scala |  6 +-
 .../spark/sql/catalyst/expressions/misc.scala  | 16 +++-
 .../sql/catalyst/expressions/predicates.scala  | 61 ---
 .../catalyst/expressions/windowExpressions.scala   | 91 ++
 .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 --
 .../sql-functions/sql-expression-schema.md | 48 ++--
 .../test/resources/sql-tests/results/cast.sql.out  |  2 +
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  | 10 ++-
 .../spark/sql/execution/command/DDLSuite.scala |  6 +-
 .../sql/expressions/ExpressionInfoSuite.scala  | 37 -
 26 files changed, 404 insertions(+), 120 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b53da23 -> acfee3c)

2020-09-22 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`
 add acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  2 +-
 .../expressions/CallMethodViaReflection.scala  |  3 +-
 .../spark/sql/catalyst/expressions/Cast.scala  |  3 +-
 .../expressions/MonotonicallyIncreasingID.scala|  8 +-
 .../catalyst/expressions/SparkPartitionID.scala|  8 +-
 .../expressions/aggregate/CountMinSketchAgg.scala  |  7 ++
 .../expressions/aggregate/bitwiseAggregates.scala  |  1 +
 .../sql/catalyst/expressions/arithmetic.scala  | 38 ++---
 .../catalyst/expressions/bitwiseExpressions.scala  | 12 ++-
 .../expressions/collectionOperations.scala | 18 +++--
 .../catalyst/expressions/complexTypeCreator.scala  | 20 +++--
 .../expressions/conditionalExpressions.scala   |  6 +-
 .../catalyst/expressions/datetimeExpressions.scala | 40 +-
 .../sql/catalyst/expressions/generators.scala  | 12 ++-
 .../spark/sql/catalyst/expressions/hash.scala  | 18 +++--
 .../sql/catalyst/expressions/inputFileBlock.scala  | 27 ++-
 .../sql/catalyst/expressions/jsonExpressions.scala |  6 +-
 .../spark/sql/catalyst/expressions/misc.scala  | 16 +++-
 .../sql/catalyst/expressions/predicates.scala  | 61 ---
 .../catalyst/expressions/windowExpressions.scala   | 91 ++
 .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 --
 .../sql-functions/sql-expression-schema.md | 48 ++--
 .../test/resources/sql-tests/results/cast.sql.out  |  2 +
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  | 10 ++-
 .../spark/sql/execution/command/DDLSuite.scala |  6 +-
 .../sql/expressions/ExpressionInfoSuite.scala  | 37 -
 26 files changed, 404 insertions(+), 120 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b53da23 -> acfee3c)

2020-09-22 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`
 add acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  2 +-
 .../expressions/CallMethodViaReflection.scala  |  3 +-
 .../spark/sql/catalyst/expressions/Cast.scala  |  3 +-
 .../expressions/MonotonicallyIncreasingID.scala|  8 +-
 .../catalyst/expressions/SparkPartitionID.scala|  8 +-
 .../expressions/aggregate/CountMinSketchAgg.scala  |  7 ++
 .../expressions/aggregate/bitwiseAggregates.scala  |  1 +
 .../sql/catalyst/expressions/arithmetic.scala  | 38 ++---
 .../catalyst/expressions/bitwiseExpressions.scala  | 12 ++-
 .../expressions/collectionOperations.scala | 18 +++--
 .../catalyst/expressions/complexTypeCreator.scala  | 20 +++--
 .../expressions/conditionalExpressions.scala   |  6 +-
 .../catalyst/expressions/datetimeExpressions.scala | 40 +-
 .../sql/catalyst/expressions/generators.scala  | 12 ++-
 .../spark/sql/catalyst/expressions/hash.scala  | 18 +++--
 .../sql/catalyst/expressions/inputFileBlock.scala  | 27 ++-
 .../sql/catalyst/expressions/jsonExpressions.scala |  6 +-
 .../spark/sql/catalyst/expressions/misc.scala  | 16 +++-
 .../sql/catalyst/expressions/predicates.scala  | 61 ---
 .../catalyst/expressions/windowExpressions.scala   | 91 ++
 .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 --
 .../sql-functions/sql-expression-schema.md | 48 ++--
 .../test/resources/sql-tests/results/cast.sql.out  |  2 +
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  | 10 ++-
 .../spark/sql/execution/command/DDLSuite.scala |  6 +-
 .../sql/expressions/ExpressionInfoSuite.scala  | 37 -
 26 files changed, 404 insertions(+), 120 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b53da23 -> acfee3c)

2020-09-22 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`
 add acfee3c  [SPARK-32870][DOCS][SQL] Make sure that all expressions have 
their ExpressionDescription filled

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  2 +-
 .../expressions/CallMethodViaReflection.scala  |  3 +-
 .../spark/sql/catalyst/expressions/Cast.scala  |  3 +-
 .../expressions/MonotonicallyIncreasingID.scala|  8 +-
 .../catalyst/expressions/SparkPartitionID.scala|  8 +-
 .../expressions/aggregate/CountMinSketchAgg.scala  |  7 ++
 .../expressions/aggregate/bitwiseAggregates.scala  |  1 +
 .../sql/catalyst/expressions/arithmetic.scala  | 38 ++---
 .../catalyst/expressions/bitwiseExpressions.scala  | 12 ++-
 .../expressions/collectionOperations.scala | 18 +++--
 .../catalyst/expressions/complexTypeCreator.scala  | 20 +++--
 .../expressions/conditionalExpressions.scala   |  6 +-
 .../catalyst/expressions/datetimeExpressions.scala | 40 +-
 .../sql/catalyst/expressions/generators.scala  | 12 ++-
 .../spark/sql/catalyst/expressions/hash.scala  | 18 +++--
 .../sql/catalyst/expressions/inputFileBlock.scala  | 27 ++-
 .../sql/catalyst/expressions/jsonExpressions.scala |  6 +-
 .../spark/sql/catalyst/expressions/misc.scala  | 16 +++-
 .../sql/catalyst/expressions/predicates.scala  | 61 ---
 .../catalyst/expressions/windowExpressions.scala   | 91 ++
 .../spark/sql/catalyst/expressions/xml/xpath.scala | 24 --
 .../sql-functions/sql-expression-schema.md | 48 ++--
 .../test/resources/sql-tests/results/cast.sql.out  |  2 +
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  | 10 ++-
 .../spark/sql/execution/command/DDLSuite.scala |  6 +-
 .../sql/expressions/ExpressionInfoSuite.scala  | 37 -
 26 files changed, 404 insertions(+), 120 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (942f577 -> b53da23)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 942f577  [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant 
available in PyPI
 add b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 
 .../src/test/resources/sql-functions/sql-expression-schema.md | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (942f577 -> b53da23)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 942f577  [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant 
available in PyPI
 add b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 
 .../src/test/resources/sql-functions/sql-expression-schema.md | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (942f577 -> b53da23)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 942f577  [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant 
available in PyPI
 add b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 
 .../src/test/resources/sql-functions/sql-expression-schema.md | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (779f0a8 -> 942f577)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 779f0a8  [SPARK-32933][PYTHON] Use keyword-only syntax for 
keyword_only methods
 add 942f577  [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant 
available in PyPI

No new revisions were added by this update.

Summary of changes:
 dev/create-release/release-build.sh|   3 +
 dev/sparktestsupport/modules.py|   1 +
 python/docs/source/getting_started/install.rst |  35 +
 python/pyspark/find_spark_home.py  |  11 +-
 python/pyspark/install.py  | 173 +
 python/pyspark/tests/test_install_spark.py | 112 
 python/setup.py|  47 ++-
 7 files changed, 380 insertions(+), 2 deletions(-)
 create mode 100644 python/pyspark/install.py
 create mode 100644 python/pyspark/tests/test_install_spark.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (942f577 -> b53da23)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 942f577  [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant 
available in PyPI
 add b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 
 .../src/test/resources/sql-functions/sql-expression-schema.md | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (779f0a8 -> 942f577)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 779f0a8  [SPARK-32933][PYTHON] Use keyword-only syntax for 
keyword_only methods
 add 942f577  [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant 
available in PyPI

No new revisions were added by this update.

Summary of changes:
 dev/create-release/release-build.sh|   3 +
 dev/sparktestsupport/modules.py|   1 +
 python/docs/source/getting_started/install.rst |  35 +
 python/pyspark/find_spark_home.py  |  11 +-
 python/pyspark/install.py  | 173 +
 python/pyspark/tests/test_install_spark.py | 112 
 python/setup.py|  47 ++-
 7 files changed, 380 insertions(+), 2 deletions(-)
 create mode 100644 python/pyspark/install.py
 create mode 100644 python/pyspark/tests/test_install_spark.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7c14f17 -> 779f0a8)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7c14f17  [SPARK-32306][SQL][DOCS] Clarify the result of 
`percentile_approx()`
 add 779f0a8  [SPARK-32933][PYTHON] Use keyword-only syntax for 
keyword_only methods

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/classification.py |  70 
 python/pyspark/ml/clustering.py |  40 ++---
 python/pyspark/ml/evaluation.py |  48 +++---
 python/pyspark/ml/feature.py| 332 ++--
 python/pyspark/ml/fpm.py|  16 +-
 python/pyspark/ml/pipeline.py   |   8 +-
 python/pyspark/ml/recommendation.py |  24 +--
 python/pyspark/ml/regression.py |  64 +++
 python/pyspark/ml/tuning.py |  24 +--
 python/pyspark/sql/streaming.py |   2 +-
 10 files changed, 318 insertions(+), 310 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (942f577 -> b53da23)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 942f577  [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant 
available in PyPI
 add b53da23  [MINOR][SQL] Improve examples for `percentile_approx()`

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 
 .../src/test/resources/sql-functions/sql-expression-schema.md | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (779f0a8 -> 942f577)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 779f0a8  [SPARK-32933][PYTHON] Use keyword-only syntax for 
keyword_only methods
 add 942f577  [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant 
available in PyPI

No new revisions were added by this update.

Summary of changes:
 dev/create-release/release-build.sh|   3 +
 dev/sparktestsupport/modules.py|   1 +
 python/docs/source/getting_started/install.rst |  35 +
 python/pyspark/find_spark_home.py  |  11 +-
 python/pyspark/install.py  | 173 +
 python/pyspark/tests/test_install_spark.py | 112 
 python/setup.py|  47 ++-
 7 files changed, 380 insertions(+), 2 deletions(-)
 create mode 100644 python/pyspark/install.py
 create mode 100644 python/pyspark/tests/test_install_spark.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7c14f17 -> 779f0a8)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7c14f17  [SPARK-32306][SQL][DOCS] Clarify the result of 
`percentile_approx()`
 add 779f0a8  [SPARK-32933][PYTHON] Use keyword-only syntax for 
keyword_only methods

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/classification.py |  70 
 python/pyspark/ml/clustering.py |  40 ++---
 python/pyspark/ml/evaluation.py |  48 +++---
 python/pyspark/ml/feature.py| 332 ++--
 python/pyspark/ml/fpm.py|  16 +-
 python/pyspark/ml/pipeline.py   |   8 +-
 python/pyspark/ml/recommendation.py |  24 +--
 python/pyspark/ml/regression.py |  64 +++
 python/pyspark/ml/tuning.py |  24 +--
 python/pyspark/sql/streaming.py |   2 +-
 10 files changed, 318 insertions(+), 310 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (779f0a8 -> 942f577)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 779f0a8  [SPARK-32933][PYTHON] Use keyword-only syntax for 
keyword_only methods
 add 942f577  [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant 
available in PyPI

No new revisions were added by this update.

Summary of changes:
 dev/create-release/release-build.sh|   3 +
 dev/sparktestsupport/modules.py|   1 +
 python/docs/source/getting_started/install.rst |  35 +
 python/pyspark/find_spark_home.py  |  11 +-
 python/pyspark/install.py  | 173 +
 python/pyspark/tests/test_install_spark.py | 112 
 python/setup.py|  47 ++-
 7 files changed, 380 insertions(+), 2 deletions(-)
 create mode 100644 python/pyspark/install.py
 create mode 100644 python/pyspark/tests/test_install_spark.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7c14f17 -> 779f0a8)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7c14f17  [SPARK-32306][SQL][DOCS] Clarify the result of 
`percentile_approx()`
 add 779f0a8  [SPARK-32933][PYTHON] Use keyword-only syntax for 
keyword_only methods

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/classification.py |  70 
 python/pyspark/ml/clustering.py |  40 ++---
 python/pyspark/ml/evaluation.py |  48 +++---
 python/pyspark/ml/feature.py| 332 ++--
 python/pyspark/ml/fpm.py|  16 +-
 python/pyspark/ml/pipeline.py   |   8 +-
 python/pyspark/ml/recommendation.py |  24 +--
 python/pyspark/ml/regression.py |  64 +++
 python/pyspark/ml/tuning.py |  24 +--
 python/pyspark/sql/streaming.py |   2 +-
 10 files changed, 318 insertions(+), 310 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (779f0a8 -> 942f577)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 779f0a8  [SPARK-32933][PYTHON] Use keyword-only syntax for 
keyword_only methods
 add 942f577  [SPARK-32017][PYTHON][BUILD] Make Pyspark Hadoop 3.2+ Variant 
available in PyPI

No new revisions were added by this update.

Summary of changes:
 dev/create-release/release-build.sh|   3 +
 dev/sparktestsupport/modules.py|   1 +
 python/docs/source/getting_started/install.rst |  35 +
 python/pyspark/find_spark_home.py  |  11 +-
 python/pyspark/install.py  | 173 +
 python/pyspark/tests/test_install_spark.py | 112 
 python/setup.py|  47 ++-
 7 files changed, 380 insertions(+), 2 deletions(-)
 create mode 100644 python/pyspark/install.py
 create mode 100644 python/pyspark/tests/test_install_spark.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7c14f17 -> 779f0a8)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7c14f17  [SPARK-32306][SQL][DOCS] Clarify the result of 
`percentile_approx()`
 add 779f0a8  [SPARK-32933][PYTHON] Use keyword-only syntax for 
keyword_only methods

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/classification.py |  70 
 python/pyspark/ml/clustering.py |  40 ++---
 python/pyspark/ml/evaluation.py |  48 +++---
 python/pyspark/ml/feature.py| 332 ++--
 python/pyspark/ml/fpm.py|  16 +-
 python/pyspark/ml/pipeline.py   |   8 +-
 python/pyspark/ml/recommendation.py |  24 +--
 python/pyspark/ml/regression.py |  64 +++
 python/pyspark/ml/tuning.py |  24 +--
 python/pyspark/sql/streaming.py |   2 +-
 10 files changed, 318 insertions(+), 310 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7c14f17 -> 779f0a8)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7c14f17  [SPARK-32306][SQL][DOCS] Clarify the result of 
`percentile_approx()`
 add 779f0a8  [SPARK-32933][PYTHON] Use keyword-only syntax for 
keyword_only methods

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/classification.py |  70 
 python/pyspark/ml/clustering.py |  40 ++---
 python/pyspark/ml/evaluation.py |  48 +++---
 python/pyspark/ml/feature.py| 332 ++--
 python/pyspark/ml/fpm.py|  16 +-
 python/pyspark/ml/pipeline.py   |   8 +-
 python/pyspark/ml/recommendation.py |  24 +--
 python/pyspark/ml/regression.py |  64 +++
 python/pyspark/ml/tuning.py |  24 +--
 python/pyspark/sql/streaming.py |   2 +-
 10 files changed, 318 insertions(+), 310 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fba5736 -> 7c14f17)

2020-09-22 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fba5736  [SPARK-32757][SQL][FOLLOWUP] Preserve the attribute name as 
possible as we scan in SubqueryBroadcastExec
 add 7c14f17  [SPARK-32306][SQL][DOCS] Clarify the result of 
`percentile_approx()`

No new revisions were added by this update.

Summary of changes:
 R/pkg/R/functions.R  |  6 --
 python/pyspark/sql/functions.py  |  4 +++-
 .../expressions/aggregate/ApproximatePercentile.scala| 12 +++-
 sql/core/src/main/scala/org/apache/spark/sql/functions.scala |  5 +++--
 4 files changed, 17 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fba5736 -> 7c14f17)

2020-09-22 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fba5736  [SPARK-32757][SQL][FOLLOWUP] Preserve the attribute name as 
possible as we scan in SubqueryBroadcastExec
 add 7c14f17  [SPARK-32306][SQL][DOCS] Clarify the result of 
`percentile_approx()`

No new revisions were added by this update.

Summary of changes:
 R/pkg/R/functions.R  |  6 --
 python/pyspark/sql/functions.py  |  4 +++-
 .../expressions/aggregate/ApproximatePercentile.scala| 12 +++-
 sql/core/src/main/scala/org/apache/spark/sql/functions.scala |  5 +++--
 4 files changed, 17 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fba5736 -> 7c14f17)

2020-09-22 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fba5736  [SPARK-32757][SQL][FOLLOWUP] Preserve the attribute name as 
possible as we scan in SubqueryBroadcastExec
 add 7c14f17  [SPARK-32306][SQL][DOCS] Clarify the result of 
`percentile_approx()`

No new revisions were added by this update.

Summary of changes:
 R/pkg/R/functions.R  |  6 --
 python/pyspark/sql/functions.py  |  4 +++-
 .../expressions/aggregate/ApproximatePercentile.scala| 12 +++-
 sql/core/src/main/scala/org/apache/spark/sql/functions.scala |  5 +++--
 4 files changed, 17 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fba5736 -> 7c14f17)

2020-09-22 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fba5736  [SPARK-32757][SQL][FOLLOWUP] Preserve the attribute name as 
possible as we scan in SubqueryBroadcastExec
 add 7c14f17  [SPARK-32306][SQL][DOCS] Clarify the result of 
`percentile_approx()`

No new revisions were added by this update.

Summary of changes:
 R/pkg/R/functions.R  |  6 --
 python/pyspark/sql/functions.py  |  4 +++-
 .../expressions/aggregate/ApproximatePercentile.scala| 12 +++-
 sql/core/src/main/scala/org/apache/spark/sql/functions.scala |  5 +++--
 4 files changed, 17 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fba5736 -> 7c14f17)

2020-09-22 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fba5736  [SPARK-32757][SQL][FOLLOWUP] Preserve the attribute name as 
possible as we scan in SubqueryBroadcastExec
 add 7c14f17  [SPARK-32306][SQL][DOCS] Clarify the result of 
`percentile_approx()`

No new revisions were added by this update.

Summary of changes:
 R/pkg/R/functions.R  |  6 --
 python/pyspark/sql/functions.py  |  4 +++-
 .../expressions/aggregate/ApproximatePercentile.scala| 12 +++-
 sql/core/src/main/scala/org/apache/spark/sql/functions.scala |  5 +++--
 4 files changed, 17 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (dd808457 -> fba5736)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from dd808457 [SPARK-32964][DSTREAMS] Pass all `streaming` module UTs in 
Scala 2.13
 add fba5736  [SPARK-32757][SQL][FOLLOWUP] Preserve the attribute name as 
possible as we scan in SubqueryBroadcastExec

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/SubqueryBroadcastExec.scala| 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (dd808457 -> fba5736)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from dd808457 [SPARK-32964][DSTREAMS] Pass all `streaming` module UTs in 
Scala 2.13
 add fba5736  [SPARK-32757][SQL][FOLLOWUP] Preserve the attribute name as 
possible as we scan in SubqueryBroadcastExec

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/SubqueryBroadcastExec.scala| 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6145621 -> dd808457)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6145621  [SPARK-32659][SQL][FOLLOWUP] Broadcast Array instead of Set 
in InSubqueryExec
 add dd808457 [SPARK-32964][DSTREAMS] Pass all `streaming` module UTs in 
Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/streaming/DStreamGraph.scala   | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (dd808457 -> fba5736)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from dd808457 [SPARK-32964][DSTREAMS] Pass all `streaming` module UTs in 
Scala 2.13
 add fba5736  [SPARK-32757][SQL][FOLLOWUP] Preserve the attribute name as 
possible as we scan in SubqueryBroadcastExec

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/SubqueryBroadcastExec.scala| 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6145621 -> dd808457)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6145621  [SPARK-32659][SQL][FOLLOWUP] Broadcast Array instead of Set 
in InSubqueryExec
 add dd808457 [SPARK-32964][DSTREAMS] Pass all `streaming` module UTs in 
Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/streaming/DStreamGraph.scala   | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of Set in InSubqueryExec

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8a481d8  [SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of 
Set in InSubqueryExec
8a481d8 is described below

commit 8a481d8c336360bc5dfa518af70590251e5b61bc
Author: Wenchen Fan 
AuthorDate: Tue Sep 22 10:58:33 2020 -0700

[SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of Set in 
InSubqueryExec

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/29475.

This PR updates the code to broadcast the Array instead of Set, which was 
the behavior before #29475

### Why are the changes needed?

The size of Set can be much bigger than Array. It's safer to keep the 
behavior the same as before and build the set at the executor side.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

existing tests

Closes #29840 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
---
 .../main/scala/org/apache/spark/sql/execution/subquery.scala | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
index 9d15c76..48d6210 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
@@ -23,7 +23,7 @@ import scala.collection.mutable.ArrayBuffer
 import org.apache.spark.broadcast.Broadcast
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.catalyst.{expressions, InternalRow}
-import org.apache.spark.sql.catalyst.expressions.{AttributeSeq, 
CreateNamedStruct, Expression, ExprId, InSet, ListQuery, Literal, 
PlanExpression}
+import org.apache.spark.sql.catalyst.expressions.{CreateNamedStruct, 
Expression, ExprId, InSet, ListQuery, Literal, PlanExpression}
 import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, 
ExprCode}
 import org.apache.spark.sql.catalyst.rules.Rule
 import org.apache.spark.sql.internal.SQLConf
@@ -114,10 +114,10 @@ case class InSubqueryExec(
 child: Expression,
 plan: BaseSubqueryExec,
 exprId: ExprId,
-private var resultBroadcast: Broadcast[Set[Any]] = null) extends 
ExecSubqueryExpression {
+private var resultBroadcast: Broadcast[Array[Any]] = null) extends 
ExecSubqueryExpression {
 
-  @transient private var result: Set[Any] = _
-  @transient private lazy val inSet = InSet(child, result)
+  @transient private var result: Array[Any] = _
+  @transient private lazy val inSet = InSet(child, result.toSet)
 
   override def dataType: DataType = BooleanType
   override def children: Seq[Expression] = child :: Nil
@@ -132,11 +132,11 @@ case class InSubqueryExec(
 
   def updateResult(): Unit = {
 val rows = plan.executeCollect()
-result = rows.map(_.get(0, child.dataType)).toSet
+result = rows.map(_.get(0, child.dataType))
 resultBroadcast = plan.sqlContext.sparkContext.broadcast(result)
   }
 
-  def values(): Option[Set[Any]] = Option(resultBroadcast).map(_.value)
+  def values(): Option[Array[Any]] = Option(resultBroadcast).map(_.value)
 
   private def prepareResult(): Unit = {
 require(resultBroadcast != null, s"$this has not finished")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (dd808457 -> fba5736)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from dd808457 [SPARK-32964][DSTREAMS] Pass all `streaming` module UTs in 
Scala 2.13
 add fba5736  [SPARK-32757][SQL][FOLLOWUP] Preserve the attribute name as 
possible as we scan in SubqueryBroadcastExec

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/SubqueryBroadcastExec.scala| 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6145621 -> dd808457)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6145621  [SPARK-32659][SQL][FOLLOWUP] Broadcast Array instead of Set 
in InSubqueryExec
 add dd808457 [SPARK-32964][DSTREAMS] Pass all `streaming` module UTs in 
Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/streaming/DStreamGraph.scala   | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of Set in InSubqueryExec

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8a481d8  [SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of 
Set in InSubqueryExec
8a481d8 is described below

commit 8a481d8c336360bc5dfa518af70590251e5b61bc
Author: Wenchen Fan 
AuthorDate: Tue Sep 22 10:58:33 2020 -0700

[SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of Set in 
InSubqueryExec

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/29475.

This PR updates the code to broadcast the Array instead of Set, which was 
the behavior before #29475

### Why are the changes needed?

The size of Set can be much bigger than Array. It's safer to keep the 
behavior the same as before and build the set at the executor side.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

existing tests

Closes #29840 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
---
 .../main/scala/org/apache/spark/sql/execution/subquery.scala | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
index 9d15c76..48d6210 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
@@ -23,7 +23,7 @@ import scala.collection.mutable.ArrayBuffer
 import org.apache.spark.broadcast.Broadcast
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.catalyst.{expressions, InternalRow}
-import org.apache.spark.sql.catalyst.expressions.{AttributeSeq, 
CreateNamedStruct, Expression, ExprId, InSet, ListQuery, Literal, 
PlanExpression}
+import org.apache.spark.sql.catalyst.expressions.{CreateNamedStruct, 
Expression, ExprId, InSet, ListQuery, Literal, PlanExpression}
 import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, 
ExprCode}
 import org.apache.spark.sql.catalyst.rules.Rule
 import org.apache.spark.sql.internal.SQLConf
@@ -114,10 +114,10 @@ case class InSubqueryExec(
 child: Expression,
 plan: BaseSubqueryExec,
 exprId: ExprId,
-private var resultBroadcast: Broadcast[Set[Any]] = null) extends 
ExecSubqueryExpression {
+private var resultBroadcast: Broadcast[Array[Any]] = null) extends 
ExecSubqueryExpression {
 
-  @transient private var result: Set[Any] = _
-  @transient private lazy val inSet = InSet(child, result)
+  @transient private var result: Array[Any] = _
+  @transient private lazy val inSet = InSet(child, result.toSet)
 
   override def dataType: DataType = BooleanType
   override def children: Seq[Expression] = child :: Nil
@@ -132,11 +132,11 @@ case class InSubqueryExec(
 
   def updateResult(): Unit = {
 val rows = plan.executeCollect()
-result = rows.map(_.get(0, child.dataType)).toSet
+result = rows.map(_.get(0, child.dataType))
 resultBroadcast = plan.sqlContext.sparkContext.broadcast(result)
   }
 
-  def values(): Option[Set[Any]] = Option(resultBroadcast).map(_.value)
+  def values(): Option[Array[Any]] = Option(resultBroadcast).map(_.value)
 
   private def prepareResult(): Unit = {
 require(resultBroadcast != null, s"$this has not finished")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (dd808457 -> fba5736)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from dd808457 [SPARK-32964][DSTREAMS] Pass all `streaming` module UTs in 
Scala 2.13
 add fba5736  [SPARK-32757][SQL][FOLLOWUP] Preserve the attribute name as 
possible as we scan in SubqueryBroadcastExec

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/SubqueryBroadcastExec.scala| 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6145621 -> dd808457)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6145621  [SPARK-32659][SQL][FOLLOWUP] Broadcast Array instead of Set 
in InSubqueryExec
 add dd808457 [SPARK-32964][DSTREAMS] Pass all `streaming` module UTs in 
Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/streaming/DStreamGraph.scala   | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of Set in InSubqueryExec

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8a481d8  [SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of 
Set in InSubqueryExec
8a481d8 is described below

commit 8a481d8c336360bc5dfa518af70590251e5b61bc
Author: Wenchen Fan 
AuthorDate: Tue Sep 22 10:58:33 2020 -0700

[SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of Set in 
InSubqueryExec

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/29475.

This PR updates the code to broadcast the Array instead of Set, which was 
the behavior before #29475

### Why are the changes needed?

The size of Set can be much bigger than Array. It's safer to keep the 
behavior the same as before and build the set at the executor side.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

existing tests

Closes #29840 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
---
 .../main/scala/org/apache/spark/sql/execution/subquery.scala | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
index 9d15c76..48d6210 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
@@ -23,7 +23,7 @@ import scala.collection.mutable.ArrayBuffer
 import org.apache.spark.broadcast.Broadcast
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.catalyst.{expressions, InternalRow}
-import org.apache.spark.sql.catalyst.expressions.{AttributeSeq, 
CreateNamedStruct, Expression, ExprId, InSet, ListQuery, Literal, 
PlanExpression}
+import org.apache.spark.sql.catalyst.expressions.{CreateNamedStruct, 
Expression, ExprId, InSet, ListQuery, Literal, PlanExpression}
 import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, 
ExprCode}
 import org.apache.spark.sql.catalyst.rules.Rule
 import org.apache.spark.sql.internal.SQLConf
@@ -114,10 +114,10 @@ case class InSubqueryExec(
 child: Expression,
 plan: BaseSubqueryExec,
 exprId: ExprId,
-private var resultBroadcast: Broadcast[Set[Any]] = null) extends 
ExecSubqueryExpression {
+private var resultBroadcast: Broadcast[Array[Any]] = null) extends 
ExecSubqueryExpression {
 
-  @transient private var result: Set[Any] = _
-  @transient private lazy val inSet = InSet(child, result)
+  @transient private var result: Array[Any] = _
+  @transient private lazy val inSet = InSet(child, result.toSet)
 
   override def dataType: DataType = BooleanType
   override def children: Seq[Expression] = child :: Nil
@@ -132,11 +132,11 @@ case class InSubqueryExec(
 
   def updateResult(): Unit = {
 val rows = plan.executeCollect()
-result = rows.map(_.get(0, child.dataType)).toSet
+result = rows.map(_.get(0, child.dataType))
 resultBroadcast = plan.sqlContext.sparkContext.broadcast(result)
   }
 
-  def values(): Option[Set[Any]] = Option(resultBroadcast).map(_.value)
+  def values(): Option[Array[Any]] = Option(resultBroadcast).map(_.value)
 
   private def prepareResult(): Unit = {
 require(resultBroadcast != null, s"$this has not finished")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6145621 -> dd808457)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6145621  [SPARK-32659][SQL][FOLLOWUP] Broadcast Array instead of Set 
in InSubqueryExec
 add dd808457 [SPARK-32964][DSTREAMS] Pass all `streaming` module UTs in 
Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/streaming/DStreamGraph.scala   | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of Set in InSubqueryExec

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8a481d8  [SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of 
Set in InSubqueryExec
8a481d8 is described below

commit 8a481d8c336360bc5dfa518af70590251e5b61bc
Author: Wenchen Fan 
AuthorDate: Tue Sep 22 10:58:33 2020 -0700

[SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of Set in 
InSubqueryExec

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/29475.

This PR updates the code to broadcast the Array instead of Set, which was 
the behavior before #29475

### Why are the changes needed?

The size of Set can be much bigger than Array. It's safer to keep the 
behavior the same as before and build the set at the executor side.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

existing tests

Closes #29840 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
---
 .../main/scala/org/apache/spark/sql/execution/subquery.scala | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
index 9d15c76..48d6210 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
@@ -23,7 +23,7 @@ import scala.collection.mutable.ArrayBuffer
 import org.apache.spark.broadcast.Broadcast
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.catalyst.{expressions, InternalRow}
-import org.apache.spark.sql.catalyst.expressions.{AttributeSeq, 
CreateNamedStruct, Expression, ExprId, InSet, ListQuery, Literal, 
PlanExpression}
+import org.apache.spark.sql.catalyst.expressions.{CreateNamedStruct, 
Expression, ExprId, InSet, ListQuery, Literal, PlanExpression}
 import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, 
ExprCode}
 import org.apache.spark.sql.catalyst.rules.Rule
 import org.apache.spark.sql.internal.SQLConf
@@ -114,10 +114,10 @@ case class InSubqueryExec(
 child: Expression,
 plan: BaseSubqueryExec,
 exprId: ExprId,
-private var resultBroadcast: Broadcast[Set[Any]] = null) extends 
ExecSubqueryExpression {
+private var resultBroadcast: Broadcast[Array[Any]] = null) extends 
ExecSubqueryExpression {
 
-  @transient private var result: Set[Any] = _
-  @transient private lazy val inSet = InSet(child, result)
+  @transient private var result: Array[Any] = _
+  @transient private lazy val inSet = InSet(child, result.toSet)
 
   override def dataType: DataType = BooleanType
   override def children: Seq[Expression] = child :: Nil
@@ -132,11 +132,11 @@ case class InSubqueryExec(
 
   def updateResult(): Unit = {
 val rows = plan.executeCollect()
-result = rows.map(_.get(0, child.dataType)).toSet
+result = rows.map(_.get(0, child.dataType))
 resultBroadcast = plan.sqlContext.sparkContext.broadcast(result)
   }
 
-  def values(): Option[Set[Any]] = Option(resultBroadcast).map(_.value)
+  def values(): Option[Array[Any]] = Option(resultBroadcast).map(_.value)
 
   private def prepareResult(): Unit = {
 require(resultBroadcast != null, s"$this has not finished")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of Set in InSubqueryExec

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8a481d8  [SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of 
Set in InSubqueryExec
8a481d8 is described below

commit 8a481d8c336360bc5dfa518af70590251e5b61bc
Author: Wenchen Fan 
AuthorDate: Tue Sep 22 10:58:33 2020 -0700

[SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of Set in 
InSubqueryExec

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/29475.

This PR updates the code to broadcast the Array instead of Set, which was 
the behavior before #29475

### Why are the changes needed?

The size of Set can be much bigger than Array. It's safer to keep the 
behavior the same as before and build the set at the executor side.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

existing tests

Closes #29840 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
---
 .../main/scala/org/apache/spark/sql/execution/subquery.scala | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
index 9d15c76..48d6210 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
@@ -23,7 +23,7 @@ import scala.collection.mutable.ArrayBuffer
 import org.apache.spark.broadcast.Broadcast
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.catalyst.{expressions, InternalRow}
-import org.apache.spark.sql.catalyst.expressions.{AttributeSeq, 
CreateNamedStruct, Expression, ExprId, InSet, ListQuery, Literal, 
PlanExpression}
+import org.apache.spark.sql.catalyst.expressions.{CreateNamedStruct, 
Expression, ExprId, InSet, ListQuery, Literal, PlanExpression}
 import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, 
ExprCode}
 import org.apache.spark.sql.catalyst.rules.Rule
 import org.apache.spark.sql.internal.SQLConf
@@ -114,10 +114,10 @@ case class InSubqueryExec(
 child: Expression,
 plan: BaseSubqueryExec,
 exprId: ExprId,
-private var resultBroadcast: Broadcast[Set[Any]] = null) extends 
ExecSubqueryExpression {
+private var resultBroadcast: Broadcast[Array[Any]] = null) extends 
ExecSubqueryExpression {
 
-  @transient private var result: Set[Any] = _
-  @transient private lazy val inSet = InSet(child, result)
+  @transient private var result: Array[Any] = _
+  @transient private lazy val inSet = InSet(child, result.toSet)
 
   override def dataType: DataType = BooleanType
   override def children: Seq[Expression] = child :: Nil
@@ -132,11 +132,11 @@ case class InSubqueryExec(
 
   def updateResult(): Unit = {
 val rows = plan.executeCollect()
-result = rows.map(_.get(0, child.dataType)).toSet
+result = rows.map(_.get(0, child.dataType))
 resultBroadcast = plan.sqlContext.sparkContext.broadcast(result)
   }
 
-  def values(): Option[Set[Any]] = Option(resultBroadcast).map(_.value)
+  def values(): Option[Array[Any]] = Option(resultBroadcast).map(_.value)
 
   private def prepareResult(): Unit = {
 require(resultBroadcast != null, s"$this has not finished")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (790d9ef2d -> 6145621)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 790d9ef2d [SPARK-32955][DOCS] An item in the navigation bar in the 
WebUI has a wrong link
 add 6145621  [SPARK-32659][SQL][FOLLOWUP] Broadcast Array instead of Set 
in InSubqueryExec

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/subquery.scala| 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (790d9ef2d -> 6145621)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 790d9ef2d [SPARK-32955][DOCS] An item in the navigation bar in the 
WebUI has a wrong link
 add 6145621  [SPARK-32659][SQL][FOLLOWUP] Broadcast Array instead of Set 
in InSubqueryExec

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/subquery.scala| 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (790d9ef2d -> 6145621)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 790d9ef2d [SPARK-32955][DOCS] An item in the navigation bar in the 
WebUI has a wrong link
 add 6145621  [SPARK-32659][SQL][FOLLOWUP] Broadcast Array instead of Set 
in InSubqueryExec

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/subquery.scala| 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (790d9ef2d -> 6145621)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 790d9ef2d [SPARK-32955][DOCS] An item in the navigation bar in the 
WebUI has a wrong link
 add 6145621  [SPARK-32659][SQL][FOLLOWUP] Broadcast Array instead of Set 
in InSubqueryExec

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/subquery.scala| 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (790d9ef2d -> 6145621)

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 790d9ef2d [SPARK-32955][DOCS] An item in the navigation bar in the 
WebUI has a wrong link
 add 6145621  [SPARK-32659][SQL][FOLLOWUP] Broadcast Array instead of Set 
in InSubqueryExec

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/subquery.scala| 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task killed before real start

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new e1e94ed  [SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task 
killed before real start
e1e94ed is described below

commit e1e94ed4ef45ef81814f1b920bac0afa52ae06a2
Author: yi.wu 
AuthorDate: Mon Sep 21 23:20:18 2020 -0700

[SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task killed before 
real start

### What changes were proposed in this pull request?

Only calculate the executorRunTime when taskStartTime > 0. Otherwise, set 
executorRunTime to 0.

### Why are the changes needed?

bug fix.

It's possible that a task be killed (e.g., by another successful attempt) 
before it reaches "taskStartTime = System.currentTimeMillis()". In this case, 
taskStartTime is still 0 since it hasn't been really initialized. And we will 
get the wrong executorRunTime by calculating System.currentTimeMillis() - 
taskStartTime.

### Does this PR introduce _any_ user-facing change?

Yes, users will see the correct executorRunTime.

### How was this patch tested?

Pass existing tests.

Closes #29832 from Ngone51/backport-spark-3289.

Authored-by: yi.wu 
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/executor/Executor.scala | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/executor/Executor.scala 
b/core/src/main/scala/org/apache/spark/executor/Executor.scala
index f7ff0b8..fe57b1c 100644
--- a/core/src/main/scala/org/apache/spark/executor/Executor.scala
+++ b/core/src/main/scala/org/apache/spark/executor/Executor.scala
@@ -337,7 +337,10 @@ private[spark] class Executor(
 private def collectAccumulatorsAndResetStatusOnFailure(taskStartTime: 
Long) = {
   // Report executor runtime and JVM gc time
   Option(task).foreach(t => {
-t.metrics.setExecutorRunTime(System.currentTimeMillis() - 
taskStartTime)
+t.metrics.setExecutorRunTime(
+  // SPARK-32898: it's possible that a task is killed when 
taskStartTime has the initial
+  // value(=0) still. In this case, the executorRunTime should be 
considered as 0.
+  if (taskStartTime > 0) System.currentTimeMillis() - taskStartTime 
else 0)
 t.metrics.setJvmGCTime(computeTotalGcTime() - startGCTime)
   })
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task killed before real start

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new e1e94ed  [SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task 
killed before real start
e1e94ed is described below

commit e1e94ed4ef45ef81814f1b920bac0afa52ae06a2
Author: yi.wu 
AuthorDate: Mon Sep 21 23:20:18 2020 -0700

[SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task killed before 
real start

### What changes were proposed in this pull request?

Only calculate the executorRunTime when taskStartTime > 0. Otherwise, set 
executorRunTime to 0.

### Why are the changes needed?

bug fix.

It's possible that a task be killed (e.g., by another successful attempt) 
before it reaches "taskStartTime = System.currentTimeMillis()". In this case, 
taskStartTime is still 0 since it hasn't been really initialized. And we will 
get the wrong executorRunTime by calculating System.currentTimeMillis() - 
taskStartTime.

### Does this PR introduce _any_ user-facing change?

Yes, users will see the correct executorRunTime.

### How was this patch tested?

Pass existing tests.

Closes #29832 from Ngone51/backport-spark-3289.

Authored-by: yi.wu 
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/executor/Executor.scala | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/executor/Executor.scala 
b/core/src/main/scala/org/apache/spark/executor/Executor.scala
index f7ff0b8..fe57b1c 100644
--- a/core/src/main/scala/org/apache/spark/executor/Executor.scala
+++ b/core/src/main/scala/org/apache/spark/executor/Executor.scala
@@ -337,7 +337,10 @@ private[spark] class Executor(
 private def collectAccumulatorsAndResetStatusOnFailure(taskStartTime: 
Long) = {
   // Report executor runtime and JVM gc time
   Option(task).foreach(t => {
-t.metrics.setExecutorRunTime(System.currentTimeMillis() - 
taskStartTime)
+t.metrics.setExecutorRunTime(
+  // SPARK-32898: it's possible that a task is killed when 
taskStartTime has the initial
+  // value(=0) still. In this case, the executorRunTime should be 
considered as 0.
+  if (taskStartTime > 0) System.currentTimeMillis() - taskStartTime 
else 0)
 t.metrics.setJvmGCTime(computeTotalGcTime() - startGCTime)
   })
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task killed before real start

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new e1e94ed  [SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task 
killed before real start
e1e94ed is described below

commit e1e94ed4ef45ef81814f1b920bac0afa52ae06a2
Author: yi.wu 
AuthorDate: Mon Sep 21 23:20:18 2020 -0700

[SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task killed before 
real start

### What changes were proposed in this pull request?

Only calculate the executorRunTime when taskStartTime > 0. Otherwise, set 
executorRunTime to 0.

### Why are the changes needed?

bug fix.

It's possible that a task be killed (e.g., by another successful attempt) 
before it reaches "taskStartTime = System.currentTimeMillis()". In this case, 
taskStartTime is still 0 since it hasn't been really initialized. And we will 
get the wrong executorRunTime by calculating System.currentTimeMillis() - 
taskStartTime.

### Does this PR introduce _any_ user-facing change?

Yes, users will see the correct executorRunTime.

### How was this patch tested?

Pass existing tests.

Closes #29832 from Ngone51/backport-spark-3289.

Authored-by: yi.wu 
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/executor/Executor.scala | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/executor/Executor.scala 
b/core/src/main/scala/org/apache/spark/executor/Executor.scala
index f7ff0b8..fe57b1c 100644
--- a/core/src/main/scala/org/apache/spark/executor/Executor.scala
+++ b/core/src/main/scala/org/apache/spark/executor/Executor.scala
@@ -337,7 +337,10 @@ private[spark] class Executor(
 private def collectAccumulatorsAndResetStatusOnFailure(taskStartTime: 
Long) = {
   // Report executor runtime and JVM gc time
   Option(task).foreach(t => {
-t.metrics.setExecutorRunTime(System.currentTimeMillis() - 
taskStartTime)
+t.metrics.setExecutorRunTime(
+  // SPARK-32898: it's possible that a task is killed when 
taskStartTime has the initial
+  // value(=0) still. In this case, the executorRunTime should be 
considered as 0.
+  if (taskStartTime > 0) System.currentTimeMillis() - taskStartTime 
else 0)
 t.metrics.setJvmGCTime(computeTotalGcTime() - startGCTime)
   })
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task killed before real start

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new e1e94ed  [SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task 
killed before real start
e1e94ed is described below

commit e1e94ed4ef45ef81814f1b920bac0afa52ae06a2
Author: yi.wu 
AuthorDate: Mon Sep 21 23:20:18 2020 -0700

[SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task killed before 
real start

### What changes were proposed in this pull request?

Only calculate the executorRunTime when taskStartTime > 0. Otherwise, set 
executorRunTime to 0.

### Why are the changes needed?

bug fix.

It's possible that a task be killed (e.g., by another successful attempt) 
before it reaches "taskStartTime = System.currentTimeMillis()". In this case, 
taskStartTime is still 0 since it hasn't been really initialized. And we will 
get the wrong executorRunTime by calculating System.currentTimeMillis() - 
taskStartTime.

### Does this PR introduce _any_ user-facing change?

Yes, users will see the correct executorRunTime.

### How was this patch tested?

Pass existing tests.

Closes #29832 from Ngone51/backport-spark-3289.

Authored-by: yi.wu 
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/executor/Executor.scala | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/executor/Executor.scala 
b/core/src/main/scala/org/apache/spark/executor/Executor.scala
index f7ff0b8..fe57b1c 100644
--- a/core/src/main/scala/org/apache/spark/executor/Executor.scala
+++ b/core/src/main/scala/org/apache/spark/executor/Executor.scala
@@ -337,7 +337,10 @@ private[spark] class Executor(
 private def collectAccumulatorsAndResetStatusOnFailure(taskStartTime: 
Long) = {
   // Report executor runtime and JVM gc time
   Option(task).foreach(t => {
-t.metrics.setExecutorRunTime(System.currentTimeMillis() - 
taskStartTime)
+t.metrics.setExecutorRunTime(
+  // SPARK-32898: it's possible that a task is killed when 
taskStartTime has the initial
+  // value(=0) still. In this case, the executorRunTime should be 
considered as 0.
+  if (taskStartTime > 0) System.currentTimeMillis() - taskStartTime 
else 0)
 t.metrics.setJvmGCTime(computeTotalGcTime() - startGCTime)
   })
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task killed before real start

2020-09-22 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new e1e94ed  [SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task 
killed before real start
e1e94ed is described below

commit e1e94ed4ef45ef81814f1b920bac0afa52ae06a2
Author: yi.wu 
AuthorDate: Mon Sep 21 23:20:18 2020 -0700

[SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task killed before 
real start

### What changes were proposed in this pull request?

Only calculate the executorRunTime when taskStartTime > 0. Otherwise, set 
executorRunTime to 0.

### Why are the changes needed?

bug fix.

It's possible that a task be killed (e.g., by another successful attempt) 
before it reaches "taskStartTime = System.currentTimeMillis()". In this case, 
taskStartTime is still 0 since it hasn't been really initialized. And we will 
get the wrong executorRunTime by calculating System.currentTimeMillis() - 
taskStartTime.

### Does this PR introduce _any_ user-facing change?

Yes, users will see the correct executorRunTime.

### How was this patch tested?

Pass existing tests.

Closes #29832 from Ngone51/backport-spark-3289.

Authored-by: yi.wu 
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/executor/Executor.scala | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/executor/Executor.scala 
b/core/src/main/scala/org/apache/spark/executor/Executor.scala
index f7ff0b8..fe57b1c 100644
--- a/core/src/main/scala/org/apache/spark/executor/Executor.scala
+++ b/core/src/main/scala/org/apache/spark/executor/Executor.scala
@@ -337,7 +337,10 @@ private[spark] class Executor(
 private def collectAccumulatorsAndResetStatusOnFailure(taskStartTime: 
Long) = {
   // Report executor runtime and JVM gc time
   Option(task).foreach(t => {
-t.metrics.setExecutorRunTime(System.currentTimeMillis() - 
taskStartTime)
+t.metrics.setExecutorRunTime(
+  // SPARK-32898: it's possible that a task is killed when 
taskStartTime has the initial
+  // value(=0) still. In this case, the executorRunTime should be 
considered as 0.
+  if (taskStartTime > 0) System.currentTimeMillis() - taskStartTime 
else 0)
 t.metrics.setJvmGCTime(computeTotalGcTime() - startGCTime)
   })
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3118c22 -> 790d9ef2d)

2020-09-22 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3118c22  [SPARK-32949][R][SQL] Add timestamp_seconds to SparkR
 add 790d9ef2d [SPARK-32955][DOCS] An item in the navigation bar in the 
WebUI has a wrong link

No new revisions were added by this update.

Summary of changes:
 docs/_layouts/global.html |  2 +-
 docs/api.md   | 27 ---
 2 files changed, 1 insertion(+), 28 deletions(-)
 delete mode 100644 docs/api.md


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

55 matches

Mail list logo