from:"srowen"

[spark] branch master updated (657e39a -> 7fdb571)

2020-09-16 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 657e39a  [SPARK-32897][PYTHON] Don't show a deprecation warning at 
SparkSession.builder.getOrCreate
 add 7fdb571  [SPARK-32890][SQL] Pass all `sql/hive` module UTs in Scala 
2.13

No new revisions were added by this update.

Summary of changes:
 .../resources/regression-test-SPARK-8489/test-2.13.jar  | Bin 0 -> 19579 bytes
 .../spark/sql/hive/HiveSchemaInferenceSuite.scala   |   2 +-
 .../apache/spark/sql/hive/HiveSparkSubmitSuite.scala|   2 +-
 .../org/apache/spark/sql/hive/StatisticsSuite.scala |   2 +-
 .../apache/spark/sql/hive/execution/HiveDDLSuite.scala  |   2 +-
 5 files changed, 4 insertions(+), 4 deletions(-)
 create mode 100644 
sql/hive/src/test/resources/regression-test-SPARK-8489/test-2.13.jar


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (316242b -> 6f36db1)

2020-09-15 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 316242b  [SPARK-32874][SQL][TEST] Enhance result set meta data check 
for execute statement operation with thrift server
 add 6f36db1  [SPARK-31448][PYTHON] Fix storage level used in persist() in 
dataframe.py

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/dataframe.py | 7 ---
 python/pyspark/storagelevel.py  | 1 +
 2 files changed, 5 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (316242b -> 6f36db1)

2020-09-15 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 316242b  [SPARK-32874][SQL][TEST] Enhance result set meta data check 
for execute statement operation with thrift server
 add 6f36db1  [SPARK-31448][PYTHON] Fix storage level used in persist() in 
dataframe.py

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/dataframe.py | 7 ---
 python/pyspark/storagelevel.py  | 1 +
 2 files changed, 5 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (316242b -> 6f36db1)

2020-09-15 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 316242b  [SPARK-32874][SQL][TEST] Enhance result set meta data check 
for execute statement operation with thrift server
 add 6f36db1  [SPARK-31448][PYTHON] Fix storage level used in persist() in 
dataframe.py

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/dataframe.py | 7 ---
 python/pyspark/storagelevel.py  | 1 +
 2 files changed, 5 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (316242b -> 6f36db1)

2020-09-15 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 316242b  [SPARK-32874][SQL][TEST] Enhance result set meta data check 
for execute statement operation with thrift server
 add 6f36db1  [SPARK-31448][PYTHON] Fix storage level used in persist() in 
dataframe.py

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/dataframe.py | 7 ---
 python/pyspark/storagelevel.py  | 1 +
 2 files changed, 5 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (316242b -> 6f36db1)

2020-09-15 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 316242b  [SPARK-32874][SQL][TEST] Enhance result set meta data check 
for execute statement operation with thrift server
 add 6f36db1  [SPARK-31448][PYTHON] Fix storage level used in persist() in 
dataframe.py

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/dataframe.py | 7 ---
 python/pyspark/storagelevel.py  | 1 +
 2 files changed, 5 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bbbd907 -> 3be552c)

2020-09-12 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bbbd907  [SPARK-32804][LAUNCHER] Fix run-example command builder bug
 add 3be552c  [SPARK-30090][SHELL] Adapt Spark REPL to Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/repl/Main.scala   |   0
 .../org/apache/spark/repl/SparkILoop.scala |   0
 .../org/apache/spark/repl/Main.scala   |  22 ++-
 .../org/apache/spark/repl/SparkILoop.scala | 149 ++
 .../org/apache/spark/repl/Repl2Suite.scala |  58 +++
 .../apache/spark/repl/SingletonRepl2Suite.scala| 171 +
 .../org/apache/spark/repl/Repl2Suite.scala |  53 +++
 .../apache/spark/repl/SingletonRepl2Suite.scala| 171 +
 .../scala/org/apache/spark/repl/ReplSuite.scala|  27 
 .../org/apache/spark/repl/SingletonReplSuite.scala |  61 
 .../sql/catalyst/util/CaseInsensitiveMap.scala |   2 +-
 11 files changed, 618 insertions(+), 96 deletions(-)
 copy repl/src/main/{scala => scala-2.12}/org/apache/spark/repl/Main.scala 
(100%)
 rename repl/src/main/{scala => 
scala-2.12}/org/apache/spark/repl/SparkILoop.scala (100%)
 rename repl/src/main/{scala => scala-2.13}/org/apache/spark/repl/Main.scala 
(89%)
 create mode 100644 
repl/src/main/scala-2.13/org/apache/spark/repl/SparkILoop.scala
 create mode 100644 
repl/src/test/scala-2.12/org/apache/spark/repl/Repl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.12/org/apache/spark/repl/SingletonRepl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.13/org/apache/spark/repl/Repl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.13/org/apache/spark/repl/SingletonRepl2Suite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bbbd907 -> 3be552c)

2020-09-12 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bbbd907  [SPARK-32804][LAUNCHER] Fix run-example command builder bug
 add 3be552c  [SPARK-30090][SHELL] Adapt Spark REPL to Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/repl/Main.scala   |   0
 .../org/apache/spark/repl/SparkILoop.scala |   0
 .../org/apache/spark/repl/Main.scala   |  22 ++-
 .../org/apache/spark/repl/SparkILoop.scala | 149 ++
 .../org/apache/spark/repl/Repl2Suite.scala |  58 +++
 .../apache/spark/repl/SingletonRepl2Suite.scala| 171 +
 .../org/apache/spark/repl/Repl2Suite.scala |  53 +++
 .../apache/spark/repl/SingletonRepl2Suite.scala| 171 +
 .../scala/org/apache/spark/repl/ReplSuite.scala|  27 
 .../org/apache/spark/repl/SingletonReplSuite.scala |  61 
 .../sql/catalyst/util/CaseInsensitiveMap.scala |   2 +-
 11 files changed, 618 insertions(+), 96 deletions(-)
 copy repl/src/main/{scala => scala-2.12}/org/apache/spark/repl/Main.scala 
(100%)
 rename repl/src/main/{scala => 
scala-2.12}/org/apache/spark/repl/SparkILoop.scala (100%)
 rename repl/src/main/{scala => scala-2.13}/org/apache/spark/repl/Main.scala 
(89%)
 create mode 100644 
repl/src/main/scala-2.13/org/apache/spark/repl/SparkILoop.scala
 create mode 100644 
repl/src/test/scala-2.12/org/apache/spark/repl/Repl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.12/org/apache/spark/repl/SingletonRepl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.13/org/apache/spark/repl/Repl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.13/org/apache/spark/repl/SingletonRepl2Suite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bbbd907 -> 3be552c)

2020-09-12 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bbbd907  [SPARK-32804][LAUNCHER] Fix run-example command builder bug
 add 3be552c  [SPARK-30090][SHELL] Adapt Spark REPL to Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/repl/Main.scala   |   0
 .../org/apache/spark/repl/SparkILoop.scala |   0
 .../org/apache/spark/repl/Main.scala   |  22 ++-
 .../org/apache/spark/repl/SparkILoop.scala | 149 ++
 .../org/apache/spark/repl/Repl2Suite.scala |  58 +++
 .../apache/spark/repl/SingletonRepl2Suite.scala| 171 +
 .../org/apache/spark/repl/Repl2Suite.scala |  53 +++
 .../apache/spark/repl/SingletonRepl2Suite.scala| 171 +
 .../scala/org/apache/spark/repl/ReplSuite.scala|  27 
 .../org/apache/spark/repl/SingletonReplSuite.scala |  61 
 .../sql/catalyst/util/CaseInsensitiveMap.scala |   2 +-
 11 files changed, 618 insertions(+), 96 deletions(-)
 copy repl/src/main/{scala => scala-2.12}/org/apache/spark/repl/Main.scala 
(100%)
 rename repl/src/main/{scala => 
scala-2.12}/org/apache/spark/repl/SparkILoop.scala (100%)
 rename repl/src/main/{scala => scala-2.13}/org/apache/spark/repl/Main.scala 
(89%)
 create mode 100644 
repl/src/main/scala-2.13/org/apache/spark/repl/SparkILoop.scala
 create mode 100644 
repl/src/test/scala-2.12/org/apache/spark/repl/Repl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.12/org/apache/spark/repl/SingletonRepl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.13/org/apache/spark/repl/Repl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.13/org/apache/spark/repl/SingletonRepl2Suite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bbbd907 -> 3be552c)

2020-09-12 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bbbd907  [SPARK-32804][LAUNCHER] Fix run-example command builder bug
 add 3be552c  [SPARK-30090][SHELL] Adapt Spark REPL to Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/repl/Main.scala   |   0
 .../org/apache/spark/repl/SparkILoop.scala |   0
 .../org/apache/spark/repl/Main.scala   |  22 ++-
 .../org/apache/spark/repl/SparkILoop.scala | 149 ++
 .../org/apache/spark/repl/Repl2Suite.scala |  58 +++
 .../apache/spark/repl/SingletonRepl2Suite.scala| 171 +
 .../org/apache/spark/repl/Repl2Suite.scala |  53 +++
 .../apache/spark/repl/SingletonRepl2Suite.scala| 171 +
 .../scala/org/apache/spark/repl/ReplSuite.scala|  27 
 .../org/apache/spark/repl/SingletonReplSuite.scala |  61 
 .../sql/catalyst/util/CaseInsensitiveMap.scala |   2 +-
 11 files changed, 618 insertions(+), 96 deletions(-)
 copy repl/src/main/{scala => scala-2.12}/org/apache/spark/repl/Main.scala 
(100%)
 rename repl/src/main/{scala => 
scala-2.12}/org/apache/spark/repl/SparkILoop.scala (100%)
 rename repl/src/main/{scala => scala-2.13}/org/apache/spark/repl/Main.scala 
(89%)
 create mode 100644 
repl/src/main/scala-2.13/org/apache/spark/repl/SparkILoop.scala
 create mode 100644 
repl/src/test/scala-2.12/org/apache/spark/repl/Repl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.12/org/apache/spark/repl/SingletonRepl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.13/org/apache/spark/repl/Repl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.13/org/apache/spark/repl/SingletonRepl2Suite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bbbd907 -> 3be552c)

2020-09-12 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bbbd907  [SPARK-32804][LAUNCHER] Fix run-example command builder bug
 add 3be552c  [SPARK-30090][SHELL] Adapt Spark REPL to Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/repl/Main.scala   |   0
 .../org/apache/spark/repl/SparkILoop.scala |   0
 .../org/apache/spark/repl/Main.scala   |  22 ++-
 .../org/apache/spark/repl/SparkILoop.scala | 149 ++
 .../org/apache/spark/repl/Repl2Suite.scala |  58 +++
 .../apache/spark/repl/SingletonRepl2Suite.scala| 171 +
 .../org/apache/spark/repl/Repl2Suite.scala |  53 +++
 .../apache/spark/repl/SingletonRepl2Suite.scala| 171 +
 .../scala/org/apache/spark/repl/ReplSuite.scala|  27 
 .../org/apache/spark/repl/SingletonReplSuite.scala |  61 
 .../sql/catalyst/util/CaseInsensitiveMap.scala |   2 +-
 11 files changed, 618 insertions(+), 96 deletions(-)
 copy repl/src/main/{scala => scala-2.12}/org/apache/spark/repl/Main.scala 
(100%)
 rename repl/src/main/{scala => 
scala-2.12}/org/apache/spark/repl/SparkILoop.scala (100%)
 rename repl/src/main/{scala => scala-2.13}/org/apache/spark/repl/Main.scala 
(89%)
 create mode 100644 
repl/src/main/scala-2.13/org/apache/spark/repl/SparkILoop.scala
 create mode 100644 
repl/src/test/scala-2.12/org/apache/spark/repl/Repl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.12/org/apache/spark/repl/SingletonRepl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.13/org/apache/spark/repl/Repl2Suite.scala
 create mode 100644 
repl/src/test/scala-2.13/org/apache/spark/repl/SingletonRepl2Suite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2009f95 -> bbbd907)

2020-09-12 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2009f95  [SPARK-32779][SQL][FOLLOW-UP] Delete Unused code
 add bbbd907  [SPARK-32804][LAUNCHER] Fix run-example command builder bug

No new revisions were added by this update.

Summary of changes:
 .../spark/launcher/SparkSubmitCommandBuilder.java  | 15 +--
 .../spark/launcher/SparkSubmitCommandBuilderSuite.java | 18 ++
 2 files changed, 31 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2009f95 -> bbbd907)

2020-09-12 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2009f95  [SPARK-32779][SQL][FOLLOW-UP] Delete Unused code
 add bbbd907  [SPARK-32804][LAUNCHER] Fix run-example command builder bug

No new revisions were added by this update.

Summary of changes:
 .../spark/launcher/SparkSubmitCommandBuilder.java  | 15 +--
 .../spark/launcher/SparkSubmitCommandBuilderSuite.java | 18 ++
 2 files changed, 31 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2009f95 -> bbbd907)

2020-09-12 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2009f95  [SPARK-32779][SQL][FOLLOW-UP] Delete Unused code
 add bbbd907  [SPARK-32804][LAUNCHER] Fix run-example command builder bug

No new revisions were added by this update.

Summary of changes:
 .../spark/launcher/SparkSubmitCommandBuilder.java  | 15 +--
 .../spark/launcher/SparkSubmitCommandBuilderSuite.java | 18 ++
 2 files changed, 31 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2009f95 -> bbbd907)

2020-09-12 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2009f95  [SPARK-32779][SQL][FOLLOW-UP] Delete Unused code
 add bbbd907  [SPARK-32804][LAUNCHER] Fix run-example command builder bug

No new revisions were added by this update.

Summary of changes:
 .../spark/launcher/SparkSubmitCommandBuilder.java  | 15 +--
 .../spark/launcher/SparkSubmitCommandBuilderSuite.java | 18 ++
 2 files changed, 31 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2009f95 -> bbbd907)

2020-09-12 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2009f95  [SPARK-32779][SQL][FOLLOW-UP] Delete Unused code
 add bbbd907  [SPARK-32804][LAUNCHER] Fix run-example command builder bug

No new revisions were added by this update.

Summary of changes:
 .../spark/launcher/SparkSubmitCommandBuilder.java  | 15 +--
 .../spark/launcher/SparkSubmitCommandBuilderSuite.java | 18 ++
 2 files changed, 31 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (94cac59 -> f6322d1)

2020-09-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 94cac59  [SPARK-32730][SQL][FOLLOW-UP] Improve LeftAnti SortMergeJoin 
right side buffering
 add f6322d1  [SPARK-32180][PYTHON][DOCS] Installation page of Getting 
Started in PySpark documentation

No new revisions were added by this update.

Summary of changes:
 python/docs/source/getting_started/index.rst   |   3 +
 .../docs/source/getting_started/installation.rst   | 114 +
 2 files changed, 117 insertions(+)
 create mode 100644 python/docs/source/getting_started/installation.rst


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (94cac59 -> f6322d1)

2020-09-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 94cac59  [SPARK-32730][SQL][FOLLOW-UP] Improve LeftAnti SortMergeJoin 
right side buffering
 add f6322d1  [SPARK-32180][PYTHON][DOCS] Installation page of Getting 
Started in PySpark documentation

No new revisions were added by this update.

Summary of changes:
 python/docs/source/getting_started/index.rst   |   3 +
 .../docs/source/getting_started/installation.rst   | 114 +
 2 files changed, 117 insertions(+)
 create mode 100644 python/docs/source/getting_started/installation.rst


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (94cac59 -> f6322d1)

2020-09-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 94cac59  [SPARK-32730][SQL][FOLLOW-UP] Improve LeftAnti SortMergeJoin 
right side buffering
 add f6322d1  [SPARK-32180][PYTHON][DOCS] Installation page of Getting 
Started in PySpark documentation

No new revisions were added by this update.

Summary of changes:
 python/docs/source/getting_started/index.rst   |   3 +
 .../docs/source/getting_started/installation.rst   | 114 +
 2 files changed, 117 insertions(+)
 create mode 100644 python/docs/source/getting_started/installation.rst


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (94cac59 -> f6322d1)

2020-09-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 94cac59  [SPARK-32730][SQL][FOLLOW-UP] Improve LeftAnti SortMergeJoin 
right side buffering
 add f6322d1  [SPARK-32180][PYTHON][DOCS] Installation page of Getting 
Started in PySpark documentation

No new revisions were added by this update.

Summary of changes:
 python/docs/source/getting_started/index.rst   |   3 +
 .../docs/source/getting_started/installation.rst   | 114 +
 2 files changed, 117 insertions(+)
 create mode 100644 python/docs/source/getting_started/installation.rst


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (94cac59 -> f6322d1)

2020-09-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 94cac59  [SPARK-32730][SQL][FOLLOW-UP] Improve LeftAnti SortMergeJoin 
right side buffering
 add f6322d1  [SPARK-32180][PYTHON][DOCS] Installation page of Getting 
Started in PySpark documentation

No new revisions were added by this update.

Summary of changes:
 python/docs/source/getting_started/index.rst   |   3 +
 .../docs/source/getting_started/installation.rst   | 114 +
 2 files changed, 117 insertions(+)
 create mode 100644 python/docs/source/getting_started/installation.rst


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (328d81a -> fe2ab25)

2020-09-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 328d81a  [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in 
CreateFunctionCommand
 add fe2ab25  [MINOR][SQL] Fix a typo at 
'spark.sql.sources.fileCompressionFactor' error message in SQLConf

No new revisions were added by this update.

Summary of changes:
 sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (328d81a -> fe2ab25)

2020-09-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 328d81a  [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in 
CreateFunctionCommand
 add fe2ab25  [MINOR][SQL] Fix a typo at 
'spark.sql.sources.fileCompressionFactor' error message in SQLConf

No new revisions were added by this update.

Summary of changes:
 sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (328d81a -> fe2ab25)

2020-09-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 328d81a  [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in 
CreateFunctionCommand
 add fe2ab25  [MINOR][SQL] Fix a typo at 
'spark.sql.sources.fileCompressionFactor' error message in SQLConf

No new revisions were added by this update.

Summary of changes:
 sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (328d81a -> fe2ab25)

2020-09-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 328d81a  [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in 
CreateFunctionCommand
 add fe2ab25  [MINOR][SQL] Fix a typo at 
'spark.sql.sources.fileCompressionFactor' error message in SQLConf

No new revisions were added by this update.

Summary of changes:
 sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (328d81a -> fe2ab25)

2020-09-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 328d81a  [SPARK-32677][SQL][DOCS][MINOR] Improve code comment in 
CreateFunctionCommand
 add fe2ab25  [MINOR][SQL] Fix a typo at 
'spark.sql.sources.fileCompressionFactor' error message in SQLConf

No new revisions were added by this update.

Summary of changes:
 sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (794b48c -> 513d51a)

2020-09-09 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 794b48c  [SPARK-32204][SPARK-32182][DOCS][FOLLOW-UP] Use IPython 
instead of ipython to check if installed in dev/lint-python
 add 513d51a  [SPARK-32808][SQL] Fix some test cases of `sql/core` module 
in scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../storage/ShuffleBlockFetcherIterator.scala  |  2 +-
 .../sql/catalyst/expressions/objects/objects.scala |  4 +-
 .../spark/sql/catalyst/plans/QueryPlan.scala   |  4 +-
 .../spark/sql/catalyst/util/GenericArrayData.scala |  8 +++-
 .../scala/org/apache/spark/sql/types/Decimal.scala |  4 +-
 .../spark/sql/RelationalGroupedDataset.scala   |  6 ++-
 .../apache/spark/sql/execution/GenerateExec.scala  |  2 +-
 .../sql-functions/sql-expression-schema.md | 46 +++---
 .../org/apache/spark/sql/DataFrameStatSuite.scala  |  4 +-
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  |  4 +-
 .../test/scala/org/apache/spark/sql/UDFSuite.scala |  2 +-
 .../execution/datasources/orc/OrcQuerySuite.scala  |  6 +--
 12 files changed, 51 insertions(+), 41 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (794b48c -> 513d51a)

2020-09-09 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 794b48c  [SPARK-32204][SPARK-32182][DOCS][FOLLOW-UP] Use IPython 
instead of ipython to check if installed in dev/lint-python
 add 513d51a  [SPARK-32808][SQL] Fix some test cases of `sql/core` module 
in scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../storage/ShuffleBlockFetcherIterator.scala  |  2 +-
 .../sql/catalyst/expressions/objects/objects.scala |  4 +-
 .../spark/sql/catalyst/plans/QueryPlan.scala   |  4 +-
 .../spark/sql/catalyst/util/GenericArrayData.scala |  8 +++-
 .../scala/org/apache/spark/sql/types/Decimal.scala |  4 +-
 .../spark/sql/RelationalGroupedDataset.scala   |  6 ++-
 .../apache/spark/sql/execution/GenerateExec.scala  |  2 +-
 .../sql-functions/sql-expression-schema.md | 46 +++---
 .../org/apache/spark/sql/DataFrameStatSuite.scala  |  4 +-
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  |  4 +-
 .../test/scala/org/apache/spark/sql/UDFSuite.scala |  2 +-
 .../execution/datasources/orc/OrcQuerySuite.scala  |  6 +--
 12 files changed, 51 insertions(+), 41 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (794b48c -> 513d51a)

2020-09-09 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 794b48c  [SPARK-32204][SPARK-32182][DOCS][FOLLOW-UP] Use IPython 
instead of ipython to check if installed in dev/lint-python
 add 513d51a  [SPARK-32808][SQL] Fix some test cases of `sql/core` module 
in scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../storage/ShuffleBlockFetcherIterator.scala  |  2 +-
 .../sql/catalyst/expressions/objects/objects.scala |  4 +-
 .../spark/sql/catalyst/plans/QueryPlan.scala   |  4 +-
 .../spark/sql/catalyst/util/GenericArrayData.scala |  8 +++-
 .../scala/org/apache/spark/sql/types/Decimal.scala |  4 +-
 .../spark/sql/RelationalGroupedDataset.scala   |  6 ++-
 .../apache/spark/sql/execution/GenerateExec.scala  |  2 +-
 .../sql-functions/sql-expression-schema.md | 46 +++---
 .../org/apache/spark/sql/DataFrameStatSuite.scala  |  4 +-
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  |  4 +-
 .../test/scala/org/apache/spark/sql/UDFSuite.scala |  2 +-
 .../execution/datasources/orc/OrcQuerySuite.scala  |  6 +--
 12 files changed, 51 insertions(+), 41 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (794b48c -> 513d51a)

2020-09-09 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 794b48c  [SPARK-32204][SPARK-32182][DOCS][FOLLOW-UP] Use IPython 
instead of ipython to check if installed in dev/lint-python
 add 513d51a  [SPARK-32808][SQL] Fix some test cases of `sql/core` module 
in scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../storage/ShuffleBlockFetcherIterator.scala  |  2 +-
 .../sql/catalyst/expressions/objects/objects.scala |  4 +-
 .../spark/sql/catalyst/plans/QueryPlan.scala   |  4 +-
 .../spark/sql/catalyst/util/GenericArrayData.scala |  8 +++-
 .../scala/org/apache/spark/sql/types/Decimal.scala |  4 +-
 .../spark/sql/RelationalGroupedDataset.scala   |  6 ++-
 .../apache/spark/sql/execution/GenerateExec.scala  |  2 +-
 .../sql-functions/sql-expression-schema.md | 46 +++---
 .../org/apache/spark/sql/DataFrameStatSuite.scala  |  4 +-
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  |  4 +-
 .../test/scala/org/apache/spark/sql/UDFSuite.scala |  2 +-
 .../execution/datasources/orc/OrcQuerySuite.scala  |  6 +--
 12 files changed, 51 insertions(+), 41 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13

2020-09-09 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 513d51a  [SPARK-32808][SQL] Fix some test cases of `sql/core` module 
in scala 2.13
513d51a is described below

commit 513d51a2c5dd2c7ff2c2fadc26ec122883372be1
Author: yangjie01 
AuthorDate: Wed Sep 9 08:53:44 2020 -0500

[SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13

### What changes were proposed in this pull request?
The purpose of this pr is to partial resolve 
[SPARK-32808](https://issues.apache.org/jira/browse/SPARK-32808), total of 26 
failed test cases were fixed, the related suite as follow:

- `StreamingAggregationSuite` related test cases (2 FAILED -> Pass)

- `GeneratorFunctionSuite` related test cases (2 FAILED -> Pass)

- `UDFSuite` related test cases (2 FAILED -> Pass)

- `SQLQueryTestSuite` related test cases (5 FAILED -> Pass)

- `WholeStageCodegenSuite` related test cases (1 FAILED -> Pass)

- `DataFrameSuite` related test cases (3 FAILED -> Pass)

- `OrcV1QuerySuite\OrcV2QuerySuite` related test cases (4 FAILED -> Pass)

- `ExpressionsSchemaSuite` related test cases (1 FAILED -> Pass)

- `DataFrameStatSuite` related test cases (1 FAILED -> Pass)

- `JsonV1Suite\JsonV2Suite\JsonLegacyTimeParserSuite` related test cases (6 
FAILED -> Pass)

The main change of this pr as following:

- Fix Scala 2.13 compilation problems in   `ShuffleBlockFetcherIterator`  
and `Analyzer`

- Specified `Seq` to `scala.collection.Seq` in `objects.scala` and 
`GenericArrayData` because internal use `Seq` maybe `mutable.ArraySeq` and not 
easy to call `.toSeq`

- Should specified `Seq` to `scala.collection.Seq`  when we call 
`Row.getAs[Seq]` and `Row.get(i).asInstanceOf[Seq]` because the data maybe 
`mutable.ArraySeq` but `Seq` is `immutable.Seq` in Scala 2.13

- Use a compatible way to let `+` and `-` method  of `Decimal` having the 
same behavior in Scala 2.12 and Scala 2.13

- Call `toList` in `RelationalGroupedDataset.toDF` method when 
`groupingExprs` is `Stream` type because `Stream` can't serialize in Scala 2.13

- Add a manual sort to `classFunsMap` in `ExpressionsSchemaSuite` because 
`Iterable.groupBy` in Scala 2.13 has different result with 
`TraversableLike.groupBy`  in Scala 2.12

### Why are the changes needed?
We need to support a Scala 2.13 build.

### Does this PR introduce _any_ user-facing change?

Should specified `Seq` to `scala.collection.Seq`  when we call 
`Row.getAs[Seq]` and `Row.get(i).asInstanceOf[Seq]` because the data maybe 
`mutable.ArraySeq` but the `Seq` is `immutable.Seq` in Scala 2.13

### How was this patch tested?

- Scala 2.12: Pass the Jenkins or GitHub Action

- Scala 2.13: Do the following:

```
dev/change-scala-version.sh 2.13
mvn clean install -DskipTests  -pl sql/core -Pscala-2.13 -am
mvn test -pl sql/core -Pscala-2.13
```

**Before**
```
Tests: succeeded 8166, failed 319, canceled 1, ignored 52, pending 0
*** 319 TESTS FAILED ***

```

**After**

```
Tests: succeeded 8204, failed 286, canceled 1, ignored 52, pending 0
*** 286 TESTS FAILED ***

```

Closes #29660 from LuciferYang/SPARK-32808.

Authored-by: yangjie01 
Signed-off-by: Sean Owen 
---
 .../storage/ShuffleBlockFetcherIterator.scala  |  2 +-
 .../sql/catalyst/expressions/objects/objects.scala |  4 +-
 .../spark/sql/catalyst/plans/QueryPlan.scala   |  4 +-
 .../spark/sql/catalyst/util/GenericArrayData.scala |  8 +++-
 .../scala/org/apache/spark/sql/types/Decimal.scala |  4 +-
 .../spark/sql/RelationalGroupedDataset.scala   |  6 ++-
 .../apache/spark/sql/execution/GenerateExec.scala  |  2 +-
 .../sql-functions/sql-expression-schema.md | 46 +++---
 .../org/apache/spark/sql/DataFrameStatSuite.scala  |  4 +-
 .../apache/spark/sql/ExpressionsSchemaSuite.scala  |  4 +-
 .../test/scala/org/apache/spark/sql/UDFSuite.scala |  2 +-
 .../execution/datasources/orc/OrcQuerySuite.scala  |  6 +--
 12 files changed, 51 insertions(+), 41 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
 
b/core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
index 57b6a38..e3b3fc5 100644
--- 
a/core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
+++ 
b/core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
@@ -495,7 +495,7 @@ final class ShuffleBlockFetcherIterator(
 hostLocalDirManager.getHostLocalDirs(host, port, 
bmIds.map(_.executorId)) {

[spark-website] branch asf-site updated: Update doc related to gpg key exports

2020-09-04 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new be6e744  Update doc related to gpg key exports
be6e744 is described below

commit be6e744d336bef26beb7c22da2e01a18f19587db
Author: zhengruifeng 
AuthorDate: Fri Sep 4 08:22:13 2020 -0500

Update doc related to gpg key exports

When preparing for 3.0.1-rc, I encounted issues related to gpg keys:
1, locally: I generated keys and used `gpg --export` to export it;
2, on an AWS EC2 instance: then imported keys by `gpg --import` commands 
and then run the `do-release-docker.sh`. I found that the script can not find 
the key.

That is because:
according to 
[export-secret-key](https://infra.apache.org/openpgp.html#export-secret-key)

> To ensure that you do not accidentally expose private keys, the GnuPG 
--export operation exports only public keys.

`gpg --export` only exports **public** keys, while `do-release-docker.sh` 
needs a **secret/private** key. So we should use `gpg --export-secret-keys` 
instead `gpg --export`.


![image](https://user-images.githubusercontent.com/7322292/92091702-afcd4780-ee03-11ea-87cf-8edcf0889215.png)

Author: zhengruifeng 

Closes #288 from zhengruifeng/fix_gpg_exports.
---
 release-process.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/release-process.md b/release-process.md
index 2b38b0b..db40a50 100644
--- a/release-process.md
+++ b/release-process.md
@@ -43,8 +43,8 @@ After generating the gpg key, you need to upload your key to 
a public key server
 https://www.apache.org/dev/openpgp.html#generate-key;>https://www.apache.org/dev/openpgp.html#generate-key
 for details.
 
-If you want to do the release on another machine, you can transfer your gpg 
key to that machine
-via the `gpg --export` and `gpg --import` commands.
+If you want to do the release on another machine, you can transfer your secret 
key to that machine
+via the `gpg --export-secret-keys` and `gpg --import` commands.
 
 The last step is to update the KEYS file with your code signing key
 https://www.apache.org/dev/openpgp.html#export-public-key;>https://www.apache.org/dev/openpgp.html#export-public-key


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3cde392 -> 7511e43)

2020-09-02 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3cde392  [SPARK-31831][SQL][FOLLOWUP] Make the 
GetCatalogsOperationMock for HiveSessionImplSuite compile with the proper Hive 
version
 add 7511e43  [SPARK-32756][SQL] Fix CaseInsensitiveMap usage for Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/util/CaseInsensitiveMap.scala  | 2 ++
 .../src/main/scala/org/apache/spark/sql/DataFrameReader.scala| 2 +-
 .../src/main/scala/org/apache/spark/sql/DataFrameWriter.scala| 9 +
 .../spark/sql/execution/datasources/orc/OrcFiltersBase.scala | 2 +-
 .../spark/sql/execution/datasources/v2/FileDataSourceV2.scala| 2 +-
 5 files changed, 10 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3cde392 -> 7511e43)

2020-09-02 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3cde392  [SPARK-31831][SQL][FOLLOWUP] Make the 
GetCatalogsOperationMock for HiveSessionImplSuite compile with the proper Hive 
version
 add 7511e43  [SPARK-32756][SQL] Fix CaseInsensitiveMap usage for Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/util/CaseInsensitiveMap.scala  | 2 ++
 .../src/main/scala/org/apache/spark/sql/DataFrameReader.scala| 2 +-
 .../src/main/scala/org/apache/spark/sql/DataFrameWriter.scala| 9 +
 .../spark/sql/execution/datasources/orc/OrcFiltersBase.scala | 2 +-
 .../spark/sql/execution/datasources/v2/FileDataSourceV2.scala| 2 +-
 5 files changed, 10 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3cde392 -> 7511e43)

2020-09-02 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3cde392  [SPARK-31831][SQL][FOLLOWUP] Make the 
GetCatalogsOperationMock for HiveSessionImplSuite compile with the proper Hive 
version
 add 7511e43  [SPARK-32756][SQL] Fix CaseInsensitiveMap usage for Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/util/CaseInsensitiveMap.scala  | 2 ++
 .../src/main/scala/org/apache/spark/sql/DataFrameReader.scala| 2 +-
 .../src/main/scala/org/apache/spark/sql/DataFrameWriter.scala| 9 +
 .../spark/sql/execution/datasources/orc/OrcFiltersBase.scala | 2 +-
 .../spark/sql/execution/datasources/v2/FileDataSourceV2.scala| 2 +-
 5 files changed, 10 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3cde392 -> 7511e43)

2020-09-02 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3cde392  [SPARK-31831][SQL][FOLLOWUP] Make the 
GetCatalogsOperationMock for HiveSessionImplSuite compile with the proper Hive 
version
 add 7511e43  [SPARK-32756][SQL] Fix CaseInsensitiveMap usage for Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/util/CaseInsensitiveMap.scala  | 2 ++
 .../src/main/scala/org/apache/spark/sql/DataFrameReader.scala| 2 +-
 .../src/main/scala/org/apache/spark/sql/DataFrameWriter.scala| 9 +
 .../spark/sql/execution/datasources/orc/OrcFiltersBase.scala | 2 +-
 .../spark/sql/execution/datasources/v2/FileDataSourceV2.scala| 2 +-
 5 files changed, 10 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3cde392 -> 7511e43)

2020-09-02 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3cde392  [SPARK-31831][SQL][FOLLOWUP] Make the 
GetCatalogsOperationMock for HiveSessionImplSuite compile with the proper Hive 
version
 add 7511e43  [SPARK-32756][SQL] Fix CaseInsensitiveMap usage for Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/util/CaseInsensitiveMap.scala  | 2 ++
 .../src/main/scala/org/apache/spark/sql/DataFrameReader.scala| 2 +-
 .../src/main/scala/org/apache/spark/sql/DataFrameWriter.scala| 9 +
 .../spark/sql/execution/datasources/orc/OrcFiltersBase.scala | 2 +-
 .../spark/sql/execution/datasources/v2/FileDataSourceV2.scala| 2 +-
 5 files changed, 10 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark-website] branch asf-site updated: Adds Kotlin to the list of third-party language bindings

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 36e19d7  Adds Kotlin to the list of third-party language bindings
36e19d7 is described below

commit 36e19d72f38a806b3086636ec1266bbffb8dfaf2
Author: MKhalusova 
AuthorDate: Thu Aug 27 17:47:46 2020 -0500

Adds Kotlin to the list of third-party language bindings

This PR adds a link to Kotlin for Apache Spark repo as a third-party 
language binding.

Author: MKhalusova 

Closes #287 from MKhalusova/kotlin-third-party.
---
 site/third-party-projects.html | 6 ++
 third-party-projects.md| 4 
 2 files changed, 10 insertions(+)

diff --git a/site/third-party-projects.html b/site/third-party-projects.html
index 5bd1524..bed5d61 100644
--- a/site/third-party-projects.html
+++ b/site/third-party-projects.html
@@ -298,6 +298,12 @@ transforming, and analyzing genomic data using Apache 
Spark
   https://github.com/dfdx/Spark.jl;>Spark.jl
 
 
+Kotlin
+
+
+  https://github.com/JetBrains/kotlin-spark-api;>Kotlin for 
Apache Spark
+
+
   
 
 
diff --git a/third-party-projects.md b/third-party-projects.md
index 6176ff6..8f29bbb 100644
--- a/third-party-projects.md
+++ b/third-party-projects.md
@@ -90,3 +90,7 @@ transforming, and analyzing genomic data using Apache Spark
 Julia
 
 - https://github.com/dfdx/Spark.jl;>Spark.jl
+
+Kotlin
+
+- https://github.com/JetBrains/kotlin-spark-api;>Kotlin for Apache 
Spark


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32701][CORE][DOCS] mapreduce.fileoutputcommitter.algorithm.version default value

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 60f4856  [SPARK-32701][CORE][DOCS] 
mapreduce.fileoutputcommitter.algorithm.version default value
60f4856 is described below

commit 60f485671a07a93ae8a8506ed2c0999cfe6ded7b
Author: waleedfateem 
AuthorDate: Thu Aug 27 09:05:50 2020 -0500

[SPARK-32701][CORE][DOCS] mapreduce.fileoutputcommitter.algorithm.version 
default value

The current documentation states that the default value of 
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version is 1 which is not 
entirely true since this configuration isn't set anywhere in Spark but rather 
inherited from the Hadoop FileOutputCommitter class.

### What changes were proposed in this pull request?

I'm submitting this change, to clarify that the default value will entirely 
depend on the Hadoop version of the runtime environment.

### Why are the changes needed?

An application would end up using algorithm version 1 on certain 
environments but without any changes the same exact application will use 
version 2 on environments running Hadoop 3.0 and later. This can have pretty 
bad consequences in certain scenarios, for example, two tasks can partially 
overwrite their output if speculation is enabled. Also, please refer to the 
following JIRA:
https://issues.apache.org/jira/browse/MAPREDUCE-7282

### Does this PR introduce _any_ user-facing change?

Yes. Configuration page content was modified where previously we explicitly 
highlighted that the default version for the FileOutputCommitter algorithm was 
v1, this now has changed to "Dependent on environment" with additional 
information in the description column to elaborate.

### How was this patch tested?

Checked changes locally in browser

Closes #29541 from waleedfateem/SPARK-32701.

Authored-by: waleedfateem 
Signed-off-by: Sean Owen 
(cherry picked from commit 8749b2b6fae5ee0ce7b48aae6d859ed71e98491d)
Signed-off-by: Sean Owen 
---
 docs/configuration.md | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/configuration.md b/docs/configuration.md
index 2701fdb..95ff282 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1761,11 +1761,16 @@ Apart from these, the following properties are also 
available, and may be useful
 
 
   
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version
-  1
+  Dependent on environment
   
 The file output committer algorithm version, valid algorithm version 
number: 1 or 2.
 Version 2 may have better performance, but version 1 may handle failures 
better in certain situations,
 as per https://issues.apache.org/jira/browse/MAPREDUCE-4815;>MAPREDUCE-4815.
+The default value depends on the Hadoop version used in an environment:
+1 for Hadoop versions lower than 3.0
+2 for Hadoop versions 3.0 and higher
+It's important to note that this can change back to 1 again in the future 
once https://issues.apache.org/jira/browse/MAPREDUCE-7282;>MAPREDUCE-7282
+is fixed and merged.
   
   2.2.0
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ed51a7f -> 8749b2b)

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ed51a7f  [SPARK-30654] Bootstrap4 docs upgrade
 add 8749b2b  [SPARK-32701][CORE][DOCS] 
mapreduce.fileoutputcommitter.algorithm.version default value

No new revisions were added by this update.

Summary of changes:
 docs/configuration.md | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32701][CORE][DOCS] mapreduce.fileoutputcommitter.algorithm.version default value

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 60f4856  [SPARK-32701][CORE][DOCS] 
mapreduce.fileoutputcommitter.algorithm.version default value
60f4856 is described below

commit 60f485671a07a93ae8a8506ed2c0999cfe6ded7b
Author: waleedfateem 
AuthorDate: Thu Aug 27 09:05:50 2020 -0500

[SPARK-32701][CORE][DOCS] mapreduce.fileoutputcommitter.algorithm.version 
default value

The current documentation states that the default value of 
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version is 1 which is not 
entirely true since this configuration isn't set anywhere in Spark but rather 
inherited from the Hadoop FileOutputCommitter class.

### What changes were proposed in this pull request?

I'm submitting this change, to clarify that the default value will entirely 
depend on the Hadoop version of the runtime environment.

### Why are the changes needed?

An application would end up using algorithm version 1 on certain 
environments but without any changes the same exact application will use 
version 2 on environments running Hadoop 3.0 and later. This can have pretty 
bad consequences in certain scenarios, for example, two tasks can partially 
overwrite their output if speculation is enabled. Also, please refer to the 
following JIRA:
https://issues.apache.org/jira/browse/MAPREDUCE-7282

### Does this PR introduce _any_ user-facing change?

Yes. Configuration page content was modified where previously we explicitly 
highlighted that the default version for the FileOutputCommitter algorithm was 
v1, this now has changed to "Dependent on environment" with additional 
information in the description column to elaborate.

### How was this patch tested?

Checked changes locally in browser

Closes #29541 from waleedfateem/SPARK-32701.

Authored-by: waleedfateem 
Signed-off-by: Sean Owen 
(cherry picked from commit 8749b2b6fae5ee0ce7b48aae6d859ed71e98491d)
Signed-off-by: Sean Owen 
---
 docs/configuration.md | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/configuration.md b/docs/configuration.md
index 2701fdb..95ff282 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1761,11 +1761,16 @@ Apart from these, the following properties are also 
available, and may be useful
 
 
   
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version
-  1
+  Dependent on environment
   
 The file output committer algorithm version, valid algorithm version 
number: 1 or 2.
 Version 2 may have better performance, but version 1 may handle failures 
better in certain situations,
 as per https://issues.apache.org/jira/browse/MAPREDUCE-4815;>MAPREDUCE-4815.
+The default value depends on the Hadoop version used in an environment:
+1 for Hadoop versions lower than 3.0
+2 for Hadoop versions 3.0 and higher
+It's important to note that this can change back to 1 again in the future 
once https://issues.apache.org/jira/browse/MAPREDUCE-7282;>MAPREDUCE-7282
+is fixed and merged.
   
   2.2.0
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ed51a7f -> 8749b2b)

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ed51a7f  [SPARK-30654] Bootstrap4 docs upgrade
 add 8749b2b  [SPARK-32701][CORE][DOCS] 
mapreduce.fileoutputcommitter.algorithm.version default value

No new revisions were added by this update.

Summary of changes:
 docs/configuration.md | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (f14f374 -> ed51a7f)

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f14f374  [SPARK-32696][SQL][TEST-HIVE1.2][TEST-HADOOP2.7] Get columns 
operation should handle interval column properly
 add ed51a7f  [SPARK-30654] Bootstrap4 docs upgrade

No new revisions were added by this update.

Summary of changes:
 docs/_layouts/global.html  |  128 +-
 docs/css/bootstrap-responsive.css  | 1040 
 docs/css/bootstrap-responsive.min.css  |9 -
 docs/css/bootstrap.css | 5624 
 docs/css/bootstrap.min.css |   14 +-
 .../ui/static => docs/css}/bootstrap.min.css.map   |0
 docs/css/main.css  |  150 +-
 docs/js/main.js|   34 +-
 .../js/vendor}/bootstrap.bundle.min.js |0
 .../js/vendor}/bootstrap.bundle.min.js.map |0
 docs/js/vendor/bootstrap.js| 2027 ---
 docs/js/vendor/bootstrap.min.js|6 -
 12 files changed, 222 insertions(+), 8810 deletions(-)
 delete mode 100644 docs/css/bootstrap-responsive.css
 delete mode 100644 docs/css/bootstrap-responsive.min.css
 delete mode 100644 docs/css/bootstrap.css
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/css}/bootstrap.min.css.map (100%)
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/js/vendor}/bootstrap.bundle.min.js (100%)
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/js/vendor}/bootstrap.bundle.min.js.map (100%)
 delete mode 100755 docs/js/vendor/bootstrap.js
 delete mode 100755 docs/js/vendor/bootstrap.min.js


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32701][CORE][DOCS] mapreduce.fileoutputcommitter.algorithm.version default value

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 60f4856  [SPARK-32701][CORE][DOCS] 
mapreduce.fileoutputcommitter.algorithm.version default value
60f4856 is described below

commit 60f485671a07a93ae8a8506ed2c0999cfe6ded7b
Author: waleedfateem 
AuthorDate: Thu Aug 27 09:05:50 2020 -0500

[SPARK-32701][CORE][DOCS] mapreduce.fileoutputcommitter.algorithm.version 
default value

The current documentation states that the default value of 
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version is 1 which is not 
entirely true since this configuration isn't set anywhere in Spark but rather 
inherited from the Hadoop FileOutputCommitter class.

### What changes were proposed in this pull request?

I'm submitting this change, to clarify that the default value will entirely 
depend on the Hadoop version of the runtime environment.

### Why are the changes needed?

An application would end up using algorithm version 1 on certain 
environments but without any changes the same exact application will use 
version 2 on environments running Hadoop 3.0 and later. This can have pretty 
bad consequences in certain scenarios, for example, two tasks can partially 
overwrite their output if speculation is enabled. Also, please refer to the 
following JIRA:
https://issues.apache.org/jira/browse/MAPREDUCE-7282

### Does this PR introduce _any_ user-facing change?

Yes. Configuration page content was modified where previously we explicitly 
highlighted that the default version for the FileOutputCommitter algorithm was 
v1, this now has changed to "Dependent on environment" with additional 
information in the description column to elaborate.

### How was this patch tested?

Checked changes locally in browser

Closes #29541 from waleedfateem/SPARK-32701.

Authored-by: waleedfateem 
Signed-off-by: Sean Owen 
(cherry picked from commit 8749b2b6fae5ee0ce7b48aae6d859ed71e98491d)
Signed-off-by: Sean Owen 
---
 docs/configuration.md | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/configuration.md b/docs/configuration.md
index 2701fdb..95ff282 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1761,11 +1761,16 @@ Apart from these, the following properties are also 
available, and may be useful
 
 
   
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version
-  1
+  Dependent on environment
   
 The file output committer algorithm version, valid algorithm version 
number: 1 or 2.
 Version 2 may have better performance, but version 1 may handle failures 
better in certain situations,
 as per https://issues.apache.org/jira/browse/MAPREDUCE-4815;>MAPREDUCE-4815.
+The default value depends on the Hadoop version used in an environment:
+1 for Hadoop versions lower than 3.0
+2 for Hadoop versions 3.0 and higher
+It's important to note that this can change back to 1 again in the future 
once https://issues.apache.org/jira/browse/MAPREDUCE-7282;>MAPREDUCE-7282
+is fixed and merged.
   
   2.2.0
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ed51a7f -> 8749b2b)

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ed51a7f  [SPARK-30654] Bootstrap4 docs upgrade
 add 8749b2b  [SPARK-32701][CORE][DOCS] 
mapreduce.fileoutputcommitter.algorithm.version default value

No new revisions were added by this update.

Summary of changes:
 docs/configuration.md | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (f14f374 -> ed51a7f)

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f14f374  [SPARK-32696][SQL][TEST-HIVE1.2][TEST-HADOOP2.7] Get columns 
operation should handle interval column properly
 add ed51a7f  [SPARK-30654] Bootstrap4 docs upgrade

No new revisions were added by this update.

Summary of changes:
 docs/_layouts/global.html  |  128 +-
 docs/css/bootstrap-responsive.css  | 1040 
 docs/css/bootstrap-responsive.min.css  |9 -
 docs/css/bootstrap.css | 5624 
 docs/css/bootstrap.min.css |   14 +-
 .../ui/static => docs/css}/bootstrap.min.css.map   |0
 docs/css/main.css  |  150 +-
 docs/js/main.js|   34 +-
 .../js/vendor}/bootstrap.bundle.min.js |0
 .../js/vendor}/bootstrap.bundle.min.js.map |0
 docs/js/vendor/bootstrap.js| 2027 ---
 docs/js/vendor/bootstrap.min.js|6 -
 12 files changed, 222 insertions(+), 8810 deletions(-)
 delete mode 100644 docs/css/bootstrap-responsive.css
 delete mode 100644 docs/css/bootstrap-responsive.min.css
 delete mode 100644 docs/css/bootstrap.css
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/css}/bootstrap.min.css.map (100%)
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/js/vendor}/bootstrap.bundle.min.js (100%)
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/js/vendor}/bootstrap.bundle.min.js.map (100%)
 delete mode 100755 docs/js/vendor/bootstrap.js
 delete mode 100755 docs/js/vendor/bootstrap.min.js


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32701][CORE][DOCS] mapreduce.fileoutputcommitter.algorithm.version default value

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 60f4856  [SPARK-32701][CORE][DOCS] 
mapreduce.fileoutputcommitter.algorithm.version default value
60f4856 is described below

commit 60f485671a07a93ae8a8506ed2c0999cfe6ded7b
Author: waleedfateem 
AuthorDate: Thu Aug 27 09:05:50 2020 -0500

[SPARK-32701][CORE][DOCS] mapreduce.fileoutputcommitter.algorithm.version 
default value

The current documentation states that the default value of 
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version is 1 which is not 
entirely true since this configuration isn't set anywhere in Spark but rather 
inherited from the Hadoop FileOutputCommitter class.

### What changes were proposed in this pull request?

I'm submitting this change, to clarify that the default value will entirely 
depend on the Hadoop version of the runtime environment.

### Why are the changes needed?

An application would end up using algorithm version 1 on certain 
environments but without any changes the same exact application will use 
version 2 on environments running Hadoop 3.0 and later. This can have pretty 
bad consequences in certain scenarios, for example, two tasks can partially 
overwrite their output if speculation is enabled. Also, please refer to the 
following JIRA:
https://issues.apache.org/jira/browse/MAPREDUCE-7282

### Does this PR introduce _any_ user-facing change?

Yes. Configuration page content was modified where previously we explicitly 
highlighted that the default version for the FileOutputCommitter algorithm was 
v1, this now has changed to "Dependent on environment" with additional 
information in the description column to elaborate.

### How was this patch tested?

Checked changes locally in browser

Closes #29541 from waleedfateem/SPARK-32701.

Authored-by: waleedfateem 
Signed-off-by: Sean Owen 
(cherry picked from commit 8749b2b6fae5ee0ce7b48aae6d859ed71e98491d)
Signed-off-by: Sean Owen 
---
 docs/configuration.md | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/configuration.md b/docs/configuration.md
index 2701fdb..95ff282 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1761,11 +1761,16 @@ Apart from these, the following properties are also 
available, and may be useful
 
 
   
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version
-  1
+  Dependent on environment
   
 The file output committer algorithm version, valid algorithm version 
number: 1 or 2.
 Version 2 may have better performance, but version 1 may handle failures 
better in certain situations,
 as per https://issues.apache.org/jira/browse/MAPREDUCE-4815;>MAPREDUCE-4815.
+The default value depends on the Hadoop version used in an environment:
+1 for Hadoop versions lower than 3.0
+2 for Hadoop versions 3.0 and higher
+It's important to note that this can change back to 1 again in the future 
once https://issues.apache.org/jira/browse/MAPREDUCE-7282;>MAPREDUCE-7282
+is fixed and merged.
   
   2.2.0
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ed51a7f -> 8749b2b)

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ed51a7f  [SPARK-30654] Bootstrap4 docs upgrade
 add 8749b2b  [SPARK-32701][CORE][DOCS] 
mapreduce.fileoutputcommitter.algorithm.version default value

No new revisions were added by this update.

Summary of changes:
 docs/configuration.md | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (f14f374 -> ed51a7f)

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f14f374  [SPARK-32696][SQL][TEST-HIVE1.2][TEST-HADOOP2.7] Get columns 
operation should handle interval column properly
 add ed51a7f  [SPARK-30654] Bootstrap4 docs upgrade

No new revisions were added by this update.

Summary of changes:
 docs/_layouts/global.html  |  128 +-
 docs/css/bootstrap-responsive.css  | 1040 
 docs/css/bootstrap-responsive.min.css  |9 -
 docs/css/bootstrap.css | 5624 
 docs/css/bootstrap.min.css |   14 +-
 .../ui/static => docs/css}/bootstrap.min.css.map   |0
 docs/css/main.css  |  150 +-
 docs/js/main.js|   34 +-
 .../js/vendor}/bootstrap.bundle.min.js |0
 .../js/vendor}/bootstrap.bundle.min.js.map |0
 docs/js/vendor/bootstrap.js| 2027 ---
 docs/js/vendor/bootstrap.min.js|6 -
 12 files changed, 222 insertions(+), 8810 deletions(-)
 delete mode 100644 docs/css/bootstrap-responsive.css
 delete mode 100644 docs/css/bootstrap-responsive.min.css
 delete mode 100644 docs/css/bootstrap.css
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/css}/bootstrap.min.css.map (100%)
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/js/vendor}/bootstrap.bundle.min.js (100%)
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/js/vendor}/bootstrap.bundle.min.js.map (100%)
 delete mode 100755 docs/js/vendor/bootstrap.js
 delete mode 100755 docs/js/vendor/bootstrap.min.js


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32701][CORE][DOCS] mapreduce.fileoutputcommitter.algorithm.version default value

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 60f4856  [SPARK-32701][CORE][DOCS] 
mapreduce.fileoutputcommitter.algorithm.version default value
60f4856 is described below

commit 60f485671a07a93ae8a8506ed2c0999cfe6ded7b
Author: waleedfateem 
AuthorDate: Thu Aug 27 09:05:50 2020 -0500

[SPARK-32701][CORE][DOCS] mapreduce.fileoutputcommitter.algorithm.version 
default value

The current documentation states that the default value of 
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version is 1 which is not 
entirely true since this configuration isn't set anywhere in Spark but rather 
inherited from the Hadoop FileOutputCommitter class.

### What changes were proposed in this pull request?

I'm submitting this change, to clarify that the default value will entirely 
depend on the Hadoop version of the runtime environment.

### Why are the changes needed?

An application would end up using algorithm version 1 on certain 
environments but without any changes the same exact application will use 
version 2 on environments running Hadoop 3.0 and later. This can have pretty 
bad consequences in certain scenarios, for example, two tasks can partially 
overwrite their output if speculation is enabled. Also, please refer to the 
following JIRA:
https://issues.apache.org/jira/browse/MAPREDUCE-7282

### Does this PR introduce _any_ user-facing change?

Yes. Configuration page content was modified where previously we explicitly 
highlighted that the default version for the FileOutputCommitter algorithm was 
v1, this now has changed to "Dependent on environment" with additional 
information in the description column to elaborate.

### How was this patch tested?

Checked changes locally in browser

Closes #29541 from waleedfateem/SPARK-32701.

Authored-by: waleedfateem 
Signed-off-by: Sean Owen 
(cherry picked from commit 8749b2b6fae5ee0ce7b48aae6d859ed71e98491d)
Signed-off-by: Sean Owen 
---
 docs/configuration.md | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/configuration.md b/docs/configuration.md
index 2701fdb..95ff282 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1761,11 +1761,16 @@ Apart from these, the following properties are also 
available, and may be useful
 
 
   
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version
-  1
+  Dependent on environment
   
 The file output committer algorithm version, valid algorithm version 
number: 1 or 2.
 Version 2 may have better performance, but version 1 may handle failures 
better in certain situations,
 as per https://issues.apache.org/jira/browse/MAPREDUCE-4815;>MAPREDUCE-4815.
+The default value depends on the Hadoop version used in an environment:
+1 for Hadoop versions lower than 3.0
+2 for Hadoop versions 3.0 and higher
+It's important to note that this can change back to 1 again in the future 
once https://issues.apache.org/jira/browse/MAPREDUCE-7282;>MAPREDUCE-7282
+is fixed and merged.
   
   2.2.0
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ed51a7f -> 8749b2b)

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ed51a7f  [SPARK-30654] Bootstrap4 docs upgrade
 add 8749b2b  [SPARK-32701][CORE][DOCS] 
mapreduce.fileoutputcommitter.algorithm.version default value

No new revisions were added by this update.

Summary of changes:
 docs/configuration.md | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (f14f374 -> ed51a7f)

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f14f374  [SPARK-32696][SQL][TEST-HIVE1.2][TEST-HADOOP2.7] Get columns 
operation should handle interval column properly
 add ed51a7f  [SPARK-30654] Bootstrap4 docs upgrade

No new revisions were added by this update.

Summary of changes:
 docs/_layouts/global.html  |  128 +-
 docs/css/bootstrap-responsive.css  | 1040 
 docs/css/bootstrap-responsive.min.css  |9 -
 docs/css/bootstrap.css | 5624 
 docs/css/bootstrap.min.css |   14 +-
 .../ui/static => docs/css}/bootstrap.min.css.map   |0
 docs/css/main.css  |  150 +-
 docs/js/main.js|   34 +-
 .../js/vendor}/bootstrap.bundle.min.js |0
 .../js/vendor}/bootstrap.bundle.min.js.map |0
 docs/js/vendor/bootstrap.js| 2027 ---
 docs/js/vendor/bootstrap.min.js|6 -
 12 files changed, 222 insertions(+), 8810 deletions(-)
 delete mode 100644 docs/css/bootstrap-responsive.css
 delete mode 100644 docs/css/bootstrap-responsive.min.css
 delete mode 100644 docs/css/bootstrap.css
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/css}/bootstrap.min.css.map (100%)
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/js/vendor}/bootstrap.bundle.min.js (100%)
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/js/vendor}/bootstrap.bundle.min.js.map (100%)
 delete mode 100755 docs/js/vendor/bootstrap.js
 delete mode 100755 docs/js/vendor/bootstrap.min.js


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (f14f374 -> ed51a7f)

2020-08-27 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f14f374  [SPARK-32696][SQL][TEST-HIVE1.2][TEST-HADOOP2.7] Get columns 
operation should handle interval column properly
 add ed51a7f  [SPARK-30654] Bootstrap4 docs upgrade

No new revisions were added by this update.

Summary of changes:
 docs/_layouts/global.html  |  128 +-
 docs/css/bootstrap-responsive.css  | 1040 
 docs/css/bootstrap-responsive.min.css  |9 -
 docs/css/bootstrap.css | 5624 
 docs/css/bootstrap.min.css |   14 +-
 .../ui/static => docs/css}/bootstrap.min.css.map   |0
 docs/css/main.css  |  150 +-
 docs/js/main.js|   34 +-
 .../js/vendor}/bootstrap.bundle.min.js |0
 .../js/vendor}/bootstrap.bundle.min.js.map |0
 docs/js/vendor/bootstrap.js| 2027 ---
 docs/js/vendor/bootstrap.min.js|6 -
 12 files changed, 222 insertions(+), 8810 deletions(-)
 delete mode 100644 docs/css/bootstrap-responsive.css
 delete mode 100644 docs/css/bootstrap-responsive.min.css
 delete mode 100644 docs/css/bootstrap.css
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/css}/bootstrap.min.css.map (100%)
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/js/vendor}/bootstrap.bundle.min.js (100%)
 copy {core/src/main/resources/org/apache/spark/ui/static => 
docs/js/vendor}/bootstrap.bundle.min.js.map (100%)
 delete mode 100755 docs/js/vendor/bootstrap.js
 delete mode 100755 docs/js/vendor/bootstrap.min.js


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (8aa644e -> 4a67f1e)

2020-08-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8aa644e  [SPARK-32092][ML][PYSPARK][3.0] Removed foldCol related code
 add 4a67f1e  [SPARK-32588][CORE][TEST] Fix SizeEstimator initialization in 
tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/storage/BlockManagerSuite.scala   | 26 +++--
 .../apache/spark/storage/MemoryStoreSuite.scala| 29 +--
 .../org/apache/spark/util/SizeEstimatorSuite.scala | 43 ++
 3 files changed, 75 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (8aa644e -> 4a67f1e)

2020-08-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8aa644e  [SPARK-32092][ML][PYSPARK][3.0] Removed foldCol related code
 add 4a67f1e  [SPARK-32588][CORE][TEST] Fix SizeEstimator initialization in 
tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/storage/BlockManagerSuite.scala   | 26 +++--
 .../apache/spark/storage/MemoryStoreSuite.scala| 29 +--
 .../org/apache/spark/util/SizeEstimatorSuite.scala | 43 ++
 3 files changed, 75 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (08b951b -> bc23bb7)

2020-08-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 08b951b  [SPARK-32649][SQL] Optimize BHJ/SHJ inner/semi join with 
empty hashed relation
 add bc23bb7  [SPARK-32588][CORE][TEST] Fix SizeEstimator initialization in 
tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/storage/BlockManagerSuite.scala   | 26 +++--
 .../apache/spark/storage/MemoryStoreSuite.scala| 29 +--
 .../org/apache/spark/util/SizeEstimatorSuite.scala | 43 ++
 3 files changed, 75 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (8aa644e -> 4a67f1e)

2020-08-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8aa644e  [SPARK-32092][ML][PYSPARK][3.0] Removed foldCol related code
 add 4a67f1e  [SPARK-32588][CORE][TEST] Fix SizeEstimator initialization in 
tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/storage/BlockManagerSuite.scala   | 26 +++--
 .../apache/spark/storage/MemoryStoreSuite.scala| 29 +--
 .../org/apache/spark/util/SizeEstimatorSuite.scala | 43 ++
 3 files changed, 75 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (08b951b -> bc23bb7)

2020-08-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 08b951b  [SPARK-32649][SQL] Optimize BHJ/SHJ inner/semi join with 
empty hashed relation
 add bc23bb7  [SPARK-32588][CORE][TEST] Fix SizeEstimator initialization in 
tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/storage/BlockManagerSuite.scala   | 26 +++--
 .../apache/spark/storage/MemoryStoreSuite.scala| 29 +--
 .../org/apache/spark/util/SizeEstimatorSuite.scala | 43 ++
 3 files changed, 75 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (8aa644e -> 4a67f1e)

2020-08-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8aa644e  [SPARK-32092][ML][PYSPARK][3.0] Removed foldCol related code
 add 4a67f1e  [SPARK-32588][CORE][TEST] Fix SizeEstimator initialization in 
tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/storage/BlockManagerSuite.scala   | 26 +++--
 .../apache/spark/storage/MemoryStoreSuite.scala| 29 +--
 .../org/apache/spark/util/SizeEstimatorSuite.scala | 43 ++
 3 files changed, 75 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (08b951b -> bc23bb7)

2020-08-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 08b951b  [SPARK-32649][SQL] Optimize BHJ/SHJ inner/semi join with 
empty hashed relation
 add bc23bb7  [SPARK-32588][CORE][TEST] Fix SizeEstimator initialization in 
tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/storage/BlockManagerSuite.scala   | 26 +++--
 .../apache/spark/storage/MemoryStoreSuite.scala| 29 +--
 .../org/apache/spark/util/SizeEstimatorSuite.scala | 43 ++
 3 files changed, 75 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (8aa644e -> 4a67f1e)

2020-08-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8aa644e  [SPARK-32092][ML][PYSPARK][3.0] Removed foldCol related code
 add 4a67f1e  [SPARK-32588][CORE][TEST] Fix SizeEstimator initialization in 
tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/storage/BlockManagerSuite.scala   | 26 +++--
 .../apache/spark/storage/MemoryStoreSuite.scala| 29 +--
 .../org/apache/spark/util/SizeEstimatorSuite.scala | 43 ++
 3 files changed, 75 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (08b951b -> bc23bb7)

2020-08-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 08b951b  [SPARK-32649][SQL] Optimize BHJ/SHJ inner/semi join with 
empty hashed relation
 add bc23bb7  [SPARK-32588][CORE][TEST] Fix SizeEstimator initialization in 
tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/storage/BlockManagerSuite.scala   | 26 +++--
 .../apache/spark/storage/MemoryStoreSuite.scala| 29 +--
 .../org/apache/spark/util/SizeEstimatorSuite.scala | 43 ++
 3 files changed, 75 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (08b951b -> bc23bb7)

2020-08-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 08b951b  [SPARK-32649][SQL] Optimize BHJ/SHJ inner/semi join with 
empty hashed relation
 add bc23bb7  [SPARK-32588][CORE][TEST] Fix SizeEstimator initialization in 
tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/storage/BlockManagerSuite.scala   | 26 +++--
 .../apache/spark/storage/MemoryStoreSuite.scala| 29 +--
 .../org/apache/spark/util/SizeEstimatorSuite.scala | 43 ++
 3 files changed, 75 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (1c798f9 -> ac520d4)

2020-08-23 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1c798f9  [SPARK-32594][SQL][FOLLOWUP][TEST-HADOOP2.7][TEST-HIVE1.2] 
Override `get()` and use Julian days in `DaysWritable`
 add ac520d4  [SPARK-32676][3.0][ML] Fix double caching in KMeans/BiKMeans

No new revisions were added by this update.

Summary of changes:
 .../spark/ml/clustering/BisectingKMeans.scala  | 33 ++-
 .../org/apache/spark/ml/clustering/KMeans.scala| 33 ++-
 .../spark/mllib/clustering/BisectingKMeans.scala   | 47 ++
 .../org/apache/spark/mllib/clustering/KMeans.scala | 29 +++--
 4 files changed, 59 insertions(+), 83 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (1c798f9 -> ac520d4)

2020-08-23 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1c798f9  [SPARK-32594][SQL][FOLLOWUP][TEST-HADOOP2.7][TEST-HIVE1.2] 
Override `get()` and use Julian days in `DaysWritable`
 add ac520d4  [SPARK-32676][3.0][ML] Fix double caching in KMeans/BiKMeans

No new revisions were added by this update.

Summary of changes:
 .../spark/ml/clustering/BisectingKMeans.scala  | 33 ++-
 .../org/apache/spark/ml/clustering/KMeans.scala| 33 ++-
 .../spark/mllib/clustering/BisectingKMeans.scala   | 47 ++
 .../org/apache/spark/mllib/clustering/KMeans.scala | 29 +++--
 4 files changed, 59 insertions(+), 83 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (1c798f9 -> ac520d4)

2020-08-23 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1c798f9  [SPARK-32594][SQL][FOLLOWUP][TEST-HADOOP2.7][TEST-HIVE1.2] 
Override `get()` and use Julian days in `DaysWritable`
 add ac520d4  [SPARK-32676][3.0][ML] Fix double caching in KMeans/BiKMeans

No new revisions were added by this update.

Summary of changes:
 .../spark/ml/clustering/BisectingKMeans.scala  | 33 ++-
 .../org/apache/spark/ml/clustering/KMeans.scala| 33 ++-
 .../spark/mllib/clustering/BisectingKMeans.scala   | 47 ++
 .../org/apache/spark/mllib/clustering/KMeans.scala | 29 +++--
 4 files changed, 59 insertions(+), 83 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (1c798f9 -> ac520d4)

2020-08-23 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1c798f9  [SPARK-32594][SQL][FOLLOWUP][TEST-HADOOP2.7][TEST-HIVE1.2] 
Override `get()` and use Julian days in `DaysWritable`
 add ac520d4  [SPARK-32676][3.0][ML] Fix double caching in KMeans/BiKMeans

No new revisions were added by this update.

Summary of changes:
 .../spark/ml/clustering/BisectingKMeans.scala  | 33 ++-
 .../org/apache/spark/ml/clustering/KMeans.scala| 33 ++-
 .../spark/mllib/clustering/BisectingKMeans.scala   | 47 ++
 .../org/apache/spark/mllib/clustering/KMeans.scala | 29 +++--
 4 files changed, 59 insertions(+), 83 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-32676][3.0][ML] Fix double caching in KMeans/BiKMeans

2020-08-23 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ac520d4  [SPARK-32676][3.0][ML] Fix double caching in KMeans/BiKMeans
ac520d4 is described below

commit ac520d4a7c40a1d67358ee64af26e7f73face448
Author: zhengruifeng 
AuthorDate: Sun Aug 23 17:14:40 2020 -0500

[SPARK-32676][3.0][ML] Fix double caching in KMeans/BiKMeans

### What changes were proposed in this pull request?
Fix double caching in KMeans/BiKMeans:
1, let the callers of `runWithWeight` to pass whether `handlePersistence` 
is needed;
2, persist and unpersist inside of `runWithWeight`;
3, persist the `norms` if needed according to the comments;

### Why are the changes needed?
avoid double caching

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
existing testsuites

Closes #29501 from zhengruifeng/kmeans_handlePersistence.

Authored-by: zhengruifeng 
Signed-off-by: Sean Owen 
---
 .../spark/ml/clustering/BisectingKMeans.scala  | 33 ++-
 .../org/apache/spark/ml/clustering/KMeans.scala| 33 ++-
 .../spark/mllib/clustering/BisectingKMeans.scala   | 47 ++
 .../org/apache/spark/mllib/clustering/KMeans.scala | 29 +++--
 4 files changed, 59 insertions(+), 83 deletions(-)

diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala 
b/mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala
index 5a60bed..061091c 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala
@@ -29,9 +29,8 @@ import org.apache.spark.ml.util._
 import org.apache.spark.ml.util.Instrumentation.instrumented
 import org.apache.spark.mllib.clustering.{BisectingKMeans => 
MLlibBisectingKMeans,
   BisectingKMeansModel => MLlibBisectingKMeansModel}
-import org.apache.spark.mllib.linalg.{Vector => OldVector, Vectors => 
OldVectors}
+import org.apache.spark.mllib.linalg.{Vectors => OldVectors}
 import org.apache.spark.mllib.linalg.VectorImplicits._
-import org.apache.spark.rdd.RDD
 import org.apache.spark.sql.{DataFrame, Dataset, Row}
 import org.apache.spark.sql.functions._
 import org.apache.spark.sql.types.{DoubleType, IntegerType, StructType}
@@ -276,21 +275,6 @@ class BisectingKMeans @Since("2.0.0") (
   override def fit(dataset: Dataset[_]): BisectingKMeansModel = instrumented { 
instr =>
 transformSchema(dataset.schema, logging = true)
 
-val handlePersistence = dataset.storageLevel == StorageLevel.NONE
-val w = if (isDefined(weightCol) && $(weightCol).nonEmpty) {
-  checkNonNegativeWeight(col($(weightCol)).cast(DoubleType))
-} else {
-  lit(1.0)
-}
-
-val instances: RDD[(OldVector, Double)] = dataset
-  .select(DatasetUtils.columnToVector(dataset, getFeaturesCol), w).rdd.map 
{
-  case Row(point: Vector, weight: Double) => (OldVectors.fromML(point), 
weight)
-}
-if (handlePersistence) {
-  instances.persist(StorageLevel.MEMORY_AND_DISK)
-}
-
 instr.logPipelineStage(this)
 instr.logDataset(dataset)
 instr.logParams(this, featuresCol, predictionCol, k, maxIter, seed,
@@ -302,11 +286,18 @@ class BisectingKMeans @Since("2.0.0") (
   .setMinDivisibleClusterSize($(minDivisibleClusterSize))
   .setSeed($(seed))
   .setDistanceMeasure($(distanceMeasure))
-val parentModel = bkm.runWithWeight(instances, Some(instr))
-val model = copyValues(new BisectingKMeansModel(uid, 
parentModel).setParent(this))
-if (handlePersistence) {
-  instances.unpersist()
+
+val w = if (isDefined(weightCol) && $(weightCol).nonEmpty) {
+  checkNonNegativeWeight(col($(weightCol)).cast(DoubleType))
+} else {
+  lit(1.0)
 }
+val instances = dataset.select(DatasetUtils.columnToVector(dataset, 
getFeaturesCol), w)
+  .rdd.map { case Row(point: Vector, weight: Double) => 
(OldVectors.fromML(point), weight) }
+
+val handlePersistence = dataset.storageLevel == StorageLevel.NONE
+val parentModel = bkm.runWithWeight(instances, handlePersistence, 
Some(instr))
+val model = copyValues(new BisectingKMeansModel(uid, 
parentModel).setParent(this))
 
 val summary = new BisectingKMeansSummary(
   model.transform(dataset),
diff --git a/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala 
b/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
index 5c06973..f6f6eb7 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala
@@ -32,7 +32,6 @@ import org.apache.spark.ml.util.Instrument

[spark] branch master updated (d9eb06e -> a4d785d)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d9eb06e  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()
 add a4d785d  [MINOR] Typo in ShuffleMapStage.scala

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/scheduler/ShuffleMapStage.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d9eb06e -> a4d785d)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d9eb06e  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()
 add a4d785d  [MINOR] Typo in ShuffleMapStage.scala

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/scheduler/ShuffleMapStage.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d9eb06e -> a4d785d)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d9eb06e  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()
 add a4d785d  [MINOR] Typo in ShuffleMapStage.scala

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/scheduler/ShuffleMapStage.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d9eb06e -> a4d785d)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d9eb06e  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()
 add a4d785d  [MINOR] Typo in ShuffleMapStage.scala

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/scheduler/ShuffleMapStage.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d9eb06e -> a4d785d)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d9eb06e  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()
 add a4d785d  [MINOR] Typo in ShuffleMapStage.scala

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/scheduler/ShuffleMapStage.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (a6df16b -> 85c9e8c)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a6df16b  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations
 add 85c9e8c  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/tests/test_tuning.py | 131 ++---
 python/pyspark/ml/tuning.py|  67 +
 2 files changed, 172 insertions(+), 26 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (25c7d0f -> d9eb06e)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 25c7d0f  [SPARK-32526][SQL] Pass all test of sql/catalyst module in 
Scala 2.13
 add d9eb06e  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/tests/test_tuning.py | 131 ++---
 python/pyspark/ml/tuning.py|  67 +
 2 files changed, 172 insertions(+), 26 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (a6df16b -> 85c9e8c)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a6df16b  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations
 add 85c9e8c  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/tests/test_tuning.py | 131 ++---
 python/pyspark/ml/tuning.py|  67 +
 2 files changed, 172 insertions(+), 26 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (25c7d0f -> d9eb06e)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 25c7d0f  [SPARK-32526][SQL] Pass all test of sql/catalyst module in 
Scala 2.13
 add d9eb06e  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/tests/test_tuning.py | 131 ++---
 python/pyspark/ml/tuning.py|  67 +
 2 files changed, 172 insertions(+), 26 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in CrossValidatorModel.copy(), read() and write()

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 85c9e8c  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()
85c9e8c is described below

commit 85c9e8c54c30b69c39075e97cd3cac295be09303
Author: Louiszr 
AuthorDate: Sat Aug 22 09:27:31 2020 -0500

[SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()

### What changes were proposed in this pull request?

Changed the definitions of 
`CrossValidatorModel.copy()/_to_java()/_from_java()` so that exposed parameters 
(i.e. parameters with `get()` methods) are copied in these methods.

### Why are the changes needed?

Parameters are copied in the respective Scala interface for 
`CrossValidatorModel.copy()`.
It fits the semantics to persist parameters when calling 
`CrossValidatorModel.save()` and `CrossValidatorModel.load()` so that the user 
gets the same model by saving and loading it after. Not copying across 
`numFolds` also causes bugs like Array index out of bound and losing sub-models 
because this parameters will always default to 3 (as described in the JIRA 
ticket).

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Tests for `CrossValidatorModel.copy()` and `save()`/`load()` are updated so 
that they check parameters before and after function calls.

Closes #29445 from Louiszr/master.

Authored-by: Louiszr 
Signed-off-by: Sean Owen 
(cherry picked from commit d9eb06ea37cab185f1e49c641313be9707270252)
Signed-off-by: Sean Owen 
---
 python/pyspark/ml/tests/test_tuning.py | 131 ++---
 python/pyspark/ml/tuning.py|  67 +
 2 files changed, 172 insertions(+), 26 deletions(-)

diff --git a/python/pyspark/ml/tests/test_tuning.py 
b/python/pyspark/ml/tests/test_tuning.py
index 6bcc3f9..b250740 100644
--- a/python/pyspark/ml/tests/test_tuning.py
+++ b/python/pyspark/ml/tests/test_tuning.py
@@ -89,15 +89,50 @@ class CrossValidatorTests(SparkSessionTestCase):
 grid = (ParamGridBuilder()
 .addGrid(iee.inducedError, [100.0, 0.0, 1.0])
 .build())
-cv = CrossValidator(estimator=iee, estimatorParamMaps=grid, 
evaluator=evaluator)
+cv = CrossValidator(
+estimator=iee,
+estimatorParamMaps=grid,
+evaluator=evaluator,
+collectSubModels=True,
+numFolds=2
+)
 cvCopied = cv.copy()
-self.assertEqual(cv.getEstimator().uid, cvCopied.getEstimator().uid)
+for param in [
+lambda x: x.getEstimator().uid,
+# SPARK-32092: CrossValidator.copy() needs to copy all existing 
params
+lambda x: x.getNumFolds(),
+lambda x: x.getFoldCol(),
+lambda x: x.getCollectSubModels(),
+lambda x: x.getParallelism(),
+lambda x: x.getSeed()
+]:
+self.assertEqual(param(cv), param(cvCopied))
 
 cvModel = cv.fit(dataset)
 cvModelCopied = cvModel.copy()
 for index in range(len(cvModel.avgMetrics)):
 self.assertTrue(abs(cvModel.avgMetrics[index] - 
cvModelCopied.avgMetrics[index])
 < 0.0001)
+# SPARK-32092: CrossValidatorModel.copy() needs to copy all existing 
params
+for param in [
+lambda x: x.getNumFolds(),
+lambda x: x.getFoldCol(),
+lambda x: x.getSeed()
+]:
+self.assertEqual(param(cvModel), param(cvModelCopied))
+
+cvModel.avgMetrics[0] = 'foo'
+self.assertNotEqual(
+cvModelCopied.avgMetrics[0],
+'foo',
+"Changing the original avgMetrics should not affect the copied 
model"
+)
+cvModel.subModels[0] = 'foo'
+self.assertNotEqual(
+cvModelCopied.subModels[0],
+'foo',
+"Changing the original subModels should not affect the copied 
model"
+)
 
 def test_fit_minimize_metric(self):
 dataset = self.spark.createDataFrame([
@@ -166,16 +201,39 @@ class CrossValidatorTests(SparkSessionTestCase):
 lr = LogisticRegression()
 grid = ParamGridBuilder().addGrid(lr.maxIter, [0, 1]).build()
 evaluator = BinaryClassificationEvaluator()
-cv = CrossValidator(estimator=lr, estimatorParamMaps=grid, 
evaluator=evaluator)
+cv = CrossValidator(
+estimator=lr,
+estimatorParamMaps=grid,
+evaluator=evaluator,
+collectSubModels=True,
+numFolds=4,
+seed=42
+)
 c

[spark] branch master updated (25c7d0f -> d9eb06e)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 25c7d0f  [SPARK-32526][SQL] Pass all test of sql/catalyst module in 
Scala 2.13
 add d9eb06e  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/tests/test_tuning.py | 131 ++---
 python/pyspark/ml/tuning.py|  67 +
 2 files changed, 172 insertions(+), 26 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in CrossValidatorModel.copy(), read() and write()

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 85c9e8c  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()
85c9e8c is described below

commit 85c9e8c54c30b69c39075e97cd3cac295be09303
Author: Louiszr 
AuthorDate: Sat Aug 22 09:27:31 2020 -0500

[SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()

### What changes were proposed in this pull request?

Changed the definitions of 
`CrossValidatorModel.copy()/_to_java()/_from_java()` so that exposed parameters 
(i.e. parameters with `get()` methods) are copied in these methods.

### Why are the changes needed?

Parameters are copied in the respective Scala interface for 
`CrossValidatorModel.copy()`.
It fits the semantics to persist parameters when calling 
`CrossValidatorModel.save()` and `CrossValidatorModel.load()` so that the user 
gets the same model by saving and loading it after. Not copying across 
`numFolds` also causes bugs like Array index out of bound and losing sub-models 
because this parameters will always default to 3 (as described in the JIRA 
ticket).

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Tests for `CrossValidatorModel.copy()` and `save()`/`load()` are updated so 
that they check parameters before and after function calls.

Closes #29445 from Louiszr/master.

Authored-by: Louiszr 
Signed-off-by: Sean Owen 
(cherry picked from commit d9eb06ea37cab185f1e49c641313be9707270252)
Signed-off-by: Sean Owen 
---
 python/pyspark/ml/tests/test_tuning.py | 131 ++---
 python/pyspark/ml/tuning.py|  67 +
 2 files changed, 172 insertions(+), 26 deletions(-)

diff --git a/python/pyspark/ml/tests/test_tuning.py 
b/python/pyspark/ml/tests/test_tuning.py
index 6bcc3f9..b250740 100644
--- a/python/pyspark/ml/tests/test_tuning.py
+++ b/python/pyspark/ml/tests/test_tuning.py
@@ -89,15 +89,50 @@ class CrossValidatorTests(SparkSessionTestCase):
 grid = (ParamGridBuilder()
 .addGrid(iee.inducedError, [100.0, 0.0, 1.0])
 .build())
-cv = CrossValidator(estimator=iee, estimatorParamMaps=grid, 
evaluator=evaluator)
+cv = CrossValidator(
+estimator=iee,
+estimatorParamMaps=grid,
+evaluator=evaluator,
+collectSubModels=True,
+numFolds=2
+)
 cvCopied = cv.copy()
-self.assertEqual(cv.getEstimator().uid, cvCopied.getEstimator().uid)
+for param in [
+lambda x: x.getEstimator().uid,
+# SPARK-32092: CrossValidator.copy() needs to copy all existing 
params
+lambda x: x.getNumFolds(),
+lambda x: x.getFoldCol(),
+lambda x: x.getCollectSubModels(),
+lambda x: x.getParallelism(),
+lambda x: x.getSeed()
+]:
+self.assertEqual(param(cv), param(cvCopied))
 
 cvModel = cv.fit(dataset)
 cvModelCopied = cvModel.copy()
 for index in range(len(cvModel.avgMetrics)):
 self.assertTrue(abs(cvModel.avgMetrics[index] - 
cvModelCopied.avgMetrics[index])
 < 0.0001)
+# SPARK-32092: CrossValidatorModel.copy() needs to copy all existing 
params
+for param in [
+lambda x: x.getNumFolds(),
+lambda x: x.getFoldCol(),
+lambda x: x.getSeed()
+]:
+self.assertEqual(param(cvModel), param(cvModelCopied))
+
+cvModel.avgMetrics[0] = 'foo'
+self.assertNotEqual(
+cvModelCopied.avgMetrics[0],
+'foo',
+"Changing the original avgMetrics should not affect the copied 
model"
+)
+cvModel.subModels[0] = 'foo'
+self.assertNotEqual(
+cvModelCopied.subModels[0],
+'foo',
+"Changing the original subModels should not affect the copied 
model"
+)
 
 def test_fit_minimize_metric(self):
 dataset = self.spark.createDataFrame([
@@ -166,16 +201,39 @@ class CrossValidatorTests(SparkSessionTestCase):
 lr = LogisticRegression()
 grid = ParamGridBuilder().addGrid(lr.maxIter, [0, 1]).build()
 evaluator = BinaryClassificationEvaluator()
-cv = CrossValidator(estimator=lr, estimatorParamMaps=grid, 
evaluator=evaluator)
+cv = CrossValidator(
+estimator=lr,
+estimatorParamMaps=grid,
+evaluator=evaluator,
+collectSubModels=True,
+numFolds=4,
+seed=42
+)
 c

[spark] branch master updated (25c7d0f -> d9eb06e)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 25c7d0f  [SPARK-32526][SQL] Pass all test of sql/catalyst module in 
Scala 2.13
 add d9eb06e  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/tests/test_tuning.py | 131 ++---
 python/pyspark/ml/tuning.py|  67 +
 2 files changed, 172 insertions(+), 26 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8b26c69 -> 25c7d0f)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8b26c69  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations
 add 25c7d0f  [SPARK-32526][SQL] Pass all test of sql/catalyst module in 
Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/ExecutorAllocationManager.scala   | 2 +-
 sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala| 2 +-
 .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala| 2 +-
 .../scala/org/apache/spark/sql/catalyst/ScalaReflection.scala | 7 +++
 .../scala/org/apache/spark/sql/catalyst/encoders/RowEncoder.scala | 4 +++-
 .../apache/spark/sql/catalyst/expressions/objects/objects.scala   | 8 
 .../org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala| 3 ++-
 .../sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala| 6 --
 8 files changed, 19 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in CrossValidatorModel.copy(), read() and write()

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 85c9e8c  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()
85c9e8c is described below

commit 85c9e8c54c30b69c39075e97cd3cac295be09303
Author: Louiszr 
AuthorDate: Sat Aug 22 09:27:31 2020 -0500

[SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()

### What changes were proposed in this pull request?

Changed the definitions of 
`CrossValidatorModel.copy()/_to_java()/_from_java()` so that exposed parameters 
(i.e. parameters with `get()` methods) are copied in these methods.

### Why are the changes needed?

Parameters are copied in the respective Scala interface for 
`CrossValidatorModel.copy()`.
It fits the semantics to persist parameters when calling 
`CrossValidatorModel.save()` and `CrossValidatorModel.load()` so that the user 
gets the same model by saving and loading it after. Not copying across 
`numFolds` also causes bugs like Array index out of bound and losing sub-models 
because this parameters will always default to 3 (as described in the JIRA 
ticket).

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Tests for `CrossValidatorModel.copy()` and `save()`/`load()` are updated so 
that they check parameters before and after function calls.

Closes #29445 from Louiszr/master.

Authored-by: Louiszr 
Signed-off-by: Sean Owen 
(cherry picked from commit d9eb06ea37cab185f1e49c641313be9707270252)
Signed-off-by: Sean Owen 
---
 python/pyspark/ml/tests/test_tuning.py | 131 ++---
 python/pyspark/ml/tuning.py|  67 +
 2 files changed, 172 insertions(+), 26 deletions(-)

diff --git a/python/pyspark/ml/tests/test_tuning.py 
b/python/pyspark/ml/tests/test_tuning.py
index 6bcc3f9..b250740 100644
--- a/python/pyspark/ml/tests/test_tuning.py
+++ b/python/pyspark/ml/tests/test_tuning.py
@@ -89,15 +89,50 @@ class CrossValidatorTests(SparkSessionTestCase):
 grid = (ParamGridBuilder()
 .addGrid(iee.inducedError, [100.0, 0.0, 1.0])
 .build())
-cv = CrossValidator(estimator=iee, estimatorParamMaps=grid, 
evaluator=evaluator)
+cv = CrossValidator(
+estimator=iee,
+estimatorParamMaps=grid,
+evaluator=evaluator,
+collectSubModels=True,
+numFolds=2
+)
 cvCopied = cv.copy()
-self.assertEqual(cv.getEstimator().uid, cvCopied.getEstimator().uid)
+for param in [
+lambda x: x.getEstimator().uid,
+# SPARK-32092: CrossValidator.copy() needs to copy all existing 
params
+lambda x: x.getNumFolds(),
+lambda x: x.getFoldCol(),
+lambda x: x.getCollectSubModels(),
+lambda x: x.getParallelism(),
+lambda x: x.getSeed()
+]:
+self.assertEqual(param(cv), param(cvCopied))
 
 cvModel = cv.fit(dataset)
 cvModelCopied = cvModel.copy()
 for index in range(len(cvModel.avgMetrics)):
 self.assertTrue(abs(cvModel.avgMetrics[index] - 
cvModelCopied.avgMetrics[index])
 < 0.0001)
+# SPARK-32092: CrossValidatorModel.copy() needs to copy all existing 
params
+for param in [
+lambda x: x.getNumFolds(),
+lambda x: x.getFoldCol(),
+lambda x: x.getSeed()
+]:
+self.assertEqual(param(cvModel), param(cvModelCopied))
+
+cvModel.avgMetrics[0] = 'foo'
+self.assertNotEqual(
+cvModelCopied.avgMetrics[0],
+'foo',
+"Changing the original avgMetrics should not affect the copied 
model"
+)
+cvModel.subModels[0] = 'foo'
+self.assertNotEqual(
+cvModelCopied.subModels[0],
+'foo',
+"Changing the original subModels should not affect the copied 
model"
+)
 
 def test_fit_minimize_metric(self):
 dataset = self.spark.createDataFrame([
@@ -166,16 +201,39 @@ class CrossValidatorTests(SparkSessionTestCase):
 lr = LogisticRegression()
 grid = ParamGridBuilder().addGrid(lr.maxIter, [0, 1]).build()
 evaluator = BinaryClassificationEvaluator()
-cv = CrossValidator(estimator=lr, estimatorParamMaps=grid, 
evaluator=evaluator)
+cv = CrossValidator(
+estimator=lr,
+estimatorParamMaps=grid,
+evaluator=evaluator,
+collectSubModels=True,
+numFolds=4,
+seed=42
+)
 c

[spark] branch master updated (25c7d0f -> d9eb06e)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 25c7d0f  [SPARK-32526][SQL] Pass all test of sql/catalyst module in 
Scala 2.13
 add d9eb06e  [SPARK-32092][ML][PYSPARK] Fix parameters not being copied in 
CrossValidatorModel.copy(), read() and write()

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/tests/test_tuning.py | 131 ++---
 python/pyspark/ml/tuning.py|  67 +
 2 files changed, 172 insertions(+), 26 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8b26c69 -> 25c7d0f)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8b26c69  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations
 add 25c7d0f  [SPARK-32526][SQL] Pass all test of sql/catalyst module in 
Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/ExecutorAllocationManager.scala   | 2 +-
 sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala| 2 +-
 .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala| 2 +-
 .../scala/org/apache/spark/sql/catalyst/ScalaReflection.scala | 7 +++
 .../scala/org/apache/spark/sql/catalyst/encoders/RowEncoder.scala | 4 +++-
 .../apache/spark/sql/catalyst/expressions/objects/objects.scala   | 8 
 .../org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala| 3 ++-
 .../sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala| 6 --
 8 files changed, 19 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8b26c69 -> 25c7d0f)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8b26c69  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations
 add 25c7d0f  [SPARK-32526][SQL] Pass all test of sql/catalyst module in 
Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/ExecutorAllocationManager.scala   | 2 +-
 sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala| 2 +-
 .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala| 2 +-
 .../scala/org/apache/spark/sql/catalyst/ScalaReflection.scala | 7 +++
 .../scala/org/apache/spark/sql/catalyst/encoders/RowEncoder.scala | 4 +++-
 .../apache/spark/sql/catalyst/expressions/objects/objects.scala   | 8 
 .../org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala| 3 ++-
 .../sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala| 6 --
 8 files changed, 19 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8b26c69 -> 25c7d0f)

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8b26c69  [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description 
for some operations
 add 25c7d0f  [SPARK-32526][SQL] Pass all test of sql/catalyst module in 
Scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/ExecutorAllocationManager.scala   | 2 +-
 sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala| 2 +-
 .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala| 2 +-
 .../scala/org/apache/spark/sql/catalyst/ScalaReflection.scala | 7 +++
 .../scala/org/apache/spark/sql/catalyst/encoders/RowEncoder.scala | 4 +++-
 .../apache/spark/sql/catalyst/expressions/objects/objects.scala   | 8 
 .../org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala| 3 ++-
 .../sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala| 6 --
 8 files changed, 19 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

2020-08-22 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 25c7d0f  [SPARK-32526][SQL] Pass all test of sql/catalyst module in 
Scala 2.13
25c7d0f is described below

commit 25c7d0fe6ae20a4c1c42e0cd0b448c08ab03f3fb
Author: yangjie01 
AuthorDate: Sat Aug 22 09:24:16 2020 -0500

[SPARK-32526][SQL] Pass all test of sql/catalyst module in Scala 2.13

### What changes were proposed in this pull request?
The purpose of this pr is to resolve 
[SPARK-32526](https://issues.apache.org/jira/browse/SPARK-32526), all remaining 
failed cases are fixed.

The main change of this pr as follow:

- Change of `ExecutorAllocationManager.scala` for core module compilation 
in Scala 2.13, it's a blocking problem

- Change `Seq[_]` to `scala.collection.Seq[_]` refer to failed cases

- Added different expected plan of `Test 4: Star with several branches` of 
StarJoinCostBasedReorderSuite  for Scala 2.13 because the candidates plans:

```
Join Inner, (d1_pk#5 = f1_fk1#0)
:- Join Inner, (f1_fk2#1 = d2_pk#8)
:  :- Join Inner, (f1_fk3#2 = d3_pk#11)
```
and

```
Join Inner, (f1_fk2#1 = d2_pk#8)
:- Join Inner, (d1_pk#5 = f1_fk1#0)
:  :- Join Inner, (f1_fk3#2 = d3_pk#11)
```

have same cost `Cost(200,9200)`, but `HashMap` is rewritten in scala 2.13 
and The order of iterations leads to different results.

This pr fix test cases as follow:

- LiteralExpressionSuite (1 FAILED -> PASS)
- StarJoinCostBasedReorderSuite ( 1 FAILED-> PASS)
- ObjectExpressionsSuite( 2 FAILED-> PASS)
- ScalaReflectionSuite （1 FAILED-> PASS）
- RowEncoderSuite (10 FAILED-> PASS)
- ExpressionEncoderSuite  (ABORTED-> PASS)

### Why are the changes needed?
We need to support a Scala 2.13 build.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

[spark] branch branch-3.0 updated: [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md to refer the proper version

2020-08-16 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c4807ce  [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in 
monitoring.md to refer the proper version
c4807ce is described below

commit c4807ced3913a4d524892dc7bab502250687a43c
Author: Kousuke Saruta 
AuthorDate: Sun Aug 16 12:07:37 2020 -0500

[SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md 
to refer the proper version

### What changes were proposed in this pull request?

This PR fixes the link to metrics.dropwizard.io in monitoring.md to refer 
the proper version of the library.

### Why are the changes needed?

There are links to metrics.dropwizard.io in monitoring.md but the link 
targets refer the version 3.1.0, while we use 4.1.1.
Now that users can create their own metrics using the dropwizard library, 
it's better to fix the links to refer the proper version.

### Does this PR introduce _any_ user-facing change?

Yes. The modified links refer the version 4.1.1.

### How was this patch tested?

Build the docs and visit all the modified links.

Closes #29426 from sarutak/fix-dropwizard-url.

Authored-by: Kousuke Saruta 
Signed-off-by: Sean Owen 
(cherry picked from commit 9a79bbc8b6e426e7b29a9f4867beb396014d8046)
Signed-off-by: Sean Owen 
---
 docs/monitoring.md | 8 
 pom.xml| 4 
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/docs/monitoring.md b/docs/monitoring.md
index 1808167..4608a4e 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -718,7 +718,7 @@ The JSON end point is exposed at: 
`/applications/[app-id]/executors`, and the Pr
 The Prometheus endpoint is experimental and conditional to a configuration 
parameter: `spark.ui.prometheus.enabled=true` (the default is `false`).
 In addition, aggregated per-stage peak values of the executor memory metrics 
are written to the event log if
 `spark.eventLog.logStageExecutorMetrics` is true.  
-Executor memory metrics are also exposed via the Spark metrics system based on 
the Dropwizard metrics library.
+Executor memory metrics are also exposed via the Spark metrics system based on 
the [Dropwizard metrics library](http://metrics.dropwizard.io/4.1.1).
 A list of the available metrics, with a short description:
 
 
@@ -922,7 +922,7 @@ keep the paths consistent in both modes.
 # Metrics
 
 Spark has a configurable metrics system based on the
-[Dropwizard Metrics Library](http://metrics.dropwizard.io/).
+[Dropwizard Metrics Library](http://metrics.dropwizard.io/4.1.1).
 This allows users to report Spark metrics to a variety of sinks including 
HTTP, JMX, and CSV
 files. The metrics are generated by sources embedded in the Spark code base. 
They
 provide instrumentation for specific activities and Spark components.
@@ -1016,7 +1016,7 @@ activates the JVM source:
 ## List of available metrics providers 
 
 Metrics used by Spark are of multiple types: gauge, counter, histogram, meter 
and timer, 
-see [Dropwizard library documentation for 
details](https://metrics.dropwizard.io/3.1.0/getting-started/).
+see [Dropwizard library documentation for 
details](https://metrics.dropwizard.io/4.1.1/getting-started.html).
 The following list of components and metrics reports the name and some details 
about the available metrics,
 grouped per component instance and source namespace.
 The most common time of metrics used in Spark instrumentation are gauges and 
counters. 
@@ -1244,7 +1244,7 @@ Notes:
 `spark.metrics.staticSources.enabled` (default is true)
   - This source is available for driver and executor instances and is also 
available for other instances.  
   - This source provides information on JVM metrics using the 
-  [Dropwizard/Codahale Metric Sets for JVM 
instrumentation](https://metrics.dropwizard.io/3.1.0/manual/jvm/)
+  [Dropwizard/Codahale Metric Sets for JVM 
instrumentation](https://metrics.dropwizard.io/4.1.1/manual/jvm.html)
and in particular the metric sets BufferPoolMetricSet, 
GarbageCollectorMetricSet and MemoryUsageGaugeSet. 
 
 ### Component instance = applicationMaster
diff --git a/pom.xml b/pom.xml
index e9ae204..1bf5de0 100644
--- a/pom.xml
+++ b/pom.xml
@@ -145,6 +145,10 @@
 0.9.5
 2.4.0
 2.0.8
+
 4.1.1
 1.8.2
 hadoop2


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md to refer the proper version

2020-08-16 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c4807ce  [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in 
monitoring.md to refer the proper version
c4807ce is described below

commit c4807ced3913a4d524892dc7bab502250687a43c
Author: Kousuke Saruta 
AuthorDate: Sun Aug 16 12:07:37 2020 -0500

[SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md 
to refer the proper version

### What changes were proposed in this pull request?

This PR fixes the link to metrics.dropwizard.io in monitoring.md to refer 
the proper version of the library.

### Why are the changes needed?

There are links to metrics.dropwizard.io in monitoring.md but the link 
targets refer the version 3.1.0, while we use 4.1.1.
Now that users can create their own metrics using the dropwizard library, 
it's better to fix the links to refer the proper version.

### Does this PR introduce _any_ user-facing change?

Yes. The modified links refer the version 4.1.1.

### How was this patch tested?

Build the docs and visit all the modified links.

Closes #29426 from sarutak/fix-dropwizard-url.

Authored-by: Kousuke Saruta 
Signed-off-by: Sean Owen 
(cherry picked from commit 9a79bbc8b6e426e7b29a9f4867beb396014d8046)
Signed-off-by: Sean Owen 
---
 docs/monitoring.md | 8 
 pom.xml| 4 
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/docs/monitoring.md b/docs/monitoring.md
index 1808167..4608a4e 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -718,7 +718,7 @@ The JSON end point is exposed at: 
`/applications/[app-id]/executors`, and the Pr
 The Prometheus endpoint is experimental and conditional to a configuration 
parameter: `spark.ui.prometheus.enabled=true` (the default is `false`).
 In addition, aggregated per-stage peak values of the executor memory metrics 
are written to the event log if
 `spark.eventLog.logStageExecutorMetrics` is true.  
-Executor memory metrics are also exposed via the Spark metrics system based on 
the Dropwizard metrics library.
+Executor memory metrics are also exposed via the Spark metrics system based on 
the [Dropwizard metrics library](http://metrics.dropwizard.io/4.1.1).
 A list of the available metrics, with a short description:
 
 
@@ -922,7 +922,7 @@ keep the paths consistent in both modes.
 # Metrics
 
 Spark has a configurable metrics system based on the
-[Dropwizard Metrics Library](http://metrics.dropwizard.io/).
+[Dropwizard Metrics Library](http://metrics.dropwizard.io/4.1.1).
 This allows users to report Spark metrics to a variety of sinks including 
HTTP, JMX, and CSV
 files. The metrics are generated by sources embedded in the Spark code base. 
They
 provide instrumentation for specific activities and Spark components.
@@ -1016,7 +1016,7 @@ activates the JVM source:
 ## List of available metrics providers 
 
 Metrics used by Spark are of multiple types: gauge, counter, histogram, meter 
and timer, 
-see [Dropwizard library documentation for 
details](https://metrics.dropwizard.io/3.1.0/getting-started/).
+see [Dropwizard library documentation for 
details](https://metrics.dropwizard.io/4.1.1/getting-started.html).
 The following list of components and metrics reports the name and some details 
about the available metrics,
 grouped per component instance and source namespace.
 The most common time of metrics used in Spark instrumentation are gauges and 
counters. 
@@ -1244,7 +1244,7 @@ Notes:
 `spark.metrics.staticSources.enabled` (default is true)
   - This source is available for driver and executor instances and is also 
available for other instances.  
   - This source provides information on JVM metrics using the 
-  [Dropwizard/Codahale Metric Sets for JVM 
instrumentation](https://metrics.dropwizard.io/3.1.0/manual/jvm/)
+  [Dropwizard/Codahale Metric Sets for JVM 
instrumentation](https://metrics.dropwizard.io/4.1.1/manual/jvm.html)
and in particular the metric sets BufferPoolMetricSet, 
GarbageCollectorMetricSet and MemoryUsageGaugeSet. 
 
 ### Component instance = applicationMaster
diff --git a/pom.xml b/pom.xml
index e9ae204..1bf5de0 100644
--- a/pom.xml
+++ b/pom.xml
@@ -145,6 +145,10 @@
 0.9.5
 2.4.0
 2.0.8
+
 4.1.1
 1.8.2
 hadoop2


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c280c7f -> 9a79bbc)

2020-08-16 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c280c7f  [SPARK-32625][SQL] Log error message when falling back to 
interpreter mode
 add 9a79bbc  [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in 
monitoring.md to refer the proper version

No new revisions were added by this update.

Summary of changes:
 docs/monitoring.md | 8 
 pom.xml| 4 
 2 files changed, 8 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md to refer the proper version

2020-08-16 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c4807ce  [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in 
monitoring.md to refer the proper version
c4807ce is described below

commit c4807ced3913a4d524892dc7bab502250687a43c
Author: Kousuke Saruta 
AuthorDate: Sun Aug 16 12:07:37 2020 -0500

[SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md 
to refer the proper version

### What changes were proposed in this pull request?

This PR fixes the link to metrics.dropwizard.io in monitoring.md to refer 
the proper version of the library.

### Why are the changes needed?

There are links to metrics.dropwizard.io in monitoring.md but the link 
targets refer the version 3.1.0, while we use 4.1.1.
Now that users can create their own metrics using the dropwizard library, 
it's better to fix the links to refer the proper version.

### Does this PR introduce _any_ user-facing change?

Yes. The modified links refer the version 4.1.1.

### How was this patch tested?

Build the docs and visit all the modified links.

Closes #29426 from sarutak/fix-dropwizard-url.

Authored-by: Kousuke Saruta 
Signed-off-by: Sean Owen 
(cherry picked from commit 9a79bbc8b6e426e7b29a9f4867beb396014d8046)
Signed-off-by: Sean Owen 
---
 docs/monitoring.md | 8 
 pom.xml| 4 
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/docs/monitoring.md b/docs/monitoring.md
index 1808167..4608a4e 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -718,7 +718,7 @@ The JSON end point is exposed at: 
`/applications/[app-id]/executors`, and the Pr
 The Prometheus endpoint is experimental and conditional to a configuration 
parameter: `spark.ui.prometheus.enabled=true` (the default is `false`).
 In addition, aggregated per-stage peak values of the executor memory metrics 
are written to the event log if
 `spark.eventLog.logStageExecutorMetrics` is true.  
-Executor memory metrics are also exposed via the Spark metrics system based on 
the Dropwizard metrics library.
+Executor memory metrics are also exposed via the Spark metrics system based on 
the [Dropwizard metrics library](http://metrics.dropwizard.io/4.1.1).
 A list of the available metrics, with a short description:
 
 
@@ -922,7 +922,7 @@ keep the paths consistent in both modes.
 # Metrics
 
 Spark has a configurable metrics system based on the
-[Dropwizard Metrics Library](http://metrics.dropwizard.io/).
+[Dropwizard Metrics Library](http://metrics.dropwizard.io/4.1.1).
 This allows users to report Spark metrics to a variety of sinks including 
HTTP, JMX, and CSV
 files. The metrics are generated by sources embedded in the Spark code base. 
They
 provide instrumentation for specific activities and Spark components.
@@ -1016,7 +1016,7 @@ activates the JVM source:
 ## List of available metrics providers 
 
 Metrics used by Spark are of multiple types: gauge, counter, histogram, meter 
and timer, 
-see [Dropwizard library documentation for 
details](https://metrics.dropwizard.io/3.1.0/getting-started/).
+see [Dropwizard library documentation for 
details](https://metrics.dropwizard.io/4.1.1/getting-started.html).
 The following list of components and metrics reports the name and some details 
about the available metrics,
 grouped per component instance and source namespace.
 The most common time of metrics used in Spark instrumentation are gauges and 
counters. 
@@ -1244,7 +1244,7 @@ Notes:
 `spark.metrics.staticSources.enabled` (default is true)
   - This source is available for driver and executor instances and is also 
available for other instances.  
   - This source provides information on JVM metrics using the 
-  [Dropwizard/Codahale Metric Sets for JVM 
instrumentation](https://metrics.dropwizard.io/3.1.0/manual/jvm/)
+  [Dropwizard/Codahale Metric Sets for JVM 
instrumentation](https://metrics.dropwizard.io/4.1.1/manual/jvm.html)
and in particular the metric sets BufferPoolMetricSet, 
GarbageCollectorMetricSet and MemoryUsageGaugeSet. 
 
 ### Component instance = applicationMaster
diff --git a/pom.xml b/pom.xml
index e9ae204..1bf5de0 100644
--- a/pom.xml
+++ b/pom.xml
@@ -145,6 +145,10 @@
 0.9.5
 2.4.0
 2.0.8
+
 4.1.1
 1.8.2
 hadoop2


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c280c7f -> 9a79bbc)

2020-08-16 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c280c7f  [SPARK-32625][SQL] Log error message when falling back to 
interpreter mode
 add 9a79bbc  [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in 
monitoring.md to refer the proper version

No new revisions were added by this update.

Summary of changes:
 docs/monitoring.md | 8 
 pom.xml| 4 
 2 files changed, 8 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md to refer the proper version

2020-08-16 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c4807ce  [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in 
monitoring.md to refer the proper version
c4807ce is described below

commit c4807ced3913a4d524892dc7bab502250687a43c
Author: Kousuke Saruta 
AuthorDate: Sun Aug 16 12:07:37 2020 -0500

[SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md 
to refer the proper version

### What changes were proposed in this pull request?

This PR fixes the link to metrics.dropwizard.io in monitoring.md to refer 
the proper version of the library.

### Why are the changes needed?

There are links to metrics.dropwizard.io in monitoring.md but the link 
targets refer the version 3.1.0, while we use 4.1.1.
Now that users can create their own metrics using the dropwizard library, 
it's better to fix the links to refer the proper version.

### Does this PR introduce _any_ user-facing change?

Yes. The modified links refer the version 4.1.1.

### How was this patch tested?

Build the docs and visit all the modified links.

Closes #29426 from sarutak/fix-dropwizard-url.

Authored-by: Kousuke Saruta 
Signed-off-by: Sean Owen 
(cherry picked from commit 9a79bbc8b6e426e7b29a9f4867beb396014d8046)
Signed-off-by: Sean Owen 
---
 docs/monitoring.md | 8 
 pom.xml| 4 
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/docs/monitoring.md b/docs/monitoring.md
index 1808167..4608a4e 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -718,7 +718,7 @@ The JSON end point is exposed at: 
`/applications/[app-id]/executors`, and the Pr
 The Prometheus endpoint is experimental and conditional to a configuration 
parameter: `spark.ui.prometheus.enabled=true` (the default is `false`).
 In addition, aggregated per-stage peak values of the executor memory metrics 
are written to the event log if
 `spark.eventLog.logStageExecutorMetrics` is true.  
-Executor memory metrics are also exposed via the Spark metrics system based on 
the Dropwizard metrics library.
+Executor memory metrics are also exposed via the Spark metrics system based on 
the [Dropwizard metrics library](http://metrics.dropwizard.io/4.1.1).
 A list of the available metrics, with a short description:
 
 
@@ -922,7 +922,7 @@ keep the paths consistent in both modes.
 # Metrics
 
 Spark has a configurable metrics system based on the
-[Dropwizard Metrics Library](http://metrics.dropwizard.io/).
+[Dropwizard Metrics Library](http://metrics.dropwizard.io/4.1.1).
 This allows users to report Spark metrics to a variety of sinks including 
HTTP, JMX, and CSV
 files. The metrics are generated by sources embedded in the Spark code base. 
They
 provide instrumentation for specific activities and Spark components.
@@ -1016,7 +1016,7 @@ activates the JVM source:
 ## List of available metrics providers 
 
 Metrics used by Spark are of multiple types: gauge, counter, histogram, meter 
and timer, 
-see [Dropwizard library documentation for 
details](https://metrics.dropwizard.io/3.1.0/getting-started/).
+see [Dropwizard library documentation for 
details](https://metrics.dropwizard.io/4.1.1/getting-started.html).
 The following list of components and metrics reports the name and some details 
about the available metrics,
 grouped per component instance and source namespace.
 The most common time of metrics used in Spark instrumentation are gauges and 
counters. 
@@ -1244,7 +1244,7 @@ Notes:
 `spark.metrics.staticSources.enabled` (default is true)
   - This source is available for driver and executor instances and is also 
available for other instances.  
   - This source provides information on JVM metrics using the 
-  [Dropwizard/Codahale Metric Sets for JVM 
instrumentation](https://metrics.dropwizard.io/3.1.0/manual/jvm/)
+  [Dropwizard/Codahale Metric Sets for JVM 
instrumentation](https://metrics.dropwizard.io/4.1.1/manual/jvm.html)
and in particular the metric sets BufferPoolMetricSet, 
GarbageCollectorMetricSet and MemoryUsageGaugeSet. 
 
 ### Component instance = applicationMaster
diff --git a/pom.xml b/pom.xml
index e9ae204..1bf5de0 100644
--- a/pom.xml
+++ b/pom.xml
@@ -145,6 +145,10 @@
 0.9.5
 2.4.0
 2.0.8
+
 4.1.1
 1.8.2
 hadoop2


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c280c7f -> 9a79bbc)

2020-08-16 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c280c7f  [SPARK-32625][SQL] Log error message when falling back to 
interpreter mode
 add 9a79bbc  [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in 
monitoring.md to refer the proper version

No new revisions were added by this update.

Summary of changes:
 docs/monitoring.md | 8 
 pom.xml| 4 
 2 files changed, 8 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md to refer the proper version

2020-08-16 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c4807ce  [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in 
monitoring.md to refer the proper version
c4807ce is described below

commit c4807ced3913a4d524892dc7bab502250687a43c
Author: Kousuke Saruta 
AuthorDate: Sun Aug 16 12:07:37 2020 -0500

[SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md 
to refer the proper version

### What changes were proposed in this pull request?

This PR fixes the link to metrics.dropwizard.io in monitoring.md to refer 
the proper version of the library.

### Why are the changes needed?

There are links to metrics.dropwizard.io in monitoring.md but the link 
targets refer the version 3.1.0, while we use 4.1.1.
Now that users can create their own metrics using the dropwizard library, 
it's better to fix the links to refer the proper version.

### Does this PR introduce _any_ user-facing change?

Yes. The modified links refer the version 4.1.1.

### How was this patch tested?

Build the docs and visit all the modified links.

Closes #29426 from sarutak/fix-dropwizard-url.

Authored-by: Kousuke Saruta 
Signed-off-by: Sean Owen 
(cherry picked from commit 9a79bbc8b6e426e7b29a9f4867beb396014d8046)
Signed-off-by: Sean Owen 
---
 docs/monitoring.md | 8 
 pom.xml| 4 
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/docs/monitoring.md b/docs/monitoring.md
index 1808167..4608a4e 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -718,7 +718,7 @@ The JSON end point is exposed at: 
`/applications/[app-id]/executors`, and the Pr
 The Prometheus endpoint is experimental and conditional to a configuration 
parameter: `spark.ui.prometheus.enabled=true` (the default is `false`).
 In addition, aggregated per-stage peak values of the executor memory metrics 
are written to the event log if
 `spark.eventLog.logStageExecutorMetrics` is true.  
-Executor memory metrics are also exposed via the Spark metrics system based on 
the Dropwizard metrics library.
+Executor memory metrics are also exposed via the Spark metrics system based on 
the [Dropwizard metrics library](http://metrics.dropwizard.io/4.1.1).
 A list of the available metrics, with a short description:
 
 
@@ -922,7 +922,7 @@ keep the paths consistent in both modes.
 # Metrics
 
 Spark has a configurable metrics system based on the
-[Dropwizard Metrics Library](http://metrics.dropwizard.io/).
+[Dropwizard Metrics Library](http://metrics.dropwizard.io/4.1.1).
 This allows users to report Spark metrics to a variety of sinks including 
HTTP, JMX, and CSV
 files. The metrics are generated by sources embedded in the Spark code base. 
They
 provide instrumentation for specific activities and Spark components.
@@ -1016,7 +1016,7 @@ activates the JVM source:
 ## List of available metrics providers 
 
 Metrics used by Spark are of multiple types: gauge, counter, histogram, meter 
and timer, 
-see [Dropwizard library documentation for 
details](https://metrics.dropwizard.io/3.1.0/getting-started/).
+see [Dropwizard library documentation for 
details](https://metrics.dropwizard.io/4.1.1/getting-started.html).
 The following list of components and metrics reports the name and some details 
about the available metrics,
 grouped per component instance and source namespace.
 The most common time of metrics used in Spark instrumentation are gauges and 
counters. 
@@ -1244,7 +1244,7 @@ Notes:
 `spark.metrics.staticSources.enabled` (default is true)
   - This source is available for driver and executor instances and is also 
available for other instances.  
   - This source provides information on JVM metrics using the 
-  [Dropwizard/Codahale Metric Sets for JVM 
instrumentation](https://metrics.dropwizard.io/3.1.0/manual/jvm/)
+  [Dropwizard/Codahale Metric Sets for JVM 
instrumentation](https://metrics.dropwizard.io/4.1.1/manual/jvm.html)
and in particular the metric sets BufferPoolMetricSet, 
GarbageCollectorMetricSet and MemoryUsageGaugeSet. 
 
 ### Component instance = applicationMaster
diff --git a/pom.xml b/pom.xml
index e9ae204..1bf5de0 100644
--- a/pom.xml
+++ b/pom.xml
@@ -145,6 +145,10 @@
 0.9.5
 2.4.0
 2.0.8
+
 4.1.1
 1.8.2
 hadoop2


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c280c7f -> 9a79bbc)

2020-08-16 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c280c7f  [SPARK-32625][SQL] Log error message when falling back to 
interpreter mode
 add 9a79bbc  [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in 
monitoring.md to refer the proper version

No new revisions were added by this update.

Summary of changes:
 docs/monitoring.md | 8 
 pom.xml| 4 
 2 files changed, 8 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md to refer the proper version

2020-08-16 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 9a79bbc  [SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in 
monitoring.md to refer the proper version
9a79bbc is described below

commit 9a79bbc8b6e426e7b29a9f4867beb396014d8046
Author: Kousuke Saruta 
AuthorDate: Sun Aug 16 12:07:37 2020 -0500

[SPARK-32610][DOCS] Fix the link to metrics.dropwizard.io in monitoring.md 
to refer the proper version

### What changes were proposed in this pull request?

This PR fixes the link to metrics.dropwizard.io in monitoring.md to refer 
the proper version of the library.

### Why are the changes needed?

There are links to metrics.dropwizard.io in monitoring.md but the link 
targets refer the version 3.1.0, while we use 4.1.1.
Now that users can create their own metrics using the dropwizard library, 
it's better to fix the links to refer the proper version.

### Does this PR introduce _any_ user-facing change?

Yes. The modified links refer the version 4.1.1.

### How was this patch tested?

Build the docs and visit all the modified links.

Closes #29426 from sarutak/fix-dropwizard-url.

Authored-by: Kousuke Saruta 
Signed-off-by: Sean Owen 
---
 docs/monitoring.md | 8 
 pom.xml| 4 
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/docs/monitoring.md b/docs/monitoring.md
index 5fdf308..31fc160 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -758,7 +758,7 @@ The JSON end point is exposed at: 
`/applications/[app-id]/executors`, and the Pr
 The Prometheus endpoint is experimental and conditional to a configuration 
parameter: `spark.ui.prometheus.enabled=true` (the default is `false`).
 In addition, aggregated per-stage peak values of the executor memory metrics 
are written to the event log if
 `spark.eventLog.logStageExecutorMetrics` is true.  
-Executor memory metrics are also exposed via the Spark metrics system based on 
the Dropwizard metrics library.
+Executor memory metrics are also exposed via the Spark metrics system based on 
the [Dropwizard metrics library](http://metrics.dropwizard.io/4.1.1).
 A list of the available metrics, with a short description:
 
 
@@ -962,7 +962,7 @@ keep the paths consistent in both modes.
 # Metrics
 
 Spark has a configurable metrics system based on the
-[Dropwizard Metrics Library](http://metrics.dropwizard.io/).
+[Dropwizard Metrics Library](http://metrics.dropwizard.io/4.1.1).
 This allows users to report Spark metrics to a variety of sinks including 
HTTP, JMX, and CSV
 files. The metrics are generated by sources embedded in the Spark code base. 
They
 provide instrumentation for specific activities and Spark components.
@@ -1056,7 +1056,7 @@ activates the JVM source:
 ## List of available metrics providers 
 
 Metrics used by Spark are of multiple types: gauge, counter, histogram, meter 
and timer, 
-see [Dropwizard library documentation for 
details](https://metrics.dropwizard.io/3.1.0/getting-started/).
+see [Dropwizard library documentation for 
details](https://metrics.dropwizard.io/4.1.1/getting-started.html).
 The following list of components and metrics reports the name and some details 
about the available metrics,
 grouped per component instance and source namespace.
 The most common time of metrics used in Spark instrumentation are gauges and 
counters. 
@@ -1284,7 +1284,7 @@ Notes:
 `spark.metrics.staticSources.enabled` (default is true)
   - This source is available for driver and executor instances and is also 
available for other instances.  
   - This source provides information on JVM metrics using the 
-  [Dropwizard/Codahale Metric Sets for JVM 
instrumentation](https://metrics.dropwizard.io/3.1.0/manual/jvm/)
+  [Dropwizard/Codahale Metric Sets for JVM 
instrumentation](https://metrics.dropwizard.io/4.1.1/manual/jvm.html)
and in particular the metric sets BufferPoolMetricSet, 
GarbageCollectorMetricSet and MemoryUsageGaugeSet. 
 
 ### Component instance = applicationMaster
diff --git a/pom.xml b/pom.xml
index e414835..23de569 100644
--- a/pom.xml
+++ b/pom.xml
@@ -145,6 +145,10 @@
 0.9.5
 2.4.0
 2.0.8
+
 4.1.1
 1.8.2
 hadoop2


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (0c850c7 -> 6ae2cb2)

2020-08-13 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0c850c7  [SPARK-32511][SQL] Add dropFields method to Column class
 add 6ae2cb2  [SPARK-32526][SQL] Fix some test cases of `sql/catalyst` 
module in scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/ExpressionSet.scala  |  8 +++-
 .../apache/spark/sql/catalyst/analysis/Analyzer.scala   |  6 +++---
 .../sql/catalyst/expressions/collectionOperations.scala | 17 +
 .../sql/catalyst/expressions/higherOrderFunctions.scala |  2 +-
 .../sql/catalyst/expressions/stringExpressions.scala|  4 ++--
 .../apache/spark/sql/catalyst/json/JacksonParser.scala  |  2 +-
 .../apache/spark/sql/catalyst/optimizer/Optimizer.scala |  2 +-
 .../apache/spark/sql/catalyst/parser/AstBuilder.scala   |  5 +++--
 .../org/apache/spark/sql/catalyst/trees/TreeNode.scala  |  8 ++--
 .../scala/org/apache/spark/sql/types/Metadata.scala |  4 +++-
 .../scala/org/apache/spark/sql/util/SchemaUtils.scala   |  2 +-
 .../org/apache/spark/sql/RandomDataGenerator.scala  |  2 +-
 .../org/apache/spark/sql/util/SchemaUtilsSuite.scala|  2 +-
 13 files changed, 39 insertions(+), 25 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (0c850c7 -> 6ae2cb2)

2020-08-13 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0c850c7  [SPARK-32511][SQL] Add dropFields method to Column class
 add 6ae2cb2  [SPARK-32526][SQL] Fix some test cases of `sql/catalyst` 
module in scala 2.13

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/ExpressionSet.scala  |  8 +++-
 .../apache/spark/sql/catalyst/analysis/Analyzer.scala   |  6 +++---
 .../sql/catalyst/expressions/collectionOperations.scala | 17 +
 .../sql/catalyst/expressions/higherOrderFunctions.scala |  2 +-
 .../sql/catalyst/expressions/stringExpressions.scala|  4 ++--
 .../apache/spark/sql/catalyst/json/JacksonParser.scala  |  2 +-
 .../apache/spark/sql/catalyst/optimizer/Optimizer.scala |  2 +-
 .../apache/spark/sql/catalyst/parser/AstBuilder.scala   |  5 +++--
 .../org/apache/spark/sql/catalyst/trees/TreeNode.scala  |  8 ++--
 .../scala/org/apache/spark/sql/types/Metadata.scala |  4 +++-
 .../scala/org/apache/spark/sql/util/SchemaUtils.scala   |  2 +-
 .../org/apache/spark/sql/RandomDataGenerator.scala  |  2 +-
 .../org/apache/spark/sql/util/SchemaUtilsSuite.scala|  2 +-
 13 files changed, 39 insertions(+), 25 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

< 5 6 7 8 9 10 11 12 13 14 >

901 - 1000 of 20676 matches

Mail list logo