[spark] branch master updated (31a16fb -> 688d016)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 31a16fb [SPARK-32714][PYTHON] Initial pyspark-stubs port add 688d016 [SPARK-32982][BUILD] Remove hive-1.2 profiles in PIP installation option No new revisions were added by this update. Summary of changes: dev/create-release/release-build.sh| 2 +- python/docs/source/getting_started/install.rst | 24 ++-- python/pyspark/install.py | 16 python/pyspark/tests/test_install_spark.py | 13 + python/setup.py| 2 ++ 5 files changed, 18 insertions(+), 39 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (31a16fb -> 688d016)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 31a16fb [SPARK-32714][PYTHON] Initial pyspark-stubs port add 688d016 [SPARK-32982][BUILD] Remove hive-1.2 profiles in PIP installation option No new revisions were added by this update. Summary of changes: dev/create-release/release-build.sh| 2 +- python/docs/source/getting_started/install.rst | 24 ++-- python/pyspark/install.py | 16 python/pyspark/tests/test_install_spark.py | 13 + python/setup.py| 2 ++ 5 files changed, 18 insertions(+), 39 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (31a16fb -> 688d016)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 31a16fb [SPARK-32714][PYTHON] Initial pyspark-stubs port add 688d016 [SPARK-32982][BUILD] Remove hive-1.2 profiles in PIP installation option No new revisions were added by this update. Summary of changes: dev/create-release/release-build.sh| 2 +- python/docs/source/getting_started/install.rst | 24 ++-- python/pyspark/install.py | 16 python/pyspark/tests/test_install_spark.py | 13 + python/setup.py| 2 ++ 5 files changed, 18 insertions(+), 39 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (31a16fb -> 688d016)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 31a16fb [SPARK-32714][PYTHON] Initial pyspark-stubs port add 688d016 [SPARK-32982][BUILD] Remove hive-1.2 profiles in PIP installation option No new revisions were added by this update. Summary of changes: dev/create-release/release-build.sh| 2 +- python/docs/source/getting_started/install.rst | 24 ++-- python/pyspark/install.py | 16 python/pyspark/tests/test_install_spark.py | 13 + python/setup.py| 2 ++ 5 files changed, 18 insertions(+), 39 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (31a16fb -> 688d016)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 31a16fb [SPARK-32714][PYTHON] Initial pyspark-stubs port add 688d016 [SPARK-32982][BUILD] Remove hive-1.2 profiles in PIP installation option No new revisions were added by this update. Summary of changes: dev/create-release/release-build.sh| 2 +- python/docs/source/getting_started/install.rst | 24 ++-- python/pyspark/install.py | 16 python/pyspark/tests/test_install_spark.py | 13 + python/setup.py| 2 ++ 5 files changed, 18 insertions(+), 39 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0bc0e91 -> 31a16fb)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0bc0e91 [SPARK-32971][K8S][FOLLOWUP] Add `.toSeq` for Scala 2.13 compilation add 31a16fb [SPARK-32714][PYTHON] Initial pyspark-stubs port No new revisions were added by this update. Summary of changes: dev/.rat-excludes |1 + dev/tox.ini|2 +- .../ml/estimator_transformer_param_example.py |8 +- .../src/main/python/ml/fm_classifier_example.py|6 +- .../src/main/python/ml/fm_regressor_example.py |6 +- examples/src/main/python/ml/pipeline_example.py|8 +- examples/src/main/python/sql/arrow.py |4 +- python/MANIFEST.in |1 + python/mypy.ini| 36 + python/pyspark/__init__.pyi| 73 + python/pyspark/_globals.pyi| 27 + python/pyspark/_typing.pyi | 33 + python/pyspark/accumulators.pyi| 71 + python/pyspark/broadcast.pyi | 46 + python/pyspark/conf.pyi| 44 + python/pyspark/context.pyi | 176 +++ python/pyspark/daemon.pyi | 29 + python/pyspark/files.pyi | 24 + python/pyspark/find_spark_home.pyi | 17 + python/pyspark/java_gateway.pyi| 24 + python/pyspark/join.pyi| 50 + python/pyspark/ml/__init__.pyi | 45 + python/pyspark/ml/_typing.pyi | 76 + python/pyspark/ml/base.pyi | 103 ++ python/pyspark/ml/classification.pyi | 922 +++ python/pyspark/ml/clustering.pyi | 437 ++ python/pyspark/ml/common.pyi | 20 + python/pyspark/ml/evaluation.pyi | 281 python/pyspark/ml/feature.pyi | 1629 python/pyspark/ml/fpm.pyi | 109 ++ python/pyspark/ml/functions.pyi| 22 + python/pyspark/ml/image.pyi| 40 + python/pyspark/ml/linalg/__init__.pyi | 255 +++ python/pyspark/ml/param/__init__.pyi | 96 ++ .../pyspark/ml/param/_shared_params_code_gen.pyi | 19 + python/pyspark/ml/param/shared.pyi | 187 +++ python/pyspark/ml/pipeline.pyi | 97 ++ python/pyspark/ml/recommendation.pyi | 152 ++ python/pyspark/ml/regression.pyi | 825 ++ python/pyspark/ml/stat.pyi | 89 ++ python/pyspark/ml/tests/test_algorithms.py |2 +- python/pyspark/ml/tests/test_base.py |2 +- python/pyspark/ml/tests/test_evaluation.py |2 +- python/pyspark/ml/tests/test_feature.py|2 +- python/pyspark/ml/tests/test_image.py |2 +- python/pyspark/ml/tests/test_linalg.py |2 +- python/pyspark/ml/tests/test_param.py |2 +- python/pyspark/ml/tests/test_persistence.py|2 +- python/pyspark/ml/tests/test_pipeline.py |2 +- python/pyspark/ml/tests/test_stat.py |2 +- python/pyspark/ml/tests/test_training_summary.py |2 +- python/pyspark/ml/tests/test_tuning.py |2 +- python/pyspark/ml/tests/test_wrapper.py|6 +- python/pyspark/ml/tree.pyi | 112 ++ python/pyspark/ml/tuning.pyi | 185 +++ python/pyspark/ml/util.pyi | 128 ++ python/pyspark/ml/wrapper.pyi | 48 + python/pyspark/mllib/__init__.pyi | 32 + python/pyspark/mllib/_typing.pyi | 23 + python/pyspark/mllib/classification.pyi| 151 ++ python/pyspark/mllib/clustering.pyi| 196 +++ python/pyspark/mllib/common.pyi| 27 + python/pyspark/mllib/evaluation.pyi| 94 ++ python/pyspark/mllib/feature.pyi | 167 ++ python/pyspark/mllib/fpm.pyi | 57 + python/pyspark/mllib/linalg/__init__.pyi | 273 python/pyspark/mllib/linalg/distributed.pyi| 147 ++ python/pyspark/mllib/random.pyi| 126 ++ python/pyspark/mllib/recommendation.pyi| 75 + python/pyspark/mllib/regression.pyi| 155 ++ python/pyspark/mllib/stat/KernelDensity.pyi| 27 + python/pyspark/mllib/stat/__init__.pyi | 29 + python/pyspark/mllib/stat/_statistics.pyi | 69 + python/pyspark/mllib/stat/distribution.pyi | 25
[spark] branch master updated (0bc0e91 -> 31a16fb)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0bc0e91 [SPARK-32971][K8S][FOLLOWUP] Add `.toSeq` for Scala 2.13 compilation add 31a16fb [SPARK-32714][PYTHON] Initial pyspark-stubs port No new revisions were added by this update. Summary of changes: dev/.rat-excludes |1 + dev/tox.ini|2 +- .../ml/estimator_transformer_param_example.py |8 +- .../src/main/python/ml/fm_classifier_example.py|6 +- .../src/main/python/ml/fm_regressor_example.py |6 +- examples/src/main/python/ml/pipeline_example.py|8 +- examples/src/main/python/sql/arrow.py |4 +- python/MANIFEST.in |1 + python/mypy.ini| 36 + python/pyspark/__init__.pyi| 73 + python/pyspark/_globals.pyi| 27 + python/pyspark/_typing.pyi | 33 + python/pyspark/accumulators.pyi| 71 + python/pyspark/broadcast.pyi | 46 + python/pyspark/conf.pyi| 44 + python/pyspark/context.pyi | 176 +++ python/pyspark/daemon.pyi | 29 + python/pyspark/files.pyi | 24 + python/pyspark/find_spark_home.pyi | 17 + python/pyspark/java_gateway.pyi| 24 + python/pyspark/join.pyi| 50 + python/pyspark/ml/__init__.pyi | 45 + python/pyspark/ml/_typing.pyi | 76 + python/pyspark/ml/base.pyi | 103 ++ python/pyspark/ml/classification.pyi | 922 +++ python/pyspark/ml/clustering.pyi | 437 ++ python/pyspark/ml/common.pyi | 20 + python/pyspark/ml/evaluation.pyi | 281 python/pyspark/ml/feature.pyi | 1629 python/pyspark/ml/fpm.pyi | 109 ++ python/pyspark/ml/functions.pyi| 22 + python/pyspark/ml/image.pyi| 40 + python/pyspark/ml/linalg/__init__.pyi | 255 +++ python/pyspark/ml/param/__init__.pyi | 96 ++ .../pyspark/ml/param/_shared_params_code_gen.pyi | 19 + python/pyspark/ml/param/shared.pyi | 187 +++ python/pyspark/ml/pipeline.pyi | 97 ++ python/pyspark/ml/recommendation.pyi | 152 ++ python/pyspark/ml/regression.pyi | 825 ++ python/pyspark/ml/stat.pyi | 89 ++ python/pyspark/ml/tests/test_algorithms.py |2 +- python/pyspark/ml/tests/test_base.py |2 +- python/pyspark/ml/tests/test_evaluation.py |2 +- python/pyspark/ml/tests/test_feature.py|2 +- python/pyspark/ml/tests/test_image.py |2 +- python/pyspark/ml/tests/test_linalg.py |2 +- python/pyspark/ml/tests/test_param.py |2 +- python/pyspark/ml/tests/test_persistence.py|2 +- python/pyspark/ml/tests/test_pipeline.py |2 +- python/pyspark/ml/tests/test_stat.py |2 +- python/pyspark/ml/tests/test_training_summary.py |2 +- python/pyspark/ml/tests/test_tuning.py |2 +- python/pyspark/ml/tests/test_wrapper.py|6 +- python/pyspark/ml/tree.pyi | 112 ++ python/pyspark/ml/tuning.pyi | 185 +++ python/pyspark/ml/util.pyi | 128 ++ python/pyspark/ml/wrapper.pyi | 48 + python/pyspark/mllib/__init__.pyi | 32 + python/pyspark/mllib/_typing.pyi | 23 + python/pyspark/mllib/classification.pyi| 151 ++ python/pyspark/mllib/clustering.pyi| 196 +++ python/pyspark/mllib/common.pyi| 27 + python/pyspark/mllib/evaluation.pyi| 94 ++ python/pyspark/mllib/feature.pyi | 167 ++ python/pyspark/mllib/fpm.pyi | 57 + python/pyspark/mllib/linalg/__init__.pyi | 273 python/pyspark/mllib/linalg/distributed.pyi| 147 ++ python/pyspark/mllib/random.pyi| 126 ++ python/pyspark/mllib/recommendation.pyi| 75 + python/pyspark/mllib/regression.pyi| 155 ++ python/pyspark/mllib/stat/KernelDensity.pyi| 27 + python/pyspark/mllib/stat/__init__.pyi | 29 + python/pyspark/mllib/stat/_statistics.pyi | 69 + python/pyspark/mllib/stat/distribution.pyi | 25
[spark] branch master updated (0bc0e91 -> 31a16fb)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0bc0e91 [SPARK-32971][K8S][FOLLOWUP] Add `.toSeq` for Scala 2.13 compilation add 31a16fb [SPARK-32714][PYTHON] Initial pyspark-stubs port No new revisions were added by this update. Summary of changes: dev/.rat-excludes |1 + dev/tox.ini|2 +- .../ml/estimator_transformer_param_example.py |8 +- .../src/main/python/ml/fm_classifier_example.py|6 +- .../src/main/python/ml/fm_regressor_example.py |6 +- examples/src/main/python/ml/pipeline_example.py|8 +- examples/src/main/python/sql/arrow.py |4 +- python/MANIFEST.in |1 + python/mypy.ini| 36 + python/pyspark/__init__.pyi| 73 + python/pyspark/_globals.pyi| 27 + python/pyspark/_typing.pyi | 33 + python/pyspark/accumulators.pyi| 71 + python/pyspark/broadcast.pyi | 46 + python/pyspark/conf.pyi| 44 + python/pyspark/context.pyi | 176 +++ python/pyspark/daemon.pyi | 29 + python/pyspark/files.pyi | 24 + python/pyspark/find_spark_home.pyi | 17 + python/pyspark/java_gateway.pyi| 24 + python/pyspark/join.pyi| 50 + python/pyspark/ml/__init__.pyi | 45 + python/pyspark/ml/_typing.pyi | 76 + python/pyspark/ml/base.pyi | 103 ++ python/pyspark/ml/classification.pyi | 922 +++ python/pyspark/ml/clustering.pyi | 437 ++ python/pyspark/ml/common.pyi | 20 + python/pyspark/ml/evaluation.pyi | 281 python/pyspark/ml/feature.pyi | 1629 python/pyspark/ml/fpm.pyi | 109 ++ python/pyspark/ml/functions.pyi| 22 + python/pyspark/ml/image.pyi| 40 + python/pyspark/ml/linalg/__init__.pyi | 255 +++ python/pyspark/ml/param/__init__.pyi | 96 ++ .../pyspark/ml/param/_shared_params_code_gen.pyi | 19 + python/pyspark/ml/param/shared.pyi | 187 +++ python/pyspark/ml/pipeline.pyi | 97 ++ python/pyspark/ml/recommendation.pyi | 152 ++ python/pyspark/ml/regression.pyi | 825 ++ python/pyspark/ml/stat.pyi | 89 ++ python/pyspark/ml/tests/test_algorithms.py |2 +- python/pyspark/ml/tests/test_base.py |2 +- python/pyspark/ml/tests/test_evaluation.py |2 +- python/pyspark/ml/tests/test_feature.py|2 +- python/pyspark/ml/tests/test_image.py |2 +- python/pyspark/ml/tests/test_linalg.py |2 +- python/pyspark/ml/tests/test_param.py |2 +- python/pyspark/ml/tests/test_persistence.py|2 +- python/pyspark/ml/tests/test_pipeline.py |2 +- python/pyspark/ml/tests/test_stat.py |2 +- python/pyspark/ml/tests/test_training_summary.py |2 +- python/pyspark/ml/tests/test_tuning.py |2 +- python/pyspark/ml/tests/test_wrapper.py|6 +- python/pyspark/ml/tree.pyi | 112 ++ python/pyspark/ml/tuning.pyi | 185 +++ python/pyspark/ml/util.pyi | 128 ++ python/pyspark/ml/wrapper.pyi | 48 + python/pyspark/mllib/__init__.pyi | 32 + python/pyspark/mllib/_typing.pyi | 23 + python/pyspark/mllib/classification.pyi| 151 ++ python/pyspark/mllib/clustering.pyi| 196 +++ python/pyspark/mllib/common.pyi| 27 + python/pyspark/mllib/evaluation.pyi| 94 ++ python/pyspark/mllib/feature.pyi | 167 ++ python/pyspark/mllib/fpm.pyi | 57 + python/pyspark/mllib/linalg/__init__.pyi | 273 python/pyspark/mllib/linalg/distributed.pyi| 147 ++ python/pyspark/mllib/random.pyi| 126 ++ python/pyspark/mllib/recommendation.pyi| 75 + python/pyspark/mllib/regression.pyi| 155 ++ python/pyspark/mllib/stat/KernelDensity.pyi| 27 + python/pyspark/mllib/stat/__init__.pyi | 29 + python/pyspark/mllib/stat/_statistics.pyi | 69 + python/pyspark/mllib/stat/distribution.pyi | 25
[spark] branch master updated (0bc0e91 -> 31a16fb)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0bc0e91 [SPARK-32971][K8S][FOLLOWUP] Add `.toSeq` for Scala 2.13 compilation add 31a16fb [SPARK-32714][PYTHON] Initial pyspark-stubs port No new revisions were added by this update. Summary of changes: dev/.rat-excludes |1 + dev/tox.ini|2 +- .../ml/estimator_transformer_param_example.py |8 +- .../src/main/python/ml/fm_classifier_example.py|6 +- .../src/main/python/ml/fm_regressor_example.py |6 +- examples/src/main/python/ml/pipeline_example.py|8 +- examples/src/main/python/sql/arrow.py |4 +- python/MANIFEST.in |1 + python/mypy.ini| 36 + python/pyspark/__init__.pyi| 73 + python/pyspark/_globals.pyi| 27 + python/pyspark/_typing.pyi | 33 + python/pyspark/accumulators.pyi| 71 + python/pyspark/broadcast.pyi | 46 + python/pyspark/conf.pyi| 44 + python/pyspark/context.pyi | 176 +++ python/pyspark/daemon.pyi | 29 + python/pyspark/files.pyi | 24 + python/pyspark/find_spark_home.pyi | 17 + python/pyspark/java_gateway.pyi| 24 + python/pyspark/join.pyi| 50 + python/pyspark/ml/__init__.pyi | 45 + python/pyspark/ml/_typing.pyi | 76 + python/pyspark/ml/base.pyi | 103 ++ python/pyspark/ml/classification.pyi | 922 +++ python/pyspark/ml/clustering.pyi | 437 ++ python/pyspark/ml/common.pyi | 20 + python/pyspark/ml/evaluation.pyi | 281 python/pyspark/ml/feature.pyi | 1629 python/pyspark/ml/fpm.pyi | 109 ++ python/pyspark/ml/functions.pyi| 22 + python/pyspark/ml/image.pyi| 40 + python/pyspark/ml/linalg/__init__.pyi | 255 +++ python/pyspark/ml/param/__init__.pyi | 96 ++ .../pyspark/ml/param/_shared_params_code_gen.pyi | 19 + python/pyspark/ml/param/shared.pyi | 187 +++ python/pyspark/ml/pipeline.pyi | 97 ++ python/pyspark/ml/recommendation.pyi | 152 ++ python/pyspark/ml/regression.pyi | 825 ++ python/pyspark/ml/stat.pyi | 89 ++ python/pyspark/ml/tests/test_algorithms.py |2 +- python/pyspark/ml/tests/test_base.py |2 +- python/pyspark/ml/tests/test_evaluation.py |2 +- python/pyspark/ml/tests/test_feature.py|2 +- python/pyspark/ml/tests/test_image.py |2 +- python/pyspark/ml/tests/test_linalg.py |2 +- python/pyspark/ml/tests/test_param.py |2 +- python/pyspark/ml/tests/test_persistence.py|2 +- python/pyspark/ml/tests/test_pipeline.py |2 +- python/pyspark/ml/tests/test_stat.py |2 +- python/pyspark/ml/tests/test_training_summary.py |2 +- python/pyspark/ml/tests/test_tuning.py |2 +- python/pyspark/ml/tests/test_wrapper.py|6 +- python/pyspark/ml/tree.pyi | 112 ++ python/pyspark/ml/tuning.pyi | 185 +++ python/pyspark/ml/util.pyi | 128 ++ python/pyspark/ml/wrapper.pyi | 48 + python/pyspark/mllib/__init__.pyi | 32 + python/pyspark/mllib/_typing.pyi | 23 + python/pyspark/mllib/classification.pyi| 151 ++ python/pyspark/mllib/clustering.pyi| 196 +++ python/pyspark/mllib/common.pyi| 27 + python/pyspark/mllib/evaluation.pyi| 94 ++ python/pyspark/mllib/feature.pyi | 167 ++ python/pyspark/mllib/fpm.pyi | 57 + python/pyspark/mllib/linalg/__init__.pyi | 273 python/pyspark/mllib/linalg/distributed.pyi| 147 ++ python/pyspark/mllib/random.pyi| 126 ++ python/pyspark/mllib/recommendation.pyi| 75 + python/pyspark/mllib/regression.pyi| 155 ++ python/pyspark/mllib/stat/KernelDensity.pyi| 27 + python/pyspark/mllib/stat/__init__.pyi | 29 + python/pyspark/mllib/stat/_statistics.pyi | 69 + python/pyspark/mllib/stat/distribution.pyi | 25
[spark] branch master updated (0bc0e91 -> 31a16fb)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0bc0e91 [SPARK-32971][K8S][FOLLOWUP] Add `.toSeq` for Scala 2.13 compilation add 31a16fb [SPARK-32714][PYTHON] Initial pyspark-stubs port No new revisions were added by this update. Summary of changes: dev/.rat-excludes |1 + dev/tox.ini|2 +- .../ml/estimator_transformer_param_example.py |8 +- .../src/main/python/ml/fm_classifier_example.py|6 +- .../src/main/python/ml/fm_regressor_example.py |6 +- examples/src/main/python/ml/pipeline_example.py|8 +- examples/src/main/python/sql/arrow.py |4 +- python/MANIFEST.in |1 + python/mypy.ini| 36 + python/pyspark/__init__.pyi| 73 + python/pyspark/_globals.pyi| 27 + python/pyspark/_typing.pyi | 33 + python/pyspark/accumulators.pyi| 71 + python/pyspark/broadcast.pyi | 46 + python/pyspark/conf.pyi| 44 + python/pyspark/context.pyi | 176 +++ python/pyspark/daemon.pyi | 29 + python/pyspark/files.pyi | 24 + python/pyspark/find_spark_home.pyi | 17 + python/pyspark/java_gateway.pyi| 24 + python/pyspark/join.pyi| 50 + python/pyspark/ml/__init__.pyi | 45 + python/pyspark/ml/_typing.pyi | 76 + python/pyspark/ml/base.pyi | 103 ++ python/pyspark/ml/classification.pyi | 922 +++ python/pyspark/ml/clustering.pyi | 437 ++ python/pyspark/ml/common.pyi | 20 + python/pyspark/ml/evaluation.pyi | 281 python/pyspark/ml/feature.pyi | 1629 python/pyspark/ml/fpm.pyi | 109 ++ python/pyspark/ml/functions.pyi| 22 + python/pyspark/ml/image.pyi| 40 + python/pyspark/ml/linalg/__init__.pyi | 255 +++ python/pyspark/ml/param/__init__.pyi | 96 ++ .../pyspark/ml/param/_shared_params_code_gen.pyi | 19 + python/pyspark/ml/param/shared.pyi | 187 +++ python/pyspark/ml/pipeline.pyi | 97 ++ python/pyspark/ml/recommendation.pyi | 152 ++ python/pyspark/ml/regression.pyi | 825 ++ python/pyspark/ml/stat.pyi | 89 ++ python/pyspark/ml/tests/test_algorithms.py |2 +- python/pyspark/ml/tests/test_base.py |2 +- python/pyspark/ml/tests/test_evaluation.py |2 +- python/pyspark/ml/tests/test_feature.py|2 +- python/pyspark/ml/tests/test_image.py |2 +- python/pyspark/ml/tests/test_linalg.py |2 +- python/pyspark/ml/tests/test_param.py |2 +- python/pyspark/ml/tests/test_persistence.py|2 +- python/pyspark/ml/tests/test_pipeline.py |2 +- python/pyspark/ml/tests/test_stat.py |2 +- python/pyspark/ml/tests/test_training_summary.py |2 +- python/pyspark/ml/tests/test_tuning.py |2 +- python/pyspark/ml/tests/test_wrapper.py|6 +- python/pyspark/ml/tree.pyi | 112 ++ python/pyspark/ml/tuning.pyi | 185 +++ python/pyspark/ml/util.pyi | 128 ++ python/pyspark/ml/wrapper.pyi | 48 + python/pyspark/mllib/__init__.pyi | 32 + python/pyspark/mllib/_typing.pyi | 23 + python/pyspark/mllib/classification.pyi| 151 ++ python/pyspark/mllib/clustering.pyi| 196 +++ python/pyspark/mllib/common.pyi| 27 + python/pyspark/mllib/evaluation.pyi| 94 ++ python/pyspark/mllib/feature.pyi | 167 ++ python/pyspark/mllib/fpm.pyi | 57 + python/pyspark/mllib/linalg/__init__.pyi | 273 python/pyspark/mllib/linalg/distributed.pyi| 147 ++ python/pyspark/mllib/random.pyi| 126 ++ python/pyspark/mllib/recommendation.pyi| 75 + python/pyspark/mllib/regression.pyi| 155 ++ python/pyspark/mllib/stat/KernelDensity.pyi| 27 + python/pyspark/mllib/stat/__init__.pyi | 29 + python/pyspark/mllib/stat/_statistics.pyi | 69 + python/pyspark/mllib/stat/distribution.pyi | 25
[spark] branch master updated (b3f0087 -> 0bc0e91)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b3f0087 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode add 0bc0e91 [SPARK-32971][K8S][FOLLOWUP] Add `.toSeq` for Scala 2.13 compilation No new revisions were added by this update. Summary of changes: .../org/apache/spark/deploy/k8s/features/MountVolumesFeatureStep.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b3f0087 -> 0bc0e91)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b3f0087 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode add 0bc0e91 [SPARK-32971][K8S][FOLLOWUP] Add `.toSeq` for Scala 2.13 compilation No new revisions were added by this update. Summary of changes: .../org/apache/spark/deploy/k8s/features/MountVolumesFeatureStep.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b3f0087 -> 0bc0e91)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b3f0087 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode add 0bc0e91 [SPARK-32971][K8S][FOLLOWUP] Add `.toSeq` for Scala 2.13 compilation No new revisions were added by this update. Summary of changes: .../org/apache/spark/deploy/k8s/features/MountVolumesFeatureStep.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 21b6b69 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode 21b6b69 is described below commit 21b6b6988e666b994839cb403e6409956c7f Author: Russell Spitzer AuthorDate: Wed Sep 23 20:02:20 2020 -0700 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode ### What changes were proposed in this pull request? The default is always ErrorsOnExist regardless of DataSource version. Fixing the JavaDoc to reflect this. ### Why are the changes needed? To fix documentation ### Does this PR introduce _any_ user-facing change? Doc change. ### How was this patch tested? Manual. Closes #29853 from RussellSpitzer/SPARK-32977. Authored-by: Russell Spitzer Signed-off-by: Dongjoon Hyun (cherry picked from commit b3f0087e39c8ad69cf1e53145d62eb73df48efd5) Signed-off-by: Dongjoon Hyun --- sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala index f463166..d16404f 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala @@ -60,8 +60,7 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { * `SaveMode.ErrorIfExists`: throw an exception at runtime. * * - * When writing to data source v1, the default option is `ErrorIfExists`. When writing to data - * source v2, the default option is `Append`. + * The default option is `ErrorIfExists`. * * @since 1.4.0 */ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b3f0087 -> 0bc0e91)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b3f0087 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode add 0bc0e91 [SPARK-32971][K8S][FOLLOWUP] Add `.toSeq` for Scala 2.13 compilation No new revisions were added by this update. Summary of changes: .../org/apache/spark/deploy/k8s/features/MountVolumesFeatureStep.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 21b6b69 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode 21b6b69 is described below commit 21b6b6988e666b994839cb403e6409956c7f Author: Russell Spitzer AuthorDate: Wed Sep 23 20:02:20 2020 -0700 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode ### What changes were proposed in this pull request? The default is always ErrorsOnExist regardless of DataSource version. Fixing the JavaDoc to reflect this. ### Why are the changes needed? To fix documentation ### Does this PR introduce _any_ user-facing change? Doc change. ### How was this patch tested? Manual. Closes #29853 from RussellSpitzer/SPARK-32977. Authored-by: Russell Spitzer Signed-off-by: Dongjoon Hyun (cherry picked from commit b3f0087e39c8ad69cf1e53145d62eb73df48efd5) Signed-off-by: Dongjoon Hyun --- sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala index f463166..d16404f 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala @@ -60,8 +60,7 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { * `SaveMode.ErrorIfExists`: throw an exception at runtime. * * - * When writing to data source v1, the default option is `ErrorIfExists`. When writing to data - * source v2, the default option is `Append`. + * The default option is `ErrorIfExists`. * * @since 1.4.0 */ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (527cd3f -> b3f0087)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 527cd3f [SPARK-32971][K8S] Support dynamic PVC creation/deletion for K8s executors add b3f0087 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode No new revisions were added by this update. Summary of changes: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b3f0087 -> 0bc0e91)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b3f0087 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode add 0bc0e91 [SPARK-32971][K8S][FOLLOWUP] Add `.toSeq` for Scala 2.13 compilation No new revisions were added by this update. Summary of changes: .../org/apache/spark/deploy/k8s/features/MountVolumesFeatureStep.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 21b6b69 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode 21b6b69 is described below commit 21b6b6988e666b994839cb403e6409956c7f Author: Russell Spitzer AuthorDate: Wed Sep 23 20:02:20 2020 -0700 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode ### What changes were proposed in this pull request? The default is always ErrorsOnExist regardless of DataSource version. Fixing the JavaDoc to reflect this. ### Why are the changes needed? To fix documentation ### Does this PR introduce _any_ user-facing change? Doc change. ### How was this patch tested? Manual. Closes #29853 from RussellSpitzer/SPARK-32977. Authored-by: Russell Spitzer Signed-off-by: Dongjoon Hyun (cherry picked from commit b3f0087e39c8ad69cf1e53145d62eb73df48efd5) Signed-off-by: Dongjoon Hyun --- sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala index f463166..d16404f 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala @@ -60,8 +60,7 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { * `SaveMode.ErrorIfExists`: throw an exception at runtime. * * - * When writing to data source v1, the default option is `ErrorIfExists`. When writing to data - * source v2, the default option is `Append`. + * The default option is `ErrorIfExists`. * * @since 1.4.0 */ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (527cd3f -> b3f0087)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 527cd3f [SPARK-32971][K8S] Support dynamic PVC creation/deletion for K8s executors add b3f0087 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode No new revisions were added by this update. Summary of changes: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 21b6b69 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode 21b6b69 is described below commit 21b6b6988e666b994839cb403e6409956c7f Author: Russell Spitzer AuthorDate: Wed Sep 23 20:02:20 2020 -0700 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode ### What changes were proposed in this pull request? The default is always ErrorsOnExist regardless of DataSource version. Fixing the JavaDoc to reflect this. ### Why are the changes needed? To fix documentation ### Does this PR introduce _any_ user-facing change? Doc change. ### How was this patch tested? Manual. Closes #29853 from RussellSpitzer/SPARK-32977. Authored-by: Russell Spitzer Signed-off-by: Dongjoon Hyun (cherry picked from commit b3f0087e39c8ad69cf1e53145d62eb73df48efd5) Signed-off-by: Dongjoon Hyun --- sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala index f463166..d16404f 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala @@ -60,8 +60,7 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { * `SaveMode.ErrorIfExists`: throw an exception at runtime. * * - * When writing to data source v1, the default option is `ErrorIfExists`. When writing to data - * source v2, the default option is `Append`. + * The default option is `ErrorIfExists`. * * @since 1.4.0 */ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (527cd3f -> b3f0087)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 527cd3f [SPARK-32971][K8S] Support dynamic PVC creation/deletion for K8s executors add b3f0087 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode No new revisions were added by this update. Summary of changes: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 21b6b69 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode 21b6b69 is described below commit 21b6b6988e666b994839cb403e6409956c7f Author: Russell Spitzer AuthorDate: Wed Sep 23 20:02:20 2020 -0700 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode ### What changes were proposed in this pull request? The default is always ErrorsOnExist regardless of DataSource version. Fixing the JavaDoc to reflect this. ### Why are the changes needed? To fix documentation ### Does this PR introduce _any_ user-facing change? Doc change. ### How was this patch tested? Manual. Closes #29853 from RussellSpitzer/SPARK-32977. Authored-by: Russell Spitzer Signed-off-by: Dongjoon Hyun (cherry picked from commit b3f0087e39c8ad69cf1e53145d62eb73df48efd5) Signed-off-by: Dongjoon Hyun --- sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala index f463166..d16404f 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala @@ -60,8 +60,7 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { * `SaveMode.ErrorIfExists`: throw an exception at runtime. * * - * When writing to data source v1, the default option is `ErrorIfExists`. When writing to data - * source v2, the default option is `Append`. + * The default option is `ErrorIfExists`. * * @since 1.4.0 */ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (527cd3f -> b3f0087)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 527cd3f [SPARK-32971][K8S] Support dynamic PVC creation/deletion for K8s executors add b3f0087 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode No new revisions were added by this update. Summary of changes: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (527cd3f -> b3f0087)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 527cd3f [SPARK-32971][K8S] Support dynamic PVC creation/deletion for K8s executors add b3f0087 [SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode No new revisions were added by this update. Summary of changes: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (27f6b5a -> 527cd3f)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 27f6b5a [SPARK-32937][SPARK-32980][K8S] Fix decom & launcher tests and add some comments to reduce chance of breakage add 527cd3f [SPARK-32971][K8S] Support dynamic PVC creation/deletion for K8s executors No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/deploy/k8s/Config.scala | 1 + ...iverSpec.scala => KubernetesExecutorSpec.scala} | 5 +-- .../apache/spark/deploy/k8s/KubernetesUtils.scala | 22 ++- .../spark/deploy/k8s/KubernetesVolumeSpec.scala| 5 ++- .../spark/deploy/k8s/KubernetesVolumeUtils.scala | 8 +++- .../k8s/features/MountVolumesFeatureStep.scala | 45 -- .../k8s/submit/KubernetesClientApplication.scala | 26 +++-- .../cluster/k8s/ExecutorPodsAllocator.scala| 32 --- .../cluster/k8s/KubernetesExecutorBuilder.scala| 14 ++- .../spark/deploy/k8s/KubernetesTestConf.scala | 7 +++- .../features/MountVolumesFeatureStepSuite.scala| 17 .../cluster/k8s/ExecutorPodsAllocatorSuite.scala | 6 +-- .../k8s/KubernetesExecutorBuilderSuite.scala | 2 +- 13 files changed, 146 insertions(+), 44 deletions(-) copy resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/{KubernetesDriverSpec.scala => KubernetesExecutorSpec.scala} (86%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (27f6b5a -> 527cd3f)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 27f6b5a [SPARK-32937][SPARK-32980][K8S] Fix decom & launcher tests and add some comments to reduce chance of breakage add 527cd3f [SPARK-32971][K8S] Support dynamic PVC creation/deletion for K8s executors No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/deploy/k8s/Config.scala | 1 + ...iverSpec.scala => KubernetesExecutorSpec.scala} | 5 +-- .../apache/spark/deploy/k8s/KubernetesUtils.scala | 22 ++- .../spark/deploy/k8s/KubernetesVolumeSpec.scala| 5 ++- .../spark/deploy/k8s/KubernetesVolumeUtils.scala | 8 +++- .../k8s/features/MountVolumesFeatureStep.scala | 45 -- .../k8s/submit/KubernetesClientApplication.scala | 26 +++-- .../cluster/k8s/ExecutorPodsAllocator.scala| 32 --- .../cluster/k8s/KubernetesExecutorBuilder.scala| 14 ++- .../spark/deploy/k8s/KubernetesTestConf.scala | 7 +++- .../features/MountVolumesFeatureStepSuite.scala| 17 .../cluster/k8s/ExecutorPodsAllocatorSuite.scala | 6 +-- .../k8s/KubernetesExecutorBuilderSuite.scala | 2 +- 13 files changed, 146 insertions(+), 44 deletions(-) copy resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/{KubernetesDriverSpec.scala => KubernetesExecutorSpec.scala} (86%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (27f6b5a -> 527cd3f)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 27f6b5a [SPARK-32937][SPARK-32980][K8S] Fix decom & launcher tests and add some comments to reduce chance of breakage add 527cd3f [SPARK-32971][K8S] Support dynamic PVC creation/deletion for K8s executors No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/deploy/k8s/Config.scala | 1 + ...iverSpec.scala => KubernetesExecutorSpec.scala} | 5 +-- .../apache/spark/deploy/k8s/KubernetesUtils.scala | 22 ++- .../spark/deploy/k8s/KubernetesVolumeSpec.scala| 5 ++- .../spark/deploy/k8s/KubernetesVolumeUtils.scala | 8 +++- .../k8s/features/MountVolumesFeatureStep.scala | 45 -- .../k8s/submit/KubernetesClientApplication.scala | 26 +++-- .../cluster/k8s/ExecutorPodsAllocator.scala| 32 --- .../cluster/k8s/KubernetesExecutorBuilder.scala| 14 ++- .../spark/deploy/k8s/KubernetesTestConf.scala | 7 +++- .../features/MountVolumesFeatureStepSuite.scala| 17 .../cluster/k8s/ExecutorPodsAllocatorSuite.scala | 6 +-- .../k8s/KubernetesExecutorBuilderSuite.scala | 2 +- 13 files changed, 146 insertions(+), 44 deletions(-) copy resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/{KubernetesDriverSpec.scala => KubernetesExecutorSpec.scala} (86%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (27f6b5a -> 527cd3f)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 27f6b5a [SPARK-32937][SPARK-32980][K8S] Fix decom & launcher tests and add some comments to reduce chance of breakage add 527cd3f [SPARK-32971][K8S] Support dynamic PVC creation/deletion for K8s executors No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/deploy/k8s/Config.scala | 1 + ...iverSpec.scala => KubernetesExecutorSpec.scala} | 5 +-- .../apache/spark/deploy/k8s/KubernetesUtils.scala | 22 ++- .../spark/deploy/k8s/KubernetesVolumeSpec.scala| 5 ++- .../spark/deploy/k8s/KubernetesVolumeUtils.scala | 8 +++- .../k8s/features/MountVolumesFeatureStep.scala | 45 -- .../k8s/submit/KubernetesClientApplication.scala | 26 +++-- .../cluster/k8s/ExecutorPodsAllocator.scala| 32 --- .../cluster/k8s/KubernetesExecutorBuilder.scala| 14 ++- .../spark/deploy/k8s/KubernetesTestConf.scala | 7 +++- .../features/MountVolumesFeatureStepSuite.scala| 17 .../cluster/k8s/ExecutorPodsAllocatorSuite.scala | 6 +-- .../k8s/KubernetesExecutorBuilderSuite.scala | 2 +- 13 files changed, 146 insertions(+), 44 deletions(-) copy resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/{KubernetesDriverSpec.scala => KubernetesExecutorSpec.scala} (86%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (27f6b5a -> 527cd3f)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 27f6b5a [SPARK-32937][SPARK-32980][K8S] Fix decom & launcher tests and add some comments to reduce chance of breakage add 527cd3f [SPARK-32971][K8S] Support dynamic PVC creation/deletion for K8s executors No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/deploy/k8s/Config.scala | 1 + ...iverSpec.scala => KubernetesExecutorSpec.scala} | 5 +-- .../apache/spark/deploy/k8s/KubernetesUtils.scala | 22 ++- .../spark/deploy/k8s/KubernetesVolumeSpec.scala| 5 ++- .../spark/deploy/k8s/KubernetesVolumeUtils.scala | 8 +++- .../k8s/features/MountVolumesFeatureStep.scala | 45 -- .../k8s/submit/KubernetesClientApplication.scala | 26 +++-- .../cluster/k8s/ExecutorPodsAllocator.scala| 32 --- .../cluster/k8s/KubernetesExecutorBuilder.scala| 14 ++- .../spark/deploy/k8s/KubernetesTestConf.scala | 7 +++- .../features/MountVolumesFeatureStepSuite.scala| 17 .../cluster/k8s/ExecutorPodsAllocatorSuite.scala | 6 +-- .../k8s/KubernetesExecutorBuilderSuite.scala | 2 +- 13 files changed, 146 insertions(+), 44 deletions(-) copy resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/{KubernetesDriverSpec.scala => KubernetesExecutorSpec.scala} (86%) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3c97665 -> 27f6b5a)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3c97665 [SPARK-32981][BUILD] Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution add 27f6b5a [SPARK-32937][SPARK-32980][K8S] Fix decom & launcher tests and add some comments to reduce chance of breakage No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/scheduler/ExecutorLossReason.scala | 1 + .../spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala| 5 - .../spark/deploy/k8s/integrationtest/DecommissionSuite.scala | 3 ++- .../deploy/k8s/integrationtest/backend/minikube/Minikube.scala | 7 +-- 4 files changed, 12 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3c97665 -> 27f6b5a)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3c97665 [SPARK-32981][BUILD] Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution add 27f6b5a [SPARK-32937][SPARK-32980][K8S] Fix decom & launcher tests and add some comments to reduce chance of breakage No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/scheduler/ExecutorLossReason.scala | 1 + .../spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala| 5 - .../spark/deploy/k8s/integrationtest/DecommissionSuite.scala | 3 ++- .../deploy/k8s/integrationtest/backend/minikube/Minikube.scala | 7 +-- 4 files changed, 12 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3c97665 -> 27f6b5a)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3c97665 [SPARK-32981][BUILD] Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution add 27f6b5a [SPARK-32937][SPARK-32980][K8S] Fix decom & launcher tests and add some comments to reduce chance of breakage No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/scheduler/ExecutorLossReason.scala | 1 + .../spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala| 5 - .../spark/deploy/k8s/integrationtest/DecommissionSuite.scala | 3 ++- .../deploy/k8s/integrationtest/backend/minikube/Minikube.scala | 7 +-- 4 files changed, 12 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (faeb71b -> 3c97665)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from faeb71b [SPARK-32950][SQL] Remove unnecessary big-endian code paths add 3c97665 [SPARK-32981][BUILD] Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution No new revisions were added by this update. Summary of changes: dev/create-release/release-build.sh | 1 - 1 file changed, 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3c97665 -> 27f6b5a)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3c97665 [SPARK-32981][BUILD] Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution add 27f6b5a [SPARK-32937][SPARK-32980][K8S] Fix decom & launcher tests and add some comments to reduce chance of breakage No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/scheduler/ExecutorLossReason.scala | 1 + .../spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala| 5 - .../spark/deploy/k8s/integrationtest/DecommissionSuite.scala | 3 ++- .../deploy/k8s/integrationtest/backend/minikube/Minikube.scala | 7 +-- 4 files changed, 12 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (faeb71b -> 3c97665)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from faeb71b [SPARK-32950][SQL] Remove unnecessary big-endian code paths add 3c97665 [SPARK-32981][BUILD] Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution No new revisions were added by this update. Summary of changes: dev/create-release/release-build.sh | 1 - 1 file changed, 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3c97665 -> 27f6b5a)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3c97665 [SPARK-32981][BUILD] Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution add 27f6b5a [SPARK-32937][SPARK-32980][K8S] Fix decom & launcher tests and add some comments to reduce chance of breakage No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/scheduler/ExecutorLossReason.scala | 1 + .../spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala| 5 - .../spark/deploy/k8s/integrationtest/DecommissionSuite.scala | 3 ++- .../deploy/k8s/integrationtest/backend/minikube/Minikube.scala | 7 +-- 4 files changed, 12 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (faeb71b -> 3c97665)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from faeb71b [SPARK-32950][SQL] Remove unnecessary big-endian code paths add 3c97665 [SPARK-32981][BUILD] Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution No new revisions were added by this update. Summary of changes: dev/create-release/release-build.sh | 1 - 1 file changed, 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (faeb71b -> 3c97665)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from faeb71b [SPARK-32950][SQL] Remove unnecessary big-endian code paths add 3c97665 [SPARK-32981][BUILD] Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution No new revisions were added by this update. Summary of changes: dev/create-release/release-build.sh | 1 - 1 file changed, 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (faeb71b -> 3c97665)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from faeb71b [SPARK-32950][SQL] Remove unnecessary big-endian code paths add 3c97665 [SPARK-32981][BUILD] Remove hive-1.2/hadoop-2.7 from Apache Spark 3.1 distribution No new revisions were added by this update. Summary of changes: dev/create-release/release-build.sh | 1 - 1 file changed, 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (383bb4a -> faeb71b)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 383bb4a [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms add faeb71b [SPARK-32950][SQL] Remove unnecessary big-endian code paths No new revisions were added by this update. Summary of changes: .../execution/vectorized/OffHeapColumnVector.java | 24 -- .../execution/vectorized/OnHeapColumnVector.java | 22 2 files changed, 8 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (383bb4a -> faeb71b)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 383bb4a [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms add faeb71b [SPARK-32950][SQL] Remove unnecessary big-endian code paths No new revisions were added by this update. Summary of changes: .../execution/vectorized/OffHeapColumnVector.java | 24 -- .../execution/vectorized/OnHeapColumnVector.java | 22 2 files changed, 8 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (432afac -> 383bb4a)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 432afac [SPARK-32907][ML] adaptively blockify instances - revert blockify gmm add 383bb4a [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms No new revisions were added by this update. Summary of changes: .../apache/spark/util/sketch/Murmur3_x86_32.java | 10 ++- .../apache/spark/unsafe/hash/Murmur3_x86_32.java | 10 ++- .../spark/sql/catalyst/expressions/XXH64.java | 43 ++ .../spark/sql/catalyst/expressions/XXH64Suite.java | 91 +- 4 files changed, 98 insertions(+), 56 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (383bb4a -> faeb71b)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 383bb4a [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms add faeb71b [SPARK-32950][SQL] Remove unnecessary big-endian code paths No new revisions were added by this update. Summary of changes: .../execution/vectorized/OffHeapColumnVector.java | 24 -- .../execution/vectorized/OnHeapColumnVector.java | 22 2 files changed, 8 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (432afac -> 383bb4a)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 432afac [SPARK-32907][ML] adaptively blockify instances - revert blockify gmm add 383bb4a [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms No new revisions were added by this update. Summary of changes: .../apache/spark/util/sketch/Murmur3_x86_32.java | 10 ++- .../apache/spark/unsafe/hash/Murmur3_x86_32.java | 10 ++- .../spark/sql/catalyst/expressions/XXH64.java | 43 ++ .../spark/sql/catalyst/expressions/XXH64Suite.java | 91 +- 4 files changed, 98 insertions(+), 56 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (383bb4a -> faeb71b)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 383bb4a [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms add faeb71b [SPARK-32950][SQL] Remove unnecessary big-endian code paths No new revisions were added by this update. Summary of changes: .../execution/vectorized/OffHeapColumnVector.java | 24 -- .../execution/vectorized/OnHeapColumnVector.java | 22 2 files changed, 8 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (432afac -> 383bb4a)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 432afac [SPARK-32907][ML] adaptively blockify instances - revert blockify gmm add 383bb4a [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms No new revisions were added by this update. Summary of changes: .../apache/spark/util/sketch/Murmur3_x86_32.java | 10 ++- .../apache/spark/unsafe/hash/Murmur3_x86_32.java | 10 ++- .../spark/sql/catalyst/expressions/XXH64.java | 43 ++ .../spark/sql/catalyst/expressions/XXH64Suite.java | 91 +- 4 files changed, 98 insertions(+), 56 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (383bb4a -> faeb71b)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 383bb4a [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms add faeb71b [SPARK-32950][SQL] Remove unnecessary big-endian code paths No new revisions were added by this update. Summary of changes: .../execution/vectorized/OffHeapColumnVector.java | 24 -- .../execution/vectorized/OnHeapColumnVector.java | 22 2 files changed, 8 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (432afac -> 383bb4a)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 432afac [SPARK-32907][ML] adaptively blockify instances - revert blockify gmm add 383bb4a [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms No new revisions were added by this update. Summary of changes: .../apache/spark/util/sketch/Murmur3_x86_32.java | 10 ++- .../apache/spark/unsafe/hash/Murmur3_x86_32.java | 10 ++- .../spark/sql/catalyst/expressions/XXH64.java | 43 ++ .../spark/sql/catalyst/expressions/XXH64Suite.java | 91 +- 4 files changed, 98 insertions(+), 56 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (432afac -> 383bb4a)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 432afac [SPARK-32907][ML] adaptively blockify instances - revert blockify gmm add 383bb4a [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms No new revisions were added by this update. Summary of changes: .../apache/spark/util/sketch/Murmur3_x86_32.java | 10 ++- .../apache/spark/unsafe/hash/Murmur3_x86_32.java | 10 ++- .../spark/sql/catalyst/expressions/XXH64.java | 43 ++ .../spark/sql/catalyst/expressions/XXH64Suite.java | 91 +- 4 files changed, 98 insertions(+), 56 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 542dc97 [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()` 542dc97 is described below commit 542dc97525860e67e3ddcd543cecc8654b19715d Author: Max Gekk AuthorDate: Wed Sep 23 20:15:52 2020 +0900 [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()` ### What changes were proposed in this pull request? More precise description of the result of the `percentile_approx()` function and its synonym `approx_percentile()`. The proposed sentence clarifies that the function returns **one of elements** (or array of elements) from the input column. ### Why are the changes needed? To improve Spark docs and avoid misunderstanding of the function behavior. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: Liang-Chi Hsieh (cherry picked from commit 7c14f177eb5b52d491f41b217926cc8ca5f0ce4c) Signed-off-by: Max Gekk Closes #29845 from MaxGekk/doc-percentile_approx-3.0. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../expressions/aggregate/ApproximatePercentile.scala| 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index 32f21fc..3327f4c 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -49,11 +49,13 @@ import org.apache.spark.sql.types._ */ @ExpressionDescription( usage = """ -_FUNC_(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric - column `col` at the given percentage. The value of percentage must be between 0.0 - and 1.0. The `accuracy` parameter (default: 1) is a positive numeric literal which - controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields - better accuracy, `1.0/accuracy` is the relative error of the approximation. +_FUNC_(col, percentage [, accuracy]) - Returns the approximate `percentile` of the numeric + column `col` which is the smallest value in the ordered `col` values (sorted from least to + greatest) such that no more than `percentage` of `col` values is less than the value + or equal to that value. The value of percentage must be between 0.0 and 1.0. The `accuracy` + parameter (default: 1) is a positive numeric literal which controls approximation accuracy + at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is + the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (d204795 -> 1366443)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from d204795 [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()` add 1366443 [MINOR][SQL][2.4] Improve examples for `percentile_approx()` No new revisions were added by this update. Summary of changes: .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 1 file changed, 4 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [MINOR][SQL][3.0] Improve examples for `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 58124bd [MINOR][SQL][3.0] Improve examples for `percentile_approx()` 58124bd is described below commit 58124bd4e5ab2cfdfdc0a6b7c553c25678258c20 Author: Max Gekk AuthorDate: Wed Sep 23 20:14:12 2020 +0900 [MINOR][SQL][3.0] Improve examples for `percentile_approx()` ### What changes were proposed in this pull request? In the PR, I propose to replace current examples for `percentile_approx()` with **only one** input value by example **with multiple values** in the input column. ### Why are the changes needed? Current examples are pretty trivial, and don't demonstrate function's behaviour on a sequence of values. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - by running `ExpressionInfoSuite` - `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: HyukjinKwon (cherry picked from commit b53da23a28fe149cc75d593c5c36f7020a8a2752) Signed-off-by: Max Gekk Closes #29848 from MaxGekk/example-percentile_approx-3.0. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 .../src/test/resources/sql-functions/sql-expression-schema.md | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index d06..32f21fc 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -60,10 +60,10 @@ import org.apache.spark.sql.types._ """, examples = """ Examples: - > SELECT _FUNC_(10.0, array(0.5, 0.4, 0.1), 100); - [10.0,10.0,10.0] - > SELECT _FUNC_(10.0, 0.5, 100); - 10.0 + > SELECT _FUNC_(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col); + [1,1,0] + > SELECT _FUNC_(col, 0.5, 100) FROM VALUES (0), (6), (7), (9), (10) AS tab(col); + 7 """, group = "agg_funcs", since = "2.1.0") diff --git a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md index 070a6f3..b84abe5 100644 --- a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md +++ b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md @@ -285,8 +285,8 @@ | org.apache.spark.sql.catalyst.expressions.XxHash64 | xxhash64 | SELECT xxhash64('Spark', array(123), 2) | struct | | org.apache.spark.sql.catalyst.expressions.Year | year | SELECT year('2016-07-30') | struct | | org.apache.spark.sql.catalyst.expressions.ZipWith | zip_with | SELECT zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x)) | struct>> | -| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | approx_percentile | SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100) | struct> | -| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | percentile_approx | SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100) | struct> | +| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | approx_percentile | SELECT approx_percentile(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col) | struct> | +| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | percentile_approx | SELECT percentile_approx(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col) | struct> | | org.apache.spark.sql.catalyst.expressions.aggregate.Average | avg | SELECT avg(col) FROM VALUES (1), (2), (3) AS tab(col) | struct | | org.apache.spark.sql.catalyst.expressions.aggregate.Average | mean | SELECT mean(col) FROM VALUES (1), (2), (3) AS tab(col) | struct | | org.apache.spark.sql.catalyst.expressions.aggregate.BitAndAgg | bit_and | SELECT bit_and(col) FROM VALUES (3), (5) AS tab(col) | struct | - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 542dc97 [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()` 542dc97 is described below commit 542dc97525860e67e3ddcd543cecc8654b19715d Author: Max Gekk AuthorDate: Wed Sep 23 20:15:52 2020 +0900 [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()` ### What changes were proposed in this pull request? More precise description of the result of the `percentile_approx()` function and its synonym `approx_percentile()`. The proposed sentence clarifies that the function returns **one of elements** (or array of elements) from the input column. ### Why are the changes needed? To improve Spark docs and avoid misunderstanding of the function behavior. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: Liang-Chi Hsieh (cherry picked from commit 7c14f177eb5b52d491f41b217926cc8ca5f0ce4c) Signed-off-by: Max Gekk Closes #29845 from MaxGekk/doc-percentile_approx-3.0. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../expressions/aggregate/ApproximatePercentile.scala| 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index 32f21fc..3327f4c 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -49,11 +49,13 @@ import org.apache.spark.sql.types._ */ @ExpressionDescription( usage = """ -_FUNC_(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric - column `col` at the given percentage. The value of percentage must be between 0.0 - and 1.0. The `accuracy` parameter (default: 1) is a positive numeric literal which - controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields - better accuracy, `1.0/accuracy` is the relative error of the approximation. +_FUNC_(col, percentage [, accuracy]) - Returns the approximate `percentile` of the numeric + column `col` which is the smallest value in the ordered `col` values (sorted from least to + greatest) such that no more than `percentage` of `col` values is less than the value + or equal to that value. The value of percentage must be between 0.0 and 1.0. The `accuracy` + parameter (default: 1) is a positive numeric literal which controls approximation accuracy + at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is + the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (d204795 -> 1366443)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from d204795 [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()` add 1366443 [MINOR][SQL][2.4] Improve examples for `percentile_approx()` No new revisions were added by this update. Summary of changes: .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 1 file changed, 4 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [MINOR][SQL][3.0] Improve examples for `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 58124bd [MINOR][SQL][3.0] Improve examples for `percentile_approx()` 58124bd is described below commit 58124bd4e5ab2cfdfdc0a6b7c553c25678258c20 Author: Max Gekk AuthorDate: Wed Sep 23 20:14:12 2020 +0900 [MINOR][SQL][3.0] Improve examples for `percentile_approx()` ### What changes were proposed in this pull request? In the PR, I propose to replace current examples for `percentile_approx()` with **only one** input value by example **with multiple values** in the input column. ### Why are the changes needed? Current examples are pretty trivial, and don't demonstrate function's behaviour on a sequence of values. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - by running `ExpressionInfoSuite` - `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: HyukjinKwon (cherry picked from commit b53da23a28fe149cc75d593c5c36f7020a8a2752) Signed-off-by: Max Gekk Closes #29848 from MaxGekk/example-percentile_approx-3.0. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 .../src/test/resources/sql-functions/sql-expression-schema.md | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index d06..32f21fc 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -60,10 +60,10 @@ import org.apache.spark.sql.types._ """, examples = """ Examples: - > SELECT _FUNC_(10.0, array(0.5, 0.4, 0.1), 100); - [10.0,10.0,10.0] - > SELECT _FUNC_(10.0, 0.5, 100); - 10.0 + > SELECT _FUNC_(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col); + [1,1,0] + > SELECT _FUNC_(col, 0.5, 100) FROM VALUES (0), (6), (7), (9), (10) AS tab(col); + 7 """, group = "agg_funcs", since = "2.1.0") diff --git a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md index 070a6f3..b84abe5 100644 --- a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md +++ b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md @@ -285,8 +285,8 @@ | org.apache.spark.sql.catalyst.expressions.XxHash64 | xxhash64 | SELECT xxhash64('Spark', array(123), 2) | struct | | org.apache.spark.sql.catalyst.expressions.Year | year | SELECT year('2016-07-30') | struct | | org.apache.spark.sql.catalyst.expressions.ZipWith | zip_with | SELECT zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x)) | struct>> | -| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | approx_percentile | SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100) | struct> | -| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | percentile_approx | SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100) | struct> | +| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | approx_percentile | SELECT approx_percentile(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col) | struct> | +| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | percentile_approx | SELECT percentile_approx(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col) | struct> | | org.apache.spark.sql.catalyst.expressions.aggregate.Average | avg | SELECT avg(col) FROM VALUES (1), (2), (3) AS tab(col) | struct | | org.apache.spark.sql.catalyst.expressions.aggregate.Average | mean | SELECT mean(col) FROM VALUES (1), (2), (3) AS tab(col) | struct | | org.apache.spark.sql.catalyst.expressions.aggregate.BitAndAgg | bit_and | SELECT bit_and(col) FROM VALUES (3), (5) AS tab(col) | struct | - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (e1e94ed -> d204795)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from e1e94ed [SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task killed before real start add d204795 [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()` No new revisions were added by this update. Summary of changes: .../expressions/aggregate/ApproximatePercentile.scala| 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 542dc97 [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()` 542dc97 is described below commit 542dc97525860e67e3ddcd543cecc8654b19715d Author: Max Gekk AuthorDate: Wed Sep 23 20:15:52 2020 +0900 [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()` ### What changes were proposed in this pull request? More precise description of the result of the `percentile_approx()` function and its synonym `approx_percentile()`. The proposed sentence clarifies that the function returns **one of elements** (or array of elements) from the input column. ### Why are the changes needed? To improve Spark docs and avoid misunderstanding of the function behavior. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: Liang-Chi Hsieh (cherry picked from commit 7c14f177eb5b52d491f41b217926cc8ca5f0ce4c) Signed-off-by: Max Gekk Closes #29845 from MaxGekk/doc-percentile_approx-3.0. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../expressions/aggregate/ApproximatePercentile.scala| 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index 32f21fc..3327f4c 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -49,11 +49,13 @@ import org.apache.spark.sql.types._ */ @ExpressionDescription( usage = """ -_FUNC_(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric - column `col` at the given percentage. The value of percentage must be between 0.0 - and 1.0. The `accuracy` parameter (default: 1) is a positive numeric literal which - controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields - better accuracy, `1.0/accuracy` is the relative error of the approximation. +_FUNC_(col, percentage [, accuracy]) - Returns the approximate `percentile` of the numeric + column `col` which is the smallest value in the ordered `col` values (sorted from least to + greatest) such that no more than `percentage` of `col` values is less than the value + or equal to that value. The value of percentage must be between 0.0 and 1.0. The `accuracy` + parameter (default: 1) is a positive numeric literal which controls approximation accuracy + at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is + the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (d204795 -> 1366443)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from d204795 [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()` add 1366443 [MINOR][SQL][2.4] Improve examples for `percentile_approx()` No new revisions were added by this update. Summary of changes: .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 1 file changed, 4 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [MINOR][SQL][3.0] Improve examples for `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 58124bd [MINOR][SQL][3.0] Improve examples for `percentile_approx()` 58124bd is described below commit 58124bd4e5ab2cfdfdc0a6b7c553c25678258c20 Author: Max Gekk AuthorDate: Wed Sep 23 20:14:12 2020 +0900 [MINOR][SQL][3.0] Improve examples for `percentile_approx()` ### What changes were proposed in this pull request? In the PR, I propose to replace current examples for `percentile_approx()` with **only one** input value by example **with multiple values** in the input column. ### Why are the changes needed? Current examples are pretty trivial, and don't demonstrate function's behaviour on a sequence of values. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - by running `ExpressionInfoSuite` - `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: HyukjinKwon (cherry picked from commit b53da23a28fe149cc75d593c5c36f7020a8a2752) Signed-off-by: Max Gekk Closes #29848 from MaxGekk/example-percentile_approx-3.0. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 .../src/test/resources/sql-functions/sql-expression-schema.md | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index d06..32f21fc 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -60,10 +60,10 @@ import org.apache.spark.sql.types._ """, examples = """ Examples: - > SELECT _FUNC_(10.0, array(0.5, 0.4, 0.1), 100); - [10.0,10.0,10.0] - > SELECT _FUNC_(10.0, 0.5, 100); - 10.0 + > SELECT _FUNC_(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col); + [1,1,0] + > SELECT _FUNC_(col, 0.5, 100) FROM VALUES (0), (6), (7), (9), (10) AS tab(col); + 7 """, group = "agg_funcs", since = "2.1.0") diff --git a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md index 070a6f3..b84abe5 100644 --- a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md +++ b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md @@ -285,8 +285,8 @@ | org.apache.spark.sql.catalyst.expressions.XxHash64 | xxhash64 | SELECT xxhash64('Spark', array(123), 2) | struct | | org.apache.spark.sql.catalyst.expressions.Year | year | SELECT year('2016-07-30') | struct | | org.apache.spark.sql.catalyst.expressions.ZipWith | zip_with | SELECT zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x)) | struct>> | -| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | approx_percentile | SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100) | struct> | -| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | percentile_approx | SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100) | struct> | +| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | approx_percentile | SELECT approx_percentile(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col) | struct> | +| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | percentile_approx | SELECT percentile_approx(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col) | struct> | | org.apache.spark.sql.catalyst.expressions.aggregate.Average | avg | SELECT avg(col) FROM VALUES (1), (2), (3) AS tab(col) | struct | | org.apache.spark.sql.catalyst.expressions.aggregate.Average | mean | SELECT mean(col) FROM VALUES (1), (2), (3) AS tab(col) | struct | | org.apache.spark.sql.catalyst.expressions.aggregate.BitAndAgg | bit_and | SELECT bit_and(col) FROM VALUES (3), (5) AS tab(col) | struct | - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (e1e94ed -> d204795)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from e1e94ed [SPARK-32898][2.4][CORE] Fix wrong executorRunTime when task killed before real start add d204795 [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()` No new revisions were added by this update. Summary of changes: .../expressions/aggregate/ApproximatePercentile.scala| 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 542dc97 [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()` 542dc97 is described below commit 542dc97525860e67e3ddcd543cecc8654b19715d Author: Max Gekk AuthorDate: Wed Sep 23 20:15:52 2020 +0900 [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()` ### What changes were proposed in this pull request? More precise description of the result of the `percentile_approx()` function and its synonym `approx_percentile()`. The proposed sentence clarifies that the function returns **one of elements** (or array of elements) from the input column. ### Why are the changes needed? To improve Spark docs and avoid misunderstanding of the function behavior. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: Liang-Chi Hsieh (cherry picked from commit 7c14f177eb5b52d491f41b217926cc8ca5f0ce4c) Signed-off-by: Max Gekk Closes #29845 from MaxGekk/doc-percentile_approx-3.0. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../expressions/aggregate/ApproximatePercentile.scala| 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index 32f21fc..3327f4c 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -49,11 +49,13 @@ import org.apache.spark.sql.types._ */ @ExpressionDescription( usage = """ -_FUNC_(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric - column `col` at the given percentage. The value of percentage must be between 0.0 - and 1.0. The `accuracy` parameter (default: 1) is a positive numeric literal which - controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields - better accuracy, `1.0/accuracy` is the relative error of the approximation. +_FUNC_(col, percentage [, accuracy]) - Returns the approximate `percentile` of the numeric + column `col` which is the smallest value in the ordered `col` values (sorted from least to + greatest) such that no more than `percentage` of `col` values is less than the value + or equal to that value. The value of percentage must be between 0.0 and 1.0. The `accuracy` + parameter (default: 1) is a positive numeric literal which controls approximation accuracy + at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is + the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [MINOR][SQL][2.4] Improve examples for `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 1366443 [MINOR][SQL][2.4] Improve examples for `percentile_approx()` 1366443 is described below commit 13664434387e338a5029e73a4388943f34e3fc07 Author: Max Gekk AuthorDate: Wed Sep 23 20:14:47 2020 +0900 [MINOR][SQL][2.4] Improve examples for `percentile_approx()` ### What changes were proposed in this pull request? In the PR, I propose to replace current examples for `percentile_approx()` with **only one** input value by example **with multiple values** in the input column. ### Why are the changes needed? Current examples are pretty trivial, and don't demonstrate function's behaviour on a sequence of values. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: HyukjinKwon (cherry picked from commit b53da23a28fe149cc75d593c5c36f7020a8a2752) Signed-off-by: Max Gekk Closes #29849 from MaxGekk/example-percentile_approx-2.4. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index 4ccde96..fd5d679 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -62,10 +62,10 @@ import org.apache.spark.sql.types._ """, examples = """ Examples: - > SELECT _FUNC_(10.0, array(0.5, 0.4, 0.1), 100); - [10.0,10.0,10.0] - > SELECT _FUNC_(10.0, 0.5, 100); - 10.0 + > SELECT _FUNC_(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col); + [1,1,0] + > SELECT _FUNC_(col, 0.5, 100) FROM VALUES (0), (6), (7), (9), (10) AS tab(col); + 7 """) case class ApproximatePercentile( child: Expression, - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [MINOR][SQL][3.0] Improve examples for `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 58124bd [MINOR][SQL][3.0] Improve examples for `percentile_approx()` 58124bd is described below commit 58124bd4e5ab2cfdfdc0a6b7c553c25678258c20 Author: Max Gekk AuthorDate: Wed Sep 23 20:14:12 2020 +0900 [MINOR][SQL][3.0] Improve examples for `percentile_approx()` ### What changes were proposed in this pull request? In the PR, I propose to replace current examples for `percentile_approx()` with **only one** input value by example **with multiple values** in the input column. ### Why are the changes needed? Current examples are pretty trivial, and don't demonstrate function's behaviour on a sequence of values. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - by running `ExpressionInfoSuite` - `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: HyukjinKwon (cherry picked from commit b53da23a28fe149cc75d593c5c36f7020a8a2752) Signed-off-by: Max Gekk Closes #29848 from MaxGekk/example-percentile_approx-3.0. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 .../src/test/resources/sql-functions/sql-expression-schema.md | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index d06..32f21fc 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -60,10 +60,10 @@ import org.apache.spark.sql.types._ """, examples = """ Examples: - > SELECT _FUNC_(10.0, array(0.5, 0.4, 0.1), 100); - [10.0,10.0,10.0] - > SELECT _FUNC_(10.0, 0.5, 100); - 10.0 + > SELECT _FUNC_(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col); + [1,1,0] + > SELECT _FUNC_(col, 0.5, 100) FROM VALUES (0), (6), (7), (9), (10) AS tab(col); + 7 """, group = "agg_funcs", since = "2.1.0") diff --git a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md index 070a6f3..b84abe5 100644 --- a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md +++ b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md @@ -285,8 +285,8 @@ | org.apache.spark.sql.catalyst.expressions.XxHash64 | xxhash64 | SELECT xxhash64('Spark', array(123), 2) | struct | | org.apache.spark.sql.catalyst.expressions.Year | year | SELECT year('2016-07-30') | struct | | org.apache.spark.sql.catalyst.expressions.ZipWith | zip_with | SELECT zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x)) | struct>> | -| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | approx_percentile | SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100) | struct> | -| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | percentile_approx | SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100) | struct> | +| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | approx_percentile | SELECT approx_percentile(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col) | struct> | +| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | percentile_approx | SELECT percentile_approx(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col) | struct> | | org.apache.spark.sql.catalyst.expressions.aggregate.Average | avg | SELECT avg(col) FROM VALUES (1), (2), (3) AS tab(col) | struct | | org.apache.spark.sql.catalyst.expressions.aggregate.Average | mean | SELECT mean(col) FROM VALUES (1), (2), (3) AS tab(col) | struct | | org.apache.spark.sql.catalyst.expressions.aggregate.BitAndAgg | bit_and | SELECT bit_and(col) FROM VALUES (3), (5) AS tab(col) | struct | - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new d204795 [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()` d204795 is described below commit d2047957ff16d322314af50f86d0d36ac7199bf6 Author: Max Gekk AuthorDate: Wed Sep 23 20:13:33 2020 +0900 [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()` ### What changes were proposed in this pull request? More precise description of the result of the `percentile_approx()` function and its synonym `approx_percentile()`. The proposed sentence clarifies that the function returns **one of elements** (or array of elements) from the input column. ### Why are the changes needed? To improve Spark docs and avoid misunderstanding of the function behavior. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: Liang-Chi Hsieh (cherry picked from commit 7c14f177eb5b52d491f41b217926cc8ca5f0ce4c) Signed-off-by: Max Gekk Closes #29847 from MaxGekk/doc-percentile_approx-2.4. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../expressions/aggregate/ApproximatePercentile.scala| 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index c790d87..4ccde96 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -49,11 +49,13 @@ import org.apache.spark.sql.types._ */ @ExpressionDescription( usage = """ -_FUNC_(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric - column `col` at the given percentage. The value of percentage must be between 0.0 - and 1.0. The `accuracy` parameter (default: 1) is a positive numeric literal which - controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields - better accuracy, `1.0/accuracy` is the relative error of the approximation. +_FUNC_(col, percentage [, accuracy]) - Returns the approximate `percentile` of the numeric + column `col` which is the smallest value in the ordered `col` values (sorted from least to + greatest) such that no more than `percentage` of `col` values is less than the value + or equal to that value. The value of percentage must be between 0.0 and 1.0. The `accuracy` + parameter (default: 1) is a positive numeric literal which controls approximation accuracy + at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is + the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 542dc97 [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()` 542dc97 is described below commit 542dc97525860e67e3ddcd543cecc8654b19715d Author: Max Gekk AuthorDate: Wed Sep 23 20:15:52 2020 +0900 [SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_approx()` ### What changes were proposed in this pull request? More precise description of the result of the `percentile_approx()` function and its synonym `approx_percentile()`. The proposed sentence clarifies that the function returns **one of elements** (or array of elements) from the input column. ### Why are the changes needed? To improve Spark docs and avoid misunderstanding of the function behavior. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: Liang-Chi Hsieh (cherry picked from commit 7c14f177eb5b52d491f41b217926cc8ca5f0ce4c) Signed-off-by: Max Gekk Closes #29845 from MaxGekk/doc-percentile_approx-3.0. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../expressions/aggregate/ApproximatePercentile.scala| 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index 32f21fc..3327f4c 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -49,11 +49,13 @@ import org.apache.spark.sql.types._ */ @ExpressionDescription( usage = """ -_FUNC_(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric - column `col` at the given percentage. The value of percentage must be between 0.0 - and 1.0. The `accuracy` parameter (default: 1) is a positive numeric literal which - controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields - better accuracy, `1.0/accuracy` is the relative error of the approximation. +_FUNC_(col, percentage [, accuracy]) - Returns the approximate `percentile` of the numeric + column `col` which is the smallest value in the ordered `col` values (sorted from least to + greatest) such that no more than `percentage` of `col` values is less than the value + or equal to that value. The value of percentage must be between 0.0 and 1.0. The `accuracy` + parameter (default: 1) is a positive numeric literal which controls approximation accuracy + at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is + the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [MINOR][SQL][2.4] Improve examples for `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 1366443 [MINOR][SQL][2.4] Improve examples for `percentile_approx()` 1366443 is described below commit 13664434387e338a5029e73a4388943f34e3fc07 Author: Max Gekk AuthorDate: Wed Sep 23 20:14:47 2020 +0900 [MINOR][SQL][2.4] Improve examples for `percentile_approx()` ### What changes were proposed in this pull request? In the PR, I propose to replace current examples for `percentile_approx()` with **only one** input value by example **with multiple values** in the input column. ### Why are the changes needed? Current examples are pretty trivial, and don't demonstrate function's behaviour on a sequence of values. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: HyukjinKwon (cherry picked from commit b53da23a28fe149cc75d593c5c36f7020a8a2752) Signed-off-by: Max Gekk Closes #29849 from MaxGekk/example-percentile_approx-2.4. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index 4ccde96..fd5d679 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -62,10 +62,10 @@ import org.apache.spark.sql.types._ """, examples = """ Examples: - > SELECT _FUNC_(10.0, array(0.5, 0.4, 0.1), 100); - [10.0,10.0,10.0] - > SELECT _FUNC_(10.0, 0.5, 100); - 10.0 + > SELECT _FUNC_(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col); + [1,1,0] + > SELECT _FUNC_(col, 0.5, 100) FROM VALUES (0), (6), (7), (9), (10) AS tab(col); + 7 """) case class ApproximatePercentile( child: Expression, - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [MINOR][SQL][3.0] Improve examples for `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 58124bd [MINOR][SQL][3.0] Improve examples for `percentile_approx()` 58124bd is described below commit 58124bd4e5ab2cfdfdc0a6b7c553c25678258c20 Author: Max Gekk AuthorDate: Wed Sep 23 20:14:12 2020 +0900 [MINOR][SQL][3.0] Improve examples for `percentile_approx()` ### What changes were proposed in this pull request? In the PR, I propose to replace current examples for `percentile_approx()` with **only one** input value by example **with multiple values** in the input column. ### Why are the changes needed? Current examples are pretty trivial, and don't demonstrate function's behaviour on a sequence of values. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - by running `ExpressionInfoSuite` - `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: HyukjinKwon (cherry picked from commit b53da23a28fe149cc75d593c5c36f7020a8a2752) Signed-off-by: Max Gekk Closes #29848 from MaxGekk/example-percentile_approx-3.0. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 8 .../src/test/resources/sql-functions/sql-expression-schema.md | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index d06..32f21fc 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -60,10 +60,10 @@ import org.apache.spark.sql.types._ """, examples = """ Examples: - > SELECT _FUNC_(10.0, array(0.5, 0.4, 0.1), 100); - [10.0,10.0,10.0] - > SELECT _FUNC_(10.0, 0.5, 100); - 10.0 + > SELECT _FUNC_(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col); + [1,1,0] + > SELECT _FUNC_(col, 0.5, 100) FROM VALUES (0), (6), (7), (9), (10) AS tab(col); + 7 """, group = "agg_funcs", since = "2.1.0") diff --git a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md index 070a6f3..b84abe5 100644 --- a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md +++ b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md @@ -285,8 +285,8 @@ | org.apache.spark.sql.catalyst.expressions.XxHash64 | xxhash64 | SELECT xxhash64('Spark', array(123), 2) | struct | | org.apache.spark.sql.catalyst.expressions.Year | year | SELECT year('2016-07-30') | struct | | org.apache.spark.sql.catalyst.expressions.ZipWith | zip_with | SELECT zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x)) | struct>> | -| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | approx_percentile | SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100) | struct> | -| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | percentile_approx | SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100) | struct> | +| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | approx_percentile | SELECT approx_percentile(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col) | struct> | +| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | percentile_approx | SELECT percentile_approx(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), (2), (10) AS tab(col) | struct> | | org.apache.spark.sql.catalyst.expressions.aggregate.Average | avg | SELECT avg(col) FROM VALUES (1), (2), (3) AS tab(col) | struct | | org.apache.spark.sql.catalyst.expressions.aggregate.Average | mean | SELECT mean(col) FROM VALUES (1), (2), (3) AS tab(col) | struct | | org.apache.spark.sql.catalyst.expressions.aggregate.BitAndAgg | bit_and | SELECT bit_and(col) FROM VALUES (3), (5) AS tab(col) | struct | - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new d204795 [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()` d204795 is described below commit d2047957ff16d322314af50f86d0d36ac7199bf6 Author: Max Gekk AuthorDate: Wed Sep 23 20:13:33 2020 +0900 [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()` ### What changes were proposed in this pull request? More precise description of the result of the `percentile_approx()` function and its synonym `approx_percentile()`. The proposed sentence clarifies that the function returns **one of elements** (or array of elements) from the input column. ### Why are the changes needed? To improve Spark docs and avoid misunderstanding of the function behavior. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: Liang-Chi Hsieh (cherry picked from commit 7c14f177eb5b52d491f41b217926cc8ca5f0ce4c) Signed-off-by: Max Gekk Closes #29847 from MaxGekk/doc-percentile_approx-2.4. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../expressions/aggregate/ApproximatePercentile.scala| 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index c790d87..4ccde96 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -49,11 +49,13 @@ import org.apache.spark.sql.types._ */ @ExpressionDescription( usage = """ -_FUNC_(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric - column `col` at the given percentage. The value of percentage must be between 0.0 - and 1.0. The `accuracy` parameter (default: 1) is a positive numeric literal which - controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields - better accuracy, `1.0/accuracy` is the relative error of the approximation. +_FUNC_(col, percentage [, accuracy]) - Returns the approximate `percentile` of the numeric + column `col` which is the smallest value in the ordered `col` values (sorted from least to + greatest) such that no more than `percentage` of `col` values is less than the value + or equal to that value. The value of percentage must be between 0.0 and 1.0. The `accuracy` + parameter (default: 1) is a positive numeric literal which controls approximation accuracy + at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is + the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()`
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new d204795 [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()` d204795 is described below commit d2047957ff16d322314af50f86d0d36ac7199bf6 Author: Max Gekk AuthorDate: Wed Sep 23 20:13:33 2020 +0900 [SPARK-32306][SQL][DOCS][2.4] Clarify the result of `percentile_approx()` ### What changes were proposed in this pull request? More precise description of the result of the `percentile_approx()` function and its synonym `approx_percentile()`. The proposed sentence clarifies that the function returns **one of elements** (or array of elements) from the input column. ### Why are the changes needed? To improve Spark docs and avoid misunderstanding of the function behavior. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? `./dev/scalastyle` Authored-by: Max Gekk Signed-off-by: Liang-Chi Hsieh (cherry picked from commit 7c14f177eb5b52d491f41b217926cc8ca5f0ce4c) Signed-off-by: Max Gekk Closes #29847 from MaxGekk/doc-percentile_approx-2.4. Authored-by: Max Gekk Signed-off-by: HyukjinKwon --- .../expressions/aggregate/ApproximatePercentile.scala| 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala index c790d87..4ccde96 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala @@ -49,11 +49,13 @@ import org.apache.spark.sql.types._ */ @ExpressionDescription( usage = """ -_FUNC_(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric - column `col` at the given percentage. The value of percentage must be between 0.0 - and 1.0. The `accuracy` parameter (default: 1) is a positive numeric literal which - controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields - better accuracy, `1.0/accuracy` is the relative error of the approximation. +_FUNC_(col, percentage [, accuracy]) - Returns the approximate `percentile` of the numeric + column `col` which is the smallest value in the ordered `col` values (sorted from least to + greatest) such that no more than `percentage` of `col` values is less than the value + or equal to that value. The value of percentage must be between 0.0 and 1.0. The `accuracy` + parameter (default: 1) is a positive numeric literal which controls approximation accuracy + at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is + the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (21b7479 -> 432afac)
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 21b7479 [SPARK-32959][SQL][TEST] Fix an invalid test in DataSourceV2SQLSuite add 432afac [SPARK-32907][ML] adaptively blockify instances - revert blockify gmm No new revisions were added by this update. Summary of changes: .../stat/distribution/MultivariateGaussian.scala | 32 +-- .../distribution/MultivariateGaussianSuite.scala | 10 - .../spark/ml/clustering/GaussianMixture.scala | 235 + .../spark/ml/clustering/GaussianMixtureSuite.scala | 11 - python/pyspark/ml/clustering.py| 26 +-- 5 files changed, 20 insertions(+), 294 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (21b7479 -> 432afac)
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 21b7479 [SPARK-32959][SQL][TEST] Fix an invalid test in DataSourceV2SQLSuite add 432afac [SPARK-32907][ML] adaptively blockify instances - revert blockify gmm No new revisions were added by this update. Summary of changes: .../stat/distribution/MultivariateGaussian.scala | 32 +-- .../distribution/MultivariateGaussianSuite.scala | 10 - .../spark/ml/clustering/GaussianMixture.scala | 235 + .../spark/ml/clustering/GaussianMixtureSuite.scala | 11 - python/pyspark/ml/clustering.py| 26 +-- 5 files changed, 20 insertions(+), 294 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (21b7479 -> 432afac)
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 21b7479 [SPARK-32959][SQL][TEST] Fix an invalid test in DataSourceV2SQLSuite add 432afac [SPARK-32907][ML] adaptively blockify instances - revert blockify gmm No new revisions were added by this update. Summary of changes: .../stat/distribution/MultivariateGaussian.scala | 32 +-- .../distribution/MultivariateGaussianSuite.scala | 10 - .../spark/ml/clustering/GaussianMixture.scala | 235 + .../spark/ml/clustering/GaussianMixtureSuite.scala | 11 - python/pyspark/ml/clustering.py| 26 +-- 5 files changed, 20 insertions(+), 294 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (21b7479 -> 432afac)
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 21b7479 [SPARK-32959][SQL][TEST] Fix an invalid test in DataSourceV2SQLSuite add 432afac [SPARK-32907][ML] adaptively blockify instances - revert blockify gmm No new revisions were added by this update. Summary of changes: .../stat/distribution/MultivariateGaussian.scala | 32 +-- .../distribution/MultivariateGaussianSuite.scala | 10 - .../spark/ml/clustering/GaussianMixture.scala | 235 + .../spark/ml/clustering/GaussianMixtureSuite.scala | 11 - python/pyspark/ml/clustering.py| 26 +-- 5 files changed, 20 insertions(+), 294 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (21b7479 -> 432afac)
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 21b7479 [SPARK-32959][SQL][TEST] Fix an invalid test in DataSourceV2SQLSuite add 432afac [SPARK-32907][ML] adaptively blockify instances - revert blockify gmm No new revisions were added by this update. Summary of changes: .../stat/distribution/MultivariateGaussian.scala | 32 +-- .../distribution/MultivariateGaussianSuite.scala | 10 - .../spark/ml/clustering/GaussianMixture.scala | 235 + .../spark/ml/clustering/GaussianMixtureSuite.scala | 11 - python/pyspark/ml/clustering.py| 26 +-- 5 files changed, 20 insertions(+), 294 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org