[spark] branch branch-2.4 updated (e52ae4e -> 6ac3659)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from e52ae4e [SPARK-30450][INFRA][FOLLOWUP][2.4] Fix git folder regex for windows file separator add 6ac3659 [SPARK-30410][SQL][2.4] Calculating size of table with large number of partitions causes flooding logs No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/execution/command/CommandUtils.scala | 10 +++--- .../spark/sql/execution/datasources/InMemoryFileIndex.scala| 6 +- 2 files changed, 12 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5c71304 -> dcdc9a8)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5c71304 [SPARK-30450][INFRA][FOLLOWUP] Fix git folder regex for windows file separator add dcdc9a8 [SPARK-28198][PYTHON][FOLLOW-UP] Run the tests of MAP ITER UDF in Jenkins No new revisions were added by this update. Summary of changes: dev/sparktestsupport/modules.py | 1 + 1 file changed, 1 insertion(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (c373123 -> 5c71304)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c373123 [SPARK-30183][SQL] Disallow to specify reserved properties in CREATE/ALTER NAMESPACE syntax add 5c71304 [SPARK-30450][INFRA][FOLLOWUP] Fix git folder regex for windows file separator No new revisions were added by this update. Summary of changes: dev/lint-python | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (92a0877 -> c373123)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 92a0877 [SPARK-30464][PYTHON][DOCS] Explicitly note that we don't add "pandas compatible" aliases add c373123 [SPARK-30183][SQL] Disallow to specify reserved properties in CREATE/ALTER NAMESPACE syntax No new revisions were added by this update. Summary of changes: docs/sql-migration-guide.md| 2 + .../spark/sql/catalyst/parser/AstBuilder.scala | 30 ++-- .../org/apache/spark/sql/internal/SQLConf.scala| 9 .../spark/sql/connector/DataSourceV2SQLSuite.scala | 55 ++ 4 files changed, 93 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ee8d661 -> 92a0877)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ee8d661 [SPARK-30434][PYTHON][SQL] Move pandas related functionalities into 'pandas' sub-package add 92a0877 [SPARK-30464][PYTHON][DOCS] Explicitly note that we don't add "pandas compatible" aliases No new revisions were added by this update. Summary of changes: python/pyspark/sql/dataframe.py | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ee8d661 -> 92a0877)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ee8d661 [SPARK-30434][PYTHON][SQL] Move pandas related functionalities into 'pandas' sub-package add 92a0877 [SPARK-30464][PYTHON][DOCS] Explicitly note that we don't add "pandas compatible" aliases No new revisions were added by this update. Summary of changes: python/pyspark/sql/dataframe.py | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (18daa37 -> ee8d661)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 18daa37 [SPARK-30440][CORE][TESTS] Avoid race condition in TaskSetManagerSuite by not using resourceOffer add ee8d661 [SPARK-30434][PYTHON][SQL] Move pandas related functionalities into 'pandas' sub-package No new revisions were added by this update. Summary of changes: dev/sparktestsupport/modules.py| 7 + examples/src/main/python/sql/arrow.py | 2 +- python/docs/pyspark.sql.rst| 1 + python/pyspark/serializers.py | 242 - python/pyspark/sql/__init__.py | 4 +- python/pyspark/sql/dataframe.py| 224 + python/pyspark/sql/functions.py| 495 +-- python/pyspark/sql/group.py| 68 +-- .../sql/pandas/__init__.py}| 7 +- python/pyspark/sql/pandas/conversion.py| 431 python/pyspark/sql/pandas/functions.py | 539 + .../sql/{cogroup.py => pandas/group_ops.py}| 86 +++- python/pyspark/sql/pandas/map_ops.py | 96 python/pyspark/sql/pandas/serializers.py | 281 +++ python/pyspark/sql/pandas/types.py | 284 +++ python/pyspark/sql/pandas/utils.py | 60 +++ python/pyspark/sql/session.py | 182 +-- python/pyspark/sql/tests/test_arrow.py | 4 +- python/pyspark/sql/types.py| 261 -- python/pyspark/sql/udf.py | 6 +- python/pyspark/sql/utils.py| 44 -- python/pyspark/testing/sqlutils.py | 4 +- python/pyspark/worker.py | 6 +- python/setup.py| 2 + .../apache/spark/sql/IntegratedUDFTestUtils.scala | 4 +- 25 files changed, 1822 insertions(+), 1518 deletions(-) copy python/{test_support/SimpleHTTPServer.py => pyspark/sql/pandas/__init__.py} (82%) create mode 100644 python/pyspark/sql/pandas/conversion.py create mode 100644 python/pyspark/sql/pandas/functions.py rename python/pyspark/sql/{cogroup.py => pandas/group_ops.py} (63%) create mode 100644 python/pyspark/sql/pandas/map_ops.py create mode 100644 python/pyspark/sql/pandas/serializers.py create mode 100644 python/pyspark/sql/pandas/types.py create mode 100644 python/pyspark/sql/pandas/utils.py - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-30450][INFRA][FOLLOWUP][2.4] Fix git folder regex for windows file separator
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new e52ae4e [SPARK-30450][INFRA][FOLLOWUP][2.4] Fix git folder regex for windows file separator e52ae4e is described below commit e52ae4e4551284be7f6757ee7fe4296a7efe0cc6 Author: Eric Chang AuthorDate: Wed Jan 8 16:38:19 2020 -0800 [SPARK-30450][INFRA][FOLLOWUP][2.4] Fix git folder regex for windows file separator ### What changes were proposed in this pull request? The regex is to exclude the .git folder for the python linter, but bash escaping caused only one forward slash to be included. This adds the necessary second slash. ### Why are the changes needed? This is necessary to properly match the file separator character. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manually. Added File dev/something.git.py and ran `dev/lint-python` ```dev/lint-python pycodestyle checks failed. *** Error compiling './dev/something.git.py'... File "./dev/something.git.py", line 1 mport asdf2 ^ SyntaxError: invalid syntax``` Closes #27139 from ericfchang/SPARK-30450. Authored-by: Eric Chang Signed-off-by: Dongjoon Hyun --- dev/lint-python | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dev/lint-python b/dev/lint-python index 9453f86..caffb73 100755 --- a/dev/lint-python +++ b/dev/lint-python @@ -30,7 +30,7 @@ SPHINX_REPORT_PATH="$SPARK_ROOT_DIR/dev/sphinx-report.txt" cd "$SPARK_ROOT_DIR" # compileall: https://docs.python.org/2/library/compileall.html -python -B -m compileall -q -l -x "[/\\][.]git" $PATHS_TO_CHECK > "$PYCODESTYLE_REPORT_PATH" +python -B -m compileall -q -l -x "[/][.]git" $PATHS_TO_CHECK > "$PYCODESTYLE_REPORT_PATH" compile_status="${PIPESTATUS[0]}" # Get pycodestyle at runtime so that we don't rely on it being installed on the build server. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (af2d3d0 -> 18daa37)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from af2d3d0 [SPARK-30315][SQL] Add adaptive execution context add 18daa37 [SPARK-30440][CORE][TESTS] Avoid race condition in TaskSetManagerSuite by not using resourceOffer No new revisions were added by this update. Summary of changes: .../spark/scheduler/TaskSetManagerSuite.scala | 32 ++ 1 file changed, 21 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (c49abf8 -> af2d3d0)
This is an automated email from the ASF dual-hosted git repository. lixiao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from c49abf8 [SPARK-30417][CORE] Task speculation numTaskThreshold should be greater than 0 even EXECUTOR_CORES is not set under Standalone mode add af2d3d0 [SPARK-30315][SQL] Add adaptive execution context No new revisions were added by this update. Summary of changes: .../spark/sql/execution/QueryExecution.scala | 4 +-- .../execution/adaptive/AdaptiveSparkPlanExec.scala | 42 +++--- .../adaptive/InsertAdaptiveSparkPlan.scala | 20 --- 3 files changed, 38 insertions(+), 28 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0a72dba -> bd7510b)
This is an automated email from the ASF dual-hosted git repository. vanzin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0a72dba [SPARK-30445][CORE] Accelerator aware scheduling handle setting configs to 0 add bd7510b [SPARK-30281][SS] Consider partitioned/recursive option while verifying archive path on FileStreamSource No new revisions were added by this update. Summary of changes: docs/structured-streaming-programming-guide.md | 3 +- .../sql/execution/streaming/FileStreamSource.scala | 73 +- .../sql/streaming/FileStreamSourceSuite.scala | 26 ++-- 3 files changed, 80 insertions(+), 22 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a93b996 -> 0a72dba)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a93b996 [MINOR][ML][INT] Array.fill(0) -> Array.ofDim; Array.empty -> Array.emptyIntArray add 0a72dba [SPARK-30445][CORE] Accelerator aware scheduling handle setting configs to 0 No new revisions were added by this update. Summary of changes: .../org/apache/spark/resource/ResourceUtils.scala | 23 +++--- .../scala/org/apache/spark/SparkConfSuite.scala| 13 .../apache/spark/resource/ResourceUtilsSuite.scala | 21 3 files changed, 50 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a93b996 -> 0a72dba)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a93b996 [MINOR][ML][INT] Array.fill(0) -> Array.ofDim; Array.empty -> Array.emptyIntArray add 0a72dba [SPARK-30445][CORE] Accelerator aware scheduling handle setting configs to 0 No new revisions were added by this update. Summary of changes: .../org/apache/spark/resource/ResourceUtils.scala | 23 +++--- .../scala/org/apache/spark/SparkConfSuite.scala| 13 .../apache/spark/resource/ResourceUtilsSuite.scala | 21 3 files changed, 50 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b3c2d73 -> a93b996)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b3c2d73 [MINOR][CORE] Process bar should print new line to avoid polluting logs add a93b996 [MINOR][ML][INT] Array.fill(0) -> Array.ofDim; Array.empty -> Array.emptyIntArray No new revisions were added by this update. Summary of changes: .../src/main/scala/org/apache/spark/api/python/PythonRunner.scala | 2 +- core/src/main/scala/org/apache/spark/util/SizeEstimator.scala | 2 +- .../org/apache/spark/ml/classification/LogisticRegression.scala | 6 +++--- .../scala/org/apache/spark/ml/classification/NaiveBayes.scala | 2 +- .../main/scala/org/apache/spark/ml/classification/OneVsRest.scala | 4 ++-- .../scala/org/apache/spark/ml/clustering/ClusteringSummary.scala | 2 +- .../main/scala/org/apache/spark/ml/feature/OneHotEncoder.scala| 2 +- mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala| 3 +-- .../src/main/scala/org/apache/spark/ml/feature/VectorSlicer.scala | 2 +- .../org/apache/spark/ml/regression/DecisionTreeRegressor.scala| 2 -- .../main/scala/org/apache/spark/ml/regression/FMRegressor.scala | 6 +++--- .../main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala | 4 ++-- .../src/main/scala/org/apache/spark/ml/tree/impl/TreePoint.scala | 2 +- mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala| 2 +- .../main/scala/org/apache/spark/ml/tuning/CrossValidator.scala| 8 .../scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala | 4 ++-- mllib/src/main/scala/org/apache/spark/ml/util/DatasetUtils.scala | 2 +- .../scala/org/apache/spark/mllib/classification/NaiveBayes.scala | 4 ++-- mllib/src/main/scala/org/apache/spark/mllib/clustering/LDA.scala | 2 +- .../scala/org/apache/spark/mllib/clustering/LocalKMeans.scala | 2 +- mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala | 4 ++-- .../org/apache/spark/mllib/linalg/distributed/BlockMatrix.scala | 4 ++-- .../main/scala/org/apache/spark/mllib/stat/test/ChiSqTest.scala | 2 +- .../scala/org/apache/spark/mllib/util/LinearDataGenerator.scala | 2 +- .../catalyst/expressions/aggregate/ApproximatePercentile.scala| 2 +- .../src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala | 2 +- .../org/apache/spark/sql/execution/python/PythonUDFRunner.scala | 2 +- .../org/apache/spark/sql/execution/benchmark/SortBenchmark.scala | 2 +- 28 files changed, 40 insertions(+), 43 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fa36966 -> b3c2d73)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fa36966 [SPARK-30410][SQL] Calculating size of table with large number of partitions causes flooding logs add b3c2d73 [MINOR][CORE] Process bar should print new line to avoid polluting logs No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/ui/ConsoleProgressBar.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (047bff0 -> fa36966)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 047bff0 [SPARK-30215][SQL] Remove PrunedInMemoryFileIndex and merge its functionality into InMemoryFileIndex add fa36966 [SPARK-30410][SQL] Calculating size of table with large number of partitions causes flooding logs No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/execution/command/CommandUtils.scala | 10 +++--- .../spark/sql/execution/datasources/InMemoryFileIndex.scala| 6 +- 2 files changed, 12 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (047bff0 -> fa36966)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 047bff0 [SPARK-30215][SQL] Remove PrunedInMemoryFileIndex and merge its functionality into InMemoryFileIndex add fa36966 [SPARK-30410][SQL] Calculating size of table with large number of partitions causes flooding logs No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/execution/command/CommandUtils.scala | 10 +++--- .../spark/sql/execution/datasources/InMemoryFileIndex.scala| 6 +- 2 files changed, 12 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b2ed6d0 -> 047bff0)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b2ed6d0 [SPARK-30214][SQL][FOLLOWUP] Remove statement logical plans for namespace commands add 047bff0 [SPARK-30215][SQL] Remove PrunedInMemoryFileIndex and merge its functionality into InMemoryFileIndex No new revisions were added by this update. Summary of changes: .../execution/datasources/CatalogFileIndex.scala | 31 ++ .../execution/datasources/InMemoryFileIndex.scala | 10 +-- .../scala/org/apache/spark/sql/ExplainSuite.scala | 4 +-- 3 files changed, 18 insertions(+), 27 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0d589f4 -> b2ed6d0)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0d589f4 [SPARK-30267][SQL][FOLLOWUP] Use while loop in Avro Array Deserializer add b2ed6d0 [SPARK-30214][SQL][FOLLOWUP] Remove statement logical plans for namespace commands No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 2 +- .../sql/catalyst/analysis/ResolveCatalogs.scala| 30 .../spark/sql/catalyst/parser/AstBuilder.scala | 24 +- .../sql/catalyst/plans/logical/statements.scala| 55 -- .../sql/catalyst/plans/logical/v2Commands.scala| 38 ++- .../spark/sql/catalyst/parser/DDLParserSuite.scala | 52 ++-- .../catalyst/analysis/ResolveSessionCatalog.scala | 13 ++--- .../datasources/v2/DataSourceV2Strategy.scala | 36 -- 8 files changed, 91 insertions(+), 159 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (0d589f4 -> b2ed6d0)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 0d589f4 [SPARK-30267][SQL][FOLLOWUP] Use while loop in Avro Array Deserializer add b2ed6d0 [SPARK-30214][SQL][FOLLOWUP] Remove statement logical plans for namespace commands No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 2 +- .../sql/catalyst/analysis/ResolveCatalogs.scala| 30 .../spark/sql/catalyst/parser/AstBuilder.scala | 24 +- .../sql/catalyst/plans/logical/statements.scala| 55 -- .../sql/catalyst/plans/logical/v2Commands.scala| 38 ++- .../spark/sql/catalyst/parser/DDLParserSuite.scala | 52 ++-- .../catalyst/analysis/ResolveSessionCatalog.scala | 13 ++--- .../datasources/v2/DataSourceV2Strategy.scala | 36 -- 8 files changed, 91 insertions(+), 159 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org