[spark] branch master updated (7a613ec -> 54b11fa)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 7a613ec [SPARK-38100][SQL] Remove unused private method in `Decimal` add 54b11fa [MINOR] Remove unnecessary null check for exception cause No new revisions were added by this update. Summary of changes: .../src/main/java/org/apache/spark/network/shuffle/ErrorHandler.java | 4 ++-- .../org/apache/spark/network/shuffle/RetryingBlockTransferor.java | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] khalidmammadov commented on pull request #378: Contribution guide to document actual guide for pull requests
khalidmammadov commented on pull request #378: URL: https://github.com/apache/spark-website/pull/378#issuecomment-1030411355 I think all done, anything left to do here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark-website] branch asf-site updated: Contribution guide to document actual guide for pull requests
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new 991df19 Contribution guide to document actual guide for pull requests 991df19 is described below commit 991df1959e2381dfd32dadce39cbfa2be80ec0c6 Author: khalidmammadov AuthorDate: Fri Feb 4 17:07:55 2022 -0600 Contribution guide to document actual guide for pull requests Currently contribution guide does not reflect actual flow to raise a new PR and hence it's not clear (for a new contributors) what exactly needs to be done to make a PR for Spark repository and test it as per expectation. This PR addresses that by following: - It describes in the Pull request section of the Contributing page the actual procedure and takes a contributor through a step by step process. - It removes optional "Running tests in your forked repository" section on Developer Tools page which is obsolete now and doesn't reflect reality anymore i.e. it says we can test by clicking “Run workflow” button which is not available anymore as workflow does not use "workflow_dispatch" event trigger anymore and was removed in https://github.com/apache/spark/pull/32092 - Instead it documents the new procedure that above PR introduced i.e. contributors needs to use their own GitHub free workflow credits to test new changes they are purposing and a Spark Actions workflow will expect that to be completed before marking PR to be ready for a review. - Some general wording was copied from "Running tests in your forked repository" section on Developer Tools page but main content was rewritten to meet objective - Also fixed URL to developer-tools.html to be resolved by parser (that converted it into relative URI) instead of using hard coded absolute URL. Tested imperically with `bundle exec jekyll serve` and static files were generated with `bundle exec jekyll build` commands This closes https://issues.apache.org/jira/browse/SPARK-37996 Author: khalidmammadov Closes #378 from khalidmammadov/fix_contribution_workflow_guide. --- contributing.md| 21 +++-- developer-tools.md | 17 - images/running-tests-using-github-actions.png | Bin 312696 -> 0 bytes site/contributing.html | 18 +- site/developer-tools.html | 19 --- site/images/running-tests-using-github-actions.png | Bin 312696 -> 0 bytes 6 files changed, 28 insertions(+), 47 deletions(-) diff --git a/contributing.md b/contributing.md index d5f0142..b127afe 100644 --- a/contributing.md +++ b/contributing.md @@ -322,9 +322,16 @@ Example: `Fix typos in Foo scaladoc` Pull request +Before creating a pull request in Apache Spark, it is important to check if tests can pass on your branch because +our GitHub Actions workflows automatically run tests for your pull request/following commits +and every run burdens the limited resources of GitHub Actions in Apache Spark repository. +Below steps will take your through the process. + + 1. https://help.github.com/articles/fork-a-repo/";>Fork the GitHub repository at https://github.com/apache/spark";>https://github.com/apache/spark if you haven't already -1. Clone your fork, create a new branch, push commits to the branch. +1. Go to "Actions" tab on your forked repository and enable "Build and test" and "Report test results" workflows +1. Clone your fork and create a new branch 1. Consider whether documentation or tests need to be added or updated as part of the change, and add them as needed. 1. When you add tests, make sure the tests are self-descriptive. @@ -355,14 +362,16 @@ and add them as needed. ... ``` 1. Consider whether benchmark results should be added or updated as part of the change, and add them as needed by -https://spark.apache.org/developer-tools.html#github-workflow-benchmarks";>Running benchmarks in your forked repository +Running benchmarks in your forked repository to generate benchmark results. 1. Run all tests with `./dev/run-tests` to verify that the code still compiles, passes tests, and -passes style checks. Alternatively you can run the tests via GitHub Actions workflow by -https://spark.apache.org/developer-tools.html#github-workflow-tests";>Running tests in your forked repository. +passes style checks. If style checks fail, review the Code Style Guide below. +1. Push commits to your branch. This will trigger "Build and test" and "Report test results" workflows +on your forked repository and start testing and validating your changes. 1. https://help.github.com/articles/using-pull-requests/";>Open a pull request against
[GitHub] [spark-website] srowen closed pull request #378: Contribution guide to document actual guide for pull requests
srowen closed pull request #378: URL: https://github.com/apache/spark-website/pull/378 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (54b11fa -> 973ea0f)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 54b11fa [MINOR] Remove unnecessary null check for exception cause add 973ea0f [SPARK-36837][BUILD] Upgrade Kafka to 3.1.0 No new revisions were added by this update. Summary of changes: .../apache/spark/sql/kafka010/KafkaTestUtils.scala | 7 +++ .../spark/streaming/kafka010/KafkaRDDSuite.scala | 20 ++-- .../spark/streaming/kafka010/KafkaTestUtils.scala| 3 ++- pom.xml | 2 +- 4 files changed, 20 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-38082][PYTHON] Update minimum numpy version to 1.15
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 49f215a [SPARK-38082][PYTHON] Update minimum numpy version to 1.15 49f215a is described below commit 49f215a5ae64a50e889ae5cf94421cdeb0eacf09 Author: zero323 AuthorDate: Fri Feb 4 20:05:35 2022 -0800 [SPARK-38082][PYTHON] Update minimum numpy version to 1.15 ### What changes were proposed in this pull request? This PR changes minimum required numpy version to 1.15. Additionally, it replaces calls to deprecated `tostring` method. ### Why are the changes needed? Current lower bound is ancient and no longer supported by the rest our dependencies. Additionally, supporting it, requires usage of long deprecated methods creating unnecessary gaps in our type checker coverage. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. Closes #35398 from zero323/SPARK-38082. Authored-by: zero323 Signed-off-by: Dongjoon Hyun --- python/pyspark/ml/linalg/__init__.py| 12 ++-- python/pyspark/mllib/linalg/__init__.py | 14 +++--- python/setup.py | 6 +++--- 3 files changed, 16 insertions(+), 16 deletions(-) diff --git a/python/pyspark/ml/linalg/__init__.py b/python/pyspark/ml/linalg/__init__.py index b361925..03e63e9 100644 --- a/python/pyspark/ml/linalg/__init__.py +++ b/python/pyspark/ml/linalg/__init__.py @@ -303,7 +303,7 @@ class DenseVector(Vector): self.array = ar def __reduce__(self): -return DenseVector, (self.array.tostring(),) +return DenseVector, (self.array.tobytes(),) def numNonzeros(self): """ @@ -591,7 +591,7 @@ class SparseVector(Vector): return np.linalg.norm(self.values, p) def __reduce__(self): -return (SparseVector, (self.size, self.indices.tostring(), self.values.tostring())) +return (SparseVector, (self.size, self.indices.tobytes(), self.values.tobytes())) def dot(self, other): """ @@ -949,7 +949,7 @@ class DenseMatrix(Matrix): return DenseMatrix, ( self.numRows, self.numCols, -self.values.tostring(), +self.values.tobytes(), int(self.isTransposed), ) @@ -1160,9 +1160,9 @@ class SparseMatrix(Matrix): return SparseMatrix, ( self.numRows, self.numCols, -self.colPtrs.tostring(), -self.rowIndices.tostring(), -self.values.tostring(), +self.colPtrs.tobytes(), +self.rowIndices.tobytes(), +self.values.tobytes(), int(self.isTransposed), ) diff --git a/python/pyspark/mllib/linalg/__init__.py b/python/pyspark/mllib/linalg/__init__.py index 30fa84c..b9c391e 100644 --- a/python/pyspark/mllib/linalg/__init__.py +++ b/python/pyspark/mllib/linalg/__init__.py @@ -390,7 +390,7 @@ class DenseVector(Vector): return DenseVector(values) def __reduce__(self) -> Tuple[Type["DenseVector"], Tuple[bytes]]: -return DenseVector, (self.array.tostring(),) # type: ignore[attr-defined] +return DenseVector, (self.array.tobytes(),) def numNonzeros(self) -> int: """ @@ -712,8 +712,8 @@ class SparseVector(Vector): SparseVector, ( self.size, -self.indices.tostring(), # type: ignore[attr-defined] -self.values.tostring(), # type: ignore[attr-defined] +self.indices.tobytes(), +self.values.tobytes(), ), ) @@ -1256,7 +1256,7 @@ class DenseMatrix(Matrix): return DenseMatrix, ( self.numRows, self.numCols, -self.values.tostring(), # type: ignore[attr-defined] +self.values.tobytes(), int(self.isTransposed), ) @@ -1489,9 +1489,9 @@ class SparseMatrix(Matrix): return SparseMatrix, ( self.numRows, self.numCols, -self.colPtrs.tostring(), # type: ignore[attr-defined] -self.rowIndices.tostring(), # type: ignore[attr-defined] -self.values.tostring(), # type: ignore[attr-defined] +self.colPtrs.tobytes(), +self.rowIndices.tobytes(), +self.values.tobytes(), int(self.isTransposed), ) diff --git a/python/setup.py b/python/setup.py index 4ff495c..673b146 100755 --- a/python/setup.py +++ b/python/setup.py @@ -260,8 +260,8 @@ try: # if you're updating the versions or dependencies. install_requires=['py4j==0.10.9.3'], extras_require={ -'ml': ['numpy>=1.7'], -
[spark] branch master updated (49f215a -> 3e0d489)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 49f215a [SPARK-38082][PYTHON] Update minimum numpy version to 1.15 add 3e0d489 [SPARK-38073][PYTHON] Update atexit function to avoid issues with late binding No new revisions were added by this update. Summary of changes: python/pyspark/shell.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.2 updated: [SPARK-38073][PYTHON] Update atexit function to avoid issues with late binding
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 2e382c8 [SPARK-38073][PYTHON] Update atexit function to avoid issues with late binding 2e382c8 is described below commit 2e382c8bff2d0c3733b9b525168254971ca1175e Author: zero323 AuthorDate: Fri Feb 4 20:21:02 2022 -0800 [SPARK-38073][PYTHON] Update atexit function to avoid issues with late binding ### What changes were proposed in this pull request? This PR updates function registered in PySpark shell `atexit` to capture `SparkContext` instead of depending on the surrounding context. **Note** A simpler approach ```python atexit.register(sc.stop) ``` is possible, but won't work properly in case of contexts with monkey patched `stop` methods (for example like [pyspark-asyncactions](https://github.com/zero323/pyspark-asyncactions)) I also consider using `_active_spark_context` ```python atexit.register(lambda: ( SparkContext._active_spark_context.stop() if SparkContext._active_spark_context else None )) ``` but `SparkContext` is also out of scope, so that doesn't work without introducing a standard function within the scope. ### Why are the changes needed? When using `ipython` as a driver with Python 3.8, `sc` goes out of scope before `atexit` function is called. This leads to `NameError` on exit. This is a mild annoyance and likely a bug in ipython (there are quite a few of these with similar behavior), but it is easy to address on our side, without causing regressions for users of earlier Python versions. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual testing to confirm that: - Named error is no longer thrown on exit with ipython and Python 3.8 or later. - `stop` is indeed invoked on exit with both plain interpreter and ipython shells. Closes #35396 from zero323/SPARK-38073. Authored-by: zero323 Signed-off-by: Dongjoon Hyun (cherry picked from commit 3e0d4899dcb3be226a120cbeec8df78ff7fb00ba) Signed-off-by: Dongjoon Hyun --- python/pyspark/shell.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/python/pyspark/shell.py b/python/pyspark/shell.py index 25aadb1..0c6a608 100644 --- a/python/pyspark/shell.py +++ b/python/pyspark/shell.py @@ -45,7 +45,7 @@ except Exception: sc = spark.sparkContext sql = spark.sql -atexit.register(lambda: sc.stop()) +atexit.register((lambda sc: lambda: sc.stop())(sc)) # for compatibility sqlContext = spark._wrapped - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org