Release _branches_ are tested as commits arrive to the branch, yes. That's what you see at https://github.com/apache/spark/actions Released versions are fixed, they don't change, and were also manually tested before release, so no they are not re-tested; there is no need.
You presumably have some local env issue, because the source of Spark 3.2.3 was passing CI/CD at time of release as well as manual tests of the PMC. On Wed, Jan 18, 2023 at 5:24 PM Adam Chhina <amanschh...@gmail.com> wrote: > Hi Sean, > > That’s fair in regards to 3.3.x being the current release branch. I’m not > familiar with the testing schedule, but I had assumed all currently > supported release versions would have some nightly/weekly tests ran; is > that not the case? I only ask, as when I when I’m seeing these test > failures, I assumed these were either known/unknown from some recurring > testing pipeline. > > Also, unfortunately using v3.2.3 also had the same test failures. > > > git clone --branch v3.2.3 https://github.com/apache/spark.git > > I’ve posted the traceback below for one of the ran tests. At the end it > mentioned to check the logs - `see logs`. However I wasn’t sure whether > that just meant the traceback or some more detailed logs elsewhere? I > wasn’t able to see any files that looked relevant running `find . -name > “*logs*”` afterwards. Sorry if I’m missing something obvious. > > ``` > test_broadcast_no_encryption (pyspark.tests.test_broadcast.BroadcastTest) > ... ERROR > test_broadcast_value_against_gc > (pyspark.tests.test_broadcast.BroadcastTest) ... ERROR > test_broadcast_value_driver_encryption > (pyspark.tests.test_broadcast.BroadcastTest) ... ERROR > test_broadcast_value_driver_no_encryption > (pyspark.tests.test_broadcast.BroadcastTest) ... ERROR > test_broadcast_with_encryption > (pyspark.tests.test_broadcast.BroadcastTest) ... ERROR > > ====================================================================== > ERROR: test_broadcast_with_encryption > (pyspark.tests.test_broadcast.BroadcastTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "$path/spark/python/pyspark/tests/test_broadcast.py", line 67, in > test_broadcast_with_encryption > self._test_multiple_broadcasts(("spark.io.encryption.enabled", "true")) > File "$path/spark/python/pyspark/tests/test_broadcast.py", line 58, in > _test_multiple_broadcasts > conf = SparkConf() > File "$path/spark/python/pyspark/conf.py", line 120, in __init__ > self._jconf = _jvm.SparkConf(loadDefaults) > File > "$path/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line > 1709, in __getattr__ > answer = self._gateway_client.send_command( > File > "$path/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line > 1036, in send_command > connection = self._get_connection() > File > "$path/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/clientserver.py", line > 284, in _get_connection > connection = self._create_new_connection() > File > "$path/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/clientserver.py", line > 291, in _create_new_connection > connection.connect_to_java_server() > File > "$path/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/clientserver.py", line > 438, in connect_to_java_server > self.socket.connect((self.java_address, self.java_port)) > ConnectionRefusedError: [Errno 61] Connection refused > > ---------------------------------------------------------------------- > Ran 7 tests in 12.950s > > FAILED (errors=7) > sys:1: ResourceWarning: unclosed file <_io.BufferedWriter name=4> > > Had test failures in pyspark.tests.test_broadcast with > /usr/local/bin/python3; see logs. > ``` > > Best, > > Adam Chhina > > On Jan 18, 2023, at 5:03 PM, Sean Owen <sro...@gmail.com> wrote: > > That isn't the released version either, but rather the head of the 3.2 > branch (which is beyond 3.2.3). > You may want to check out the v3.2.3 tag instead: > https://github.com/apache/spark/tree/v3.2.3 > ... instead of 3.2.1. > But note of course the 3.3.x is the current release branch anyway. > > Hard to say what the error is without seeing more of the error log. > > That final warning is fine, just means you are using Java 11+. > > > On Wed, Jan 18, 2023 at 3:59 PM Adam Chhina <amanschh...@gmail.com> wrote: > >> Oh, whoops, didn’t realize that wasn’t the release version, thanks! >> >> > git clone --branch branch-3.2 https://github.com/apache/spark.git >> >> Ah, so the old failing tests are passing now, but I am seeing failures in >> `pyspark.tests.test_broadcast` such as `test_broadcast_value_against_gc`, >> with a majority of them failing due to `ConnectionRefusedError: [Errno >> 61] Connection refused`. Maybe these tests are not mean to be ran locally, >> and only in the pipeline? >> >> Also, I see this warning that mentions to notify the maintainers here: >> >> ``` >> Starting test(/usr/local/bin/python3): pyspark.tests.test_broadcast >> WARNING: An illegal reflective access operation has occurred >> WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform >> (file:/$path/spark/common/unsafe/target/scala-2.12/classes/) to constructor >> java.nio.DirectByteBuffer(long,int) >> ``` >> >> FWIW, not sure if this matters, but python executable used for running >> these tests is `Python 3.10.9` under `/user/local/bin/python3`. >> >> Best, >> >> Adam Chhina >> >> On Jan 18, 2023, at 3:05 PM, Bjørn Jørgensen <bjornjorgen...@gmail.com> >> wrote: >> >> Replace >> > > git clone g...@github.com:apache/spark.git >> > > git checkout -b spark-321 v3.2.1 >> >> with >> git clone --branch branch-3.2 https://github.com/apache/spark.git >> This will give you branch 3.2 as today, what I suppose you call upstream >> >> https://github.com/apache/spark/commits/branch-3.2 >> and right now all tests in github action are passed :) >> >> >> ons. 18. jan. 2023 kl. 18:07 skrev Sean Owen <sro...@gmail.com>: >> >>> Never seen those, but it's probably a difference in pandas, numpy >>> versions. You can see the current CICD test results in GitHub Actions. But, >>> you want to use release versions, not an RC. 3.2.1 is not the latest >>> version, and it's possible the tests were actually failing in the RC. >>> >>> On Wed, Jan 18, 2023, 10:57 AM Adam Chhina <amanschh...@gmail.com> >>> wrote: >>> >>>> Bump, >>>> >>>> Just trying to see where I can find what tests are known failing for a >>>> particular release, to ensure I’m building upstream correctly following the >>>> build docs. I figured this would be the best place to ask as it pertains to >>>> building and testing upstream (also more than happy to provide a PR for any >>>> docs if required afterwards), however if there would be a more appropriate >>>> place, please let me know. >>>> >>>> Best, >>>> >>>> Adam Chhina >>>> >>>> > On Dec 27, 2022, at 11:37 AM, Adam Chhina <amanschh...@gmail.com> >>>> wrote: >>>> > >>>> > As part of an upgrade I was looking to run upstream PySpark unit >>>> tests on `v3.2.1-rc2` before I applied some downstream patches and tested >>>> those. However, I'm running into some issues with failing unit tests, which >>>> I'm not sure are failing upstream or due to some step I missed in the >>>> build. >>>> > >>>> > The current failing tests (at least so far, since I believe the >>>> python script exits on test failure): >>>> > ``` >>>> > ====================================================================== >>>> > FAIL: test_train_prediction >>>> (pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests) >>>> > Test that error on test data improves as model is trained. >>>> > ---------------------------------------------------------------------- >>>> > Traceback (most recent call last): >>>> > File >>>> "/Users/adam/OSS/spark/python/pyspark/mllib/tests/test_streaming_algorithms.py", >>>> line 474, in test_train_prediction >>>> > eventually(condition, timeout=180.0) >>>> > File "/Users/adam/OSS/spark/python/pyspark/testing/utils.py", line >>>> 86, in eventually >>>> > lastValue = condition() >>>> > File >>>> "/Users/adam/OSS/spark/python/pyspark/mllib/tests/test_streaming_algorithms.py", >>>> line 469, in condition >>>> > self.assertGreater(errors[1] - errors[-1], 2) >>>> > AssertionError: 1.8960983527735014 not greater than 2 >>>> > >>>> > ====================================================================== >>>> > FAIL: test_parameter_accuracy >>>> (pyspark.mllib.tests.test_streaming_algorithms.StreamingLogisticRegressionWithSGDTests) >>>> > Test that the final value of weights is close to the desired value. >>>> > ---------------------------------------------------------------------- >>>> > Traceback (most recent call last): >>>> > File >>>> "/Users/adam/OSS/spark/python/pyspark/mllib/tests/test_streaming_algorithms.py", >>>> line 229, in test_parameter_accuracy >>>> > eventually(condition, timeout=60.0, catch_assertions=True) >>>> > File "/Users/adam/OSS/spark/python/pyspark/testing/utils.py", line >>>> 91, in eventually >>>> > raise lastValue >>>> > File "/Users/adam/OSS/spark/python/pyspark/testing/utils.py", line >>>> 82, in eventually >>>> > lastValue = condition() >>>> > File >>>> "/Users/adam/OSS/spark/python/pyspark/mllib/tests/test_streaming_algorithms.py", >>>> line 226, in condition >>>> > self.assertAlmostEqual(rel, 0.1, 1) >>>> > AssertionError: 0.23052813480829393 != 0.1 within 1 places >>>> (0.13052813480829392 difference) >>>> > >>>> > ====================================================================== >>>> > FAIL: test_training_and_prediction >>>> (pyspark.mllib.tests.test_streaming_algorithms.StreamingLogisticRegressionWithSGDTests) >>>> > Test that the model improves on toy data with no. of batches >>>> > ---------------------------------------------------------------------- >>>> > Traceback (most recent call last): >>>> > File >>>> "/Users/adam/OSS/spark/python/pyspark/mllib/tests/test_streaming_algorithms.py", >>>> line 334, in test_training_and_prediction >>>> > eventually(condition, timeout=180.0) >>>> > File "/Users/adam/OSS/spark/python/pyspark/testing/utils.py", line >>>> 93, in eventually >>>> > raise AssertionError( >>>> > AssertionError: Test failed due to timeout after 180 sec, with last >>>> condition returning: Latest errors: 0.67, 0.71, 0.78, 0.7, 0.75, 0.74, >>>> 0.73, 0.69, 0.62, 0.71, 0.69, 0.75, 0.72, 0.77, 0.71, 0.74, 0.76, 0.78, >>>> 0.7, 0.78, 0.8, 0.74, 0.77, 0.75, 0.76, 0.76, 0.75, 0.78, 0.74, 0.64, 0.64, >>>> 0.71, 0.78, 0.76, 0.64, 0.68, 0.69, 0.72, 0.77 >>>> > >>>> > ---------------------------------------------------------------------- >>>> > Ran 13 tests in 661.536s >>>> > >>>> > FAILED (failures=3, skipped=1) >>>> > >>>> > Had test failures in pyspark.mllib.tests.test_streaming_algorithms >>>> with /usr/local/bin/python3; see logs. >>>> > ``` >>>> > >>>> > Here's how I'm currently building Spark, I was using the >>>> [building-spark](https://spark.apache.org/docs/3..1/building-spark.html) >>>> docs as a reference. >>>> > ``` >>>> > > git clone g...@github.com:apache/spark.git >>>> > > git checkout -b spark-321 v3.2.1 >>>> > > ./build/mvn -DskipTests clean package -Phive >>>> > > export JAVA_HOME=$(path/to/jdk/11) >>>> > > ./python/run-tests >>>> > ``` >>>> > >>>> > Current Java version >>>> > ``` >>>> > java -version >>>> > openjdk version "11.0.17" 2022-10-18 >>>> > OpenJDK Runtime Environment Homebrew (build 11.0.17+0) >>>> > OpenJDK 64-Bit Server VM Homebrew (build 11.0.17+0, mixed mode) >>>> > ``` >>>> > >>>> > Alternatively, I've also tried simply building Spark and using a >>>> python=3.9 venv and installing the requirements from `pip install -r >>>> dev/requirements.txt` and using that as the interpreter to run tests. >>>> However, I was running into some failing pandas test which to me seemed >>>> like it was coming from a pandas version difference as `requirements.txt` >>>> didn't specify a version. >>>> > >>>> > I suppose I have a couple of questions in regards to this: >>>> > 1. Am I missing a build step to build Spark and run PySpark unit >>>> tests? >>>> > 2. Where could I find whether an upstream test is failing for a >>>> specific release? >>>> > 3. Would it be possible to configure the `run-tests` script to run >>>> all tests regardless of test failures? >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>> >>>> >> >> -- >> Bjørn Jørgensen >> Vestre Aspehaug 4, 6010 Ålesund >> Norge >> >> +47 480 94 297 >> >> >> >