[ https://issues.apache.org/jira/browse/SPARK-35834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-35834. ---------------------------------- Fix Version/s: 3.2.0 Assignee: Hyukjin Kwon Resolution: Fixed Fixed in https://github.com/apache/spark/pull/32989 > Use the same cleanup logic as Py4J in inheritable thread API > ------------------------------------------------------------ > > Key: SPARK-35834 > URL: https://issues.apache.org/jira/browse/SPARK-35834 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 3.2.0 > Reporter: Hyukjin Kwon > Assignee: Hyukjin Kwon > Priority: Major > Fix For: 3.2.0 > > > After > https://github.com/apache/spark/commit/6d309914df422d9f0c96edfd37924ecb8f29e3a9, > the test became flaky: > {code} > ====================================================================== > ERROR [71.813s]: test_save_load_pipeline_estimator > (pyspark.ml.tests.test_tuning.CrossValidatorTests) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/__w/spark/spark/python/pyspark/ml/tests/test_tuning.py", line 589, > in test_save_load_pipeline_estimator > self._run_test_save_load_pipeline_estimator(DummyLogisticRegression) > File "/__w/spark/spark/python/pyspark/ml/tests/test_tuning.py", line 572, > in _run_test_save_load_pipeline_estimator > cvModel2 = crossval2.fit(training) > File "/__w/spark/spark/python/pyspark/ml/base.py", line 161, in fit > return self._fit(dataset) > File "/__w/spark/spark/python/pyspark/ml/tuning.py", line 747, in _fit > bestModel = est.fit(dataset, epm[bestIndex]) > File "/__w/spark/spark/python/pyspark/ml/base.py", line 159, in fit > return self.copy(params)._fit(dataset) > File "/__w/spark/spark/python/pyspark/ml/pipeline.py", line 114, in _fit > model = stage.fit(dataset) > File "/__w/spark/spark/python/pyspark/ml/base.py", line 161, in fit > return self._fit(dataset) > File "/__w/spark/spark/python/pyspark/ml/pipeline.py", line 114, in _fit > model = stage.fit(dataset) > File "/__w/spark/spark/python/pyspark/ml/base.py", line 161, in fit > return self._fit(dataset) > File "/__w/spark/spark/python/pyspark/ml/classification.py", line 2924, in > _fit > models = pool.map(inheritable_thread_target(trainSingleClass), > range(numClasses)) > File "/__t/Python/3.6.13/x64/lib/python3.6/multiprocessing/pool.py", line > 266, in map > return self._map_async(func, iterable, mapstar, chunksize).get() > File "/__t/Python/3.6.13/x64/lib/python3.6/multiprocessing/pool.py", line > 644, in get > raise self._value > File "/__t/Python/3.6.13/x64/lib/python3.6/multiprocessing/pool.py", line > 119, in worker > result = (True, func(*args, **kwds)) > File "/__t/Python/3.6.13/x64/lib/python3.6/multiprocessing/pool.py", line > 44, in mapstar > return list(map(*args)) > File "/__w/spark/spark/python/pyspark/util.py", line 324, in wrapped > InheritableThread._clean_py4j_conn_for_current_thread() > File "/__w/spark/spark/python/pyspark/util.py", line 389, in > _clean_py4j_conn_for_current_thread > del connections[i] > IndexError: deque index out of range > ---------------------------------------------------------------------- > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org