Sean Owen created SPARK-28877: --------------------------------- Summary: Investigate/fix JAXB failure running Pyspark tests on JDK 11 Key: SPARK-28877 URL: https://issues.apache.org/jira/browse/SPARK-28877 Project: Spark Issue Type: Sub-task Components: Build, PySpark Affects Versions: 3.0.0 Reporter: Sean Owen
It looks like we might have a test failure in Pyspark with JDK 11: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109686/console {code} ====================================================================== ERROR: test_linear_regression_pmml_basic (pyspark.ml.tests.test_persistence.PersistenceTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/ml/tests/test_persistence.py", line 69, in test_linear_regression_pmml_basic model.write().format("pmml").save(lr_path) File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/ml/util.py", line 175, in save self._jwrite.save(path) File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1286, in __call__ answer, self.gateway_client, self.target_id, self.name) File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/utils.py", line 89, in deco return f(*a, **kw) File "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/protocol.py", line 328, in get_return_value format(target_id, ".", name), value) Py4JJavaError: An error occurred while calling o529.save. : javax.xml.bind.JAXBException - with linked exception: [java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory] at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:241) at javax.xml.bind.ContextFinder.find(ContextFinder.java:477) at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:656) at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:599) at org.jpmml.model.JAXBUtil.getContext(JAXBUtil.java:103) at org.jpmml.model.JAXBUtil.createMarshaller(JAXBUtil.java:132) at org.jpmml.model.JAXBUtil.marshal(JAXBUtil.java:77) at org.jpmml.model.JAXBUtil.marshalPMML(JAXBUtil.java:67) at org.apache.spark.mllib.pmml.PMMLExportable.toPMML(PMMLExportable.scala:44) at org.apache.spark.mllib.pmml.PMMLExportable.toPMML(PMMLExportable.scala:78) ... {code} The error is typical of other JDK 11-related incompatibilities, because Java 9 removed the built-in JAXB implementation from Sun. It appears that somehow the classpath is trying to load the 'old' JAXB implementation. It's curious because the JVM-based tests appear to pass. This suggests it may be more about how the Pyspark test classpath is constructed, and perhaps there is an old dependency or something selecting this implementation via a META-INF/MANIFEST.MF entry. It's also curious because we seemed to observe Pyspark tests passing with JDK 11 during earlier testing. This is likely to be more related to how Pyspark tests are run, but still needs a reproduction and an answer. -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org