Sean Owen created SPARK-28877:
---------------------------------

             Summary: Investigate/fix JAXB failure running Pyspark tests on JDK 
11
                 Key: SPARK-28877
                 URL: https://issues.apache.org/jira/browse/SPARK-28877
             Project: Spark
          Issue Type: Sub-task
          Components: Build, PySpark
    Affects Versions: 3.0.0
            Reporter: Sean Owen


It looks like we might have a test failure in Pyspark with JDK 11:

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109686/console

{code}
======================================================================
ERROR: test_linear_regression_pmml_basic 
(pyspark.ml.tests.test_persistence.PersistenceTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File 
"/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/ml/tests/test_persistence.py",
 line 69, in test_linear_regression_pmml_basic
    model.write().format("pmml").save(lr_path)
  File 
"/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/ml/util.py", 
line 175, in save
    self._jwrite.save(path)
  File 
"/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py",
 line 1286, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File 
"/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/utils.py", 
line 89, in deco
    return f(*a, **kw)
  File 
"/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/protocol.py",
 line 328, in get_return_value
    format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling o529.save.
: javax.xml.bind.JAXBException
 - with linked exception:
[java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory]
        at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:241)
        at javax.xml.bind.ContextFinder.find(ContextFinder.java:477)
        at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:656)
        at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:599)
        at org.jpmml.model.JAXBUtil.getContext(JAXBUtil.java:103)
        at org.jpmml.model.JAXBUtil.createMarshaller(JAXBUtil.java:132)
        at org.jpmml.model.JAXBUtil.marshal(JAXBUtil.java:77)
        at org.jpmml.model.JAXBUtil.marshalPMML(JAXBUtil.java:67)
        at 
org.apache.spark.mllib.pmml.PMMLExportable.toPMML(PMMLExportable.scala:44)
        at 
org.apache.spark.mllib.pmml.PMMLExportable.toPMML(PMMLExportable.scala:78)
...
{code}

The error is typical of other JDK 11-related incompatibilities, because Java 9 
removed the built-in JAXB implementation from Sun. It appears that somehow the 
classpath is trying to load the 'old' JAXB implementation.

It's curious because the JVM-based tests appear to pass. This suggests it may 
be more about how the Pyspark test classpath is constructed, and perhaps there 
is an old dependency or something selecting this implementation via a 
META-INF/MANIFEST.MF entry. 

It's also curious because we seemed to observe Pyspark tests passing with JDK 
11 during earlier testing. This is likely to be more related to how Pyspark 
tests are run, but still needs a reproduction and an answer.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to