[ 
https://issues.apache.org/jira/browse/SPARK-37913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alana Young updated SPARK-37913:
--------------------------------
    Description: 
I am trying to create and persist a ML pipeline model using a custom Spark 
transformer that I created based on the [Unary Transformer 
example|https://github.com/apache/spark/blob/v3.1.2/examples/src/main/scala/org/apache/spark/examples/ml/UnaryTransformerExample.scala]
 provided by Spark. I am able to save and load the transformer. When I include 
the custom transformer as a stage in a pipeline model, I can save the model, 
but am unable to load it. Here is the stack trace of the exception:

 
{code:java}
01-14-2022 03:49:52 PM ERROR Instrumentation: java.lang.NullPointerException at 
java.base/java.lang.reflect.Method.invoke(Method.java:559) at 
org.apache.spark.ml.util.DefaultParamsReader$.loadParamsInstanceReader(ReadWrite.scala:631)
 at 
org.apache.spark.ml.Pipeline$SharedReadWrite$.$anonfun$load$4(Pipeline.scala:276)
 at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) 
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) 
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) 
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198) at 
scala.collection.TraversableLike.map(TraversableLike.scala:238) at 
scala.collection.TraversableLike.map$(TraversableLike.scala:231) at 
scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198) at 
org.apache.spark.ml.Pipeline$SharedReadWrite$.$anonfun$load$3(Pipeline.scala:274)
 at 
org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191)
 at scala.util.Try$.apply(Try.scala:213) at 
org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191)
 at org.apache.spark.ml.Pipeline$SharedReadWrite$.load(Pipeline.scala:268) at 
org.apache.spark.ml.PipelineModel$PipelineModelReader.$anonfun$load$7(Pipeline.scala:356)
 at org.apache.spark.ml.MLEvents.withLoadInstanceEvent(events.scala:160) at 
org.apache.spark.ml.MLEvents.withLoadInstanceEvent$(events.scala:155) at 
org.apache.spark.ml.util.Instrumentation.withLoadInstanceEvent(Instrumentation.scala:42)
 at 
org.apache.spark.ml.PipelineModel$PipelineModelReader.$anonfun$load$6(Pipeline.scala:355)
 at 
org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191)
 at scala.util.Try$.apply(Try.scala:213) at 
org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191)
 at 
org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:355) 
at 
org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:349) 
at org.apache.spark.ml.util.MLReadable.load(ReadWrite.scala:355) at 
org.apache.spark.ml.util.MLReadable.load$(ReadWrite.scala:355) at 
org.apache.spark.ml.PipelineModel$.load(Pipeline.scala:337) at 
com.dtech.scala.pipeline.PipelineProcess.process(PipelineProcess.scala:122) at 
com.dtech.scala.pipeline.PipelineProcess$.main(PipelineProcess.scala:448) at 
com.dtech.scala.pipeline.PipelineProcess.main(PipelineProcess.scala) at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:65) at 
org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala){code}
 

*Source Code*

[Unary 
Transformer|https://gist.github.com/ally1221/ff10ec50f7ef98fb6cd365172195bfd5]

[Persist Unary Transformer & Pipeline 
Model|https://gist.github.com/ally1221/42473cdc818a8cf795ac78d65d48ee14]

  was:
I am trying to create and persist a ML pipeline model using a custom Spark 
transformer that I created based on the [Unary Transformer 
example|https://github.com/apache/spark/blob/v3.1.2/examples/src/main/scala/org/apache/spark/examples/ml/UnaryTransformerExample.scala]
 provided by Spark. I am able to save and load the transformer. When I include 
the custom transformer as a stage in a pipeline model, I can save the model, 
but am unable to load it. Here is the stack trace of the exception:

{{}}
{code:java}
01-14-2022 03:49:52 PM ERROR Instrumentation: java.lang.NullPointerException at 
java.base/java.lang.reflect.Method.invoke(Method.java:559) at 
org.apache.spark.ml.util.DefaultParamsReader$.loadParamsInstanceReader(ReadWrite.scala:631)
 at 
org.apache.spark.ml.Pipeline$SharedReadWrite$.$anonfun$load$4(Pipeline.scala:276)
 at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) 
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) 
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) 
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198) at 
scala.collection.TraversableLike.map(TraversableLike.scala:238) at 
scala.collection.TraversableLike.map$(TraversableLike.scala:231) at 
scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198) at 
org.apache.spark.ml.Pipeline$SharedReadWrite$.$anonfun$load$3(Pipeline.scala:274)
 at 
org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191)
 at scala.util.Try$.apply(Try.scala:213) at 
org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191)
 at org.apache.spark.ml.Pipeline$SharedReadWrite$.load(Pipeline.scala:268) at 
org.apache.spark.ml.PipelineModel$PipelineModelReader.$anonfun$load$7(Pipeline.scala:356)
 at org.apache.spark.ml.MLEvents.withLoadInstanceEvent(events.scala:160) at 
org.apache.spark.ml.MLEvents.withLoadInstanceEvent$(events.scala:155) at 
org.apache.spark.ml.util.Instrumentation.withLoadInstanceEvent(Instrumentation.scala:42)
 at 
org.apache.spark.ml.PipelineModel$PipelineModelReader.$anonfun$load$6(Pipeline.scala:355)
 at 
org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191)
 at scala.util.Try$.apply(Try.scala:213) at 
org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191)
 at 
org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:355) 
at 
org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:349) 
at org.apache.spark.ml.util.MLReadable.load(ReadWrite.scala:355) at 
org.apache.spark.ml.util.MLReadable.load$(ReadWrite.scala:355) at 
org.apache.spark.ml.PipelineModel$.load(Pipeline.scala:337) at 
com.dtech.scala.pipeline.PipelineProcess.process(PipelineProcess.scala:122) at 
com.dtech.scala.pipeline.PipelineProcess$.main(PipelineProcess.scala:448) at 
com.dtech.scala.pipeline.PipelineProcess.main(PipelineProcess.scala) at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:65) at 
org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala){code}
 

*Source Code*

[Unary 
Transformer|https://gist.github.com/ally1221/ff10ec50f7ef98fb6cd365172195bfd5]

[Persist Unary Transformer & Pipeline 
Model|https://gist.github.com/ally1221/42473cdc818a8cf795ac78d65d48ee14]

 

 


> Null Pointer Exception when Loading ML Pipeline Model with Custom Transformer
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-37913
>                 URL: https://issues.apache.org/jira/browse/SPARK-37913
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.1.2
>         Environment: Spark 3.1.2, Scala 2.12, Java 11
>            Reporter: Alana Young
>            Priority: Critical
>              Labels: MLPipelineModels, MLPipelines
>
> I am trying to create and persist a ML pipeline model using a custom Spark 
> transformer that I created based on the [Unary Transformer 
> example|https://github.com/apache/spark/blob/v3.1.2/examples/src/main/scala/org/apache/spark/examples/ml/UnaryTransformerExample.scala]
>  provided by Spark. I am able to save and load the transformer. When I 
> include the custom transformer as a stage in a pipeline model, I can save the 
> model, but am unable to load it. Here is the stack trace of the exception:
>  
> {code:java}
> 01-14-2022 03:49:52 PM ERROR Instrumentation: java.lang.NullPointerException 
> at java.base/java.lang.reflect.Method.invoke(Method.java:559) at 
> org.apache.spark.ml.util.DefaultParamsReader$.loadParamsInstanceReader(ReadWrite.scala:631)
>  at 
> org.apache.spark.ml.Pipeline$SharedReadWrite$.$anonfun$load$4(Pipeline.scala:276)
>  at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) at 
> scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) at 
> scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) 
> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198) at 
> scala.collection.TraversableLike.map(TraversableLike.scala:238) at 
> scala.collection.TraversableLike.map$(TraversableLike.scala:231) at 
> scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198) at 
> org.apache.spark.ml.Pipeline$SharedReadWrite$.$anonfun$load$3(Pipeline.scala:274)
>  at 
> org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191)
>  at scala.util.Try$.apply(Try.scala:213) at 
> org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191)
>  at org.apache.spark.ml.Pipeline$SharedReadWrite$.load(Pipeline.scala:268) at 
> org.apache.spark.ml.PipelineModel$PipelineModelReader.$anonfun$load$7(Pipeline.scala:356)
>  at org.apache.spark.ml.MLEvents.withLoadInstanceEvent(events.scala:160) at 
> org.apache.spark.ml.MLEvents.withLoadInstanceEvent$(events.scala:155) at 
> org.apache.spark.ml.util.Instrumentation.withLoadInstanceEvent(Instrumentation.scala:42)
>  at 
> org.apache.spark.ml.PipelineModel$PipelineModelReader.$anonfun$load$6(Pipeline.scala:355)
>  at 
> org.apache.spark.ml.util.Instrumentation$.$anonfun$instrumented$1(Instrumentation.scala:191)
>  at scala.util.Try$.apply(Try.scala:213) at 
> org.apache.spark.ml.util.Instrumentation$.instrumented(Instrumentation.scala:191)
>  at 
> org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:355)
>  at 
> org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:349)
>  at org.apache.spark.ml.util.MLReadable.load(ReadWrite.scala:355) at 
> org.apache.spark.ml.util.MLReadable.load$(ReadWrite.scala:355) at 
> org.apache.spark.ml.PipelineModel$.load(Pipeline.scala:337) at 
> com.dtech.scala.pipeline.PipelineProcess.process(PipelineProcess.scala:122) 
> at com.dtech.scala.pipeline.PipelineProcess$.main(PipelineProcess.scala:448) 
> at com.dtech.scala.pipeline.PipelineProcess.main(PipelineProcess.scala) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
> org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:65) at 
> org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala){code}
>  
> *Source Code*
> [Unary 
> Transformer|https://gist.github.com/ally1221/ff10ec50f7ef98fb6cd365172195bfd5]
> [Persist Unary Transformer & Pipeline 
> Model|https://gist.github.com/ally1221/42473cdc818a8cf795ac78d65d48ee14]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to