[ https://issues.apache.org/jira/browse/SPARK-40582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610161#comment-17610161 ]
Yang Jie commented on SPARK-40582: ---------------------------------- Do you use Scala 2.13? [~garretwilson] > NullPointerException: Cannot invoke > invalidateSerializedMapOutputStatusCache() because "shuffleStatus" is null > -------------------------------------------------------------------------------------------------------------- > > Key: SPARK-40582 > URL: https://issues.apache.org/jira/browse/SPARK-40582 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.3.0 > Reporter: Garret Wilson > Priority: Critical > > I'm running a simple little Spark 3.3.0 pipeline on Windows 10 using Java 17 > and UDFs. I hardly do anything interesting, and now when I run the pipeline > on only 30,000 records I'm getting this: > {noformat} > [ERROR] Error in removing shuffle 2 > java.lang.NullPointerException: Cannot invoke > "org.apache.spark.ShuffleStatus.invalidateSerializedMapOutputStatusCache()" > because "shuffleStatus" is null > at > org.apache.spark.MapOutputTrackerMaster.$anonfun$unregisterShuffle$1(MapOutputTracker.scala:882) > at > org.apache.spark.MapOutputTrackerMaster.$anonfun$unregisterShuffle$1$adapted(MapOutputTracker.scala:881) > at scala.Option.foreach(Option.scala:437) > at > org.apache.spark.MapOutputTrackerMaster.unregisterShuffle(MapOutputTracker.scala:881) > at > org.apache.spark.storage.BlockManagerStorageEndpoint$$anonfun$receiveAndReply$1.$anonfun$applyOrElse$3(BlockManagerStorageEndpoint.scala:59) > at > scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.scala:17) > at > org.apache.spark.storage.BlockManagerStorageEndpoint.$anonfun$doAsync$1(BlockManagerStorageEndpoint.scala:89) > at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:678) > at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:467) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at java.base/java.lang.Thread.run(Thread.java:833) > {noformat} > I searched and couldn't find any of the principal terms in the error message. > Disconcerting that Spark is breaking at what seems to be a fundamental part > of processing, and with a {{NullPointerException}} at that. > I have already asked this question on [Stack > Overflow|https://stackoverflow.com/q/73732970], and even posted a bounty, > with no solutions. (The only answer so far is from someone who doesn't even > use Spark and just posted links.) > _Update:_ Now it just happened with only 1000 records. But then I reran the > pipeline immediately with no changes, and it succeeded. So this > {{NullPointerException}} bug is nondeterministic. Not good at all. > _Update:_ Now it just happened with only 10 records. But then as before I > reran the pipeline immediately with no changes, and it succeeded. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org