[ 
https://issues.apache.org/jira/browse/SPARK-31923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319622#comment-17319622
 ] 

Jean-Yves STEPHAN commented on SPARK-31923:
-------------------------------------------

Hi [~zsxwing] :) 



Despite your patch, we're running in the same issue while using Spark 3.0.1. 
The stack trace is informative (for the line numbers we need to refer to this 
file 
https://github.com/apache/spark/blob/v3.0.1/core/src/main/scala/org/apache/spark/util/JsonProtocol.scala#L354).

The problem is that we're given a class (java.util.Collections$SynchronizedSet) 
that enters the branch
{code:java}
case v: java.util.List[_] =>
{code}
but on the next line the cast v.asScala.toList fails. 

 

```

21/04/12 15:11:40 ERROR AsyncEventQueue: Listener EventLoggingListener threw an 
exception21/04/12 15:11:40 ERROR AsyncEventQueue: Listener EventLoggingListener 
threw an exceptionjava.lang.ClassCastException: 
java.util.Collections$SynchronizedSet cannot be cast to java.util.List at 
org.apache.spark.util.JsonProtocol$.accumValueToJson(JsonProtocol.scala:355) at 
org.apache.spark.util.JsonProtocol$.$anonfun$accumulableInfoToJson$4(JsonProtocol.scala:331)
 at scala.Option.map(Option.scala:230) at 
org.apache.spark.util.JsonProtocol$.accumulableInfoToJson(JsonProtocol.scala:331)
 at 
org.apache.spark.util.JsonProtocol$.$anonfun$accumulablesToJson$3(JsonProtocol.scala:324)
 at scala.collection.immutable.List.map(List.scala:290) at 
org.apache.spark.util.JsonProtocol$.accumulablesToJson(JsonProtocol.scala:324) 
at org.apache.spark.util.JsonProtocol$.taskInfoToJson(JsonProtocol.scala:316) 
at org.apache.spark.util.JsonProtocol$.taskEndToJson(JsonProtocol.scala:151) at 
org.apache.spark.util.JsonProtocol$.sparkEventToJson(JsonProtocol.scala:79) at 
org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:97)
 at 
org.apache.spark.scheduler.EventLoggingListener.onTaskEnd(EventLoggingListener.scala:119)
 at 
org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:45)
 at 
org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28)
 at 
org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
 at 
org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
 at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:115) at 
org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:99) at 
org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:105)
 at 
org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:105)
 at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23) at 
scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at 
org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:100)
 at 
org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:96)
 at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1319) at 
org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:96)

```

 

> Event log cannot be generated when some internal accumulators use unexpected 
> types
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-31923
>                 URL: https://issues.apache.org/jira/browse/SPARK-31923
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.0, 2.4.1, 2.4.2, 
> 2.4.3, 2.4.4, 2.4.5, 2.4.6
>            Reporter: Shixiong Zhu
>            Assignee: Shixiong Zhu
>            Priority: Major
>             Fix For: 2.4.7, 3.0.1, 3.1.0
>
>
> A user may use internal accumulators by adding the "internal.metrics." prefix 
> to the accumulator name to hide sensitive information from UI (Accumulators 
> except internal ones will be shown in Spark UI).
> However, *org.apache.spark.util.JsonProtocol.accumValueToJson* assumes an 
> internal accumulator has only 3 possible types: int, long, and 
> java.util.List[(BlockId, BlockStatus)]. When an internal accumulator uses an 
> unexpected type, it will crash.
> An event log that contains such accumulator will be dropped because it cannot 
> be converted to JSON, and it will cause weird UI issue when rendering in 
> Spark History Server. For example, if `SparkListenerTaskEnd` is dropped 
> because of this issue, the user will see the task is still running even if it 
> was finished.
> It's better to make *accumValueToJson* more robust.
> ----
> How to reproduce it:
> - Enable Spark event log
> - Run the following command:
> {code}
> scala> val accu = sc.doubleAccumulator("internal.metrics.foo")
> accu: org.apache.spark.util.DoubleAccumulator = DoubleAccumulator(id: 0, 
> name: Some(internal.metrics.foo), value: 0.0)
> scala> sc.parallelize(1 to 1, 1).foreach { _ => accu.add(1.0) }
> 20/06/06 16:11:27 ERROR AsyncEventQueue: Listener EventLoggingListener threw 
> an exception
> java.lang.ClassCastException: java.lang.Double cannot be cast to 
> java.util.List
>       at 
> org.apache.spark.util.JsonProtocol$.accumValueToJson(JsonProtocol.scala:330)
>       at 
> org.apache.spark.util.JsonProtocol$$anonfun$accumulableInfoToJson$3.apply(JsonProtocol.scala:306)
>       at 
> org.apache.spark.util.JsonProtocol$$anonfun$accumulableInfoToJson$3.apply(JsonProtocol.scala:306)
>       at scala.Option.map(Option.scala:146)
>       at 
> org.apache.spark.util.JsonProtocol$.accumulableInfoToJson(JsonProtocol.scala:306)
>       at 
> org.apache.spark.util.JsonProtocol$$anonfun$accumulablesToJson$2.apply(JsonProtocol.scala:299)
>       at 
> org.apache.spark.util.JsonProtocol$$anonfun$accumulablesToJson$2.apply(JsonProtocol.scala:299)
>       at scala.collection.immutable.List.map(List.scala:284)
>       at 
> org.apache.spark.util.JsonProtocol$.accumulablesToJson(JsonProtocol.scala:299)
>       at 
> org.apache.spark.util.JsonProtocol$.taskInfoToJson(JsonProtocol.scala:291)
>       at 
> org.apache.spark.util.JsonProtocol$.taskEndToJson(JsonProtocol.scala:145)
>       at 
> org.apache.spark.util.JsonProtocol$.sparkEventToJson(JsonProtocol.scala:76)
>       at 
> org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:138)
>       at 
> org.apache.spark.scheduler.EventLoggingListener.onTaskEnd(EventLoggingListener.scala:158)
>       at 
> org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:45)
>       at 
> org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
>       at 
> org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
>       at 
> org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:91)
>       at 
> org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:92)
>       at 
> org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply$mcJ$sp(AsyncEventQueue.scala:92)
>       at 
> org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply(AsyncEventQueue.scala:87)
>       at 
> org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply(AsyncEventQueue.scala:87)
>       at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
>       at 
> org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:87)
>       at 
> org.apache.spark.scheduler.AsyncEventQueue$$anon$1$$anonfun$run$1.apply$mcV$sp(AsyncEventQueue.scala:83)
>       at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1302)
>       at 
> org.apache.spark.scheduler.AsyncEventQueue$$anon$1.run(AsyncEventQueue.scala:82)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to