Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10061#discussion_r63822163
  
    --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
    @@ -96,6 +100,7 @@ private[spark] object JsonProtocol {
             executorMetricsUpdateToJson(metricsUpdate)
           case blockUpdated: SparkListenerBlockUpdated =>
             throw new MatchError(blockUpdated)  // TODO(ekl) implement this
    +      case _ => parse(mapper.writeValueAsString(event))
    --- End diff --
    
    > Events are a public API, and they should be carefully crafted, since 
changing them affects user applications (including event logs). If there is 
unnecessary information in the event, then it's a bug in the event definition, 
not here.
    
    Yea. I totally agree. However, my concern is that having this line at here 
will make the developer harder to spot issues during the development. Since the 
serialization works automatically, we are not making a self-review on what will 
be serialized and what methods will be called during serialization a mandatory 
step, which makes the auditing work much harder. Although it introduces more 
work to the developer to make every event explicitly handled, when we review 
the pull request, we can clearly know what will be serialized and how a event 
is serialized when a pull request is submitted. What do you think?
    
    btw, if I am missing any context, please let me know :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to