Github user mridulm commented on the issue:

    https://github.com/apache/spark/pull/17596
  
    The approach I took for this was slightly different.
    * Create a bitmask indicating which accumulators are required in 
TaskMetrics - that is, have non zero values, and emit this first.
    * Instead of relying on default serialization, simply do custom 
serialization for all internal accumulators - directly emit the long's (based 
on the bitmask for writing/reading).
    * Encode long/int's so that they take less than 8/4 bytes (currently this 
is sitting inside graphx iirc - essentially same code as from kryo for 
optimizePositve)
    
    For 1.6, this brought down the size from 1.6k or so average down to 200+ 
bytes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to