[GitHub] spark issue #22894: [SPARK-25885][Core][Minor] HighlyCompressedMapStatus des...

srowen Wed, 31 Oct 2018 07:12:48 -0700

Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22894
  
    I also would not expect updating an immutable data structure to be faster. 
Building a map once from tuples at the end seems better than rebuilding a map 
each time. Under the hood the immutable map is going to be a HashTrieMap (a map 
of smaller optimized immutable maps) and its updated0 method does some clever 
stuff to avoid recreating the whole map.
    
    But, yeah, why immutable here to begin with? it ought to be better still to 
update a mutable Map. And then I am still not sure why it would be faster to 
keep the map invariants over this loop rather than build the map with its size 
known ahead of time at the end.
    
    Benchmarks are good evidence but we just need to make sure that the 
difference is material as used in Spark. It may well be.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22894: [SPARK-25885][Core][Minor] HighlyCompressedMapStatus des...

Reply via email to