[GitHub] spark issue #22371: [SPARK-25386][CORE] Don't need to synchronize the IndexS...

ConeyLiu Mon, 10 Sep 2018 05:29:32 -0700

Github user ConeyLiu commented on the issue:

    https://github.com/apache/spark/pull/22371
  
    Thanks @felixcheung, @srowen, @cloud-fan for your time. There is only one 
instance of `IndexShuffleBlockResolver` per executor, and the synchronize is 
used to protect the modify safely when there are same tasks with different 
attempt update at the same time. The synchronize is unnecessary for most of the 
tasks, and the modify is very simple.
    
    I have tested locally, the results as follow. I admit that this change 
brings little improvement to complex tasks, but it does not cause performance 
degradation.
    
    `./spark-shell --master local[20] --driver-memory 40g`
    `spark.range(0, 10000000, 1, 100).repartition(200).count()`
    
    before: 
    
    map | reduce
    ---- | ---
    2s | 0.4s
    0.8s |  0.2s
    0.7s |  0.2s
    
    after:
    
    map | reduce
    ---- | ---
    0.8s | 0.2s
    0.5s |  0.4s
    0.5s |  0.2s



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22371: [SPARK-25386][CORE] Don't need to synchronize the IndexS...

Reply via email to