[ 
https://issues.apache.org/jira/browse/SPARK-22540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yucai updated SPARK-22540:
--------------------------
    Description: 
The calculation of HighlyCompressedMapStatus's avgSize is incorrect. 
Currently, it looks like "sum of small blocks / count of all non empty blocks", 
the count of all non empty blocks not only contains small blocks, which 
contains huge blocks number also, but we need the count of small blocks only.

  was:
The calculation of HighlyCompressedMapStatus's avgSize is incorrect. 
Currently, it looks like "sum of small blocks / count of all non empty blocks", 
all non empty blocks contains huge blocks number also, actually we need the 
count of small blocks only.


> HighlyCompressedMapStatus's avgSize is incorrect
> ------------------------------------------------
>
>                 Key: SPARK-22540
>                 URL: https://issues.apache.org/jira/browse/SPARK-22540
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.3.0
>            Reporter: yucai
>
> The calculation of HighlyCompressedMapStatus's avgSize is incorrect. 
> Currently, it looks like "sum of small blocks / count of all non empty 
> blocks", the count of all non empty blocks not only contains small blocks, 
> which contains huge blocks number also, but we need the count of small blocks 
> only.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to