;
>> *Sender* : Josh Rosen
>>
>> *Date* : 2015-01-05 06:14 (GMT+09:00)
>>
>> *Title* : Re: Shuffle write increases in spark 1.2
>>
>>
>> If you have a small reproduction for this issue, can you open a ticket at
>> https://issues.apache.org/jira
I think Xuefeng Wu's suggestion is likely correct. This different is more
likely explained by the compression library changing versions than sort vs
hash shuffle (which should not affect output size significantly). Others
have reported that switching to lz4 fixed their issue.
We should document th
I double check the 1.2 feature list and found out that the new sort-based
shuffle manager has nothing to do with HashPartitioner :-< Sorry for the
misinformation.
In another hand. This may explain increase in shuffle spill as a side effect
of the new shuffle manager, let me revert spark.shuffle.ma
Same problem here, shuffle write increased from 10G to over 64G, since I'm
running on amazon EC2 this always cause temporary folder to consume all the
disk space. Still looking for a solution.
BTW, the 64G shuffle write is encountered on shuffling a pairRDD with
HashPartitioner, so its not related
It looks because different snappy version, if you disable compress or switch to
lz4, the size is no different.
Yours, Xuefeng Wu 吴雪峰 敬上
> On 2015年2月10日, at 下午6:13, chris wrote:
>
> Hello,
>
> as the original message from Kevin Jung never got accepted to the
> mailinglist, I quote it here com
Hello,
as the original message from Kevin Jung never got accepted to the
mailinglist, I quote it here completely:
Kevin Jung wrote
> Hi all,
> The size of shuffle write showing in spark web UI is much different when I
> execute same spark job on same input data(100GB) in both spark 1.1 and
> spa
Hello,
as the original message never got accepted to the mailinglist, I quote it
here completely:
Kevin Jung wrote
> Hi all,
> The size of shuffle write showing in spark web UI is much different when I
> execute same spark job on same input data(100GB) in both spark 1.1 and
> spark 1.2.
> At the
081
>
>
>
> --- *Original Message* ---
>
> *Sender* : Josh Rosen
>
> *Date* : 2015-01-05 06:14 (GMT+09:00)
>
> *Title* : Re: Shuffle write increases in spark 1.2
>
>
> If you have a small reproduction for this issue, can you open a ticket at
> https://iss
Sure, here is a ticket. https://issues.apache.org/jira/browse/SPARK-5081
--- Original Message ---
Sender : Josh Rosen
Date : 2015-01-05 06:14 (GMT+09:00)
Title : Re: Shuffle write increases in spark 1.2
If you have a small reproduction for this issue, can you open a
If you have a small reproduction for this issue, can you open a ticket at
https://issues.apache.org/jira/browse/SPARK ?
On December 29, 2014 at 7:10:02 PM, Kevin Jung (itsjb.j...@samsung.com) wrote:
Hi all,
The size of shuffle write showing in spark web UI is mush different when I
execute
10 matches
Mail list logo