Hi Kevin, We seem to be facing the same problem as well. Were you able to find anything after that? The ticket does not seem to have progressed anywhere.
Regards, Anubhav On 5 January 2015 at 10:37, 정재부 <itsjb.j...@samsung.com> wrote: > Sure, here is a ticket. https://issues.apache.org/jira/browse/SPARK-5081 > > > > ------- *Original Message* ------- > > *Sender* : Josh Rosen<rosenvi...@gmail.com> > > *Date* : 2015-01-05 06:14 (GMT+09:00) > > *Title* : Re: Shuffle write increases in spark 1.2 > > > If you have a small reproduction for this issue, can you open a ticket at > https://issues.apache.org/jira/browse/SPARK ? > > > > On December 29, 2014 at 7:10:02 PM, Kevin Jung (itsjb.j...@samsung.com) > wrote: > > Hi all, > The size of shuffle write showing in spark web UI is mush different when I > execute same spark job on same input data(100GB) in both spark 1.1 and > spark > 1.2. > At the same sortBy stage, the size of shuffle write is 39.7GB in spark 1.1 > but 91.0GB in spark 1.2. > I set spark.shuffle.manager option to hash because it's default value is > changed but spark 1.2 writes larger file than spark 1.1. > Can anyone tell me why this happened? > > Thanks > Kevin > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Shuffle-write-increases-in-spark-1-2-tp20894.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > --------------------------------------------------------------------- To > unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional > commands, e-mail: user-h...@spark.apache.org