Re: Cache Shuffle Based Operation Before Sort

2016-05-12 Thread Takeshi Yamamuro
mage site. > > On Sun, May 8, 2016 at 5:17 PM, Ali Tootoonchian <a...@levyx.com> wrote: > >> Thanks for your comment. >> Which image or chart are you pointing? >> >> >> >> -- >> View this message in context: >> http://apache-spark-d

Re: Cache Shuffle Based Operation Before Sort

2016-05-08 Thread Ted Yu
t; Thanks for your comment. > Which image or chart are you pointing? > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Cache-Shuffle-Based-Operation-Before-Sort-tp17331p17438.html > Sent from the Apache Spark Dev

Re: Cache Shuffle Based Operation Before Sort

2016-05-08 Thread Ali Tootoonchian
Thanks for your comment. Which image or chart are you pointing? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Cache-Shuffle-Based-Operation-Before-Sort-tp17331p17438.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

Re: Cache Shuffle Based Operation Before Sort

2016-04-25 Thread Ted Yu
gt; val joinDf = sqlContext.sql(tpchQuery).cache > val queryRes = joinDf.sort("o_orderdate") > > Let’s look at details of execution for 10 and 100 scale factor input > > > By comparing stage 4, 9, 10 and 15, 20, 21 of two approaches, you can find > out that amount of dat

Cache Shuffle Based Operation Before Sort

2016-04-25 Thread Ali Tootoonchian
ring stage 4, 9, 10 and 15, 20, 21 of two approaches, you can find out that amount of data is read during sort process can be reduced by factor 2. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Cache-Shuffle-Based-Operation-Before-Sort-tp17331.