Re: Cache Shuffle Based Operation Before Sort

Takeshi Yamamuro Thu, 12 May 2016 08:47:50 -0700

Hi,

Look interesting.
This optimisation also seems effective in case of simply loading and
sorting df;
val df = sqlCtx.read.load(path)
df.cache.sort("some colum")


How big does this optimisation have effects on actual performance?
If big, it'd be better to open JIRA.

// maropu

On Mon, May 9, 2016 at 2:21 PM, Ted Yu <[email protected]> wrote:

> I assume there were supposed to be images following this line (which I
> don't see in the email thread):
>
> bq. Let’s look at details of execution for 10 and 100 scale factor input
>
> Consider using 3rd party image site.
>
> On Sun, May 8, 2016 at 5:17 PM, Ali Tootoonchian <[email protected]> wrote:
>
>> Thanks for your comment.
>> Which image or chart are you pointing?
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/Cache-Shuffle-Based-Operation-Before-Sort-tp17331p17438.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>


-- 
---
Takeshi Yamamuro

Re: Cache Shuffle Based Operation Before Sort

Reply via email to