Hi,

Look interesting.
This optimisation also seems effective in case of simply loading and
sorting df;
val df = sqlCtx.read.load(path)
df.cache.sort("some colum")

How big does this optimisation have effects on actual performance?
If big, it'd be better to open JIRA.

// maropu

On Mon, May 9, 2016 at 2:21 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> I assume there were supposed to be images following this line (which I
> don't see in the email thread):
>
> bq. Let’s look at details of execution for 10 and 100 scale factor input
>
> Consider using 3rd party image site.
>
> On Sun, May 8, 2016 at 5:17 PM, Ali Tootoonchian <a...@levyx.com> wrote:
>
>> Thanks for your comment.
>> Which image or chart are you pointing?
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/Cache-Shuffle-Based-Operation-Before-Sort-tp17331p17438.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>
>


-- 
---
Takeshi Yamamuro

Reply via email to