Hi Koret,
Try spark.shuffle.reduceLocality.enabled=false
This is an undocumented configuration.
See:
https://github.com/apache/spark/pull/8280
https://issues.apache.org/jira/browse/SPARK-10567

It solved the problem for me (both with and without memory legacy mode)


On Sun, Feb 28, 2016 at 11:16 PM, Koert Kuipers <ko...@tresata.com> wrote:

> i find it particularly confusing that a new memory management module would
> change the locations. its not like the hash partitioner got replaced. i can
> switch back and forth between legacy and "new" memory management and see
> the distribution change... fully reproducible
>
> On Sun, Feb 28, 2016 at 11:24 AM, Lior Chaga <lio...@taboola.com> wrote:
>
>> Hi,
>> I've experienced a similar problem upgrading from spark 1.4 to spark 1.6.
>> The data is not evenly distributed across executors, but in my case it
>> also reproduced with legacy mode.
>> Also tried 1.6.1 rc-1, with same results.
>>
>> Still looking for resolution.
>>
>> Lior
>>
>> On Fri, Feb 19, 2016 at 2:01 AM, Koert Kuipers <ko...@tresata.com> wrote:
>>
>>> looking at the cached rdd i see a similar story:
>>> with useLegacyMode = true the cached rdd is spread out across 10
>>> executors, but with useLegacyMode = false the data for the cached rdd sits
>>> on only 3 executors (the rest all show 0s). my cached RDD is a key-value
>>> RDD that got partitioned (hash partitioner, 50 partitions) before being
>>> cached.
>>>
>>> On Thu, Feb 18, 2016 at 6:51 PM, Koert Kuipers <ko...@tresata.com>
>>> wrote:
>>>
>>>> hello all,
>>>> we are just testing a semi-realtime application (it should return
>>>> results in less than 20 seconds from cached RDDs) on spark 1.6.0. before
>>>> this it used to run on spark 1.5.1
>>>>
>>>> in spark 1.6.0 the performance is similar to 1.5.1 if i set
>>>> spark.memory.useLegacyMode = true, however if i switch to
>>>> spark.memory.useLegacyMode = false the queries take about 50% to 100% more
>>>> time.
>>>>
>>>> the issue becomes clear when i focus on a single stage: the individual
>>>> tasks are not slower at all, but they run on less executors.
>>>> in my test query i have 50 tasks and 10 executors. both with
>>>> useLegacyMode = true and useLegacyMode = false the tasks finish in about 3
>>>> seconds and show as running PROCESS_LOCAL. however when  useLegacyMode =
>>>> false the tasks run on just 3 executors out of 10, while with useLegacyMode
>>>> = true they spread out across 10 executors. all the tasks running on just a
>>>> few executors leads to the slower results.
>>>>
>>>> any idea why this would happen?
>>>> thanks! koert
>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to