No reference. I opened a ticket about missing documentation for it, and was answered by Sean Owen that this is not meant for spark users. I explained that it's an issue, but no news so far.
As for the memory management, I'm not experienced with it, but I suggest you read: http://0x0fff.com/spark-memory-management/ and http://0x0fff.com/spark-architecture/ Could be that the effective default storage memory in spark 1.6 is a bit lower than in spark 1.5, and your application can't borrow from the execution memory. On Thu, Mar 3, 2016 at 2:35 AM, Koert Kuipers <ko...@tresata.com> wrote: > with the locality issue resolved, i am still struggling with the new > memory management. > > i am seeing tasks on tiny amounts of data take 15 seconds, of which 14 are > spend in GC. with the legacy memory management (spark.memory.useLegacyMode > = false ) they complete in 1 - 2 seconds. > > since we are permanently caching a very large number of RDDs, my suspicion > is that with the new memory management these cached RDDs happily gobble up > all the memory, and need to be evicted to run my small job, leading to the > slowness. > > i can revert to legacy memory management mode, so this is not an issue, > but i am worried that at some point the legacy memory management will be > deprecated and then i am stuck with this performance issue. > > On Mon, Feb 29, 2016 at 12:47 PM, Koert Kuipers <ko...@tresata.com> wrote: > >> setting spark.shuffle.reduceLocality.enabled=false worked for me, thanks >> >> >> is there any reference to the benefits of setting reduceLocality to true? >> i am tempted to disable it across the board. >> >> On Mon, Feb 29, 2016 at 9:51 AM, Yin Yang <yy201...@gmail.com> wrote: >> >>> The default value for spark.shuffle.reduceLocality.enabled is true. >>> >>> To reduce surprise to users of 1.5 and earlier releases, should the >>> default value be set to false ? >>> >>> On Mon, Feb 29, 2016 at 5:38 AM, Lior Chaga <lio...@taboola.com> wrote: >>> >>>> Hi Koret, >>>> Try spark.shuffle.reduceLocality.enabled=false >>>> This is an undocumented configuration. >>>> See: >>>> https://github.com/apache/spark/pull/8280 >>>> https://issues.apache.org/jira/browse/SPARK-10567 >>>> >>>> It solved the problem for me (both with and without memory legacy mode) >>>> >>>> >>>> On Sun, Feb 28, 2016 at 11:16 PM, Koert Kuipers <ko...@tresata.com> >>>> wrote: >>>> >>>>> i find it particularly confusing that a new memory management module >>>>> would change the locations. its not like the hash partitioner got >>>>> replaced. >>>>> i can switch back and forth between legacy and "new" memory management and >>>>> see the distribution change... fully reproducible >>>>> >>>>> On Sun, Feb 28, 2016 at 11:24 AM, Lior Chaga <lio...@taboola.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> I've experienced a similar problem upgrading from spark 1.4 to spark >>>>>> 1.6. >>>>>> The data is not evenly distributed across executors, but in my case >>>>>> it also reproduced with legacy mode. >>>>>> Also tried 1.6.1 rc-1, with same results. >>>>>> >>>>>> Still looking for resolution. >>>>>> >>>>>> Lior >>>>>> >>>>>> On Fri, Feb 19, 2016 at 2:01 AM, Koert Kuipers <ko...@tresata.com> >>>>>> wrote: >>>>>> >>>>>>> looking at the cached rdd i see a similar story: >>>>>>> with useLegacyMode = true the cached rdd is spread out across 10 >>>>>>> executors, but with useLegacyMode = false the data for the cached rdd >>>>>>> sits >>>>>>> on only 3 executors (the rest all show 0s). my cached RDD is a key-value >>>>>>> RDD that got partitioned (hash partitioner, 50 partitions) before being >>>>>>> cached. >>>>>>> >>>>>>> On Thu, Feb 18, 2016 at 6:51 PM, Koert Kuipers <ko...@tresata.com> >>>>>>> wrote: >>>>>>> >>>>>>>> hello all, >>>>>>>> we are just testing a semi-realtime application (it should return >>>>>>>> results in less than 20 seconds from cached RDDs) on spark 1.6.0. >>>>>>>> before >>>>>>>> this it used to run on spark 1.5.1 >>>>>>>> >>>>>>>> in spark 1.6.0 the performance is similar to 1.5.1 if i set >>>>>>>> spark.memory.useLegacyMode = true, however if i switch to >>>>>>>> spark.memory.useLegacyMode = false the queries take about 50% to 100% >>>>>>>> more >>>>>>>> time. >>>>>>>> >>>>>>>> the issue becomes clear when i focus on a single stage: the >>>>>>>> individual tasks are not slower at all, but they run on less executors. >>>>>>>> in my test query i have 50 tasks and 10 executors. both with >>>>>>>> useLegacyMode = true and useLegacyMode = false the tasks finish in >>>>>>>> about 3 >>>>>>>> seconds and show as running PROCESS_LOCAL. however when useLegacyMode >>>>>>>> = >>>>>>>> false the tasks run on just 3 executors out of 10, while with >>>>>>>> useLegacyMode >>>>>>>> = true they spread out across 10 executors. all the tasks running on >>>>>>>> just a >>>>>>>> few executors leads to the slower results. >>>>>>>> >>>>>>>> any idea why this would happen? >>>>>>>> thanks! koert >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >