Hi, One option to find what could be taking the memory is to use jmap on the running task. The steps I followed are:
- I ran a sleep job (which comes in the examples jar of the distribution - effectively does nothing in the mapper / reducer). - From the JobTracker UI looked at a map task attempt ID. - Then on the machine where the map task is running, got the PID of the running task - ps -ef | grep <task attempt id> - On the same machine executed jmap -histo <pid> This will give you an idea of the count of objects allocated and size. Jmap also has options to get a dump, that will contain more information, but this should help to get you started with debugging. For my sleep job task - I saw allocations worth roughly 130 MB. Thanks hemanth On Mon, Mar 25, 2013 at 6:43 PM, Nagarjuna Kanamarlapudi < nagarjuna.kanamarlap...@gmail.com> wrote: > I have a lookup file which I need in the mapper. So I am trying to read > the whole file and load it into list in the mapper. > > For each and every record Iook in this file which I got from distributed > cache. > > — > Sent from iPhone > > > On Mon, Mar 25, 2013 at 6:39 PM, Hemanth Yamijala < > yhema...@thoughtworks.com> wrote: > >> Hmm. How are you loading the file into memory ? Is it some sort of memory >> mapping etc ? Are they being read as records ? Some details of the app will >> help >> >> >> On Mon, Mar 25, 2013 at 2:14 PM, nagarjuna kanamarlapudi < >> nagarjuna.kanamarlap...@gmail.com> wrote: >> >>> Hi Hemanth, >>> >>> I tried out your suggestion loading 420 MB file into memory. It threw >>> java heap space error. >>> >>> I am not sure where this 1.6 GB of configured heap went to ? >>> >>> >>> On Mon, Mar 25, 2013 at 12:01 PM, Hemanth Yamijala < >>> yhema...@thoughtworks.com> wrote: >>> >>>> Hi, >>>> >>>> The free memory might be low, just because GC hasn't reclaimed what it >>>> can. Can you just try reading in the data you want to read and see if that >>>> works ? >>>> >>>> Thanks >>>> Hemanth >>>> >>>> >>>> On Mon, Mar 25, 2013 at 10:32 AM, nagarjuna kanamarlapudi < >>>> nagarjuna.kanamarlap...@gmail.com> wrote: >>>> >>>>> io.sort.mb = 256 MB >>>>> >>>>> >>>>> On Monday, March 25, 2013, Harsh J wrote: >>>>> >>>>>> The MapTask may consume some memory of its own as well. What is your >>>>>> io.sort.mb (MR1) or mapreduce.task.io.sort.mb (MR2) set to? >>>>>> >>>>>> On Sun, Mar 24, 2013 at 3:40 PM, nagarjuna kanamarlapudi >>>>>> <nagarjuna.kanamarlap...@gmail.com> wrote: >>>>>> > Hi, >>>>>> > >>>>>> > I configured my child jvm heap to 2 GB. So, I thought I could >>>>>> really read >>>>>> > 1.5GB of data and store it in memory (mapper/reducer). >>>>>> > >>>>>> > I wanted to confirm the same and wrote the following piece of code >>>>>> in the >>>>>> > configure method of mapper. >>>>>> > >>>>>> > @Override >>>>>> > >>>>>> > public void configure(JobConf job) { >>>>>> > >>>>>> > System.out.println("FREE MEMORY -- " >>>>>> > >>>>>> > + Runtime.getRuntime().freeMemory()); >>>>>> > >>>>>> > System.out.println("MAX MEMORY ---" + >>>>>> Runtime.getRuntime().maxMemory()); >>>>>> > >>>>>> > } >>>>>> > >>>>>> > >>>>>> > Surprisingly the output was >>>>>> > >>>>>> > >>>>>> > FREE MEMORY -- 341854864 = 320 MB >>>>>> > MAX MEMORY ---1908932608 = 1.9 GB >>>>>> > >>>>>> > >>>>>> > I am just wondering what processes are taking up that extra 1.6GB >>>>>> of heap >>>>>> > which I configured for the child jvm heap. >>>>>> > >>>>>> > >>>>>> > Appreciate in helping me understand the scenario. >>>>>> > >>>>>> > >>>>>> > >>>>>> > Regards >>>>>> > >>>>>> > Nagarjuna K >>>>>> > >>>>>> > >>>>>> > >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Harsh J >>>>>> >>>>> >>>>> >>>>> -- >>>>> Sent from iPhone >>>>> >>>> >>>> >>> >> >