Hi,

>> "Dumping heap to ./heapdump.hprof"

>> File myheapdump.hprof does not exist.

The file names don't match - can you check your script / command line args.

Thanks
hemanth


On Wed, Mar 27, 2013 at 3:21 PM, nagarjuna kanamarlapudi <
nagarjuna.kanamarlap...@gmail.com> wrote:

> Hi Hemanth,
>
> Nice to see this. I didnot know about this till now.
>
> But few one more issue.. the dump file did not get created..   The
> following are the logs
>
>
>
> ttempt_201302211510_81218_m_000000_0:
> /data/1/mapred/local/taskTracker/distcache/8776089957260881514_-363500746_715125253/cmp111wcd/user/ims-b/nagarjuna/AddressId_Extractor/Numbers
> attempt_201302211510_81218_m_000000_0: java.lang.OutOfMemoryError: Java
> heap space
> attempt_201302211510_81218_m_000000_0: Dumping heap to ./heapdump.hprof ...
> attempt_201302211510_81218_m_000000_0: Heap dump file created [210641441
> bytes in 3.778 secs]
> attempt_201302211510_81218_m_000000_0: #
> attempt_201302211510_81218_m_000000_0: # java.lang.OutOfMemoryError: Java
> heap space
> attempt_201302211510_81218_m_000000_0: # -XX:OnOutOfMemoryError="./dump.sh"
> attempt_201302211510_81218_m_000000_0: #   Executing /bin/sh -c
> "./dump.sh"...
> attempt_201302211510_81218_m_000000_0: put: File myheapdump.hprof does not
> exist.
> attempt_201302211510_81218_m_000000_0: log4j:WARN No appenders could be
> found for logger (org.apache.hadoop.hdfs.DFSClient).
>
>
>
>
>
> On Wed, Mar 27, 2013 at 2:29 PM, Hemanth Yamijala <
> yhema...@thoughtworks.com> wrote:
>
>> Couple of things to check:
>>
>> Does your class com.hadoop.publicationMrPOC.Launcher implement the Tool
>> interface ? You can look at an example at (
>> http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html#Source+Code-N110D0).
>> That's what accepts the -D params on command line. Alternatively, you can
>> also set the same in the configuration object like this, in your launcher
>> code:
>>
>> Configuration conf = new Configuration()
>>
>> conf.set("mapred.create.symlink", "yes");
>>
>>
>> conf.set("mapred.cache.files", 
>> "hdfs:///user/hemanty/scripts/copy_dump.sh#copy_dump.sh");
>>
>>
>> conf.set("mapred.child.java.opts",
>>
>>
>>   "-Xmx200m -XX:+HeapDumpOnOutOfMemoryError 
>> -XX:HeapDumpPath=./heapdump.hprof -XX:OnOutOfMemoryError=./copy_dump.sh");
>>
>>
>> Second, the position of the arguments matters. I think the command should
>> be
>>
>> hadoop jar -Dmapred.create.symlink=yes 
>> -Dmapred.cache.files=hdfs:///user/ims-b/dump.sh#dump.sh
>> -Dmapred.reduce.child.java.opts='-Xmx2048m -XX:+HeapDumpOnOutOfMemoryError
>> -XX:HeapDumpPath=./myheapdump.hprof -XX:OnOutOfMemoryError=./dump.sh'
>> com.hadoop.publicationMrPOC.Launcher  Fudan\ Univ
>>
>> Thanks
>> Hemanth
>>
>>
>> On Wed, Mar 27, 2013 at 1:58 PM, nagarjuna kanamarlapudi <
>> nagarjuna.kanamarlap...@gmail.com> wrote:
>>
>>> Hi Hemanth/Koji,
>>>
>>> Seems the above script doesn't work for me.  Can u look into the
>>> following and suggest what more can I do
>>>
>>>
>>>  hadoop fs -cat /user/ims-b/dump.sh
>>> #!/bin/sh
>>> hadoop dfs -put myheapdump.hprof /tmp/myheapdump_ims/${PWD//\//_}.hprof
>>>
>>>
>>> hadoop jar LL.jar com.hadoop.publicationMrPOC.Launcher  Fudan\ Univ
>>>  -Dmapred.create.symlink=yes
>>> -Dmapred.cache.files=hdfs:///user/ims-b/dump.sh#dump.sh
>>> -Dmapred.reduce.child.java.opts='-Xmx2048m -XX:+HeapDumpOnOutOfMemoryError
>>> -XX:HeapDumpPath=./myheapdump.hprof -XX:OnOutOfMemoryError=./dump.sh'
>>>
>>>
>>> I am not able to see the heap dump at  /tmp/myheapdump_ims
>>>
>>>
>>>
>>> Erorr in the mapper :
>>>
>>> Caused by: java.lang.reflect.InvocationTargetException
>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>     at 
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>     at 
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>>     at 
>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>>>     ... 17 more
>>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>>     at java.util.Arrays.copyOf(Arrays.java:2734)
>>>     at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
>>>     at java.util.ArrayList.add(ArrayList.java:351)
>>>     at 
>>> com.hadoop.publicationMrPOC.PublicationMapper.configure(PublicationMapper.java:59)
>>>     ... 22 more
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Mar 27, 2013 at 10:16 AM, Hemanth Yamijala <
>>> yhema...@thoughtworks.com> wrote:
>>>
>>>> Koji,
>>>>
>>>> Works beautifully. Thanks a lot. I learnt at least 3 different things
>>>> with your script today !
>>>>
>>>> Hemanth
>>>>
>>>>
>>>> On Tue, Mar 26, 2013 at 9:41 PM, Koji Noguchi 
>>>> <knogu...@yahoo-inc.com>wrote:
>>>>
>>>>> Create a dump.sh on hdfs.
>>>>>
>>>>> $ hadoop dfs -cat /user/knoguchi/dump.sh
>>>>> #!/bin/sh
>>>>> hadoop dfs -put myheapdump.hprof
>>>>> /tmp/myheapdump_knoguchi/${PWD//\//_}.hprof
>>>>>
>>>>> Run your job with
>>>>>
>>>>> -Dmapred.create.symlink=yes
>>>>> -Dmapred.cache.files=hdfs:///user/knoguchi/dump.sh#dump.sh
>>>>> -Dmapred.reduce.child.java.opts='-Xmx2048m
>>>>> -XX:+HeapDumpOnOutOfMemoryError
>>>>> -XX:HeapDumpPath=./myheapdump.hprof -XX:OnOutOfMemoryError=./dump.sh'
>>>>>
>>>>> This should create the heap dump on hdfs at /tmp/myheapdump_knoguchi.
>>>>>
>>>>> Koji
>>>>>
>>>>>
>>>>> On Mar 26, 2013, at 11:53 AM, Hemanth Yamijala wrote:
>>>>>
>>>>> > Hi,
>>>>> >
>>>>> > I tried to use the -XX:+HeapDumpOnOutOfMemoryError. Unfortunately,
>>>>> like I suspected, the dump goes to the current work directory of the task
>>>>> attempt as it executes on the cluster. This directory is cleaned up once
>>>>> the task is done. There are options to keep failed task files or task 
>>>>> files
>>>>> matching a pattern. However, these are NOT retaining the current working
>>>>> directory. Hence, there is no option to get this from a cluster AFAIK.
>>>>> >
>>>>> > You are effectively left with the jmap option on pseudo distributed
>>>>> cluster I think.
>>>>> >
>>>>> > Thanks
>>>>> > Hemanth
>>>>> >
>>>>> >
>>>>> > On Tue, Mar 26, 2013 at 11:37 AM, Hemanth Yamijala <
>>>>> yhema...@thoughtworks.com> wrote:
>>>>> > If your task is running out of memory, you could add the option
>>>>> -XX:+HeapDumpOnOutOfMemoryError
>>>>> > to mapred.child.java.opts (along with the heap memory). However, I
>>>>> am not sure  where it stores the dump.. You might need to experiment a
>>>>> little on it.. Will try and send out the info if I get time to try out.
>>>>> >
>>>>> >
>>>>> > Thanks
>>>>> > Hemanth
>>>>> >
>>>>> >
>>>>> > On Tue, Mar 26, 2013 at 10:23 AM, nagarjuna kanamarlapudi <
>>>>> nagarjuna.kanamarlap...@gmail.com> wrote:
>>>>> > Hi hemanth,
>>>>> >
>>>>> > This sounds interesting, will out try out that on the pseudo
>>>>> cluster.  But the real problem for me is, the cluster is being maintained
>>>>> by third party. I only have have a edge node through which I can submit 
>>>>> the
>>>>> jobs.
>>>>> >
>>>>> > Is there any other way of getting the dump instead of physically
>>>>> going to that machine and  checking out.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Mar 26, 2013 at 10:12 AM, Hemanth Yamijala <
>>>>> yhema...@thoughtworks.com> wrote:
>>>>> > Hi,
>>>>> >
>>>>> > One option to find what could be taking the memory is to use jmap on
>>>>> the running task. The steps I followed are:
>>>>> >
>>>>> > - I ran a sleep job (which comes in the examples jar of the
>>>>> distribution - effectively does nothing in the mapper / reducer).
>>>>> > - From the JobTracker UI looked at a map task attempt ID.
>>>>> > - Then on the machine where the map task is running, got the PID of
>>>>> the running task - ps -ef | grep <task attempt id>
>>>>> > - On the same machine executed jmap -histo <pid>
>>>>> >
>>>>> > This will give you an idea of the count of objects allocated and
>>>>> size. Jmap also has options to get a dump, that will contain more
>>>>> information, but this should help to get you started with debugging.
>>>>> >
>>>>> > For my sleep job task - I saw allocations worth roughly 130 MB.
>>>>> >
>>>>> > Thanks
>>>>> > hemanth
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Mar 25, 2013 at 6:43 PM, Nagarjuna Kanamarlapudi <
>>>>> nagarjuna.kanamarlap...@gmail.com> wrote:
>>>>> > I have a lookup file which I need in the mapper. So I am trying to
>>>>> read the whole file and load it into list in the mapper.
>>>>> >
>>>>> >
>>>>> > For each and every record Iook in this file which I got from
>>>>> distributed cache.
>>>>> >
>>>>> > —
>>>>> > Sent from iPhone
>>>>> >
>>>>> >
>>>>> > On Mon, Mar 25, 2013 at 6:39 PM, Hemanth Yamijala <
>>>>> yhema...@thoughtworks.com> wrote:
>>>>> >
>>>>> > Hmm. How are you loading the file into memory ? Is it some sort of
>>>>> memory mapping etc ? Are they being read as records ? Some details of the
>>>>> app will help
>>>>> >
>>>>> >
>>>>> > On Mon, Mar 25, 2013 at 2:14 PM, nagarjuna kanamarlapudi <
>>>>> nagarjuna.kanamarlap...@gmail.com> wrote:
>>>>> > Hi Hemanth,
>>>>> >
>>>>> > I tried out your suggestion loading 420 MB file into memory. It
>>>>> threw java heap space error.
>>>>> >
>>>>> > I am not sure where this 1.6 GB of configured heap went to ?
>>>>> >
>>>>> >
>>>>> > On Mon, Mar 25, 2013 at 12:01 PM, Hemanth Yamijala <
>>>>> yhema...@thoughtworks.com> wrote:
>>>>> > Hi,
>>>>> >
>>>>> > The free memory might be low, just because GC hasn't reclaimed what
>>>>> it can. Can you just try reading in the data you want to read and see if
>>>>> that works ?
>>>>> >
>>>>> > Thanks
>>>>> > Hemanth
>>>>> >
>>>>> >
>>>>> > On Mon, Mar 25, 2013 at 10:32 AM, nagarjuna kanamarlapudi <
>>>>> nagarjuna.kanamarlap...@gmail.com> wrote:
>>>>> > io.sort.mb = 256 MB
>>>>> >
>>>>> >
>>>>> > On Monday, March 25, 2013, Harsh J wrote:
>>>>> > The MapTask may consume some memory of its own as well. What is your
>>>>> > io.sort.mb (MR1) or mapreduce.task.io.sort.mb (MR2) set to?
>>>>> >
>>>>> > On Sun, Mar 24, 2013 at 3:40 PM, nagarjuna kanamarlapudi
>>>>> > <nagarjuna.kanamarlap...@gmail.com> wrote:
>>>>> > > Hi,
>>>>> > >
>>>>> > > I configured  my child jvm heap to 2 GB. So, I thought I could
>>>>> really read
>>>>> > > 1.5GB of data and store it in memory (mapper/reducer).
>>>>> > >
>>>>> > > I wanted to confirm the same and wrote the following piece of code
>>>>> in the
>>>>> > > configure method of mapper.
>>>>> > >
>>>>> > > @Override
>>>>> > >
>>>>> > > public void configure(JobConf job) {
>>>>> > >
>>>>> > > System.out.println("FREE MEMORY -- "
>>>>> > >
>>>>> > > + Runtime.getRuntime().freeMemory());
>>>>> > >
>>>>> > > System.out.println("MAX MEMORY ---" +
>>>>> Runtime.getRuntime().maxMemory());
>>>>> > >
>>>>> > > }
>>>>> > >
>>>>> > >
>>>>> > > Surprisingly the output was
>>>>> > >
>>>>> > >
>>>>> > > FREE MEMORY -- 341854864  = 320 MB
>>>>> > > MAX MEMORY ---1908932608  = 1.9 GB
>>>>> > >
>>>>> > >
>>>>> > > I am just wondering what processes are taking up that extra 1.6GB
>>>>> of heap
>>>>> > > which I configured for the child jvm heap.
>>>>> > >
>>>>> > >
>>>>> > > Appreciate in helping me understand the scenario.
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > > Regards
>>>>> > >
>>>>> > > Nagarjuna K
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Harsh J
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Sent from iPhone
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to