Re: Hadoop Jobtracker heap size calculation and OOME

2013-10-14 Thread Viswanathan J
Hi,

Not yet updated in production environment. Will keep you posted once it is
done.

In which Apache hadoop release this issue will be fixed? Or this issue
already fixed in hadoop-1.2.1 version as in the given below link,

https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&serverRenderedViewIssue=true

Please confirm.

Thanks,
On Oct 15, 2013 3:43 AM, "Antwnis"  wrote:

> Viswana,
>
> please confirm :) whether the issue was fixed - for future readers of this
> thread
>
> with this configuration, after restarting the JobTracker you should see on
> the jobtracker page that the memory usage remains low over time
>
> Antonios
>
>
> On Mon, Oct 14, 2013 at 10:56 AM, Antwnis  wrote:
>
>> After changing mapred-site.xml , you will have to restart the JobTracker
>> to have the changes applied to it
>>
>>
>> On Mon, Oct 14, 2013 at 10:37 AM, Viswanathan J <
>> jayamviswanat...@gmail.com> wrote:
>>
>>> Thanks a lot and lot Antonio.
>>>
>>> I'm using the Apache hadoop, hope this issue will be resolved in
>>> upcoming apache hadoop releases.
>>>
>>> Do I need the restart whole cluster after changing the mapred site conf
>>> as you mentioned?
>>>
>>> What is the following bug id,
>>>
>>>
>>> https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&serverRenderedViewIssue=true
>>>
>>> Is this issue was different from OOME, but they mentioned that issue is
>>> fixed.
>>>
>>> Thanks,
>>> Viswa.J
>>>  On Oct 14, 2013 2:44 PM, "Antonios Chalkiopoulos" 
>>> wrote:
>>>
 In *mapred-site.xml* you need the following snipset:

 
 mapreduce.jobtracker.retiredjobs.cache.size
 100
 
 
 keep.failed.task.files
 true
 
 
 keep.task.files.pattern
 shouldnevereverevermatch
 


 This will fix the memory leak issue ( the official fix i think is
 available in Cloudera's 4.6 distribution )
 It will cause another issue - that is not removing the .staging files
 from the /user/*/.staging/ location


 To overcome this use a daily Jenkins job ( or cron ) and

 #!/bin/bash
 LAST_DATE=$(date -ud '-7days' +%s)
 hdfs dfs -ls /user/*/.staging | awk '/^d/ {m_date=$6;gsub("-","
 ",m_date); ep_date=strftime("%s", mktime(m_date" 00 00 00")); if ( ep_date
 <= l_date ) print $8 }' l_date=$LAST_DATE | xargs -P 2 --verbose hdfs
 dfs -rm -r -skipTrash


 ^ The above will remove all directories that were created more than 7
 days ago .. and will keep your HDFS clean



 On Monday, 14 October 2013 09:52:41 UTC+1, Viswanathan J wrote:
>
> Hi guys,
>
> Appreciate your response.
>
> Thanks,
> Viswa.J
> On Oct 12, 2013 11:29 PM, "Viswanathan J" 
> wrote:
>
>> Hi Guys,
>>
>> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1
>> version as per the hadoop release notes as below.
>>
>> Please check this URL,
>>
>> https://issues.apache.org/**jira/browse/MAPREDUCE-5351
>>
>> How come the issue still persist? I'm I asking a valid thing.
>>
>> Do I need to configure anything our I missing anything.
>>
>> Please help. Appreciate your response.
>>
>> Thanks,
>> Viswa.J
>> On Oct 12, 2013 7:57 PM, "Viswanathan J" 
>> wrote:
>>
>>> Thanks Antonio, hope the memory leak issue will be resolved. Its
>>> really nightmare every week.
>>>
>>> In which release this issue will be resolved?
>>>
>>> How to solve this issue, please help because we are facing in
>>> production environment.
>>>
>>> Please share the configuration and cron to do that cleanup process.
>>>
>>> Thanks,
>>> Viswa
>>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" 
>>> wrote:
>>>
 "After restart the JT, within a week getting OOME."

 Viswa, we were having the same issue in our cluster as well -
 roughly every 5-7 days getting OOME.
 The heap size of the Job Tracker was constantly increasing due to a
 memory leak that will hopefully be fixed in newest releases.

 There is a configuration change in the JobTracker that will disable
 a functionality regarding cleaning up staging files i.e.
 /user/build/.staging/* - but that means that you will have to
 handle the staging files through a cron / jenkins task

 I'll get you the configuration on Monday..

 On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>
> Hi,
>
> I'm running a 14 nodes of Hadoop cluster with
> datanodes,tasktrackers running in all nodes.
>
> *Apache Hadoop :* 1.2.1
>
> It shows the heap size curre

Re: Hadoop Jobtracker heap size calculation and OOME

2013-10-14 Thread Viswanathan J
Thanks a lot and lot Antonio.

I'm using the Apache hadoop, hope this issue will be resolved in upcoming
apache hadoop releases.

Do I need the restart whole cluster after changing the mapred site conf as
you mentioned?

What is the following bug id,

https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-5351&serverRenderedViewIssue=true

Is this issue was different from OOME, but they mentioned that issue is
fixed.

Thanks,
Viswa.J
 On Oct 14, 2013 2:44 PM, "Antonios Chalkiopoulos" 
wrote:

> In *mapred-site.xml* you need the following snipset:
>
> 
> mapreduce.jobtracker.retiredjobs.cache.size
> 100
> 
> 
> keep.failed.task.files
> true
> 
> 
> keep.task.files.pattern
> shouldnevereverevermatch
> 
>
>
> This will fix the memory leak issue ( the official fix i think is
> available in Cloudera's 4.6 distribution )
> It will cause another issue - that is not removing the .staging files from
> the /user/*/.staging/ location
>
>
> To overcome this use a daily Jenkins job ( or cron ) and
>
> #!/bin/bash
> LAST_DATE=$(date -ud '-7days' +%s)
> hdfs dfs -ls /user/*/.staging | awk '/^d/ {m_date=$6;gsub("-","
> ",m_date); ep_date=strftime("%s", mktime(m_date" 00 00 00")); if ( ep_date
> <= l_date ) print $8 }' l_date=$LAST_DATE | xargs -P 2 --verbose hdfs dfs
> -rm -r -skipTrash
>
>
> ^ The above will remove all directories that were created more than 7 days
> ago .. and will keep your HDFS clean
>
>
>
> On Monday, 14 October 2013 09:52:41 UTC+1, Viswanathan J wrote:
>>
>> Hi guys,
>>
>> Appreciate your response.
>>
>> Thanks,
>> Viswa.J
>> On Oct 12, 2013 11:29 PM, "Viswanathan J"  wrote:
>>
>>> Hi Guys,
>>>
>>> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version
>>> as per the hadoop release notes as below.
>>>
>>> Please check this URL,
>>>
>>> https://issues.apache.org/**jira/browse/MAPREDUCE-5351
>>>
>>> How come the issue still persist? I'm I asking a valid thing.
>>>
>>> Do I need to configure anything our I missing anything.
>>>
>>> Please help. Appreciate your response.
>>>
>>> Thanks,
>>> Viswa.J
>>> On Oct 12, 2013 7:57 PM, "Viswanathan J"  wrote:
>>>
 Thanks Antonio, hope the memory leak issue will be resolved. Its really
 nightmare every week.

 In which release this issue will be resolved?

 How to solve this issue, please help because we are facing in
 production environment.

 Please share the configuration and cron to do that cleanup process.

 Thanks,
 Viswa
 On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" 
 wrote:

> "After restart the JT, within a week getting OOME."
>
> Viswa, we were having the same issue in our cluster as well - roughly
> every 5-7 days getting OOME.
> The heap size of the Job Tracker was constantly increasing due to a
> memory leak that will hopefully be fixed in newest releases.
>
> There is a configuration change in the JobTracker that will disable a
> functionality regarding cleaning up staging files i.e.
> /user/build/.staging/* - but that means that you will have to handle
> the staging files through a cron / jenkins task
>
> I'll get you the configuration on Monday..
>
> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>
>> Hi,
>>
>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>> running in all nodes.
>>
>> *Apache Hadoop :* 1.2.1
>>
>> It shows the heap size currently as follows:
>>
>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>> *
>> *
>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines 
>> maximum heap size for Jobtracker, if yes how it has
>> been calculated.
>>
>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>
>> Have set the jobtracker default memory size in hadoop-env.sh
>>
>> *HADOOP_HEAPSIZE="1024"*
>> *
>> *
>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>
>>  
>>   mapred.child.java.opts
>>   -Xmx2048m
>>  
>>
>> Even after setting the above property, getting Jobtracker OOME issue.
>> How the jobtracker memory gradually increasing. After restart the JT,
>> within a week getting OOME.
>>
>> How to resolve this, it is in production and critical? Please help.
>> Thanks in advance.
>>
>> --
>> Regards,
>> Viswa.J
>>
>  --
>
> ---
> You received this message because you are subscribed to the Google
> Groups "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to cdh-user+u...@cloudera.**org.
> For more options, visit https://groups.google.com/a/**
> cloudera.org/groups/opt_out
> .
>
   --
>
> ---
> You received this mess

Re: Hadoop Jobtracker heap size calculation and OOME

2013-10-14 Thread Viswanathan J
Hi guys,

Appreciate your response.

Thanks,
Viswa.J
On Oct 12, 2013 11:29 PM, "Viswanathan J" 
wrote:

> Hi Guys,
>
> But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version as
> per the hadoop release notes as below.
>
> Please check this URL,
>
> https://issues.apache.org/jira/browse/MAPREDUCE-5351
>
> How come the issue still persist? I'm I asking a valid thing.
>
> Do I need to configure anything our I missing anything.
>
> Please help. Appreciate your response.
>
> Thanks,
> Viswa.J
> On Oct 12, 2013 7:57 PM, "Viswanathan J" 
> wrote:
>
>> Thanks Antonio, hope the memory leak issue will be resolved. Its really
>> nightmare every week.
>>
>> In which release this issue will be resolved?
>>
>> How to solve this issue, please help because we are facing in production
>> environment.
>>
>> Please share the configuration and cron to do that cleanup process.
>>
>> Thanks,
>> Viswa
>> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" 
>> wrote:
>>
>>> "After restart the JT, within a week getting OOME."
>>>
>>> Viswa, we were having the same issue in our cluster as well - roughly
>>> every 5-7 days getting OOME.
>>> The heap size of the Job Tracker was constantly increasing due to a
>>> memory leak that will hopefully be fixed in newest releases.
>>>
>>> There is a configuration change in the JobTracker that will disable a
>>> functionality regarding cleaning up staging files i.e.
>>> /user/build/.staging/* - but that means that you will have to handle the
>>> staging files through a cron / jenkins task
>>>
>>> I'll get you the configuration on Monday..
>>>
>>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:

 Hi,

 I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
 running in all nodes.

 *Apache Hadoop :* 1.2.1

 It shows the heap size currently as follows:

 *Cluster Summary (Heap Size is 5.7/8.89 GB)*
 *
 *
 In the above summary what is the *8.89* GB defines? Is the *8.89*defines 
 maximum heap size for Jobtracker, if yes how it has
 been calculated.

 Hope *5.7* is currently running jobs heap-size, how it is calculated.

 Have set the jobtracker default memory size in hadoop-env.sh

 *HADOOP_HEAPSIZE="1024"*
 *
 *
 Have set the mapred.child.java.opts value in mapred-site.xml as,

  
   mapred.child.java.opts
   -Xmx2048m
  

 Even after setting the above property, getting Jobtracker OOME issue.
 How the jobtracker memory gradually increasing. After restart the JT,
 within a week getting OOME.

 How to resolve this, it is in production and critical? Please help.
 Thanks in advance.

 --
 Regards,
 Viswa.J

>>>  --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "CDH Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to cdh-user+unsubscr...@cloudera.org.
>>> For more options, visit
>>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>>
>>


Re: Hadoop Jobtracker heap size calculation and OOME

2013-10-12 Thread Viswanathan J
Hi Guys,

But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version as
per the hadoop release notes as below.

Please check this URL,

https://issues.apache.org/jira/browse/MAPREDUCE-5351

How come the issue still persist? I'm I asking a valid thing.

Do I need to configure anything our I missing anything.

Please help. Appreciate your response.

Thanks,
Viswa.J
On Oct 12, 2013 7:57 PM, "Viswanathan J"  wrote:

> Thanks Antonio, hope the memory leak issue will be resolved. Its really
> nightmare every week.
>
> In which release this issue will be resolved?
>
> How to solve this issue, please help because we are facing in production
> environment.
>
> Please share the configuration and cron to do that cleanup process.
>
> Thanks,
> Viswa
> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" 
> wrote:
>
>> "After restart the JT, within a week getting OOME."
>>
>> Viswa, we were having the same issue in our cluster as well - roughly
>> every 5-7 days getting OOME.
>> The heap size of the Job Tracker was constantly increasing due to a
>> memory leak that will hopefully be fixed in newest releases.
>>
>> There is a configuration change in the JobTracker that will disable a
>> functionality regarding cleaning up staging files i.e.
>> /user/build/.staging/* - but that means that you will have to handle the
>> staging files through a cron / jenkins task
>>
>> I'll get you the configuration on Monday..
>>
>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>>
>>> Hi,
>>>
>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>>> running in all nodes.
>>>
>>> *Apache Hadoop :* 1.2.1
>>>
>>> It shows the heap size currently as follows:
>>>
>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>>> *
>>> *
>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines 
>>> maximum heap size for Jobtracker, if yes how it has
>>> been calculated.
>>>
>>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>>
>>> Have set the jobtracker default memory size in hadoop-env.sh
>>>
>>> *HADOOP_HEAPSIZE="1024"*
>>> *
>>> *
>>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>>
>>>  
>>>   mapred.child.java.opts
>>>   -Xmx2048m
>>>  
>>>
>>> Even after setting the above property, getting Jobtracker OOME issue.
>>> How the jobtracker memory gradually increasing. After restart the JT,
>>> within a week getting OOME.
>>>
>>> How to resolve this, it is in production and critical? Please help.
>>> Thanks in advance.
>>>
>>> --
>>> Regards,
>>> Viswa.J
>>>
>>  --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "CDH Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to cdh-user+unsubscr...@cloudera.org.
>> For more options, visit
>> https://groups.google.com/a/cloudera.org/groups/opt_out.
>>
>


Re: Hadoop Jobtracker heap size calculation and OOME

2013-10-12 Thread Viswanathan J
Thanks Antonio, hope the memory leak issue will be resolved. Its really
nightmare every week.

In which release this issue will be resolved?

How to solve this issue, please help because we are facing in production
environment.

Please share the configuration and cron to do that cleanup process.

Thanks,
Viswa
On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos"  wrote:

> "After restart the JT, within a week getting OOME."
>
> Viswa, we were having the same issue in our cluster as well - roughly
> every 5-7 days getting OOME.
> The heap size of the Job Tracker was constantly increasing due to a memory
> leak that will hopefully be fixed in newest releases.
>
> There is a configuration change in the JobTracker that will disable a
> functionality regarding cleaning up staging files i.e.
> /user/build/.staging/* - but that means that you will have to handle the
> staging files through a cron / jenkins task
>
> I'll get you the configuration on Monday..
>
> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
>>
>> Hi,
>>
>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>> running in all nodes.
>>
>> *Apache Hadoop :* 1.2.1
>>
>> It shows the heap size currently as follows:
>>
>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>> *
>> *
>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines 
>> maximum heap size for Jobtracker, if yes how it has
>> been calculated.
>>
>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>
>> Have set the jobtracker default memory size in hadoop-env.sh
>>
>> *HADOOP_HEAPSIZE="1024"*
>> *
>> *
>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>
>> 
>>   mapred.child.java.opts
>>   -Xmx2048m
>> 
>>
>> Even after setting the above property, getting Jobtracker OOME issue. How
>> the jobtracker memory gradually increasing. After restart the JT, within a
>> week getting OOME.
>>
>> How to resolve this, it is in production and critical? Please help.
>> Thanks in advance.
>>
>> --
>> Regards,
>> Viswa.J
>>
>  --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscr...@cloudera.org.
> For more options, visit
> https://groups.google.com/a/cloudera.org/groups/opt_out.
>


Re: Hadoop Jobtracker heap size calculation and OOME

2013-10-12 Thread Viswanathan J
Hi Harsh,

Appreciate the response.

Thanks Reyane.

Thanks,
Viswa.J
On Oct 12, 2013 5:04 AM, "Reyane Oukpedjo"  wrote:

> Hi there,
> I had a similar issue with hadoop-1.2.0  JobTracker keep crashing until I
> set HADOOP_HEAPSIZE="2048"  I did not have this kind of issue with
> previous versions. But you can try this if you have memory and see. In my
> case the issue was gone after I set as above.
>
> Thanks
>
>
> Reyane OUKPEDJO
>
>
> On 11 October 2013 13:08, Viswanathan J wrote:
>
>> Hi,
>>
>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
>> running in all nodes.
>>
>> *Apache Hadoop :* 1.2.1
>>
>> It shows the heap size currently as follows:
>>
>> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
>> *
>> *
>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines 
>> maximum heap size for Jobtracker, if yes how it has
>> been calculated.
>>
>> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>>
>> Have set the jobtracker default memory size in hadoop-env.sh
>>
>> *HADOOP_HEAPSIZE="1024"*
>> *
>> *
>> Have set the mapred.child.java.opts value in mapred-site.xml as,
>>
>>  
>>   mapred.child.java.opts
>>   -Xmx2048m
>> 
>>
>> Even after setting the above property, getting Jobtracker OOME issue. How
>> the jobtracker memory gradually increasing. After restart the JT, within a
>> week getting OOME.
>>
>> How to resolve this, it is in production and critical? Please help.
>> Thanks in advance.
>>
>> --
>> Regards,
>> Viswa.J
>>
>
>


Re: Hadoop Jobtracker heap size calculation and OOME

2013-10-11 Thread Reyane Oukpedjo
Hi there,
I had a similar issue with hadoop-1.2.0  JobTracker keep crashing until I
set HADOOP_HEAPSIZE="2048"  I did not have this kind of issue with previous
versions. But you can try this if you have memory and see. In my case the
issue was gone after I set as above.

Thanks


Reyane OUKPEDJO


On 11 October 2013 13:08, Viswanathan J  wrote:

> Hi,
>
> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
> running in all nodes.
>
> *Apache Hadoop :* 1.2.1
>
> It shows the heap size currently as follows:
>
> *Cluster Summary (Heap Size is 5.7/8.89 GB)*
> *
> *
> In the above summary what is the *8.89* GB defines? Is the *8.89* defines
> maximum heap size for Jobtracker, if yes how it has been calculated.
>
> Hope *5.7* is currently running jobs heap-size, how it is calculated.
>
> Have set the jobtracker default memory size in hadoop-env.sh
>
> *HADOOP_HEAPSIZE="1024"*
> *
> *
> Have set the mapred.child.java.opts value in mapred-site.xml as,
>
> 
>   mapred.child.java.opts
>   -Xmx2048m
> 
>
> Even after setting the above property, getting Jobtracker OOME issue. How
> the jobtracker memory gradually increasing. After restart the JT, within a
> week getting OOME.
>
> How to resolve this, it is in production and critical? Please help. Thanks
> in advance.
>
> --
> Regards,
> Viswa.J
>


Hadoop Jobtracker heap size calculation and OOME

2013-10-11 Thread Viswanathan J
Hi,

I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
running in all nodes.

*Apache Hadoop :* 1.2.1

It shows the heap size currently as follows:

*Cluster Summary (Heap Size is 5.7/8.89 GB)*
*
*
In the above summary what is the *8.89* GB defines? Is the *8.89* defines
maximum heap size for Jobtracker, if yes how it has been calculated.

Hope *5.7* is currently running jobs heap-size, how it is calculated.

Have set the jobtracker default memory size in hadoop-env.sh

*HADOOP_HEAPSIZE="1024"*
*
*
Have set the mapred.child.java.opts value in mapred-site.xml as,


  mapred.child.java.opts
  -Xmx2048m


Even after setting the above property, getting Jobtracker OOME issue. How
the jobtracker memory gradually increasing. After restart the JT, within a
week getting OOME.

How to resolve this, it is in production and critical? Please help. Thanks
in advance.

-- 
Regards,
Viswa.J