Re: java.io.IOException: Task process exit with nonzero status of 1

2012-05-11 Thread Prashant Kommireddi
You might be running out of disk space. Check for that on your cluster
nodes.

-Prashant

On Fri, May 11, 2012 at 12:21 AM, JunYong Li lij...@gmail.com wrote:

 is there errors in the task outpu file?
 on the jobtracker.jsp click the Jobid link - tasks link - Taskid link -
 Task logs link

 2012/5/11 Mohit Kundra mohit@gmail.com

  Hi ,
 
  I am new user to hadoop . I have installed hadoop0.19.1 on single windows
  machine.
  Its http://localhost:50030/jobtracker.jsp and
  http://localhost:50070/dfshealth.jsp pages are working fine but when i
 am
  executing  bin/hadoop jar hadoop-0.19.1-examples.jar pi 5 100
 
  It is showing below
 
  $ bin/hadoop jar hadoop-0.19.1-examples.jar pi 5 100
  cygpath: cannot create short name of D:hadoop-0.19.1logs
  Number of Maps = 5 Samples per Map = 100
  Wrote input for Map #0
  Wrote input for Map #1
  Wrote input for Map #2
  Wrote input for Map #3
  Wrote input for Map #4
  Starting Job
  12/05/11 12:07:26 INFO mapred.JobClient:
  Running job: job_20120513_0002
  12/05/11 12:07:27 INFO mapred.JobClient:  map 0% reduce 0%
  12/05/11 12:07:35 INFO mapred.JobClient: Task Id :
  attempt_20120513_0002_m_06_ 0, Status : FAILED
  java.io.IOException: Task process exit with nonzero status of 1.
  at org.apache.hadoop.mapred.TaskRunner.run (TaskRunner.java:425)
 
 
 
  Please tell me what is the root cause
 
  regards ,
  Mohit
 
 


 --
 Regards
 Junyong



Re: java.io.IOException: Task process exit with nonzero status of 1

2012-05-11 Thread Harsh J
Mohit,

Why are you using Hadoop-0.19, a version released many years ago?
Please download the latest stable available at
http://hadoop.apache.org/common/releases.html#Download instead.

On Fri, May 11, 2012 at 12:26 PM, Mohit Kundra mohit@gmail.com wrote:
 Hi ,

 I am new user to hadoop . I have installed hadoop0.19.1 on single windows
 machine.
 Its http://localhost:50030/jobtracker.jsp and
 http://localhost:50070/dfshealth.jsp pages are working fine but when i am
 executing  bin/hadoop jar hadoop-0.19.1-examples.jar pi 5 100

 It is showing below

 $ bin/hadoop jar hadoop-0.19.1-examples.jar pi 5 100
 cygpath: cannot create short name of D:hadoop-0.19.1logs
 Number of Maps = 5 Samples per Map = 100
 Wrote input for Map #0
 Wrote input for Map #1
 Wrote input for Map #2
 Wrote input for Map #3
 Wrote input for Map #4
 Starting Job
 12/05/11 12:07:26 INFO mapred.JobClient:
 Running job: job_20120513_0002
 12/05/11 12:07:27 INFO mapred.JobClient:  map 0% reduce 0%
 12/05/11 12:07:35 INFO mapred.JobClient: Task Id :
 attempt_20120513_0002_m_06_ 0, Status : FAILED
 java.io.IOException: Task process exit with nonzero status of 1.
 at org.apache.hadoop.mapred.TaskRunner.run (TaskRunner.java:425)



 Please tell me what is the root cause

 regards ,
 Mohit




-- 
Harsh J


Re: Monitoring Hadoop Cluster

2012-05-11 Thread Lance Norskog
zabbix does monitoring, archiving and graphing, and alerts.

It has a JMX bean monitor system. If Hadoop has these, or you can add
them easily, you have a great monitor. Also, check out 'Starfish'.
It's a little old, but I got it running and it was really cool.

On Thu, May 10, 2012 at 11:24 PM, Manu S manupk...@gmail.com wrote:
 Thanks a lot Junyong

 On Fri, May 11, 2012 at 11:15 AM, JunYong Li lij...@gmail.com wrote:

 Each has its own merits.
 http://developer.yahoo.com/hadoop/tutorial/module7.html#monitoring

 2012/5/11 Manu S manupk...@gmail.com

  Hi All,
 
  Which is the best monitoring tool for Hadoop cluster monitoring? Ganglia
 or
  Nagios?
 
  Thanks,
  Manu S
 



 --
 Regards
 Junyong




-- 
Lance Norskog
goks...@gmail.com


Re: Monitoring Hadoop Cluster

2012-05-11 Thread Stu Teasdale
I've helped out linking hadoop to munin using jmx querying in the past, 
there's a writeup at:

http://www.cs.huji.ac.il/wikis/MediaWiki/lawa/index.php/Munin_for_Hadoop

Stu
On Fri, May 11, 2012 at 02:15:16AM -0700, Lance Norskog wrote:
 zabbix does monitoring, archiving and graphing, and alerts.
 
 It has a JMX bean monitor system. If Hadoop has these, or you can add
 them easily, you have a great monitor. Also, check out 'Starfish'.
 It's a little old, but I got it running and it was really cool.
 
 On Thu, May 10, 2012 at 11:24 PM, Manu S manupk...@gmail.com wrote:
  Thanks a lot Junyong
 
  On Fri, May 11, 2012 at 11:15 AM, JunYong Li lij...@gmail.com wrote:
 
  Each has its own merits.
  http://developer.yahoo.com/hadoop/tutorial/module7.html#monitoring
 
  2012/5/11 Manu S manupk...@gmail.com
 
   Hi All,
  
   Which is the best monitoring tool for Hadoop cluster monitoring? Ganglia
  or
   Nagios?
  
   Thanks,
   Manu S
  
 
 
 
  --
  Regards
  Junyong
 
 
 
 
 -- 
 Lance Norskog
 goks...@gmail.com

-- 
From the prompt of Stu Teasdale

Happiness is a hard disk.


Re: High load on datanode startup

2012-05-11 Thread Darrell Taylor
On Thu, May 10, 2012 at 5:58 PM, Raj Vishwanathan rajv...@yahoo.com wrote:

 Darrell

 Are the new dn,nn and mapred directories on the same physical disk?
 Nothing on NFS , correct?


Yes, that's correct



 Could you be having some hardware issue? Any clue in /var/log/messages or
 dmesg?


Hardware is good, all logs are clean.



 A non responsive system indicates a CPU that is really busy either doing
 something or waiting for something and the fact that it happens only on
 some nodes indicates a local problem.


Yes, it was a very strange problem, which I seemed to have solved (for
now).  So, yesterday I upgraded the cluster to cdh4, and I found some of
the nodes started to display similar behaviour but was able to catch then
early enough to do something about it, the solution was to remove the
hadoop-env.sh that I had copied over from the cdh3 install, the only thing
I had added to this file was the following which I did to get pig/hbase
talking :

export HADOOP_CLASSPATH=`/usr/bin/hbase classpath`:$HADOOP_CLASSPATH

What I saw on the machine was thousands of recursive processes in ps of the
form 'bash /usr/bin/hbase classpath...',  Stopping everything didn't clean
the processes up so had to kill them manually with some grep/xargs foo.
 Once this was all cleaned up and the hadoop-env.sh file removed the nodes
seem to be happy again.

Darrell.



 Raj



 
  From: Darrell Taylor darrell.tay...@gmail.com
 To: common-user@hadoop.apache.org
 Cc: Raj Vishwanathan rajv...@yahoo.com
 Sent: Thursday, May 10, 2012 3:57 AM
 Subject: Re: High load on datanode startup
 
 On Thu, May 10, 2012 at 9:33 AM, Todd Lipcon t...@cloudera.com wrote:
 
  That's real weird..
 
  If you can reproduce this after a reboot, I'd recommend letting the DN
  run for a minute, and then capturing a jstack pid of dn as well as
  the output of top -H -p pid of dn -b -n 5 and send it to the list.
 
 
 What I did after the reboot this morning was to move the my dn, nn, and
 mapred directories out of the the way, create a new one, formatted it, and
 restarted the node, it's now happy.
 
 I'll try moving the directories back later and do the jstack as you
 suggest.
 
 
 
  What JVM/JDK are you using? What OS version?
 
 
 root@pl446:/# dpkg --get-selections | grep java
 java-common install
 libjaxp1.3-java install
 libjaxp1.3-java-gcj install
 libmysql-java   install
 libxerces2-java install
 libxerces2-java-gcj install
 sun-java6-bin   install
 sun-java6-javadbinstall
 sun-java6-jdk   install
 sun-java6-jre   install
 
 root@pl446:/# java -version
 java version 1.6.0_26
 Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
 
 root@pl446:/# cat /etc/issue
 Debian GNU/Linux 6.0 \n \l
 
 
 
 
  -Todd
 
 
  On Wed, May 9, 2012 at 11:57 PM, Darrell Taylor
  darrell.tay...@gmail.com wrote:
   On Wed, May 9, 2012 at 10:52 PM, Raj Vishwanathan rajv...@yahoo.com
  wrote:
  
   The picture either too small or too pixelated for my eyes :-)
  
  
   There should be a zoom option in the top right of the page that allows
  you
   to view it full size
  
  
  
   Can you login to the box and send the output of top? If the system is
   unresponsive, it has to be something more than an unbalanced hdfs
  cluster,
   methinks.
  
  
   Sorry, I'm unable to login to the box, it's completely unresponsive.
  
  
  
   Raj
  
  
  
   
From: Darrell Taylor darrell.tay...@gmail.com
   To: common-user@hadoop.apache.org; Raj Vishwanathan 
 rajv...@yahoo.com
  
   Sent: Wednesday, May 9, 2012 2:40 PM
   Subject: Re: High load on datanode startup
   
   On Wed, May 9, 2012 at 10:23 PM, Raj Vishwanathan 
 rajv...@yahoo.com
   wrote:
   
When you say 'load', what do you mean? CPU load or something else?
   
   
   I mean in the unix sense of load average, i.e. top would show a
 load of
   (currently) 376.
   
   Looking at Ganglia stats for the box it's not CPU load as such, the
  graphs
   shows actual CPU usage as 30%, but the number of running processes
 is
   simply growing in a linear manner - screen shot of ganglia page
 here :
   
   
  
 
 https://picasaweb.google.com/lh/photo/Q0uFSzyLiriDuDnvyRUikXVR0iWwMibMfH0upnTwi28?feat=directlink
   
   
   
   
Raj
   
   
   

 From: Darrell Taylor darrell.tay...@gmail.com
To: common-user@hadoop.apache.org
Sent: Wednesday, May 9, 2012 9:52 AM
Subject: High load on datanode startup

Hi,

I wonder if someone could give some pointers with a problem I'm
  having?

I have a 7 machine cluster setup for 

Re: High load on datanode startup

2012-05-11 Thread Todd Lipcon
On Fri, May 11, 2012 at 2:29 AM, Darrell Taylor
darrell.tay...@gmail.com wrote:

 What I saw on the machine was thousands of recursive processes in ps of the
 form 'bash /usr/bin/hbase classpath...',  Stopping everything didn't clean
 the processes up so had to kill them manually with some grep/xargs foo.
  Once this was all cleaned up and the hadoop-env.sh file removed the nodes
 seem to be happy again.

Ah -- maybe the issue is that... my guess is that hbase classpath is
now trying to include the Hadoop dependencies using hadoop
classpath. But hadoop classpath was recursing right back because of
that setting in hadoop-env. Basically you made a fork bomb - that
explains the shape of the graph in Ganglia perfectly.

-Todd


 Darrell.



 Raj



 
  From: Darrell Taylor darrell.tay...@gmail.com
 To: common-user@hadoop.apache.org
 Cc: Raj Vishwanathan rajv...@yahoo.com
 Sent: Thursday, May 10, 2012 3:57 AM
 Subject: Re: High load on datanode startup
 
 On Thu, May 10, 2012 at 9:33 AM, Todd Lipcon t...@cloudera.com wrote:
 
  That's real weird..
 
  If you can reproduce this after a reboot, I'd recommend letting the DN
  run for a minute, and then capturing a jstack pid of dn as well as
  the output of top -H -p pid of dn -b -n 5 and send it to the list.
 
 
 What I did after the reboot this morning was to move the my dn, nn, and
 mapred directories out of the the way, create a new one, formatted it, and
 restarted the node, it's now happy.
 
 I'll try moving the directories back later and do the jstack as you
 suggest.
 
 
 
  What JVM/JDK are you using? What OS version?
 
 
 root@pl446:/# dpkg --get-selections | grep java
 java-common                                     install
 libjaxp1.3-java                                 install
 libjaxp1.3-java-gcj                             install
 libmysql-java                                   install
 libxerces2-java                                 install
 libxerces2-java-gcj                             install
 sun-java6-bin                                   install
 sun-java6-javadb                                install
 sun-java6-jdk                                   install
 sun-java6-jre                                   install
 
 root@pl446:/# java -version
 java version 1.6.0_26
 Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
 
 root@pl446:/# cat /etc/issue
 Debian GNU/Linux 6.0 \n \l
 
 
 
 
  -Todd
 
 
  On Wed, May 9, 2012 at 11:57 PM, Darrell Taylor
  darrell.tay...@gmail.com wrote:
   On Wed, May 9, 2012 at 10:52 PM, Raj Vishwanathan rajv...@yahoo.com
  wrote:
  
   The picture either too small or too pixelated for my eyes :-)
  
  
   There should be a zoom option in the top right of the page that allows
  you
   to view it full size
  
  
  
   Can you login to the box and send the output of top? If the system is
   unresponsive, it has to be something more than an unbalanced hdfs
  cluster,
   methinks.
  
  
   Sorry, I'm unable to login to the box, it's completely unresponsive.
  
  
  
   Raj
  
  
  
   
From: Darrell Taylor darrell.tay...@gmail.com
   To: common-user@hadoop.apache.org; Raj Vishwanathan 
 rajv...@yahoo.com
  
   Sent: Wednesday, May 9, 2012 2:40 PM
   Subject: Re: High load on datanode startup
   
   On Wed, May 9, 2012 at 10:23 PM, Raj Vishwanathan 
 rajv...@yahoo.com
   wrote:
   
When you say 'load', what do you mean? CPU load or something else?
   
   
   I mean in the unix sense of load average, i.e. top would show a
 load of
   (currently) 376.
   
   Looking at Ganglia stats for the box it's not CPU load as such, the
  graphs
   shows actual CPU usage as 30%, but the number of running processes
 is
   simply growing in a linear manner - screen shot of ganglia page
 here :
   
   
  
 
 https://picasaweb.google.com/lh/photo/Q0uFSzyLiriDuDnvyRUikXVR0iWwMibMfH0upnTwi28?feat=directlink
   
   
   
   
Raj
   
   
   

 From: Darrell Taylor darrell.tay...@gmail.com
To: common-user@hadoop.apache.org
Sent: Wednesday, May 9, 2012 9:52 AM
Subject: High load on datanode startup

Hi,

I wonder if someone could give some pointers with a problem I'm
  having?

I have a 7 machine cluster setup for testing and we have been
  pouring
   data
into it for a week without issue, have learnt several thing along
  the
   way
and solved all the problems up to now by searching online, but
 now
  I'm
stuck.  One of the data nodes decided to have a load of 70+ this
   morning,
stopping datanode and tasktracker brought it back to normal, but
  every
time
I start the datanode again the load shoots through the roof, and
  all I
   get
in the logs is :

STARTUP_MSG: Starting DataNode


STARTUP_MSG:   host = pl464/10.20.16.64


STARTUP_MSG:   args = []



Re: High load on datanode startup

2012-05-11 Thread Harsh J
Doesn't look like the $HBASE_HOME/bin/hbase script runs
$HADOOP_HOME/bin/hadoop classpath directly. Its classpath builder
seems to add $HADOOP_HOME items manually via listing/etc.. Perhaps if
hbase-env.sh has a HBASE_CLASSPATH that imports `hadoop classpath`,
and the hadoop-env.sh has a `hbase classpath` this issue could happen.

I do know that `hbase classpath` may take very long and/or hang over
network calls if there's a target/build directory inside of
$HBASE_HOME, which causes it to use maven to generate a classpath
instead of using a cached file/local gen. Generally doing mvn clean
solves that up for me, whenever it happens over my installs.

On Fri, May 11, 2012 at 3:02 PM, Todd Lipcon t...@cloudera.com wrote:
 On Fri, May 11, 2012 at 2:29 AM, Darrell Taylor
 darrell.tay...@gmail.com wrote:

 What I saw on the machine was thousands of recursive processes in ps of the
 form 'bash /usr/bin/hbase classpath...',  Stopping everything didn't clean
 the processes up so had to kill them manually with some grep/xargs foo.
  Once this was all cleaned up and the hadoop-env.sh file removed the nodes
 seem to be happy again.

 Ah -- maybe the issue is that... my guess is that hbase classpath is
 now trying to include the Hadoop dependencies using hadoop
 classpath. But hadoop classpath was recursing right back because of
 that setting in hadoop-env. Basically you made a fork bomb - that
 explains the shape of the graph in Ganglia perfectly.

 -Todd


 Darrell.



 Raj



 
  From: Darrell Taylor darrell.tay...@gmail.com
 To: common-user@hadoop.apache.org
 Cc: Raj Vishwanathan rajv...@yahoo.com
 Sent: Thursday, May 10, 2012 3:57 AM
 Subject: Re: High load on datanode startup
 
 On Thu, May 10, 2012 at 9:33 AM, Todd Lipcon t...@cloudera.com wrote:
 
  That's real weird..
 
  If you can reproduce this after a reboot, I'd recommend letting the DN
  run for a minute, and then capturing a jstack pid of dn as well as
  the output of top -H -p pid of dn -b -n 5 and send it to the list.
 
 
 What I did after the reboot this morning was to move the my dn, nn, and
 mapred directories out of the the way, create a new one, formatted it, and
 restarted the node, it's now happy.
 
 I'll try moving the directories back later and do the jstack as you
 suggest.
 
 
 
  What JVM/JDK are you using? What OS version?
 
 
 root@pl446:/# dpkg --get-selections | grep java
 java-common                                     install
 libjaxp1.3-java                                 install
 libjaxp1.3-java-gcj                             install
 libmysql-java                                   install
 libxerces2-java                                 install
 libxerces2-java-gcj                             install
 sun-java6-bin                                   install
 sun-java6-javadb                                install
 sun-java6-jdk                                   install
 sun-java6-jre                                   install
 
 root@pl446:/# java -version
 java version 1.6.0_26
 Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
 
 root@pl446:/# cat /etc/issue
 Debian GNU/Linux 6.0 \n \l
 
 
 
 
  -Todd
 
 
  On Wed, May 9, 2012 at 11:57 PM, Darrell Taylor
  darrell.tay...@gmail.com wrote:
   On Wed, May 9, 2012 at 10:52 PM, Raj Vishwanathan rajv...@yahoo.com
  wrote:
  
   The picture either too small or too pixelated for my eyes :-)
  
  
   There should be a zoom option in the top right of the page that allows
  you
   to view it full size
  
  
  
   Can you login to the box and send the output of top? If the system is
   unresponsive, it has to be something more than an unbalanced hdfs
  cluster,
   methinks.
  
  
   Sorry, I'm unable to login to the box, it's completely unresponsive.
  
  
  
   Raj
  
  
  
   
From: Darrell Taylor darrell.tay...@gmail.com
   To: common-user@hadoop.apache.org; Raj Vishwanathan 
 rajv...@yahoo.com
  
   Sent: Wednesday, May 9, 2012 2:40 PM
   Subject: Re: High load on datanode startup
   
   On Wed, May 9, 2012 at 10:23 PM, Raj Vishwanathan 
 rajv...@yahoo.com
   wrote:
   
When you say 'load', what do you mean? CPU load or something else?
   
   
   I mean in the unix sense of load average, i.e. top would show a
 load of
   (currently) 376.
   
   Looking at Ganglia stats for the box it's not CPU load as such, the
  graphs
   shows actual CPU usage as 30%, but the number of running processes
 is
   simply growing in a linear manner - screen shot of ganglia page
 here :
   
   
  
 
 https://picasaweb.google.com/lh/photo/Q0uFSzyLiriDuDnvyRUikXVR0iWwMibMfH0upnTwi28?feat=directlink
   
   
   
   
Raj
   
   
   

 From: Darrell Taylor darrell.tay...@gmail.com
To: common-user@hadoop.apache.org
Sent: Wednesday, May 9, 2012 9:52 AM
Subject: High load on datanode startup

Hi,

I wonder if 

Re: freeze a mapreduce job

2012-05-11 Thread Harsh J
I do not know about the per-host slot control (that is most likely not
supported, or not yet anyway - and perhaps feels wrong to do), but the
rest of the needs can be doable if you use schedulers and
queues/pools.

If you use FairScheduler (FS), ensure that this job always goes to a
special pool and when you want to freeze the pool simply set the
pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous
tasks as you wish, to constrict instead of freeze. When you make
changes to the FairScheduler configs, you do not need to restart the
JT, and you may simply wait a few seconds for FairScheduler to refresh
its own configs.

More on FS at http://hadoop.apache.org/common/docs/current/fair_scheduler.html

If you use CapacityScheduler (CS), then I believe you can do this by
again making sure the job goes to a specific queue, and when needed to
freeze it, simply set the queue's maximum-capacity to 0 (percentage)
or to constrict it, choose a lower, positive percentage value as you
need. You can also refresh CS to pick up config changes by refreshing
queues via mradmin.

More on CS at 
http://hadoop.apache.org/common/docs/current/capacity_scheduler.html

Either approach will not freeze/constrict the job immediately, but
should certainly prevent it from progressing. Meaning, their existing
running tasks during the time of changes made to scheduler config will
continue to run till completion but further tasks scheduling from
those jobs shall begin seeing effect of the changes made.

P.s. A better solution would be to make your job not take as many
days, somehow? :-)

On Fri, May 11, 2012 at 4:13 PM, Rita rmorgan...@gmail.com wrote:
 I have a rather large map reduce job which takes few days. I was wondering
 if its possible for me to freeze the job or make the job less intensive. Is
 it possible to reduce the number of slots per host and then I can increase
 them overnight?


 tia

 --
 --- Get your facts first, then you can distort them as you please.--



-- 
Harsh J


Re: freeze a mapreduce job

2012-05-11 Thread Michael Segel
Just a quick note...

If your task is currently occupying a slot,  the only way to release the slot 
is to kill the specific task.
If you are using FS, you can move the task to another queue and/or you can 
lower the job's priority which will cause new tasks to spawn  slower than other 
jobs so you will eventually free up the cluster. 

There isn't a way to 'freeze' or stop a job mid state. 

Is the issue that the job has a large number of slots, or is it an issue of the 
individual tasks taking a  long time to complete? 

If its the latter, you will probably want to go to a capacity scheduler over 
the fair scheduler. 

HTH

-Mike

On May 11, 2012, at 6:08 AM, Harsh J wrote:

 I do not know about the per-host slot control (that is most likely not
 supported, or not yet anyway - and perhaps feels wrong to do), but the
 rest of the needs can be doable if you use schedulers and
 queues/pools.
 
 If you use FairScheduler (FS), ensure that this job always goes to a
 special pool and when you want to freeze the pool simply set the
 pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous
 tasks as you wish, to constrict instead of freeze. When you make
 changes to the FairScheduler configs, you do not need to restart the
 JT, and you may simply wait a few seconds for FairScheduler to refresh
 its own configs.
 
 More on FS at http://hadoop.apache.org/common/docs/current/fair_scheduler.html
 
 If you use CapacityScheduler (CS), then I believe you can do this by
 again making sure the job goes to a specific queue, and when needed to
 freeze it, simply set the queue's maximum-capacity to 0 (percentage)
 or to constrict it, choose a lower, positive percentage value as you
 need. You can also refresh CS to pick up config changes by refreshing
 queues via mradmin.
 
 More on CS at 
 http://hadoop.apache.org/common/docs/current/capacity_scheduler.html
 
 Either approach will not freeze/constrict the job immediately, but
 should certainly prevent it from progressing. Meaning, their existing
 running tasks during the time of changes made to scheduler config will
 continue to run till completion but further tasks scheduling from
 those jobs shall begin seeing effect of the changes made.
 
 P.s. A better solution would be to make your job not take as many
 days, somehow? :-)
 
 On Fri, May 11, 2012 at 4:13 PM, Rita rmorgan...@gmail.com wrote:
 I have a rather large map reduce job which takes few days. I was wondering
 if its possible for me to freeze the job or make the job less intensive. Is
 it possible to reduce the number of slots per host and then I can increase
 them overnight?
 
 
 tia
 
 --
 --- Get your facts first, then you can distort them as you please.--
 
 
 
 -- 
 Harsh J
 



Re: freeze a mapreduce job

2012-05-11 Thread Rita
thanks.  I think I will investigate capacity scheduler.


On Fri, May 11, 2012 at 7:26 AM, Michael Segel michael_se...@hotmail.comwrote:

 Just a quick note...

 If your task is currently occupying a slot,  the only way to release the
 slot is to kill the specific task.
 If you are using FS, you can move the task to another queue and/or you can
 lower the job's priority which will cause new tasks to spawn  slower than
 other jobs so you will eventually free up the cluster.

 There isn't a way to 'freeze' or stop a job mid state.

 Is the issue that the job has a large number of slots, or is it an issue
 of the individual tasks taking a  long time to complete?

 If its the latter, you will probably want to go to a capacity scheduler
 over the fair scheduler.

 HTH

 -Mike

 On May 11, 2012, at 6:08 AM, Harsh J wrote:

  I do not know about the per-host slot control (that is most likely not
  supported, or not yet anyway - and perhaps feels wrong to do), but the
  rest of the needs can be doable if you use schedulers and
  queues/pools.
 
  If you use FairScheduler (FS), ensure that this job always goes to a
  special pool and when you want to freeze the pool simply set the
  pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous
  tasks as you wish, to constrict instead of freeze. When you make
  changes to the FairScheduler configs, you do not need to restart the
  JT, and you may simply wait a few seconds for FairScheduler to refresh
  its own configs.
 
  More on FS at
 http://hadoop.apache.org/common/docs/current/fair_scheduler.html
 
  If you use CapacityScheduler (CS), then I believe you can do this by
  again making sure the job goes to a specific queue, and when needed to
  freeze it, simply set the queue's maximum-capacity to 0 (percentage)
  or to constrict it, choose a lower, positive percentage value as you
  need. You can also refresh CS to pick up config changes by refreshing
  queues via mradmin.
 
  More on CS at
 http://hadoop.apache.org/common/docs/current/capacity_scheduler.html
 
  Either approach will not freeze/constrict the job immediately, but
  should certainly prevent it from progressing. Meaning, their existing
  running tasks during the time of changes made to scheduler config will
  continue to run till completion but further tasks scheduling from
  those jobs shall begin seeing effect of the changes made.
 
  P.s. A better solution would be to make your job not take as many
  days, somehow? :-)
 
  On Fri, May 11, 2012 at 4:13 PM, Rita rmorgan...@gmail.com wrote:
  I have a rather large map reduce job which takes few days. I was
 wondering
  if its possible for me to freeze the job or make the job less
 intensive. Is
  it possible to reduce the number of slots per host and then I can
 increase
  them overnight?
 
 
  tia
 
  --
  --- Get your facts first, then you can distort them as you please.--
 
 
 
  --
  Harsh J
 




-- 
--- Get your facts first, then you can distort them as you please.--


How to maintain record boundaries

2012-05-11 Thread Shreya.Pal
Hi

When we store data into HDFS, it gets broken into small pieces and distributed 
across the cluster based on Block size for the file.
While processing the data using MR program I want a particular record as a 
whole without it being split across nodes, but the data has already been split 
and stored in HDFS when I loaded the data.
How would I make sure that my record doesn't get split, how would my Input 
format make a difference now ?

Regards
Shreya

This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful.


Re: How to maintain record boundaries

2012-05-11 Thread Harsh J
Shreya,

This has been asked several times before, and the way it is handled by
TextInputFormats (for one example) is explained at
http://wiki.apache.org/hadoop/HadoopMapReduce in the Map section. If
you are writing a custom reader, feel free to follow the same steps -
you basically need to seek over to next blocks for an end-record
marker and not limit yourself to just one-block reads.

All input formats provided in MR handle this already for you, and you
needn't worry about this unless you're implementing a whole new reader
from scratch.

On Fri, May 11, 2012 at 5:45 PM,  shreya@cognizant.com wrote:
 Hi

 When we store data into HDFS, it gets broken into small pieces and 
 distributed across the cluster based on Block size for the file.
 While processing the data using MR program I want a particular record as a 
 whole without it being split across nodes, but the data has already been 
 split and stored in HDFS when I loaded the data.
 How would I make sure that my record doesn't get split, how would my Input 
 format make a difference now ?

 Regards
 Shreya

 This e-mail and any files transmitted with it are for the sole use of the 
 intended recipient(s) and may contain confidential and privileged 
 information. If you are not the intended recipient(s), please reply to the 
 sender and destroy all copies of the original message. Any unauthorized 
 review, use, disclosure, dissemination, forwarding, printing or copying of 
 this email, and/or any action taken in reliance on the contents of this 
 e-mail is strictly prohibited and may be unlawful.



-- 
Harsh J


Re: java.io.IOException: Task process exit with nonzero status of 1

2012-05-11 Thread samir das mohapatra
Hi Mohit,

 1)  Hadoop is more portable with   Linux,Ubantu or any non dos file
system.
   but you are running hadoop on window it colud be the problem bcz hadoop
will generate some partial out put file for temporary use.
 2) Another thing is that your are running hadoop version as 0.19 , I think
if you upgrade the version it will solve your problem. why bcz example what
exactly you are using it is having some problem with FileRead and Write
with Window OS.

3)  Check your input file data bcz i could see your mapper is also 0%
4) If your are all right with whole scenario . please could your share your
logs under hadoopversion/logs
there it self we can trace it very clearly.

Thanks
 SAMIR






On Fri, May 11, 2012 at 12:26 PM, Mohit Kundra mohit@gmail.com wrote:

 Hi ,

 I am new user to hadoop . I have installed hadoop0.19.1 on single windows
 machine.
 Its http://localhost:50030/jobtracker.jsp and
 http://localhost:50070/dfshealth.jsp pages are working fine but when i am
 executing  bin/hadoop jar hadoop-0.19.1-examples.jar pi 5 100
 It is showing below

 $ bin/hadoop jar hadoop-0.19.1-examples.jar pi 5 100
 cygpath: cannot create short name of D:hadoop-0.19.1logs
 Number of Maps = 5 Samples per Map = 100
 Wrote input for Map #0
 Wrote input for Map #1
 Wrote input for Map #2
 Wrote input for Map #3
 Wrote input for Map #4
 Starting Job
 12/05/11 12:07:26 INFO mapred.JobClient:
 Running job: job_20120513_0002
 12/05/11 12:07:27 INFO mapred.JobClient:  map 0% reduce 0%
 12/05/11 12:07:35 INFO mapred.JobClient: Task Id :
 attempt_20120513_0002_m_06_ 0, Status : FAILED
 java.io.IOException: Task process exit with nonzero status of 1.
 at org.apache.hadoop.mapred.TaskRunner.run (TaskRunner.java:425)



 Please tell me what is the root cause

 regards ,
 Mohit




transferring between HDFS which reside in different subnet

2012-05-11 Thread Arindam Choudhury
Hi,

I have a question to the hadoop experts:

I have two HDFS, in different subnet.

HDFS1 : 192.168.*.*
HDFS2:  10.10.*.*

the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and another
to 10.10.*.*.

So, is it possible to transfer data from HDFS1 to HDFS2 and vice versa.

Regards,
Arindam


Re: transferring between HDFS which reside in different subnet

2012-05-11 Thread Shi Yu
If you could cross-access HDFS from both name nodes, then it should be 
transferable using /distcp /command.


Shi *
*
On 5/11/2012 8:45 AM, Arindam Choudhury wrote:

Hi,

I have a question to the hadoop experts:

I have two HDFS, in different subnet.

HDFS1 : 192.168.*.*
HDFS2:  10.10.*.*

the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and another
to 10.10.*.*.

So, is it possible to transfer data from HDFS1 to HDFS2 and vice versa.

Regards,
Arindam





Re: transferring between HDFS which reside in different subnet

2012-05-11 Thread Arindam Choudhury
I can not cross access HDFS. Though HDFS2 has two NIC the HDFS is running
on the other subnet.

On Fri, May 11, 2012 at 3:57 PM, Shi Yu sh...@uchicago.edu wrote:

 If you could cross-access HDFS from both name nodes, then it should be
 transferable using /distcp /command.

 Shi *
 *

 On 5/11/2012 8:45 AM, Arindam Choudhury wrote:

 Hi,

 I have a question to the hadoop experts:

 I have two HDFS, in different subnet.

 HDFS1 : 192.168.*.*
 HDFS2:  10.10.*.*

 the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and
 another
 to 10.10.*.*.

 So, is it possible to transfer data from HDFS1 to HDFS2 and vice versa.

 Regards,
 Arindam





Re: freeze a mapreduce job

2012-05-11 Thread Shi Yu
Is there any risk to suppress a job too long in FS?I guess there are 
some parameters to control the waiting time of a job (such as timeout 
,etc.),   for example, if a job is kept idle for more than 24 hours is 
there a configuration deciding kill/keep that job?


Shi

On 5/11/2012 6:52 AM, Rita wrote:

thanks.  I think I will investigate capacity scheduler.


On Fri, May 11, 2012 at 7:26 AM, Michael Segelmichael_se...@hotmail.comwrote:


Just a quick note...

If your task is currently occupying a slot,  the only way to release the
slot is to kill the specific task.
If you are using FS, you can move the task to another queue and/or you can
lower the job's priority which will cause new tasks to spawn  slower than
other jobs so you will eventually free up the cluster.

There isn't a way to 'freeze' or stop a job mid state.

Is the issue that the job has a large number of slots, or is it an issue
of the individual tasks taking a  long time to complete?

If its the latter, you will probably want to go to a capacity scheduler
over the fair scheduler.

HTH

-Mike

On May 11, 2012, at 6:08 AM, Harsh J wrote:


I do not know about the per-host slot control (that is most likely not
supported, or not yet anyway - and perhaps feels wrong to do), but the
rest of the needs can be doable if you use schedulers and
queues/pools.

If you use FairScheduler (FS), ensure that this job always goes to a
special pool and when you want to freeze the pool simply set the
pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous
tasks as you wish, to constrict instead of freeze. When you make
changes to the FairScheduler configs, you do not need to restart the
JT, and you may simply wait a few seconds for FairScheduler to refresh
its own configs.

More on FS at

http://hadoop.apache.org/common/docs/current/fair_scheduler.html

If you use CapacityScheduler (CS), then I believe you can do this by
again making sure the job goes to a specific queue, and when needed to
freeze it, simply set the queue's maximum-capacity to 0 (percentage)
or to constrict it, choose a lower, positive percentage value as you
need. You can also refresh CS to pick up config changes by refreshing
queues via mradmin.

More on CS at

http://hadoop.apache.org/common/docs/current/capacity_scheduler.html

Either approach will not freeze/constrict the job immediately, but
should certainly prevent it from progressing. Meaning, their existing
running tasks during the time of changes made to scheduler config will
continue to run till completion but further tasks scheduling from
those jobs shall begin seeing effect of the changes made.

P.s. A better solution would be to make your job not take as many
days, somehow? :-)

On Fri, May 11, 2012 at 4:13 PM, Ritarmorgan...@gmail.com  wrote:

I have a rather large map reduce job which takes few days. I was

wondering

if its possible for me to freeze the job or make the job less

intensive. Is

it possible to reduce the number of slots per host and then I can

increase

them overnight?


tia

--
--- Get your facts first, then you can distort them as you please.--



--
Harsh J









Re: How to maintain record boundaries

2012-05-11 Thread Shi Yu
here are some quick code for you (based on Tom's book). You could 
overwrite the TextInputFormat isSplitable method to avoid splitting, 
which is pretty important and useful when processing sequence data.


//Old API

public class NonSplittableTextInputFormat extends TextInputFormat {

@Override
protected boolean isSplitable(FileSystem fs, Path file){
return false;
}

}


//New API
public class NonSplittableTextInputFormatNewAPI extends TextInputFormat {

@Override
protected boolean isSplitable(JobContext context, Path file){
return false;
}

}


On 5/11/2012 7:19 AM, Harsh J wrote:

Shreya,

This has been asked several times before, and the way it is handled by
TextInputFormats (for one example) is explained at
http://wiki.apache.org/hadoop/HadoopMapReduce in the Map section. If
you are writing a custom reader, feel free to follow the same steps -
you basically need to seek over to next blocks for an end-record
marker and not limit yourself to just one-block reads.

All input formats provided in MR handle this already for you, and you
needn't worry about this unless you're implementing a whole new reader
from scratch.

On Fri, May 11, 2012 at 5:45 PM,shreya@cognizant.com  wrote:

Hi

When we store data into HDFS, it gets broken into small pieces and distributed 
across the cluster based on Block size for the file.
While processing the data using MR program I want a particular record as a 
whole without it being split across nodes, but the data has already been split 
and stored in HDFS when I loaded the data.
How would I make sure that my record doesn't get split, how would my Input 
format make a difference now ?

Regards
Shreya

This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
If you are not the intended recipient(s), please reply to the sender and 
destroy all copies of the original message. Any unauthorized review, use, 
disclosure, dissemination, forwarding, printing or copying of this email, 
and/or any action taken in reliance on the contents of this e-mail is strictly 
prohibited and may be unlawful.







Re: transferring between HDFS which reside in different subnet

2012-05-11 Thread Shi Yu
It seems in your case HDFS2 could access HDFS, so you should be able to 
transfer HDFS data to HDFS2.


If you want to cross-transfer, you don't need to do distcp on cluster 
nodes, if any client node (not necessary to be namenode, datanode, 
secondary node, etc.) could access to both HDFSs, then run transfer 
command on that client node.




On 5/11/2012 9:03 AM, Arindam Choudhury wrote:

I can not cross access HDFS. Though HDFS2 has two NIC the HDFS is running
on the other subnet.

On Fri, May 11, 2012 at 3:57 PM, Shi Yush...@uchicago.edu  wrote:


If you could cross-access HDFS from both name nodes, then it should be
transferable using /distcp /command.

Shi *
*

On 5/11/2012 8:45 AM, Arindam Choudhury wrote:


Hi,

I have a question to the hadoop experts:

I have two HDFS, in different subnet.

HDFS1 : 192.168.*.*
HDFS2:  10.10.*.*

the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and
another
to 10.10.*.*.

So, is it possible to transfer data from HDFS1 to HDFS2 and vice versa.

Regards,
Arindam






Re: transferring between HDFS which reside in different subnet

2012-05-11 Thread Rajesh Sai T
Looks like both are private subnets, so you got to route via a public
default gateway. Try adding route using route command if your in
linux(windows i have no idea). Just a thought i havent tried it though.

Thanks,
Rajesh

Typed from mobile, please bear with typos.
On May 11, 2012 10:03 AM, Arindam Choudhury arindamchoudhu...@gmail.com
wrote:

 I can not cross access HDFS. Though HDFS2 has two NIC the HDFS is running
 on the other subnet.

 On Fri, May 11, 2012 at 3:57 PM, Shi Yu sh...@uchicago.edu wrote:

  If you could cross-access HDFS from both name nodes, then it should be
  transferable using /distcp /command.
 
  Shi *
  *
 
  On 5/11/2012 8:45 AM, Arindam Choudhury wrote:
 
  Hi,
 
  I have a question to the hadoop experts:
 
  I have two HDFS, in different subnet.
 
  HDFS1 : 192.168.*.*
  HDFS2:  10.10.*.*
 
  the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and
  another
  to 10.10.*.*.
 
  So, is it possible to transfer data from HDFS1 to HDFS2 and vice versa.
 
  Regards,
  Arindam
 
 
 



Re: transferring between HDFS which reside in different subnet

2012-05-11 Thread Arindam Choudhury
So,

hadoop dfs -cp hdfs:// hdfs://...

this will work.

On Fri, May 11, 2012 at 4:14 PM, Rajesh Sai T tsairaj...@gmail.com wrote:

 Looks like both are private subnets, so you got to route via a public
 default gateway. Try adding route using route command if your in
 linux(windows i have no idea). Just a thought i havent tried it though.

 Thanks,
 Rajesh

 Typed from mobile, please bear with typos.
 On May 11, 2012 10:03 AM, Arindam Choudhury arindamchoudhu...@gmail.com
 
 wrote:

  I can not cross access HDFS. Though HDFS2 has two NIC the HDFS is running
  on the other subnet.
 
  On Fri, May 11, 2012 at 3:57 PM, Shi Yu sh...@uchicago.edu wrote:
 
   If you could cross-access HDFS from both name nodes, then it should be
   transferable using /distcp /command.
  
   Shi *
   *
  
   On 5/11/2012 8:45 AM, Arindam Choudhury wrote:
  
   Hi,
  
   I have a question to the hadoop experts:
  
   I have two HDFS, in different subnet.
  
   HDFS1 : 192.168.*.*
   HDFS2:  10.10.*.*
  
   the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and
   another
   to 10.10.*.*.
  
   So, is it possible to transfer data from HDFS1 to HDFS2 and vice
 versa.
  
   Regards,
   Arindam
  
  
  
 



Re: freeze a mapreduce job

2012-05-11 Thread Michael Segel
I haven't seen any.

Haven't really had to test that...

On May 11, 2012, at 9:03 AM, Shi Yu wrote:

 Is there any risk to suppress a job too long in FS?I guess there are some 
 parameters to control the waiting time of a job (such as timeout ,etc.),   
 for example, if a job is kept idle for more than 24 hours is there a 
 configuration deciding kill/keep that job?
 
 Shi
 
 On 5/11/2012 6:52 AM, Rita wrote:
 thanks.  I think I will investigate capacity scheduler.
 
 
 On Fri, May 11, 2012 at 7:26 AM, Michael 
 Segelmichael_se...@hotmail.comwrote:
 
 Just a quick note...
 
 If your task is currently occupying a slot,  the only way to release the
 slot is to kill the specific task.
 If you are using FS, you can move the task to another queue and/or you can
 lower the job's priority which will cause new tasks to spawn  slower than
 other jobs so you will eventually free up the cluster.
 
 There isn't a way to 'freeze' or stop a job mid state.
 
 Is the issue that the job has a large number of slots, or is it an issue
 of the individual tasks taking a  long time to complete?
 
 If its the latter, you will probably want to go to a capacity scheduler
 over the fair scheduler.
 
 HTH
 
 -Mike
 
 On May 11, 2012, at 6:08 AM, Harsh J wrote:
 
 I do not know about the per-host slot control (that is most likely not
 supported, or not yet anyway - and perhaps feels wrong to do), but the
 rest of the needs can be doable if you use schedulers and
 queues/pools.
 
 If you use FairScheduler (FS), ensure that this job always goes to a
 special pool and when you want to freeze the pool simply set the
 pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous
 tasks as you wish, to constrict instead of freeze. When you make
 changes to the FairScheduler configs, you do not need to restart the
 JT, and you may simply wait a few seconds for FairScheduler to refresh
 its own configs.
 
 More on FS at
 http://hadoop.apache.org/common/docs/current/fair_scheduler.html
 If you use CapacityScheduler (CS), then I believe you can do this by
 again making sure the job goes to a specific queue, and when needed to
 freeze it, simply set the queue's maximum-capacity to 0 (percentage)
 or to constrict it, choose a lower, positive percentage value as you
 need. You can also refresh CS to pick up config changes by refreshing
 queues via mradmin.
 
 More on CS at
 http://hadoop.apache.org/common/docs/current/capacity_scheduler.html
 Either approach will not freeze/constrict the job immediately, but
 should certainly prevent it from progressing. Meaning, their existing
 running tasks during the time of changes made to scheduler config will
 continue to run till completion but further tasks scheduling from
 those jobs shall begin seeing effect of the changes made.
 
 P.s. A better solution would be to make your job not take as many
 days, somehow? :-)
 
 On Fri, May 11, 2012 at 4:13 PM, Ritarmorgan...@gmail.com  wrote:
 I have a rather large map reduce job which takes few days. I was
 wondering
 if its possible for me to freeze the job or make the job less
 intensive. Is
 it possible to reduce the number of slots per host and then I can
 increase
 them overnight?
 
 
 tia
 
 --
 --- Get your facts first, then you can distort them as you please.--
 
 
 --
 Harsh J
 
 
 
 
 



Re: freeze a mapreduce job

2012-05-11 Thread Harsh J
Am not aware of a job-level timeout or idle monitor.

On Fri, May 11, 2012 at 7:33 PM, Shi Yu sh...@uchicago.edu wrote:
 Is there any risk to suppress a job too long in FS?    I guess there are
 some parameters to control the waiting time of a job (such as timeout
 ,etc.),   for example, if a job is kept idle for more than 24 hours is there
 a configuration deciding kill/keep that job?

 Shi


 On 5/11/2012 6:52 AM, Rita wrote:

 thanks.  I think I will investigate capacity scheduler.


 On Fri, May 11, 2012 at 7:26 AM, Michael
 Segelmichael_se...@hotmail.comwrote:

 Just a quick note...

 If your task is currently occupying a slot,  the only way to release the
 slot is to kill the specific task.
 If you are using FS, you can move the task to another queue and/or you
 can
 lower the job's priority which will cause new tasks to spawn  slower than
 other jobs so you will eventually free up the cluster.

 There isn't a way to 'freeze' or stop a job mid state.

 Is the issue that the job has a large number of slots, or is it an issue
 of the individual tasks taking a  long time to complete?

 If its the latter, you will probably want to go to a capacity scheduler
 over the fair scheduler.

 HTH

 -Mike

 On May 11, 2012, at 6:08 AM, Harsh J wrote:

 I do not know about the per-host slot control (that is most likely not
 supported, or not yet anyway - and perhaps feels wrong to do), but the
 rest of the needs can be doable if you use schedulers and
 queues/pools.

 If you use FairScheduler (FS), ensure that this job always goes to a
 special pool and when you want to freeze the pool simply set the
 pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous
 tasks as you wish, to constrict instead of freeze. When you make
 changes to the FairScheduler configs, you do not need to restart the
 JT, and you may simply wait a few seconds for FairScheduler to refresh
 its own configs.

 More on FS at

 http://hadoop.apache.org/common/docs/current/fair_scheduler.html

 If you use CapacityScheduler (CS), then I believe you can do this by
 again making sure the job goes to a specific queue, and when needed to
 freeze it, simply set the queue's maximum-capacity to 0 (percentage)
 or to constrict it, choose a lower, positive percentage value as you
 need. You can also refresh CS to pick up config changes by refreshing
 queues via mradmin.

 More on CS at

 http://hadoop.apache.org/common/docs/current/capacity_scheduler.html

 Either approach will not freeze/constrict the job immediately, but
 should certainly prevent it from progressing. Meaning, their existing
 running tasks during the time of changes made to scheduler config will
 continue to run till completion but further tasks scheduling from
 those jobs shall begin seeing effect of the changes made.

 P.s. A better solution would be to make your job not take as many
 days, somehow? :-)

 On Fri, May 11, 2012 at 4:13 PM, Ritarmorgan...@gmail.com  wrote:

 I have a rather large map reduce job which takes few days. I was

 wondering

 if its possible for me to freeze the job or make the job less

 intensive. Is

 it possible to reduce the number of slots per host and then I can

 increase

 them overnight?


 tia

 --
 --- Get your facts first, then you can distort them as you please.--



 --
 Harsh J







-- 
Harsh J


Re: freeze a mapreduce job

2012-05-11 Thread Robert Evans
There is an idle timeout for map/reduce tasks.  If a task makes no progress for 
10 min (Default) the AM will kill it on 2.0 and the JT will kill it on 1.0.  
But I don't know of anything associated with a Job, other then in 0.23 is the 
AM does not heart beat back in for too long, I believe that the RM may kill it 
and retry, but I don't know for sure.

--Bobby Evans

On 5/11/12 10:53 AM, Harsh J ha...@cloudera.com wrote:

Am not aware of a job-level timeout or idle monitor.

On Fri, May 11, 2012 at 7:33 PM, Shi Yu sh...@uchicago.edu wrote:
 Is there any risk to suppress a job too long in FS?I guess there are
 some parameters to control the waiting time of a job (such as timeout
 ,etc.),   for example, if a job is kept idle for more than 24 hours is there
 a configuration deciding kill/keep that job?

 Shi


 On 5/11/2012 6:52 AM, Rita wrote:

 thanks.  I think I will investigate capacity scheduler.


 On Fri, May 11, 2012 at 7:26 AM, Michael
 Segelmichael_se...@hotmail.comwrote:

 Just a quick note...

 If your task is currently occupying a slot,  the only way to release the
 slot is to kill the specific task.
 If you are using FS, you can move the task to another queue and/or you
 can
 lower the job's priority which will cause new tasks to spawn  slower than
 other jobs so you will eventually free up the cluster.

 There isn't a way to 'freeze' or stop a job mid state.

 Is the issue that the job has a large number of slots, or is it an issue
 of the individual tasks taking a  long time to complete?

 If its the latter, you will probably want to go to a capacity scheduler
 over the fair scheduler.

 HTH

 -Mike

 On May 11, 2012, at 6:08 AM, Harsh J wrote:

 I do not know about the per-host slot control (that is most likely not
 supported, or not yet anyway - and perhaps feels wrong to do), but the
 rest of the needs can be doable if you use schedulers and
 queues/pools.

 If you use FairScheduler (FS), ensure that this job always goes to a
 special pool and when you want to freeze the pool simply set the
 pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous
 tasks as you wish, to constrict instead of freeze. When you make
 changes to the FairScheduler configs, you do not need to restart the
 JT, and you may simply wait a few seconds for FairScheduler to refresh
 its own configs.

 More on FS at

 http://hadoop.apache.org/common/docs/current/fair_scheduler.html

 If you use CapacityScheduler (CS), then I believe you can do this by
 again making sure the job goes to a specific queue, and when needed to
 freeze it, simply set the queue's maximum-capacity to 0 (percentage)
 or to constrict it, choose a lower, positive percentage value as you
 need. You can also refresh CS to pick up config changes by refreshing
 queues via mradmin.

 More on CS at

 http://hadoop.apache.org/common/docs/current/capacity_scheduler.html

 Either approach will not freeze/constrict the job immediately, but
 should certainly prevent it from progressing. Meaning, their existing
 running tasks during the time of changes made to scheduler config will
 continue to run till completion but further tasks scheduling from
 those jobs shall begin seeing effect of the changes made.

 P.s. A better solution would be to make your job not take as many
 days, somehow? :-)

 On Fri, May 11, 2012 at 4:13 PM, Ritarmorgan...@gmail.com  wrote:

 I have a rather large map reduce job which takes few days. I was

 wondering

 if its possible for me to freeze the job or make the job less

 intensive. Is

 it possible to reduce the number of slots per host and then I can

 increase

 them overnight?


 tia

 --
 --- Get your facts first, then you can distort them as you please.--



 --
 Harsh J







--
Harsh J



Question on MapReduce

2012-05-11 Thread Satheesh Kumar
Hi,

I am a newbie on Hadoop and have a quick question on optimal compute vs.
storage resources for MapReduce.

If I have a multiprocessor node with 4 processors, will Hadoop schedule
higher number of Map or Reduce tasks on the system than on a uni-processor
system? In other words, does Hadoop detect denser systems and schedule
denser tasks on multiprocessor systems?

If yes, will that imply that it makes sense to attach higher capacity
storage to store more number of blocks on systems with dense compute?

Any insights will be very useful.

Thanks,
Satheesh


Re: DatanodeRegistration, socketTImeOutException

2012-05-11 Thread sulabh choudhury
I have set dfs.datanode.max.xcievers=4096 and have swapping turned off,
Regionserver Heap = 24 GB
Datanode Heap = 1 GB
On Fri, May 11, 2012 at 9:55 AM, sulabh choudhury sula...@gmail.com wrote:

 I have spent a lot of time trying to find a solution to this issue, but
 have had no luck. I think this is because of the Hbase's read/write
 pattern, but I do not see any related errors in hbase logs.
 Does not look like it is because of a GC pause, but seeing several 48
 ms timeOut certainly suggests something is really slowing down the *writes
 *( I do see this only in the write ch.)

 In my dataNode logs I see tonnes of
 2012-05-11 09:34:30,953 WARN
 org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
 10.10.2.102:50010,
 storageID=DS-1494937024-10.10.2.102-50010-1305755343443, infoPort=50075,
 ipcPort=50020):Got exception while serving
 blk_-5331817573170456741_12784653 to /10.10.2.102:
 java.net.SocketTimeoutException: 48 millis timeout while waiting for
 channel to be ready for *write*. ch :
 java.nio.channels.SocketChannel[connected local=/10.10.2.102:50010remote=/
 10.10.2.102:46752]
  at
 org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
 at
 org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
  at
 org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
 at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397)
  at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:267)
  at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:163)

 2012-05-11 09:34:30,953 ERROR
 org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
 10.10.2.102:50010,
 storageID=DS-1494937024-10.10.2.102-50010-1305755343443, infoPort=50075,
 ipcPort=50020):DataXceiver
 java.net.SocketTimeoutException: 48 millis timeout while waiting for
 channel to be ready for write. ch :
 java.nio.channels.SocketChannel[connected local=/10.10.2.102:50010remote=/
 10.10.2.102:46752]
  at
 org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
 at
 org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
  at
 org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
 at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397)
  at
 org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493)
 at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:267)
  at
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:163)


 This block is mapped to a Hbase region, from NN logs :-

 2012-05-10 15:46:35,117 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
 NameSystem.allocateBlock:
 /hbase/table1/5a84f3844b7fd049c73a78b78ba6c2cf/.tmp/1639371300072460962.
 blk_4283960240517860151_12781124
 2012-05-10 15:47:18,000 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
 NameSystem.addStoredBlock: blockMap updated: 10.10.2.103:50010 is added
 to blk_4283960240517860151_12781124 size 134217728
 2012-05-10 15:47:18,000 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
 NameSystem.addStoredBlock: blockMap updated: 10.10.2.102:50010 is added
 to blk_4283960240517860151_12781124 size 134217728



 I am running hbase-0.90.4-cdh3u3 on hadoop-0.20.2-cdh3u3




-- 

-- 
Thanks and Regards,
Sulabh Choudhury


RE: Question on MapReduce

2012-05-11 Thread Leo Leung
Nope, you must tune the config on that specific super node to have more M/R 
slots (this is for 1.0.x)
This does not mean the JobTracker will be eager to stuff that super node with 
all the M/R jobs at hand.

It still goes through the scheduler,  Capacity Scheduler is most likely what 
you have.  (check your config)

IMO, If the data locality is not going to be there, your cluster is going to 
suffer from Network I/O.


-Original Message-
From: Satheesh Kumar [mailto:nks...@gmail.com] 
Sent: Friday, May 11, 2012 9:51 AM
To: common-user@hadoop.apache.org
Subject: Question on MapReduce

Hi,

I am a newbie on Hadoop and have a quick question on optimal compute vs.
storage resources for MapReduce.

If I have a multiprocessor node with 4 processors, will Hadoop schedule higher 
number of Map or Reduce tasks on the system than on a uni-processor system? In 
other words, does Hadoop detect denser systems and schedule denser tasks on 
multiprocessor systems?

If yes, will that imply that it makes sense to attach higher capacity storage 
to store more number of blocks on systems with dense compute?

Any insights will be very useful.

Thanks,
Satheesh


Re: Question on MapReduce

2012-05-11 Thread Satheesh Kumar
Thanks, Leo. What is the config of a typical data node in a Hadoop cluster
- cores, storage capacity, and connectivity (SATA?).? How many tasktrackers
scheduled per core in general?

Is there a best practices guide somewhere?

Thanks,
Satheesh

On Fri, May 11, 2012 at 10:48 AM, Leo Leung lle...@ddn.com wrote:

 Nope, you must tune the config on that specific super node to have more
 M/R slots (this is for 1.0.x)
 This does not mean the JobTracker will be eager to stuff that super node
 with all the M/R jobs at hand.

 It still goes through the scheduler,  Capacity Scheduler is most likely
 what you have.  (check your config)

 IMO, If the data locality is not going to be there, your cluster is going
 to suffer from Network I/O.


 -Original Message-
 From: Satheesh Kumar [mailto:nks...@gmail.com]
 Sent: Friday, May 11, 2012 9:51 AM
 To: common-user@hadoop.apache.org
 Subject: Question on MapReduce

 Hi,

 I am a newbie on Hadoop and have a quick question on optimal compute vs.
 storage resources for MapReduce.

 If I have a multiprocessor node with 4 processors, will Hadoop schedule
 higher number of Map or Reduce tasks on the system than on a uni-processor
 system? In other words, does Hadoop detect denser systems and schedule
 denser tasks on multiprocessor systems?

 If yes, will that imply that it makes sense to attach higher capacity
 storage to store more number of blocks on systems with dense compute?

 Any insights will be very useful.

 Thanks,
 Satheesh



RE: Question on MapReduce

2012-05-11 Thread Leo Leung

This maybe dated materials.

Cloudera and HDP folks please correct with updates :)

http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/
http://www.cloudera.com/blog/2010/08/hadoophbase-capacity-planning/

http://hortonworks.com/blog/best-practices-for-selecting-apache-hadoop-hardware/

Hope this helps.



-Original Message-
From: Satheesh Kumar [mailto:nks...@gmail.com] 
Sent: Friday, May 11, 2012 12:48 PM
To: common-user@hadoop.apache.org
Subject: Re: Question on MapReduce

Thanks, Leo. What is the config of a typical data node in a Hadoop cluster
- cores, storage capacity, and connectivity (SATA?).? How many tasktrackers 
scheduled per core in general?

Is there a best practices guide somewhere?

Thanks,
Satheesh

On Fri, May 11, 2012 at 10:48 AM, Leo Leung lle...@ddn.com wrote:

 Nope, you must tune the config on that specific super node to have 
 more M/R slots (this is for 1.0.x) This does not mean the JobTracker 
 will be eager to stuff that super node with all the M/R jobs at hand.

 It still goes through the scheduler,  Capacity Scheduler is most 
 likely what you have.  (check your config)

 IMO, If the data locality is not going to be there, your cluster is 
 going to suffer from Network I/O.


 -Original Message-
 From: Satheesh Kumar [mailto:nks...@gmail.com]
 Sent: Friday, May 11, 2012 9:51 AM
 To: common-user@hadoop.apache.org
 Subject: Question on MapReduce

 Hi,

 I am a newbie on Hadoop and have a quick question on optimal compute vs.
 storage resources for MapReduce.

 If I have a multiprocessor node with 4 processors, will Hadoop 
 schedule higher number of Map or Reduce tasks on the system than on a 
 uni-processor system? In other words, does Hadoop detect denser 
 systems and schedule denser tasks on multiprocessor systems?

 If yes, will that imply that it makes sense to attach higher capacity 
 storage to store more number of blocks on systems with dense compute?

 Any insights will be very useful.

 Thanks,
 Satheesh



Re: How to maintain record boundaries

2012-05-11 Thread Ankur C. Goel
Record reader implementations are typically written to honor record
boundaries. This means that while reading a split data they will continue
reading if the end of split has reached BUT end of record is yet to be
encountered.

-@nkur

On 5/11/12 5:15 AM, shreya@cognizant.com shreya@cognizant.com
wrote:

Hi

When we store data into HDFS, it gets broken into small pieces and
distributed across the cluster based on Block size for the file.
While processing the data using MR program I want a particular record as
a whole without it being split across nodes, but the data has already
been split and stored in HDFS when I loaded the data.
How would I make sure that my record doesn't get split, how would my
Input format make a difference now ?

Regards
Shreya

This e-mail and any files transmitted with it are for the sole use of the
intended recipient(s) and may contain confidential and privileged
information. If you are not the intended recipient(s), please reply to
the sender and destroy all copies of the original message. Any
unauthorized review, use, disclosure, dissemination, forwarding, printing
or copying of this email, and/or any action taken in reliance on the
contents of this e-mail is strictly prohibited and may be unlawful.



Resource underutilization / final reduce tasks only uses half of cluster ( tasktracker map/reduce slots )

2012-05-11 Thread Jeremy Davis

I see mapred.tasktracker.reduce.tasks.maximum and 
mapred.tasktracker.map.tasks.maximum, but I'm wondering if there isn't another 
tuning parameter I need to look at.

I can tune the task tracker so that when I have many jobs running, with many 
simultaneous maps and reduces I utilize 95% of cpu and memory. 

Inevitably though I end up with a huge final reduce task that only uses half of 
of my cluster because I have reserved the other half for Mapping. 

Is there a way around this problem?  

Seems like there should also be a maximum number of reducers conditional on no 
Map tasks running. 

-JD

Moving files from JBoss server to HDFS

2012-05-11 Thread financeturd financeturd
Hello,

We have a large number of 
custom-generated files (not just web logs) that we need to move from our JBoss 
servers to HDFS.  Our first implementation ran a cron job every 5 minutes to 
move our files from the output directory to HDFS.

Is this recommended?  We are being told by our IT team that our JBoss servers 
should not have access to HDFS for security reasons.  The files must be 
sucked to HDFS by other servers that do not accept traffic 
from the outside.  In essence, they are asking for a layer of 
indirection.  Instead of:
{JBoss server} -- {HDFS}
it's being requested that it look like:
{Separate server} -- {JBoss server}
and then
{Separate server} -- HDFS


While I understand in principle what is being said, the security of having 
processes on JBoss servers writing files to HDFS doesn't seem any worse than 
having Tomcat servers access a central database, which they do.

Can anyone comment on what a recommended approach would be?  Should our JBoss 
servers push their data to HDFS or should the data be pulled by another server 
and then placed into HDFS?

Thank you!
FT