date:20100411

Too many fetch-failures ERROR

2010-04-11 Thread long thai

Hi all.

The error I encounter is so common, however, after 2 weeks of searching and
following all solution, I still stuck at it. So, I hope that anyone can help
me to overcome this issue :)

First, I use Ubuntu 9.04 x86_64 and hadoop-0.20.2. I successfully setup for
single node based on instruction of Michael G. Noll.

Second, I setup Hadoop for multi nodes, following Noll's instruction, and
encounter the error. This is my config files

/etc/hosts
127.0.0.1localhost
127.0.1.1thailong-desktop
#192.168.1.2 localhost
#192.168.1.2 thailong-desktop

# The following lines are desirable for IPv6 capable hosts
#::1 localhost ip6-localhost ip6-loopback
#fe00::0 ip6-localnet
#ff00::0 ip6-mcastprefix
#ff02::1 ip6-allnodes
#ff02::2 ip6-allrouters
#ff02::3 ip6-allhosts
192.168.1.4 node1
192.168.1.2 master

core-site.xml
?xml version=1.0?
?xml-stylesheet type=text/xsl href=configuration.xsl?
!-- Put site-specific property overrides in this file. --
configuration
property
  namehadoop.tmp.dir/name
  value/usr/local/hadoop-datastore/hadoop-${user.name}/value
  descriptionA base for other temporary directories./description
/property
property
  namefs.default.name/name
  valuehdfs://master:54310/value
  descriptionThe name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem./description
/property
/configuration

mapred-site.xml
?xml version=1.0?
?xml-stylesheet type=text/xsl href=configuration.xsl?
!-- Put site-specific property overrides in this file. --
configuration
!-- In: conf/mapred-site.xml --
property
  namemapred.job.tracker/name
  valuemaster:54311/value
  descriptionThe host and port that the MapReduce job tracker runs
  at.  If local, then jobs are run in-process as a single map
  and reduce task.
  /description
/property
/configuration

hdfs.xml
?xml version=1.0?
?xml-stylesheet type=text/xsl href=configuration.xsl?
!-- Put site-specific property overrides in this file. --
configuration
!-- In: conf/hdfs-site.xml --
property
  namedfs.replication/name
  value1/value
  descriptionDefault block replication.
  The actual number of replications can be specified when the file is
created.
  The default is used if replication is not specified in create time.
  /description
/property
/configuration

I try to setup the Hadoop on single node again, but at this time, instead of
using localhost, I set all value to master, which is the host name of the
local machine, and the error is still there. It seems that there is a
problem in mapred-site.xml, if I change mapred.job.tracker to localhost, or
change the IP address of master in /etc/hosts to 127.0.1.1, the system runs
with error. is there something that I missed?

This problem has haunted me for weeks, any help from you is precious to me.

Regards

Re: cluster under-utilization with Hadoop Fair Scheduler

2010-04-11 Thread Todd Lipcon

Hi Abhishek,

This behavior is improved by MAPREDUCE-706 I believe (not certain that
that's the JIRA, but I know it's fixed in trunk fairscheduler). These
patches are included in CDH3 (currently in beta)
http://archive.cloudera.com/cdh/3/

In general, though, map tasks that are so short are not going to be very
efficient - even with fast assignment there is some constant overhead per
task.

Thanks
-Todd

On Sun, Apr 11, 2010 at 11:42 AM, abhishek sharma absha...@usc.edu wrote:

 Hi all,

 I have been using the Hadoop Fair Scheduler for some experiments on a
 100 node cluster with 2 map slots per node (hence, a total of 200 map
 slots).

 In one of my experiments, all the map tasks finish within a heartbeat
 interval of 3 seconds. I noticed that the maximum number of
 concurrently
 active map slots on my cluster never exceeds 100, and hence, the
 cluster utilization during my experiments never exceeds 50% even when
 large jobs with more than a 1000 maps are being executed.

 A look at the Fair Scheduler code (in particular, the assignTasks
 function) revealed the reason.
 As per my understanding, with the implementation in Hadoop 0.20.0, a
 TaskTracker is not assigned more than 1 map and 1 reduce task per
 heart beat.

 In my experiments, in every heart beat, each TT has 2 free map slots
 but is assigned only 1 map task, and hence, the utilization never goes
 beyond 50%.

 Of course, this (degenerate) case does not arise when map tasks take
 more than one 1 heart beat interval to finish. For example, I repeated
 the experiments with maps tasks taking close to 15 s to finish and
 noticed close to 100 % utilization when large jobs were executing.

 Why does the Fair Scheduler not assign more than one map task to a TT
 per heart beat? Is this done to spread the load uniformly across the
 cluster?
 I looked at assignTasks function in the default Hadoop scheduler
 (JobQueueTaskScheduler.java), and it does assign more than 1 map task
 per heart beat to a TT.

 It will be easy to change the Fair Scheduler to assign more than 1 map
 task to a TT per heart beat (I did that and achieved 100% utilization
 even with small map tasks). But I am wondering, if doing so will
 violate some fairness properties.

 Thanks,
 Abhishek




-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Too many fetch-failures ERROR

2010-04-11 Thread Raghava Mutharaju

Hi,

 I followed Michael G. Noll's blog post to set up a single node
installation on my laptop. Sometimes I did encounter this error. I just used
to restart hadoop and that used to fix it. But I don't know the exact reason
behind this.

Regards,
Raghava.

On Sun, Apr 11, 2010 at 6:05 AM, long thai thaithanhlong2...@gmail.comwrote:

 Hi all.

 The error I encounter is so common, however, after 2 weeks of searching and
 following all solution, I still stuck at it. So, I hope that anyone can
 help
 me to overcome this issue :)

 First, I use Ubuntu 9.04 x86_64 and hadoop-0.20.2. I successfully setup for
 single node based on instruction of Michael G. Noll.

 Second, I setup Hadoop for multi nodes, following Noll's instruction, and
 encounter the error. This is my config files

 /etc/hosts
 127.0.0.1localhost
 127.0.1.1thailong-desktop
 #192.168.1.2 localhost
 #192.168.1.2 thailong-desktop

 # The following lines are desirable for IPv6 capable hosts
 #::1 localhost ip6-localhost ip6-loopback
 #fe00::0 ip6-localnet
 #ff00::0 ip6-mcastprefix
 #ff02::1 ip6-allnodes
 #ff02::2 ip6-allrouters
 #ff02::3 ip6-allhosts
 192.168.1.4 node1
 192.168.1.2 master

 core-site.xml
 ?xml version=1.0?
 ?xml-stylesheet type=text/xsl href=configuration.xsl?
 !-- Put site-specific property overrides in this file. --
 configuration
 property
  namehadoop.tmp.dir/name
  value/usr/local/hadoop-datastore/hadoop-${user.name}/value
  descriptionA base for other temporary directories./description
 /property
 property
  namefs.default.name/name
  valuehdfs://master:54310/value
  descriptionThe name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem./description
 /property
 /configuration

 mapred-site.xml
 ?xml version=1.0?
 ?xml-stylesheet type=text/xsl href=configuration.xsl?
 !-- Put site-specific property overrides in this file. --
 configuration
 !-- In: conf/mapred-site.xml --
 property
  namemapred.job.tracker/name
  valuemaster:54311/value
  descriptionThe host and port that the MapReduce job tracker runs
  at.  If local, then jobs are run in-process as a single map
  and reduce task.
  /description
 /property
 /configuration

 hdfs.xml
 ?xml version=1.0?
 ?xml-stylesheet type=text/xsl href=configuration.xsl?
 !-- Put site-specific property overrides in this file. --
 configuration
 !-- In: conf/hdfs-site.xml --
 property
  namedfs.replication/name
  value1/value
  descriptionDefault block replication.
  The actual number of replications can be specified when the file is
 created.
  The default is used if replication is not specified in create time.
  /description
 /property
 /configuration

 I try to setup the Hadoop on single node again, but at this time, instead
 of
 using localhost, I set all value to master, which is the host name of the
 local machine, and the error is still there. It seems that there is a
 problem in mapred-site.xml, if I change mapred.job.tracker to localhost, or
 change the IP address of master in /etc/hosts to 127.0.1.1, the system runs
 with error. is there something that I missed?

 This problem has haunted me for weeks, any help from you is precious to me.

 Regards

Re: cluster under-utilization with Hadoop Fair Scheduler

2010-04-11 Thread Ted Yu

Reading assignTasks() in 0.20.2 reveals that the number of map tasks
assigned is not limited to 1 per heartbeat.

Cheers

On Sun, Apr 11, 2010 at 12:30 PM, Todd Lipcon t...@cloudera.com wrote:

 Hi Abhishek,

 This behavior is improved by MAPREDUCE-706 I believe (not certain that
 that's the JIRA, but I know it's fixed in trunk fairscheduler). These
 patches are included in CDH3 (currently in beta)
 http://archive.cloudera.com/cdh/3/

 In general, though, map tasks that are so short are not going to be very
 efficient - even with fast assignment there is some constant overhead per
 task.

 Thanks
 -Todd

 On Sun, Apr 11, 2010 at 11:42 AM, abhishek sharma absha...@usc.edu
 wrote:

  Hi all,
 
  I have been using the Hadoop Fair Scheduler for some experiments on a
  100 node cluster with 2 map slots per node (hence, a total of 200 map
  slots).
 
  In one of my experiments, all the map tasks finish within a heartbeat
  interval of 3 seconds. I noticed that the maximum number of
  concurrently
  active map slots on my cluster never exceeds 100, and hence, the
  cluster utilization during my experiments never exceeds 50% even when
  large jobs with more than a 1000 maps are being executed.
 
  A look at the Fair Scheduler code (in particular, the assignTasks
  function) revealed the reason.
  As per my understanding, with the implementation in Hadoop 0.20.0, a
  TaskTracker is not assigned more than 1 map and 1 reduce task per
  heart beat.
 
  In my experiments, in every heart beat, each TT has 2 free map slots
  but is assigned only 1 map task, and hence, the utilization never goes
  beyond 50%.
 
  Of course, this (degenerate) case does not arise when map tasks take
  more than one 1 heart beat interval to finish. For example, I repeated
  the experiments with maps tasks taking close to 15 s to finish and
  noticed close to 100 % utilization when large jobs were executing.
 
  Why does the Fair Scheduler not assign more than one map task to a TT
  per heart beat? Is this done to spread the load uniformly across the
  cluster?
  I looked at assignTasks function in the default Hadoop scheduler
  (JobQueueTaskScheduler.java), and it does assign more than 1 map task
  per heart beat to a TT.
 
  It will be easy to change the Fair Scheduler to assign more than 1 map
  task to a TT per heart beat (I did that and achieved 100% utilization
  even with small map tasks). But I am wondering, if doing so will
  violate some fairness properties.
 
  Thanks,
  Abhishek
 



 --
 Todd Lipcon
 Software Engineer, Cloudera

Re: Announce: Karmasphere Studio for Hadoop 1.2.0

2010-04-11 Thread Allen Wittenauer


On Apr 10, 2010, at 7:10 PM, Shevek wrote:
 
 * Full cross-platform support
  - Job submission, HDFS and S3 browsing from Windows, MacOS or Linux.


If you list three OSes, that isn't cross platform. :)

Re: Hadoop 0.20.3 hangs

2010-04-11 Thread Allen Wittenauer


On Apr 10, 2010, at 5:36 PM, rishi kapoor wrote:
 With Hadoop 0.20.3 the command for accessing dfs just hangs
 (bin/hadoop dfs -ls , bin/hadoop dfs -put ).


hadoop 0.20.3 hasn't been released, so you'll need to be more explicit about 
what you are actually running.

How many nodes are there in the largest hadoop cluster worldwide?

2010-04-11 Thread Mafish Liu

Hi, all:
I'm writing an article related to hadoop and want to know how many
nodes are there in the largest hadoop cluster worldwide.

Regards

Re: Too many fetch-failures ERROR

2010-04-11 Thread long thai

Hi.

For single node installation using localhost in cofig files, Hadoop run very
well. However, If I change localhost to the hostname which is assigned to
local machine in /etc/hosts file, in my case it is master, I receive Too
many fetch-failure error. I think there is a problem with transferring data
to mapred process. Am I right? Is there any way to solve it?

Regards.

On Mon, Apr 12, 2010 at 2:40 AM, Raghava Mutharaju 
m.vijayaragh...@gmail.com wrote:

 Hi,

 I followed Michael G. Noll's blog post to set up a single node
 installation on my laptop. Sometimes I did encounter this error. I just
 used
 to restart hadoop and that used to fix it. But I don't know the exact
 reason
 behind this.

 Regards,
 Raghava.

 On Sun, Apr 11, 2010 at 6:05 AM, long thai thaithanhlong2...@gmail.com
 wrote:

  Hi all.
 
  The error I encounter is so common, however, after 2 weeks of searching
 and
  following all solution, I still stuck at it. So, I hope that anyone can
  help
  me to overcome this issue :)
 
  First, I use Ubuntu 9.04 x86_64 and hadoop-0.20.2. I successfully setup
 for
  single node based on instruction of Michael G. Noll.
 
  Second, I setup Hadoop for multi nodes, following Noll's instruction, and
  encounter the error. This is my config files
 
  /etc/hosts
  127.0.0.1localhost
  127.0.1.1thailong-desktop
  #192.168.1.2 localhost
  #192.168.1.2 thailong-desktop
 
  # The following lines are desirable for IPv6 capable hosts
  #::1 localhost ip6-localhost ip6-loopback
  #fe00::0 ip6-localnet
  #ff00::0 ip6-mcastprefix
  #ff02::1 ip6-allnodes
  #ff02::2 ip6-allrouters
  #ff02::3 ip6-allhosts
  192.168.1.4 node1
  192.168.1.2 master
 
  core-site.xml
  ?xml version=1.0?
  ?xml-stylesheet type=text/xsl href=configuration.xsl?
  !-- Put site-specific property overrides in this file. --
  configuration
  property
   namehadoop.tmp.dir/name
   value/usr/local/hadoop-datastore/hadoop-${user.name}/value
   descriptionA base for other temporary directories./description
  /property
  property
   namefs.default.name/name
   valuehdfs://master:54310/value
   descriptionThe name of the default file system.  A URI whose
   scheme and authority determine the FileSystem implementation.  The
   uri's scheme determines the config property (fs.SCHEME.impl) naming
   the FileSystem implementation class.  The uri's authority is used to
   determine the host, port, etc. for a filesystem./description
  /property
  /configuration
 
  mapred-site.xml
  ?xml version=1.0?
  ?xml-stylesheet type=text/xsl href=configuration.xsl?
  !-- Put site-specific property overrides in this file. --
  configuration
  !-- In: conf/mapred-site.xml --
  property
   namemapred.job.tracker/name
   valuemaster:54311/value
   descriptionThe host and port that the MapReduce job tracker runs
   at.  If local, then jobs are run in-process as a single map
   and reduce task.
   /description
  /property
  /configuration
 
  hdfs.xml
  ?xml version=1.0?
  ?xml-stylesheet type=text/xsl href=configuration.xsl?
  !-- Put site-specific property overrides in this file. --
  configuration
  !-- In: conf/hdfs-site.xml --
  property
   namedfs.replication/name
   value1/value
   descriptionDefault block replication.
   The actual number of replications can be specified when the file is
  created.
   The default is used if replication is not specified in create time.
   /description
  /property
  /configuration
 
  I try to setup the Hadoop on single node again, but at this time, instead
  of
  using localhost, I set all value to master, which is the host name of the
  local machine, and the error is still there. It seems that there is a
  problem in mapred-site.xml, if I change mapred.job.tracker to localhost,
 or
  change the IP address of master in /etc/hosts to 127.0.1.1, the system
 runs
  with error. is there something that I missed?
 
  This problem has haunted me for weeks, any help from you is precious to
 me.
 
  Regards

Too many fetch-failures ERROR

Re: cluster under-utilization with Hadoop Fair Scheduler

Re: Too many fetch-failures ERROR

Re: cluster under-utilization with Hadoop Fair Scheduler

Re: Announce: Karmasphere Studio for Hadoop 1.2.0

Re: Hadoop 0.20.3 hangs

How many nodes are there in the largest hadoop cluster worldwide?

Re: Too many fetch-failures ERROR

8 matches

Site Navigation

Mail list logo

Footer information