subject:"could only be replicated to 0 nodes, instead of 1"

RE: Jobtracker could only be replicated to 0 nodes instead of 1

2014-08-19 Thread Smita Deshpande

I had the same issue. Can you try disabling firewall from both datanode and 
resourcemanager using sudo /etc/init.d/iptables stop?

Regards,
Smita

From: Sindhu Hosamane [mailto:sindh...@gmail.com]
Sent: Saturday, August 16, 2014 2:09 AM
To: user@hadoop.apache.org
Subject: Re: Jobtracker could only be replicated to 0 nodes instead of 1

When i checked  hadoop dfsadmin -report
then i could see 1 datanode is up .so i assume datanode is working .
Also i see all the 5 daemons running when i run jps command.

The only error i saw is in nam node logs is that job tracker.info could only be 
replicated to 0 nodes instead of 1 .
Rest in other logs i found no error just some warnings  EOF exception. Just 
attached logs for reference.  Please point me what should be corrected.

Jobtracker could only be replicated to 0 nodes instead of 1

2014-08-15 Thread sindhu hosamane

Hello friends,

I got the above error jobtarcker.info could only be replicated to 0 nodes
instead of 1 

Tried different Solutions found on web :
* Formatted namenode
* removed tmp Folder
* cleaned uncessary logs just to have more space

But still no success . What other Solutions could it be ?
your advices would be appreciated.Thank you.

Regards,
shosaman

Re: Jobtracker could only be replicated to 0 nodes instead of 1

2014-08-15 Thread Nitin Pawar

 you have set replication factor to 1, I am assuming its running a single
node cluster.

i would recommend you to check the datanode logs to see if it was able to
connect with namenode successfully.


On Fri, Aug 15, 2014 at 1:58 PM, sindhu hosamane sindh...@gmail.com wrote:

 Hello friends,

 I got the above error jobtarcker.info could only be replicated to 0
 nodes instead of 1 

 Tried different Solutions found on web :
 * Formatted namenode
 * removed tmp Folder
 * cleaned uncessary logs just to have more space

 But still no success . What other Solutions could it be ?
 your advices would be appreciated.Thank you.

 Regards,
 shosaman




-- 
Nitin Pawar

could only be replicated to 0 nodes, instead of 1

2012-09-04 Thread Keith Wiley

I've been running up against the good old fashioned replicated to 0 nodes 
gremlin quite a bit recently.  My system (a set of processes interacting with 
hadoop, and of course hadoop itself) runs for a while (a day or so) and then I 
get plagued with these errors.  This is a very simple system, a single node 
running pseudo-distributed.  Obviously, the replication factor is implicitly 1 
and the datanode is the same machine as the namenode.  None of the typical 
culprits seem to explain the situation and I'm not sure what to do.  I'm also 
not sure how I'm getting around it so far.  I fiddle desperately for a few 
hours and things start running again, but that's not really a solution...I've 
tried stopping and restarting hdfs, but that doesn't seem to improve things.

So, to go through the common suspects one by one, as quoted on 
http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo:

• No DataNode instances being up and running. Action: look at the servers, see 
if the processes are running.

I can interact with hdfs through the command line (doing directory listings for 
example).  Furthermore, I can see that the relevant java processes are all 
running (NameNode, SecondaryNameNode, DataNode, JobTracker, TaskTracker).

• The DataNode instances cannot talk to the server, through networking or 
Hadoop configuration problems. Action: look at the logs of one of the DataNodes.

Obviously irrelevant in a single-node scenario.  Anyway, like I said, I can 
perform basic hdfs listings, I just can't upload new data.

• Your DataNode instances have no hard disk space in their configured data 
directories. Action: look at the dfs.data.dir list in the node configurations, 
verify that at least one of the directories exists, and is writeable by the 
user running the Hadoop processes. Then look at the logs.

There's plenty of space, at least 50GB.

• Your DataNode instances have run out of space. Look at the disk capacity via 
the Namenode web pages. Delete old files. Compress under-used files. Buy more 
disks for existing servers (if there is room), upgrade the existing servers to 
bigger drives, or add some more servers.

Nope, 50GBs free, I'm only uploading a few KB at a time, maybe a few MB.

• The reserved space for a DN (as set in dfs.datanode.du.reserved is greater 
than the remaining free space, so the DN thinks it has no free space

I grepped all the files in the conf directory and couldn't find this parameter 
so I don't really know anything about it.  At any rate, it seems rather 
esoteric, I doubt it is related to my problem.  Any thoughts on this?

• You may also get this message due to permissions, eg if JT can not create 
jobtracker.info on startup.

Meh, like I said, the system basicaslly works...and then stops working.  The 
only explanation that would really make sense in that context is running out of 
space...which isn't happening. If this were a permission error, or a 
configuration error, or anything weird like that, then the whole system would 
never get up and running in the first place.

Why would a properly running hadoop system start exhibiting this error without 
running out of disk space?  THAT's the real question on the table here.

Any ideas?


Keith Wiley kwi...@keithwiley.com keithwiley.commusic.keithwiley.com

Yet mark his perfect self-contentment, and hence learn his lesson, that to be
self-contented is to be vile and ignorant, and that to aspire is better than to
be blindly and impotently happy.
   --  Edwin A. Abbott, Flatland

Re: could only be replicated to 0 nodes, instead of 1

2012-09-04 Thread Suresh Srinivas

- A datanode is typically kept free with up to 5 free blocks (HDFS block
size) of space.
- Disk space is used by mapreduce jobs to store temporary shuffle spills
also. This is what dfs.datanode.du.reserved is used to configure. The
configuration is available in hdfs-site.xml. If you have not configured it
then reserved space is 0. Not only mapreduce, other files also might take
up the disk space.

When these errors are thrown, please send the namenode web UI information.
It has storage related information in the cluster summary. That will help
debug.


On Tue, Sep 4, 2012 at 9:41 AM, Keith Wiley kwi...@keithwiley.com wrote:

 I've been running up against the good old fashioned replicated to 0
 nodes gremlin quite a bit recently.  My system (a set of processes
 interacting with hadoop, and of course hadoop itself) runs for a while (a
 day or so) and then I get plagued with these errors.  This is a very simple
 system, a single node running pseudo-distributed.  Obviously, the
 replication factor is implicitly 1 and the datanode is the same machine as
 the namenode.  None of the typical culprits seem to explain the situation
 and I'm not sure what to do.  I'm also not sure how I'm getting around it
 so far.  I fiddle desperately for a few hours and things start running
 again, but that's not really a solution...I've tried stopping and
 restarting hdfs, but that doesn't seem to improve things.

 So, to go through the common suspects one by one, as quoted on
 http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo:

 • No DataNode instances being up and running. Action: look at the servers,
 see if the processes are running.

 I can interact with hdfs through the command line (doing directory
 listings for example).  Furthermore, I can see that the relevant java
 processes are all running (NameNode, SecondaryNameNode, DataNode,
 JobTracker, TaskTracker).

 • The DataNode instances cannot talk to the server, through networking or
 Hadoop configuration problems. Action: look at the logs of one of the
 DataNodes.

 Obviously irrelevant in a single-node scenario.  Anyway, like I said, I
 can perform basic hdfs listings, I just can't upload new data.

 • Your DataNode instances have no hard disk space in their configured data
 directories. Action: look at the dfs.data.dir list in the node
 configurations, verify that at least one of the directories exists, and is
 writeable by the user running the Hadoop processes. Then look at the logs.

 There's plenty of space, at least 50GB.

 • Your DataNode instances have run out of space. Look at the disk capacity
 via the Namenode web pages. Delete old files. Compress under-used files.
 Buy more disks for existing servers (if there is room), upgrade the
 existing servers to bigger drives, or add some more servers.

 Nope, 50GBs free, I'm only uploading a few KB at a time, maybe a few MB.

 • The reserved space for a DN (as set in dfs.datanode.du.reserved is
 greater than the remaining free space, so the DN thinks it has no free space

 I grepped all the files in the conf directory and couldn't find this
 parameter so I don't really know anything about it.  At any rate, it seems
 rather esoteric, I doubt it is related to my problem.  Any thoughts on this?

 • You may also get this message due to permissions, eg if JT can not
 create jobtracker.info on startup.

 Meh, like I said, the system basicaslly works...and then stops working.
  The only explanation that would really make sense in that context is
 running out of space...which isn't happening. If this were a permission
 error, or a configuration error, or anything weird like that, then the
 whole system would never get up and running in the first place.

 Why would a properly running hadoop system start exhibiting this error
 without running out of disk space?  THAT's the real question on the table
 here.

 Any ideas?


 
 Keith Wiley kwi...@keithwiley.com keithwiley.com
 music.keithwiley.com

 Yet mark his perfect self-contentment, and hence learn his lesson, that
 to be
 self-contented is to be vile and ignorant, and that to aspire is better
 than to
 be blindly and impotently happy.
--  Edwin A. Abbott, Flatland

 




-- 
http://hortonworks.com/download/

Re: could only be replicated to 0 nodes, instead of 1

2012-09-04 Thread Keith Wiley

On Sep 4, 2012, at 10:05 , Suresh Srinivas wrote:

 When these errors are thrown, please send the namenode web UI information. It 
 has storage related information in the cluster summary. That will help debug.

Sure thing.  Thanks.  Here's what I currently see.  It looks like the problem 
isn't the datanode, but rather the namenode.  Would you agree with that 
assessment?

NameNode 'localhost:9000'

Started: Tue Sep 04 10:06:52 PDT 2012
Version: 0.20.2-cdh3u3, 03b655719d13929bd68bb2c2f9cee615b389cea9 
Compiled:Thu Jan 26 11:55:16 PST 2012 by root from Unknown
Upgrades:There are no upgrades in progress.

Browse the filesystem
Namenode Logs
Cluster Summary

Safe mode is ON. Resources are low on NN. Safe mode must be turned off manually.
1639 files and directories, 585 blocks = 2224 total. Heap Size is 39.55 MB / 
888.94 MB (4%) 
Configured Capacity  :   49.21 GB
DFS Used :   9.9 MB
Non DFS Used :   2.68 GB
DFS Remaining:   46.53 GB
DFS Used%:   0.02 %
DFS Remaining%   :   94.54 %
Live Nodes   :   1
Dead Nodes   :   0
Decommissioning Nodes:   0
Number of Under-Replicated Blocks:   5

NameNode Storage:

Storage Directory   TypeState
/var/lib/hadoop-0.20/cache/hadoop/dfs/name  IMAGE_AND_EDITS Active

Cloudera's Distribution including Apache Hadoop, 2012.


Keith Wiley kwi...@keithwiley.com keithwiley.commusic.keithwiley.com

And what if we picked the wrong religion?  Every week, we're just making God
madder and madder!
   --  Homer Simpson

Re: could only be replicated to 0 nodes, instead of 1

2012-09-04 Thread Suresh Srinivas

Keith,

Assuming that you were seeing the problem when you captured the namenode
webUI info, it is not related to what I suspect. This might be a good
question for CDH forums given this is not an Apache release.

Regards,
Suresh

On Tue, Sep 4, 2012 at 10:20 AM, Keith Wiley kwi...@keithwiley.com wrote:

 On Sep 4, 2012, at 10:05 , Suresh Srinivas wrote:

  When these errors are thrown, please send the namenode web UI
 information. It has storage related information in the cluster summary.
 That will help debug.

 Sure thing.  Thanks.  Here's what I currently see.  It looks like the
 problem isn't the datanode, but rather the namenode.  Would you agree with
 that assessment?

 NameNode 'localhost:9000'

 Started: Tue Sep 04 10:06:52 PDT 2012
 Version: 0.20.2-cdh3u3, 03b655719d13929bd68bb2c2f9cee615b389cea9
 Compiled:Thu Jan 26 11:55:16 PST 2012 by root from Unknown
 Upgrades:There are no upgrades in progress.

 Browse the filesystem
 Namenode Logs
 Cluster Summary

 Safe mode is ON. Resources are low on NN. Safe mode must be turned off
 manually.
 1639 files and directories, 585 blocks = 2224 total. Heap Size is 39.55 MB
 / 888.94 MB (4%)
 Configured Capacity  :   49.21 GB
 DFS Used :   9.9 MB
 Non DFS Used :   2.68 GB
 DFS Remaining:   46.53 GB
 DFS Used%:   0.02 %
 DFS Remaining%   :   94.54 %
 Live Nodes   :   1
 Dead Nodes   :   0
 Decommissioning Nodes:   0
 Number of Under-Replicated Blocks:   5

 NameNode Storage:

 Storage Directory   TypeState
 /var/lib/hadoop-0.20/cache/hadoop/dfs/name  IMAGE_AND_EDITS Active

 Cloudera's Distribution including Apache Hadoop, 2012.


 
 Keith Wiley kwi...@keithwiley.com keithwiley.com
 music.keithwiley.com

 And what if we picked the wrong religion?  Every week, we're just making
 God
 madder and madder!
--  Homer Simpson

 




-- 
http://hortonworks.com/download/

Re: could only be replicated to 0 nodes, instead of 1

2012-09-04 Thread Harsh J

Keith,

The NameNode has a resource-checker thread in it by design to help
prevent cases of on-disk metadata corruption in event of filled up
dfs.namenode.name.dir disks, etc.. By default, an NN will lock itself
up if the free disk space (among its configured metadata mounts)
reaches a value  100 MB, controlled by
dfs.namenode.resource.du.reserved. You can probably set that to 0 if
you do not want such an automatic preventive measure. Its not exactly
a need, just a check to help avoid accidental data loss due to
non-monitoring of disk space.

On Tue, Sep 4, 2012 at 11:33 PM, Keith Wiley kwi...@keithwiley.com wrote:
 I had moved the data directory to the larger disk but left the namenode 
 directory on the smaller disk figuring it didn't need much room.  Moving that 
 to the larger disk seems to have improved the situation...although I'm still 
 surprised the NN needed so much room.

 Problem is solved for now.


 Thanks.
 
 Keith Wiley kwi...@keithwiley.com keithwiley.com
 music.keithwiley.com

 I used to be with it, but then they changed what it was.  Now, what I'm with
 isn't it, and what's it seems weird and scary to me.
--  Abe (Grandpa) Simpson
 




-- 
Harsh J

Re: could only be replicated to 0 nodes, instead of 1

2011-07-16 Thread Thomas Anderson

Harsh,

Thanks, you are right. The problem stems from the tmp directory space
is not large enough. After changing tmp dir to other place, the
problem goes away.

But I remember one block size (default) in hdfs is 64m, so shouldn't
it at least allow one file, whose actual size in local disk is smaller
than 1k, to be uploaded?

Thanks again for the advice.

On Fri, Jul 15, 2011 at 7:49 PM, Harsh J ha...@cloudera.com wrote:
 Thomas,

 Your problem might lie simply with the virtual node DNs using /tmp and
 tmpfs being used for that -- which somehow is causing reported free
 space to go as 0 in reports to the NN (master).

 tmpfs                 101M   44K  101M   1% /tmp

 This causes your trouble that the NN can't choose a suitable DN to
 write to, cause it determines that none has at least a block size
 worth of space (64MB default) available for writes.

 You can resolve as:

 1. Stop DFS completely.

 2. Create a directory under root somewhere (I use Cloudera's distro,
 and its default configured location for data files comes along as
 /var/lib/hadoop-0.20/cache/, if you need an idea for a location) and
 set it as your hadoop.tmp.dir in core-site.xml on all the nodes.

 3. Reformat your NameNode (hadoop namenode -format, say Y) and restart
 DFS. Things _should_ be OK now.

 Config example (core-site.xml):

  property
   namehadoop.tmp.dir/name
   value/var/lib/hadoop-0.20/cache/value
  /property

 Let us know if this still doesn't get your dev cluster up and running
 for action :)

 On Fri, Jul 15, 2011 at 4:40 PM, Thomas Anderson
 t.dt.aander...@gmail.com wrote:
 When doing partition, I remember only / and swap was specified for all
 nodes during creation. So I think /tmp is also mounted under /, which
 should have size around 9G. The total size of hardisk specified is
 10G.

 The df -kh shows

 server01:
 /dev/sda1             9.4G  2.3G  6.7G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M  132K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 server02:
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M   44K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 server03:
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M   44K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 server04:
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M   44K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 server05:
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M   44K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 In addition, the output of dfs (du -sk /tmp/hadoop-user/dfs) is

 server02:
 8       /tmp/hadoop-user/dfs/

 server03:
 8       /tmp/hadoop-user/dfs/

 server04:
 8       /tmp/hadoop-user/dfs/

 server05:
 8       /tmp/hadoop-user/dfs/

 On Fri, Jul 15, 2011 at 7:01 PM, Harsh J ha...@cloudera.com wrote:
 (P.s. I asked that cause if you look at your NN's live nodes tables,
 the reported space is all 0)

 What's the output of:

 du -sk /tmp/hadoop-user/dfs on all your DNs?

 On Fri, Jul 15, 2011 at 4:01 PM, Harsh J ha...@cloudera.com wrote:
 Thomas,

 Is your /tmp/ mount point also under the / or is it separate? Your
 dfs.data.dir are /tmp/hadoop-user/dfs/data in all DNs, and if they are
 separately mounted then what's the available space on that?

 (bad idea in production to keep things default on /tmp though, like
 dfs.name.dir, dfs.data.dir -- reconfigure+restart as necessary)

 On Fri, Jul 15, 2011 at 3:47 PM, Thomas Anderson
 t.dt.aander...@gmail.com wrote:
 1.) The disk usage (with df -kh) on namenode (server01)

 Filesystem            Size  Used Avail Use% Mounted on
 /dev/sda1             9.4G  2.3G  6.7G  25% /

 and datanodes (server02 ~ server05)
 /dev/sda1             9.4G  2.2G  6.8G  25% /

Re: could only be replicated to 0 nodes, instead of 1

2011-07-16 Thread Harsh J

The actual check is done to see if 5 blocks worth of space is
available remaining.

On Sat, Jul 16, 2011 at 1:52 PM, Thomas Anderson
t.dt.aander...@gmail.com wrote:
 Harsh,

 Thanks, you are right. The problem stems from the tmp directory space
 is not large enough. After changing tmp dir to other place, the
 problem goes away.

 But I remember one block size (default) in hdfs is 64m, so shouldn't
 it at least allow one file, whose actual size in local disk is smaller
 than 1k, to be uploaded?

 Thanks again for the advice.

 On Fri, Jul 15, 2011 at 7:49 PM, Harsh J ha...@cloudera.com wrote:
 Thomas,

 Your problem might lie simply with the virtual node DNs using /tmp and
 tmpfs being used for that -- which somehow is causing reported free
 space to go as 0 in reports to the NN (master).

 tmpfs                 101M   44K  101M   1% /tmp

 This causes your trouble that the NN can't choose a suitable DN to
 write to, cause it determines that none has at least a block size
 worth of space (64MB default) available for writes.

 You can resolve as:

 1. Stop DFS completely.

 2. Create a directory under root somewhere (I use Cloudera's distro,
 and its default configured location for data files comes along as
 /var/lib/hadoop-0.20/cache/, if you need an idea for a location) and
 set it as your hadoop.tmp.dir in core-site.xml on all the nodes.

 3. Reformat your NameNode (hadoop namenode -format, say Y) and restart
 DFS. Things _should_ be OK now.

 Config example (core-site.xml):

  property
   namehadoop.tmp.dir/name
   value/var/lib/hadoop-0.20/cache/value
  /property

 Let us know if this still doesn't get your dev cluster up and running
 for action :)

 On Fri, Jul 15, 2011 at 4:40 PM, Thomas Anderson
 t.dt.aander...@gmail.com wrote:
 When doing partition, I remember only / and swap was specified for all
 nodes during creation. So I think /tmp is also mounted under /, which
 should have size around 9G. The total size of hardisk specified is
 10G.

 The df -kh shows

 server01:
 /dev/sda1             9.4G  2.3G  6.7G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M  132K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 server02:
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M   44K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 server03:
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M   44K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 server04:
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M   44K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 server05:
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M   44K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 In addition, the output of dfs (du -sk /tmp/hadoop-user/dfs) is

 server02:
 8       /tmp/hadoop-user/dfs/

 server03:
 8       /tmp/hadoop-user/dfs/

 server04:
 8       /tmp/hadoop-user/dfs/

 server05:
 8       /tmp/hadoop-user/dfs/

 On Fri, Jul 15, 2011 at 7:01 PM, Harsh J ha...@cloudera.com wrote:
 (P.s. I asked that cause if you look at your NN's live nodes tables,
 the reported space is all 0)

 What's the output of:

 du -sk /tmp/hadoop-user/dfs on all your DNs?

 On Fri, Jul 15, 2011 at 4:01 PM, Harsh J ha...@cloudera.com wrote:
 Thomas,

 Is your /tmp/ mount point also under the / or is it separate? Your
 dfs.data.dir are /tmp/hadoop-user/dfs/data in all DNs, and if they are
 separately mounted then what's the available space on that?

 (bad idea in production to keep things default on /tmp though, like
 dfs.name.dir, dfs.data.dir -- reconfigure+restart as necessary)

 On Fri, Jul 15, 2011 at 3:47 PM, Thomas Anderson
 t.dt.aander...@gmail.com wrote:
 1.) The disk usage (with df -kh) on namenode (server01)

 Filesystem

Re: could only be replicated to 0 nodes, instead of 1

2011-07-15 Thread Thomas Anderson

Before posting question indeed I did follow the wiki page

http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F
http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment

checking local disk space (*), etc. but none of those suggestions
work. So I appreciate any advice.

*: df -kh shows /dev/sda1 9.4G  2.3G  6.7G  25% / only
25% disk space is used.

On Fri, Jul 15, 2011 at 11:39 AM, Thomas Anderson
t.dt.aander...@gmail.com wrote:
 I have fresh  hadoop 0.20.2 installed on virtualbox 4.0.8 with jdk
 1.6.0_26. The problem is when trying to put a file to hdfs, it throws
 error `org.apache.hadoop.ipc.RemoteException: java.io.IOException:
 File /path/to/file could only be replicated to 0 nodes, instead of 1';
 however, there is no problem to create a folder, as the command ls
 print the result

 Found 1 items
 drwxr-xr-x   - user supergroup          0 2011-07-15 11:09 /user/user/test

 I also try with flushing firewall (remove all iptables restriction),
 but the error message is still thrown out when uploading (hadoop fs
 -put /tmp/x test) a file from local fs.

 The name node log shows

 2011-07-15 10:42:43,491 INFO org.apache.hadoop.hdfs.StateChange:
 BLOCK* NameSystem.registerDatanode: node registration from
 aaa.bbb.ccc.ddd.22:50010 storage DS-929017105-aaa.bbb.ccc.22-50010-13
 10697763488
 2011-07-15 10:42:43,495 INFO org.apache.hadoop.net.NetworkTopology:
 Adding a new node: /default-rack/aaa.bbb.ccc.22:50010
 2011-07-15 10:42:44,169 INFO org.apache.hadoop.hdfs.StateChange:
 BLOCK* NameSystem.registerDatanode: node registration from
 aaa.bbb.ccc.35:50010 storage DS-884574392-aaa.bbb.ccc.35-50010-13
 10697764164
 2011-07-15 10:42:44,170 INFO org.apache.hadoop.net.NetworkTopology:
 Adding a new node: /default-rack/aaa.bbb.ccc.35:50010
 2011-07-15 10:42:44,507 INFO org.apache.hadoop.hdfs.StateChange:
 BLOCK* NameSystem.registerDatanode: node registration from
 aaa.bbb.ccc.ddd.11:50010 storage DS-1537583073-aaa.bbb.ccc.11-50010-1
 310697764488
 2011-07-15 10:42:44,507 INFO org.apache.hadoop.net.NetworkTopology:
 Adding a new node: /default-rack/aaa.bbb.ccc.11:50010
 2011-07-15 10:42:45,796 INFO org.apache.hadoop.hdfs.StateChange:
 BLOCK* NameSystem.registerDatanode: node registration from
 140.127.220.25:50010 storage DS-1500589162-aaa.bbb.ccc.25-50010-1
 310697765386
 2011-07-15 10:42:45,797 INFO org.apache.hadoop.net.NetworkTopology:
 Adding a new node: /default-rack/aaa.bbb.ccc.25:50010

 And all datanodes have similar message as below:

 2011-07-15 10:42:46,562 INFO
 org.apache.hadoop.hdfs.server.datanode.DataNode: using
 BLOCKREPORT_INTERVAL of 360msec Initial delay: 0msec
 2011-07-15 10:42:47,163 INFO
 org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0
 blocks got processed in 3 msecs
 2011-07-15 10:42:47,187 INFO
 org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic
 block scanner.
 2011-07-15 11:19:42,931 INFO
 org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0
 blocks got processed in 1 msecs

 Command `hadoop fsck /`  displays

 Status: HEALTHY
  Total size:    0 B
  Total dirs:    3
  Total files:   0 (Files currently being written: 1)
  Total blocks (validated):      0
  Minimally replicated blocks:   0
  Over-replicated blocks:        0
  Under-replicated blocks:       0
  Mis-replicated blocks:         0
  Default replication factor:    3
  Average block replication:     0.0
  Corrupt blocks:                0
  Missing replicas:              0
  Number of data-nodes:          4

 The setting in conf include:

 - Master node:
 core-site.xml
  property
    namefs.default.name/name
    valuehdfs://lab01:9000//value
  /property

 hdfs-site.xml
  property
    namedfs.replication/name
    value3/value
  /property

 -Slave nodes:
 core-site.xml
  property
    namefs.default.name/name
    valuehdfs://lab01:9000//value
  /property

 hdfs-site.xml
  property
    namedfs.replication/name
    value3/value
  /property

 Do I missing any configuration? Or any place that I can check?

 Thanks.

Re: could only be replicated to 0 nodes, instead of 1

2011-07-15 Thread Thomas Anderson

1.) The disk usage (with df -kh) on namenode (server01)

FilesystemSize  Used Avail Use% Mounted on
/dev/sda1 9.4G  2.3G  6.7G  25% /

and datanodes (server02 ~ server05)
/dev/sda1 9.4G  2.2G  6.8G  25% /
/dev/sda1 9.4G  2.2G  6.8G  25% /
/dev/sda1 9.4G  2.2G  6.8G  25% /
/dev/sda1 9.4G  2.2G  6.8G  25% /

2.) How can I make sure that datanode is busy? The environment is only
for testing so there is no other user processes are running at that
moment. Also it is a fresh installation, so only hadoop required
packages are installed such as hadoop and jdk.

3.) fs.block.size is not set in hdfs-site.xml, including datanodes and
namenode, because its purpose is for testing. I thought it would use
the default value, which should be 512?

4.) What might be a good way for fast check if network is not stable?
I check the healthy page e.g. server01:50070/dfshealth.jsp where
livenodes are up and  last contact varies when checking the page.

Node Last ContactAdmin State Configured  Capacity (GB)   Used
(GB) Non DFS  Used (GB)  Remaining  (GB) Used  (%)   Used  
(%)  
Remaining  (%)   Blocks
server02 2  In Service  0.1 0   0   0.1 0.01
 99.96  0
server03 0  In Service  0.1 0   0   0.1 0.01
 99.96  0
server04 1  In Service  0.1 0   0   0.1 0.01
 99.96  0
server05 2  In Service  0.1 0   0   0.1 0.01
 99.96  0

5.) Only command `hadoop fs -put /tmp/testfile test` is issued as it
is just to test if the installation is working. So the file e.g.
testfile will be removed first (hadoop fs -rm test/testfile), then
upload again with hadoop put command.

The logs are listed as below:

namenode:
server01: http://pastebin.com/TLpDmmPx

datanodes:
server02: http://pastebin.com/pdE5XKfi
server03: http://pastebin.com/4aV7ECCV
server04: http://pastebin.com/tF7HiRZj
server05: http://pastebin.com/5qwSPrvU

Please let me know if more information needs to be provided.

I really appreciate your suggestion.

Thank you.


On Fri, Jul 15, 2011 at 4:54 PM, Brahma Reddy brahmared...@huawei.com wrote:
 Hi,

 By seeing this exception(could only be replicated to 0 nodes, instead of 1)
 ,datanode is not available to Name Node..

 This are the following cases Data Node may not available to Name Node

 1)Data Node disk is Full

 2)Data Node is Busy with block report and block scanning

 3)If Block Size is Negative value(dfs.block.size in hdfs-site.xml)

 4)while write in progress primary datanode goes down(Any n/w fluctations b/w
 Name Node and Data Node Machines)

 5)when Ever we append any partial chunk and call sync for subsequent partial
 chunk appends client should store the previous data in buffer.

 For example after appending a I have called sync and when I am trying the
 to append the buffer should have ab

 And Server side when the chunk is not multiple of 512 then it will try to do
 Crc comparison for the data present in block file as well as crc present in
 metafile. But while constructing crc for the data present in block it is
 always comparing till the initial Offeset

 Or For more analysis Please the data node logs

 Warm Regards

 Brahma Reddy

 
 ***
 This e-mail and attachments contain confidential information from HUAWEI,
 which is intended only for the person or entity whose address is listed
 above. Any use of the information contained herein in any way (including,
 but not limited to, total or partial disclosure, reproduction, or
 dissemination) by persons other than the intended recipient's) is
 prohibited. If you receive this e-mail in error, please notify the sender by
 phone or email immediately and delete it!
 -Original Message-
 From: Thomas Anderson [mailto:t.dt.aander...@gmail.com]
 Sent: Friday, July 15, 2011 9:09 AM
 To: hdfs-user@hadoop.apache.org
 Subject: could only be replicated to 0 nodes, instead of 1

 I have fresh  hadoop 0.20.2 installed on virtualbox 4.0.8 with jdk
 1.6.0_26. The problem is when trying to put a file to hdfs, it throws
 error `org.apache.hadoop.ipc.RemoteException: java.io.IOException:
 File /path/to/file could only be replicated to 0 nodes, instead of 1';
 however, there is no problem to create a folder, as the command ls
 print the result

 Found 1 items
 drwxr-xr-x   - user supergroup          0 2011-07-15 11:09 /user/user/test

 I also try with flushing firewall (remove all iptables restriction),
 but the error message is still thrown out when uploading (hadoop fs
 -put /tmp/x test) a file from local fs.

 The name node log shows

 2011-07-15 10:42:43,491 INFO org.apache.hadoop.hdfs.StateChange:
 BLOCK* NameSystem.registerDatanode: node registration from
 aaa.bbb.ccc.ddd.22:50010 storage DS-929017105-aaa.bbb.ccc.22-50010-13
 10697763488
 2011-07-15

Re: could only be replicated to 0 nodes, instead of 1

2011-07-15 Thread Harsh J

Thomas,

Is your /tmp/ mount point also under the / or is it separate? Your
dfs.data.dir are /tmp/hadoop-user/dfs/data in all DNs, and if they are
separately mounted then what's the available space on that?

(bad idea in production to keep things default on /tmp though, like
dfs.name.dir, dfs.data.dir -- reconfigure+restart as necessary)

On Fri, Jul 15, 2011 at 3:47 PM, Thomas Anderson
t.dt.aander...@gmail.com wrote:
 1.) The disk usage (with df -kh) on namenode (server01)

 Filesystem            Size  Used Avail Use% Mounted on
 /dev/sda1             9.4G  2.3G  6.7G  25% /

 and datanodes (server02 ~ server05)
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 /dev/sda1             9.4G  2.2G  6.8G  25% /

 2.) How can I make sure that datanode is busy? The environment is only
 for testing so there is no other user processes are running at that
 moment. Also it is a fresh installation, so only hadoop required
 packages are installed such as hadoop and jdk.

 3.) fs.block.size is not set in hdfs-site.xml, including datanodes and
 namenode, because its purpose is for testing. I thought it would use
 the default value, which should be 512?

 4.) What might be a good way for fast check if network is not stable?
 I check the healthy page e.g. server01:50070/dfshealth.jsp where
 livenodes are up and  last contact varies when checking the page.

 Node     Last Contact    Admin State     Configured  Capacity (GB)       Used
 (GB)     Non DFS  Used (GB)      Remaining  (GB)         Used  (%)       Used 
  (%)
 Remaining  (%)   Blocks
 server02         2      In Service      0.1     0       0       0.1     0.01  
    99.96  0
 server03         0      In Service      0.1     0       0       0.1     0.01  
    99.96  0
 server04         1      In Service      0.1     0       0       0.1     0.01  
    99.96  0
 server05         2      In Service      0.1     0       0       0.1     0.01  
    99.96  0

 5.) Only command `hadoop fs -put /tmp/testfile test` is issued as it
 is just to test if the installation is working. So the file e.g.
 testfile will be removed first (hadoop fs -rm test/testfile), then
 upload again with hadoop put command.

 The logs are listed as below:

 namenode:
 server01: http://pastebin.com/TLpDmmPx

 datanodes:
 server02: http://pastebin.com/pdE5XKfi
 server03: http://pastebin.com/4aV7ECCV
 server04: http://pastebin.com/tF7HiRZj
 server05: http://pastebin.com/5qwSPrvU

 Please let me know if more information needs to be provided.

 I really appreciate your suggestion.

 Thank you.


 On Fri, Jul 15, 2011 at 4:54 PM, Brahma Reddy brahmared...@huawei.com wrote:
 Hi,

 By seeing this exception(could only be replicated to 0 nodes, instead of 1)
 ,datanode is not available to Name Node..

 This are the following cases Data Node may not available to Name Node

 1)Data Node disk is Full

 2)Data Node is Busy with block report and block scanning

 3)If Block Size is Negative value(dfs.block.size in hdfs-site.xml)

 4)while write in progress primary datanode goes down(Any n/w fluctations b/w
 Name Node and Data Node Machines)

 5)when Ever we append any partial chunk and call sync for subsequent partial
 chunk appends client should store the previous data in buffer.

 For example after appending a I have called sync and when I am trying the
 to append the buffer should have ab

 And Server side when the chunk is not multiple of 512 then it will try to do
 Crc comparison for the data present in block file as well as crc present in
 metafile. But while constructing crc for the data present in block it is
 always comparing till the initial Offeset

 Or For more analysis Please the data node logs

 Warm Regards

 Brahma Reddy

 
 ***
 This e-mail and attachments contain confidential information from HUAWEI,
 which is intended only for the person or entity whose address is listed
 above. Any use of the information contained herein in any way (including,
 but not limited to, total or partial disclosure, reproduction, or
 dissemination) by persons other than the intended recipient's) is
 prohibited. If you receive this e-mail in error, please notify the sender by
 phone or email immediately and delete it!
 -Original Message-
 From: Thomas Anderson [mailto:t.dt.aander...@gmail.com]
 Sent: Friday, July 15, 2011 9:09 AM
 To: hdfs-user@hadoop.apache.org
 Subject: could only be replicated to 0 nodes, instead of 1

 I have fresh  hadoop 0.20.2 installed on virtualbox 4.0.8 with jdk
 1.6.0_26. The problem is when trying to put a file to hdfs, it throws
 error `org.apache.hadoop.ipc.RemoteException: java.io.IOException:
 File /path/to/file could only be replicated to 0 nodes, instead of 1';
 however, there is no problem to create a folder, as the command ls
 print the result

 Found 1 items
 drwxr-xr-x   - user supergroup          0

Re: could only be replicated to 0 nodes, instead of 1

2011-07-15 Thread Harsh J

(P.s. I asked that cause if you look at your NN's live nodes tables,
the reported space is all 0)

What's the output of:

du -sk /tmp/hadoop-user/dfs on all your DNs?

On Fri, Jul 15, 2011 at 4:01 PM, Harsh J ha...@cloudera.com wrote:
 Thomas,

 Is your /tmp/ mount point also under the / or is it separate? Your
 dfs.data.dir are /tmp/hadoop-user/dfs/data in all DNs, and if they are
 separately mounted then what's the available space on that?

 (bad idea in production to keep things default on /tmp though, like
 dfs.name.dir, dfs.data.dir -- reconfigure+restart as necessary)

 On Fri, Jul 15, 2011 at 3:47 PM, Thomas Anderson
 t.dt.aander...@gmail.com wrote:
 1.) The disk usage (with df -kh) on namenode (server01)

 Filesystem            Size  Used Avail Use% Mounted on
 /dev/sda1             9.4G  2.3G  6.7G  25% /

 and datanodes (server02 ~ server05)
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 /dev/sda1             9.4G  2.2G  6.8G  25% /

 2.) How can I make sure that datanode is busy? The environment is only
 for testing so there is no other user processes are running at that
 moment. Also it is a fresh installation, so only hadoop required
 packages are installed such as hadoop and jdk.

 3.) fs.block.size is not set in hdfs-site.xml, including datanodes and
 namenode, because its purpose is for testing. I thought it would use
 the default value, which should be 512?

 4.) What might be a good way for fast check if network is not stable?
 I check the healthy page e.g. server01:50070/dfshealth.jsp where
 livenodes are up and  last contact varies when checking the page.

 Node     Last Contact    Admin State     Configured  Capacity (GB)       Used
 (GB)     Non DFS  Used (GB)      Remaining  (GB)         Used  (%)       
 Used  (%)
 Remaining  (%)   Blocks
 server02         2      In Service      0.1     0       0       0.1     0.01 
     99.96  0
 server03         0      In Service      0.1     0       0       0.1     0.01 
     99.96  0
 server04         1      In Service      0.1     0       0       0.1     0.01 
     99.96  0
 server05         2      In Service      0.1     0       0       0.1     0.01 
     99.96  0

 5.) Only command `hadoop fs -put /tmp/testfile test` is issued as it
 is just to test if the installation is working. So the file e.g.
 testfile will be removed first (hadoop fs -rm test/testfile), then
 upload again with hadoop put command.

 The logs are listed as below:

 namenode:
 server01: http://pastebin.com/TLpDmmPx

 datanodes:
 server02: http://pastebin.com/pdE5XKfi
 server03: http://pastebin.com/4aV7ECCV
 server04: http://pastebin.com/tF7HiRZj
 server05: http://pastebin.com/5qwSPrvU

 Please let me know if more information needs to be provided.

 I really appreciate your suggestion.

 Thank you.


 On Fri, Jul 15, 2011 at 4:54 PM, Brahma Reddy brahmared...@huawei.com 
 wrote:
 Hi,

 By seeing this exception(could only be replicated to 0 nodes, instead of 1)
 ,datanode is not available to Name Node..

 This are the following cases Data Node may not available to Name Node

 1)Data Node disk is Full

 2)Data Node is Busy with block report and block scanning

 3)If Block Size is Negative value(dfs.block.size in hdfs-site.xml)

 4)while write in progress primary datanode goes down(Any n/w fluctations b/w
 Name Node and Data Node Machines)

 5)when Ever we append any partial chunk and call sync for subsequent partial
 chunk appends client should store the previous data in buffer.

 For example after appending a I have called sync and when I am trying the
 to append the buffer should have ab

 And Server side when the chunk is not multiple of 512 then it will try to do
 Crc comparison for the data present in block file as well as crc present in
 metafile. But while constructing crc for the data present in block it is
 always comparing till the initial Offeset

 Or For more analysis Please the data node logs

 Warm Regards

 Brahma Reddy

 
 ***
 This e-mail and attachments contain confidential information from HUAWEI,
 which is intended only for the person or entity whose address is listed
 above. Any use of the information contained herein in any way (including,
 but not limited to, total or partial disclosure, reproduction, or
 dissemination) by persons other than the intended recipient's) is
 prohibited. If you receive this e-mail in error, please notify the sender by
 phone or email immediately and delete it!
 -Original Message-
 From: Thomas Anderson [mailto:t.dt.aander...@gmail.com]
 Sent: Friday, July 15, 2011 9:09 AM
 To: hdfs-user@hadoop.apache.org
 Subject: could only be replicated to 0 nodes, instead of 1

 I have fresh  hadoop 0.20.2 installed on virtualbox 4.0.8 with jdk
 1.6.0_26. The problem is when trying to put a file to hdfs, it throws
 error

Re: could only be replicated to 0 nodes, instead of 1

2011-07-15 Thread Harsh J

Thomas,

Your problem might lie simply with the virtual node DNs using /tmp and
tmpfs being used for that -- which somehow is causing reported free
space to go as 0 in reports to the NN (master).

tmpfs 101M   44K  101M   1% /tmp

This causes your trouble that the NN can't choose a suitable DN to
write to, cause it determines that none has at least a block size
worth of space (64MB default) available for writes.

You can resolve as:

1. Stop DFS completely.

2. Create a directory under root somewhere (I use Cloudera's distro,
and its default configured location for data files comes along as
/var/lib/hadoop-0.20/cache/, if you need an idea for a location) and
set it as your hadoop.tmp.dir in core-site.xml on all the nodes.

3. Reformat your NameNode (hadoop namenode -format, say Y) and restart
DFS. Things _should_ be OK now.

Config example (core-site.xml):

 property
   namehadoop.tmp.dir/name
   value/var/lib/hadoop-0.20/cache/value
 /property

Let us know if this still doesn't get your dev cluster up and running
for action :)

On Fri, Jul 15, 2011 at 4:40 PM, Thomas Anderson
t.dt.aander...@gmail.com wrote:
 When doing partition, I remember only / and swap was specified for all
 nodes during creation. So I think /tmp is also mounted under /, which
 should have size around 9G. The total size of hardisk specified is
 10G.

 The df -kh shows

 server01:
 /dev/sda1             9.4G  2.3G  6.7G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M  132K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 server02:
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M   44K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 server03:
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M   44K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 server04:
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M   44K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 server05:
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 tmpfs                 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs                 5.0M     0  5.0M   0% /var/run/lock
 tmpfs                 101M   44K  101M   1% /tmp
 udev                  247M     0  247M   0% /dev
 tmpfs                 101M     0  101M   0% /var/run/shm
 tmpfs                  51M  176K   51M   1% /var/run

 In addition, the output of dfs (du -sk /tmp/hadoop-user/dfs) is

 server02:
 8       /tmp/hadoop-user/dfs/

 server03:
 8       /tmp/hadoop-user/dfs/

 server04:
 8       /tmp/hadoop-user/dfs/

 server05:
 8       /tmp/hadoop-user/dfs/

 On Fri, Jul 15, 2011 at 7:01 PM, Harsh J ha...@cloudera.com wrote:
 (P.s. I asked that cause if you look at your NN's live nodes tables,
 the reported space is all 0)

 What's the output of:

 du -sk /tmp/hadoop-user/dfs on all your DNs?

 On Fri, Jul 15, 2011 at 4:01 PM, Harsh J ha...@cloudera.com wrote:
 Thomas,

 Is your /tmp/ mount point also under the / or is it separate? Your
 dfs.data.dir are /tmp/hadoop-user/dfs/data in all DNs, and if they are
 separately mounted then what's the available space on that?

 (bad idea in production to keep things default on /tmp though, like
 dfs.name.dir, dfs.data.dir -- reconfigure+restart as necessary)

 On Fri, Jul 15, 2011 at 3:47 PM, Thomas Anderson
 t.dt.aander...@gmail.com wrote:
 1.) The disk usage (with df -kh) on namenode (server01)

 Filesystem            Size  Used Avail Use% Mounted on
 /dev/sda1             9.4G  2.3G  6.7G  25% /

 and datanodes (server02 ~ server05)
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 /dev/sda1             9.4G  2.2G  6.8G  25% /
 /dev/sda1             9.4G  2.2G  6.8G  25% /

 2.) How can I make sure that datanode is busy? The environment is only
 for testing so there is no other user processes are running at that
 moment. Also it is a fresh installation, so only hadoop required
 packages are installed such as hadoop and jdk.

 3.) fs.block.size is not set in hdfs-site.xml, including

Re: could only be replicated to 0 nodes, instead of 1

2011-07-15 Thread Harsh J

Thomas,

Your problem might lie simply with the virtual node DNs using /tmp and tmpfs 
being used for that -- which somehow is causing reported free space to go as 0 
in reports to the NN (master).

tmpfs 101M   44K  101M   1% /tmp

This causes your trouble that the NN can't choose a suitable DN to write to, 
cause it determines that none has at least a block size worth of space (64MB 
default) available for writes.

You can resolve as:

1. Stop DFS completely.

2. Create a directory under root somewhere (I use Cloudera's distro, and its 
default configured location for data files comes along as 
/var/lib/hadoop-0.20/cache/, if you need an idea for a location) and set it as 
your hadoop.tmp.dir in core-site.xml on all the nodes.

3. Reformat your NameNode (hadoop namenode -format, say Y) and restart DFS. 
Things _should_ be OK now.

Config example (core-site.xml):

 property
   namehadoop.tmp.dir/name
   value/var/lib/hadoop-0.20/cache/value
 /property

Let us know if this still doesn't get your dev cluster up and running for 
action :)

On 15-Jul-2011, at 4:40 PM, Thomas Anderson wrote:

 When doing partition, I remember only / and swap was specified for all
 nodes during creation. So I think /tmp is also mounted under /, which
 should have size around 9G. The total size of hardisk specified is
 10G.
 
 The df -kh shows
 
 server01:
 /dev/sda1 9.4G  2.3G  6.7G  25% /
 tmpfs 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs 5.0M 0  5.0M   0% /var/run/lock
 tmpfs 101M  132K  101M   1% /tmp
 udev  247M 0  247M   0% /dev
 tmpfs 101M 0  101M   0% /var/run/shm
 tmpfs  51M  176K   51M   1% /var/run
 
 server02:
 /dev/sda1 9.4G  2.2G  6.8G  25% /
 tmpfs 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs 5.0M 0  5.0M   0% /var/run/lock
 tmpfs 101M   44K  101M   1% /tmp
 udev  247M 0  247M   0% /dev
 tmpfs 101M 0  101M   0% /var/run/shm
 tmpfs  51M  176K   51M   1% /var/run
 
 server03:
 /dev/sda1 9.4G  2.2G  6.8G  25% /
 tmpfs 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs 5.0M 0  5.0M   0% /var/run/lock
 tmpfs 101M   44K  101M   1% /tmp
 udev  247M 0  247M   0% /dev
 tmpfs 101M 0  101M   0% /var/run/shm
 tmpfs  51M  176K   51M   1% /var/run
 
 server04:
 /dev/sda1 9.4G  2.2G  6.8G  25% /
 tmpfs 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs 5.0M 0  5.0M   0% /var/run/lock
 tmpfs 101M   44K  101M   1% /tmp
 udev  247M 0  247M   0% /dev
 tmpfs 101M 0  101M   0% /var/run/shm
 tmpfs  51M  176K   51M   1% /var/run
 
 server05:
 /dev/sda1 9.4G  2.2G  6.8G  25% /
 tmpfs 5.0M  4.0K  5.0M   1% /lib/init/rw
 tmpfs 5.0M 0  5.0M   0% /var/run/lock
 tmpfs 101M   44K  101M   1% /tmp
 udev  247M 0  247M   0% /dev
 tmpfs 101M 0  101M   0% /var/run/shm
 tmpfs  51M  176K   51M   1% /var/run
 
 In addition, the output of dfs (du -sk /tmp/hadoop-user/dfs) is
 
 server02:
 8 /tmp/hadoop-user/dfs/
 
 server03:
 8 /tmp/hadoop-user/dfs/
 
 server04:
 8 /tmp/hadoop-user/dfs/
 
 server05:
 8 /tmp/hadoop-user/dfs/
 
 On Fri, Jul 15, 2011 at 7:01 PM, Harsh J ha...@cloudera.com wrote:
 (P.s. I asked that cause if you look at your NN's live nodes tables,
 the reported space is all 0)
 
 What's the output of:
 
 du -sk /tmp/hadoop-user/dfs on all your DNs?
 
 On Fri, Jul 15, 2011 at 4:01 PM, Harsh J ha...@cloudera.com wrote:
 Thomas,
 
 Is your /tmp/ mount point also under the / or is it separate? Your
 dfs.data.dir are /tmp/hadoop-user/dfs/data in all DNs, and if they are
 separately mounted then what's the available space on that?
 
 (bad idea in production to keep things default on /tmp though, like
 dfs.name.dir, dfs.data.dir -- reconfigure+restart as necessary)
 
 On Fri, Jul 15, 2011 at 3:47 PM, Thomas Anderson
 t.dt.aander...@gmail.com wrote:
 1.) The disk usage (with df -kh) on namenode (server01)
 
 FilesystemSize  Used Avail Use% Mounted on
 /dev/sda1 9.4G  2.3G  6.7G  25% /
 
 and datanodes (server02 ~ server05)
 /dev/sda1 9.4G  2.2G  6.8G  25% /
 /dev/sda1 9.4G  2.2G  6.8G  25% /
 /dev/sda1 9.4G  2.2G  6.8G  25% /
 /dev/sda1 9.4G  2.2G  6.8G  25% /
 
 2.) How can I make sure that datanode is busy? The environment is only
 for testing so there is no other user processes are running at that
 moment. Also it is a fresh installation, so only hadoop required
 packages are installed such as hadoop and jdk.
 
 3.) fs.block.size is not set in hdfs-site.xml, including

could only be replicated to 0 nodes, instead of 1

2011-07-14 Thread Thomas Anderson

I have fresh  hadoop 0.20.2 installed on virtualbox 4.0.8 with jdk
1.6.0_26. The problem is when trying to put a file to hdfs, it throws
error `org.apache.hadoop.ipc.RemoteException: java.io.IOException:
File /path/to/file could only be replicated to 0 nodes, instead of 1';
however, there is no problem to create a folder, as the command ls
print the result

Found 1 items
drwxr-xr-x   - user supergroup  0 2011-07-15 11:09 /user/user/test

I also try with flushing firewall (remove all iptables restriction),
but the error message is still thrown out when uploading (hadoop fs
-put /tmp/x test) a file from local fs.

The name node log shows

2011-07-15 10:42:43,491 INFO org.apache.hadoop.hdfs.StateChange:
BLOCK* NameSystem.registerDatanode: node registration from
aaa.bbb.ccc.ddd.22:50010 storage DS-929017105-aaa.bbb.ccc.22-50010-13
10697763488
2011-07-15 10:42:43,495 INFO org.apache.hadoop.net.NetworkTopology:
Adding a new node: /default-rack/aaa.bbb.ccc.22:50010
2011-07-15 10:42:44,169 INFO org.apache.hadoop.hdfs.StateChange:
BLOCK* NameSystem.registerDatanode: node registration from
aaa.bbb.ccc.35:50010 storage DS-884574392-aaa.bbb.ccc.35-50010-13
10697764164
2011-07-15 10:42:44,170 INFO org.apache.hadoop.net.NetworkTopology:
Adding a new node: /default-rack/aaa.bbb.ccc.35:50010
2011-07-15 10:42:44,507 INFO org.apache.hadoop.hdfs.StateChange:
BLOCK* NameSystem.registerDatanode: node registration from
aaa.bbb.ccc.ddd.11:50010 storage DS-1537583073-aaa.bbb.ccc.11-50010-1
310697764488
2011-07-15 10:42:44,507 INFO org.apache.hadoop.net.NetworkTopology:
Adding a new node: /default-rack/aaa.bbb.ccc.11:50010
2011-07-15 10:42:45,796 INFO org.apache.hadoop.hdfs.StateChange:
BLOCK* NameSystem.registerDatanode: node registration from
140.127.220.25:50010 storage DS-1500589162-aaa.bbb.ccc.25-50010-1
310697765386
2011-07-15 10:42:45,797 INFO org.apache.hadoop.net.NetworkTopology:
Adding a new node: /default-rack/aaa.bbb.ccc.25:50010

And all datanodes have similar message as below:

2011-07-15 10:42:46,562 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: using
BLOCKREPORT_INTERVAL of 360msec Initial delay: 0msec
2011-07-15 10:42:47,163 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0
blocks got processed in 3 msecs
2011-07-15 10:42:47,187 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic
block scanner.
2011-07-15 11:19:42,931 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0
blocks got processed in 1 msecs

Command `hadoop fsck /`  displays

Status: HEALTHY
 Total size:0 B
 Total dirs:3
 Total files:   0 (Files currently being written: 1)
 Total blocks (validated):  0
 Minimally replicated blocks:   0
 Over-replicated blocks:0
 Under-replicated blocks:   0
 Mis-replicated blocks: 0
 Default replication factor:3
 Average block replication: 0.0
 Corrupt blocks:0
 Missing replicas:  0
 Number of data-nodes:  4

The setting in conf include:

- Master node:
core-site.xml
  property
namefs.default.name/name
valuehdfs://lab01:9000//value
  /property

hdfs-site.xml
  property
namedfs.replication/name
value3/value
  /property

-Slave nodes:
core-site.xml
  property
namefs.default.name/name
valuehdfs://lab01:9000//value
  /property

hdfs-site.xml
 property
namedfs.replication/name
value3/value
  /property

Do I missing any configuration? Or any place that I can check?

Thanks.

jobtracker.info could only be replicated to 0 nodes, instead of 1

2011-07-08 Thread Gustavo Pabon

Dear Hadoop Users,

I am very new on hadoop, I am just trying to run the tutorials.
Currently I am trying to run the Pseudo-Distributed Operation
(http://hadoop.apache.org/common/docs/stable/single_node_setup.html).

I have found that there are another users that have had this same
problem.  But any of the workarounds worked for me.

I will copy a fragment of the namenode LOG:

2011-07-08 09:34:38,422 INFO org.apache.hadoop.hdfs.StateChange:
*BLOCK* NameSystem.processReport: from 127.0.0.1:50010, blocks: 0,
processing time: 1 msecs
2011-07-08 09:38:41,887 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
transactions: 0 Total time for transactions(ms): 0Number of
transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0
2011-07-08 09:38:42,100 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
enough replicas, still in need of 1
2011-07-08 09:38:42,101 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 9 on 9000, call
addBlock(/tmp/hadoop-gpabon/mapred/system/jobtracker.info,
DFSClient_383801114) from 127.0.0.1:60003: error: java.io.IOException:
File /tmp/hadoop-gpabon/mapred/system/jobtracker.info could only be
replicated to 0 nodes, instead of 1

Note that the timestamp of the logs starts at 09:34, and the
IOExceotion was raised at 09:38.  This is because one workaround
suggests start dfs first, wait some minutes and then start mapred
service.

Hadoop Version:  0.20.203.0
Java: jdk1.6.0_26
SO: Linux Debian on VirtualBox
kernel: 2.6.39-2-686-pae

Thank you very much in advanced for your valuable help.

Best regards,

Gustavo P.

Question on Error : could only be replicated to 0 nodes, instead of 1

2011-03-04 Thread Ted Pedersen

Greetings all,

I get the following error at seemingly irregular intervals when I'm
trying to do the following...

hadoop fs -put /scratch1/tdp/data/* input

(The data is a few hundred files of wikistats data, about 75GB in total).

11/03/04 15:55:05 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop
.ipc.RemoteException: java.io.IOException: File /user/pedersen/input/pagecounts-
20110129-020001 could only be replicated to 0 nodes, instead of 1

I've searched around on the error message, and have actually found a
lot of postings,
but they seem to be as irregular as the error itself (both in terms of
explanations and fixes).

http://www.mail-archive.com/common-user@hadoop.apache.org/msg00407.html
http://wiki.apache.org/hadoop/HowToSetupYourDevelopmentEnvironment
http://www.phacai.com/hadoop-error-could-only-be-replicated-to-0-nodes-instead-of-1
http://permalink.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/20198

Is there a best currently understood explanation for this error, and
the preferred way to
resolve it? We are running in fully distributed mode here...

Thanks!
Ted

-- 
Ted Pedersen
http://www.d.umn.edu/~tpederse

could only be replicated to 0 nodes, instead of 1

2010-07-01 Thread Pierre ANCELOT


Hello everyone :)
Any idea how to troubleshoot this please?
I don't get where this comes from.


2010-07-01 08:29:02,721 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/acd/Dispatch/data/user100/historyreport/020920360E55A807CECDFC34621E90C5/last
could only be replicated to 0 nodes, instead of 1


at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source)


at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)


at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)



at org.apache.hadoop.ipc.Client.call(Client.java:740)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at $Proxy1.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)


at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)


at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy1.addBlock(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)


at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)


Thank you.
-- 
View this message in context: 
http://old.nabble.com/could-only-be-replicated-to-0-nodes%2C-instead-of-1-tp29043049p29043049.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

could only be replicated to 0 nodes, instead of 1

2010-07-01 Thread Pierre ANCELOT

Hello everyone :)
Any idea how to troubleshoot this please?
I don't get where this comes from.


2010-07-01 08:29:02,721 WARN org.apache.hadoop.hdfs.DFSClient:
DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: File
/acd/Dispatch/data/user100/historyreport/020920360E55A807CECDFC34621E90C5/last
could only be replicated to 0 nodes, instead of 1


at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source)


at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)


at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)



at org.apache.hadoop.ipc.Client.call(Client.java:740)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at $Proxy1.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)


at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)


at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy1.addBlock(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)


at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)


Thank you.

-- 
http://www.neko-consulting.com
Ego sum quis ego servo
Je suis ce que je protège
I am what I protect

Re: could only be replicated to 0 nodes, instead of 1

2010-07-01 Thread Harsh J

Did you check if your Datanode(s) is/are up? The Namenode's web
interface reports statistics about live/dead datanodes.

NN is trying to replicate your data to N (in your case, 1) datanodes
but it can find none.

On Thu, Jul 1, 2010 at 1:48 PM, Pierre ANCELOT pierre...@gmail.com wrote:
 Hello everyone :)
 Any idea how to troubleshoot this please?
 I don't get where this comes from.


 2010-07-01 08:29:02,721 WARN org.apache.hadoop.hdfs.DFSClient:
 DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
 java.io.IOException: File
 /acd/Dispatch/data/user100/historyreport/020920360E55A807CECDFC34621E90C5/last
 could only be replicated to 0 nodes, instead of 1


        at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
        at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source)


        at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)


        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)



        at org.apache.hadoop.ipc.Client.call(Client.java:740)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy1.addBlock(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)


        at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)


        at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy1.addBlock(Unknown Source)
        at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)


        at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
        at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
        at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)


 Thank you.

 --
 http://www.neko-consulting.com
 Ego sum quis ego servo
 Je suis ce que je protège
 I am what I protect




-- 
Harsh J
www.harshj.com

Re: could only be replicated to 0 nodes, instead of 1

2010-07-01 Thread Pierre ANCELOT

Hi,
Well, actually, I'm configured to have 3 replicas, the default.
So this, is a first issue.
And yes, the datanode is up and reponding, this problem only happens
from time to time.

Thanks :)

On Thu, Jul 1, 2010 at 10:49 AM, Harsh J qwertyman...@gmail.com wrote:
 Did you check if your Datanode(s) is/are up? The Namenode's web
 interface reports statistics about live/dead datanodes.

 NN is trying to replicate your data to N (in your case, 1) datanodes
 but it can find none.

 On Thu, Jul 1, 2010 at 1:48 PM, Pierre ANCELOT pierre...@gmail.com wrote:
 Hello everyone :)
 Any idea how to troubleshoot this please?
 I don't get where this comes from.


 2010-07-01 08:29:02,721 WARN org.apache.hadoop.hdfs.DFSClient:
 DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
 java.io.IOException: File
 /acd/Dispatch/data/user100/historyreport/020920360E55A807CECDFC34621E90C5/last
 could only be replicated to 0 nodes, instead of 1


        at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
        at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source)


        at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)


        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)



        at org.apache.hadoop.ipc.Client.call(Client.java:740)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy1.addBlock(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)


        at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)


        at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy1.addBlock(Unknown Source)
        at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)


        at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
        at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
        at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)


 Thank you.

 --
 http://www.neko-consulting.com
 Ego sum quis ego servo
 Je suis ce que je protège
 I am what I protect




 --
 Harsh J
 www.harshj.com




-- 
http://www.neko-consulting.com
Ego sum quis ego servo
Je suis ce que je protège
I am what I protect

Re: Does error could only be replicated to 0 nodes, instead of 1 mean no datanodes available?

2010-05-27 Thread Alex Luya


Hello
   here is the output of hadoop fsck /:

Status: HEALTHY
 Total size:0 B
 Total dirs:2
 Total files:   0 (Files currently being written: 1)
 Total blocks (validated):  0
 Minimally replicated blocks:   0
 Over-replicated blocks:0
 Under-replicated blocks:   0
 Mis-replicated blocks: 0
 Default replication factor:3
 Average block replication: 0.0
 Corrupt blocks:0
 Missing replicas:  0
 Number of data-nodes:  4
 Number of racks:   1
--

Currently,I have set
the dfs.http.address configuration property in hdfs-site.xml,all othe 
error gone,except error in primary namenode:


10/05/27 14:21:06 WARN hdfs.DFSClient: DataStreamer Exception: 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/user/alex/check_ssh.sh could only be replicated to 0 nodes, instead of 1
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)

...

10/05/27 14:21:06 WARN hdfs.DFSClient: Error Recovery for block null bad 
datanode[0] nodes == null
10/05/27 14:21:06 WARN hdfs.DFSClient: Could not get block locations. 
Source file /user/alex/check_ssh.sh - Aborting...
put: java.io.IOException: File /user/alex/check_ssh.sh could only be 
replicated to 0 nodes, instead of 1
10/05/27 14:21:06 ERROR hdfs.DFSClient: Exception closing file 
/user/alex/check_ssh.sh : org.apache.hadoop.ipc.RemoteException: 
java.io.IOException: File /user/alex/check_ssh.sh could only be 
replicated to 0 nodes, instead of 1
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)

...

org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/user/alex/check_ssh.sh could only be replicated to 0 nodes, instead of 1
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)

...
---




On 05/27/2010 12:23 AM, Eric Sammer wrote:

Alex:

 From the data node / secondary NN exceptions, it appears that nothing
can talk to your name node. Take a look in the name node logs and look
for where data node registration happens. Is it possible the NN disk
is full? My guess is that there's something odd happening with the
state on the name node. What does hadoop fsck / look like?

On Wed, May 26, 2010 at 6:53 AM, Alex Luyaalexander.l...@gmail.com  wrote:
   

Hello:
   I got this error when putting files into hdfs,it seems a old issue,and I
followed the solution of this link:

http://adityadesai.wordpress.com/2009/02/26/another-problem-with-hadoop-
jobjar-could-only-be-replicated-to-0-nodes-instead-of-1io-exception/
-

but problem still exists.so I tried to figure it out through source code:
---
  org.apache.hadoop.hdfs.server.namenode.FSNameSystem.getAdditionalBlock()
---
  // choose targets for the new block tobe allocated.
DatanodeDescriptor targets[] = replicator.chooseTarget(replication,
   clientNode,
   null,
   blockSize);
if (targets.length  this.minReplication) {
  throw new IOException(File  + src +  could only be replicated to  +
targets.length +  nodes, instead of  +
minReplication);
--

I think DatanodeDescriptor represents datanode,so here targets.length
means the number of datanode,clearly,it is 0,in other words,no datanode is
available.But in the web interface:localhost:50070,I can see 4 live nodes(I
have 4 nodes only),and hadoop dfsadmin -report shows 4 nodes also.that is
strange.
And I got this error message in secondary namenode:

Does error could only be replicated to 0 nodes, instead of 1 mean no datanodes available?

2010-05-26 Thread Alex Luya

Hello:
   I got this error when putting files into hdfs,it seems a old issue,and I 
followed the solution of this link:

http://adityadesai.wordpress.com/2009/02/26/another-problem-with-hadoop-
jobjar-could-only-be-replicated-to-0-nodes-instead-of-1io-exception/
-

but problem still exists.so I tried to figure it out through source code:
---
 org.apache.hadoop.hdfs.server.namenode.FSNameSystem.getAdditionalBlock()
---
 // choose targets for the new block tobe allocated.
DatanodeDescriptor targets[] = replicator.chooseTarget(replication,
   clientNode,
   null,
   blockSize);
if (targets.length  this.minReplication) {
  throw new IOException(File  + src +  could only be replicated to  +
targets.length +  nodes, instead of  +
minReplication);
--

I think DatanodeDescriptor represents datanode,so here targets.length 
means the number of datanode,clearly,it is 0,in other words,no datanode is 
available.But in the web interface:localhost:50070,I can see 4 live nodes(I 
have 4 nodes only),and hadoop dfsadmin -report shows 4 nodes also.that is 
strange.
And I got this error message in secondary namenode:
-
2010-05-26 16:26:39,588 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Recovering storage directory /home/alex/tmp/dfs/namesecondary from failed 
checkpoint.
2010-05-26 16:26:39,593 ERROR 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
doCheckpoint: 
2010-05-26 16:26:39,594 ERROR 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: 
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:193)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
..
-
and error message in datanode:
-
2010-05-26 16:07:49,039 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(192.168.1.3:50010, 
storageID=DS-1180479012-192.168.1.3-50010-1274799233678, infoPort=50075, 
ipcPort=50020):DataXceiver
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
at sun.nio.ch.IOUtil.read(IOUtil.java:206)
.
-

Seems like that network ports don't open,but after scaning by nmap,I can 
confirm that all network ports in relevant nodes are being opened.After two 
days effort,result is zero.

Can anybody help me troubleshooting?Thank you.



  (following is  relevant info:my cluster configuration,content conf files 
and oupt or hadoop dfsadmin -report and java error message stack )



---
my configuration is:
-
ubuntu 10.04 64 bit+jdk1.6.0_20+hadoop  0.20.2,
-



core-site.xml
-
configuration
property
namefs.default.name/name
valuehdfs://AlexLuya/value
/property
property
namehadoop.tmp.dir/name
value/home/alex/tmp/value

hadoop start error - could only be replicated to 0 nodes, instead of 1

2010-05-08 Thread Dennis

Hi,I am trying to set up a Hadoop claster. I have a server machine, and
I installed XenServer on it. I installed 3 Debian Lenny VMs on the XenServer.
And on every VM, I installed sun-java6-jre, openssh-server, rsync and hadoop.
VM jack is the namenode, VM kim and VM lynne are datanodes. Everything was cool
until I run bin/start-all.sh. I attached three configuration files and the
hadoop-cs-namenode-jack.log file who printed a lot errors.
Two important errors are:java.lang.IllegalArgumentException:
Duplicate metricsName:getProtocolVersionjava.io.IOException:
File /home/cs/HadoopInstall/tmp/mapred/system/jobtracker.info could only
be replicated to 0 nodes, instead of 1
Thanks.Dennis

?xml version=1.0?
?xml-stylesheet type=text/xsl href=configuration.xsl?

!-- Put site-specific property overrides in this file. --

configuration

property
namehadoop.tmp.dir/name
value/home/cs/HadoopInstall/tmp/value
descriptionA base for other temporary directories./description
/property
property
namefs.default.name/name
valuehdfs://jack:9000/value
descriptionThe name of the default file system. Either the literal string local or a host:port for DFS./description
/property

property
namedfs.name.dir/name
value/home/cr/HadoopInstall/filesystem/name/value
descriptionDetermines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. /description
/property
property
namedfs.data.dir/name
value/home/cr/HadoopInstall/filesystem/data/value
descriptionDetermines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored./description
/property
/configuration
?xml version=1.0?
?xml-stylesheet type=text/xsl href=configuration.xsl?

!-- Put site-specific property overrides in this file. --

configuration

property
namedfs.replication/name
value1/value
descriptionDefault block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time./description
/property
/configuration
?xml version=1.0?
?xml-stylesheet type=text/xsl href=configuration.xsl?

!-- Put site-specific property overrides in this file. --

configuration

property
namemapred.job.tracker/name
valuehdfs://jack:9001/value
descriptionThe host and port that the MapReduce job tracker runs at. If local, then jobs are run in-process as a single map and reduce task./description
/property
/configuration

hadoop start error - could only be replicated to 0 nodes, instead of 1

2010-05-08 Thread Dennis

Sorry, forgot to attach the log file
Hi,I am trying to set up a Hadoop claster. I have a server machine, and
I installed XenServer on it. I installed 3 Debian Lenny VMs on the XenServer.
And on every VM, I installed sun-java6-jre, openssh-server, rsync and hadoop.
VM jack is the namenode, VM kim and VM lynne are datanodes. Everything was cool
until I run bin/start-all.sh. I attached three configuration files and the
hadoop-cs-namenode-jack.log file who printed a lot errors.
Two important errors are:java.lang.IllegalArgumentException:
Duplicate metricsName:getProtocolVersionjava.io.IOException:
File /home/cs/HadoopInstall/tmp/mapred/system/jobtracker.info could only
be replicated to 0 nodes, instead of 1
Thanks.Dennis

?xml version=1.0?
?xml-stylesheet type=text/xsl href=configuration.xsl?

!-- Put site-specific property overrides in this file. --

configuration

!-- Put site-specific property overrides in this file. --

configuration

!-- Put site-specific property overrides in this file. --

configuration

could only be replicated to 0 nodes, instead of 1 (java.io.EOFException)

2009-10-13 Thread tim robertson

Hi all,

I have just done a fresh install of hadoop-0.20.1 on a small cluster
and can't get it to start up.

Could someone please help me diagnose where I might be going wrong?
Below are the snippets of logs from the namenode, a datanode and a
tasktrasker.

I have successfully formated the namenode:
09/10/13 15:18:51 INFO common.Storage: Storage directory /hadoop/name
has been successfully formatted.

Any advice is greatly appreciated and please let me know if there is
more info I can to provide.

Thanks
Tim



The namenode is reporting:
---
2009-10-13 15:00:24,758 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 3 on 8020, call
addBlock(/hadoop/mapred/system/jobtracker.info, DFSClient_-262825200)
from 192.38.28.30:49642: error: java.io.IOException: File
/hadoop/mapred/system/jobtracker.info could only be replicated
to 0 nodes, instead of 1
java.io.IOException: File /hadoop/mapred/system/jobtracker.info could
only be replicated to 0 nodes, instead of 1
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
---


And the datanodes are reporting repeatedly:
---
2009-10-13 15:20:40,773 INFO org.apache.hadoop.ipc.RPC: Server at
hdfs-master.local/169.254.97.194:8020 not available yet, Z...
2009-10-13 15:20:42,774 INFO org.apache.hadoop.ipc.Client: Retrying
connect to server: hdfs-master.local/169.254.97.194:8020. Already
tried 0 time(s).
---


The task trackers are reporting:
---
2009-10-13 15:06:27,034 ERROR org.apache.hadoop.mapred.TaskTracker:
Can not start task tracker because java.io.IOException: Call to
hdfs-master.local/169.254.97.194:50070 failed on local exception:
java.io.EOFException
at org.apache.hadoop.ipc.Client.wrapException(Client.java:774)
at org.apache.hadoop.ipc.Client.call(Client.java:742)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at org.apache.hadoop.mapred.$Proxy4.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:346)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:383)
at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:314)
at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:291)
at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:514)
at org.apache.hadoop.mapred.TaskTracker.init(TaskTracker.java:934)
at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2833)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:508)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
---

could only be replicated to 0 nodes, instead of 1

2009-07-13 Thread Anthony.Fan


Hi, All

I just start to use Hadoop few days ago. I met the error message 
 WARN hdfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/hadoop/count/count/temp1 could only be replicated to 0 nodes, instead
of 1
while trying to copy data files to DFS after Hadoop is started.

I did all the settings according to the
Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)'s instruction, and I
don't know what's wrong. Besides, during the process, no error message is
written to log files.

Also, according to http://localhost.localdomain:50070/dfshealth.jsp;, I
have one live namenode. By the broswer, I even can see the first data file
is created in DFS, but the size of it is 0.

Things I've tried:
1. Stop hadoop, re-format DFS and start hadoop again.
2. Change localhost to 127.0.0.1

But neigher of them works.

Could anyone help me or give me a hint?

Thanks.

Anthony
-- 
View this message in context: 
http://www.nabble.com/could-only-be-replicated-to-0-nodes%2C-instead-of-1-tp24459104p24459104.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Re: could only be replicated to 0 nodes, instead of 1

2009-07-13 Thread Anthony.Fan


The full error message is 
09/07/02 16:28:09 WARN hdfs.DFSClient: NotReplicatedYetException sleeping
/user/hadoop/count/count/temp1 retries left 1
09/07/02 16:28:12 WARN hdfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/hadoop/count/count/temp1 could only be replicated to 0 nodes, instead
of 1
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1280)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)

at org.apache.hadoop.ipc.Client.call(Client.java:697)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy0.addBlock(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2814)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2696)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)

-- 
View this message in context: 
http://www.nabble.com/could-only-be-replicated-to-0-nodes%2C-instead-of-1-tp24459104p24459151.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Hadoop error help- file system closed, could only be replicated to 0 nodes, instead of 1

2009-06-18 Thread terrianne.erickson

Hi,
 
I am extremely new to Hadoop and have come across a few errors that I'm not 
sure how to fix. I am running Hadoop version 0.19.0 from an image through 
Elasticfox and S3. I am on windows and use puTTY as my ssh. I am trying to run 
a wordcount with 5 slaves. This is what I do so far:
 
1. boot up the instance through ElasticFox
2. cd /usr/local/hadoop-0.19.0
3. bin/hadoop namenode -format
4. bin/start-all.sh
5. jps --( shows jps, jobtracker, secondarynamenode)
6.bin/stop-all.sh
7. ant examples
8. bin/start-all.sh
9. bin/hadoop jar build/hadoop-0.19.0-examples.jar pi 0 100
 
Then I get this error trace:
 
Number of Maps = 0 Samples per Map = 100
Starting Job
09/06/18 17:31:25 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: 
java.io.IOException: File 
/mnt/hadoop/mapred/system/job_200906181730_0001/job.jar could only be 
replicated to 0 nodes, instead of 1
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1270)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)
 
at org.apache.hadoop.ipc.Client.call(Client.java:696)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy0.addBlock(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2815)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2697)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
 
09/06/18 17:31:25 WARN hdfs.DFSClient: NotReplicatedYetException sleeping 
/mnt/hadoop/mapred/system/job_200906181730_0001/job.jar retries left 4
09/06/18 17:31:25 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: 
java.io.IOException: File 
/mnt/hadoop/mapred/system/job_200906181730_0001/job.jar could only be 
replicated to 0 nodes, instead of 1
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1270)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
at sun.reflec,t.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)
 
at org.apache.hadoop.ipc.Client.call(Client.java:696)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy0.addBlock(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2815)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2697)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)
 
09/06/18 17:31:25 WARN hdfs.DFSClient: NotReplicatedYetException sleeping 
/mnt/hadoop/mapred/system/job_200906181730_0001/job.jar retries left 3
09/06/18

Re: Hadoop error help- file system closed, could only be replicated to 0 nodes, instead of 1

2009-06-18 Thread ashish pareek

HI ,
 What seems from your details is that datanode is not running.can
you run *bin/hadoop dfsadmin -report*  and find out whether your datanodes
are up ? then post your observation and it would be better if you post even
your hadoop-site.xml file deatils also.

Regards,
Ashish.

On Fri, Jun 19, 2009 at 3:16 AM, terrianne.erick...@accenture.com wrote:

 Hi,

 I am extremely new to Hadoop and have come across a few errors that I'm not
 sure how to fix. I am running Hadoop version 0.19.0 from an image through
 Elasticfox and S3. I am on windows and use puTTY as my ssh. I am trying to
 run a wordcount with 5 slaves. This is what I do so far:

 1. boot up the instance through ElasticFox
 2. cd /usr/local/hadoop-0.19.0
 3. bin/hadoop namenode -format
 4. bin/start-all.sh
 5. jps --( shows jps, jobtracker, secondarynamenode)
 6.bin/stop-all.sh
 7. ant examples
 8. bin/start-all.sh
 9. bin/hadoop jar build/hadoop-0.19.0-examples.jar pi 0 100

 Then I get this error trace:

 Number of Maps = 0 Samples per Map = 100
 Starting Job
 09/06/18 17:31:25 INFO hdfs.DFSClient:
 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /mnt/hadoop/mapred/system/job_200906181730_0001/job.jar could only be
 replicated to 0 nodes, instead of 1
at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1270)
at
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)

at org.apache.hadoop.ipc.Client.call(Client.java:696)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy0.addBlock(Unknown Source)
at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2815)
at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2697)
at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1997)
at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2183)

 09/06/18 17:31:25 WARN hdfs.DFSClient: NotReplicatedYetException sleeping
 /mnt/hadoop/mapred/system/job_200906181730_0001/job.jar retries left 4
 09/06/18 17:31:25 INFO hdfs.DFSClient:
 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /mnt/hadoop/mapred/system/job_200906181730_0001/job.jar could only be
 replicated to 0 nodes, instead of 1
at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1270)
at
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
at sun.reflec,t.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)

at org.apache.hadoop.ipc.Client.call(Client.java:696)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy0.addBlock(Unknown Source)
at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2815)
at

Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Stas Oskin

Hi.

I'm testing Hadoop in our lab, and started getting the following message
when trying to copy a file:
Could only be replicated to 0 nodes, instead of 1

I have the following setup:

* 3 machines, 2 of them with only 80GB of space, and 1 with 1.5GB
* Two clients are copying files all the time (one of them is the 1.5GB
machine)
* The replication is set on 2
* I let the space on 2 smaller machines to end, to test the behavior

Now, one of the clients (the one located on 1.5GB) works fine, and the other
one - the external, unable to copy and displays the error + the exception
below

Any idea if this expected on my scenario? Or how it can be solved?

Thanks in advance.



09/05/21 10:51:03 WARN dfs.DFSClient: NotReplicatedYetException sleeping
/test/test.bin retries left 1

09/05/21 10:51:06 WARN dfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/test/test.bin could only be replicated to 0 nodes, instead of 1

at
org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1123
)

at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)

at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)



at org.apache.hadoop.ipc.Client.call(Client.java:716)

at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)

at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
)

at java.lang.reflect.Method.invoke(Method.java:597)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82
)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59
)

at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450
)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333
)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745
)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922
)



09/05/21 10:51:06 WARN dfs.DFSClient: Error Recovery for block null bad
datanode[0]

java.io.IOException: Could not get block locations. Aborting...

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2153
)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745
)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1899
)

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread ashish pareek

Hi ,

I have two suggestion

i)Choose a right version ( Hadoop- 0.18 is good)
ii)replication should be 3 as ur having 3 modes.( Indirectly see to it that
ur configuration is correct !!)

Hey even i am just suggesting this as i am also a new to hadoop

Ashish Pareek


On Thu, May 21, 2009 at 2:41 PM, Stas Oskin stas.os...@gmail.com wrote:

 Hi.

 I'm testing Hadoop in our lab, and started getting the following message
 when trying to copy a file:
 Could only be replicated to 0 nodes, instead of 1

 I have the following setup:

 * 3 machines, 2 of them with only 80GB of space, and 1 with 1.5GB
 * Two clients are copying files all the time (one of them is the 1.5GB
 machine)
 * The replication is set on 2
 * I let the space on 2 smaller machines to end, to test the behavior

 Now, one of the clients (the one located on 1.5GB) works fine, and the
 other
 one - the external, unable to copy and displays the error + the exception
 below

 Any idea if this expected on my scenario? Or how it can be solved?

 Thanks in advance.



 09/05/21 10:51:03 WARN dfs.DFSClient: NotReplicatedYetException sleeping
 /test/test.bin retries left 1

 09/05/21 10:51:06 WARN dfs.DFSClient: DataStreamer Exception:
 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /test/test.bin could only be replicated to 0 nodes, instead of 1

at

 org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1123
 )

at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)

at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)

at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
 )

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)



at org.apache.hadoop.ipc.Client.call(Client.java:716)

at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)

at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
 )

at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
 )

at java.lang.reflect.Method.invoke(Method.java:597)

at

 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82
 )

at

 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59
 )

at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)

at

 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450
 )

at

 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333
 )

at

 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745
 )

at

 org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922
 )



 09/05/21 10:51:06 WARN dfs.DFSClient: Error Recovery for block null bad
 datanode[0]

 java.io.IOException: Could not get block locations. Aborting...

at

 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2153
 )

at

 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745
 )

at

 org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1899
 )

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread jason hadoop

It does not appear that any datanodes have connected to your namenode.
on the datanode machines look in the hadoop logs directory at the datanode
log files.
There should be some information there that helps you diagnose the problem.

chapter 4 of my book provides some detail on work with this problem

On Thu, May 21, 2009 at 4:29 AM, ashish pareek pareek...@gmail.com wrote:

 Hi ,

I have two suggestion

 i)Choose a right version ( Hadoop- 0.18 is good)
 ii)replication should be 3 as ur having 3 modes.( Indirectly see to it that
 ur configuration is correct !!)

 Hey even i am just suggesting this as i am also a new to hadoop

 Ashish Pareek


 On Thu, May 21, 2009 at 2:41 PM, Stas Oskin stas.os...@gmail.com wrote:

  Hi.
 
  I'm testing Hadoop in our lab, and started getting the following message
  when trying to copy a file:
  Could only be replicated to 0 nodes, instead of 1
 
  I have the following setup:
 
  * 3 machines, 2 of them with only 80GB of space, and 1 with 1.5GB
  * Two clients are copying files all the time (one of them is the 1.5GB
  machine)
  * The replication is set on 2
  * I let the space on 2 smaller machines to end, to test the behavior
 
  Now, one of the clients (the one located on 1.5GB) works fine, and the
  other
  one - the external, unable to copy and displays the error + the exception
  below
 
  Any idea if this expected on my scenario? Or how it can be solved?
 
  Thanks in advance.
 
 
 
  09/05/21 10:51:03 WARN dfs.DFSClient: NotReplicatedYetException sleeping
  /test/test.bin retries left 1
 
  09/05/21 10:51:06 WARN dfs.DFSClient: DataStreamer Exception:
  org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
  /test/test.bin could only be replicated to 0 nodes, instead of 1
 
 at
 
 
 org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1123
  )
 
 at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
 
 at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
 
 at
 
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
  )
 
 at java.lang.reflect.Method.invoke(Method.java:597)
 
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
 
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)
 
 
 
 at org.apache.hadoop.ipc.Client.call(Client.java:716)
 
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
 
 at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 
 at
 
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
  )
 
 at
 
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
  )
 
 at java.lang.reflect.Method.invoke(Method.java:597)
 
 at
 
 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82
  )
 
 at
 
 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59
  )
 
 at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450
  )
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333
  )
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745
  )
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922
  )
 
 
 
  09/05/21 10:51:06 WARN dfs.DFSClient: Error Recovery for block null bad
  datanode[0]
 
  java.io.IOException: Could not get block locations. Aborting...
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2153
  )
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745
  )
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1899
  )
 




-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422
www.prohadoopbook.com a community for Hadoop Professionals

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Stas Oskin

Hi.

i)Choose a right version ( Hadoop- 0.18 is good)


I'm using 0.18.3.



 ii)replication should be 3 as ur having 3 modes.( Indirectly see to it that
 ur configuration is correct !!)


Actually I'm testing 2x replication on any number of DN's, to see how
reliable is it.



 Hey even i am just suggesting this as i am also a new to hadoop

 Ashish Pareek


 On Thu, May 21, 2009 at 2:41 PM, Stas Oskin stas.os...@gmail.com wrote:

  Hi.
 
  I'm testing Hadoop in our lab, and started getting the following message
  when trying to copy a file:
  Could only be replicated to 0 nodes, instead of 1
 
  I have the following setup:
 
  * 3 machines, 2 of them with only 80GB of space, and 1 with 1.5GB
  * Two clients are copying files all the time (one of them is the 1.5GB
  machine)
  * The replication is set on 2
  * I let the space on 2 smaller machines to end, to test the behavior
 
  Now, one of the clients (the one located on 1.5GB) works fine, and the
  other
  one - the external, unable to copy and displays the error + the exception
  below
 
  Any idea if this expected on my scenario? Or how it can be solved?
 
  Thanks in advance.
 
 
 
  09/05/21 10:51:03 WARN dfs.DFSClient: NotReplicatedYetException sleeping
  /test/test.bin retries left 1
 
  09/05/21 10:51:06 WARN dfs.DFSClient: DataStreamer Exception:
  org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
  /test/test.bin could only be replicated to 0 nodes, instead of 1
 
 at
 
 
 org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1123
  )
 
 at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
 
 at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
 
 at
 
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
  )
 
 at java.lang.reflect.Method.invoke(Method.java:597)
 
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
 
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)
 
 
 
 at org.apache.hadoop.ipc.Client.call(Client.java:716)
 
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
 
 at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 
 at
 
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
  )
 
 at
 
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
  )
 
 at java.lang.reflect.Method.invoke(Method.java:597)
 
 at
 
 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82
  )
 
 at
 
 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59
  )
 
 at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450
  )
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333
  )
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745
  )
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922
  )
 
 
 
  09/05/21 10:51:06 WARN dfs.DFSClient: Error Recovery for block null bad
  datanode[0]
 
  java.io.IOException: Could not get block locations. Aborting...
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2153
  )
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745
  )
 
 at
 
 
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1899
  )

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Raghu Angadi



I think you should file a jira on this. Most likely this is what is 
happening :


 * two out of 3 dns can not take anymore blocks.
 * While picking nodes for a new block, NN mostly skips the third dn as 
well since '# active writes' on it is larger than '2 * avg'.
 * Even if there is one other block is being written on the 3rd, it is 
still greater than (2 * 1/3).


To test this, if you write just one block to an idle cluster it should 
succeed.


Writing from the client on the 3rd dn succeeds since local node is 
always favored.


This particular problem is not that severe on a large cluster but HDFS 
should do the sensible thing.


Raghu.

Stas Oskin wrote:

Hi.

I'm testing Hadoop in our lab, and started getting the following message
when trying to copy a file:
Could only be replicated to 0 nodes, instead of 1

I have the following setup:

* 3 machines, 2 of them with only 80GB of space, and 1 with 1.5GB
* Two clients are copying files all the time (one of them is the 1.5GB
machine)
* The replication is set on 2
* I let the space on 2 smaller machines to end, to test the behavior

Now, one of the clients (the one located on 1.5GB) works fine, and the other
one - the external, unable to copy and displays the error + the exception
below

Any idea if this expected on my scenario? Or how it can be solved?

Thanks in advance.



09/05/21 10:51:03 WARN dfs.DFSClient: NotReplicatedYetException sleeping
/test/test.bin retries left 1

09/05/21 10:51:06 WARN dfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/test/test.bin could only be replicated to 0 nodes, instead of 1

at
org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1123
)

at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)

at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)



at org.apache.hadoop.ipc.Client.call(Client.java:716)

at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)

at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
)

at java.lang.reflect.Method.invoke(Method.java:597)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82
)

at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59
)

at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450
)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333
)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745
)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922
)



09/05/21 10:51:06 WARN dfs.DFSClient: Error Recovery for block null bad
datanode[0]

java.io.IOException: Could not get block locations. Aborting...

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2153
)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1745
)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1899
)

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Brian Bockelman



On May 21, 2009, at 2:01 PM, Raghu Angadi wrote:



I think you should file a jira on this. Most likely this is what is  
happening :


* two out of 3 dns can not take anymore blocks.
* While picking nodes for a new block, NN mostly skips the third dn  
as well since '# active writes' on it is larger than '2 * avg'.
* Even if there is one other block is being written on the 3rd, it  
is still greater than (2 * 1/3).


To test this, if you write just one block to an idle cluster it  
should succeed.


Writing from the client on the 3rd dn succeeds since local node is  
always favored.


This particular problem is not that severe on a large cluster but  
HDFS should do the sensible thing.




Hey Raghu,

If this analysis is right, I would add it can happen even on large  
clusters!  I've seen this error at our cluster when we're very full  
(97%) and very few nodes have any empty space.  This usually happens  
because we have two very large nodes (10x bigger than the rest of the  
cluster), and HDFS tends to distribute writes randomly -- meaning the  
smaller nodes fill up quickly, until the balancer can catch up.


Brian


Raghu.

Stas Oskin wrote:

Hi.
I'm testing Hadoop in our lab, and started getting the following  
message

when trying to copy a file:
Could only be replicated to 0 nodes, instead of 1
I have the following setup:
* 3 machines, 2 of them with only 80GB of space, and 1 with 1.5GB
* Two clients are copying files all the time (one of them is the  
1.5GB

machine)
* The replication is set on 2
* I let the space on 2 smaller machines to end, to test the behavior
Now, one of the clients (the one located on 1.5GB) works fine, and  
the other
one - the external, unable to copy and displays the error + the  
exception

below
Any idea if this expected on my scenario? Or how it can be solved?
Thanks in advance.
09/05/21 10:51:03 WARN dfs.DFSClient: NotReplicatedYetException  
sleeping

/test/test.bin retries left 1
09/05/21 10:51:06 WARN dfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/test/test.bin could only be replicated to 0 nodes, instead of 1
   at
org 
.apache 
.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1123

)
   at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java: 
330)
   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown  
Source)

   at
sun 
.reflect 
.DelegatingMethodAccessorImpl 
.invoke(DelegatingMethodAccessorImpl.java:25

)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java: 
890)

   at org.apache.hadoop.ipc.Client.call(Client.java:716)
   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
   at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native  
Method)

   at
sun 
.reflect 
.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39

)
   at
sun 
.reflect 
.DelegatingMethodAccessorImpl 
.invoke(DelegatingMethodAccessorImpl.java:25

)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at
org 
.apache 
.hadoop 
.io 
.retry 
.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82

)
   at
org 
.apache 
.hadoop 
.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java: 
59

)
   at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
   at
org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450

)
   at
org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333

)
   at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access 
$1800(DFSClient.java:1745

)
   at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream 
$DataStreamer.run(DFSClient.java:1922

)
09/05/21 10:51:06 WARN dfs.DFSClient: Error Recovery for block null  
bad

datanode[0]
java.io.IOException: Could not get block locations. Aborting...
   at
org.apache.hadoop.dfs.DFSClient 
$DFSOutputStream.processDatanodeError(DFSClient.java:2153

)
   at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access 
$1400(DFSClient.java:1745

)
   at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream 
$DataStreamer.run(DFSClient.java:1899

)

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Raghu Angadi


Brian Bockelman wrote:


On May 21, 2009, at 2:01 PM, Raghu Angadi wrote:



I think you should file a jira on this. Most likely this is what is 
happening :


* two out of 3 dns can not take anymore blocks.
* While picking nodes for a new block, NN mostly skips the third dn as 
well since '# active writes' on it is larger than '2 * avg'.
* Even if there is one other block is being written on the 3rd, it is 
still greater than (2 * 1/3).


To test this, if you write just one block to an idle cluster it should 
succeed.


Writing from the client on the 3rd dn succeeds since local node is 
always favored.


This particular problem is not that severe on a large cluster but HDFS 
should do the sensible thing.




Hey Raghu,

If this analysis is right, I would add it can happen even on large 
clusters!  I've seen this error at our cluster when we're very full 
(97%) and very few nodes have any empty space.  This usually happens 
because we have two very large nodes (10x bigger than the rest of the 
cluster), and HDFS tends to distribute writes randomly -- meaning the 
smaller nodes fill up quickly, until the balancer can catch up.


Yes. This would bite when ever a large portion of nodes can not accept 
blocks. In general can happen whenever less than half the nodes have any 
space left.


Raghu.

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Stas Oskin

Hi.

I think you should file a jira on this. Most likely this is what is
 happening :


Will do - this goes to DFS section, correct?



  * two out of 3 dns can not take anymore blocks.
  * While picking nodes for a new block, NN mostly skips the third dn as
 well since '# active writes' on it is larger than '2 * avg'.
  * Even if there is one other block is being written on the 3rd, it is
 still greater than (2 * 1/3).


Frankly I'm not so familiar with Hadoop inner workings to understand this
completely, but from what I digest, NN doesn't like the 3rd DN because there
is too many blocks on it, compared to other servers?



 To test this, if you write just one block to an idle cluster it should
 succeed.


What exactly is idle cluster? Something that nothing is being written to
(including the 3rd DN)?



 Writing from the client on the 3rd dn succeeds since local node is always
 favored.


Makes sense.



 This particular problem is not that severe on a large cluster but HDFS
 should do the sensible thing.


Yes, I agree that this is a non-standard situation, but IMHO the best way of
action would be write anyway, but throw a warning. There is one already
appearing when there is not enough space for replication, and it explains
quite well the matter. So similar one would be great.

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Stas Oskin

Hi.

If this analysis is right, I would add it can happen even on large clusters!
  I've seen this error at our cluster when we're very full (97%) and very
 few nodes have any empty space.  This usually happens because we have two
 very large nodes (10x bigger than the rest of the cluster), and HDFS tends
 to distribute writes randomly -- meaning the smaller nodes fill up quickly,
 until the balancer can catch up.



A bit of topic, do you ran the balancer manually? Or you have some scheduler
that does it?

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Brian Bockelman



On May 21, 2009, at 3:10 PM, Stas Oskin wrote:


Hi.

If this analysis is right, I would add it can happen even on large  
clusters!
I've seen this error at our cluster when we're very full (97%) and  
very
few nodes have any empty space.  This usually happens because we  
have two
very large nodes (10x bigger than the rest of the cluster), and  
HDFS tends
to distribute writes randomly -- meaning the smaller nodes fill up  
quickly,

until the balancer can catch up.




A bit of topic, do you ran the balancer manually? Or you have some  
scheduler

that does it?


crontab does it for us, once an hour.  We're always importing data, so  
the cluster is always out-of-balance.


If the previous balancer didn't exit, the new one will simply exit.

The real trick has been to make sure the balancer doesn't get stuck --  
a Nagios plugin makes sure that the stdout has been printed to in the  
last hour or so, otherwise it kills the running balancer.  Stuck  
balancers have been an issue in the past.


Brian

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Stas Oskin


 The real trick has been to make sure the balancer doesn't get stuck -- a
 Nagios plugin makes sure that the stdout has been printed to in the last
 hour or so, otherwise it kills the running balancer.  Stuck balancers have
 been an issue in the past.



Thanks for the advice.

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Stas Oskin


 I think you should file a jira on this. Most likely this is what is
 happening :


Here it is - hope it's ok:

https://issues.apache.org/jira/browse/HADOOP-5886

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Raghu Angadi


Stas Oskin wrote:

I think you should file a jira on this. Most likely this is what is
happening :



Here it is - hope it's ok:

https://issues.apache.org/jira/browse/HADOOP-5886


looks good. I will add my earlier post as comment. You could update the 
jira with any more tests.


Next time, it would be better include larger stack traces, logs etc in 
subsequent comments rather than in the description.


Thanks,
Raghu.

Re: could only be replicated to 0 nodes, instead of 1

2009-04-24 Thread Piotr

Hi

I have got a very similar problem when trying to configure HDFS.
The solution was configuring a smaller block size.
 I wanted to install HDFS for testing purposes only, so decided to have ~300
MB of storage space on each machine. The block size was set up to 128 MB ( I
used cloudera configuration tool).
After changing the block size to 1 MB ( could be bigger but it is not a
production environment ), everything started to work fine !

regards
Piotr Praczyk

Re: could only be replicated to 0 nodes, instead of 1

2008-11-13 Thread Arul Ganesh

Hi,
If you are getting this in windows environment (2003 64 bit). We have faced
the same problem. Now we tried the following steps and it started working.
1)Install cygwin and ssh.
2) Downloaded the stable version Hadoop - hadoop-0.17.2.1.tar.gz as on
13/Nov/2008
3) Untar it via cygwin (tar xvfz hadoop-0.17.2.1.tar.gz). please DONOT use
WINZIP to untar.
4) We tried running the sudo distribution example provided in quickstart
(http://hadoop.apache.org/core/docs/current/quickstart.html) and it worked.

Thanks
Arul and Limin
eBay Inc.,

jerrro wrote:

I am trying to install/configure hadoop on a cluster with several
computers. I followed exactly the instructions in the hadoop website for
configuring multiple slaves, and when I run start-all.sh I get no errors -
both datanode and tasktracker are reported to be running (doing ps awux |
grep hadoop on the slave nodes returns two java processes). Also, the log
files are empty - nothing is printed there. Still, when I try to use
bin/hadoop dfs -put,
I get the following error:

# bin/hadoop dfs -put w.txt w.txt
put: java.io.IOException: File /user/scohen/w4.txt could only be
replicated to 0 nodes, instead of 1

and a file of size 0 is created on the DFS (bin/hadoop dfs -ls shows it).

I couldn't find much information about this error, but I did manage to see
somewhere it might mean that there are no datanodes running. But as I
said, start-all does not give any errors. Any ideas what could be problem?

Thanks.

Jerr.

--
View this message in context:
http://www.nabble.com/%22could-only-be-replicated-to-0-nodes%2C-instead-of-1%22-tp14175780p20488938.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Re: could only be replicated to 0 nodes, instead of 1

2008-05-08 Thread jasongs


I get the same error when doing a put and my cluster is running ok

i.e. has capacity and all nodes are live. 
Error message is
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/test/test.txt could only be replicated to 0 nodes, instead of 1
at
org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1127)
at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:312)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:409)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:901)

at org.apache.hadoop.ipc.Client.call(Client.java:512)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:198)
at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2074)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:1967)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1500(DFSClient.java:1487)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1601)
I would appreciate any help/suggestions

Thanks


jerrro wrote:
 
 I am trying to install/configure hadoop on a cluster with several
 computers. I followed exactly the instructions in the hadoop website for
 configuring multiple slaves, and when I run start-all.sh I get no errors -
 both datanode and tasktracker are reported to be running (doing ps awux |
 grep hadoop on the slave nodes returns two java processes). Also, the log
 files are empty - nothing is printed there. Still, when I try to use
 bin/hadoop dfs -put,
 I get the following error:
 
 # bin/hadoop dfs -put w.txt w.txt
 put: java.io.IOException: File /user/scohen/w4.txt could only be
 replicated to 0 nodes, instead of 1
 
 and a file of size 0 is created on the DFS (bin/hadoop dfs -ls shows it).
 
 I couldn't find much information about this error, but I did manage to see
 somewhere it might mean that there are no datanodes running. But as I
 said, start-all does not give any errors. Any ideas what could be problem?
 
 Thanks.
 
 Jerr.
 

-- 
View this message in context: 
http://www.nabble.com/%22could-only-be-replicated-to-0-nodes%2C-instead-of-1%22-tp14175780p17124514.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Re: could only be replicated to 0 nodes, instead of 1

2008-05-08 Thread Hairong Kuang

Could you please go to the dfs webUI and check how many datanodes are up and
how much available space each has?

Hairong


On 5/8/08 3:30 AM, jasongs [EMAIL PROTECTED] wrote:

 
 I get the same error when doing a put and my cluster is running ok
 
 i.e. has capacity and all nodes are live.
 Error message is
 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /test/test.txt could only be replicated to 0 nodes, instead of 1
 at
 org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1127)
 at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:312)
 at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
 ava:25)
 at java.lang.reflect.Method.invoke(Method.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:409)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:901)
 
 at org.apache.hadoop.ipc.Client.call(Client.java:512)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:198)
 at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
 ava:25)
 at java.lang.reflect.Method.invoke(Method.java:585)
 at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocation
 Handler.java:82)
 at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandle
 r.java:59)
 at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
 at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient
 .java:2074)
 at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClien
 t.java:1967)
 at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1500(DFSClient.java:148
 7)
 at
 org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.jav
 a:1601)
 I would appreciate any help/suggestions
 
 Thanks
 
 
 jerrro wrote:
 
 I am trying to install/configure hadoop on a cluster with several
 computers. I followed exactly the instructions in the hadoop website for
 configuring multiple slaves, and when I run start-all.sh I get no errors -
 both datanode and tasktracker are reported to be running (doing ps awux |
 grep hadoop on the slave nodes returns two java processes). Also, the log
 files are empty - nothing is printed there. Still, when I try to use
 bin/hadoop dfs -put,
 I get the following error:
 
 # bin/hadoop dfs -put w.txt w.txt
 put: java.io.IOException: File /user/scohen/w4.txt could only be
 replicated to 0 nodes, instead of 1
 
 and a file of size 0 is created on the DFS (bin/hadoop dfs -ls shows it).
 
 I couldn't find much information about this error, but I did manage to see
 somewhere it might mean that there are no datanodes running. But as I
 said, start-all does not give any errors. Any ideas what could be problem?
 
 Thanks.
 
 Jerr.

Re: Job.jar could only be replicated to 0 nodes, instead of 1(IO Exception)

2008-04-29 Thread Amar Kamat


Sridhar Raman wrote:

I am trying to run K-Means using Hadoop.  I first wanted to test it within a
single-node cluster.  And this was the error I got.  What could be the
problem?

$ bin/hadoop jar clustering.jar
com.company.analytics.clustering.mr.core.KMeansDriver
Iteration 0
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/WORK/temp/hadoop/workspace/hadoop-user/mapred/system/job_200804291904_0001/job.jar
could only be replicated to 0 nodes, instead of 1
  

Check if your datanode is up or not.
Amar

at
org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1003)
at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:293)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
at org.apache.hadoop.ipc.Client.call(Client.java:482)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:1554)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:1500)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1626)
at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1733)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:55)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:83)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:140)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:827)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:815)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:796)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:493)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:753)
at
com.company.analytics.clustering.mr.core.KMeansDriver.runIteration(KMeansDriver.java:136)
at
com.company.analytics.clustering.mr.core.KMeansDriver.runJob(KMeansDriver.java:88)
at
com.company.analytics.clustering.mr.core.KMeansDriver.main(KMeansDriver.java:34)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)

Re: could only be replicated to 0 nodes, instead of 1

2008-04-12 Thread John Menzer


i had the same error message...
can you describe when and how this error occurs?


Jayant Durgad wrote:
 
 I am faced with the exact same problem described here, does anybody know
 how
 to resolve this?
 
 

-- 
View this message in context: 
http://www.nabble.com/Re%3A-%22could-only-be-replicated-to-0-nodes%2C-instead-of-1%22-tp16623192p16656655.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: could only be replicated to 0 nodes, instead of 1

2008-04-12 Thread lohit

Can you check the datanode and namenode logs and see if all are up and running? 
I am assuming you are running this on single host hence replication of 1. 
Thanks,
Lohit

- Original Message 
From: John Menzer [EMAIL PROTECTED]
To: core-user@hadoop.apache.org
Sent: Saturday, April 12, 2008 2:04:00 PM
Subject: Re: could only be replicated to 0 nodes, instead of 1

i had the same error message...
can you describe when and how this error occurs?

Jayant Durgad wrote:

 I am faced with the exact same problem described here, does anybody know
 how
 to resolve this?

-- 
View this message in context: 
http://www.nabble.com/Re%3A-%22could-only-be-replicated-to-0-nodes%2C-instead-of-1%22-tp16623192p16656655.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: could only be replicated to 0 nodes, instead of 1

2008-04-11 Thread Raghu Angadi


jerrro wrote:


I couldn't find much information about this error, but I did manage to see
somewhere it might mean that there are no datanodes running. But as I said,
start-all does not give any errors. Any ideas what could be problem?


start-all return does not mean datanodes are ok. Did you check if there 
are any datanodes alive? You can check from http://namenode:50070/.


Raghu.

Re: could only be replicated to 0 nodes, instead of 1

2008-04-10 Thread Jayant Durgad

I am faced with the exact same problem described here, does anybody know how
to resolve this?

54 matches

Mail list logo