Yes guys. We observed such problems.
They will be common for 0.18.2 and 0.19.0 exactly as you
described it when data-nodes become unstable.
There were several issues, please take a look
HADOOP-4997 workaround for tmp file handling on DataNodes
HADOOP-4663 - links to other related
HADOOP-4810 Data lost at cluster startup
HADOOP-4702 Failed block replication leaves an incomplete block
....
We run 0.18.3 now and it does not have these problems.
0.19.1 should be the same.
Thanks,
--Konstantin
Zak, Richard [USA] wrote:
It happens right after the MR job (though once or twice its happened
during). I am not using EBS, just HDFS between the machines. As for tasks,
there are 4 mappers and 0 reducers.
Richard J. Zak
-----Original Message-----
From: jdcry...@gmail.com [mailto:jdcry...@gmail.com] On Behalf Of
Jean-Daniel Cryans
Sent: Friday, January 23, 2009 13:24
To: core-user@hadoop.apache.org
Subject: Re: HDFS loosing blocks or connection error
xlarge is good. Is it normally happening during a MR job? If so, how many
tasks do you have running at the same moment overall? Also, is your data
stored on EBS?
Thx,
J-D
On Fri, Jan 23, 2009 at 12:55 PM, Zak, Richard [USA]
<zak_rich...@bah.com>wrote:
4 slaves, 1 master, all are the m1.xlarge instance type.
Richard J. Zak
-----Original Message-----
From: jdcry...@gmail.com [mailto:jdcry...@gmail.com] On Behalf Of
Jean-Daniel Cryans
Sent: Friday, January 23, 2009 12:34
To: core-user@hadoop.apache.org
Subject: Re: HDFS loosing blocks or connection error
Richard,
This happens when the datanodes are too slow and eventually all
replicas for a single block are tagged as "bad". What kind of
instances are you using?
How many of them?
J-D
On Fri, Jan 23, 2009 at 12:13 PM, Zak, Richard [USA]
<zak_rich...@bah.com>wrote:
Might there be a reason for why this seems to routinely happen to
me when using Hadoop 0.19.0 on Amazon EC2?
09/01/23 11:45:52 INFO hdfs.DFSClient: Could not obtain block
blk_-1757733438820764312_6736 from any node: java.io.IOException:
No live nodes contain current block
09/01/23 11:45:55 INFO hdfs.DFSClient: Could not obtain block
blk_-1757733438820764312_6736 from any node: java.io.IOException:
No live nodes contain current block
09/01/23 11:45:58 INFO hdfs.DFSClient: Could not obtain block
blk_-1757733438820764312_6736 from any node: java.io.IOException:
No live nodes contain current block
09/01/23 11:46:01 WARN hdfs.DFSClient: DFS Read: java.io.IOException:
Could not obtain block: blk_-1757733438820764312_6736
file=/stats.txt It seems hdfs isn't so robust or reliable as the
website says and/or I have a configuration issue.
Richard J. Zak