Hi,
I have 4 identical nodes in a Hadoop cluster (all functioning as DNs). One of
the 4 nodes is a new node that I recently added. I ran the balancer a few
times and it did move some of the blocks from the other 3 nodes to the new
node. However, the 4 nodes are still not 100% balanced
Oh, and on top of the above, I just observed that even though bin/hadoop
balancer exits immediately and reports the cluster is fully balanced, I do see
*very* few blocks (1-2 blocks per node) getting moved every time I run
balancer. It feels as if the balancer does actually find some blocks
Just upgraded to 0.16.4, and tried a distcp to s3. I'm seeing many
errors - according to the jobtracker, 8,008 files were copied, but
5,880 were skipped. I'm assume that the number of skipped files needs
to be 0 for a successful copy. And 56 maps failed (log file given
below).
Is there
Hi All:
We had a primary node failure over the weekend. When we brought the node
back up and I ran Hadoop fsck, I see the file system is corrupt. I'm unsure
how best to proceed. Any advice is greatly appreciated. If I've missed a
Wiki page or documentation somewhere please feel free
Did one datanode fail or did the namenode fail? By fail do you mean
that the system was rebooted or was there a bad disk that caused the
problem?
thanks,
dhruba
On Sun, May 11, 2008 at 7:23 PM, C G [EMAIL PROTECTED] wrote:
Hi All:
We had a primary node failure over the weekend. When we
You bring up an interesting point. A big chunk of the code in the
Namenode is being done inside a global lock although there are pieces
(e.g. a portion of code that chooses datanodes for a newly allocated
block) that do execute outside this lock. But, it is probably the case
that the namenode does
The system hosting the namenode experienced an OS panic and shut down, we
subsequently rebooted it. Currently we don't believe there is/was a bad disk
or other hardware problem.
Something interesting: I've ran fsck twice, the first time it gave the
result I posted. The second time I
Is it possible that new files were being created by running
applications between the first and second fsck runs?
thans,
dhruba
On Sun, May 11, 2008 at 8:55 PM, C G [EMAIL PROTECTED] wrote:
The system hosting the namenode experienced an OS panic and shut down, we
subsequently rebooted it.
Yes, several of our logging apps had accumulated backlogs of data and were
eager to write to HDFS
Dhruba Borthakur [EMAIL PROTECTED] wrote: Is it possible that new files were
being created by running
applications between the first and second fsck runs?
thans,
dhruba
On Sun, May 11, 2008