Filed HADOOP-5798.
On Wed, May 6, 2009 at 9:53 PM, Raghu Angadi wrote:
> Tamir Kamara wrote:
>
>> Hi Raghu,
>>
>> The thread you posted is my original post written when this problem first
>> happened on my cluster. I can file a JIRA but I wouldn't be able to
>> provide
>> information other than
Tamir Kamara wrote:
Hi Raghu,
The thread you posted is my original post written when this problem first
happened on my cluster. I can file a JIRA but I wouldn't be able to provide
information other than what I already posted and I don't have the logs from
that time. Should I still file ?
yes.
Hi.
Yes, this was probably it.
The strangest part, that the HDFS somehow worked even with all files empty
in the NN directory.
Go figure...
Regards.
2009/5/5 Raghu Angadi
>
> the image is stored in two files : fsimage and edits
> (under namenode-directory/current/).
>
>
> Stas Oskin wrote:
>
Hi Raghu,
The thread you posted is my original post written when this problem first
happened on my cluster. I can file a JIRA but I wouldn't be able to provide
information other than what I already posted and I don't have the logs from
that time. Should I still file ?
Thanks,
Tamir
On Tue, May
the image is stored in two files : fsimage and edits
(under namenode-directory/current/).
Stas Oskin wrote:
Well, it definitely caused the SecondaryNameNode to crash, and also seems to
have triggered some strange issues today as well.
By the way, how the image file is named?
Hi.
2009/5/5 Raghu Angadi
> Stas Oskin wrote:
>
>> Actually, we discovered today an annoying bug in our test-app, which might
>> have moved some of the HDFS files to the cluster, including the metadata
>> files.
>>
>
> oops! presumably it could have removed the image file itself.
>
> I presume
Stas Oskin wrote:
Actually, we discovered today an annoying bug in our test-app, which might
have moved some of the HDFS files to the cluster, including the metadata
files.
oops! presumably it could have removed the image file itself.
I presume it could be the possible reason for such behavio
Actually, we discovered today an annoying bug in our test-app, which might
have moved some of the HDFS files to the cluster, including the metadata
files.
I presume it could be the possible reason for such behavior? :)
2009/5/5 Stas Oskin
> Hi Raghu.
>
> The only lead I have, is that my root mo
Hi Raghu.
The only lead I have, is that my root mount has filled-up completely.
This in itself should not have caused the metadata corruption, as it has
been stored on another mount point, which had plenty of space.
But perhaps the fact that NameNode/SecNameNode didn't have enough space for
logs
Tamir,
Please file a jira on the problem you are seeing with 'saveLeases'. In
the past there have been multiple fixes in this area (HADOOP-3418,
HADOOP-3724, and more mentioned in HADOOP-3724).
Also refer the thread you started
http://www.mail-archive.com/core-user@hadoop.apache.org/msg09397
Stas,
This is indeed a serious issue.
Did you happen to store the the corrupt image? Can this be reproduced
using the image?
Usually you can recover manually from a corrupt or truncated image. But
more importantly we want to find how it got in to this state.
Raghu.
Stas Oskin wrote:
Hi.
Hi.
This quite worry-some issue.
Can anyone advice on this? I'm really concerned it could appear in
production, and cause a huge data loss.
Is there any way to recover from this?
Regards.
2009/5/5 Tamir Kamara
> I didn't have a space problem which led to it (I think). The corruption
> starte
I didn't have a space problem which led to it (I think). The corruption
started after I bounced the cluster.
At the time, I tried to investigate what led to the corruption but didn't
find anything useful in the logs besides this line:
saveLeases found path
/tmp/temp623789763/tmp659456056/_temporary
Hi.
Same conditions - where the space has run out and the fs got corrupted?
Or it got corrupted by itself (which is even more worrying)?
Regards.
2009/5/4 Tamir Kamara
> I had the same problem a couple of weeks ago with 0.19.1. Had to reformat
> the cluster too...
>
> On Mon, May 4, 2009 at 3
I had the same problem a couple of weeks ago with 0.19.1. Had to reformat
the cluster too...
On Mon, May 4, 2009 at 3:50 PM, Stas Oskin wrote:
> Hi.
>
> After rebooting the NameNode server, I found out the NameNode doesn't start
> anymore.
>
> The logs contained this error:
> "FSNamesystem initi
Hi.
After rebooting the NameNode server, I found out the NameNode doesn't start
anymore.
The logs contained this error:
"FSNamesystem initialization failed"
I suspected filesystem corruption, so I tried to recover from
SecondaryNameNode. Problem is, it was completely empty!
I had an issue that
16 matches
Mail list logo