Re: Recovery of files in hadoop 18

lohit Fri, 14 Nov 2008 14:23:01 -0800

Yes that is right whatever you did. One last check.
In secondary namenode log you should see the timestamp of last checkpoint. (or 
download of edits). Just make sure those are before when you run delete 
command. 
Basically, trying to make sure your delete command isn't in edits. (Another way 
woudl have been to open edits in hex editor or similar to check) , but this 
should work.
Once done, you could start.
Thanks,
Lohit




----- Original Message ----
From: Sagar Naik <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Friday, November 14, 2008 1:59:04 PM
Subject: Re: Recovery of files in hadoop 18

I had a secondary namenode running on the namenode machine.
I deleted the dfs.name.dir
then bin/hadoop namenode -importCheckpoint.

and restarted the dfs.

I guess the deletion of name.dir will delete the edit logs.
Can u pl tell me that this will not lead to replaying the delete 
transactions ?

Thanks for help/advice


-Sagar

lohit wrote:
> NameNode would not come out of safe mode as it is still waiting for datanodes 
> to report those blocks which it expects. 
> I should have added, try to get a full output of fsck
> fsck <path> -openforwrite -files -blocks -location.
> -openforwrite files should tell you what files where open during the 
> checkpoint, you might want to double check that is the case, the files were 
> being writting during that moment. May be by looking at the filename you 
> could tell if that was part of a job which was running.
>
> For any missing block, you might also want to cross verify on the datanode to 
> see if is really missing.
>
> Once you are convinced that those are the only corrupt files which you can 
> live with, start datanodes. 
> Namenode woudl still not come out of safemode as you have missing blocks, 
> leave it for a while, run fsck look around, if everything ok, bring namenode 
> out of safemode.
> I hope you had started this namenode with old image and empty edits. You do 
> not want your latest edits to be replayed, which has your delete transactions.
>
> Thanks,
> Lohit
>
>
>
> ----- Original Message ----
> From: Sagar Naik <[EMAIL PROTECTED]>
> To: core-user@hadoop.apache.org
> Sent: Friday, November 14, 2008 12:11:46 PM
> Subject: Re: Recovery of files in hadoop 18
>
> Hey Lohit,
>
> Thanks for you help.
> I did as per your suggestion. imported from secondary namenode.
> we have some corrupted files.
>
> But for some reason, the namenode is still in safe_mode. It has been an hour 
> or so.
> The fsck report is :
>
> Total size:    6954466496842 B (Total open files size: 543469222 B)
> Total dirs:    1159
> Total files:   1354155 (Files currently being written: 7673)
> Total blocks (validated):      1375725 (avg. block size 5055128 B) (Total 
> open file blocks (not validated): 50)
> ********************************
> CORRUPT FILES:        1574
> MISSING BLOCKS:       1574
> MISSING SIZE:         1165735334 B
> CORRUPT BLOCKS:       1574
> ********************************
> Minimally replicated blocks:   1374151 (99.88559 %)
> Over-replicated blocks:        0 (0.0 %)
> Under-replicated blocks:       26619 (1.9349071 %)
> Mis-replicated blocks:         0 (0.0 %)
> Default replication factor:    3
> Average block replication:     2.977127
> Corrupt blocks:                1574
> Missing replicas:              26752 (0.65317154 %)
>
>
> Do you think, I should manually override the safemode and delete all the 
> corrupted files and restart
>
> -Sagar
>
>
> lohit wrote:
>  
>> If you have enabled thrash. They should be moved to trash folder before 
>> permanently deleting them, restore them back. (hope you have that set 
>> fs.trash.interval)
>>
>> If not Shut down the cluster.
>> Take backup of you dfs.data.dir (both on namenode and secondary namenode).
>>
>> Secondary namenode should have last updated image, try to start namenode 
>> from that image, dont use the edits from namenode yet. Try do 
>> importCheckpoint explained in here 
>> https://issues.apache.org/jira/browse/HADOOP-2585?focusedCommentId=12558173#action_12558173.
>>  Start only namenode and run fsck -files. it will throw lot of messages 
>> saying you are missing blocks but thats fine since you havent started 
>> datanodes yet. But if it shows your files, that means they havent been 
>> deleted yet. This will give you a view of system of last backup. Start 
>> datanode If its up, try running fsck and check consistency of the sytem. you 
>> would lose all changes that has happened since the last checkpoint. 
>>
>> Hope that helps,
>> Lohit
>>
>>
>>
>> ----- Original Message ----
>> From: Sagar Naik <[EMAIL PROTECTED]>
>> To: core-user@hadoop.apache.org
>> Sent: Friday, November 14, 2008 10:38:45 AM
>> Subject: Recovery of files in hadoop 18
>>
>> Hi,
>> I accidentally deleted the root folder in our hdfs.
>> I have stopped the hdfs
>>
>> Is there any way to recover the files from secondary namenode
>>
>> Pl help
>>
>>
>> -Sagar
>>  
>>

Re: Recovery of files in hadoop 18

Reply via email to