Yes. Thats what I've done to fit ITL. Also you can export the data-dir you 
backup'ed over samba / nfs so people has the opportunity to restore their files 
easier (fuse hdfs). For smb I wrote an article in my blog. 

The copy to another cluster has the charm for fast restore of lost files in the 
first step of your backup concept. 

 - Alex

sent via my mobile device

On Jan 3, 2012, at 1:31 PM, Mac Noland <mcdonaldnol...@yahoo.com> wrote:

> 
> 
> Thanks for the reply Alex.  To make sure I understand:
> 
> 1) "park" the data by sending it  over to a different cluster on a schedule 
> (e.g. nightly is what we offer today on most things).
> 2) then from this secondary cluster, which is sitting idle after the distcp, 
> do a copy local to a NFS mount pointed at SAN or NAS.
> 3) Then with some type of coordination (so you're not copying local when the 
> backup happens), have the SAN or NAS device snap the data for backup.
> 
> A simple restore process would be then to allow users read access to the NFS 
> mounted storage so they can pick and choose what they want to recover via the 
> SAN or NAS's snapshot feature - or after a "restore" to the local file system 
> is completed by the support folks if they are using one of our older systems.
> 
> 
> Is that about right?
> 
> Mac
> 
> 
> 
> ________________________________
> From: alo alt <wget.n...@googlemail.com>
> To: "hdfs-user@hadoop.apache.org" <hdfs-user@hadoop.apache.org>; Mac Noland 
> <mcdonaldnol...@yahoo.com> 
> Sent: Tuesday, January 3, 2012 3:10 PM
> Subject: Re: Hadoop HDFS Backup/Restore Solutions
> 
> 
> Hi Mac,
> 
> hdfs has at the moment no solution for an complete backup- and restore 
> process like ITL or ISO9000. An strategy could be to "park" the data from 
> hdfs do you want to backup on tape with "distcp" to another backup cluster 
> and snapshot from them with SAN mechanism. Here the DN store has to be 
> located on the SAN box. 
> 
> - Alex
> 
> On Tuesday, January 3, 2012, Mac Noland <mcdonaldnol...@yahoo.com> wrote:
>> Good day,
>>  
>> I’m guessing this question been asked a myriad of times, but
>> we’re about to get serious with some of our Hadoop implementations so I 
>> wanted
>> to re-ask to see if I’m missing anything, or if others happen to know if 
>> this might
>> be on a future road map.
>>  
>> For our current storage offerings (e.g. NAS or SAN), we give
>> businesses the opportunity to choose 7, 14, or 45 day “backups” for their
>> storage.   The purpose of the backup isn’t
>> so much as they are worried about losing their current data (we’re RAID’ed
>> and  have some stuff mirrored to remote
>> datacenters), but more so if they were to delete some data today, they can
>> recover from yesterday’s backup.  Or the
>> day before’s backup, or the day before that, etc.  And to be honest, 
>> business units buy a good portion of their backups to make people feel 
>> better and fulfill custom contracts.
>> 
>>  
>> So far with HDFS we haven’t found too many formalized
>> offerings for this specific feature.  While I haven’t done a ton of 
>> research, the best solution I’ve found is an
>> idea where we’d schedule a job to pull the data locally to a mount that is
>> backed up via our traditional methods.  See Michael Segel’s first post on 
>> this site http://lucene.472066.n3.nabble.com/Backing-up-HDFS-td1019184.html
>>  
>> Though we’d have to work through the details of what this
>> would look like for our support folks, it looks like something that could
>> potentially fit into our current model.  We’d basically need to allocate the 
>> same amount of SAN or NAS disk as we
>> have for HDFS, then coordinate a snap on the the SAN or NAS via our 
>> traditional
>> methods.  Not sure what a restore would
>> look like, other than we could give the end users read access to the NAS or 
>> SAN
>> mounts so they can pick through what they need to recover and let them figure
>> out how to get it back into HDFS.
>>  
>> For use cases like ours where we’d need multi-day backups to
>> fulfill business needs, is this kind of what people are thinking or doing?  
>> Moreover, are there any things in the Hadoop
>> HDFS road map for providing, for lack of a better word, an “enterprise”
>> backup/restore solution?
>>  
>> Thanks in advance,
>> 
>> Mac Noland – Thomson Reuters
>> 
> 
> -- 
> 
> Alexander Lorenz
> http://mapredit.blogspot.com
> 
> P Think of the environment: please don't print this email unless you really 
> need to.

Reply via email to