Hi andy, we do run a version of HDFS RAID that is backported from Apache trunk to a 0.20 based release. Our code is in https://github.com/facebook/hadoop-20-warehouse/tree/master/src/contrib/raid But I do not have an elegant way to contribute this code to Apache 0.20.2xx.x.
thanks, dhruba On Sat, Sep 17, 2011 at 9:16 AM, Andrew Purtell <apurt...@apache.org> wrote: > Hi Dhruba, > > Would you consider a contribution of this to branch-0.20-security aka > 0.20.2xx.x? > > If I am mistaken and you do not have a 0.22-ish HDFS RAID backported to an > 0.20-ish platform, please disregard. > > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > > ------------------------------ > *From:* Dhruba Borthakur <dhr...@gmail.com> > *To:* hdfs-user@hadoop.apache.org; Andrew Purtell <apurt...@apache.org> > *Sent:* Thursday, September 15, 2011 10:14 AM > > *Subject:* Re: Need help regarding HDFS-RAID > > That's right Andy. 0.22+. We are running a HDFS-RAID code base that is > pretty close to what is available in Apache hdfs trunk. > > -dhruba > > On Thu, Sep 15, 2011 at 10:08 AM, Andrew Purtell <apurt...@apache.org>wrote: > > But that is the HDFS RAID effectively in 0.22+, not 0.21, right Dhruba? > > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > > ------------------------------ > *From:* Dhruba Borthakur <dhr...@gmail.com> > *To:* hdfs-user@hadoop.apache.org > *Sent:* Thursday, September 15, 2011 10:06 AM > *Subject:* Re: Need help regarding HDFS-RAID > > We use HDFS RAID in a big way. Data older than 12 days are RAIDED using XOR > encoding (effective replication of 2.5). Data older than a few months are > raided using ReedSolomon (effective observed replication factor of 1.5). > This is running on our 60 PB size cluster for about an year now. > > thanks > dhruba > > On Thu, Sep 15, 2011 at 5:31 AM, Ajit Ratnaparkhi < > ajit.ratnapar...@gmail.com> wrote: > > Hi, > > We were planning to use it for past data archival(instead of moving it to > archival store). > Archiving it in HDFS gives advantage of making it easily available for > processing whenever required. > > Is there any archival solution in hadoop ecosystem? > > thanks, > Ajit. > > > On Thu, Sep 15, 2011 at 5:05 PM, Harsh J <ha...@cloudera.com> wrote: > > Hey Ajit, > > HDFS-RAID was never part of the 0.20 release. It made its debut in the > 0.21 release [1]. I know that Facebook uses it (and also did develop > it), but unsure of users beyond Facebook. > > While 0.21 overall is not entirely deemed as production-usable yet > (and is in fact, possibly abandoned for efforts on 0.22+), you can > give that release a whirl on a test cluster and see for yourself if > your need beats the stability. > > Just curious though - why are you looking to use this specifically? > > [1] - > http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.21/mapreduce/src/contrib/raid/ > > On Thu, Sep 15, 2011 at 4:37 PM, Ajit Ratnaparkhi > <ajit.ratnapar...@gmail.com> wrote: > > Hi, > > We want to use HDFS-RAID in our production cluster. > > (http://wiki.apache.org/hadoop/HDFS-RAID) > > I am not able to find source/binaries/configs for this in official hadoop > > distribution from apache hadoop. (checked in 0.20.1 and 0.20.2). > > Can somebody please tell me where can I find that? and installation > > procedure? > > Also, is HDFS-RAID implementation stable enough to use in production? > > thanks, > > Ajit. > > > > > > -- > Harsh J > > > > > > -- > Connect to me at http://www.facebook.com/dhruba > > > > > > -- > Connect to me at http://www.facebook.com/dhruba > > > -- Connect to me at http://www.facebook.com/dhruba