RE: Using HBase on other file systems

2010-05-16 Thread Gibbon, Robert, VF-Group
...@cloudera.com] Sent: Sun 5/16/2010 12:27 AM To: hbase-user@hadoop.apache.org Subject: Re: Using HBase on other file systems On Sat, May 15, 2010 at 1:19 PM, Gibbon, Robert, VF-Group < robert.gib...@vodafone.com> wrote: > > Todd thanks for replying. 4x 7200 spindles and no RAID = app

Re: Using HBase on other file systems

2010-05-15 Thread Todd Lipcon
ant to put 4G of data in memory! -Todd > -Original Message- > From: Todd Lipcon [mailto:t...@cloudera.com] > Sent: Sat 5/15/2010 3:51 AM > To: hbase-user@hadoop.apache.org > Subject: Re: Using HBase on other file systems > > On Fri, May 14, 2010 at 2:15 PM, Gibbon, Ro

RE: Using HBase on other file systems

2010-05-15 Thread Andrew Purtell
No, Todd was not specifying some kind of minimum. The point was the more spindles, the better for an I/O parallel architecture like HDFS and BigTable. Have you read the BigTable paper? - Andy > From: Gibbon, Robert, VF-Group > Subject: RE: Using HBase on other file systems > >

Re: Using HBase on other file systems

2010-05-15 Thread baleksan
®erred Sent via BlackBerry from T-Mobile -Original Message- From: "Gibbon, Robert, VF-Group" Date: Sat, 15 May 2010 22:19:57 To: Subject: RE: Using HBase on other file systems Todd thanks for replying. 4x 7200 spindles and no RAID = approx 360 IOPS to/from the backe

RE: Using HBase on other file systems

2010-05-15 Thread Gibbon, Robert, VF-Group
@hadoop.apache.org Subject: Re: Using HBase on other file systems On Fri, May 14, 2010 at 2:15 PM, Gibbon, Robert, VF-Group < robert.gib...@vodafone.com> wrote: > Hmm. What level of IOPs does Hbase need in order to support a reasonably > responsive level of service? How much latency in trans

Re: Using HBase on other file systems

2010-05-14 Thread Todd Lipcon
riad other > single-node filesystems that exist. > > -Todd > > > > > > > > -Original Message- > > From: Andrew Purtell [mailto:apurt...@apache.org] > > Sent: Thu 5/13/2010 11:54 PM > > To: hbase-user@hadoop.apache.org > > Subject: RE:

RE: Using HBase on other file systems

2010-05-14 Thread Gibbon, Robert, VF-Group
antics/performance between ext3,ext4,xfs,ufs, myriad other single-node filesystems that exist. -Todd > > > -Original Message- > From: Andrew Purtell [mailto:apurt...@apache.org] > Sent: Thu 5/13/2010 11:54 PM > To: hbase-user@hadoop.apache.org > Subject: RE: Using HBa

Re: Using HBase on other file systems

2010-05-14 Thread Todd Lipcon
on't have to deal with varying semantics/performance between ext3,ext4,xfs,ufs, myriad other single-node filesystems that exist. -Todd > > > -Original Message- > From: Andrew Purtell [mailto:apurt...@apache.org] > Sent: Thu 5/13/2010 11:54 PM > To: hbase-user@hadoop

RE: Using HBase on other file systems

2010-05-14 Thread Gibbon, Robert, VF-Group
:apurt...@apache.org] Sent: Thu 5/13/2010 11:54 PM To: hbase-user@hadoop.apache.org Subject: RE: Using HBase on other file systems You really want to run HBase backed by Eucalyptus' Walrus? What do you have behind that? > From: Gibbon, Robert, VF-Group > Subject: RE: Using HBase on other file s

RE: Using HBase on other file systems

2010-05-13 Thread Andrew Purtell
You really want to run HBase backed by Eucalyptus' Walrus? What do you have behind that? > From: Gibbon, Robert, VF-Group > Subject: RE: Using HBase on other file systems [...] > NB. I checked out running HBase over Walrus (an AWS S3 > clone): bork - you want me to file a Jira on that?

Re: Using HBase on other file systems

2010-05-13 Thread Ryan Rawson
nning HBase over Walrus (an AWS S3 clone): bork - you > want me to file a Jira on that? > > > -Original Message- > From: Ryan Rawson [mailto:ryano...@gmail.com] > Sent: Thu 5/13/2010 9:46 PM > To: hbase-user@hadoop.apache.org > Subject: Re: Using HBase on other file syst

RE: Using HBase on other file systems

2010-05-13 Thread Gibbon, Robert, VF-Group
PM To: hbase-user@hadoop.apache.org Subject: Re: Using HBase on other file systems Hey, I think one of the key features of HDFS is its ability to be run on standard hardware and integrate nicely in a standardized datacenter environment. I never would have got my project off the ground if I h

Re: Using HBase on other file systems

2010-05-13 Thread Ryan Rawson
p a test cluster and >> then >> > > randomly kill bricks. >> > > >> > > Also as pointed out in another mail, you'll want to colocate >> TaskTrackers >> > > on Gluster bricks to get I/O locality, yet there is no way for Gluster >>

Re: Using HBase on other file systems

2010-05-13 Thread Edward Capriolo
ll want to colocate > TaskTrackers > > > on Gluster bricks to get I/O locality, yet there is no way for Gluster > to > > > export stripe locations back to Hadoop. > > > > > > It seems a poor choice. > > > > > > - Andy > > > > >

Re: Using HBase on other file systems

2010-05-12 Thread Jeff Hammerbacher
tripe locations back to Hadoop. > > > > It seems a poor choice. > > > > - Andy > > > > > From: Edward Capriolo > > > Subject: Re: Using HBase on other file systems > > > To: "hbase-user@hadoop.apache.org" > > > D

Re: Using HBase on other file systems

2010-05-12 Thread Edward Capriolo
there is no way for Gluster to > export stripe locations back to Hadoop. > > It seems a poor choice. > > - Andy > > > From: Edward Capriolo > > Subject: Re: Using HBase on other file systems > > To: "hbase-user@hadoop.apache.org" > > D

Re: Using HBase on other file systems

2010-05-12 Thread Andrew Purtell
seems a poor choice. - Andy > From: Edward Capriolo > Subject: Re: Using HBase on other file systems > To: "hbase-user@hadoop.apache.org" > Date: Wednesday, May 12, 2010, 6:38 AM > On Tuesday, May 11, 2010, Jeff > Hammerbacher > wrote: > > Hey Edwa

Re: Using HBase on other file systems

2010-05-12 Thread Edward Capriolo
On Tuesday, May 11, 2010, Jeff Hammerbacher wrote: > Hey Edward, > > I do think that if you compare GoogleFS to HDFS, GFS looks more full >> featured. >> > > What features are you missing? Multi-writer append was explicitly called out > by Sean Quinlan as a bad idea, and rolled back. From internal

Re: Using HBase on other file systems

2010-05-11 Thread Kevin Apte
-Original Message- > From: Jeff Hammerbacher [mailto:ham...@cloudera.com] > Sent: Tuesday, May 11, 2010 3:29 PM > To: hbase-user@hadoop.apache.org > Subject: Re: Using HBase on other file systems > > Hey Edward, > > I do think that if you compare GoogleFS to HDFS, GFS loo

RE: Using HBase on other file systems

2010-05-11 Thread Buttler, David
: Tuesday, May 11, 2010 3:29 PM To: hbase-user@hadoop.apache.org Subject: Re: Using HBase on other file systems Hey Edward, I do think that if you compare GoogleFS to HDFS, GFS looks more full > featured. > What features are you missing? Multi-writer append was explicitly called out by Sean Quin

Re: Using HBase on other file systems

2010-05-11 Thread Jeff Hammerbacher
Hey Edward, I do think that if you compare GoogleFS to HDFS, GFS looks more full > featured. > What features are you missing? Multi-writer append was explicitly called out by Sean Quinlan as a bad idea, and rolled back. From internal conversations with Google engineers, erasure coding of blocks s

Re: Using HBase on other file systems

2010-05-11 Thread Edward Capriolo
On Tue, May 11, 2010 at 5:40 PM, Jeff Hammerbacher wrote: > Okay, the assertion that HBase is only interesting if you need HDFS is > continuing to rankle for me. On the surface, it sounds reasonable, but it's > just so wrong. The specifics cited (caching, HFile, and compaction) are > actually all

Re: Using HBase on other file systems

2010-05-11 Thread Jeff Hammerbacher
Okay, the assertion that HBase is only interesting if you need HDFS is continuing to rankle for me. On the surface, it sounds reasonable, but it's just so wrong. The specifics cited (caching, HFile, and compaction) are actually all advantages of the HBase design. 1) Caching: any data store which t

Re: Using HBase on other file systems

2010-05-11 Thread Jeff Hammerbacher
Hey Edward, Database systems have been built for decades against a storage medium (spinning magnetic platters) which have the same characteristics you point out in HDFS. In the interim, they've managed to service a large number of low latency workloads in a reasonable fashion. There's a reason the

Re: Using HBase on other file systems

2010-05-11 Thread Edward Capriolo
On Tue, May 11, 2010 at 3:51 PM, Jeff Hammerbacher wrote: > Hey, > > Thanks for the evaluation, Andrew. Ceph certainly is elegant in design; > HDFS, similar to GFS [1], was purpose-built to get into production quickly, > so its current incarnation lacks some of the same elegance. On the other > ha

Re: Using HBase on other file systems

2010-05-11 Thread Jeff Hammerbacher
Hey, Thanks for the evaluation, Andrew. Ceph certainly is elegant in design; HDFS, similar to GFS [1], was purpose-built to get into production quickly, so its current incarnation lacks some of the same elegance. On the other hand, there are many techniques for making the metadata servers scalable

Re: Using HBase on other file systems

2010-05-09 Thread Andrew Purtell
> or you'll need to > extend the FileSystem class to write a client that Hadoop > Core can use. There is one: https://issues.apache.org/jira/browse/HADOOP-6253 It even exports stripe locations in a way useful for distributing MR task placement, but provides only one host per "block". - And

Re: Using HBase on other file systems

2010-05-09 Thread Amandeep Khurana
I have HBase running over Ceph on a small cluster here at UC Santa Cruz and am evaluating its performance as compared to HDFS. You'll see some numbers soon. Theoretically, HBase can work on any filesystem. It should either have a posix client that you can mount and HBase can use it as a raw filesys

Re: Using HBase on other file systems

2010-05-09 Thread Andrew Purtell
Our experience with Gluster 2 is that self heal when a brick drops off the network is very painful. The high performance impact lasts for a long time. I'm not sure but I think Gluster 3 may only rereplicate missing sections instead of entire files. On the other hand I would not trust Gluster 3 t