Re: Blob storage

2011-03-09 Thread Steven Noels
On Tue, Mar 8, 2011 at 7:33 PM, Buttler, David buttl...@llnl.gov wrote: Has someone implemented a complementary blob storage mechanism, or is this something still on the roadmap? Well, yes. :-) The Lily blob storage looks very interesting, but also tightly integrated into their CMS We

Re: Blob storage

2011-03-09 Thread Chris Tarnas
When I get a chance to catch my breath I'll see about writing up something on our experiences. One thing I will say - don't skimp on the nodes, you do not want to run out of RAM when using the large values. When running my dev environment in pseudo distributed mode on a laptop the system can

Re: Blob storage

2011-03-09 Thread Jean-Daniel Cryans
Yeah there's definitely something better we could do there, see Too easy to OOME a RS https://issues.apache.org/jira/browse/HBASE-2506 J-D On Wed, Mar 9, 2011 at 11:09 AM, Chris Tarnas c...@email.com wrote: When I get a chance to catch my breath I'll see about writing up something on our

Re: Blob storage

2011-03-08 Thread Jean-Daniel Cryans
How big are those blobs on average? I know of a few people that store objects in the hundreds of KBs in HBase without too much tuning. AFAIK the offline blob storage described by Todd wasn't implemented. J-D On Tue, Mar 8, 2011 at 10:33 AM, Buttler, David buttl...@llnl.gov wrote: Hi all, I

RE: Blob storage

2011-03-08 Thread Buttler, David
...@gmail.com] On Behalf Of Jean-Daniel Cryans Sent: Tuesday, March 08, 2011 10:39 AM To: user@hbase.apache.org Subject: Re: Blob storage How big are those blobs on average? I know of a few people that store objects in the hundreds of KBs in HBase without too much tuning. AFAIK the offline blob storage

Re: Blob storage

2011-03-08 Thread Jean-Daniel Cryans
The blobs vary in size from smallish (10K) to largish (20MB). 20MB is quite large, but could be harmless if most of the rows are under 1MB They are too small to put into individual files in HDFS, but if I have too many largish rows in a region, I think I would suffer. Yeah, need more info

Re: Blob storage

2011-03-08 Thread Chris Tarnas
Just as a point of reference, in one of our systems we have 500+million rows that have a cell in its own column family that is about usually about 100bytes, but in about 10,000 of rows the cell can get to 300mb (average is probably about 30mb for the larger data). The jumbo sized data gets

Re: Blob storage

2011-03-08 Thread Jean-Daniel Cryans
On Tue, Mar 8, 2011 at 11:04 AM, Chris Tarnas c...@email.com wrote: Just as a point of reference, in one of our systems we have 500+million rows that have a cell in its own column family that is about usually about 100bytes, but in about 10,000 of rows the cell can get to 300mb (average is

Re: Blob storage

2011-03-08 Thread Ryan Rawson
Probably the soft limit flushes, eh? On Mar 8, 2011 11:15 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: On Tue, Mar 8, 2011 at 11:04 AM, Chris Tarnas c...@email.com wrote: Just as a point of reference, in one of our systems we have 500+million rows that have a cell in its own column family

Re: Blob storage

2011-03-08 Thread Chris Tarnas
Yes, HBASE-3483 fixed the majority of our pauses, but not all - as JD points out we do experience issues related to inserting into several column families. Luckily inserts that have the really imbalanced column family sizes (mb vs kb) are few and far between, relatively speaking. We are also

Re: Blob storage

2011-03-08 Thread Jean-Daniel Cryans
That's pretty good stuff Chris! You know, you could be my new BFF if you wrote a blog post about your current HBase setup, experiences, etc :) J-D On Tue, Mar 8, 2011 at 11:25 AM, Chris Tarnas c...@email.com wrote: Yes, HBASE-3483 fixed the majority of our pauses, but not all - as JD points