RE: Bulk Loads and Updates

2012-10-03 Thread Ramkrishna.S.Vasudevan
. Regards Ram > -Original Message- > From: Eugeny Morozov [mailto:emoro...@griddynamics.com] > Sent: Thursday, October 04, 2012 2:01 AM > To: user@hbase.apache.org > Subject: Re: Bulk Loads and Updates > > Hi! > > Sure, you do, but don't forget to sort all

Re: Bulk Loads and Updates

2012-10-03 Thread Doug Meil
Hi there- re: "All 20 versions will get loaded but the 10 oldest will be deleted during the next major compaction." Yep, that's what is expected to happen. For information on KeyValue structure and compaction algorithm, seeĊ . http://hbase.apache.org/book.html#regions.arch For info on bulk l

Re: Bulk Loads and Updates

2012-10-03 Thread Eugeny Morozov
Hi! Sure, you do, but don't forget to sort all KV pairs before put them into context. Or else you'd get some "unsorted" expection. If you have them completely the same and you need to reduce number of same lines you could use Combiner, but their behavior is not deterministic, so basically there i

Re: Bulk Loads and Updates

2012-10-03 Thread gordoslocos
Thank you Paul. I was just thinking that I could use add a reducer to the step that prepares the data to build custom logic around having multiple entries which produce the same rowkey. What do u think? Sent from my iPhone On 03/10/2012, at 17:12, Paul Mackles wrote: > Keys in hbase are a co

Re: Bulk Loads and Updates

2012-10-03 Thread Paul Mackles
Keys in hbase are a combination of rowkey/column/timestamp. Two records with the same rowkey but different column will result in two different cells with the same rowkey which is probably what you expect. For two records with the same rowkey and same column, the timestamp will normally differenti

Bulk Loads and Updates

2012-10-03 Thread Juan P.
Hi guys, I've been reading up on bulk load using MapReduce jobs and I wanted to validate something. If I the input I wanted to load into HBase produced the same key for several lines. How will HBase handle that? I understand the MapReduce job will create StoreFiles which the region servers just p