Re: Insert into tall table 50% faster than wide table

2010-12-23 Thread Bryan Keller
en Bryan is going to randomly pick a customer >>> id, create an order and insert the order in to HBase. (randomly pick a >>> number between 1 and N where N represents the number of customers who >>> haven't placed 600 orders, and then count the number of orders a

Re: Insert into tall table 50% faster than wide table

2010-12-23 Thread Bryan Keller
ustomer with 600 orders from the list) >> >> So this really wouldn't be a bulk load app, but a simulation of multiple >> clients hitting HBase and its relative performance. >> >> If this is the case, I don't know if you want to use the HFileOutput >

Re: Insert into tall table 50% faster than wide table

2010-12-23 Thread Ryan Rawson
iple > clients hitting HBase and its relative performance. > > If this is the case, I don't know if you want to use the HFileOutput format... > > With respect to the 'wide' row, I'd hash the key. (You wouldn't want to do > this in the 'tall' tab

RE: Insert into tall table 50% faster than wide table

2010-12-23 Thread Michael Segel
nd its relative performance. If this is the case, I don't know if you want to use the HFileOutput format... With respect to the 'wide' row, I'd hash the key. (You wouldn't want to do this in the 'tall' table because you want each customer's orders to b

Re: Insert into tall table 50% faster than wide table

2010-12-23 Thread Lars George
Writing data only hits the WAL and MemStore, so that should equal in the same performance for both models. One thing that Mike mentioned is how you distribute the load. How many servers are you using? How are inserting your data (sequential or random)? Why do you use a Put since this sounds like a

Re: Insert into tall table 50% faster than wide table

2010-12-23 Thread Andrey Stepachev
2010/12/23 Ted Dunning > But the tall table is FASTER than the wide table. > Opps. :). Maybe you put more data? Do you using compression? (in case of prefixed qualifiers you get more data, that uuid can has comparable length as an order row) > > On Wed, Dec 22, 2010 at 11:14 PM, Andrey Stepac

Re: Insert into tall table 50% faster than wide table

2010-12-23 Thread Ted Dunning
But the tall table is FASTER than the wide table. On Wed, Dec 22, 2010 at 11:14 PM, Andrey Stepachev wrote: > I think row locks slows down here. Each row you inserted tries to aquire > lock, and then release it. Wide table has significally less rows, and much > less locks acquired during insert.

Re: Insert into tall table 50% faster than wide table

2010-12-22 Thread Andrey Stepachev
I think row locks slows down here. Each row you inserted tries to aquire lock, and then release it. Wide table has significally less rows, and much less locks acquired during insert. 2010/12/23 Bryan Keller > I have been testing a couple of different approaches to storing customer > orders. One

Re: Insert into tall table 50% faster than wide table

2010-12-22 Thread Bryan Keller
you >>> write one column for each order and you have to figure out how you represent >>> your columns in the order. >>> (An example... your order of 10 items is represented by a string with a >>> 'special character' used as a column separator in the o

RE: Insert into tall table 50% faster than wide table

2010-12-22 Thread Michael Segel
e 10 columns as individual columns. But that's just me. :-) -Mike > Date: Wed, 22 Dec 2010 19:00:25 -0800 > Subject: Re: Insert into tall table 50% faster than wide table > From: yuzhih...@gmail.com > To: user@hbase.apache.org > > > Each column is the order so you wr

Re: Insert into tall table 50% faster than wide table

2010-12-22 Thread Ted Yu
gt; question. The much slower write performance will cause problems for me > unless I can resolve that. > > >> > > >> On Dec 22, 2010, at 3:52 PM, Peter Haidinyak wrote: > > >> > > >>> Interesting, do you know what the time difference would be on the &

RE: Insert into tall table 50% faster than wide table

2010-12-22 Thread Michael Segel
able. There are a couple other unknowns. Are you hashing your keys? I mean are you getting a bit of 'randomness' in your keys? So what am I missing? -Mike > Subject: Re: Insert into tall table 50% faster than wide table > From: brya...@gmail.com > Date: Wed, 22 Dec 2010 18:24:05

Re: Insert into tall table 50% faster than wide table

2010-12-22 Thread Bryan Keller
gt;> Interesting, do you know what the time difference would be on the other >>> side, doing a lookup/scan? >>> >>> Thanks >>> >>> -Pete >>> >>> -Original Message- >>> From: Bryan Keller [mailto:brya...@gmail.com] >

Re: Insert into tall table 50% faster than wide table

2010-12-22 Thread Bryan Keller
ler [mailto:brya...@gmail.com] >> Sent: Wednesday, December 22, 2010 3:41 PM >> To: user@hbase.apache.org >> Subject: Insert into tall table 50% faster than wide table >> >> I have been testing a couple of different approaches to storing customer >> orders

Re: Insert into tall table 50% faster than wide table

2010-12-22 Thread Bryan Keller
ide, > doing a lookup/scan? > > Thanks > > -Pete > > -Original Message- > From: Bryan Keller [mailto:brya...@gmail.com] > Sent: Wednesday, December 22, 2010 3:41 PM > To: user@hbase.apache.org > Subject: Insert into tall table 50% faster than wide table

RE: Insert into tall table 50% faster than wide table

2010-12-22 Thread Peter Haidinyak
Interesting, do you know what the time difference would be on the other side, doing a lookup/scan? Thanks -Pete -Original Message- From: Bryan Keller [mailto:brya...@gmail.com] Sent: Wednesday, December 22, 2010 3:41 PM To: user@hbase.apache.org Subject: Insert into tall table 50

Insert into tall table 50% faster than wide table

2010-12-22 Thread Bryan Keller
I have been testing a couple of different approaches to storing customer orders. One is a tall table, where each order is a row. The other is a wide table where each customer is a row, and orders are columns in the row. I am finding that inserts into the tall table, i.e. adding rows for every or