RE: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Chad Walters
> -ryan > > On Wed, Dec 15, 2010 at 11:55 AM, Chad Walters > > wrote: > > Why not run multiple JVMs per machine? > > > > Chad > > > > -Original Message- > > From: Ryan Rawson [mailto:ryano...@gmail.com] > > Sent: Wednesday,

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Ed Kohlwey
Along the lines of Terracotta big memory, apparently what they are actually doing is just using the DirectByteBuffer class (see this forum post: http://forums.terracotta.org/forums/posts/list/4304.page) which is basically the same as using malloc - it gives you non-gc access to a giant pool of memo

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Andrew Purtell
> From: Ryan Rawson > Purtell has more, but he told me "no longer crashes, but minor pauses > between 50-250 ms". From 1.6_23. That's right. On EC2 m1.xlarge so that's a big caveat... per-test-iteration variance on EC2 in general is ~20%, and EC2 hardware is 2? generations back. Someone with

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Andrew Purtell
kloads. Best regards, - Andy --- On Wed, 12/15/10, Ted Dunning wrote: > From: Ted Dunning > Subject: Re: Hypertable claiming upto >900% random-read throughput vs HBase > To: dev@hbase.apache.org > Date: Wednesday, December 15, 2010, 11:31 AM > > Does anybody have a re

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Ted Dunning
r per VM seems like a reasonable stopgap until the GC > works better. > > Chad > > -Original Message- > From: Ryan Rawson [mailto:ryano...@gmail.com] > Sent: Wednesday, December 15, 2010 11:58 AM > To: dev@hbase.apache.org > Subject: Re: Hypertable claiming u

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Ryan Rawson
IQ, www.carrieriq.com >> e-mail: vrodio...@carrieriq.com >> >> ________________ >> From: Ryan Rawson [ryano...@gmail.com] >> Sent: Wednesday, December 15, 2010 11:52 AM >> To: dev@hbase.apache.org >> Subject: Re: Hypertable claiming upto &g

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Todd Lipcon
Wednesday, December 15, 2010 11:52 AM > To: dev@hbase.apache.org > Subject: Re: Hypertable claiming upto >900% random-read throughput vs HBase > > The malloc thing was pointing out that we have to contend with Xmx and > GC.  So it makes it harder for us to maximally use all the availabl

RE: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Vladimir Rodionov
From: Ryan Rawson [ryano...@gmail.com] Sent: Wednesday, December 15, 2010 11:52 AM To: dev@hbase.apache.org Subject: Re: Hypertable claiming upto >900% random-read throughput vs HBase The malloc thing was pointing out that we have to contend with Xmx and GC. So it makes it harder for us

RE: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Chad Walters
[mailto:ryano...@gmail.com] Sent: Wednesday, December 15, 2010 11:58 AM To: dev@hbase.apache.org Subject: Re: Hypertable claiming upto >900% random-read throughput vs HBase Why do that? You reduce the cache effectiveness and up the logistical complexity. As a stopgap maybe, but not as a l

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Ryan Rawson
> Chad > > -Original Message- > From: Ryan Rawson [mailto:ryano...@gmail.com] > Sent: Wednesday, December 15, 2010 11:52 AM > To: dev@hbase.apache.org > Subject: Re: Hypertable claiming upto >900% random-read throughput vs HBase > > The malloc thing was pointing

RE: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Chad Walters
Why not run multiple JVMs per machine? Chad -Original Message- From: Ryan Rawson [mailto:ryano...@gmail.com] Sent: Wednesday, December 15, 2010 11:52 AM To: dev@hbase.apache.org Subject: Re: Hypertable claiming upto >900% random-read throughput vs HBase The malloc thing was point

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Ryan Rawson
The malloc thing was pointing out that we have to contend with Xmx and GC. So it makes it harder for us to maximally use all the available ram for block cache in the regionserver. Which you may or may not want to do for alternative reasons. At least with Xmx you can plan and control your deploym

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Todd Lipcon
On Wed, Dec 15, 2010 at 11:44 AM, Gaurav Sharma wrote: > Thanks Ryan and Ted. I also think if they were using tcmalloc, it would have > given them a further advantage but as you said, not much is known about the > test source code. I think Hypertable does use tcmalloc or jemalloc (forget which)

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Gaurav Sharma
Thanks Ryan and Ted. I also think if they were using tcmalloc, it would have given them a further advantage but as you said, not much is known about the test source code. On Wed, Dec 15, 2010 at 2:22 PM, Ryan Rawson wrote: > So if that is the case, I'm not sure how that is a fair test. One > sy

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Ryan Rawson
Purtell has more, but he told me "no longer crashes, but minor pauses between 50-250 ms". From 1.6_23. Still not usable in a latency sensitive prod setting. Maybe in other settings? -ryan On Wed, Dec 15, 2010 at 11:31 AM, Ted Dunning wrote: > Does anybody have a recent report about how G1 is

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Ted Dunning
Does anybody have a recent report about how G1 is coming along? On Wed, Dec 15, 2010 at 11:22 AM, Ryan Rawson wrote: > As G1 GC improves, I expect our ability to use larger and larger heaps > would blunt the advantage of a C++ program using malloc. >

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Ryan Rawson
So if that is the case, I'm not sure how that is a fair test. One system reads from RAM, the other from disk. The results as expected. Why not test one system with SSDs and the other without? It's really hard to get apples/oranges comparison. Even if you are doing the same workloads on 2 divers

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Ted Dunning
>From the small comments I have heard, the RAM versus disk difference is mostly what I have heard they were testing. On Wed, Dec 15, 2010 at 11:11 AM, Ryan Rawson wrote: > We dont have the test source code, so it isnt very objective. However > I believe there are 2 things which help them: > - T

Re: Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Ryan Rawson
Hi, We dont have the test source code, so it isnt very objective. However I believe there are 2 things which help them: - They are able to harness larger amounts of RAM, so they are really just testing that vs HBase - There have been substantial performance improvements in HBase since the version

Hypertable claiming upto >900% random-read throughput vs HBase

2010-12-15 Thread Gaurav Sharma
Folks, my apologies if this has been discussed here before but can someone please shed some light on how Hypertable is claiming upto a 900% higher throughput on random reads and upto a 1000% on sequential reads in their performance evaluation vs HBase (modeled after the perf-eval test in section 7