Re: Optimizing table scans

2012-09-17 Thread Alex Baranau
column value at a time FYI > > -Anoop- > > From: Amit Sela [am...@infolinks.com] > Sent: Saturday, September 15, 2012 2:41 PM > To: user@hbase.apache.org > Subject: Re: Optimizing table scans > > So just to get it straight. The reason the s

RE: Optimizing table scans

2012-09-16 Thread Anoop Sam John
[am...@infolinks.com] Sent: Saturday, September 15, 2012 2:41 PM To: user@hbase.apache.org Subject: Re: Optimizing table scans So just to get it straight. The reason the scan with setBatch(1) is much much faster is because it returns the only the value for the first column ? On Wed, Sep 12, 2012 a

Re: Optimizing table scans

2012-09-15 Thread Amit Sela
So just to get it straight. The reason the scan with setBatch(1) is much much faster is because it returns the only the value for the first column ? On Wed, Sep 12, 2012 at 5:37 PM, Doug Meil wrote: > > Hi there, > > See this for info on the block cache in the RegionServer.. > > http://hbase.apac

Re: Optimizing table scans

2012-09-12 Thread Doug Meil
Hi there, See this for info on the block cache in the RegionServer.. http://hbase.apache.org/book.html 9.6.4. Block Cache Š and see this for "batching" on the scan parameter... http://hbase.apache.org/book.html#perf.reading 11.8.1. Scan Caching On 9/12/12 9:55 AM, "Amit Sela" wrote: >

Re: Optimizing table scans

2012-09-12 Thread Amit Sela
I allocate 10GB per RegionServer. An average row size is ~200 Bytes. The network is 1GB. It would be great if anyone could elaborate on the difference between Cache and Batch parameters. Thanks. On Wed, Sep 12, 2012 at 4:04 PM, Michael Segel wrote: > How much memory do you have? > What's the si

Re: Optimizing table scans

2012-09-12 Thread Michael Segel
How much memory do you have? What's the size of the underlying row? What does your network look like? 1GBe or 10GBe? There's more to it, and I think that you'll find that YMMV on what is an optimum scan size... HTH -Mike On Sep 12, 2012, at 7:57 AM, Amit Sela wrote: > Hi all, > > I'm tryi

Optimizing table scans

2012-09-12 Thread Amit Sela
Hi all, I'm trying to find the sweet spot for the cache size and batch size Scan() parameters. I'm scanning one table using HTable.getScanner() and iterating over the ResultScanner retrieved. I did some testing and got the following results: For scanning *100* rows. * Cache Batch Total