Thanks for the explanation. Makes perfect sense now that you've explained it. That would incur a
huge write overhead so I see whey we don't keep the counts.
~Jeff
On 3/16/2011 2:59 PM, Matt Corgan wrote:
Jeff,
The problem is that when hbase receives a put or delete, it doesn't know if
the put is overwriting an existing row or inserting a new one, and it
doesn't know if whether the requested row was there to delete. This isn't
known until read or compaction time.
So to keep the counter up to date on every insert, it would have to check
all of the region's storefiles which would slow down your inserts a lot.
Matt
On Wed, Mar 16, 2011 at 4:52 PM, Ted Yu<yuzhih...@gmail.com> wrote:
Since we have lived so long without this information, I guess we can hold
for longer :-)
Another issue I am working on is to reduce memory footprint. See the
following discussion thread:
One of the regionserver aborted, then the master shut down itself
We have to bear in mind that there would be around 10K regions or more in
production.
Cheers
On Wed, Mar 16, 2011 at 1:46 PM, Jeff Whiting<je...@qualtrics.com> wrote:
Just a random thought. What about keeping a per region row count? Then
if
you needed to get a row count for a table you'd just have to query each
region once and sum. Seems like it wouldn't be too expensive because
you'd
just have a row counter variable. It maybe more complicated than I'm
making
it out to be though...
~Jeff
On 3/16/2011 2:40 PM, Stack wrote:
On Wed, Mar 16, 2011 at 1:35 PM, Vivek Krishna<vivekris...@gmail.com>
wrote:
1. How do I count rows fast in hbase?
First I tired count 'test' , takes ages.
Saw that I could use RowCounter, but looks like it is deprecated.
It is not. Make sure you are using the one from mapreduce package as
opposed to mapred package.
I just need to verify the total counts. Is it possible to see
somewhere
in
the web interface or ganglia or by any other means?
We don't keep a current count on a table. Too expensive. Run the
rowcounter MR job. This page may be of help:
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description
Good luck,
St.Ack
--
Jeff Whiting
Qualtrics Senior Software Engineer
je...@qualtrics.com
--
Jeff Whiting
Qualtrics Senior Software Engineer
je...@qualtrics.com