Re: Scan performance on a big table as combination of multiple logic tables

M. C. Srivas Thu, 23 Feb 2012 22:35:15 -0800

On Tue, Feb 21, 2012 at 9:58 PM, Stack <[email protected]> wrote:

> On Tue, Feb 21, 2012 at 9:29 PM, M. C. Srivas <[email protected]> wrote:
> > Yes,  that was my thinking ---  to do a major compaction  the
> region-server
> > would have to load all the flushed files for that region, merge them, and
> > then write out the new region. If the region-file was 20g in size, the
> > region-server would require well over 20g of heap space to do this work.
> Am
> > I completely off?
> >
>
> You are a little off.  We open all hfiles and then stream through each
> of them doing a merge sort streaming the outputting to the new
> compacted file.
>


Doh!  Seems obvious once you mention it. Sorry about that.



>
> Here is where we open a scanner on all the files to compact and then
> as we inch through, we figure what to write to the output:
>
> http://hbase.apache.org/xref/org/apache/hadoop/hbase/regionserver/Store.html#1393
>
> (Its a bit hard to follow whats going on -- file selection is done
> already higher up in call chain).
>
> St.Ack
>

Re: Scan performance on a big table as combination of multiple logic tables

Reply via email to