Thanks! That completely answers my question and really helps my schema plans.

-chris

On Mar 1, 2010, at 12:27 AM, Ryan Rawson wrote:

> Yes/no.  During the read process we load a block of the hfile in at a
> time.  We only retain the cells which are picked by your scan/get
> query specification.  So columns you are not interested in are not
> retained.  But if you have huge values intermixed with tiny values,
> yeah we will scan and read all of the blocks to find your little
> values hiding between the huge ones.  Column families let you separate
> data to improve performance - one option to adjust performance in
> hbase by simple scheme adjustments.
> 
> On Wed, Feb 24, 2010 at 11:00 AM, Chris Tarnas <c...@email.com> wrote:
>> This brings up a question, when retrieving only one cell, is the whole row 
>> still read into memory from disk? I recall reading that happened in the past.
>> 
>> thanks!
>> -chris
>> 
>> On Feb 23, 2010, at 12:18 PM, Stack wrote:
>> 
>>> On Tue, Feb 23, 2010 at 10:40 AM, Bluemetrix Development
>>> <bmdevelopm...@gmail.com> wrote:
>>>> 
>>>> If this is the case tho, how big is too big?
>>> 
>>> Each cell and its coordinates is read into memory.  If not enough
>>> memory, then OOME.
>>> 
>>> Or does it depend on my
>>>> disk/memory resources?
>>>> I'm currently using dynamic column qualifiers, so I could have been 
>>>> reaching
>>>> rows with 10s of millions of unique column qualifiers each.
>>> 
>>> This should be fine as long as you are on a recent hbase.
>>> 
>>> I'd say it a big cell or many big cells concurrently that caused the OOME.
>>> 
>>> 
>>>> Or, with other tables using timestamps as another dimension to the
>>>> data, and therefore
>>>> reaching 10s of millions of versions.
>>>> (I was trying to get HBase back up so I could count these numbers.)
>>>> 
>>>> What limits should I use for the time being for number of qualifiers
>>>> and number of timestamps/versions?
>>> 
>>> Shouldn't be an issue.
>>> 
>>> St.Ack
>> 
>> 

Reply via email to