Yes this is spot on. When hbase scans we read a block, iterate through the
keys in the block then goes to the next block. We try to be as efficient as
possible, but the inescapable fact remains we must read all the intervening
data. We can do tricks (in 0.90) to use the block index to skip some blocks,
but it is not always possible.
On Oct 11, 2010 5:01 PM, "Sean Bigdatafun" <sean.bigdata...@gmail.com>
wrote:
> I think this is a good suggestion too.
>
> HBase linearly scans through the 64KB that is bring to memory. If big data
> payload (yet unused in a query/scan) is mixed with small data payload, it
> will be rather ineffective, I think?
>
> On Mon, Oct 11, 2010 at 9:43 AM, Ryan Rawson <ryano...@gmail.com> wrote:
>
>> The reason I talk about value size is one area where multiple families
>> are good is when you have really large values in one column and
>> smaller values in different columns. So if you want to just read the
>> small values without scanning through the big values you can use
>> separate column families.
>>
>> -ryan
>>
>> On Mon, Oct 11, 2010 at 9:32 AM, Jean-Daniel Cryans <jdcry...@apache.org>
>> wrote:
>> >> Yes. I agree. OOME unlikely. I misinterpreted my current problem.
>> >> I found, that this (gc timeout) on my 0.89-stumpbleupon hbase occurs
>> >> only if writeToWAL=false. My RS eats all available memory (5GB), but
>> >> don't get OOME. I try ti figure out what is going on.
>> >
>> > Long GC pauses happens for many different reasons, first make sure
>> > that your IO, CPU, and RAM aren't over committed and that there's no
>> > swap.
>> >
>> >> Hm.. How I can flush family from client side? I don't see any api in
>> 0.20.x.
>> >> Is it 0.89 api changes? (don't dig into 0.89 yet).
>> >>
>> >
>> > You can't, I was talking about a possible fix in the code.
>> >
>> >>
>> >> Sorry for wrong information.
>> >
>> > No problem :)
>> >
>> > J-D
>> >
>>

Reply via email to