Thanks! That completely answers my question and really helps my schema plans.
-chris On Mar 1, 2010, at 12:27 AM, Ryan Rawson wrote: > Yes/no. During the read process we load a block of the hfile in at a > time. We only retain the cells which are picked by your scan/get > query specification. So columns you are not interested in are not > retained. But if you have huge values intermixed with tiny values, > yeah we will scan and read all of the blocks to find your little > values hiding between the huge ones. Column families let you separate > data to improve performance - one option to adjust performance in > hbase by simple scheme adjustments. > > On Wed, Feb 24, 2010 at 11:00 AM, Chris Tarnas <c...@email.com> wrote: >> This brings up a question, when retrieving only one cell, is the whole row >> still read into memory from disk? I recall reading that happened in the past. >> >> thanks! >> -chris >> >> On Feb 23, 2010, at 12:18 PM, Stack wrote: >> >>> On Tue, Feb 23, 2010 at 10:40 AM, Bluemetrix Development >>> <bmdevelopm...@gmail.com> wrote: >>>> >>>> If this is the case tho, how big is too big? >>> >>> Each cell and its coordinates is read into memory. If not enough >>> memory, then OOME. >>> >>> Or does it depend on my >>>> disk/memory resources? >>>> I'm currently using dynamic column qualifiers, so I could have been >>>> reaching >>>> rows with 10s of millions of unique column qualifiers each. >>> >>> This should be fine as long as you are on a recent hbase. >>> >>> I'd say it a big cell or many big cells concurrently that caused the OOME. >>> >>> >>>> Or, with other tables using timestamps as another dimension to the >>>> data, and therefore >>>> reaching 10s of millions of versions. >>>> (I was trying to get HBase back up so I could count these numbers.) >>>> >>>> What limits should I use for the time being for number of qualifiers >>>> and number of timestamps/versions? >>> >>> Shouldn't be an issue. >>> >>> St.Ack >> >>