Probably the soft limit flushes, eh?
On Mar 8, 2011 11:15 AM, "Jean-Daniel Cryans" <jdcry...@apache.org> wrote:
> On Tue, Mar 8, 2011 at 11:04 AM, Chris Tarnas <c...@email.com> wrote:
>> Just as a point of reference, in one of our systems we have 500+million
rows that have a cell in its own column family that is about usually about
100bytes, but in about 10,000 of rows the cell can get to 300mb (average is
probably about 30mb for the larger data). The jumbo sized data gets loaded
in separately from the smaller data, although it all goes through the same
pipeline. We are using cdh3b45 (0.90.1) GZ compression, region size of 1GB
and with a max value size of 500mb. So far we have had no problems with the
larger values.
>>
>> Our largest problem was performance related to inserting into several
column families for the small sized value loads and pauses when flushing the
memstores. 0.90.1 helped quite a bit with that.
>
> Flushing is done without blocking, were the pauses you were seeing
> related to the "too many stores" issue or about the global memstore
> size?
>
> In general inserting into many families is a bad idea unless the sizes
> are the same. The worst case is inserting a few kbs in one and a few
> mbs in the other. The reason being:
> https://issues.apache.org/jira/browse/HBASE-3149
>
> J-D

Reply via email to