If you are game for deploying an instrumented jar, we could log client lookups in .META. and try and figure if it profligate. St.Ack
On Wed, Nov 24, 2010 at 9:25 AM, Jack Levin <magn...@gmail.com> wrote: > Yes, but that does not alleviate CPU contention should there be too > many queries to a single region server. On a separate topic, is > 'compression' in the works for REST gateway? Similar to > mysql_client_compression? We plan to drop in 500K or more queries at > a time into the REST, and it would be interesting to see the > performance gain against uncompressed data. > > Thanks. > > -Jack > > On Wed, Nov 24, 2010 at 9:04 AM, Andrew Purtell <apurt...@apache.org> wrote: >> The REST gateway (Stargate) is a long lived client. :-) >> >> It uses HTablePool internally so this will keep some warm table references >> around in addition to the region location caching that HConnectionManager >> does behind the scenes. (10 references, but this could be made configurable.) >> >> Best regards, >> >> - Andy >> >> --- On Tue, 11/23/10, Jack Levin <magn...@gmail.com> wrote: >> >>> From: Jack Levin <magn...@gmail.com> >>> Subject: Re: question about meta data query intensity >>> To: user@hbase.apache.org >>> Date: Tuesday, November 23, 2010, 11:06 AM >>> its REST, and generally no long lived >>> clients, yes, caching of regions >>> helps however, we expect long tail hits that will be >>> uncached, which >>> may stress out meta region, that being said, is it possible >>> create >>> affinity and nail meta region into a beefy server or set of >>> beefy >>> servers? >>> >>> -Jack >>> >>> On Tue, Nov 23, 2010 at 10:58 AM, Jonathan Gray <jg...@fb.com> >>> wrote: >>> > Are you going to have long-lived clients? How are >>> you accessing HBase? REST or Thrift gateways? Caching of >>> region locations should help significantly so that it's only >>> a bottleneck right at the startup of the >>> cluster/gateways/clients. >>> > >>> >> -----Original Message----- >>> >> From: Jack Levin [mailto:magn...@gmail.com] >>> >> Sent: Tuesday, November 23, 2010 10:53 AM >>> >> To: user@hbase.apache.org >>> >> Subject: Re: question about meta data query >>> intensity >>> >> >>> >> my concern is that we plane to have 120 >>> regionservers with 1000 >>> >> Regions each, so the hits to meta could be quite >>> intense. (why so >>> >> many regions? we are storing 1 Petabyte of data of >>> images into hbase). >>> >> >>> >> -Jack >>> >> >>> >> On Tue, Nov 23, 2010 at 9:50 AM, Jonathan Gray >>> <jg...@fb.com> >>> wrote: >>> >> > It is possible that it could be a bottleneck >>> but usually is not. >>> >> Generally production HBase installations have >>> long-lived clients, so >>> >> the client-side caching is sufficient to reduce >>> the amount of load to >>> >> META (virtually 0 clean cluster is at steady-state >>> / no region >>> >> movement). >>> >> > >>> >> > For MapReduce, you do make new clients but >>> generally only need to >>> >> query for one region per task. >>> >> > >>> >> > It is not currently possible to split META. >>> We hard-coded some stuff >>> >> a while back to make things easier and in the name >>> of correctness. >>> >> > >>> >> > HBASE-3171 is about removing the ROOT region >>> and putting the META >>> >> region(s) locations into ZK directly. When we >>> make that change, we >>> >> could probably also re-enable the splitting of >>> META. >>> >> > >>> >> > JG >>> >> > >>> >> >> -----Original Message----- >>> >> >> From: Jack Levin [mailto:magn...@gmail.com] >>> >> >> Sent: Tuesday, November 23, 2010 9:31 AM >>> >> >> To: user@hbase.apache.org >>> >> >> Subject: question about meta data query >>> intensity >>> >> >> >>> >> >> Hello, I am curious if there is a >>> potential bottleneck in .META. >>> >> >> ownership by a single region server. Is >>> it possible (safe) to split >>> >> >> meta region into several? >>> >> >> >>> >> >> -Jack >>> >> > >>> > >>> >> >> >> >> >