I forgot to mention in the mail that the table will be salted. Thanks Vladimir and James for pointing it out.
Also Thanks for confirming that BLOCK ENCODING is the right approach for dealing with large row keys with small column values. 'Block encoding in your case is a MUST' --Unilocal On Tue, Apr 15, 2014 at 12:01 PM, Sean Huo <[email protected]> wrote: > Speaking of which, is there a way to reset data_block_encoding of an > existing phoenix table? > I tried using phoenix's alter table without much success. Hbase shell does > support that. > > > On Mon, Apr 14, 2014 at 7:35 PM, James Taylor <[email protected]>wrote: > >> I agree with Vladamir wrt hotspotting - take a look here at how to salt >> your table: http://phoenix.incubator.apache.org/salted.html >> >> In general, we find salted tables perform better for both reads and >> writes. >> >> James >> >> >> On Monday, April 14, 2014, Vladimir Rodionov <[email protected]> >> wrote: >> >>> You rowkey schema is not efficient, it will result in region hot >>> spotting (unless you salt your table). >>> DATA_BLOCK_ENCODING is beneficial for both in memory and on disk >>> storage. Compression is must as well. >>> It is hard to say how block encoding + compression will affect query >>> performance in your case, but common sense >>> tell us that the more data you keep in memory block cache - the better >>> overall performance you have. Block encoding >>> in your case is a MUST, as for compression - ymmv but usually snappy, >>> lzo or lz4 improves performance as well. >>> Block cache in 0.94-8 does not support compression but supports block >>> encoding. >>> >>> -Vladimir Rodionov >>> >>> >>> >>> On Mon, Apr 14, 2014 at 6:04 PM, James Taylor <[email protected]>wrote: >>> >>>> Hi, >>>> Take a look at Doug Meil's HBase blog here: >>>> http://blogs.apache.org/hbase/ as I think that's pretty relevant for >>>> Phoenix as well. Also, Mujtaba may be able to provide you with some good >>>> guidance. >>>> Thanks, >>>> James >>>> >>>> >>>> On Tue, Apr 8, 2014 at 2:24 PM, universal localhost < >>>> [email protected]> wrote: >>>> >>>>> Hey All, >>>>> >>>>> Can someone please suggest on the optimizations in Phoenix or Hbase >>>>> that I can benefit from >>>>> for the case where *Rowkeys are much larger as compared to the column >>>>> values*. >>>>> >>>>> In my case, Rowkeys have timestamp. >>>>> >>>>> RowKey schema: *DATELOGGED, ORGNAME,* INSTANCEID, TXID >>>>> Column TXID is a sequence number. >>>>> >>>>> I have read a little about DATA_BLOCK_ENCODING and learned that it >>>>> can benefit the in-cache key compression. >>>>> >>>>> I am hoping that by using this compression I can get away with large >>>>> rowkeys... >>>>> Any suggestions on how it will affect the query performance? >>>>> >>>>> --Uni >>>>> >>>> >>>> >>> >
