as a matter of fact, It looks like phoenix tables are created by default with data_block_encoding = 'fast_diff', which is a feature I just found out.
On Tue, Apr 15, 2014 at 12:52 PM, Vladimir Rodionov <[email protected]>wrote: > Why would not use HBase shell if it supports this? I think this will > affect only new data, the old data will still be encoded until next > compaction. > > -Vladimir Rodionov > > > On Tue, Apr 15, 2014 at 12:01 PM, Sean Huo <[email protected]> wrote: > >> Speaking of which, is there a way to reset data_block_encoding of an >> existing phoenix table? >> I tried using phoenix's alter table without much success. Hbase shell >> does support that. >> >> >> On Mon, Apr 14, 2014 at 7:35 PM, James Taylor <[email protected]>wrote: >> >>> I agree with Vladamir wrt hotspotting - take a look here at how to salt >>> your table: http://phoenix.incubator.apache.org/salted.html >>> >>> In general, we find salted tables perform better for both reads and >>> writes. >>> >>> James >>> >>> >>> On Monday, April 14, 2014, Vladimir Rodionov <[email protected]> >>> wrote: >>> >>>> You rowkey schema is not efficient, it will result in region hot >>>> spotting (unless you salt your table). >>>> DATA_BLOCK_ENCODING is beneficial for both in memory and on disk >>>> storage. Compression is must as well. >>>> It is hard to say how block encoding + compression will affect query >>>> performance in your case, but common sense >>>> tell us that the more data you keep in memory block cache - the better >>>> overall performance you have. Block encoding >>>> in your case is a MUST, as for compression - ymmv but usually snappy, >>>> lzo or lz4 improves performance as well. >>>> Block cache in 0.94-8 does not support compression but supports block >>>> encoding. >>>> >>>> -Vladimir Rodionov >>>> >>>> >>>> >>>> On Mon, Apr 14, 2014 at 6:04 PM, James Taylor >>>> <[email protected]>wrote: >>>> >>>>> Hi, >>>>> Take a look at Doug Meil's HBase blog here: >>>>> http://blogs.apache.org/hbase/ as I think that's pretty relevant for >>>>> Phoenix as well. Also, Mujtaba may be able to provide you with some good >>>>> guidance. >>>>> Thanks, >>>>> James >>>>> >>>>> >>>>> On Tue, Apr 8, 2014 at 2:24 PM, universal localhost < >>>>> [email protected]> wrote: >>>>> >>>>>> Hey All, >>>>>> >>>>>> Can someone please suggest on the optimizations in Phoenix or Hbase >>>>>> that I can benefit from >>>>>> for the case where *Rowkeys are much larger as compared to the >>>>>> column values*. >>>>>> >>>>>> In my case, Rowkeys have timestamp. >>>>>> >>>>>> RowKey schema: *DATELOGGED, ORGNAME,* INSTANCEID, TXID >>>>>> Column TXID is a sequence number. >>>>>> >>>>>> I have read a little about DATA_BLOCK_ENCODING and learned that it >>>>>> can benefit the in-cache key compression. >>>>>> >>>>>> I am hoping that by using this compression I can get away with large >>>>>> rowkeys... >>>>>> Any suggestions on how it will affect the query performance? >>>>>> >>>>>> --Uni >>>>>> >>>>> >>>>> >>>> >> >
