You rowkey schema is not efficient, it will result in region hot spotting (unless you salt your table). DATA_BLOCK_ENCODING is beneficial for both in memory and on disk storage. Compression is must as well. It is hard to say how block encoding + compression will affect query performance in your case, but common sense tell us that the more data you keep in memory block cache - the better overall performance you have. Block encoding in your case is a MUST, as for compression - ymmv but usually snappy, lzo or lz4 improves performance as well. Block cache in 0.94-8 does not support compression but supports block encoding.
-Vladimir Rodionov On Mon, Apr 14, 2014 at 6:04 PM, James Taylor <[email protected]>wrote: > Hi, > Take a look at Doug Meil's HBase blog here: http://blogs.apache.org/hbase/as > I think that's pretty relevant for Phoenix as well. Also, Mujtaba may be > able to provide you with some good guidance. > Thanks, > James > > > On Tue, Apr 8, 2014 at 2:24 PM, universal localhost < > [email protected]> wrote: > >> Hey All, >> >> Can someone please suggest on the optimizations in Phoenix or Hbase that >> I can benefit from >> for the case where *Rowkeys are much larger as compared to the column >> values*. >> >> In my case, Rowkeys have timestamp. >> >> RowKey schema: *DATELOGGED, ORGNAME,* INSTANCEID, TXID >> Column TXID is a sequence number. >> >> I have read a little about DATA_BLOCK_ENCODING and learned that it can >> benefit the in-cache key compression. >> >> I am hoping that by using this compression I can get away with large >> rowkeys... >> Any suggestions on how it will affect the query performance? >> >> --Uni >> > >
