You rowkey schema is not efficient, it  will result in region hot spotting
(unless you salt your table).
DATA_BLOCK_ENCODING is beneficial for both  in memory and on disk storage.
Compression is must as well.
It is hard to say how block encoding + compression will affect query
performance in your case, but common sense
tell us that the more data you keep in memory block cache - the better
overall performance you have. Block encoding
in your case is  a MUST, as for compression - ymmv but  usually snappy, lzo
or lz4 improves performance as well.
Block cache in 0.94-8 does not support compression but supports block
encoding.

-Vladimir Rodionov



On Mon, Apr 14, 2014 at 6:04 PM, James Taylor <[email protected]>wrote:

> Hi,
> Take a look at Doug Meil's HBase blog here: http://blogs.apache.org/hbase/as 
> I think that's pretty relevant for Phoenix as well. Also, Mujtaba may be
> able to provide you with some good guidance.
> Thanks,
> James
>
>
> On Tue, Apr 8, 2014 at 2:24 PM, universal localhost <
> [email protected]> wrote:
>
>> Hey All,
>>
>> Can someone please suggest on the optimizations in Phoenix or Hbase that
>> I can benefit from
>> for the case where *Rowkeys are much larger as compared to the column
>> values*.
>>
>> In my case, Rowkeys have timestamp.
>>
>> RowKey schema:  *DATELOGGED, ORGNAME,* INSTANCEID, TXID
>> Column TXID is a sequence number.
>>
>> I have read a little about DATA_BLOCK_ENCODING and learned that it can
>> benefit the in-cache key compression.
>>
>> I am hoping that by using this compression I can get away with large
>> rowkeys...
>> Any suggestions on how it will affect the query performance?
>>
>> --Uni
>>
>
>

Reply via email to