Hi Hef,
Have you had a chance to read our Tuning Guide [1] yet? There's a lot of
good, general guidance there. There are some optimizations for write
performance that depend on how you expect/allow your data and schema to
change:
1) Is your data write-once? Make sure to declare your table with the
IMMUTABLE_ROWS=true property[2]. That will lower the overhead of a
secondary index as it's not necessary to read the data row (to get the old
value) prior to writing it when there are secondary indexes.
2) Does your schema only change in an append-only manner? For example, are
columns only added, but never removed? If so, you can declare your table as
APPEND_ONLY_SCHEMA as described here [2].
3) Does your schema never or rarely change at know times? If so, you can
declare an UPDATE_CACHE_FREQUENCY property as described here [2] to reduce
the RPC traffic.
4) Can you bulk load data [3] and then add or rebuild the index afterwards?
5) Have you investigated using local indexes [4]? They're optimized for
write speed since they ensure that the index data is on the same region
server as the data (i.e. all writes are local to the region server, no
cross region server calls, but there's some overhead at read time).
6) Have you considered not using secondary indexes and just letting your
less common queries be slower?

Keep in mind, with secondary indexes, you're essentially writing your data
twice. You'll need to expect that your write performance will drop. As
usual, there's a set of tradeoffs that you need to understand and choose
according to your requirements.

Thanks,
James

[1] https://phoenix.apache.org/tuning_guide.html
[2] https://phoenix.apache.org/language/index.html#options
[3] https://phoenix.apache.org/bulk_dataload.html
[4] https://phoenix.apache.org/secondary_indexing.html#Local_Indexes

On Tue, Sep 5, 2017 at 11:48 AM, Josh Elser <els...@apache.org> wrote:

> 500writes/seconds seems very low to me. On my wimpy laptop, I can easily
> see over 10K writes/second depending on the schema.
>
> The first check is to make sure that you have autocommit disabled.
> Otherwise, every update you make via JDBC will trigger an HBase RPC.
> Batching of RPCs to HBase is key to optimal performance via Phoenix.
>
> Regarding #2, unless you have intimate knowledge with how Phoenix writes
> data to HBase, do not investigate this approach.
>
>
> On 9/5/17 5:56 AM, Hef wrote:
>
>> Hi guys,
>> I'm evaluating using Phoenix to replace MySQL for better scalability.
>> The version I'm evaluating is 4.11-HBase-1.2, with some dependencies
>> modified to match CDH5.9 which we are using.
>>
>> The problem I'm having is the write performance to Phoenix from JDBC is
>> too poor, only 500writes/second, while our data's throughput is almost
>> 50,000/s. My questions are:
>> 1. If the 500/s TPS is normal speed? How fast can you achieve in your
>> production?
>> 2. Whether I can write directly into HBase with mutation API, and read
>> from Phoenix, that could be fast. But I don't see the secondary index be
>> created automatically in this case.
>>
>> Regards,
>> Hef
>>
>

Reply via email to