I'm using the same solution as Samarth suggested (commit batching), it
brings down latency per single row upsert from 50ms to 5ms (averaged after
batching)

On Wed, Aug 19, 2015 at 7:11 PM, Samarth Jain <[email protected]>
wrote:

> You can do this via phoenix by doing something like this:
>
> try (Connection conn = DriverManager.getConnection(url)) {
> conn.setAutoCommit(false);
> int batchSize = 0;
> int commitSize = 1000; // number of rows you want to commit per batch.
> Change this value according to your needs.
> while (there are records to upsert) {
>      stmt.executeUpdate();
>      batchSize++;
>      if (batchSize % commitSize == 0) {
>           conn.commit();
>      }
> }
> conn.commit(); // commit the last batch of records
>
> You don't want commitSize to be too large since Phoenix client keeps the
> uncommitted rows in memory till they are sent over to HBase.
>
>
>
> On Wed, Aug 19, 2015 at 3:05 PM, Serega Sheypak <[email protected]>
> wrote:
>
>> I would suggest you to use
>>
>> https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/BufferedMutator.html
>> instead of list of puts and share mutableBuffer across threads (it's
>> thread-safe). I reduced my response time from 30-40 ms to 4ms while using
>> buffferedmutator. It also sends mutations in async mode. :)
>>
>> I meet the same problem. Can't force Phoenix to buffer upserts on
>> client-side and then send them to HBase in small batches.
>>
>> 2015-08-19 19:40 GMT+02:00 jeremy p <[email protected]>:
>>
>>> Hello all,
>>>
>>> I need to do true batch updates to a Phoenix table.  By this, I mean
>>> sending a bunch of updates to HBase as part of a single request.  The HBase
>>> API offers this behavior with the Table.put(List<Put> puts) method.  I
>>> noticed PhoenixStatement exposes an executeBatch() method, however, this
>>> method just executes the batched statements one-by-one.  This will not
>>> deliver the performance that the HBase API exposes through their batch put
>>> method.
>>>
>>> What is the best way for me to do true batch updates to a Phoenix
>>> table?  I need to do this programmatically, so I cannot use the command
>>> line bulk insert utility.
>>>
>>> --Jeremy
>>>
>>
>>
>

Reply via email to