Re: Phoenix JDBC in web-app, what is the right pattern?

ALEX K Mon, 07 Sep 2015 08:19:24 -0700

yes

On Mon, Sep 7, 2015 at 4:40 AM, Serega Sheypak <[email protected]>
wrote:


> Hi, so you hold phoenix java.sql.Connection for each thread as
> thread-local variable and don't get any problems, correct?
>
> 2015-09-07 6:43 GMT+02:00 ALEX K <[email protected]>:
>
>> Serega,
>>
>> I haven't seen any issues so far with this approach (keeping connections
>> open in thread-local, one conn per thread)
>>
>> we send a stream of kafka messages to hbase in the following way
>> - each "saver" thread initializes two connections -  one for upserts and
>> one for alter statements (for multitenancy each thread keeps a local Map
>> <String, Connection> with one connection per tenant)
>> - get  message from inbound kafka queue, deserialize (avro) and save to
>> hbase with java.sql.preparedstatement and phoenix upsert or create/alter*
>> - connection.commit() every 1000 or so upserts or per "transaction", with
>> connection autocommit set to false (this brings down latency per row saved
>> to single digit msec)
>> - in case of exceptions retry save, if it still fails  - check/refresh
>> connections
>>
>> *We pause upserts in all threads (thread.yield on detecting lock)  if
>> there is incoming message with metadata change (CREATE/ALTER table)
>>  otherwise phoenix would throw an exception
>>
>> Alex
>>
>> On Sun, Sep 6, 2015 at 12:36 PM, Serega Sheypak <[email protected]
>> > wrote:
>>
>>> Hi, approach above doesn't fit web-app. There are multiple simultaneous
>>> upserts comes from different threads.
>>> So the only thing is to put connection into thread-local and use one
>>> connection per thread.
>>>
>>> try (Connection conn = DriverManager.getConnection(url)) {
>>> conn.setAutoCommit(false);
>>> int batchSize = 0;
>>> int commitSize = 1000; // number of rows you want to commit per batch.
>>> Change this value according to your needs.
>>> try (Statement stmt = conn.prepareStatement(upsert)) {
>>> stmt.set ...
>>> while (there are records to upsert) {
>>>      stmt.executeUpdate();
>>>      batchSize++;
>>>      if (batchSize % commitSize == 0) {
>>>           conn.commit();
>>>      }
>>> }
>>> conn.commit(); // commit the last batch of records
>>> }
>>>
>>>
>>> 2015-09-03 22:28 GMT+02:00 Jaime Solano <[email protected]>:
>>>
>>>> Thanks for the explanation, Samarth!
>>>>
>>>> On Thu, Sep 3, 2015 at 4:11 PM, Samarth Jain <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Jaime,
>>>>>
>>>>> Phoenix is an embedded JDBC driver i.e. the connections are not meant
>>>>> to be pooled. Creating phoenix connections is a light weight operation.
>>>>> This is because all phoenix connections to a cluster from the Phoenix JDBC
>>>>> driver use the same underlying HConnection.
>>>>>
>>>>> If you end up pooling Phoenix connections then you will likely run
>>>>> into unexpected issues. It is likely that a request will leave a Phoenix
>>>>> connection not in a kosher state for the next request to use it.
>>>>>
>>>>> On Thu, Sep 3, 2015 at 12:58 PM, Jaime Solano <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi Samarth,
>>>>>> In our project we were thinking about using Tomcat JDBC Connection
>>>>>> Pool, to handle Phoenix connections. You're saying this type of approach
>>>>>> should be avoided? What other approach should be followed if, in the
>>>>>> scenario Serega described, you experience a heavy load of users, all 
>>>>>> trying
>>>>>> to upsert at the same time (therefore, a lot of connections) ??
>>>>>> Also, can you expand a little bit more on the implications of having
>>>>>> a pooling mechanism for Phoenix connections?
>>>>>> Thanks in advance!
>>>>>> -Jaime
>>>>>>
>>>>>> On Thu, Sep 3, 2015 at 3:35 PM, Samarth Jain <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Yes. PhoenixConnection implements java.sql.Connection.
>>>>>>>
>>>>>>> On Thu, Sep 3, 2015 at 12:34 PM, Serega Sheypak <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> >Phoenix doesn't cache connections. You shouldn't pool them and
>>>>>>>> you shouldn't share them with multiple threads.
>>>>>>>> We are talking about java.sql.Connection, right?
>>>>>>>>
>>>>>>>> 2015-09-03 21:26 GMT+02:00 Samarth Jain <[email protected]>:
>>>>>>>>
>>>>>>>>> Your pattern is correct.
>>>>>>>>>
>>>>>>>>> Phoenix doesn't cache connections. You shouldn't pool them and you
>>>>>>>>> shouldn't share them with multiple threads.
>>>>>>>>>
>>>>>>>>> For batching upserts, you could do something like this:
>>>>>>>>>
>>>>>>>>> You can do this via phoenix by doing something like this:
>>>>>>>>>
>>>>>>>>> try (Connection conn = DriverManager.getConnection(url)) {
>>>>>>>>> conn.setAutoCommit(false);
>>>>>>>>> int batchSize = 0;
>>>>>>>>> int commitSize = 1000; // number of rows you want to commit per
>>>>>>>>> batch. Change this value according to your needs.
>>>>>>>>> try (Statement stmt = conn.prepareStatement(upsert)) {
>>>>>>>>> stmt.set ...
>>>>>>>>> while (there are records to upsert) {
>>>>>>>>>      stmt.executeUpdate();
>>>>>>>>>      batchSize++;
>>>>>>>>>      if (batchSize % commitSize == 0) {
>>>>>>>>>           conn.commit();
>>>>>>>>>      }
>>>>>>>>> }
>>>>>>>>> conn.commit(); // commit the last batch of records
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> You don't want commitSize to be too large since Phoenix client
>>>>>>>>> keeps the uncommitted rows in memory till they are sent over to HBase.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Sep 3, 2015 at 12:19 PM, Serega Sheypak <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hi, I'm using  phoenix in java web-application. App does upsert
>>>>>>>>>> or select by primary key.
>>>>>>>>>> What is the right pattern to do it?
>>>>>>>>>> - I create new connection for each request
>>>>>>>>>> - prepare and execute statement
>>>>>>>>>> - close stmt
>>>>>>>>>> - close connection
>>>>>>>>>>
>>>>>>>>>> Does phoenix caches connections internally? What is the right way
>>>>>>>>>> to batch upserts in current case?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Phoenix JDBC in web-app, what is the right pattern?

Reply via email to