Re: Phoenix JDBC in web-app, what is the right pattern?

Serega Sheypak Mon, 07 Sep 2015 01:42:02 -0700

Hi, so you hold phoenix java.sql.Connection for each thread as thread-local
variable and don't get any problems, correct?


2015-09-07 6:43 GMT+02:00 ALEX K <[email protected]>:

> Serega,
>
> I haven't seen any issues so far with this approach (keeping connections
> open in thread-local, one conn per thread)
>
> we send a stream of kafka messages to hbase in the following way
> - each "saver" thread initializes two connections -  one for upserts and
> one for alter statements (for multitenancy each thread keeps a local Map
> <String, Connection> with one connection per tenant)
> - get  message from inbound kafka queue, deserialize (avro) and save to
> hbase with java.sql.preparedstatement and phoenix upsert or create/alter*
> - connection.commit() every 1000 or so upserts or per "transaction", with
> connection autocommit set to false (this brings down latency per row saved
> to single digit msec)
> - in case of exceptions retry save, if it still fails  - check/refresh
> connections
>
> *We pause upserts in all threads (thread.yield on detecting lock)  if
> there is incoming message with metadata change (CREATE/ALTER table)
>  otherwise phoenix would throw an exception
>
> Alex
>
> On Sun, Sep 6, 2015 at 12:36 PM, Serega Sheypak <[email protected]>
> wrote:
>
>> Hi, approach above doesn't fit web-app. There are multiple simultaneous
>> upserts comes from different threads.
>> So the only thing is to put connection into thread-local and use one
>> connection per thread.
>>
>> try (Connection conn = DriverManager.getConnection(url)) {
>> conn.setAutoCommit(false);
>> int batchSize = 0;
>> int commitSize = 1000; // number of rows you want to commit per batch.
>> Change this value according to your needs.
>> try (Statement stmt = conn.prepareStatement(upsert)) {
>> stmt.set ...
>> while (there are records to upsert) {
>>      stmt.executeUpdate();
>>      batchSize++;
>>      if (batchSize % commitSize == 0) {
>>           conn.commit();
>>      }
>> }
>> conn.commit(); // commit the last batch of records
>> }
>>
>>
>> 2015-09-03 22:28 GMT+02:00 Jaime Solano <[email protected]>:
>>
>>> Thanks for the explanation, Samarth!
>>>
>>> On Thu, Sep 3, 2015 at 4:11 PM, Samarth Jain <[email protected]> wrote:
>>>
>>>> Hi Jaime,
>>>>
>>>> Phoenix is an embedded JDBC driver i.e. the connections are not meant
>>>> to be pooled. Creating phoenix connections is a light weight operation.
>>>> This is because all phoenix connections to a cluster from the Phoenix JDBC
>>>> driver use the same underlying HConnection.
>>>>
>>>> If you end up pooling Phoenix connections then you will likely run into
>>>> unexpected issues. It is likely that a request will leave a Phoenix
>>>> connection not in a kosher state for the next request to use it.
>>>>
>>>> On Thu, Sep 3, 2015 at 12:58 PM, Jaime Solano <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Samarth,
>>>>> In our project we were thinking about using Tomcat JDBC Connection
>>>>> Pool, to handle Phoenix connections. You're saying this type of approach
>>>>> should be avoided? What other approach should be followed if, in the
>>>>> scenario Serega described, you experience a heavy load of users, all 
>>>>> trying
>>>>> to upsert at the same time (therefore, a lot of connections) ??
>>>>> Also, can you expand a little bit more on the implications of having a
>>>>> pooling mechanism for Phoenix connections?
>>>>> Thanks in advance!
>>>>> -Jaime
>>>>>
>>>>> On Thu, Sep 3, 2015 at 3:35 PM, Samarth Jain <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Yes. PhoenixConnection implements java.sql.Connection.
>>>>>>
>>>>>> On Thu, Sep 3, 2015 at 12:34 PM, Serega Sheypak <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> >Phoenix doesn't cache connections. You shouldn't pool them and you
>>>>>>> shouldn't share them with multiple threads.
>>>>>>> We are talking about java.sql.Connection, right?
>>>>>>>
>>>>>>> 2015-09-03 21:26 GMT+02:00 Samarth Jain <[email protected]>:
>>>>>>>
>>>>>>>> Your pattern is correct.
>>>>>>>>
>>>>>>>> Phoenix doesn't cache connections. You shouldn't pool them and you
>>>>>>>> shouldn't share them with multiple threads.
>>>>>>>>
>>>>>>>> For batching upserts, you could do something like this:
>>>>>>>>
>>>>>>>> You can do this via phoenix by doing something like this:
>>>>>>>>
>>>>>>>> try (Connection conn = DriverManager.getConnection(url)) {
>>>>>>>> conn.setAutoCommit(false);
>>>>>>>> int batchSize = 0;
>>>>>>>> int commitSize = 1000; // number of rows you want to commit per
>>>>>>>> batch. Change this value according to your needs.
>>>>>>>> try (Statement stmt = conn.prepareStatement(upsert)) {
>>>>>>>> stmt.set ...
>>>>>>>> while (there are records to upsert) {
>>>>>>>>      stmt.executeUpdate();
>>>>>>>>      batchSize++;
>>>>>>>>      if (batchSize % commitSize == 0) {
>>>>>>>>           conn.commit();
>>>>>>>>      }
>>>>>>>> }
>>>>>>>> conn.commit(); // commit the last batch of records
>>>>>>>> }
>>>>>>>>
>>>>>>>> You don't want commitSize to be too large since Phoenix client
>>>>>>>> keeps the uncommitted rows in memory till they are sent over to HBase.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Sep 3, 2015 at 12:19 PM, Serega Sheypak <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi, I'm using  phoenix in java web-application. App does upsert or
>>>>>>>>> select by primary key.
>>>>>>>>> What is the right pattern to do it?
>>>>>>>>> - I create new connection for each request
>>>>>>>>> - prepare and execute statement
>>>>>>>>> - close stmt
>>>>>>>>> - close connection
>>>>>>>>>
>>>>>>>>> Does phoenix caches connections internally? What is the right way
>>>>>>>>> to batch upserts in current case?
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Phoenix JDBC in web-app, what is the right pattern?

Reply via email to