Hi, so you hold phoenix java.sql.Connection for each thread as thread-local variable and don't get any problems, correct?
2015-09-07 6:43 GMT+02:00 ALEX K <[email protected]>: > Serega, > > I haven't seen any issues so far with this approach (keeping connections > open in thread-local, one conn per thread) > > we send a stream of kafka messages to hbase in the following way > - each "saver" thread initializes two connections - one for upserts and > one for alter statements (for multitenancy each thread keeps a local Map > <String, Connection> with one connection per tenant) > - get message from inbound kafka queue, deserialize (avro) and save to > hbase with java.sql.preparedstatement and phoenix upsert or create/alter* > - connection.commit() every 1000 or so upserts or per "transaction", with > connection autocommit set to false (this brings down latency per row saved > to single digit msec) > - in case of exceptions retry save, if it still fails - check/refresh > connections > > *We pause upserts in all threads (thread.yield on detecting lock) if > there is incoming message with metadata change (CREATE/ALTER table) > otherwise phoenix would throw an exception > > Alex > > On Sun, Sep 6, 2015 at 12:36 PM, Serega Sheypak <[email protected]> > wrote: > >> Hi, approach above doesn't fit web-app. There are multiple simultaneous >> upserts comes from different threads. >> So the only thing is to put connection into thread-local and use one >> connection per thread. >> >> try (Connection conn = DriverManager.getConnection(url)) { >> conn.setAutoCommit(false); >> int batchSize = 0; >> int commitSize = 1000; // number of rows you want to commit per batch. >> Change this value according to your needs. >> try (Statement stmt = conn.prepareStatement(upsert)) { >> stmt.set ... >> while (there are records to upsert) { >> stmt.executeUpdate(); >> batchSize++; >> if (batchSize % commitSize == 0) { >> conn.commit(); >> } >> } >> conn.commit(); // commit the last batch of records >> } >> >> >> 2015-09-03 22:28 GMT+02:00 Jaime Solano <[email protected]>: >> >>> Thanks for the explanation, Samarth! >>> >>> On Thu, Sep 3, 2015 at 4:11 PM, Samarth Jain <[email protected]> wrote: >>> >>>> Hi Jaime, >>>> >>>> Phoenix is an embedded JDBC driver i.e. the connections are not meant >>>> to be pooled. Creating phoenix connections is a light weight operation. >>>> This is because all phoenix connections to a cluster from the Phoenix JDBC >>>> driver use the same underlying HConnection. >>>> >>>> If you end up pooling Phoenix connections then you will likely run into >>>> unexpected issues. It is likely that a request will leave a Phoenix >>>> connection not in a kosher state for the next request to use it. >>>> >>>> On Thu, Sep 3, 2015 at 12:58 PM, Jaime Solano <[email protected]> >>>> wrote: >>>> >>>>> Hi Samarth, >>>>> In our project we were thinking about using Tomcat JDBC Connection >>>>> Pool, to handle Phoenix connections. You're saying this type of approach >>>>> should be avoided? What other approach should be followed if, in the >>>>> scenario Serega described, you experience a heavy load of users, all >>>>> trying >>>>> to upsert at the same time (therefore, a lot of connections) ?? >>>>> Also, can you expand a little bit more on the implications of having a >>>>> pooling mechanism for Phoenix connections? >>>>> Thanks in advance! >>>>> -Jaime >>>>> >>>>> On Thu, Sep 3, 2015 at 3:35 PM, Samarth Jain <[email protected]> >>>>> wrote: >>>>> >>>>>> Yes. PhoenixConnection implements java.sql.Connection. >>>>>> >>>>>> On Thu, Sep 3, 2015 at 12:34 PM, Serega Sheypak < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> >Phoenix doesn't cache connections. You shouldn't pool them and you >>>>>>> shouldn't share them with multiple threads. >>>>>>> We are talking about java.sql.Connection, right? >>>>>>> >>>>>>> 2015-09-03 21:26 GMT+02:00 Samarth Jain <[email protected]>: >>>>>>> >>>>>>>> Your pattern is correct. >>>>>>>> >>>>>>>> Phoenix doesn't cache connections. You shouldn't pool them and you >>>>>>>> shouldn't share them with multiple threads. >>>>>>>> >>>>>>>> For batching upserts, you could do something like this: >>>>>>>> >>>>>>>> You can do this via phoenix by doing something like this: >>>>>>>> >>>>>>>> try (Connection conn = DriverManager.getConnection(url)) { >>>>>>>> conn.setAutoCommit(false); >>>>>>>> int batchSize = 0; >>>>>>>> int commitSize = 1000; // number of rows you want to commit per >>>>>>>> batch. Change this value according to your needs. >>>>>>>> try (Statement stmt = conn.prepareStatement(upsert)) { >>>>>>>> stmt.set ... >>>>>>>> while (there are records to upsert) { >>>>>>>> stmt.executeUpdate(); >>>>>>>> batchSize++; >>>>>>>> if (batchSize % commitSize == 0) { >>>>>>>> conn.commit(); >>>>>>>> } >>>>>>>> } >>>>>>>> conn.commit(); // commit the last batch of records >>>>>>>> } >>>>>>>> >>>>>>>> You don't want commitSize to be too large since Phoenix client >>>>>>>> keeps the uncommitted rows in memory till they are sent over to HBase. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Sep 3, 2015 at 12:19 PM, Serega Sheypak < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi, I'm using phoenix in java web-application. App does upsert or >>>>>>>>> select by primary key. >>>>>>>>> What is the right pattern to do it? >>>>>>>>> - I create new connection for each request >>>>>>>>> - prepare and execute statement >>>>>>>>> - close stmt >>>>>>>>> - close connection >>>>>>>>> >>>>>>>>> Does phoenix caches connections internally? What is the right way >>>>>>>>> to batch upserts in current case? >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
