Thanks for quick response. James. I'll try out some stuff. On Sun, Oct 2, 2016 at 5:00 PM, James Taylor <jamestay...@apache.org> wrote:
> Option #2 is fine. Connections are cheap in Phoenix. > > > On Sunday, October 2, 2016, anil gupta <anilgupt...@gmail.com> wrote: > >> Hi James, >> >> There is a high possibility that we might be sharing connection among >> multiple threads. This MR job is fairly complicated because we spin up a >> separate thread(to kill the download if it doesnt completes in a >> prespecified time.) within Mapper to perform image downloads from internet. >> #1. One solution would be to make the method that is doing upsert to >> synchronize on connection(or PreparedStatement object). We dont write at a >> very high throughput to Phoenix table coz it only logs errors. What do you >> think about that? >> >> #2. I will need to create a connection in every thread. But, AFAIK, >> creating connection everytime and then the PreparedStatement is expensive. >> Right? >> >> Please let me know if there is any other better approach that i am >> missing. >> >> On Sun, Oct 2, 2016 at 3:50 PM, James Taylor <jamestay...@apache.org> >> wrote: >> >>> Hi Anil, >>> Make sure you're not sharing the same Connection between multiple >>> threads as it's not thread safe. >>> Thanks, >>> James >>> >>> >>> On Sunday, October 2, 2016, anil gupta <anilgupt...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> We are running HDP2.3.4(HBase 1.1 and Phoenix 4.4). I have a MapReduce >>>> job thats writing data to a very simple Phoenix table. We intermittently >>>> get and due to this our job fails: >>>> java.util.ConcurrentModificationException at >>>> java.util.HashMap$HashIterator.remove(HashMap.java:944) at >>>> org.apache.phoenix.execute.MutationState.commit(MutationState.java:472) >>>> at >>>> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:461) >>>> at >>>> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:458) >>>> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at >>>> org.apache.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:458) >>>> at >>>> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:308) >>>> at >>>> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:297) >>>> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at >>>> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:295) >>>> at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpda >>>> te(PhoenixPreparedStatement.java:200) >>>> >>>> We are running these upserts as part of code in Mapper that executes as >>>> part of ChainReducer. One problem i noticed that we were instantiating >>>> PreparedStatment everytime(conn == Connection object) we were doing an >>>> upsert : >>>> >>>> conn.prepareStatement(TcErrorWritable.buildUpsertNewRowStatement(TC_DOWNLOAD_ERRORS_TABLE)); >>>> >>>> This is the only line that seems awkward to me in that code. We have >>>> other projects writing to Phoenix at a much higher throughput and volume of >>>> data but we never ran into this problem. Can anyone provide me more details >>>> on why we are getting ConcurrentModificationException while doing >>>> upserts? >>>> >>>> -- >>>> Thanks & Regards, >>>> Anil Gupta >>>> >>> >> >> >> -- >> Thanks & Regards, >> Anil Gupta >> > -- Thanks & Regards, Anil Gupta