Hi James, There is a high possibility that we might be sharing connection among multiple threads. This MR job is fairly complicated because we spin up a separate thread(to kill the download if it doesnt completes in a prespecified time.) within Mapper to perform image downloads from internet. #1. One solution would be to make the method that is doing upsert to synchronize on connection(or PreparedStatement object). We dont write at a very high throughput to Phoenix table coz it only logs errors. What do you think about that?
#2. I will need to create a connection in every thread. But, AFAIK, creating connection everytime and then the PreparedStatement is expensive. Right? Please let me know if there is any other better approach that i am missing. On Sun, Oct 2, 2016 at 3:50 PM, James Taylor <jamestay...@apache.org> wrote: > Hi Anil, > Make sure you're not sharing the same Connection between multiple threads > as it's not thread safe. > Thanks, > James > > > On Sunday, October 2, 2016, anil gupta <anilgupt...@gmail.com> wrote: > >> Hi, >> >> We are running HDP2.3.4(HBase 1.1 and Phoenix 4.4). I have a MapReduce >> job thats writing data to a very simple Phoenix table. We intermittently >> get and due to this our job fails: >> java.util.ConcurrentModificationException at >> java.util.HashMap$HashIterator.remove(HashMap.java:944) at >> org.apache.phoenix.execute.MutationState.commit(MutationState.java:472) >> at >> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:461) >> at >> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:458) >> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at >> org.apache.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:458) >> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:308) >> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:297) >> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at >> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:295) >> at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpda >> te(PhoenixPreparedStatement.java:200) >> >> We are running these upserts as part of code in Mapper that executes as >> part of ChainReducer. One problem i noticed that we were instantiating >> PreparedStatment everytime(conn == Connection object) we were doing an >> upsert : >> >> conn.prepareStatement(TcErrorWritable.buildUpsertNewRowStatement(TC_DOWNLOAD_ERRORS_TABLE)); >> >> This is the only line that seems awkward to me in that code. We have >> other projects writing to Phoenix at a much higher throughput and volume of >> data but we never ran into this problem. Can anyone provide me more details >> on why we are getting ConcurrentModificationException while doing >> upserts? >> >> -- >> Thanks & Regards, >> Anil Gupta >> > -- Thanks & Regards, Anil Gupta