Hi James,

There is a high possibility that we might be sharing connection among
multiple threads. This MR job is fairly complicated because we spin up a
separate thread(to kill the download if it doesnt completes in a
prespecified time.) within Mapper to perform image downloads from internet.
#1. One solution would be to make the method that is doing upsert to
synchronize on connection(or PreparedStatement object). We dont write at a
very high throughput to Phoenix table coz it only logs errors. What do you
think about that?

#2. I will need to create a connection in every thread. But, AFAIK,
creating connection everytime and then the PreparedStatement is expensive.
Right?

Please let me know if there is any other better approach that i am missing.

On Sun, Oct 2, 2016 at 3:50 PM, James Taylor <jamestay...@apache.org> wrote:

> Hi Anil,
> Make sure you're not sharing the same Connection between multiple threads
> as it's not thread safe.
> Thanks,
> James
>
>
> On Sunday, October 2, 2016, anil gupta <anilgupt...@gmail.com> wrote:
>
>> Hi,
>>
>> We are running HDP2.3.4(HBase 1.1 and Phoenix 4.4). I have a MapReduce
>> job thats writing data to a very simple Phoenix table. We intermittently
>> get and due to this our job fails:
>> java.util.ConcurrentModificationException at
>> java.util.HashMap$HashIterator.remove(HashMap.java:944) at
>> org.apache.phoenix.execute.MutationState.commit(MutationState.java:472)
>> at 
>> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:461)
>> at 
>> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:458)
>> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at
>> org.apache.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:458)
>> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:308)
>> at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:297)
>> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at
>> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:295)
>> at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpda
>> te(PhoenixPreparedStatement.java:200)
>>
>> We are running these upserts as part of code in Mapper that executes as
>> part of ChainReducer. One problem i noticed that we were instantiating
>> PreparedStatment everytime(conn == Connection object) we were doing an
>> upsert :
>>
>> conn.prepareStatement(TcErrorWritable.buildUpsertNewRowStatement(TC_DOWNLOAD_ERRORS_TABLE));
>>
>> This is the only line that seems awkward to me in that code. We have
>> other projects writing to Phoenix at a much higher throughput and volume of
>> data but we never ran into this problem. Can anyone provide me more details
>> on why we are getting ConcurrentModificationException while doing
>> upserts?
>>
>> --
>> Thanks & Regards,
>> Anil Gupta
>>
>


-- 
Thanks & Regards,
Anil Gupta

Reply via email to