That was my first thought as I said but I am 100% sure the issue is not in the SerDe. To confirm this, I removed the reader and writer from the serde and created a new instance of reader/writer in every call to serialize or deserialize just to determine if the problem is gone.
The problem didn't go away and I still had the same issue. That is why I know for sure it is not the SerDe. Don't waste any more time in that direction. ~Abdullah. Amoudi, Abdullah. On Wed, Nov 11, 2015 at 10:54 PM, Jianfeng Jia <[email protected]> wrote: > Here is my finding and thoughts. > I think I’ve checked all the direct use case of UTF8SerDer. However, I > missed some indirect static/shared use case of UTF8SerDer. > > One big suspect is the RecordDescriptor which has the > ISerializerDeserializers inside and is always passed into the Factory > method and shared by the ThreadMethod (usually NodePushable). > E.g., in the ResultWriterOperatorDescriptor, the outRecordDesc is passed > to the createPushRuntime() factory method to create the “resultSerializer”, > and it is shared by the thread object > AbstractUnaryInputSinkOperatorNodePushable. This pushable object will > directly get the deserializer from the shared > recordDescpitor.getFields()[i]. It explains the issue-1164. > > I guess in your case there must be some deserializers given by shared > RecordDescriptor. Then it will get into the racing condition if there are > some UTF8StringSerDer involved. > > Given that the SerDers are stored in the shared RecordDescriptor, I think > the very initial design was to make the all the SerDers thread-safe. And it > maybe some other data structures stores the SerDers and are passed/used in > a same way. Then I’d have to propose to roll back the UTF8SerDer into the > state-less version (at the expense of creating intermediate buffer array > per record). > > Any opinions? > > > > On Nov 11, 2015, at 10:54 AM, abdullah alamoudi <[email protected]> > wrote: > > > > That was my first thought and so I changed it. The issue is still there. > > I am also using the UTF8StringSerializerDeserializer to deserialize the > > strings and they always serialize it correctly. > > > > I am thinking maybe it is related to the UTF8StringPointable but I am not > > sure how that could be. > > I am looking at this as well, > > Abdullah. > > > > Amoudi, Abdullah. > > > > On Wed, Nov 11, 2015 at 8:05 PM, Jianfeng Jia <[email protected]> > > wrote: > > > >> The possible racing condition could be that the > >> UTF8StringSerializerDeserializer now is not a singleton method any > more. It > >> was implemented to reuse the byte[] that serialize/deserialize the > string > >> object. Let me look into this issue. > >> > >>> On Nov 11, 2015, at 8:37 AM, abdullah alamoudi <[email protected]> > >> wrote: > >>> > >>> Highly probable. > >>> Please, let's fix this soon. > >>> > >>> Amoudi, Abdullah. > >>> > >>> On Wed, Nov 11, 2015 at 7:32 PM, Till Westmann <[email protected]> > wrote: > >>> > >>>> https://issues.apache.org/jira/browse/ASTERIXDB-1164 > >>>> might be related. > >>>> > >>>> Cheers, > >>>> Till > >>>> > >>>> On 11 Nov 2015, at 8:25, abdullah alamoudi wrote: > >>>> > >>>>> Hi all, > >>>>> I am having a hard time figuring this out. Here are the symptoms I am > >>>>> seeing in case one has an idea what this could be. > >>>>> > >>>>> I have a feed running ingesting data into a dataset. sporadically, I > >> get > >>>>> duplicate key exception errors (The key is of a string type) and I am > >>>> 100% > >>>>> sure that I don't have duplicate records. > >>>>> > >>>>> Moreover, I am printing the content of the frames about to be > inserted > >>>> into > >>>>> the primary index and there are no duplicate records. > >>>>> > >>>>> There are three reasons why I am suspecting the String > implementation: > >>>>> 1. It is fairly recent change. > >>>>> 2. When I run on a single node, or run one thread at a time, I never > >> get > >>>>> this exception. > >>>>> 3. the key is a String. > >>>>> > >>>>> I have looked at the change trying to figure out where a race > condition > >>>>> might take place but it is well hidden (if it is true at all.). > >>>>> > >>>>> Let me know if you have seen something similar. > >>>>> > >>>>> Cheers, > >>>>> Abdullah. > >>>> > >> > >> > >> > >> Best, > >> > >> Jianfeng Jia > >> PhD Candidate of Computer Science > >> University of California, Irvine > >> > >> > > > > Best, > > Jianfeng Jia > PhD Candidate of Computer Science > University of California, Irvine > >
