So here is some updates about this:

I have added some debug statement so that whenever the value of a
UTF8StringPointable is reset, it prints the following:
1. Its own object Id
2. whether the value changed from the previous value
3. if the value was changed, it prints the change

Here is some output when doing inserts

My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 WARNIN: UTF8StringPointble: prev =
submit-5.chtc.wisc.edu#3946933.27#1438705332 NOT equal current =
submit-5.chtc.wisc.edu#3946939.19#1438705343
My Id = 967207753 WARNIN: UTF8StringPointble: prev =
submit-5.chtc.wisc.edu#3946939.19#1438705343 NOT equal current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332
My Id = 967207753 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946933.27#1438705332

As can be seen, for some reason, the value of this pointable changed
briefly and immediately after that, I got the duplicate key exception. The
same thing can be seen in some other places too:

My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 WARNIN: UTF8StringPointble: prev =
submit-5.chtc.wisc.edu#3946943.44#1438705345 NOT equal current =
submit-5.chtc.wisc.edu#3946946.79#1438705351
My Id = 1829631708 WARNIN: UTF8StringPointble: prev =
submit-5.chtc.wisc.edu#3946946.79#1438705351 NOT equal current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345
My Id = 1829631708 NORMAL: UTF8StringPointble: prev = current =
submit-5.chtc.wisc.edu#3946943.44#1438705345

Not sure if I am on the right track but this is more promising than the
SerDe. will be back with more updates soon.
~Abdullah.

Amoudi, Abdullah.

On Thu, Nov 12, 2015 at 3:24 AM, Jianfeng Jia <[email protected]>
wrote:

> You are correct. I was looking at one implementation of
> ResultWriterOperator which was only used in hyracks test. The normal
> ResultWriter is using the IPrinterFactory as what Yingyi said.
>
> There must be a different reason for this racing condition. Let me check
> 1164 again.
>
> > On Nov 11, 2015, at 12:34 PM, abdullah alamoudi <[email protected]>
> wrote:
> >
> > I definitely re-built both and I am 100% sure that the
> > AsterixIndexInsertDeleteNodePushable has the bug. where? not sure but
> most
> > likely hidden somewhere in the storage layer.
> >
> > Tomorrow, I am going to check each of the components in that operator 1
> by
> > 1 until I can isolate the source of the bug.
> >
> > Cheers,
> > Abdullah.
> >
> > Amoudi, Abdullah.
> >
> > On Wed, Nov 11, 2015 at 11:27 PM, Jianfeng Jia <[email protected]>
> > wrote:
> >
> >> Then that will be two different issues.
> >> Just want to make sure that you’ve rebuilt the hyracks (not only
> >> asterixdb) before test your code, cause those changes are in hyracks.
> >> And could you send the logic plan and the hyrack job so that we can lock
> >> which hyracks operators that get involved?
> >>
> >>> On Nov 11, 2015, at 12:10 PM, abdullah alamoudi <[email protected]>
> >> wrote:
> >>>
> >>> That was my first thought as I said but I am 100% sure the issue is not
> >> in
> >>> the SerDe. To confirm this, I removed the reader and writer from the
> >> serde
> >>> and created a new instance of reader/writer in every call to serialize
> or
> >>> deserialize just to determine if the problem is gone.
> >>>
> >>> The problem didn't go away and I still had the same issue. That is why
> I
> >>> know for sure it is not the SerDe.
> >>>
> >>> Don't waste any more time in that direction.
> >>> ~Abdullah.
> >>>
> >>> Amoudi, Abdullah.
> >>>
> >>> On Wed, Nov 11, 2015 at 10:54 PM, Jianfeng Jia <[email protected]
> >
> >>> wrote:
> >>>
> >>>> Here is my finding and thoughts.
> >>>> I think I’ve checked all the direct use case of UTF8SerDer. However, I
> >>>> missed some indirect static/shared use case of UTF8SerDer.
> >>>>
> >>>> One big suspect is the RecordDescriptor which has the
> >>>> ISerializerDeserializers inside and is always passed into the Factory
> >>>> method and shared by the ThreadMethod (usually NodePushable).
> >>>> E.g., in the ResultWriterOperatorDescriptor, the outRecordDesc is
> passed
> >>>> to the createPushRuntime() factory method to create the
> >> “resultSerializer”,
> >>>> and it is shared by the thread object
> >>>> AbstractUnaryInputSinkOperatorNodePushable. This pushable object will
> >>>> directly get the deserializer from the shared
> >>>> recordDescpitor.getFields()[i]. It explains the issue-1164.
> >>>>
> >>>> I guess in your case there must be some deserializers given by shared
> >>>> RecordDescriptor. Then it will get into the racing condition if there
> >> are
> >>>> some UTF8StringSerDer involved.
> >>>>
> >>>> Given that the SerDers are stored in the shared RecordDescriptor, I
> >> think
> >>>> the very initial design was to make the all the SerDers thread-safe.
> >> And it
> >>>> maybe some other data structures stores the SerDers and are
> passed/used
> >> in
> >>>> a same way. Then I’d have to propose to roll back the UTF8SerDer into
> >> the
> >>>> state-less version (at the expense of creating intermediate buffer
> array
> >>>> per record).
> >>>>
> >>>> Any opinions?
> >>>>
> >>>>
> >>>>> On Nov 11, 2015, at 10:54 AM, abdullah alamoudi <[email protected]>
> >>>> wrote:
> >>>>>
> >>>>> That was my first thought and so I changed it. The issue is still
> >> there.
> >>>>> I am also using the UTF8StringSerializerDeserializer to deserialize
> the
> >>>>> strings and they always serialize it correctly.
> >>>>>
> >>>>> I am thinking maybe it is related to the UTF8StringPointable but I am
> >> not
> >>>>> sure how that could be.
> >>>>> I am looking at this as well,
> >>>>> Abdullah.
> >>>>>
> >>>>> Amoudi, Abdullah.
> >>>>>
> >>>>> On Wed, Nov 11, 2015 at 8:05 PM, Jianfeng Jia <
> [email protected]>
> >>>>> wrote:
> >>>>>
> >>>>>> The possible racing condition could be that the
> >>>>>> UTF8StringSerializerDeserializer now is not a singleton method any
> >>>> more. It
> >>>>>> was implemented to reuse the byte[] that serialize/deserialize the
> >>>> string
> >>>>>> object. Let me look into this issue.
> >>>>>>
> >>>>>>> On Nov 11, 2015, at 8:37 AM, abdullah alamoudi <[email protected]
> >
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Highly probable.
> >>>>>>> Please, let's fix this soon.
> >>>>>>>
> >>>>>>> Amoudi, Abdullah.
> >>>>>>>
> >>>>>>> On Wed, Nov 11, 2015 at 7:32 PM, Till Westmann <[email protected]>
> >>>> wrote:
> >>>>>>>
> >>>>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1164
> >>>>>>>> might be related.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Till
> >>>>>>>>
> >>>>>>>> On 11 Nov 2015, at 8:25, abdullah alamoudi wrote:
> >>>>>>>>
> >>>>>>>>> Hi all,
> >>>>>>>>> I am having a hard time figuring this out. Here are the symptoms
> I
> >> am
> >>>>>>>>> seeing in case one has an idea what this could be.
> >>>>>>>>>
> >>>>>>>>> I have a feed running ingesting data into a dataset.
> sporadically,
> >> I
> >>>>>> get
> >>>>>>>>> duplicate key exception errors (The key is of a string type) and
> I
> >> am
> >>>>>>>> 100%
> >>>>>>>>> sure that I don't have duplicate records.
> >>>>>>>>>
> >>>>>>>>> Moreover, I am printing the content of the frames about to be
> >>>> inserted
> >>>>>>>> into
> >>>>>>>>> the primary index and there are no duplicate records.
> >>>>>>>>>
> >>>>>>>>> There are three reasons why I am suspecting the String
> >>>> implementation:
> >>>>>>>>> 1. It is fairly recent change.
> >>>>>>>>> 2. When I run on a single node, or run one thread at a time, I
> >> never
> >>>>>> get
> >>>>>>>>> this exception.
> >>>>>>>>> 3. the key is a String.
> >>>>>>>>>
> >>>>>>>>> I have looked at the change trying to figure out where a race
> >>>> condition
> >>>>>>>>> might take place but it is well hidden (if it is true at all.).
> >>>>>>>>>
> >>>>>>>>> Let me know if you have seen something similar.
> >>>>>>>>>
> >>>>>>>>> Cheers,
> >>>>>>>>> Abdullah.
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> Jianfeng Jia
> >>>>>> PhD Candidate of Computer Science
> >>>>>> University of California, Irvine
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>>
> >>>> Best,
> >>>>
> >>>> Jianfeng Jia
> >>>> PhD Candidate of Computer Science
> >>>> University of California, Irvine
> >>>>
> >>>>
> >>
> >>
> >>
> >> Best,
> >>
> >> Jianfeng Jia
> >> PhD Candidate of Computer Science
> >> University of California, Irvine
> >>
> >>
>
>
>
> Best,
>
> Jianfeng Jia
> PhD Candidate of Computer Science
> University of California, Irvine
>
>

Reply via email to