It will be good to file a ticket to keep track of this bug.

Chen

On Fri, Feb 19, 2016 at 3:40 PM, Young-Seok Kim <[email protected]> wrote:

> Please let us know how it goes with the change.
>
> Young-Seok
>
> On Fri, Feb 19, 2016 at 3:23 PM, Yiran Wang <[email protected]> wrote:
>
>> Young-Seok,
>>
>> I will just go ahead and change the duplicated keys I have in my original
>> file. That should solve my loading problem. I was describing what's going
>> on in case that is relevant for you to understand why a lot of files still
>> got loaded into the dataset.
>>
>> Thanks!
>> Yiran
>>
>> On Fri, Feb 19, 2016 at 3:19 PM, Young-Seok Kim <[email protected]>
>> wrote:
>>
>>> Yiran,
>>>
>>> Could you show all AQLs involved in the loading with indicating the
>>> problematic file which includes the duplicated primary keys?
>>> Then, we may better understand what's going on and may get the solution
>>> hopefully.
>>>
>>> On Fri, Feb 19, 2016 at 2:58 PM, Yiran Wang <[email protected]> wrote:
>>>
>>>> Young-Seok,
>>>>
>>>> Thank you for your feedback. You are right there are some duplicated
>>>> primary keys. It took me some time, but I did locate the file where the
>>>> duplicated primary keys are from.
>>>>
>>>> If the load function loads files in sequence as written in the query,
>>>> the problematic file is located towards the end. Maybe that is why there
>>>> are still many instances got loaded into the dataset before it hit the
>>>> problematic file?
>>>>
>>>> Thanks again,
>>>> Yiran
>>>>
>>>> On Fri, Feb 19, 2016 at 10:53 AM, Young-Seok Kim <[email protected]>
>>>> wrote:
>>>>
>>>>> By quickly looking at the log, there seems to exist duplicated primary
>>>>> keys in the files to be loaded.
>>>>> That seems the first cause of the problem.
>>>>> But I'm not sure why the load query continues trying to load data
>>>>> further instead of stop when the duplication was found.
>>>>> This unexpected behavior seems to have introduced the "Cannot load an
>>>>> index that is not empty" exception.
>>>>>
>>>>> The following shows the snippet of the exceptions appeared in the log
>>>>> file attached.
>>>>>
>>>>> ---------------------------------------
>>>>> SEVERE: Setting uncaught exception handler
>>>>> edu.uci.ics.hyracks.api.lifecycle.LifeCycleComponentManager@46844c3d
>>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>>>> Input stream given to BTree bulk load has duplicates.
>>>>> Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>>>> Input stream given to BTree bulk load has duplicates.
>>>>> Caused by:
>>>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
>>>>> Input stream given to BTree bulk load has duplicates.
>>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
>>>>> edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load
>>>>> an index that is not empty
>>>>> Caused by:
>>>>> edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load
>>>>> an index that is not empty
>>>>>
>>>>> Best,
>>>>> Young-Seok
>>>>>
>>>>> On Fri, Feb 19, 2016 at 10:31 AM, Yiran Wang <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Abdullah,
>>>>>>
>>>>>> Here is the log attached. Thank you all very much for looking into
>>>>>> this.
>>>>>>
>>>>>> Ian - I have two query questions besides this loading issue. I was
>>>>>> wondering if I can meet briefly with you (or over email) regarding that.
>>>>>>
>>>>>> Thanks!
>>>>>> Yiran
>>>>>>
>>>>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Maybe Ian can visit the cluster with Yiran later today?
>>>>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Yiran,
>>>>>>>> Can you share the logs? It would help us identifying the actual
>>>>>>>> cause of this failure much faster.
>>>>>>>>
>>>>>>>> I am pretty sure you know this but in case you didn't, you can get
>>>>>>>> the logs using
>>>>>>>> >managix log -n <instance-name>
>>>>>>>>
>>>>>>>> Also, it would be nice if someone from the team has access to the
>>>>>>>> cluster so we can work with it directly.
>>>>>>>> Cheers,
>>>>>>>> Abdullah.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Steven,
>>>>>>>>>
>>>>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
>>>>>>>>> what happened:
>>>>>>>>>
>>>>>>>>> I test-loaded the first 32 files, no problem. I deleted the
>>>>>>>>> dataset, created a new one, and tried to load the entire 76 files 
>>>>>>>>> into the
>>>>>>>>> newly created (hence empty) dataset.
>>>>>>>>>
>>>>>>>>> It took about 2mins after executing the query for the error
>>>>>>>>> message to show up. There are currently 31710406 rows of data in the
>>>>>>>>> dataset, despite the error message (so it looks like it did load).
>>>>>>>>>
>>>>>>>>> So my questions are: 1) why did I still get that error message
>>>>>>>>> when I was loading to an empty dataset; and 2) I'm not sure if all 
>>>>>>>>> the data
>>>>>>>>> from the 76 file are fully loaded. Is there other ways to check, 
>>>>>>>>> besides
>>>>>>>>> trying to load it again and hope this time I don't get the error?
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>> Yiran
>>>>>>>>>
>>>>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> Welcome! We are an Apache incubator project now so I added the
>>>>>>>>>> correct mailing list. Our "load" statement only works on an empty 
>>>>>>>>>> dataset.
>>>>>>>>>> Subsequent data needs to be added with an insert or a feed. You 
>>>>>>>>>> should be
>>>>>>>>>> able to load all 76 files at once though (starting from empty).
>>>>>>>>>> Steven
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thursday, February 18, 2016, Yiran Wang <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Asterix team!
>>>>>>>>>>>
>>>>>>>>>>> I've come across this error when I was trying to load 76 files
>>>>>>>>>>> into a dataset. When I test-loaded the first 32 files, there wasn't 
>>>>>>>>>>> such an
>>>>>>>>>>> error. All 76 files are of the same data format.
>>>>>>>>>>>
>>>>>>>>>>> Can you help interpret what this error message means?
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>> Yiran
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best,
>>>>>>>>>>> Yiran
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>> Google Groups "asterixdb-dev" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>> Google Groups "asterixdb-users" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>>> send an email to [email protected].
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best,
>>>>>>>>> Yiran
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "asterixdb-dev" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send an email to [email protected].
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "asterixdb-dev" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "asterixdb-users" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best,
>>>>>> Yiran
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "asterixdb-users" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "asterixdb-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best,
>>>> Yiran
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "asterixdb-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "asterixdb-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Best,
>> Yiran
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "asterixdb-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "asterixdb-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

Reply via email to