It will be good to file a ticket to keep track of this bug. Chen
On Fri, Feb 19, 2016 at 3:40 PM, Young-Seok Kim <[email protected]> wrote: > Please let us know how it goes with the change. > > Young-Seok > > On Fri, Feb 19, 2016 at 3:23 PM, Yiran Wang <[email protected]> wrote: > >> Young-Seok, >> >> I will just go ahead and change the duplicated keys I have in my original >> file. That should solve my loading problem. I was describing what's going >> on in case that is relevant for you to understand why a lot of files still >> got loaded into the dataset. >> >> Thanks! >> Yiran >> >> On Fri, Feb 19, 2016 at 3:19 PM, Young-Seok Kim <[email protected]> >> wrote: >> >>> Yiran, >>> >>> Could you show all AQLs involved in the loading with indicating the >>> problematic file which includes the duplicated primary keys? >>> Then, we may better understand what's going on and may get the solution >>> hopefully. >>> >>> On Fri, Feb 19, 2016 at 2:58 PM, Yiran Wang <[email protected]> wrote: >>> >>>> Young-Seok, >>>> >>>> Thank you for your feedback. You are right there are some duplicated >>>> primary keys. It took me some time, but I did locate the file where the >>>> duplicated primary keys are from. >>>> >>>> If the load function loads files in sequence as written in the query, >>>> the problematic file is located towards the end. Maybe that is why there >>>> are still many instances got loaded into the dataset before it hit the >>>> problematic file? >>>> >>>> Thanks again, >>>> Yiran >>>> >>>> On Fri, Feb 19, 2016 at 10:53 AM, Young-Seok Kim <[email protected]> >>>> wrote: >>>> >>>>> By quickly looking at the log, there seems to exist duplicated primary >>>>> keys in the files to be loaded. >>>>> That seems the first cause of the problem. >>>>> But I'm not sure why the load query continues trying to load data >>>>> further instead of stop when the duplication was found. >>>>> This unexpected behavior seems to have introduced the "Cannot load an >>>>> index that is not empty" exception. >>>>> >>>>> The following shows the snippet of the exceptions appeared in the log >>>>> file attached. >>>>> >>>>> --------------------------------------- >>>>> SEVERE: Setting uncaught exception handler >>>>> edu.uci.ics.hyracks.api.lifecycle.LifeCycleComponentManager@46844c3d >>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException: >>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException: >>>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException: >>>>> Input stream given to BTree bulk load has duplicates. >>>>> Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException: >>>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException: >>>>> Input stream given to BTree bulk load has duplicates. >>>>> Caused by: >>>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException: >>>>> Input stream given to BTree bulk load has duplicates. >>>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException: >>>>> edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load >>>>> an index that is not empty >>>>> Caused by: >>>>> edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load >>>>> an index that is not empty >>>>> >>>>> Best, >>>>> Young-Seok >>>>> >>>>> On Fri, Feb 19, 2016 at 10:31 AM, Yiran Wang <[email protected]> >>>>> wrote: >>>>> >>>>>> Abdullah, >>>>>> >>>>>> Here is the log attached. Thank you all very much for looking into >>>>>> this. >>>>>> >>>>>> Ian - I have two query questions besides this loading issue. I was >>>>>> wondering if I can meet briefly with you (or over email) regarding that. >>>>>> >>>>>> Thanks! >>>>>> Yiran >>>>>> >>>>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Maybe Ian can visit the cluster with Yiran later today? >>>>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Yiran, >>>>>>>> Can you share the logs? It would help us identifying the actual >>>>>>>> cause of this failure much faster. >>>>>>>> >>>>>>>> I am pretty sure you know this but in case you didn't, you can get >>>>>>>> the logs using >>>>>>>> >managix log -n <instance-name> >>>>>>>> >>>>>>>> Also, it would be nice if someone from the team has access to the >>>>>>>> cluster so we can work with it directly. >>>>>>>> Cheers, >>>>>>>> Abdullah. >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Steven, >>>>>>>>> >>>>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is >>>>>>>>> what happened: >>>>>>>>> >>>>>>>>> I test-loaded the first 32 files, no problem. I deleted the >>>>>>>>> dataset, created a new one, and tried to load the entire 76 files >>>>>>>>> into the >>>>>>>>> newly created (hence empty) dataset. >>>>>>>>> >>>>>>>>> It took about 2mins after executing the query for the error >>>>>>>>> message to show up. There are currently 31710406 rows of data in the >>>>>>>>> dataset, despite the error message (so it looks like it did load). >>>>>>>>> >>>>>>>>> So my questions are: 1) why did I still get that error message >>>>>>>>> when I was loading to an empty dataset; and 2) I'm not sure if all >>>>>>>>> the data >>>>>>>>> from the 76 file are fully loaded. Is there other ways to check, >>>>>>>>> besides >>>>>>>>> trying to load it again and hope this time I don't get the error? >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> Yiran >>>>>>>>> >>>>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> Welcome! We are an Apache incubator project now so I added the >>>>>>>>>> correct mailing list. Our "load" statement only works on an empty >>>>>>>>>> dataset. >>>>>>>>>> Subsequent data needs to be added with an insert or a feed. You >>>>>>>>>> should be >>>>>>>>>> able to load all 76 files at once though (starting from empty). >>>>>>>>>> Steven >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thursday, February 18, 2016, Yiran Wang <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Asterix team! >>>>>>>>>>> >>>>>>>>>>> I've come across this error when I was trying to load 76 files >>>>>>>>>>> into a dataset. When I test-loaded the first 32 files, there wasn't >>>>>>>>>>> such an >>>>>>>>>>> error. All 76 files are of the same data format. >>>>>>>>>>> >>>>>>>>>>> Can you help interpret what this error message means? >>>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>>>>> Yiran >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Best, >>>>>>>>>>> Yiran >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>> Google Groups "asterixdb-dev" group. >>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>> it, send an email to [email protected]. >>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>> Google Groups "asterixdb-users" group. >>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>>> send an email to [email protected]. >>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best, >>>>>>>>> Yiran >>>>>>>>> >>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "asterixdb-dev" group. >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>> send an email to [email protected]. >>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "asterixdb-dev" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "asterixdb-users" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best, >>>>>> Yiran >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "asterixdb-users" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "asterixdb-users" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> >>>> >>>> -- >>>> Best, >>>> Yiran >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "asterixdb-dev" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "asterixdb-users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> >> -- >> Best, >> Yiran >> >> -- >> You received this message because you are subscribed to the Google Groups >> "asterixdb-dev" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups > "asterixdb-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. >
