Please let us know how it goes with the change. Young-Seok
On Fri, Feb 19, 2016 at 3:23 PM, Yiran Wang <[email protected]> wrote: > Young-Seok, > > I will just go ahead and change the duplicated keys I have in my original > file. That should solve my loading problem. I was describing what's going > on in case that is relevant for you to understand why a lot of files still > got loaded into the dataset. > > Thanks! > Yiran > > On Fri, Feb 19, 2016 at 3:19 PM, Young-Seok Kim <[email protected]> wrote: > >> Yiran, >> >> Could you show all AQLs involved in the loading with indicating the >> problematic file which includes the duplicated primary keys? >> Then, we may better understand what's going on and may get the solution >> hopefully. >> >> On Fri, Feb 19, 2016 at 2:58 PM, Yiran Wang <[email protected]> wrote: >> >>> Young-Seok, >>> >>> Thank you for your feedback. You are right there are some duplicated >>> primary keys. It took me some time, but I did locate the file where the >>> duplicated primary keys are from. >>> >>> If the load function loads files in sequence as written in the query, >>> the problematic file is located towards the end. Maybe that is why there >>> are still many instances got loaded into the dataset before it hit the >>> problematic file? >>> >>> Thanks again, >>> Yiran >>> >>> On Fri, Feb 19, 2016 at 10:53 AM, Young-Seok Kim <[email protected]> >>> wrote: >>> >>>> By quickly looking at the log, there seems to exist duplicated primary >>>> keys in the files to be loaded. >>>> That seems the first cause of the problem. >>>> But I'm not sure why the load query continues trying to load data >>>> further instead of stop when the duplication was found. >>>> This unexpected behavior seems to have introduced the "Cannot load an >>>> index that is not empty" exception. >>>> >>>> The following shows the snippet of the exceptions appeared in the log >>>> file attached. >>>> >>>> --------------------------------------- >>>> SEVERE: Setting uncaught exception handler >>>> edu.uci.ics.hyracks.api.lifecycle.LifeCycleComponentManager@46844c3d >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException: >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException: >>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException: >>>> Input stream given to BTree bulk load has duplicates. >>>> Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException: >>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException: >>>> Input stream given to BTree bulk load has duplicates. >>>> Caused by: >>>> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException: >>>> Input stream given to BTree bulk load has duplicates. >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException: >>>> edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load >>>> an index that is not empty >>>> Caused by: >>>> edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException: Cannot load >>>> an index that is not empty >>>> >>>> Best, >>>> Young-Seok >>>> >>>> On Fri, Feb 19, 2016 at 10:31 AM, Yiran Wang <[email protected]> wrote: >>>> >>>>> Abdullah, >>>>> >>>>> Here is the log attached. Thank you all very much for looking into >>>>> this. >>>>> >>>>> Ian - I have two query questions besides this loading issue. I was >>>>> wondering if I can meet briefly with you (or over email) regarding that. >>>>> >>>>> Thanks! >>>>> Yiran >>>>> >>>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <[email protected]> wrote: >>>>> >>>>>> Maybe Ian can visit the cluster with Yiran later today? >>>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Yiran, >>>>>>> Can you share the logs? It would help us identifying the actual >>>>>>> cause of this failure much faster. >>>>>>> >>>>>>> I am pretty sure you know this but in case you didn't, you can get >>>>>>> the logs using >>>>>>> >managix log -n <instance-name> >>>>>>> >>>>>>> Also, it would be nice if someone from the team has access to the >>>>>>> cluster so we can work with it directly. >>>>>>> Cheers, >>>>>>> Abdullah. >>>>>>> >>>>>>> >>>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Steven, >>>>>>>> >>>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is >>>>>>>> what happened: >>>>>>>> >>>>>>>> I test-loaded the first 32 files, no problem. I deleted the >>>>>>>> dataset, created a new one, and tried to load the entire 76 files into >>>>>>>> the >>>>>>>> newly created (hence empty) dataset. >>>>>>>> >>>>>>>> It took about 2mins after executing the query for the error message >>>>>>>> to show up. There are currently 31710406 rows of data in the dataset, >>>>>>>> despite the error message (so it looks like it did load). >>>>>>>> >>>>>>>> So my questions are: 1) why did I still get that error message when >>>>>>>> I was loading to an empty dataset; and 2) I'm not sure if all the data >>>>>>>> from >>>>>>>> the 76 file are fully loaded. Is there other ways to check, besides >>>>>>>> trying >>>>>>>> to load it again and hope this time I don't get the error? >>>>>>>> >>>>>>>> Thanks! >>>>>>>> Yiran >>>>>>>> >>>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> Welcome! We are an Apache incubator project now so I added the >>>>>>>>> correct mailing list. Our "load" statement only works on an empty >>>>>>>>> dataset. >>>>>>>>> Subsequent data needs to be added with an insert or a feed. You >>>>>>>>> should be >>>>>>>>> able to load all 76 files at once though (starting from empty). >>>>>>>>> Steven >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thursday, February 18, 2016, Yiran Wang <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Asterix team! >>>>>>>>>> >>>>>>>>>> I've come across this error when I was trying to load 76 files >>>>>>>>>> into a dataset. When I test-loaded the first 32 files, there wasn't >>>>>>>>>> such an >>>>>>>>>> error. All 76 files are of the same data format. >>>>>>>>>> >>>>>>>>>> Can you help interpret what this error message means? >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> Yiran >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Best, >>>>>>>>>> Yiran >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>> Google Groups "asterixdb-dev" group. >>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>>> send an email to [email protected]. >>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "asterixdb-users" group. >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>> send an email to [email protected]. >>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Best, >>>>>>>> Yiran >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "asterixdb-dev" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "asterixdb-dev" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "asterixdb-users" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best, >>>>> Yiran >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "asterixdb-users" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "asterixdb-users" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >>> >>> -- >>> Best, >>> Yiran >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "asterixdb-dev" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "asterixdb-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > Best, > Yiran > > -- > You received this message because you are subscribed to the Google Groups > "asterixdb-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. >
