Done. Abdullah.
On Sun, Feb 21, 2016 at 8:52 AM, Till Westmann <[email protected]> wrote: > Sounds like a good candidate for a JIRA issue, so we won't forget. :) > > Cheers, > Till > > > On Feb 20, 2016, at 21:44, abdullah alamoudi <[email protected]> wrote: > > > > Totally agree. Probably better make sure it works nicely with that many > > tasks and then fix the number of readers. > > > > Cheers, > > Abdullah. > > > >> On Sun, Feb 21, 2016 at 2:04 AM, Mike Carey <[email protected]> wrote: > >> > >> Sounds like the load job parallelism needs a redo - it probably > shouldn't > >> be more than the number of target partitions IMO...? > >>> On Feb 20, 2016 12:41 PM, "abdullah alamoudi" <[email protected]> > wrote: > >>> > >>> I have an idea that might explain why such a strange behavior > happened. I > >>> believe it could be due to the number of task partitions being very > high > >>> assuming each of the 76 files is being read in a separate task. > >>> This could potentially lead to some corner cases that we didn't > consider > >>> before considering the number of threads in the tasks thread pool is > less > >>> than 76, some tasks will not be able to start until others have > completed > >>> execution. > >>> > >>> Just a thought, > >>> Abdullah. > >>> > >>> On Fri, Feb 19, 2016 at 9:43 PM, abdullah alamoudi <[email protected] > > > >>> wrote: > >>> > >>>> Yiran, > >>>> Here is one problem causing a failure: > >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException: > >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException: > >> > edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException: > >>>> Input stream given to BTree bulk load has duplicates. > >>>> > >>>> which tells us that Input stream given to BTree bulk load has > >> duplicates. > >>>> The question is why this was not returned as the error message? We > need > >>> to > >>>> look into that. > >>>> > >>>> I will continue looking at the log file to see if there were other > >>> issues. > >>>> > >>>> Can you share with us the load statement you're using? I would like to > >>> see > >>>> how you're loading all the files. we might be able to suggest a way to > >>> make > >>>> it work better. > >>>> > >>>> Cheers, > >>>> Abdullah. > >>>> > >>>>> On Fri, Feb 19, 2016 at 9:31 PM, Yiran Wang <[email protected]> > wrote: > >>>>> > >>>>> Abdullah, > >>>>> > >>>>> Here is the log attached. Thank you all very much for looking into > >> this. > >>>>> > >>>>> Ian - I have two query questions besides this loading issue. I was > >>>>> wondering if I can meet briefly with you (or over email) regarding > >> that. > >>>>> > >>>>> Thanks! > >>>>> Yiran > >>>>> > >>>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <[email protected]> > >> wrote: > >>>>> > >>>>>> Maybe Ian can visit the cluster with Yiran later today? > >>>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <[email protected]> > >>> wrote: > >>>>>> > >>>>>>> Yiran, > >>>>>>> Can you share the logs? It would help us identifying the actual > >> cause > >>>>>>> of this failure much faster. > >>>>>>> > >>>>>>> I am pretty sure you know this but in case you didn't, you can get > >> the > >>>>>>> logs using > >>>>>>>> managix log -n <instance-name> > >>>>>>> > >>>>>>> Also, it would be nice if someone from the team has access to the > >>>>>>> cluster so we can work with it directly. > >>>>>>> Cheers, > >>>>>>> Abdullah. > >>>>>>> > >>>>>>> > >>>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <[email protected]> > >>> wrote: > >>>>>>> > >>>>>>>> Steven, > >>>>>>>> > >>>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is > >>> what > >>>>>>>> happened: > >>>>>>>> > >>>>>>>> I test-loaded the first 32 files, no problem. I deleted the > >> dataset, > >>>>>>>> created a new one, and tried to load the entire 76 files into the > >>> newly > >>>>>>>> created (hence empty) dataset. > >>>>>>>> > >>>>>>>> It took about 2mins after executing the query for the error > message > >>> to > >>>>>>>> show up. There are currently 31710406 rows of data in the dataset, > >>> despite > >>>>>>>> the error message (so it looks like it did load). > >>>>>>>> > >>>>>>>> So my questions are: 1) why did I still get that error message > >> when I > >>>>>>>> was loading to an empty dataset; and 2) I'm not sure if all the > >> data > >>> from > >>>>>>>> the 76 file are fully loaded. Is there other ways to check, > besides > >>> trying > >>>>>>>> to load it again and hope this time I don't get the error? > >>>>>>>> > >>>>>>>> Thanks! > >>>>>>>> Yiran > >>>>>>>> > >>>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <[email protected] > > > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi, > >>>>>>>>> Welcome! We are an Apache incubator project now so I added the > >>>>>>>>> correct mailing list. Our "load" statement only works on an empty > >>> dataset. > >>>>>>>>> Subsequent data needs to be added with an insert or a feed. You > >>> should be > >>>>>>>>> able to load all 76 files at once though (starting from empty). > >>>>>>>>> Steven > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Thursday, February 18, 2016, Yiran Wang <[email protected]> > >>> wrote: > >>>>>>>>> > >>>>>>>>>> Hi Asterix team! > >>>>>>>>>> > >>>>>>>>>> I've come across this error when I was trying to load 76 files > >> into > >>>>>>>>>> a dataset. When I test-loaded the first 32 files, there wasn't > >>> such an > >>>>>>>>>> error. All 76 files are of the same data format. > >>>>>>>>>> > >>>>>>>>>> Can you help interpret what this error message means? > >>>>>>>>>> > >>>>>>>>>> Thanks! > >>>>>>>>>> Yiran > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> Best, > >>>>>>>>>> Yiran > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> You received this message because you are subscribed to the > >> Google > >>>>>>>>>> Groups "asterixdb-dev" group. > >>>>>>>>>> To unsubscribe from this group and stop receiving emails from > it, > >>>>>>>>>> send an email to [email protected]. > >>>>>>>>>> For more options, visit https://groups.google.com/d/optout. > >>>>>>>>> -- > >>>>>>>>> You received this message because you are subscribed to the > Google > >>>>>>>>> Groups "asterixdb-users" group. > >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, > >>>>>>>>> send an email to [email protected]. > >>>>>>>>> For more options, visit https://groups.google.com/d/optout. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Best, > >>>>>>>> Yiran > >>>>>>>> > >>>>>>>> -- > >>>>>>>> You received this message because you are subscribed to the Google > >>>>>>>> Groups "asterixdb-dev" group. > >>>>>>>> To unsubscribe from this group and stop receiving emails from it, > >>> send > >>>>>>>> an email to [email protected]. > >>>>>>>> For more options, visit https://groups.google.com/d/optout. > >>>>>>> > >>>>>>> -- > >>>>>>> You received this message because you are subscribed to the Google > >>>>>>> Groups "asterixdb-dev" group. > >>>>>>> To unsubscribe from this group and stop receiving emails from it, > >> send > >>>>>>> an email to [email protected]. > >>>>>>> For more options, visit https://groups.google.com/d/optout. > >>>>>> -- > >>>>>> You received this message because you are subscribed to the Google > >>>>>> Groups "asterixdb-users" group. > >>>>>> To unsubscribe from this group and stop receiving emails from it, > >> send > >>>>>> an email to [email protected]. > >>>>>> For more options, visit https://groups.google.com/d/optout. > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Best, > >>>>> Yiran > >>>>> > >>>>> -- > >>>>> You received this message because you are subscribed to the Google > >>> Groups > >>>>> "asterixdb-dev" group. > >>>>> To unsubscribe from this group and stop receiving emails from it, > send > >>> an > >>>>> email to [email protected]. > >>>>> For more options, visit https://groups.google.com/d/optout. > >> >
