Done.

Abdullah.

On Sun, Feb 21, 2016 at 8:52 AM, Till Westmann <[email protected]> wrote:

> Sounds like a good candidate for a JIRA issue, so we won't forget. :)
>
> Cheers,
> Till
>
> > On Feb 20, 2016, at 21:44, abdullah alamoudi <[email protected]> wrote:
> >
> > Totally agree. Probably better make sure it works nicely with that many
> > tasks and then fix the number of readers.
> >
> > Cheers,
> > Abdullah.
> >
> >> On Sun, Feb 21, 2016 at 2:04 AM, Mike Carey <[email protected]> wrote:
> >>
> >> Sounds like the load job parallelism needs a redo - it probably
> shouldn't
> >> be more than the number of target partitions IMO...?
> >>> On Feb 20, 2016 12:41 PM, "abdullah alamoudi" <[email protected]>
> wrote:
> >>>
> >>> I have an idea that might explain why such a strange behavior
> happened. I
> >>> believe it could be due to the number of task partitions being very
> high
> >>> assuming each of the 76 files is being read in a separate task.
> >>> This could potentially lead to some corner cases that we didn't
> consider
> >>> before considering the number of threads in the tasks thread pool is
> less
> >>> than 76, some tasks will not be able to start until others have
> completed
> >>> execution.
> >>>
> >>> Just a thought,
> >>> Abdullah.
> >>>
> >>> On Fri, Feb 19, 2016 at 9:43 PM, abdullah alamoudi <[email protected]
> >
> >>> wrote:
> >>>
> >>>> Yiran,
> >>>> Here is one problem causing a failure:
> >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> >>>> edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> >>
> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
> >>>> Input stream given to BTree bulk load has duplicates.
> >>>>
> >>>> which tells us that Input stream given to BTree bulk load has
> >> duplicates.
> >>>> The question is why this was not returned as the error message? We
> need
> >>> to
> >>>> look into that.
> >>>>
> >>>> I will continue looking at the log file to see if there were other
> >>> issues.
> >>>>
> >>>> Can you share with us the load statement you're using? I would like to
> >>> see
> >>>> how you're loading all the files. we might be able to suggest a way to
> >>> make
> >>>> it work better.
> >>>>
> >>>> Cheers,
> >>>> Abdullah.
> >>>>
> >>>>> On Fri, Feb 19, 2016 at 9:31 PM, Yiran Wang <[email protected]>
> wrote:
> >>>>>
> >>>>> Abdullah,
> >>>>>
> >>>>> Here is the log attached. Thank you all very much for looking into
> >> this.
> >>>>>
> >>>>> Ian - I have two query questions besides this loading issue. I was
> >>>>> wondering if I can meet briefly with you (or over email) regarding
> >> that.
> >>>>>
> >>>>> Thanks!
> >>>>> Yiran
> >>>>>
> >>>>> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <[email protected]>
> >> wrote:
> >>>>>
> >>>>>> Maybe Ian can visit the cluster with Yiran later today?
> >>>>>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <[email protected]>
> >>> wrote:
> >>>>>>
> >>>>>>> Yiran,
> >>>>>>> Can you share the logs? It would help us identifying the actual
> >> cause
> >>>>>>> of this failure much faster.
> >>>>>>>
> >>>>>>> I am pretty sure you know this but in case you didn't, you can get
> >> the
> >>>>>>> logs using
> >>>>>>>> managix log -n <instance-name>
> >>>>>>>
> >>>>>>> Also, it would be nice if someone from the team has access to the
> >>>>>>> cluster so we can work with it directly.
> >>>>>>> Cheers,
> >>>>>>> Abdullah.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <[email protected]>
> >>> wrote:
> >>>>>>>
> >>>>>>>> Steven,
> >>>>>>>>
> >>>>>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
> >>> what
> >>>>>>>> happened:
> >>>>>>>>
> >>>>>>>> I test-loaded the first 32 files, no problem. I deleted the
> >> dataset,
> >>>>>>>> created a new one, and tried to load the entire 76 files into the
> >>> newly
> >>>>>>>> created (hence empty) dataset.
> >>>>>>>>
> >>>>>>>> It took about 2mins after executing the query for the error
> message
> >>> to
> >>>>>>>> show up. There are currently 31710406 rows of data in the dataset,
> >>> despite
> >>>>>>>> the error message (so it looks like it did load).
> >>>>>>>>
> >>>>>>>> So my questions are: 1) why did I still get that error message
> >> when I
> >>>>>>>> was loading to an empty dataset; and 2) I'm not sure if all the
> >> data
> >>> from
> >>>>>>>> the 76 file are fully loaded. Is there other ways to check,
> besides
> >>> trying
> >>>>>>>> to load it again and hope this time I don't get the error?
> >>>>>>>>
> >>>>>>>> Thanks!
> >>>>>>>> Yiran
> >>>>>>>>
> >>>>>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <[email protected]
> >
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>> Welcome! We are an Apache incubator project now so I added the
> >>>>>>>>> correct mailing list. Our "load" statement only works on an empty
> >>> dataset.
> >>>>>>>>> Subsequent data needs to be added with an insert or a feed. You
> >>> should be
> >>>>>>>>> able to load all 76 files at once though (starting from empty).
> >>>>>>>>> Steven
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Thursday, February 18, 2016, Yiran Wang <[email protected]>
> >>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Asterix team!
> >>>>>>>>>>
> >>>>>>>>>> I've come across this error when I was trying to load 76 files
> >> into
> >>>>>>>>>> a dataset. When I test-loaded the first 32 files, there wasn't
> >>> such an
> >>>>>>>>>> error. All 76 files are of the same data format.
> >>>>>>>>>>
> >>>>>>>>>> Can you help interpret what this error message means?
> >>>>>>>>>>
> >>>>>>>>>> Thanks!
> >>>>>>>>>> Yiran
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Best,
> >>>>>>>>>> Yiran
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> You received this message because you are subscribed to the
> >> Google
> >>>>>>>>>> Groups "asterixdb-dev" group.
> >>>>>>>>>> To unsubscribe from this group and stop receiving emails from
> it,
> >>>>>>>>>> send an email to [email protected].
> >>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>>>> --
> >>>>>>>>> You received this message because you are subscribed to the
> Google
> >>>>>>>>> Groups "asterixdb-users" group.
> >>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
> >>>>>>>>> send an email to [email protected].
> >>>>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Best,
> >>>>>>>> Yiran
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> You received this message because you are subscribed to the Google
> >>>>>>>> Groups "asterixdb-dev" group.
> >>>>>>>> To unsubscribe from this group and stop receiving emails from it,
> >>> send
> >>>>>>>> an email to [email protected].
> >>>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>>>
> >>>>>>> --
> >>>>>>> You received this message because you are subscribed to the Google
> >>>>>>> Groups "asterixdb-dev" group.
> >>>>>>> To unsubscribe from this group and stop receiving emails from it,
> >> send
> >>>>>>> an email to [email protected].
> >>>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>> --
> >>>>>> You received this message because you are subscribed to the Google
> >>>>>> Groups "asterixdb-users" group.
> >>>>>> To unsubscribe from this group and stop receiving emails from it,
> >> send
> >>>>>> an email to [email protected].
> >>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best,
> >>>>> Yiran
> >>>>>
> >>>>> --
> >>>>> You received this message because you are subscribed to the Google
> >>> Groups
> >>>>> "asterixdb-dev" group.
> >>>>> To unsubscribe from this group and stop receiving emails from it,
> send
> >>> an
> >>>>> email to [email protected].
> >>>>> For more options, visit https://groups.google.com/d/optout.
> >>
>

Reply via email to