Re: Cannot load an index that is not empty [TreeIndexException]

abdullah alamoudi Sat, 20 Feb 2016 21:50:26 -0800

Totally agree. Probably better make sure it works nicely with that many
tasks and then fix the number of readers.


Cheers,
Abdullah.

On Sun, Feb 21, 2016 at 2:04 AM, Mike Carey <[email protected]> wrote:

> Sounds like the load job parallelism needs a redo - it probably shouldn't
> be more than the number of target partitions IMO...?
> On Feb 20, 2016 12:41 PM, "abdullah alamoudi" <[email protected]> wrote:
>
> > I have an idea that might explain why such a strange behavior happened. I
> > believe it could be due to the number of task partitions being very high
> > assuming each of the 76 files is being read in a separate task.
> > This could potentially lead to some corner cases that we didn't consider
> > before considering the number of threads in the tasks thread pool is less
> > than 76, some tasks will not be able to start until others have completed
> > execution.
> >
> > Just a thought,
> > Abdullah.
> >
> > On Fri, Feb 19, 2016 at 9:43 PM, abdullah alamoudi <[email protected]>
> > wrote:
> >
> > > Yiran,
> > > Here is one problem causing a failure:
> > > edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> > > edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
> > >
> >
> edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
> > > Input stream given to BTree bulk load has duplicates.
> > >
> > > which tells us that Input stream given to BTree bulk load has
> duplicates.
> > > The question is why this was not returned as the error message? We need
> > to
> > > look into that.
> > >
> > > I will continue looking at the log file to see if there were other
> > issues.
> > >
> > > Can you share with us the load statement you're using? I would like to
> > see
> > > how you're loading all the files. we might be able to suggest a way to
> > make
> > > it work better.
> > >
> > > Cheers,
> > > Abdullah.
> > >
> > > On Fri, Feb 19, 2016 at 9:31 PM, Yiran Wang <[email protected]> wrote:
> > >
> > >> Abdullah,
> > >>
> > >> Here is the log attached. Thank you all very much for looking into
> this.
> > >>
> > >> Ian - I have two query questions besides this loading issue. I was
> > >> wondering if I can meet briefly with you (or over email) regarding
> that.
> > >>
> > >> Thanks!
> > >> Yiran
> > >>
> > >> On Fri, Feb 19, 2016 at 9:38 AM, Mike Carey <[email protected]>
> wrote:
> > >>
> > >>> Maybe Ian can visit the cluster with Yiran later today?
> > >>> On Feb 19, 2016 1:31 AM, "abdullah alamoudi" <[email protected]>
> > wrote:
> > >>>
> > >>>> Yiran,
> > >>>> Can you share the logs? It would help us identifying the actual
> cause
> > >>>> of this failure much faster.
> > >>>>
> > >>>> I am pretty sure you know this but in case you didn't, you can get
> the
> > >>>> logs using
> > >>>> >managix log -n <instance-name>
> > >>>>
> > >>>> Also, it would be nice if someone from the team has access to the
> > >>>> cluster so we can work with it directly.
> > >>>> Cheers,
> > >>>> Abdullah.
> > >>>>
> > >>>>
> > >>>> On Fri, Feb 19, 2016 at 9:40 AM, Yiran Wang <[email protected]>
> > wrote:
> > >>>>
> > >>>>> Steven,
> > >>>>>
> > >>>>> Thanks for getting back to me so quickly! I wasn't clear. Here is
> > what
> > >>>>> happened:
> > >>>>>
> > >>>>> I test-loaded the first 32 files, no problem. I deleted the
> dataset,
> > >>>>> created a new one, and tried to load the entire 76 files into the
> > newly
> > >>>>> created (hence empty) dataset.
> > >>>>>
> > >>>>> It took about 2mins after executing the query for the error message
> > to
> > >>>>> show up. There are currently 31710406 rows of data in the dataset,
> > despite
> > >>>>> the error message (so it looks like it did load).
> > >>>>>
> > >>>>> So my questions are: 1) why did I still get that error message
> when I
> > >>>>> was loading to an empty dataset; and 2) I'm not sure if all the
> data
> > from
> > >>>>> the 76 file are fully loaded. Is there other ways to check, besides
> > trying
> > >>>>> to load it again and hope this time I don't get the error?
> > >>>>>
> > >>>>> Thanks!
> > >>>>> Yiran
> > >>>>>
> > >>>>> On Thu, Feb 18, 2016 at 10:29 PM, Steven Jacobs <[email protected]>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hi,
> > >>>>>> Welcome! We are an Apache incubator project now so I added the
> > >>>>>> correct mailing list. Our "load" statement only works on an empty
> > dataset.
> > >>>>>> Subsequent data needs to be added with an insert or a feed. You
> > should be
> > >>>>>> able to load all 76 files at once though (starting from empty).
> > >>>>>> Steven
> > >>>>>>
> > >>>>>>
> > >>>>>> On Thursday, February 18, 2016, Yiran Wang <[email protected]>
> > wrote:
> > >>>>>>
> > >>>>>>> Hi Asterix team!
> > >>>>>>>
> > >>>>>>> I've come across this error when I was trying to load 76 files
> into
> > >>>>>>> a dataset. When I test-loaded the first 32 files, there wasn't
> > such an
> > >>>>>>> error. All 76 files are of the same data format.
> > >>>>>>>
> > >>>>>>> Can you help interpret what this error message means?
> > >>>>>>>
> > >>>>>>> Thanks!
> > >>>>>>> Yiran
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> Best,
> > >>>>>>> Yiran
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> You received this message because you are subscribed to the
> Google
> > >>>>>>> Groups "asterixdb-dev" group.
> > >>>>>>> To unsubscribe from this group and stop receiving emails from it,
> > >>>>>>> send an email to [email protected].
> > >>>>>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>>>>
> > >>>>>> --
> > >>>>>> You received this message because you are subscribed to the Google
> > >>>>>> Groups "asterixdb-users" group.
> > >>>>>> To unsubscribe from this group and stop receiving emails from it,
> > >>>>>> send an email to [email protected].
> > >>>>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Best,
> > >>>>> Yiran
> > >>>>>
> > >>>>> --
> > >>>>> You received this message because you are subscribed to the Google
> > >>>>> Groups "asterixdb-dev" group.
> > >>>>> To unsubscribe from this group and stop receiving emails from it,
> > send
> > >>>>> an email to [email protected].
> > >>>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>>
> > >>>>
> > >>>> --
> > >>>> You received this message because you are subscribed to the Google
> > >>>> Groups "asterixdb-dev" group.
> > >>>> To unsubscribe from this group and stop receiving emails from it,
> send
> > >>>> an email to [email protected].
> > >>>> For more options, visit https://groups.google.com/d/optout.
> > >>>>
> > >>> --
> > >>> You received this message because you are subscribed to the Google
> > >>> Groups "asterixdb-users" group.
> > >>> To unsubscribe from this group and stop receiving emails from it,
> send
> > >>> an email to [email protected].
> > >>> For more options, visit https://groups.google.com/d/optout.
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> Best,
> > >> Yiran
> > >>
> > >> --
> > >> You received this message because you are subscribed to the Google
> > Groups
> > >> "asterixdb-dev" group.
> > >> To unsubscribe from this group and stop receiving emails from it, send
> > an
> > >> email to [email protected].
> > >> For more options, visit https://groups.google.com/d/optout.
> > >>
> > >
> > >
> >
>

Re: Cannot load an index that is not empty [TreeIndexException]

Reply via email to