Thanks, everyone, for the quick investigation and resolution today - awesome! (Looks like we have some error-case user experience work to do to handle such cases more cleanly? :-))
Cheers,
Mike

On 2/19/16 4:38 PM, Yiran Wang wrote:
I'm not sure if the loading issue was a bug. It looks like it was caused by the duplicated primary key in my dataset, which is now solved after I updated the primary key in the dataset. I was not sure what that error message means when I first created this email thread.

Thanks all for jumping on this quickly!
Yiran


On Fri, Feb 19, 2016 at 4:25 PM, Chen Li <[email protected] <mailto:[email protected]>> wrote:

    It will be good to file a ticket to keep track of this bug.

    Chen

    On Fri, Feb 19, 2016 at 3:40 PM, Young-Seok Kim <[email protected]
    <mailto:[email protected]>> wrote:

        Please let us know how it goes with the change.

        Young-Seok

        On Fri, Feb 19, 2016 at 3:23 PM, Yiran Wang <[email protected]
        <mailto:[email protected]>> wrote:

            Young-Seok,

            I will just go ahead and change the duplicated keys I have
            in my original file. That should solve my loading problem.
            I was describing what's going on in case that is relevant
            for you to understand why a lot of files still got loaded
            into the dataset.

            Thanks!
            Yiran

            On Fri, Feb 19, 2016 at 3:19 PM, Young-Seok Kim
            <[email protected] <mailto:[email protected]>> wrote:

                Yiran,

                Could you show all AQLs involved in the loading with
                indicating the problematic file which includes the
                duplicated primary keys?
                Then, we may better understand what's going on and may
                get the solution hopefully.

                On Fri, Feb 19, 2016 at 2:58 PM, Yiran Wang
                <[email protected] <mailto:[email protected]>> wrote:

                    Young-Seok,

                    Thank you for your feedback. You are right there
                    are some duplicated primary keys. It took me some
                    time, but I did locate the file where the
                    duplicated primary keys are from.

                    If the load function loads files in sequence as
                    written in the query, the problematic file is
                    located towards the end. Maybe that is why there
                    are still many instances got loaded into the
                    dataset before it hit the problematic file?

                    Thanks again,
                    Yiran

                    On Fri, Feb 19, 2016 at 10:53 AM, Young-Seok Kim
                    <[email protected] <mailto:[email protected]>> wrote:

                        By quickly looking at the log, there seems to
                        exist duplicated primary keys in the files to
                        be loaded.
                        That seems the first cause of the problem.
                        But I'm not sure why the load query continues
                        trying to load data further instead of stop
                        when the duplication was found.
                        This unexpected behavior seems to have
                        introduced the "Cannot load an index that is
                        not empty" exception.

                        The following shows the snippet of the
                        exceptions appeared in the log file attached.

                        ---------------------------------------
                        SEVERE: Setting uncaught exception handler
                        
edu.uci.ics.hyracks.api.lifecycle.LifeCycleComponentManager@46844c3d
                        edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
                        edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
                        
edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
                        Input stream given to BTree bulk load has
                        duplicates.
                        Caused by:
                        edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
                        
edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
                        Input stream given to BTree bulk load has
                        duplicates.
                        Caused by:
                        
edu.uci.ics.hyracks.storage.am.common.exceptions.TreeIndexDuplicateKeyException:
                        Input stream given to BTree bulk load has
                        duplicates.
                        edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
                        
edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException:
                        Cannot load an index that is not empty
                        Caused by:
                        
edu.uci.ics.hyracks.storage.am.common.api.TreeIndexException:
                        Cannot load an index that is not empty

                        Best,
                        Young-Seok

                        On Fri, Feb 19, 2016 at 10:31 AM, Yiran Wang
                        <[email protected] <mailto:[email protected]>>
                        wrote:

                            Abdullah,

                            Here is the log attached. Thank you all
                            very much for looking into this.

                            Ian - I have two query questions besides
                            this loading issue. I was wondering if I
                            can meet briefly with you (or over email)
                            regarding that.

                            Thanks!
                            Yiran

                            On Fri, Feb 19, 2016 at 9:38 AM, Mike
                            Carey <[email protected]
                            <mailto:[email protected]>> wrote:

                                Maybe Ian can visit the cluster with
                                Yiran later today?

                                On Feb 19, 2016 1:31 AM, "abdullah
                                alamoudi" <[email protected]
                                <mailto:[email protected]>> wrote:

                                    Yiran,
                                    Can you share the logs? It would
                                    help us identifying the actual
                                    cause of this failure much faster.

                                    I am pretty sure you know this but
                                    in case you didn't, you can get
                                    the logs using
                                    >managix log -n <instance-name>

                                    Also, it would be nice if someone
                                    from the team has access to the
                                    cluster so we can work with it
                                    directly.
                                    Cheers,
                                    Abdullah.


                                    On Fri, Feb 19, 2016 at 9:40 AM,
                                    Yiran Wang <[email protected]
                                    <mailto:[email protected]>> wrote:

                                        Steven,

                                        Thanks for getting back to me
                                        so quickly! I wasn't clear.
                                        Here is what happened:

                                        I test-loaded the first 32
                                        files, no problem. I deleted
                                        the dataset, created a new
                                        one, and tried to load the
                                        entire 76 files into the newly
                                        created (hence empty) dataset.

                                        It took about 2mins after
                                        executing the query for the
                                        error message to show up.
                                        There are currently 31710406
                                        rows of data in the dataset,
                                        despite the error message (so
                                        it looks like it did load).

                                        So my questions are: 1) why
                                        did I still get that error
                                        message when I was loading to
                                        an empty dataset; and 2) I'm
                                        not sure if all the data from
                                        the 76 file are fully loaded.
                                        Is there other ways to check,
                                        besides trying to load it
                                        again and hope this time I
                                        don't get the error?

                                        Thanks!
                                        Yiran

                                        On Thu, Feb 18, 2016 at 10:29
                                        PM, Steven Jacobs
                                        <[email protected]
                                        <mailto:[email protected]>> wrote:

                                            Hi,
                                            Welcome! We are an Apache
                                            incubator project now so I
                                            added the correct mailing
                                            list. Our "load" statement
                                            only works on an empty
                                            dataset. Subsequent data
                                            needs to be added with an
                                            insert or a feed. You
                                            should be able to load all
                                            76 files at once though
                                            (starting from empty).
                                            Steven


                                            On Thursday, February 18,
                                            2016, Yiran Wang
                                            <[email protected]
                                            <mailto:[email protected]>> wrote:

                                                Hi Asterix team!

                                                I've come across this
                                                error when I was
                                                trying to load 76
                                                files into a dataset.
                                                When I test-loaded the
                                                first 32 files, there
                                                wasn't such an error.
                                                All 76 files are of
                                                the same data format.

                                                Can you help interpret
                                                what this error
                                                message means?

                                                Thanks!
                                                Yiran

-- Best,
                                                Yiran
-- You received this
                                                message because you
                                                are subscribed to the
                                                Google Groups
                                                "asterixdb-dev" group.
                                                To unsubscribe from
                                                this group and stop
                                                receiving emails from
                                                it, send an email to
                                                
[email protected].
                                                For more options,
                                                visit
                                                
https://groups.google.com/d/optout.

-- You received this message
                                            because you are subscribed
                                            to the Google Groups
                                            "asterixdb-users" group.
                                            To unsubscribe from this
                                            group and stop receiving
                                            emails from it, send an
                                            email to
                                            
[email protected]
                                            
<mailto:[email protected]>.
                                            For more options, visit
                                            https://groups.google.com/d/optout.




-- Best,
                                        Yiran
-- You received this message
                                        because you are subscribed to
                                        the Google Groups
                                        "asterixdb-dev" group.
                                        To unsubscribe from this group
                                        and stop receiving emails from
                                        it, send an email to
                                        
[email protected]
                                        
<mailto:[email protected]>.
                                        For more options, visit
                                        https://groups.google.com/d/optout.


-- You received this message because
                                    you are subscribed to the Google
                                    Groups "asterixdb-dev" group.
                                    To unsubscribe from this group and
                                    stop receiving emails from it,
                                    send an email to
                                    [email protected]
                                    
<mailto:[email protected]>.
                                    For more options, visit
                                    https://groups.google.com/d/optout.

-- You received this message because you
                                are subscribed to the Google Groups
                                "asterixdb-users" group.
                                To unsubscribe from this group and
                                stop receiving emails from it, send an
                                email to
                                [email protected]
                                
<mailto:[email protected]>.
                                For more options, visit
                                https://groups.google.com/d/optout.




-- Best,
                            Yiran
-- You received this message because you are
                            subscribed to the Google Groups
                            "asterixdb-users" group.
                            To unsubscribe from this group and stop
                            receiving emails from it, send an email to
                            [email protected]
                            
<mailto:[email protected]>.
                            For more options, visit
                            https://groups.google.com/d/optout.


-- You received this message because you are
                        subscribed to the Google Groups
                        "asterixdb-users" group.
                        To unsubscribe from this group and stop
                        receiving emails from it, send an email to
                        [email protected]
                        <mailto:[email protected]>.
                        For more options, visit
                        https://groups.google.com/d/optout.




-- Best,
                    Yiran
-- You received this message because you are
                    subscribed to the Google Groups "asterixdb-dev" group.
                    To unsubscribe from this group and stop receiving
                    emails from it, send an email to
                    [email protected]
                    <mailto:[email protected]>.
                    For more options, visit
                    https://groups.google.com/d/optout.


-- You received this message because you are subscribed
                to the Google Groups "asterixdb-users" group.
                To unsubscribe from this group and stop receiving
                emails from it, send an email to
                [email protected]
                <mailto:[email protected]>.
                For more options, visit
                https://groups.google.com/d/optout.




-- Best,
            Yiran
-- You received this message because you are subscribed to
            the Google Groups "asterixdb-dev" group.
            To unsubscribe from this group and stop receiving emails
            from it, send an email to
            [email protected]
            <mailto:[email protected]>.
            For more options, visit https://groups.google.com/d/optout.


-- You received this message because you are subscribed to the
        Google Groups "asterixdb-users" group.
        To unsubscribe from this group and stop receiving emails from
        it, send an email to
        [email protected]
        <mailto:[email protected]>.
        For more options, visit https://groups.google.com/d/optout.


-- You received this message because you are subscribed to the Google
    Groups "asterixdb-users" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected]
    <mailto:[email protected]>.
    For more options, visit https://groups.google.com/d/optout.




--
Best,
Yiran
--
You received this message because you are subscribed to the Google Groups "asterixdb-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected] <mailto:[email protected]>.
For more options, visit https://groups.google.com/d/optout.

Reply via email to