Re: Question about open indexes

Mike Carey Fri, 25 Sep 2015 20:28:17 -0700

Seems like less work, I'd think, to fix Hyracks first?
On Sep 25, 2015 5:34 PM, "Chen Li" <[email protected]> wrote:


> I vote for including this fix in the next Asterxi/Hyracks release, not this
> one.
>
> Chen
>
> On Fri, Sep 25, 2015 at 4:23 PM, Ildar Absalyamov <
> [email protected]> wrote:
>
> > It did not really occur to me during today during the meeting, but
> Preston
> > pointed out that the secondary index delete fix, that I proposed, spans
> > both Hyracks & Asterix codebase. Thus we will either have to release
> > Hyracks once again, or bite the bullet, sign the RC without this fixing
> > this issue and create bug-fix releases for both Hyracks&Asterix right
> after.
> >
> > > On Sep 22, 2015, at 22:27, Mike Carey <[email protected]> wrote:
> > >
> > > Ah - that makes sense now.  Thx.  (And welcome back. :-))
> > >
> > > On 9/22/15 10:02 PM, Ildar Absalyamov wrote:
> > >> Sorry for confusion, my initial answer was not correct enough,
> probably
> > should have waited sometime after I drove 1500 miles form Seattle :)
> > >> The casting in the insert pipeline, which Abdullah mentioned, is
> needed
> > only for secondary index insert. The reasoning behind this casting is to
> > ensure that the record is equivalent, thus it is safe to create an open
> > index. It is true that we can get <Pk, Sk> pairs out of original record
> > using get-field-by-name\index, but the cast operator is introduced merely
> > to kill the pipeline if the dataset input is not correct.
> > >> Thus the records in primary are never touched of modified, not matter
> > what indexes were created.
> > >> I am not sure however what is the second cast in Abdullah’s plan, and
> > where is comes from.
> > >>
> > >> @Taewoo, so scan-delete-btree-secondary-index-open test does not
> > actually delete data from the secondary index? I have checked the plan
> and
> > it has the delete operator. Maybe it is initialized with wrong
> parameters,
> > I’ll have a close look.
> > >>
> > >>> On Sep 22, 2015, at 18:33, Mike Carey <[email protected]> wrote:
> > >>>
> > >>> Sounds kinda bad!  Also, I wonder what happens when the compiler
> > encounters records in the dataset - whose type in the catalog doesn't
> claim
> > to have a given (but now indexed) open field - e.g., during a data scan
> or
> > an access via some other path?  Can Bad Things Happen due to the compiler
> > not properly anticipating the casted form of the records?  (Maybe I am
> > misunderstanding something, but we should probably take a careful look at
> > the test cases - and make sure we do things like add a bunch of records,
> > then add such an index, then add some more records, then stress-test
> > type-related things that come at the dataset (i) thru the index, (ii)
> thru
> > a primary dataset scan, and (iii) thru some other index.)
> > >>>
> > >>> On 9/22/15 4:06 PM, Taewoo Kim wrote:
> > >>>> I think this issue:
> > https://issues.apache.org/jira/browse/ASTERIXDB-1109 is
> > >>>> related. Currently, index entries (SK, PK) are not deleted on an
> > open-type
> > >>>> secondary index during a deletion. This issue was not surfaced due
> to
> > the
> > >>>> fact that every search after a secondary index search had to go
> > through the
> > >>>> primary index lookup.
> > >>>>
> > >>>> Best,
> > >>>> Taewoo
> > >>>>
> > >>>> On Tue, Sep 22, 2015 at 12:04 AM, Ildar Absalyamov <
> > >>>> [email protected]> wrote:
> > >>>>
> > >>>>> Abdullah,
> > >>>>>
> > >>>>> If I remember correctly whenever a secondary open index is created
> > all
> > >>>>> existing records would be casted to a proper type to ensure that
> the
> > index
> > >>>>> creation is valid.
> > >>>>> As for the overall correctness of casting operation, semantically
> > creating
> > >>>>> an open index is the same thing as altering the dataset type. The
> > current
> > >>>>> implementation allows only one open index of particular type
> created
> > on a
> > >>>>> single field. If we would have had “alter datatype” functionality
> > the open
> > >>>>> indexing would not be required at all.
> > >>>>>
> > >>>>>> On Sep 21, 2015, at 23:25, abdullah alamoudi <[email protected]>
> > wrote:
> > >>>>>>
> > >>>>>> More thoughts:
> > >>>>>> I assume the intention of the cast was just to make sure if the
> open
> > >>>>> field
> > >>>>>> exists, it is of the specified type. Moreover, the un-casted
> record
> > >>>>> should
> > >>>>>> be inserted into the index.
> > >>>>>> If my assumptions are not correct, please, let me know ASAP.
> > >>>>>>
> > >>>>>> I have two thoughts on this:
> > >>>>>> 1. Actually, insert plans show that the records being inserted
> into
> > the
> > >>>>>> primary index is actually the casted record creating the issue
> > described
> > >>>>>> above.
> > >>>>>>
> > >>>>>> 2. I don't believe this is the right way to ensure that the open
> > field if
> > >>>>>> exists is of the right type. why not extract the field using field
> > access
> > >>>>>> by name function and then verify the type using the field tag?
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Tue, Sep 22, 2015 at 9:11 AM, abdullah alamoudi <
> > [email protected]>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Hi Dev, @Ildar,
> > >>>>>>>
> > >>>>>>> In the insert pipeline for datasets with open indexes, we
> > introduce a
> > >>>>> cast
> > >>>>>>> function before the insert and so one would expect the records to
> > look
> > >>>>> like
> > >>>>>>> the casted record type which I assume has {{the closed fields + a
> > >>>>> nullable
> > >>>>>>> field}}.
> > >>>>>>>
> > >>>>>>> The question is, what happens to the previously existing
> records?,
> > since
> > >>>>>>> now the index has both, records of the original type and records
> > of the
> > >>>>>>> casted type.
> > >>>>>>>
> > >>>>>>> Thanks,
> > >>>>>>> Abdullah.
> > >>>>>>>
> > >>>>> Best regards,
> > >>>>> Ildar
> > >>>>>
> > >>>>>
> > >> Best regards,
> > >> Ildar
> > >>
> > >>
> > >
> >
> > Best regards,
> > Ildar
> >
> >
>

Re: Question about open indexes

Reply via email to