Seems like less work, I'd think, to fix Hyracks first? On Sep 25, 2015 5:34 PM, "Chen Li" <[email protected]> wrote:
> I vote for including this fix in the next Asterxi/Hyracks release, not this > one. > > Chen > > On Fri, Sep 25, 2015 at 4:23 PM, Ildar Absalyamov < > [email protected]> wrote: > > > It did not really occur to me during today during the meeting, but > Preston > > pointed out that the secondary index delete fix, that I proposed, spans > > both Hyracks & Asterix codebase. Thus we will either have to release > > Hyracks once again, or bite the bullet, sign the RC without this fixing > > this issue and create bug-fix releases for both Hyracks&Asterix right > after. > > > > > On Sep 22, 2015, at 22:27, Mike Carey <[email protected]> wrote: > > > > > > Ah - that makes sense now. Thx. (And welcome back. :-)) > > > > > > On 9/22/15 10:02 PM, Ildar Absalyamov wrote: > > >> Sorry for confusion, my initial answer was not correct enough, > probably > > should have waited sometime after I drove 1500 miles form Seattle :) > > >> The casting in the insert pipeline, which Abdullah mentioned, is > needed > > only for secondary index insert. The reasoning behind this casting is to > > ensure that the record is equivalent, thus it is safe to create an open > > index. It is true that we can get <Pk, Sk> pairs out of original record > > using get-field-by-name\index, but the cast operator is introduced merely > > to kill the pipeline if the dataset input is not correct. > > >> Thus the records in primary are never touched of modified, not matter > > what indexes were created. > > >> I am not sure however what is the second cast in Abdullah’s plan, and > > where is comes from. > > >> > > >> @Taewoo, so scan-delete-btree-secondary-index-open test does not > > actually delete data from the secondary index? I have checked the plan > and > > it has the delete operator. Maybe it is initialized with wrong > parameters, > > I’ll have a close look. > > >> > > >>> On Sep 22, 2015, at 18:33, Mike Carey <[email protected]> wrote: > > >>> > > >>> Sounds kinda bad! Also, I wonder what happens when the compiler > > encounters records in the dataset - whose type in the catalog doesn't > claim > > to have a given (but now indexed) open field - e.g., during a data scan > or > > an access via some other path? Can Bad Things Happen due to the compiler > > not properly anticipating the casted form of the records? (Maybe I am > > misunderstanding something, but we should probably take a careful look at > > the test cases - and make sure we do things like add a bunch of records, > > then add such an index, then add some more records, then stress-test > > type-related things that come at the dataset (i) thru the index, (ii) > thru > > a primary dataset scan, and (iii) thru some other index.) > > >>> > > >>> On 9/22/15 4:06 PM, Taewoo Kim wrote: > > >>>> I think this issue: > > https://issues.apache.org/jira/browse/ASTERIXDB-1109 is > > >>>> related. Currently, index entries (SK, PK) are not deleted on an > > open-type > > >>>> secondary index during a deletion. This issue was not surfaced due > to > > the > > >>>> fact that every search after a secondary index search had to go > > through the > > >>>> primary index lookup. > > >>>> > > >>>> Best, > > >>>> Taewoo > > >>>> > > >>>> On Tue, Sep 22, 2015 at 12:04 AM, Ildar Absalyamov < > > >>>> [email protected]> wrote: > > >>>> > > >>>>> Abdullah, > > >>>>> > > >>>>> If I remember correctly whenever a secondary open index is created > > all > > >>>>> existing records would be casted to a proper type to ensure that > the > > index > > >>>>> creation is valid. > > >>>>> As for the overall correctness of casting operation, semantically > > creating > > >>>>> an open index is the same thing as altering the dataset type. The > > current > > >>>>> implementation allows only one open index of particular type > created > > on a > > >>>>> single field. If we would have had “alter datatype” functionality > > the open > > >>>>> indexing would not be required at all. > > >>>>> > > >>>>>> On Sep 21, 2015, at 23:25, abdullah alamoudi <[email protected]> > > wrote: > > >>>>>> > > >>>>>> More thoughts: > > >>>>>> I assume the intention of the cast was just to make sure if the > open > > >>>>> field > > >>>>>> exists, it is of the specified type. Moreover, the un-casted > record > > >>>>> should > > >>>>>> be inserted into the index. > > >>>>>> If my assumptions are not correct, please, let me know ASAP. > > >>>>>> > > >>>>>> I have two thoughts on this: > > >>>>>> 1. Actually, insert plans show that the records being inserted > into > > the > > >>>>>> primary index is actually the casted record creating the issue > > described > > >>>>>> above. > > >>>>>> > > >>>>>> 2. I don't believe this is the right way to ensure that the open > > field if > > >>>>>> exists is of the right type. why not extract the field using field > > access > > >>>>>> by name function and then verify the type using the field tag? > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> On Tue, Sep 22, 2015 at 9:11 AM, abdullah alamoudi < > > [email protected]> > > >>>>>> wrote: > > >>>>>> > > >>>>>>> Hi Dev, @Ildar, > > >>>>>>> > > >>>>>>> In the insert pipeline for datasets with open indexes, we > > introduce a > > >>>>> cast > > >>>>>>> function before the insert and so one would expect the records to > > look > > >>>>> like > > >>>>>>> the casted record type which I assume has {{the closed fields + a > > >>>>> nullable > > >>>>>>> field}}. > > >>>>>>> > > >>>>>>> The question is, what happens to the previously existing > records?, > > since > > >>>>>>> now the index has both, records of the original type and records > > of the > > >>>>>>> casted type. > > >>>>>>> > > >>>>>>> Thanks, > > >>>>>>> Abdullah. > > >>>>>>> > > >>>>> Best regards, > > >>>>> Ildar > > >>>>> > > >>>>> > > >> Best regards, > > >> Ildar > > >> > > >> > > > > > > > Best regards, > > Ildar > > > > >
