It did not really occur to me during today during the meeting, but Preston pointed out that the secondary index delete fix, that I proposed, spans both Hyracks & Asterix codebase. Thus we will either have to release Hyracks once again, or bite the bullet, sign the RC without this fixing this issue and create bug-fix releases for both Hyracks&Asterix right after.
> On Sep 22, 2015, at 22:27, Mike Carey <[email protected]> wrote: > > Ah - that makes sense now. Thx. (And welcome back. :-)) > > On 9/22/15 10:02 PM, Ildar Absalyamov wrote: >> Sorry for confusion, my initial answer was not correct enough, probably >> should have waited sometime after I drove 1500 miles form Seattle :) >> The casting in the insert pipeline, which Abdullah mentioned, is needed only >> for secondary index insert. The reasoning behind this casting is to ensure >> that the record is equivalent, thus it is safe to create an open index. It >> is true that we can get <Pk, Sk> pairs out of original record using >> get-field-by-name\index, but the cast operator is introduced merely to kill >> the pipeline if the dataset input is not correct. >> Thus the records in primary are never touched of modified, not matter what >> indexes were created. >> I am not sure however what is the second cast in Abdullah’s plan, and where >> is comes from. >> >> @Taewoo, so scan-delete-btree-secondary-index-open test does not actually >> delete data from the secondary index? I have checked the plan and it has the >> delete operator. Maybe it is initialized with wrong parameters, I’ll have a >> close look. >> >>> On Sep 22, 2015, at 18:33, Mike Carey <[email protected]> wrote: >>> >>> Sounds kinda bad! Also, I wonder what happens when the compiler encounters >>> records in the dataset - whose type in the catalog doesn't claim to have a >>> given (but now indexed) open field - e.g., during a data scan or an access >>> via some other path? Can Bad Things Happen due to the compiler not >>> properly anticipating the casted form of the records? (Maybe I am >>> misunderstanding something, but we should probably take a careful look at >>> the test cases - and make sure we do things like add a bunch of records, >>> then add such an index, then add some more records, then stress-test >>> type-related things that come at the dataset (i) thru the index, (ii) thru >>> a primary dataset scan, and (iii) thru some other index.) >>> >>> On 9/22/15 4:06 PM, Taewoo Kim wrote: >>>> I think this issue:https://issues.apache.org/jira/browse/ASTERIXDB-1109 is >>>> related. Currently, index entries (SK, PK) are not deleted on an open-type >>>> secondary index during a deletion. This issue was not surfaced due to the >>>> fact that every search after a secondary index search had to go through the >>>> primary index lookup. >>>> >>>> Best, >>>> Taewoo >>>> >>>> On Tue, Sep 22, 2015 at 12:04 AM, Ildar Absalyamov < >>>> [email protected]> wrote: >>>> >>>>> Abdullah, >>>>> >>>>> If I remember correctly whenever a secondary open index is created all >>>>> existing records would be casted to a proper type to ensure that the index >>>>> creation is valid. >>>>> As for the overall correctness of casting operation, semantically creating >>>>> an open index is the same thing as altering the dataset type. The current >>>>> implementation allows only one open index of particular type created on a >>>>> single field. If we would have had “alter datatype” functionality the open >>>>> indexing would not be required at all. >>>>> >>>>>> On Sep 21, 2015, at 23:25, abdullah alamoudi <[email protected]> wrote: >>>>>> >>>>>> More thoughts: >>>>>> I assume the intention of the cast was just to make sure if the open >>>>> field >>>>>> exists, it is of the specified type. Moreover, the un-casted record >>>>> should >>>>>> be inserted into the index. >>>>>> If my assumptions are not correct, please, let me know ASAP. >>>>>> >>>>>> I have two thoughts on this: >>>>>> 1. Actually, insert plans show that the records being inserted into the >>>>>> primary index is actually the casted record creating the issue described >>>>>> above. >>>>>> >>>>>> 2. I don't believe this is the right way to ensure that the open field if >>>>>> exists is of the right type. why not extract the field using field access >>>>>> by name function and then verify the type using the field tag? >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Sep 22, 2015 at 9:11 AM, abdullah alamoudi <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Dev, @Ildar, >>>>>>> >>>>>>> In the insert pipeline for datasets with open indexes, we introduce a >>>>> cast >>>>>>> function before the insert and so one would expect the records to look >>>>> like >>>>>>> the casted record type which I assume has {{the closed fields + a >>>>> nullable >>>>>>> field}}. >>>>>>> >>>>>>> The question is, what happens to the previously existing records?, since >>>>>>> now the index has both, records of the original type and records of the >>>>>>> casted type. >>>>>>> >>>>>>> Thanks, >>>>>>> Abdullah. >>>>>>> >>>>> Best regards, >>>>> Ildar >>>>> >>>>> >> Best regards, >> Ildar >> >> > Best regards, Ildar
