Re: Question about open indexes

Till Westmann Mon, 28 Sep 2015 11:52:22 -0700

I’ve added the “0.8.7-blocker” tohttps://issues.apache.org/jira/browse/ASTERIXDB-1109 (which I believecovers this issue).

Is this what we agree on right now?

Also, do we already have a review for this?


Thanks,
Till

On 26 Sep 2015, at 1:37, abdullah alamoudi wrote:

I agree with Chen especially with the system not yet production ready.
It seems that going through with the release is more important.

Cheers,


Amoudi, Abdullah.

On Sat, Sep 26, 2015 at 3:33 AM, Chen Li <[email protected]> wrote:
I vote for including this fix in the next Asterxi/Hyracks release,not this
one.

Chen

On Fri, Sep 25, 2015 at 4:23 PM, Ildar Absalyamov <
[email protected]> wrote:
It did not really occur to me during today during the meeting, but
Preston
pointed out that the secondary index delete fix, that I proposed,spans
both Hyracks & Asterix codebase. Thus we will either have to release
Hyracks once again, or bite the bullet, sign the RC without thisfixingthis issue and create bug-fix releases for both Hyracks&Asterixright
after.
On Sep 22, 2015, at 22:27, Mike Carey <[email protected]> wrote:

Ah - that makes sense now.  Thx.  (And welcome back. :-))

On 9/22/15 10:02 PM, Ildar Absalyamov wrote:
Sorry for confusion, my initial answer was not correct enough,
probably
should have waited sometime after I drove 1500 miles form Seattle :)
The casting in the insert pipeline, which Abdullah mentioned, is
needed
only for secondary index insert. The reasoning behind this castingis toensure that the record is equivalent, thus it is safe to create anopenindex. It is true that we can get <Pk, Sk> pairs out of originalrecordusing get-field-by-name\index, but the cast operator is introducedmerely
to kill the pipeline if the dataset input is not correct.
Thus the records in primary are never touched of modified, notmatter
what indexes were created.
I am not sure however what is the second cast in Abdullah’splan, and
where is comes from.
@Taewoo, so scan-delete-btree-secondary-index-open test does not
actually delete data from the secondary index? I have checked theplan
and
it has the delete operator. Maybe it is initialized with wrong
parameters,
I’ll have a close look.
On Sep 22, 2015, at 18:33, Mike Carey <[email protected]> wrote:

Sounds kinda bad!  Also, I wonder what happens when the compiler
encounters records in the dataset - whose type in the catalogdoesn't
claim
to have a given (but now indexed) open field - e.g., during a datascan
or
an access via some other path? Can Bad Things Happen due to thecompilernot properly anticipating the casted form of the records? (Maybe Iammisunderstanding something, but we should probably take a carefullook atthe test cases - and make sure we do things like add a bunch ofrecords,
then add such an index, then add some more records, then stress-test
type-related things that come at the dataset (i) thru the index,(ii)
thru
a primary dataset scan, and (iii) thru some other index.)
On 9/22/15 4:06 PM, Taewoo Kim wrote:
I think this issue:
https://issues.apache.org/jira/browse/ASTERIXDB-1109 is
related. Currently, index entries (SK, PK) are not deleted on an
open-type
secondary index during a deletion. This issue was not surfaceddue
to
the
fact that every search after a secondary index search had to go
through the
primary index lookup.

Best,
Taewoo

On Tue, Sep 22, 2015 at 12:04 AM, Ildar Absalyamov <
[email protected]> wrote:
Abdullah,
If I remember correctly whenever a secondary open index iscreated
all
existing records would be casted to a proper type to ensurethat
the
index
creation is valid.
As for the overall correctness of casting operation,semantically
creating
an open index is the same thing as altering the dataset type.The
current
implementation allows only one open index of particular type
created
on a
single field. If we would have had “alter datatype”functionality
the open
indexing would not be required at all.
On Sep 21, 2015, at 23:25, abdullah alamoudi<[email protected]>
wrote:
More thoughts:
I assume the intention of the cast was just to make sure ifthe
open
field
exists, it is of the specified type. Moreover, the un-casted
record
should
be inserted into the index.
If my assumptions are not correct, please, let me know ASAP.

I have two thoughts on this:
1. Actually, insert plans show that the records being inserted
into
the
primary index is actually the casted record creating the issue
described
above.
2. I don't believe this is the right way to ensure that theopen
field if
exists is of the right type. why not extract the field usingfield
access
by name function and then verify the type using the field tag?



On Tue, Sep 22, 2015 at 9:11 AM, abdullah alamoudi <
[email protected]>
wrote:
Hi Dev, @Ildar,

In the insert pipeline for datasets with open indexes, we
introduce a
cast
function before the insert and so one would expect therecords to
look
like
the casted record type which I assume has {{the closed fields+ a
nullable
field}}.

The question is, what happens to the previously existing
records?,
since
now the index has both, records of the original type andrecords
of the
casted type.

Thanks,
Abdullah.
Best regards,
Ildar
Best regards,
Ildar
Best regards,
Ildar

Re: Question about open indexes

Reply via email to