Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM
On Oct16, 2011, at 20:26 , Tom Lane wrote: > Florian Pflug writes: >> On Oct16, 2011, at 19:09 , Tom Lane wrote: >>> That doesn't seem like the same thing at all, because the indexed column >>> is on different sides of the expression in the two cases. The situation >>> that I'm worried about is "indexedcolumn op ANY(arrayconstant)", and >>> what you're bringing up is "constant op ANY(indexedarraycolumn)". > >> Couldn't we teach the main executor to push a ScalarArrayOpExpr down >> into the index AM if the operator belongs to the index's opclass, one >> side is indexed, and the other is constant? > > Well, no, unless you're proposing to somehow implement the "constant op > ANY(indexedarraycolumn)" case in all the AMs. I don't see any practical > way to do it in btree, for one. Hm, true. Also, the "operator belongs to the index's opclass" part of the push-down condition I proposed was wrong anyway, because the "=" operator in e.g. 3 = ANY(indexed_int_array_column) isn't in the opclass int_array_ops. From that, it seems that what would be needed to make the planner use a GIN index to evaluate the qual above is a a way for it to realize that there's a connection between some GIN indices (like *_array_ops) and btree opclasses on the GIN index's storage type. Which would be nice, but I see now that it has little to do with your proposal, which is only concerned with operators from the index's opclass. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM
Florian Pflug writes: > On Oct16, 2011, at 19:09 , Tom Lane wrote: >> That doesn't seem like the same thing at all, because the indexed column >> is on different sides of the expression in the two cases. The situation >> that I'm worried about is "indexedcolumn op ANY(arrayconstant)", and >> what you're bringing up is "constant op ANY(indexedarraycolumn)". > Couldn't we teach the main executor to push a ScalarArrayOpExpr down > into the index AM if the operator belongs to the index's opclass, one > side is indexed, and the other is constant? Well, no, unless you're proposing to somehow implement the "constant op ANY(indexedarraycolumn)" case in all the AMs. I don't see any practical way to do it in btree, for one. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM
On Oct16, 2011, at 19:09 , Tom Lane wrote: > Florian Pflug writes: >> On Oct15, 2011, at 20:58 , Tom Lane wrote: >>> So, at least as far as btrees are concerned, it seems like I implemented >>> the ScalarArrayOpExpr logic at the wrong level and it ought to be pushed >>> down into the index AM. The above rules seem pretty btree-specific, so >>> I don't think that we ought to have the main executor doing any of this. >>> I envision doing this by marking btree as supporting ScalarArrayOpExpr >>> scankeys directly, so that the executor does nothing special with them, >>> and the planner treats them the same as regular scalar indexquals. > >> Hm, would this make it possible to teach the array GIN ops to also handle >> ScalarArrayOpExpr? > > Hmm, maybe. In principle the index AM can always do this at least as > efficiently as the executor can, and maybe there's actual wins to be had > in GIST and GIN. So another route to getting rid of the executor-level > support would be to implement ScalarArrayOpExpr in all the AMs. I'm not > personally volunteering to do that though. Hm, that sounds like we ought to leave the existing infrastructure in the main executor in place until we have GIN and GIST support. >> I've recently had to put >> ARRAY[$1] <@ $2 AND $1 = ANY($2) >> into an (inlineable) SQL function to make it use a (btree) index if >> $1 is a scalar-values field (and $1 constant array) and a GIN index if $2 >> is a GIN-indexed array-values field (and $2 a constant array). Which of >> course sucks from an efficiency POV. > > That doesn't seem like the same thing at all, because the indexed column > is on different sides of the expression in the two cases. The situation > that I'm worried about is "indexedcolumn op ANY(arrayconstant)", and > what you're bringing up is "constant op ANY(indexedarraycolumn)". Hm, true > To fit the latter into the existing opclass infrastructure, we'd have to > somehow teach the planner that "constant op ANY(indexedarraycolumn)" > is interchangeable with "indexedarraycolumn @> constant", for pairs of > operators where that's indeed the case. Seems like that'd be a lot > messier than the use-case warrants. That exactly was what convinced me previously that there's no easy way to do this. I had hoped that with your patch only the index AMs, instead of the planner, need to know about these equivalences. Couldn't we teach the main executor to push a ScalarArrayOpExpr down into the index AM if the operator belongs to the index's opclass, one side is indexed, and the other is constant? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM
Florian Pflug writes: > On Oct15, 2011, at 20:58 , Tom Lane wrote: >> So, at least as far as btrees are concerned, it seems like I implemented >> the ScalarArrayOpExpr logic at the wrong level and it ought to be pushed >> down into the index AM. The above rules seem pretty btree-specific, so >> I don't think that we ought to have the main executor doing any of this. >> I envision doing this by marking btree as supporting ScalarArrayOpExpr >> scankeys directly, so that the executor does nothing special with them, >> and the planner treats them the same as regular scalar indexquals. > Hm, would this make it possible to teach the array GIN ops to also handle > ScalarArrayOpExpr? Hmm, maybe. In principle the index AM can always do this at least as efficiently as the executor can, and maybe there's actual wins to be had in GIST and GIN. So another route to getting rid of the executor-level support would be to implement ScalarArrayOpExpr in all the AMs. I'm not personally volunteering to do that though. > I've recently had to put > ARRAY[$1] <@ $2 AND $1 = ANY($2) > into an (inlineable) SQL function to make it use a (btree) index if > $1 is a scalar-values field (and $1 constant array) and a GIN index if $2 > is a GIN-indexed array-values field (and $2 a constant array). Which of > course sucks from an efficiency POV. That doesn't seem like the same thing at all, because the indexed column is on different sides of the expression in the two cases. The situation that I'm worried about is "indexedcolumn op ANY(arrayconstant)", and what you're bringing up is "constant op ANY(indexedarraycolumn)". To fit the latter into the existing opclass infrastructure, we'd have to somehow teach the planner that "constant op ANY(indexedarraycolumn)" is interchangeable with "indexedarraycolumn @> constant", for pairs of operators where that's indeed the case. Seems like that'd be a lot messier than the use-case warrants. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM
On Oct15, 2011, at 20:58 , Tom Lane wrote: > So, at least as far as btrees are concerned, it seems like I implemented > the ScalarArrayOpExpr logic at the wrong level and it ought to be pushed > down into the index AM. The above rules seem pretty btree-specific, so > I don't think that we ought to have the main executor doing any of this. > I envision doing this by marking btree as supporting ScalarArrayOpExpr > scankeys directly, so that the executor does nothing special with them, > and the planner treats them the same as regular scalar indexquals. Hm, would this make it possible to teach the array GIN ops to also handle ScalarArrayOpExpr? I've recently had to put ARRAY[$1] <@ $2 AND $1 = ANY($2) into an (inlineable) SQL function to make it use a (btree) index if $1 is a scalar-values field (and $1 constant array) and a GIN index if $2 is a GIN-indexed array-values field (and $2 a constant array). Which of course sucks from an efficiency POV. At the time I didn't see a way to easily teach GIN to support ANY, but with your proposal it seems entirely doable. Unless I'm missing something, that is. > In principle somebody could be doing something like > WHERE pointcol <@ ANY (ARRAY[list of box values]) > and expecting that to generate a bitmap indexscan on a GIST index, but > is it likely that anyone is doing that? (As opposed to the equivalent > formulation with "pointcol <@ box1 OR pointcol <@ box2 ...", which would > still work to generate OR'd bitmap scans even if we took out the > ScalarArrayOpExpr logic.) Hm, if the number of box values isn't fixed, the ANY form plays much nicer nicer with parametrized statements than the OR form. So I think we shouldn't take that away from people. OTOH it seems that, depending on the actual list of box values, a bitmap indexscan isn't the smartest way to do this. If the boxes overlap (or are close enough to each other), using the bounding box to search the index and then filtering the remaining rows ought to be more efficient. So maybe we should remove the support for ScalarArrayOpExpr from the main executor, but add support for it to GIST? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM
Noah Misch writes: > On Sat, Oct 15, 2011 at 02:58:45PM -0400, Tom Lane wrote: >> [algorithm for a regular index scan satisfying "key IN (...)"] > Sounds sensible. The algorithm applies to more than ScalarArrayOpExpr; is it > not the ability to handle an OR'ed list of ScanKey instead of an AND'ed one? No, because it's restricted to all the elements having the same operator and same index column; it's not a generic OR. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM
On Sat, Oct 15, 2011 at 02:58:45PM -0400, Tom Lane wrote: > [algorithm for a regular index scan satisfying "key IN (...)"] > So, at least as far as btrees are concerned, it seems like I implemented > the ScalarArrayOpExpr logic at the wrong level and it ought to be pushed > down into the index AM. The above rules seem pretty btree-specific, so > I don't think that we ought to have the main executor doing any of this. > I envision doing this by marking btree as supporting ScalarArrayOpExpr > scankeys directly, so that the executor does nothing special with them, > and the planner treats them the same as regular scalar indexquals. Sounds sensible. The algorithm applies to more than ScalarArrayOpExpr; is it not the ability to handle an OR'ed list of ScanKey instead of an AND'ed one? Would it be worth exposing the capability along those lines instead, even if the only initial consumer is ScalarArrayOpExpr? > In principle somebody could be doing something like > WHERE pointcol <@ ANY (ARRAY[list of box values]) > and expecting that to generate a bitmap indexscan on a GIST index, but > is it likely that anyone is doing that? (As opposed to the equivalent > formulation with "pointcol <@ box1 OR pointcol <@ box2 ...", which would > still work to generate OR'd bitmap scans even if we took out the > ScalarArrayOpExpr logic.) Removing the "key IN (...)" optimization for hash indexes would add one more barrier to making that access method practical. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM
On Sat, Oct 15, 2011 at 2:58 PM, Tom Lane wrote: > In principle somebody could be doing something like > WHERE pointcol <@ ANY (ARRAY[list of box values]) > and expecting that to generate a bitmap indexscan on a GIST index, but > is it likely that anyone is doing that? (As opposed to the equivalent > formulation with "pointcol <@ box1 OR pointcol <@ box2 ...", which would > still work to generate OR'd bitmap scans even if we took out the > ScalarArrayOpExpr logic.) That seems like a pretty natural formulation to me, so I would be rather reluctant to assume nobody's doing it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Pushing ScalarArrayOpExpr support into the btree index AM
Almost immediately after we committed index-only scans, there were complaints that it didn't work with "indexkey IN (...)" conditions, that is ScalarArrayOpExpr index quals. That's because the current code only supports ScalarArrayOpExpr as a bitmap indexscan qual, not a regular indexscan. The executor performs a separate indexscan for each element of the array, relying on the bitmap mechanism to eliminate duplicate hits on the same heap tuple. In principle it's not hard to see how ScalarArrayOpExpr could be supported as a regular indexqual for btree indexes. If the comparison operator has <, <=, >=, or > semantics, then the array condition is equivalent to a simple scalar condition using the greatest or least array element respectively. Otherwise (i.e., the operator is =): 1. Sort the array elements into the same order as the btree, eliminating duplicates and NULL entries. (If there are no non-null elements, the qual condition is unsatisfiable and we can skip the indexscan.) 2. Perform an indexscan for each remaining array element, in sequence. Since we eliminated equal values in step 1, there can be no two matches to the same index entry, so there's no need for duplicate elimination. What's more, because we sorted the array to match the index order, index entries will be matched in index order, so the results of the scan can still be expected to come out in sorted order, a critical expectation for btree indexscans. If we have more than one ScalarArrayOpExpr then we have to be a bit careful about how we perform the multiple indexscans to ensure that output ordering is preserved, but it's certainly doable. So, at least as far as btrees are concerned, it seems like I implemented the ScalarArrayOpExpr logic at the wrong level and it ought to be pushed down into the index AM. The above rules seem pretty btree-specific, so I don't think that we ought to have the main executor doing any of this. I envision doing this by marking btree as supporting ScalarArrayOpExpr scankeys directly, so that the executor does nothing special with them, and the planner treats them the same as regular scalar indexquals. So that leaves me wondering whether the existing executor-level implementation of ScalarArrayOpExpr indexquals ought to be ripped out entirely. It would not save all that much code in the executor proper (basically ExecIndexEvalArrayKeys, ExecIndexAdvanceArrayKeys, and a few lines of supporting logic). However, there's a fair amount of cruft in the planner to deal with the concept that ScalarArrayOpExpr is supported as a bitmap indexscan qual but not a plain indexscan qual. If that code is only going to get exercised for non-btree index types, it's likely to be under-tested and hence a continuing source of bugs. In principle somebody could be doing something like WHERE pointcol <@ ANY (ARRAY[list of box values]) and expecting that to generate a bitmap indexscan on a GIST index, but is it likely that anyone is doing that? (As opposed to the equivalent formulation with "pointcol <@ box1 OR pointcol <@ box2 ...", which would still work to generate OR'd bitmap scans even if we took out the ScalarArrayOpExpr logic.) Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers