Re: [HACKERS] partial indexes and bitmap scans
Stephen Frost writes: > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> I do notice that createplan.c makes some effort to get rid of filter >> conditions that are provably true when the index conditions are. >> Doesn't look like it considers the reverse direction. Not sure if >> that's missing a bet. > That actually strikes me as a less likely condition to have in the > general case, isn't it? Wouldn't that only happen if the index > predicate and the user predicate are identical, because otherwise we > either can't use the index or we must keep the user's predicate because > it will only be satisfied in a subset of cases? Well, remember that the planner's idea of an ideal situation is to not have any filter conditions, not to not have any index (a/k/a recheck) conditions. It's going to try to put as much load as it can on the index condition side of things, and that gives rise to the need for rechecks. It seems like there might be some mileage to be gained by reversing the proof direction here, and having it get rid of recheck conditions that are implied by filter conditions rather than vice versa. I'm not quite convinced though, and I'm also not sure how hard it would be to mechanize. A lot of that code is shared between the bitmap and plain-indexscan cases, which could make it tricky to not break the plain-indexscan case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] partial indexes and bitmap scans
Tom, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frost writes: > > We have already figured out that the user's predicate results in a > > subset of the index's or we wouldn't be able to use that index though, > > right? Do we really need to spend cycles re-discovering that? Are > > there cases where we actually need the index's predicate to ever be > > included for correctness..? > > In a bitmap indexscan, yes, because the entire point of the recheck > condition is that we're going to scan a whole page and possibly see > tuples that don't satisfy the index predicate at all. I understand the point of the recheck condition being that we might see tuples that don't satisfy either the index predicate or the user's predicate, but that isn't the point- to even consider using that index we must have already realized that the user's predicate will only be satisfied in a subset of cases when the index predicate predicate will be satisfied, making any check of the index predicate necessairly always true. > Another point > that's mentioned in the comments in createplan.c is that if you're > considering the result of a BitmapOr or BitmapAnd, there's not necessarily > a single index involved so it's much harder to reason about which part > of the qual is an index predicate. We must be pulling the index's predicate to be able to put it into the Recheck condition in the first place. What I'm arguing is that once we've decided that we're able to use an index X, because the values which the user's predicate will satisfy is a subset of those which the index's predicate will satisfy, then we can entirely forget about the index's predicate as being redundant to the user's. I don't see anything about a BitmapAnd or BitmapOr being relevant to that. We will always need to check the user's predicate against the tuples being returned from the Bitmap Heap Scan, unless by chance it matches the index's predicate exactly *and* we have an entirely exact match bitmap without any lossy parts. > I do notice that createplan.c makes some effort to get rid of filter > conditions that are provably true when the index conditions are. > Doesn't look like it considers the reverse direction. Not sure if > that's missing a bet. That actually strikes me as a less likely condition to have in the general case, isn't it? Wouldn't that only happen if the index predicate and the user predicate are identical, because otherwise we either can't use the index or we must keep the user's predicate because it will only be satisfied in a subset of cases? Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] partial indexes and bitmap scans
Stephen Frost writes: > We have already figured out that the user's predicate results in a > subset of the index's or we wouldn't be able to use that index though, > right? Do we really need to spend cycles re-discovering that? Are > there cases where we actually need the index's predicate to ever be > included for correctness..? In a bitmap indexscan, yes, because the entire point of the recheck condition is that we're going to scan a whole page and possibly see tuples that don't satisfy the index predicate at all. Another point that's mentioned in the comments in createplan.c is that if you're considering the result of a BitmapOr or BitmapAnd, there's not necessarily a single index involved so it's much harder to reason about which part of the qual is an index predicate. I do notice that createplan.c makes some effort to get rid of filter conditions that are provably true when the index conditions are. Doesn't look like it considers the reverse direction. Not sure if that's missing a bet. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] partial indexes and bitmap scans
Tom, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frost writes: > > Perhaps I'm missing something obvious, but isn't it a bit redundant to > > have both a Recheck condition (which is the predicate of the index) and > > a Filter condition (which is the user's predicate) when we've already > > decided that the user's predicate must result in a subset of the > > index's, as, otherwise, we wouldn't be able to use the index in the > > first place? > > Yeah, I think this is just something that the planner doesn't see fit > to expend cycles on detecting. We have already figured out that the user's predicate results in a subset of the index's or we wouldn't be able to use that index though, right? Do we really need to spend cycles re-discovering that? Are there cases where we actually need the index's predicate to ever be included for correctness..? This seems like we're going out of our way to add in an additional check for something that we've already determined must always be true and that strikes me as odd. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] partial indexes and bitmap scans
Stephen Frost writes: > Perhaps I'm missing something obvious, but isn't it a bit redundant to > have both a Recheck condition (which is the predicate of the index) and > a Filter condition (which is the user's predicate) when we've already > decided that the user's predicate must result in a subset of the > index's, as, otherwise, we wouldn't be able to use the index in the > first place? Yeah, I think this is just something that the planner doesn't see fit to expend cycles on detecting. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] partial indexes and bitmap scans
Greetings, Consider this: create table t10 (c1 int, c2 int); create index on t10 (c1) where c2 > 5; \d t10 Table "sfrost.t10" Column | Type | Modifiers +-+--- c1 | integer | c2 | integer | Indexes: "t10_c1_idx" btree (c1) WHERE c2 > 5 insert into t10 select * from generate_series(1,1) a, generate_series(1,10) b; (repeat a bunch of times, if desired) vacuum analyze t10; set work_mem = '64kB'; set enable_indexscan = false; set enable_seqscan = false; =*> explain analyze select * from t10 where c2 > 6; QUERY PLAN -- Bitmap Heap Scan on t10 (cost=6496.49..15037.50 rows=318653 width=8) (actual time=34.682..116.236 rows=32 loops=1) Recheck Cond: (c2 > 5) Rows Removed by Index Recheck: 327502 Filter: (c2 > 6) Rows Removed by Filter: 8 Heap Blocks: exact=642 lossy=2898 -> Bitmap Index Scan on t10_c1_idx (cost=0.00..6416.83 rows=400081 width=0) (actual time=34.601..34.601 rows=40 loops=1) Planning time: 0.087 ms Execution time: 124.229 ms (9 rows) Perhaps I'm missing something obvious, but isn't it a bit redundant to have both a Recheck condition (which is the predicate of the index) and a Filter condition (which is the user's predicate) when we've already decided that the user's predicate must result in a subset of the index's, as, otherwise, we wouldn't be able to use the index in the first place? In other words, it seems like we shouldn't need a Filter in the above Bitmap Heap Scan, instead we should just make the Recheck be (c2 > 6). I've not looked into the code side of this at all and there may be reasons why this is hard to do, but it seems like a worthwhile improvement to consider doing, though perhaps I'm missing some reason why we need both the Recheck and the Filter in such cases for correctness. Thanks! Stephen signature.asc Description: Digital signature