Re: [HACKERS] Highly Variable Planning Times
On 19 April 2017 at 22:39, Michael Malis wrote: >> *At best*, you're doing substantial work in the >> planner to avoid the first tree descent step or two in a single >> non-partial index. Fwiw, in addition to replacing the first few levels of the descent with planning-time work, there's also an advantage due to the smaller keys. Effectively these partial indexes are emulating prefix-compression in the btree. > While the specific example I gave in the post could be replaced with a > non-partial index, most of the partial indexes contain predicates that > are not straightforward to index with non-partial indexes. About 85% > of the partial indexes contain a regular expression in them for CSS > selector matching. I've tried using a trigram GIN index, but it wound > up not working too well due to the data being highly redundant (almost > every CSS hierarchy contains 'div' in it). Additionally, most of the > predicates for partial indexes are extremely specific making the > partial indexes very small. The sum total size of all of the partial > indexes is still much smaller than if we were to index every necessary > field with regular indexes. I wonder if you could implement a FTS parser that tokenized html in just tokens representing the matching criteria. A GIN index using such a parser would actually be very similar to what you have as GIN indexes are basically a collection of btrees... The operational problem with that is I think it would be even harder to update a parser than adding a new partial index. I don't think you can effectively upgrade a parser to include new tokens without rebuilding any indexes using it. If you wanted to add new selector critieria live you would probably end up deploying the new parser and building a new index with CREATE INDEX CONCURRENTLY using the new parser and then dropping the old index. I'm not sure if it's possible to do a FTS parser for handling arbitrary CSS selectors but if you managed that that would be a very valuable addition to Postgres, IMHO -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Highly Variable Planning Times
Thanks for the help Andres. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Highly Variable Planning Times
On 2017-04-19 16:43:07 -0700, Michael Malis wrote: > > The profile seems to confirm that this is largely due to cache misses. > > Can you elaborate? Cache misses of what? Why is the planning time so > variable? Postgres uses caches for a lot of metadata of tables, indexes, ... Some actions, like vacuum, DDL, table size changes trigger cache invalidations for other backends so the caches stay coherent. What you're likely seeing is that the cached cases are quick enough, and that the slow plan cycles are when the cache has been invalidated. - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Highly Variable Planning Times
> The profile seems to confirm that this is largely due to cache misses. Can you elaborate? Cache misses of what? Why is the planning time so variable? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Highly Variable Planning Times
On 2017-04-19 14:14:54 -0700, Michael Malis wrote: > > Could you also get a profile using perf record -g? The strace isn't > > telling us all that much. > > I've attached the perf.data file from `perf record -g -p > `. I'm not that familiar with perf so if there is more > information you need let me know. This was on Ubuntu 16.04 The profile seems to confirm that this is largely due to cache misses. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Highly Variable Planning Times
On 2017-04-19 14:14:54 -0700, Michael Malis wrote: > > Could you also get a profile using perf record -g? The strace isn't > > telling us all that much. > > I've attached the perf.data file from `perf record -g -p > `. I'm not that familiar with perf so if there is more > information you need let me know. This was on Ubuntu 16.04 You'd have to do a perf report > outfile.txt too, without your system we can't analyze the perf.data. https://wiki.postgresql.org/wiki/Profiling_with_perf - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Highly Variable Planning Times
On Wed, Apr 19, 2017 at 2:39 PM, Michael Malis wrote: > > TBH, I see no convincing explanation in that article of why 1300 partial > > indexes are a good idea. > > I don't like it much either. I've been trying to eliminate most of the > need for the partial indexes, but this is the current state of things. > > > *At best*, you're doing substantial work in the > > planner to avoid the first tree descent step or two in a single > > non-partial index. > > While the specific example I gave in the post could be replaced with a > non-partial index, most of the partial indexes contain predicates that > are not straightforward to index with non-partial indexes. About 85% > of the partial indexes contain a regular expression in them for CSS > selector matching. I've tried using a trigram GIN index, but it wound > up not working too well due to the data being highly redundant (almost > every CSS hierarchy contains 'div' in it). pg_trgm 1.2 or higher has code to help with very common trigrams. But to try it, you would need to upgrade PostgreSQL to 9.6. Cheers, Jeff
Re: [HACKERS] Highly Variable Planning Times
> TBH, I see no convincing explanation in that article of why 1300 partial > indexes are a good idea. I don't like it much either. I've been trying to eliminate most of the need for the partial indexes, but this is the current state of things. > *At best*, you're doing substantial work in the > planner to avoid the first tree descent step or two in a single > non-partial index. While the specific example I gave in the post could be replaced with a non-partial index, most of the partial indexes contain predicates that are not straightforward to index with non-partial indexes. About 85% of the partial indexes contain a regular expression in them for CSS selector matching. I've tried using a trigram GIN index, but it wound up not working too well due to the data being highly redundant (almost every CSS hierarchy contains 'div' in it). Additionally, most of the predicates for partial indexes are extremely specific making the partial indexes very small. The sum total size of all of the partial indexes is still much smaller than if we were to index every necessary field with regular indexes. Thanks, Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Highly Variable Planning Times
Michael Malis writes: > I've been encountering highly variable planning time on PG 9.5.4. > Running `EXPLAIN SELECT * FROM events_1171738` will take anywhere from > 200ms to 4s. This likely has to do with the table having 1300 partial > indexes on it (for reasons elaborated on in > https://blog.heapanalytics.com/running-10-million-postgresql-indexes-in-production/). TBH, I see no convincing explanation in that article of why 1300 partial indexes are a good idea. *At best*, you're doing substantial work in the planner to avoid the first tree descent step or two in a single non-partial index. At worst, all your savings go out the window during a relcache flush, which is likely to happen whenever a vacuum or analyze gets done on the underlying table. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Highly Variable Planning Times
Michael Malis wrote: > Hi, > > I've been encountering highly variable planning time on PG 9.5.4. > Running `EXPLAIN SELECT * FROM events_1171738` will take anywhere from > 200ms to 4s. This likely has to do with the table having 1300 partial > indexes on it (for reasons elaborated on in > https://blog.heapanalytics.com/running-10-million-postgresql-indexes-in-production/). > I'm trying to figure out if there is something I can do to eliminate > the slow planning from happening. I wonder if some BRIN index could help this use case, reducing the number of indexes. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Highly Variable Planning Times
Hi, On 2017-04-19 13:39:40 -0700, Michael Malis wrote: > I've been encountering highly variable planning time on PG 9.5.4. > Running `EXPLAIN SELECT * FROM events_1171738` will take anywhere from > 200ms to 4s. This likely has to do with the table having 1300 partial > indexes on it (for reasons elaborated on in > https://blog.heapanalytics.com/running-10-million-postgresql-indexes-in-production/). > I'm trying to figure out if there is something I can do to eliminate > the slow planning from happening. I'd suspect that that's triggered by cache rebuilds. If there's any concurrent relcache invalidation (e.g. by a concurrent vacuum, create index, alter table, relation extension, ...), a lot of metadata for all those indexes will have to be rebuilt. TBH, I don't think we're particularly likely to optimize hugely for workloads with 1300 indexes on individual tables - such an effort seems not unlikely to hurt more common cases. > When I used `strace -c` on the backend process, I found that the > number of `semop` system calls is much higher for the slower queries > than the faster ones. After digging in a bit more I found that when > the query was slow, there would be a single semaphore which would make > up most of the `semop` calls. Here is a snippet of the output when I > ran `strace -r` on a query with high planning time: Could you also get a profile using perf record -g? The strace isn't telling us all that much. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Highly Variable Planning Times
Hi, I've been encountering highly variable planning time on PG 9.5.4. Running `EXPLAIN SELECT * FROM events_1171738` will take anywhere from 200ms to 4s. This likely has to do with the table having 1300 partial indexes on it (for reasons elaborated on in https://blog.heapanalytics.com/running-10-million-postgresql-indexes-in-production/). I'm trying to figure out if there is something I can do to eliminate the slow planning from happening. When I used `strace -c` on the backend process, I found that the number of `semop` system calls is much higher for the slower queries than the faster ones. After digging in a bit more I found that when the query was slow, there would be a single semaphore which would make up most of the `semop` calls. Here is a snippet of the output when I ran `strace -r` on a query with high planning time: 0.000225 open("base/16384/13236647", O_RDWR) = 219 0.000110 lseek(219, 0, SEEK_END) = 16384 0.000122 semop(1736711, [{9, -1, 0}], 1) = 0 0.004110 semop(1769480, [{10, 1, 0}], 1) = 0 0.000197 semop(1736711, [{9, -1, 0}], 1) = 0 0.010096 semop(1540097, [{1, 1, 0}], 1) = 0 0.000176 semop(1736711, [{9, -1, 0}], 1) = 0 0.004069 semop(1638404, [{15, 1, 0}], 1) = 0 0.000244 open("base/16384/13236648", O_RDWR) = 220 0.000131 lseek(220, 0, SEEK_END) = 16384 0.000212 semop(1736711, [{9, -1, 0}], 1) = 0 0.007851 semop(1736711, [{8, 1, 0}], 1) = 0 0.000130 semop(1736711, [{9, -1, 0}], 1) = 0 0.000101 semop(1769480, [{5, 1, 0}], 1) = 0 0.000450 open("base/16384/13236653", O_RDWR) = 221 0.000151 lseek(221, 0, SEEK_END) = 180224 0.000141 semop(1736711, [{9, -1, 0}], 1) = 0 0.000239 semop(1736711, [{9, -1, 0}], 1) = 0 0.000419 semop(1736711, [{10, 1, 0}], 1) = 0 0.000127 semop(1736711, [{9, -1, 0}], 1) = 0 0.001493 semop(1703942, [{3, 1, 0}], 1) = 0 You can see that most of the semop calls are waits on semaphore #9 in semaphore set 1736711. I looked up the relation for the files being opened and they all seem to be indexes on the table. Based on some flame graphs I generated, I believe most of the time is spent inside of the function get_relation_info. I think the open/lseek calls are coming from the estimate_rel_size call in get_relation_info, but I haven't been able to figure out where the semop calls are coming from. It seems that the semop calls are at least a symptom of the high planning time. I'm looking for help in finding the root cause of the high planning time and for figuring out a way for eliminating the high planning time. Thanks, Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers