Sorry, by accident I sent this one out twice. -- David Geier (ServiceNow)
On Wed, Jul 27, 2022 at 2:42 PM David Geier <geidav...@gmail.com> wrote: > Hi hackers, > > We came across a slowdown in planning, where queries use tables with many > indexes. In setups with wide tables it is not uncommon to have easily > 10-100 indexes on a single table. The slowdown is already visible in serial > workloads with just a handful of indexes, but gets drastically amplified > when running queries with more indexes in parallel at high throughput. > > We measured the TPS and planning time of running parallel streams of > simple point look-up queries on a single empty table with 60 columns and 60 > indexes. The query used is 'SELECT * FROM synth_table WHERE col5 = 42'. No > rows are returned because the table is empty. We used a machine with 64 > physical CPU cores. The schema and sysbench script to reproduce these > numbers are attached. We used the TPS as reported by sysbench and obtained > planning time by running 'EXPLAIN ANALYZE' on the same query in a > separately opened connection. We averaged the planning time of 3 successive > 'EXPLAIN ANALYZE' runs. sysbench ran on the same machine with varying > numbers of threads using the following command line: > > sysbench repro.lua --db-driver=pgsql --pgsql-host=localhost > --pgsql-db=postgres --pgsql-port=? --pgsql-user=? --pgsql-password=? > --report-interval=1 --threads=64 run > > The following table shows the results. It is clearly visible that the TPS > flatten out already at 8 parallel streams, while the planning time is > increasing drastically. > > Parallel streams | TPS (before) | Planning time (before) > -----------------|--------------|----------------------- > 1 | 5,486 | 0.13 ms > 2 | 8,098 | 0.22 ms > 4 | 15,013 | 0.19 ms > 8 | 27,107 | 0.29 ms > 16 | 30,938 | 0.43 ms > 32 | 26,330 | 1.68 ms > 64 | 24,314 | 2.48 ms > > We tracked down the root cause of this slowdown to lock contention in > 'get_relation_info()'. The index lock of every single index of every single > table used in that query is acquired. We attempted a fix by pre-filtering > out all indexes that anyways cannot be used with a certain query, without > taking the index locks (credits to Luc Vlaming for idea and > implementation). The patch does so by caching the columns present in every > index, inside 'struct Relation', similarly to 'rd_indexlist'. Then, before > opening (= locking) the indexes in 'get_relation_info()', we check if the > index can actually contribute to the query and if not it is discarded right > away. Caching the index info saves considerable work for every query run > subsequently, because less indexes must be inspected and thereby locked. > This way we also save cycles in any code that later on goes over all > relation indexes. > > The work-in-progress version of the patch is attached. It is still fairly > rough (e.g. uses a global variable, selects the best index in scans without > restrictions by column count instead of physical column size, is missing > some renaming, etc.), but shows the principle. > > The following table shows the TPS, planning time and speed-ups after > applying the patch and rerunning above described benchmark. Now, the > planning time remains roughly constant and TPS roughly doubles each time > the number of parallel streams is doubled. The higher the stream count the > more severe the lock contention is and the more pronounced the gained > speed-up gets. Interestingly, even for a single query stream the speed-up > in planning time is already very significant. This applies also for lower > index counts. For example just with 10 indexes the TPS for a single query > stream goes from 9,159 to 12,558. We can do more measurements if there is > interest in details for a lower number of indexes. > > Parallel streams | TPS (after) | Planning time (after) | Speed-up TPS | > Speed-up planning > > -----------------|-------------|-----------------------|--------------|------------------ > 1 | 10,344 | 0.046 | 1.9x | > 2.8x > 2 | 20,140 | 0.045 ms | 2.5x | > 4.9x > 4 | 40,349 | 0.047 ms | 2.7x | > 4.0x > 8 | 80,121 | 0.046 ms | 3.0x | > 6.3x > 16 | 152,632 | 0.051 ms | 4.9x | > 8.4x > 32 | 301,359 | 0.052 ms | 11.4x | > 32.3x > 64 | 525,115 | 0.062 ms | 21.6x | > 40.0x > > We are happy to receive your feedback and polish up the patch. > > -- > David Geier > (ServiceNow) >