[PERFORM] query not using index
I would appreciate some help optimising the following query: with subject_journals as( select A.sq fromisi.rissue A, isi.rsc_joern_link C WHERE C.sc_id in ('d0963875-e438-4923-b3fa-f462e8975221', '04e14284-09c8-421a-b1ad-c8238051601a', '04e2189f-cd2a-44f0-b98d-52f6bb5dcd78', 'f5521c65-ec49-408a-9a42-8a69d47703cd', '2e47ae2f-2c4d-433e-8bdf-9983eeeafc42', '5d3639b1-04c2-4d94-a99a-5323277fd2b7') AND C.rj_id = A.uuid), subject_articles as ( SELECT B.article_id as art_id FROM isi.isi_l1_publication B, subject_journals A, isi.ritem C WHERE A.sq = B.journal_id AND B.publication_year <= '2012' AND B.publication_year >= '2000' AND C.ut = B.article_id AND C.dt in ('@ Article','Review') ), country_articles as ( SELECT A.art_id FROM isi.art_country_link A WHERE A.countrycode = 'ZA') select art_id from subject_articles INTERSECT select art_id from country_articles Analyze explains shows that it is not using the indexes on both isi.isi_l1_publication and isi.ritem (both tables with more than 43 million records).: "HashSetOp Intersect (cost=10778065.50..11227099.44 rows=200 width=48) (actual time=263120.868..263279.467 rows=4000 loops=1)" " Output: "*SELECT* 1".art_id, (0)" " Buffers: shared hit=627401 read=4347235, temp read=234498 written=234492" " CTE subject_journals" "-> Hash Join (cost=12846.55..17503.27 rows=28818 width=8) (actual time=99.762..142.439 rows=30291 loops=1)" " Output: a.sq" " Hash Cond: ((c.rj_id)::text = (a.uuid)::text)" " Buffers: shared hit=12232" " -> Bitmap Heap Scan on isi.rsc_joern_link c (cost=1020.92..5029.23 rows=28818 width=37) (actual time=4.238..15.806 rows=30291 loops=1)" "Output: c.id, c.rj_id, c.sc_id" "Recheck Cond: ((c.sc_id)::text = ANY ('{d0963875-e438-4923-b3fa-f462e8975221,04e14284-09c8-421a-b1ad-c8238051601a,04e2189f-cd2a-44f0-b98d-52f6bb5dcd78,f5521c65-ec49-408a-9a42-8a69d47703cd,2e47ae2f-2c4d-433e-8bdf-9983eeeafc42,5d3639b1-04c2-4 (...)" "Buffers: shared hit=3516" "-> Bitmap Index Scan on rsc_joern_link_sc_id_idx (cost=0.00..1013.72 rows=28818 width=0) (actual time=3.722..3.722 rows=30291 loops=1)" " Index Cond: ((c.sc_id)::text = ANY ('{d0963875-e438-4923-b3fa-f462e8975221,04e14284-09c8-421a-b1ad-c8238051601a,04e2189f-cd2a-44f0-b98d-52f6bb5dcd78,f5521c65-ec49-408a-9a42-8a69d47703cd,2e47ae2f-2c4d-433e-8bdf-9983eeeafc42,5d3639b1-04 (...)" " Buffers: shared hit=237" " -> Hash (cost=10098.06..10098.06 rows=138206 width=45) (actual time=95.495..95.495 rows=138206 loops=1)" "Output: a.sq, a.uuid" "Buckets: 16384 Batches: 1 Memory Usage: 10393kB" "Buffers: shared hit=8716" "-> Seq Scan on isi.rissue a (cost=0.00..10098.06 rows=138206 width=45) (actual time=0.005..58.225 rows=138206 loops=1)" " Output: a.sq, a.uuid" " Buffers: shared hit=8716" " CTE subject_articles" "-> Merge Join (cost=9660996.21..9896868.27 rows=13571895 width=16) (actual time=229449.020..259557.073 rows=2513896 loops=1)" " Output: b.article_id" " Merge Cond: ((a.sq)::text = (b.journal_id)::text)" " Buffers: shared hit=519891 read=4347235, temp read=234498 written=234492" " -> Sort (cost=2711.01..2783.05 rows=28818 width=32) (actual time=224.901..230.615 rows=30288 loops=1)" "Output: a.sq" "Sort Key: a.sq" "Sort Method: quicksort Memory: 2188kB" "Buffers: shared hit=12232" "-> CTE Scan on subject_journals a (cost=0.00..576.36 rows=28818 width=32) (actual time=99.764..152.459 rows=30291 loops=1)" " Output: a.sq" " Buffers: shared hit=12232" " -> Materialize (cost=9658285.21..9722584.29 rows=12859816 width=24) (actual time=229223.851..253191.308 rows=14664245 loops=1)" "Output: b.article_id, b.journal_id" "Buffers: shared hit=507659 read=4347235, temp read=234498 written=234492" "-> Sort (cost=9658285.21..9690434.75 rows=12859816 width=24) (actual time=229223.846..251142.167 rows=14072645 loops=1)" " Output: b.article_id, b.journal_id" " Sort Key: b.journal_id" " Sort Method: external merge Disk: 467704kB" "
Re: [PERFORM] query not using index
On 19 December 2013 16:48, Tom Lane wrote: > Johann Spies writes: > > I would appreciate some help optimising the following query: > > It's a mistake to imagine that indexes are going to help much with > a join of this size. Hash or merge join is going to be a lot better > than nestloop. What you need to do is make sure those will perform > as well as possible, and to that end, it'd likely help to raise > work_mem. I'm not sure if you can sanely put it high enough to > make the query operate totally in memory --- it looks like you'd > need work_mem of 500MB or more to prevent any of the sorts or > hashes from spilling to disk, and keep in mind that this query > is going to use several times work_mem because there are multiple > sorts/hashes going on. But if you can transiently dedicate a lot > of RAM to this query, that should help some. I'd suggest increasing > work_mem via a SET command in the particular session running this > query --- you don't want such a high value to be the global default. > Thanks Tom. Raising work_mem from 384MB to 512MB made a significant difference. You said "hash or merge join id going to be a lot better than nestloop". Is that purely in the hands of the query planner or what can I do to get the planner to use that options apart from raising the work_mem? Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)
[PERFORM] The same query - much different runtimes
While waiting for a query to finish (activated through a web interface), I ran the same query using psql through a ssh-connection with much different runtimes. I have configured the server to log queries taking more than five seconds and in the log the query for which I waited was logged as: 2014-04-07 12:01:38 SAST LOG: duration: 466754.684 ms plan: Query Text: SELECT isi_alt_names.code FROM rresearch, isi_alt_names WHERE UPPER(rresearch.ny) = 'GUANGZHOU') AND\ (UPPER(rresearch.nu) = 'PEOPLES R CHINA')) AND (isi_alt_names.rsc_id = rresearch.id)) AND (isi_alt_names.code IS NOT NULL)) \ ORDER BY rresearch.id, isi_alt_names.id LIMIT 2 OFFSET 0; Limit (cost=384216.93..384216.94 rows=2 width=15) -> Sort (cost=384216.93..384244.77 rows=11137 width=15) Sort Key: rresearch.id, isi_alt_names.id -> Nested Loop (cost=138757.99..384105.56 rows=11137 width=15) -> Bitmap Heap Scan on rresearch (cost=138757.99..161224.50 rows=11337 width=4) Recheck Cond: ((upper((ny)::text) = 'GUANGZHOU'::text) AND (upper((nu)::text) = 'PEOPLES R CHINA'\ ::text)) -> BitmapAnd (cost=138757.99..138757.99 rows=11337 width=0) -> Bitmap Index Scan on rresearch_ny_idx (cost=0.00..4930.62 rows=215233 width=0) Index Cond: (upper((ny)::text) = 'GUANGZHOU'::text) -> Bitmap Index Scan on rresearch_nu_idx (cost=0.00..133821.46 rows=6229156 width=0) Index Cond: (upper((nu)::text) = 'PEOPLES R CHINA'::text) -> Index Scan using isi_alt_countrynames_rsc_id_idx on isi_alt_names (cost=0.00..19.65 rows=1 width=1\ 5) Index Cond: (rsc_id = rresearch.id) Filter: (code IS NOT NULL) While this was going on, I only changed the query to include the schema (the web-based query used search_path) and ran it. Query Analyze said: "Limit (cost=384288.35..384288.36 rows=2 width=15) (actual time=2945.338..2945.340 rows=2 loops=1)" " Output: isi_alt_names.code, rresearch.id, isi_alt_names.id" " Buffers: shared hit=1408146" " -> Sort (cost=384288.35..384316.20 rows=11137 width=15) (actual time=2945.338..2945.338 rows=2 loops=1)" "Output: isi_alt_names.code, rresearch.id, isi_alt_names.id" "Sort Key: rresearch.id, isi_alt_names.id" "Sort Method: top-N heapsort Memory: 25kB" "Buffers: shared hit=1408146" "-> Nested Loop (cost=138757.99..384176.98 rows=11137 width=15) (actual time=1530.875..2876.376 rows=241920 loops=1)" " Output: isi_alt_names.code, rresearch.id, isi_alt_names.id" " Buffers: shared hit=1408146" " -> Bitmap Heap Scan on isi.rresearch (cost=138757.99..161224.50 rows=11337 width=4) (actual time=1530.848..1750.169 rows=241337 loops=1)" "Output: rresearch.id, rresearch.cn, rresearch.nf, rresearch.nc, rresearch.nd, rresearch.nn, rresearch.ny, rresearch.np, rresearch.nu, rresearch.nz, rresearch.uuid, rresearch.tsv" "Recheck Cond: ((upper((rresearch.ny)::text) = 'GUANGZHOU'::text) AND (upper((rresearch.nu)::text) = 'PEOPLES R CHINA'::text))" "Buffers: shared hit=195242" "-> BitmapAnd (cost=138757.99..138757.99 rows=11337 width=0) (actual time=1484.363..1484.363 rows=0 loops=1)" " Buffers: shared hit=31173" " -> Bitmap Index Scan on rresearch_ny_idx (cost=0.00..4930.62 rows=215233 width=0) (actual time=60.997..60.997 rows=241354 loops=1)" "Index Cond: (upper((rresearch.ny)::text) = 'GUANGZHOU'::text)" "Buffers: shared hit=1124" " -> Bitmap Index Scan on rresearch_nu_idx (cost=0.00..133821.46 rows=6229156 width=0) (actual time=1350.819..1350.819 rows=6434248 loops=1)" "Index Cond: (upper((rresearch.nu)::text) = 'PEOPLES R CHINA'::text)" "Buffers: shared hit=30049" " -> Index Scan using isi_alt_countrynames_rsc_id_idx on isi.isi_alt_names (cost=0.00..19.65 rows=1 width=15) (actual time=0.003..0.004 rows=1 loops=241337)" "Output: isi_alt_names.rsc_id, isi_alt_names.code, isi_alt_names.id, isi_alt_names.institution" "Index Cond: (isi_alt_names.rsc_id = rresearch.id)" "Filter: (isi_alt_names.code IS NOT NULL)" "Buffers: shared hit=1212904" "Total runtime: 2945.400 ms" I then ran the query and the result was produced in about the same time as (2945 ms). What can cause such a huge discrepancy? I have checked and there was no other process blocking the query. Regards Johann -- Because experiencing your loyal love is better
[PERFORM] Specifications for a new server
I am busy reading Gregory Smith' s PostgreSQL 9.0 High Performance and when the book was written he seemed to me a bit sceptical about SSD's. I suspect the reliability of the SSD's has improved significantly since then. Our present server (128Gb RAM and 2.5 Tb disk space and 12 CPU cores - RAID 10) will become a development server and we are going to buy a new server. At the moment the 'base' directory uses 1.5Tb of disk space and there is still more data to come. The database contains blbliometric data that receive updates on a weekly basis but not much changes other than that except for cleaning of data by a few persons. Some of the queries can take many hours to finish. On our present system there are sometimes more than 300GB in temporary files which I suspect will not be the case on the new system with a much larger RAM. Analysis or the SAR-logs showed that there were too much iowait in the CPU's on the old system which has a lower spec CPU than the ones considered for the new system. We are looking possibly the following hardware: CPU: 2 x Ivy Bridge 8C E5-2667V2 3.3G 25M 8GT/s QPI - 16 cores RAM: 24 x 32GB DDR3-1866 2Rx4 LP ECC REG RoHS - 768Gb with enough disk space - about 4.8 Tb on RAID 10. My question is about the possible advantage and usage of SSD disks in the new server. At the moment I am considering using 2 x 200GB SSD' s for a separate partion for temporary files and 2 x 100GB for the operating system. So my questions: 1. Will the SSD's in this case be worth the cost? 2. What will the best way to utilize them in the system? Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)
Re: [PERFORM] Specifications for a new server
On 6 May 2014 13:07, Michael Stone wrote: > On Tue, May 06, 2014 at 11:13:42AM +0200, Johann Spies wrote: > >> Analysis or the SAR-logs showed that there were too much iowait in the >> CPU's on >> the old system which has a lower spec CPU than the ones considered for >> the new >> system. >> > > iowait means the cpu is doing nothing but waiting for data from the disk. > buying faster cpus means that they will be able to spend more time waiting > for data from the disk. you'd probably get much better bang for the buck > upgrading the storage subsystem than throwing more money at cpus. > > > In that case I apologise for making the wrong assumption. People who are more experienced than me analyzed the logs told me that to their surprise the CPU' s were under pressure. I just assumed that the iowait was the problem having looked at the logs myself. > If you're talking about SSDs for the OS, that's a complete waste; there is > essentially no I/O relating to the OS once you've booted. > > I also thought this might be an overkill but I was not sure. > > So my questions: >> >> 1. Will the SSD's in this case be worth the cost? >> 2. What will the best way to utilize them in the system? >> > > The best way to utilize them would probably be to spend less on the CPU > and RAM and more on the storage, and use SSD either for all of the storage > or for specific items that have a high level of I/O (such as the indexes). > Can't be more specific than that without a lot more information about the > database, how it is utilized, and what's actually slow. > > I understand your remark about the CPU in the light of my wrong assumption earlier, but I do not understand your remark about the RAM. The fact that temporary files of up to 250Gb are created at times during complex queries, is to me an indication of too low RAM. Question: How do I dedicate a partition to indexes? Were do I configure PostgreSQL to write them in a particular area? Regards Johann
Re: [PERFORM] Specifications for a new server
On 8 May 2014 10:11, Johann Spies wrote: > > Question: How do I dedicate a partition to indexes? Were do I configure > PostgreSQL to write them in a particular area? > > > I just discovered TABLESPACE which answered my question. Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)
Re: [PERFORM] pgtune + configurations with 9.3
I have done some tests using pgbench-tools with different configurations on our new server with 768G RAM and it seems for our purpose 32G shared_buffers would give the best results. Regards Johann On 17 November 2014 at 07:17, Stuart Bishop wrote: > On 15 November 2014 06:00, Mark Kirkwood > wrote: > > > It is probably time to revisit this 8GB limit with some benchmarking. We > > don't really have a hard and fast rule that is known to be correct, and > that > > makes Alexey's job really difficult. Informally folk (including myself at > > times) have suggested: > > > > min(ram/4, 8GB) > > > > as the 'rule of thumb' for setting shared_buffers. However I was recently > > It would be nice to have more benchmarking and improve the rule of > thumb. I do, however, believe this is orthogonal to fixing pgtune > which I think should be using the current rule of thumb (which is > overwhelmingly min(ram/4, 8GB) as you suggest). > > > > > benchmarking a machine with a lot of ram (1TB) and entirely SSD storage > [1], > > and that seemed quite happy with 50GB of shared buffers (better > performance > > than with 8GB). Now shared_buffers was not the variable we were > > concentrating on so I didn't get too carried away and try much bigger > than > > about 100GB - but this seems like a good thing to come out with some > numbers > > for i.e pgbench read write and read only tps vs shared_buffers 1 -> 100 > GB > > in size. > > I've always thought the shared_buffers setting would need to factor in > things like CPU speed and memory access, since the rational for the > 8GB cap has always been the cost to scan the data structures. And the > kernel would factor in too, since the PG specific algorithms are in > competition with the generic OS algorithms. And size of the hot set, > since this gets pinned in shared_buffers. Urgh, so many variables. > > -- > Stuart Bishop > http://www.stuartbishop.net/ > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)
Re: [PERFORM] pgtune + configurations with 9.3
Hello Greame, It's probably helpful if everyone sharing this information can post their > measurement process / settings and the results as completely as possible, > for comparison and reference. > Apologies. I have only changed one parameter in postgresql.conf for the tests and that was shared_buffers: shared_buffers = 32GB # min 128k shared_preload_libraries = 'auto_explain' # (change requires restart) vacuum_cost_delay = 5 # 0-100 milliseconds wal_sync_method = open_sync # the default is the first option wal_buffers = -1# min 32kB, -1 sets based on shared_buffers checkpoint_completion_target = 0.9 # checkpoint target duration, 0.0 - 1.0 checkpoint_warning = 30s# 0 disables default_statistics_target = 100 # range 1-1 log_line_prefix = '%t ' # special values: log_statement = 'all' # none, ddl, mod, all log_timezone = 'localtime' autovacuum_vacuum_scale_factor = 0.1# fraction of table size before vacuum autovacuum_vacuum_cost_delay = 5ms # default vacuum cost delay for datestyle = 'iso, dmy' timezone = 'localtime' lc_messages = 'en_ZA.UTF-8' # locale for system error message lc_monetary = 'en_ZA.UTF-8' # locale for monetary formatting lc_numeric = 'en_ZA.UTF-8' # locale for number formatting lc_time = 'en_ZA.UTF-8' # locale for time formatting default_text_search_config = 'pg_catalog.english' auto_explain.log_min_duration = '6s' # Gregory Smith page 180 effective_cache_size = 512GB # pgtune wizard 2014-09-25 work_mem = 4608MB # pgtune wizard 2014-09-25 checkpoint_segments = 16 # pgtune wizard 2014-09-25 max_connections = 80 # pgtune wizard 2014-09-25 And pgbench-tools - the default configuration: BASEDIR=`pwd` PGBENCHBIN=`which pgbench` TESTDIR="tests" SKIPINIT=0 TABBED=0 OSDATA=1 TESTHOST=localhost TESTUSER=`whoami` TESTPORT=5432 TESTDB=pgbench RESULTHOST="$TESTHOST" RESULTUSER="$TESTUSER" RESULTPORT="$TESTPORT" RESULTDB=results MAX_WORKERS="" SCRIPT="select.sql" SCALES="1 10 100 1000" SETCLIENTS="1 2 4 8 16 32" SETTIMES=3 RUNTIME=60 TOTTRANS="" SETRATES="" The server: # See Gregory Smith: High Performans Postgresql 9.0 pages 81,82 for the next lines vm.swappiness=0 vm.overcommit_memory=2 vm.dirty_ratio = 2 vm.dirty_background_ratio=1 # Maximum shared segment size in bytes kernel.shmmax = 406622322688 # Maximum number of shared memory segments in pages kernel.shmall = 99273028 $ free total used free sharedbuffers cached Mem: 794184164 792406416148 0 123676 788079892 -/+ buffers/cache:4202848 789981316 Swap: 7906300 07906300 I have attached the resulting graphs. Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] pgtune + configurations with 9.3
Another apology: My pg_version is 9.3 and here are more up to date png's. On 26 November 2014 at 15:34, Johann Spies wrote: > Hello Greame, > > It's probably helpful if everyone sharing this information can post their >> measurement process / settings and the results as completely as possible, >> for comparison and reference. >> > > Apologies. I have only changed one parameter in postgresql.conf for the > tests and that was shared_buffers: > > shared_buffers = 32GB # min 128k > shared_preload_libraries = 'auto_explain' # (change requires > restart) > vacuum_cost_delay = 5 # 0-100 milliseconds > wal_sync_method = open_sync # the default is the first option > wal_buffers = -1# min 32kB, -1 sets based > on shared_buffers > checkpoint_completion_target = 0.9 # checkpoint target > duration, 0.0 - 1.0 > checkpoint_warning = 30s# 0 disables > default_statistics_target = 100 # range 1-1 > log_line_prefix = '%t ' # special values: > log_statement = 'all' # none, ddl, mod, all > log_timezone = 'localtime' > autovacuum_vacuum_scale_factor = 0.1# fraction of table size > before vacuum > autovacuum_vacuum_cost_delay = 5ms # default vacuum cost > delay for > datestyle = 'iso, dmy' > timezone = 'localtime' > lc_messages = 'en_ZA.UTF-8' # locale for > system error message > lc_monetary = 'en_ZA.UTF-8' # locale for > monetary formatting > lc_numeric = 'en_ZA.UTF-8' # locale for > number formatting > lc_time = 'en_ZA.UTF-8' # locale for time > formatting > default_text_search_config = 'pg_catalog.english' > auto_explain.log_min_duration = '6s' # Gregory Smith page 180 > effective_cache_size = 512GB # pgtune wizard 2014-09-25 > work_mem = 4608MB # pgtune wizard 2014-09-25 > checkpoint_segments = 16 # pgtune wizard 2014-09-25 > max_connections = 80 # pgtune wizard 2014-09-25 > > And pgbench-tools - the default configuration: > > BASEDIR=`pwd` > PGBENCHBIN=`which pgbench` > TESTDIR="tests" > SKIPINIT=0 > TABBED=0 > OSDATA=1 > TESTHOST=localhost > TESTUSER=`whoami` > TESTPORT=5432 > TESTDB=pgbench > RESULTHOST="$TESTHOST" > RESULTUSER="$TESTUSER" > RESULTPORT="$TESTPORT" > RESULTDB=results > MAX_WORKERS="" > SCRIPT="select.sql" > SCALES="1 10 100 1000" > SETCLIENTS="1 2 4 8 16 32" > SETTIMES=3 > RUNTIME=60 > TOTTRANS="" > SETRATES="" > > > The server: > > # See Gregory Smith: High Performans Postgresql 9.0 pages 81,82 for the > next lines > vm.swappiness=0 > vm.overcommit_memory=2 > vm.dirty_ratio = 2 > vm.dirty_background_ratio=1 > # Maximum shared segment size in bytes > kernel.shmmax = 406622322688 > # Maximum number of shared memory segments in pages > kernel.shmall = 99273028 > > $ free > total used free sharedbuffers cached > Mem: 794184164 792406416148 0 123676 788079892 > -/+ buffers/cache:4202848 789981316 > Swap: 7906300 07906300 > > I have attached the resulting graphs. > > Regards > Johann > > -- > Because experiencing your loyal love is better than life itself, > my lips will praise you. (Psalm 63:3) > -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Slow query - lots of temporary files.
I have stopped this query after about 16 hours. At the same time I ran a 'explain analyze' on the same query to find out why it took so long. These two processes generated temporary files of 173GB in /var/lib/postgresql/9.4/main/base/pgsql_tmp. COPY (SELECT A.ut, B.go AS funding_org, B.gn AS grant_no, C.gt AS thanks, D.au FROM isi.funding_text C, isi.rauthor D, isi.africa_uts A LEFT JOIN isi.funding_org B ON (B.ut = A.ut) WHERE (C.ut IS NOT NULL OR B.ut IS NOT NULL) AND D.rart_id = C.ut AND C.ut = B.ut GROUP BY A.ut, GO, gn, gt, au ORDER BY funding_org) TO '/tmp/africafunding2.csv' WITH csv quote '"' DELIMITER ','; A modified version of this query finished in 1min 27 sek: COPY (SELECT 'UT'||A.ut, B.go AS funding_org, B.gn AS grant_no, C.gt AS thanks FROM isi.africa_uts A LEFT JOIN isi.funding_org B ON (B.ut = A.ut) LEFT JOIN isi.funding_text C ON (A.ut = C.ut) WHERE (C.ut IS NOT NULL OR B.ut IS NOT NULL) GROUP BY A.ut, GO, gn, gt) TO '/tmp/africafunding.csv' WITH csv quote '"' DELIMITER ','; As I said, the process of 'explain analyze' of the problematic query contributed to the 173GB temporary files and did not finish in about 16 hours. Just explain of the query part produces this: "Sort (cost=4781458203.46..4798118612.44 rows=6664163593 width=390)" " Output: a.ut, b.go, b.gn, c.gt, (array_to_string(array_agg(d.au), ';'::text)), b.go, b.gn, d.au" " Sort Key: b.go" " -> GroupAggregate (cost=2293037801.73..2509623118.51 rows=6664163593 width=390)" "Output: a.ut, b.go, b.gn, c.gt, array_to_string(array_agg(d.au), ';'::text), b.go, b.gn, d.au" "Group Key: a.ut, b.go, b.gn, c.gt, d.au" "-> Sort (cost=2293037801.73..2309698210.72 rows=6664163593 width=390)" " Output: a.ut, c.gt, b.go, b.gn, d.au" " Sort Key: a.ut, b.go, b.gn, c.gt, d.au" " -> Merge Join (cost=4384310.92..21202716.78 rows=6664163593 width=390)" "Output: a.ut, c.gt, b.go, b.gn, d.au" "Merge Cond: ((c.ut)::text = (d.rart_id)::text)" "-> Merge Join (cost=635890.84..1675389.41 rows=6069238 width=412)" " Output: c.gt, c.ut, a.ut, b.go, b.gn, b.ut" " Merge Cond: ((c.ut)::text = (b.ut)::text)" " Join Filter: ((c.ut IS NOT NULL) OR (b.ut IS NOT NULL))" " -> Merge Join (cost=635476.30..675071.77 rows=1150354 width=348)" "Output: c.gt, c.ut, a.ut" "Merge Cond: ((a.ut)::text = (c.ut)::text)" "-> Index Only Scan using africa_ut_idx on isi.africa_uts a (cost=0.42..19130.19 rows=628918 width=16)" " Output: a.ut" "-> Sort (cost=632211.00..640735.23 rows=3409691 width=332)" " Output: c.gt, c.ut" " Sort Key: c.ut" " -> Seq Scan on isi.funding_text c (cost=0.00..262238.91 rows=3409691 width=332)" "Output: c.gt, c.ut" " -> Index Scan using funding_org_ut_idx on isi.funding_org b (cost=0.56..912582.50 rows=9835492 width=64)" "Output: b.id, b.ut, b.go, b.gn" "-> Materialize (cost=0.57..17914892.46 rows=159086560 width=26)" " Output: d.id, d.rart_id, d.au, d.ro, d.ln, d.af, d.ras, d.ad, d.aa, d.em, d.ag, d.tsv" " -> Index Scan using rauthor_rart_id_idx on isi.rauthor d (cost=0.57..17517176.06 rows=159086560 width=26)" "Output: d.id, d.rart_id, d.au, d.ro, d.ln, d.af, d.ras, d.ad, d.aa, d.em, d.ag, d.tsv" Any idea on why adding the rauthor table in the query is so problematic? My systerm: 768 GB RAM shared_ buffers: 32GB work_mem: 4608MB Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)
Re: [PERFORM] Slow query - lots of temporary files.
On 10 June 2015 at 15:02, Claudio Freire wrote: > > The joins are different on both versions, and the most likely culprit > is the join against D. It's probably wrong, and the first query is > building a cartesian product. > > Without more information about the schema it's difficult to be sure though. > Thanks for your reply. I will experiment futher with different joins. Here is the schema of the involved tables: nkb=# \d isi.funding_text Table "isi.funding_text" Column | Type | Modifiers +---+--- id | integer | not null default nextval('isi.funding_text_id_seq'::regclass) ut | character varying(15) | gt | citext| Indexes: "funding_text_pkey" PRIMARY KEY, btree (id) "funding_text_ut_idx" btree (ut) Foreign-key constraints: "funding_text_ut_fkey" FOREIGN KEY (ut) REFERENCES isi.ritem(ut) nkb=# \d isi.funding_org Table "isi.funding_org" Column | Type | Modifiers +---+-- id | integer | not null default nextval('isi.funding_org_id_seq'::regclass) ut | character varying(15) | go | citext| gn | character varying | Indexes: "funding_org_pkey" PRIMARY KEY, btree (id) "funding_org_ut_idx" btree (ut) Foreign-key constraints: "funding_org_ut_fkey" FOREIGN KEY (ut) REFERENCES isi.ritem(ut) Table "isi.africa_uts" Column | Type | Modifiers +---+- ut | character varying(15) | id | integer | not null default nextval('isi.africa_uts_id_seq'::regclass) Indexes: "africa_uts_pkey" PRIMARY KEY, btree (id) "africa_ut_idx" btree (ut) Foreign-key constraints: "africa_uts_ut_fkey" FOREIGN KEY (ut) REFERENCES isi.ritem(ut) Table "isi.rauthor" Column | Type | Modifiers -++-- id | integer| not null default nextval('isi.rauthor_id_seq'::regclass) rart_id | character varying(15) | au | character varying(75) | ro | character varying(30) | ln | character varying(200) | af | character varying(200) | ras | character varying(4) | ad | integer| aa | text | em | character varying(250) | ag | character varying(75) | tsv | tsvector | Indexes: "rauthor_pkey" PRIMARY KEY, btree (id) CLUSTER "rauthor_ad_idx" btree (ad) "rauthor_au_idx" btree (au) "rauthor_lower_idx" btree (lower(au::text)) "rauthor_lower_lower1_idx" btree (lower(ln::text), lower(af::text)) "rauthor_rart_id_idx" btree (rart_id) "rauthor_tsv_idx" gin (tsv) Referenced by: TABLE "level1.person" CONSTRAINT "person_auth_id_fkey" FOREIGN KEY (auth_id) REFERENCES isi.rauthor(id) ON DELETE CASCADE Triggers: tsvectorupdate_for_rauthor BEFORE INSERT OR UPDATE ON isi.rauthor FOR EACH ROW EXECUTE PROCEDURE isi.update_rauthor_tsv() Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)
Re: [PERFORM] Slow query - lots of temporary files.
On 10 June 2015 at 16:50, Tomas Vondra wrote: > > > The problematic piece of the explain plan is this: > > -> Merge Join (cost=4384310.92..21202716.78 rows=6664163593 > width=390)" >Output: a.ut, c.gt, b.go, b.gn, d.au" >Merge Cond: ((c.ut)::text = (d.rart_id)::text)" > > That is, the planner expects ~6.7 billion rows, each ~390B wide. That's > ~2.5TB of data that needs to be stored to disk (so that the sort can > process it). > > The way the schema is designed might be one of the issues - ISTM the 'ut' > column is somehow universal, mixing values referencing different columns in > multiple tables. Not only that's utterly misleading for the planner (and > may easily cause issues with huge intermediate results), but it also makes > formulating the queries very difficult. And of course, the casting between > text and int is not very good either. > > Fix the schema to follow relational best practices - separate the values > into multiple columns, and most of this will go away. > Thanks for your reply Tomas. I do not understand what the problem with the 'ut' column is. It is a unique identifier in the first table(africa_uts) and is used in the other tables to establish joins and does have the same type definition in all the tables. Is the problem in the similar name. The data refers in all the 'ut' columns of the different tables to the same data. I do not casting of integers into text in this case. I don't know why the planner is doing it. The field 'rart_id' in isi.rauthor is just another name for 'ut' in the other tables and have the same datatype. I do not understand your remark: "separate the values into multiple columns". I cannot see which values can be separated into different columns in the schema. Do you mean in the query? Why? Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)
[PERFORM] Long running query: How to monitor the progress
Working with 9.4. We are in the process of unpacking complicated XML-data into tables. XML-data are already in a table with two fields (id, xml) - 47+ million records. Some of hour queries to extract the data and insert it in other tables runs for days and in one case we have created a table with 758million unique records. Now my question. Is there a way to monitor the progress of a long running query like this? I have recently read that it is probably better for processes like this to copy result of the query to a csv-file and then import it again with copy as an insert. Next time I will try that. The following query has been running for 6 days now and are still running (I have anonymized it a little bit) on a server with 768 GB RAM. It has created 44 temporary files so far: INSERT INTO table_a_link(uid,gn_id) WITH p AS (SELECT ARRAY[ARRAY['t','some_xpath']] AS some_xpath), q AS (SELECT (xpath('//t:UID/text()',xml,some_xpath))[1] uid, unnest(xpath('//t:grant',xml,some_xpath)) AS gr FROM source.xml_data a, p WHERE xpath_exists('//t:grant/t:grant_agency', xml ,some_xpath)), r AS ( SELECT CASE WHEN xpath_exists('//t:grant_ids', gr, some_xpath) THEN unnest(xpath('//t:grant_ids', gr, some_xpath)) ELSE NULL END AS GR_ids FROM q, p ) , y as (SELECT A.UUID AS FO_ID, /* unnest(xpath('//t:grant_agency/text()',GR,ns))::citext AS agency, */ CASE WHEN xpath_exists('//t:grant_id', gr_ids, some_xpath) THEN unnest(xpath('//t:grant_id/text()', gr_ids, some_xpath))::citext ELSE NULL END grant_NO, uid::varchar(19) from WOS.FUNDING_ORG A, p,q left join r on (xpath('//t:grant/t:grant_ids/t:grant_id/text()',gr, ARRAY[ARRAY['t','some_xpath']])::citext = xpath('//t:grant_id/text()',GR_IDS,ARRAY[ARRAY['t','some_xpath']])::citext) WHERE A.FUNDING_ORG = (xpath('//t:grant_agency/text()',GR,some_xpath))[1]::CITEXT ) select distinct y.uid, B.uuid gn_id from y, table_b B where y.fo_id = B.fo_id and y.grant_no is not distinct from b.grant_no Regards. Johann
[PERFORM] Delete, foreign key, index usage
While updating our database which includes a lot of deletions where a lot of foreign key references are involved we found that in the case of two tables the indexes are ignored and it slow down the process a lot. Here are stats about those two tables: relname seq_scan seq_tup_read idx_scan idx_tup_fetch n_tup_ins n_tup_upd n_tup_del n_tup_hot_upd n_live_tup n_dead_tup n_mod_since_analyze belongs_to 227 52539487559 0 0 771 0 1459 0 125 1459 2230 publication 229 11502854612 0 0 254 0 229 0 60 229 483 Publication ( has a foreign key (ut) and more than 50million records) that references the top of the chain of references. This field (ut) is also the primary key of publication. In the case of belongs_to (about 231393000 records) which references the same table (article) ut has an index. All other tables in this dependency chain reports 100% or near 100% usage of the indexes e.g. citation_2010_2014 0 0 226 1882 2510 0 1910 0 816 1910 4420 The indexes are on a ssd and we have set the random_page_cost to 1 for those queries. The definition of belongs_to: CREATE TABLE wos_2017_1.belongs_to ( suborg_id uuid, organisation_id uuid, address_id uuid, ut citext, uuid uuid NOT NULL, id integer NOT NULL DEFAULT nextval('wos_2017_1.belongs2_id_seq'::regclass), pref_name_id uuid, addr_no smallint, reprint_addr_no smallint, CONSTRAINT belongs2_pkey PRIMARY KEY (uuid), CONSTRAINT belongs_to_address_id_fkey FOREIGN KEY (address_id) REFERENCES wos_2017_1.address (uuid) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE SET NULL, CONSTRAINT belongs_to_pref_name_id_fkey FOREIGN KEY (pref_name_id) REFERENCES wos_2017_1.org_pref_name (uuid) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE SET NULL, CONSTRAINT belongs_to_suborg_id_fkey FOREIGN KEY (suborg_id) REFERENCES wos_2017_1.suborg (uuid) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE SET NULL, CONSTRAINT belongs_to_ut_fkey FOREIGN KEY (ut) REFERENCES wos_2017_1.article (ut) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE CASCADE, CONSTRAINT belongs2_id_key UNIQUE (id), CONSTRAINT belongs2_ut_suborg_id_organisation_id_address_id_addr_no_re_key UNIQUE (ut, suborg_id, organisation_id, address_id, addr_no, reprint_addr_no, pref_name_id) ) WITH ( OIDS=FALSE ); with indexes on address_id, organisation_id, pref_name_id, ut I have also tried to set enable_seqscan to false for these queries, but still no usage of the indexes. Why would that be? Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)
Re: [PERFORM] Delete, foreign key, index usage
On 4 April 2017 at 14:07, Johann Spies wrote: > Why would that be? To answer my own question. After experimenting a lot we found that 9.6 uses a parallel seqscan that is actually a lot faster than using the index on these large tables. This, to us was a surprise! Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Delete, foreign key, index usage
> On Wed, Apr 5, 2017 at 6:40 AM, Johann Spies wrote: >> >> On 4 April 2017 at 14:07, Johann Spies wrote: >> >> > Why would that be? >> >> To answer my own question. After experimenting a lot we found that >> 9.6 uses a parallel seqscan that is actually a lot faster than using >> the index on these large tables. Further experimenting resulted in a solution which we do not understand: The table 'publication' had the field 'ut' as primary key and the ut index was not used. So we built an additional btree index(ut) on publication - which was ignored as well. Then we built a gin index(ut) on publication and now it is being used. The same happened on the other table (belongs_to) where the btree index was ignored by the planner but the gin-index used. As a result our deletes runs between 25-60 times faster than earlier with maximum of about 20 records per hour in comparison with a maximum of 4500 earlier.. In the case of both tables the ut has a foreign key reference to another article. Why would the planner prefer the use the gin index and not the btree index in this case? Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Delete, foreign key, index usage
On 24 April 2017 at 15:17, Tomas Vondra wrote: > On 04/24/2017 08:48 AM, Johann Spies wrote: >> >> >> Why would the planner prefer the use the gin index and not the btree >> index in this case? >> > > You'll need to show what queries are you running - that's a quite important > piece of information, and I don't see it anywhere in this thread. Seeing > explain plans would also be helpful. It is a simple "delete from wos_2017_1.article;" which causes a domino effect deletes due to foreign keys. In the case of one table with more than 50 million records where the primary key was also the foreign key, the process only started to use the index when we built a gin index. In the case of the "belongs_to" table (shown in my first email) we first built a btree index on the foreign key - and it was ignored. Only after the gin index was created did it use the index. Regards. Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Delete, foreign key, index usage
On 4 April 2017 at 14:07, Johann Spies wrote: > While updating our database which includes a lot of deletions where a lot > of foreign key references are involved we found that in the case of two > tables the indexes are ignored and it slow down the process a lot. > ... > > > Why would that be? > After a long time we found the problem: The primary/foreign key fields had different types: varchar and citext. In the case of the two tables where the indexes were ignored indexes were built with the 'citext' type and the queries assumed it was varchar as the case were in the other tables using the same field. Lesson learnt: Check your types in every field in every table - and we have many tables. Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)