Re: [PERFORM] Workaround: Planner preference for tsquery filter vs. GIN index in fast text search
On 04/20/2014 07:46 AM, Oleg Bartunov wrote: btw, 9.4 should be wiser in case of rare+common terms, thanks to GIN fast scan feature. Indeed, although we didn't actually do anything to the planner to make it understand when fast scan helps. Doing something about cost estimation is still on the 9.4 Open Items list, but I don't have any ideas on what to do about it, and I haven't heard anything from Alexander about that either. That means that the cost estimation issue Laurence saw is going to be even worse in 9.4, because GIN is going to be faster than a seq scan in more cases than before and the planner doesn't know about it. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] tsearch2, large data and indexes
On 04/20/2014 02:15 AM, Ivan Voras wrote: Hello, If a table contains simple fields as well as large (hundreds of KiB) text fields, will accessing only the simple fields cause the entire record data, including the large fields, to be read and unpacked? (e.g. SELECT int_field FROM table_with_large_text) No. More details: after thinking about it some more, it might have something to do with tsearch2 and indexes: the large data in this case is a tsvector, indexed with GIN, and the query plan involves a re-check condition. The query is of the form: SELECT simple_fields FROM table WHERE fts @@ to_tsquery('...'). Does the re-check condition mean that the original tsvector data is always read from the table in addition to the index? Yes, if the re-check condition involves the fts column. I don't see why you would have a re-check condition with a query like that, though. Are there some other WHERE-conditions that you didn't show us? The large fields are stored in the toast table. You can check if the toast table is accessed with a query like this: select * from pg_stat_all_tables where relid = (select reltoastrelid from pg_class where relname='table'); Run that before and after your query, and see if the numbers change. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Workaround: Planner preference for tsquery filter vs. GIN index in fast text search
On Tue, Apr 22, 2014 at 10:28 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 04/20/2014 07:46 AM, Oleg Bartunov wrote: btw, 9.4 should be wiser in case of rare+common terms, thanks to GIN fast scan feature. Indeed, although we didn't actually do anything to the planner to make it understand when fast scan helps. Doing something about cost estimation is still on the 9.4 Open Items list, but I don't have any ideas on what to do about it, and I haven't heard anything from Alexander about that either. That means that the cost estimation issue Laurence saw is going to be even worse in 9.4, because GIN is going to be faster than a seq scan in more cases than before and the planner doesn't know about it. - Heikki You are right, we should return to that topic. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] tsearch2, large data and indexes
On 22 April 2014 08:40, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 04/20/2014 02:15 AM, Ivan Voras wrote: More details: after thinking about it some more, it might have something to do with tsearch2 and indexes: the large data in this case is a tsvector, indexed with GIN, and the query plan involves a re-check condition. The query is of the form: SELECT simple_fields FROM table WHERE fts @@ to_tsquery('...'). Does the re-check condition mean that the original tsvector data is always read from the table in addition to the index? Yes, if the re-check condition involves the fts column. I don't see why you would have a re-check condition with a query like that, though. Are there some other WHERE-conditions that you didn't show us? Yes, I've read about tsearch2 and GIN indexes and there shouldn't be a recheck condition - but there is. This is the query: SELECT documents.id, title, raw_data, q, ts_rank(fts_data, q, 4) AS rank, html_filename FROM documents, to_tsquery('document') AS q WHERE fts_data @@ q ORDER BY rank DESC LIMIT 25; And here is the explain analyze: http://explain.depesz.com/s/4xm It clearly shows a bitmap index scan operation is immediately followed by a recheck operation AND that the recheck operation actually does something, because it reduces the number of records from 61 to 58 (!!!). This is the table structure: nn=# \d documents Table public.documents Column | Type | Modifiers ---+--+ id| integer | not null default nextval('documents_id_seq'::regclass) ctime | integer | not null default unix_ts(now()) dtime | integer | not null title | text | not null html_filename | text | not null raw_data | text | fts_data | tsvector | not null tags | text[] | dtype | integer | not null default 0 flags | integer | not null default 0 Indexes: documents_pkey PRIMARY KEY, btree (id) documents_html_filename UNIQUE, btree (html_filename) documents_dtime btree (dtime) documents_fts_data gin (fts_data) documents_tags gin (tags) The large fields are stored in the toast table. You can check if the toast table is accessed with a query like this: select * from pg_stat_all_tables where relid = (select reltoastrelid from pg_class where relname='table'); Run that before and after your query, and see if the numbers change. Before: relid|schemaname|relname|seq_scan|seq_tup_read|idx_scan|idx_tup_fetch|n_tup_ins|n_tup_upd|n_tup_del|n_tup_hot_upd|n_live_tup|n_dead_tup|last_vacuum|last_autovacuum|last_analyze|last_autoanalyze|vacuum_count|autovacuum_count|analyze_count|autoanalyze_count 27290|pg_toast|pg_toast_27283|3|0|2481289|10631453|993194|0|266306|0|147931|2514||2014-04-18 00:49:11.066443+02|||0|11|0|0 After: relid|schemaname|relname|seq_scan|seq_tup_read|idx_scan|idx_tup_fetch|n_tup_ins|n_tup_upd|n_tup_del|n_tup_hot_upd|n_live_tup|n_dead_tup|last_vacuum|last_autovacuum|last_analyze|last_autoanalyze|vacuum_count|autovacuum_count|analyze_count|autoanalyze_count 27290|pg_toast|pg_toast_27283|3|0|2481347|10632814|993194|0|266306|0|147931|2514||2014-04-18 00:49:11.066443+02|||0|11|0|0 idx_scan has changed from 2481289 to 2481347 (58) idx_tup_fetch has changed from 10631453 to 10632814 (1361) Number 58 corresponds to the number of rows found by the index, seen in the EXPLAIN output, I don't know where 1361 comes from. I'm also surprised by the amount of memory used for sorting (23 kB), since the actually returned data from my query (all the tuples from all the 58 rows) amount to around 2 kB - but this is not an actual problem. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Query on partitioned table not using index
Hi, i'm working on a strange behaviour of planner, _PostgreSQL version :_ 8.4 _Stats vacuum state : _just done, the table is never changed after creation ( Create table as...) _Here's my query :_ SELECT *cabmnt___rfovsnide*::varchar FROM zcub_258 WHERE *cabmnt___rfovsnide* '201301_reel' ORDER BY *cabmnt___rfovsnide* LIMIT 1 _Here's the table :_ The table is partitionned by column *cabmnt___rfovsnide* There is 24 partitions. CREATE TABLE zcub_258 ( dwhinvyea character varying(32), dwhinvmon text, dwhinvmonl character varying(32), dwhinvday text, mnt_2_rfodst0 character varying, mnt_2_rfodst1 character varying, mnt_2_rfodst2 character varying, mnt_2_rfodst3 character varying, mnt_2_rfodst4 character varying, nivmnt_2_rfodst integer, mnt___rfontr0 character varying, mnt___rfontr1 character varying, mnt___rfontr2 character varying, mnt___rfontr3 character varying, mnt___rfontr4 character varying, mnt___rfontr5 character varying, mnt___rfontr6 character varying, mnt___rfontr7 character varying, mnt___rfontr8 character varying, mnt___rfontr9 character varying, nivmnt___rfontr integer, * cabmnt___rfovsnide character varying(32),* cabmnt___rteprcide character varying(32), cabmnt___rtestdide character varying(32), key1 integer, key2 integer,the table key3 integer, q0 numeric, nothing integer, libmnt_2_rfodst0 character varying(32), liblmnt_2_rfodst0 character varying(100), libmnt_2_rfodst1 character varying(32), liblmnt_2_rfodst1 character varying(100), libmnt_2_rfodst2 character varying(32), liblmnt_2_rfodst2 character varying(100), libmnt_2_rfodst3 character varying(32), liblmnt_2_rfodst3 character varying(100), libmnt_2_rfodst4 character varying(32), liblmnt_2_rfodst4 character varying(100), libmnt___rfontr0 character varying(32), liblmnt___rfontr0 character varying(100), libmnt___rfontr1 character varying(32), liblmnt___rfontr1 character varying(100), libmnt___rfontr2 character varying(32), liblmnt___rfontr2 character varying(100), libmnt___rfontr3 character varying(32), liblmnt___rfontr3 character varying(100), libmnt___rfontr4 character varying(32), liblmnt___rfontr4 character varying(100), libmnt___rfontr5 character varying(32), liblmnt___rfontr5 character varying(100), libmnt___rfontr6 character varying(32), liblmnt___rfontr6 character varying(100), libmnt___rfontr7 character varying(32), liblmnt___rfontr7 character varying(100), libmnt___rfontr8 character varying(32), liblmnt___rfontr8 character varying(100), libmnt___rfontr9 character varying(32), liblmnt___rfontr9 character varying(100) ) _ __the plan is : __ _ Limit (cost=1572842.00..1572842.00 rows=1 width=13) - Sort (cost=1572842.00..1619836.83 rows=18797933 width=13) Sort Key: public.zcub_143.cabmnt___rfovsnide - Result (cost=0.00..1478852.33 rows=18797933 width=13) - Append (cost=0.00..1478852.33 rows=18797933 width=13) - Seq Scan on zcub_143 (cost=0.00..67.91 rows=3591 width=82) - Seq Scan on zcub_143_0 zcub_143 (cost=0.00..21941.36 rows=265936 width=11) - Seq Scan on zcub_143_1 zcub_143 (cost=0.00..695.37 rows=8637 width=15) - Seq Scan on zcub_143_2 zcub_143 (cost=0.00..36902.82 rows=454482 width=12) - Seq Scan on zcub_143_3 zcub_143 (cost=0.00..116775.60 rows=1475460 width=15) - Seq Scan on zcub_143_4 zcub_143 (cost=0.00..170064.21 rows=2111521 width=15) - Seq Scan on zcub_143_5 zcub_143 (cost=0.00..44583.32 rows=559332 width=12) - Seq Scan on zcub_143_6 zcub_143 (cost=0.00..48501.54 rows=608454 width=12) - Seq Scan on zcub_143_7 zcub_143 (cost=0.00..53600.30 rows=687630 width=12) - Seq Scan on zcub_143_8 zcub_143 (cost=0.00..57048.78 rows=731078 width=12) - Seq Scan on zcub_143_9 zcub_143 (cost=0.00..60401.80 rows=773880 width=12) - Seq Scan on zcub_143_10 zcub_143 (cost=0.00..64455.42 rows=828942 width=12) - Seq Scan on zcub_143_11 zcub_143 (cost=0.00..67903.80 rows=872480 width=12) - Seq Scan on zcub_143_12 zcub_143 (cost=0.00..71341.55 rows=915955 width=12) - Seq Scan on zcub_143_13 zcub_143 (cost=0.00..74761.82 rows=959182 width=12) - Seq Scan on zcub_143_14 zcub_143 (cost=0.00..78838.92 rows=1014292 width=12) - Seq Scan on zcub_143_15 zcub_143 (cost=0.00..82330.08 rows=1058208 width=12) - Seq Scan on zcub_143_16 zcub_143 (cost=0.00..168486.12 rows=2149712 width=15) - Seq Scan on zcub_143_17 zcub_143 (cost=0.00..86700.75 rows=1112575 width=12) - Seq Scan on zcub_143_18 zcub_143 (cost=0.00..25063.32 rows=302332 width=14)
Re: [PERFORM] Workaround: Planner preference for tsquery filter vs. GIN index in fast text search
Heikki Linnakangas hlinnakan...@vmware.com writes: On 04/20/2014 07:46 AM, Oleg Bartunov wrote: btw, 9.4 should be wiser in case of rare+common terms, thanks to GIN fast scan feature. Indeed, although we didn't actually do anything to the planner to make it understand when fast scan helps. The given query has nothing to do with rare+common terms, since there is only one term in the search --- and what's more, the planner's estimate for that term is spot on already (755 estimated matches vs 752 actual). It looks to me like the complaint is more probably about inappropriate choice of join order; but since we've been allowed to see only some small portion of either the query or the plan, speculating about the root cause is a fool's errand. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Stalls on PGSemaphoreLock
Hi all - I am a little delayed in reporting back on this issue, but it was indeed the hugepage defrag setting that was the cause of my issue. One item that we noticed as we were testing this issue that I wanted to report back to the forum is that these settings cat /sys/kernel/mm/transparent_hugepage/defrag always [never] cat /sys/kernel/mm/transparent_hugepage/enabled always [never] Were not sicky on reboot for my version of CentOS, which probably explains why I thought this was disabled already only to have it crop back up. Anyway, I wanted to report back these findings to close the loop on this and to thank the community again for their support. Best, Matt From: Pavy Philippe [philippe.p...@worldline.com] Sent: Tuesday, March 25, 2014 4:10 PM To: Matthew Spilich; pgsql-performance@postgresql.org Subject: RE: [PERFORM] Stalls on PGSemaphoreLock Here, we were the transparent hugepage always actif: cat /sys/kernel/mm/redhat_transparent_hugepage/enabled [always] never We changed to: cat /sys/kernel/mm/redhat_transparent_hugepage/enabled always [never] For the semaphore, our initial configuration was: cat /proc/sys/kernel/sem 250 32000 32 128 And we changed to: cat /proc/sys/kernel/sem 5010641280 5010128 -Message d'origine- De : pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] De la part de Matthew Spilich Envoyé : mardi 25 mars 2014 19:38 À : pgsql-performance@postgresql.org Objet : Re: [PERFORM] Stalls on PGSemaphoreLock Thanks all: Ray: Thanks, we started to look at the hardware/firmware, but didn't get to the the level of detail or running sar. I will probably collect more detail in this area if I continue to see issues. Pavy - I hope that you are right that the hugepage setting is the issue. I was under the impression that I had it disabled already because this has been an known issue for us in the past, but it turns out this was not the case for this server in question. I have disabled it at this time, but it will take a few days of running without issue before I am comfortable declaring that this is the solution. Can you elaborate on the change you mention to upgrade the semaphore configuration? I think this is not something I have looked at before. Ashutosh - Thanks for the reply, I started to do that at first. I turned on log_statement=all for a few hours and I generated a few GB of log file, and I didn't want to leave it running in that state for too long because the issue happens every few days, and not on any regular schedule, so I reverted that after collecting a few GB of detail in the pg log. What I'm doing now to sample every few seconds is I think giving me a decent picture of what is going on with the incident occurs and is a level of data collection that I am more comfortable will not impact operations. I am also logging at the level of 'mod' and all duration 500ms. I don't see that large write operations are a contributing factor leading up to these incidents. I'm hoping that disabling the hugepage setting will be the solution to this. I'll check back in a day or two with feedback. Thanks, Matt From: Pavy Philippe [philippe.p...@worldline.com] Sent: Tuesday, March 25, 2014 1:45 PM To: Ray Stell; Matthew Spilich Cc: pgsql-performance@postgresql.org Subject: RE : [PERFORM] Stalls on PGSemaphoreLock Hello Recently I have a similar problem. The first symptom was a freeze of the connection and 100% of CPU SYS during 2 et 10 minutes, 1 or 2 times per day. Connection impossible, slow query. The strace on one backend show a very long system call on semop(). We have a node with 48 cores dans 128 Go of memory. We have disable the hugepage and upgrade the semaphore configuration, and since that time, we no longer have any problem of freeze on our instance. Can you check the hugepage and semaphore configuration on our node ? I am interested in this case, so do not hesitate to let me make a comeback. Thanks. excuse me for my bad english !!! De : pgsql-performance-ow...@postgresql.org [pgsql-performance-ow...@postgresql.org] de la part de Ray Stell [ste...@vt.edu] Date d'envoi : mardi 25 mars 2014 18:17 À : Matthew Spilich Cc : pgsql-performance@postgresql.org Objet : Re: [PERFORM] Stalls on PGSemaphoreLock On Mar 25, 2014, at 8:46 AM, Matthew Spilich wrote: The symptom: The database machine (running postgres 9.1.9 on CentOS 6.4) is running a low utilization most of the time, but once every day or two, it will appear to slow down to the point where queries back up and clients are unable to connect. Once this event occurs, there are lots of concurrent queries, I see slow queries appear in the logs, but there doesn't appear to be anything abnormal that I have been able to
Re: [PERFORM] Stalls on PGSemaphoreLock
On Tue, Apr 22, 2014 at 12:12 PM, Matthew Spilich mspil...@tripadvisor.comwrote: Hi all - I am a little delayed in reporting back on this issue, but it was indeed the hugepage defrag setting that was the cause of my issue. The transparent huge pages features seems so bogus for database workloads, that it is one of the first things I disable on new servers (I have tried to let it enabled sometimes, but every time the system was better with it disabled). One item that we noticed as we were testing this issue that I wanted to report back to the forum is that these settings ... Were not sicky on reboot for my version of CentOS, which probably explains why I thought this was disabled already only to have it crop back up. Anyway, I wanted to report back these findings to close the loop on this and to thank the community again for their support. Just changing files at /sys/ is not permanent, so I recommend adding these commands into your /etc/rc.local file: test -f /sys/kernel/mm/transparent_hugepage/enabled echo never /sys/kernel/mm/transparent_hugepage/enabled test -f /sys/kernel/mm/transparent_hugepage/defrag echo never /sys/kernel/mm/transparent_hugepage/defrag The test's are just to make sure the file does exists, as its location changes depending on the distro you are using and may also change on kernel upgrades. It is also possible to add transparent_hugepage=never on grub.conf file, but I personally dislike this option. Regards, -- Matheus de Oliveira Analista de Banco de Dados Dextra Sistemas - MPS.Br nível F! www.dextra.com.br/postgres
Re: [PERFORM] tsearch2, large data and indexes
On Tue, Apr 22, 2014 at 12:57 AM, Ivan Voras ivo...@freebsd.org wrote: On 22 April 2014 08:40, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 04/20/2014 02:15 AM, Ivan Voras wrote: More details: after thinking about it some more, it might have something to do with tsearch2 and indexes: the large data in this case is a tsvector, indexed with GIN, and the query plan involves a re-check condition. I think bitmap scans always insert a recheck, do to the possibility of bitmap overflow. But that doesn't mean that it ever got triggered. In 9.4., explain (analyze) will report on the overflows. Yes, I've read about tsearch2 and GIN indexes and there shouldn't be a recheck condition - but there is. This is the query: SELECT documents.id, title, raw_data, q, ts_rank(fts_data, q, 4) AS rank, html_filename FROM documents, to_tsquery('document') AS q WHERE fts_data @@ q ORDER BY rank DESC LIMIT 25; And here is the explain analyze: http://explain.depesz.com/s/4xm It clearly shows a bitmap index scan operation is immediately followed by a recheck operation AND that the recheck operation actually does something, because it reduces the number of records from 61 to 58 (!!!). That could be ordinary visibility checking, not qual rechecking. Cheers, Jeff
[PERFORM] Help on migrating data from MSSQL2008R2 to PostgreSQL 9.3
To whom it may concern, My question and problem is posted on below site: http://stackoverflow.com/questions/23147724/postgresql-9-3-on-ubuntu-server-12-04-v-s-ms-sql-server-2008-r2-on-windows-7-ul Would you please help me to solve my problem. Thank you for your help. Alex, regard
[PERFORM] Best practices for update timestamp with/without triggers
Hi I am going to add a new column to a table for modify_date that needs to be updated every time the table is updated. Is it better to just update application code to set the modify_date to current_time, or create a Before-Update trigger on the table that will update the modify_date column to current_timestamp when the table is updated? I also have slony in place, so the trigger will need to be on master and slave. Slony will take care of suppressing it on the slave and enabling in the event of a switchover, but it is additional overhead and validation to make sure nothing failed on switchover. So considering that we have slony, is it better to use application code to update the modify_date or use a trigger?Is a trigger essentially 2 updates to the table? Are there any other risks in using the trigger? Thanks Riya Verghese This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the message and any attachments from your system.