[GENERAL] Log message last_statrequest ... is later than collector's time - what does it mean?
Hi, I upgraded our test database from PostgreSQL 8.4.8 to 9.0.4 via pg_dumpall. The database seems to work fine, but now the logfile of the new database is flooded with log messages like these: 2011-09-16 13:48:32 CEST: LOG: database system was shut down at 2011-09-16 13:48:29 CEST 2011-09-16 13:48:32 CEST: LOG: autovacuum launcher started 2011-09-16 13:48:32 CEST: LOG: database system is ready to accept connections 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.614476+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.626524+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.638498+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.650502+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.662507+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.674506+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.686507+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.698498+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.710517+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.722504+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.734495+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.746498+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.758503+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.770501+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.782509+02 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.794496+02 2011-09-16 13:49:07 CEST: LOG: last_statrequest 2011-09-16 13:49:08.176573+02 is later than collector's time 2011-09-16 13:49:07.500691+02 2011-09-16 13:49:07 CEST: LOG: last_statrequest 2011-09-16 13:49:08.176573+02 is later than collector's time 2011-09-16 13:49:07.51259+02 2011-09-16 13:49:07 CEST: LOG: last_statrequest 2011-09-16 13:49:08.176573+02 is later than collector's time 2011-09-16 13:49:07.524583+02 2011-09-16 13:49:07 CEST: LOG: last_statrequest 2011-09-16 13:49:08.176573+02 is later than collector's time 2011-09-16 13:49:07.536605+02 2011-09-16 13:49:07 CEST: LOG: last_statrequest 2011-09-16 13:49:08.176573+02 is later than collector's time 2011-09-16 13:49:07.548601+02 2011-09-16 13:49:07 CEST: LOG: last_statrequest 2011-09-16 13:49:08.176573+02 is later than collector's time 2011-09-16 13:49:07.560569+02 2011-09-16 13:49:07 CEST: LOG: last_statrequest 2011-09-16 13:49:08.176573+02 is later than collector's time 2011-09-16 13:49:07.572571+02 2011-09-16 13:49:07 CEST: LOG: last_statrequest 2011-09-16 13:49:08.176573+02 is later than collector's time 2011-09-16 13:49:07.584579+02 Configuration: - SUSE Linux Enterprise Server 10 (i586) - uname -a: Linux 2.6.16.21-0.8-bigsmp #1 SMP Mon Jul 3 18:25:39 UTC 2006 i686 athlon i386 GNU/Linux - PostgreSQL 9.0.4, installed with RPM: postgresql-9.0.4-54.5 What do these messages mean and what can I do to avoid them? Cheers, Stephan
Re: [GENERAL] Log message last_statrequest ... is later than collector's time - what does it mean?
On Fri, Sep 16, 2011 at 19:16, Tom Lane t...@sss.pgh.pa.us wrote: Stephan Vollmer vollmer.step...@googlemail.com writes: I upgraded our test database from PostgreSQL 8.4.8 to 9.0.4 via pg_dumpall. The database seems to work fine, but now the logfile of the new database is flooded with log messages like these: 2011-09-16 13:48:54 CEST: LOG: last_statrequest 2011-09-16 13:48:55.890743+02 is later than collector's time 2011-09-16 13:48:54.614476+02 Wow. AFAIK this is an indication of major system clock problems, as in there's at least one backend process that is seeing gettimeofday() results significantly later than what the stats collector process is seeing. We have seen small processor-to-processor skews before, but you've apparently got skews that are more than a second. It was presumably happening before too, but pre-9.0 the stats collector doesn't bleat about it --- we added that logging to try to diagnose such problems. Now it's relatively harmless so far as this particular issue goes (I think the log bleating is the only real consequence); but it's not hard to envision very serious problems elsewhere, for instance gmake failing to rebuild things because file timestamps are in the future compared to what it thinks the time is. You want to get that fixed. Thank you for this information! I agree that this problem should be fixed, but unfortunately I'm not the server admin and there are other applications running on this machine (even MySQL...). In the meantime, is there a way to filter these log messages other than setting log_min_messages to fatal which is not really desireable? Configuration: - SUSE Linux Enterprise Server 10 (i586) - uname -a: Linux 2.6.16.21-0.8-bigsmp #1 SMP Mon Jul 3 18:25:39 UTC 2006 i686 athlon i386 GNU/Linux I'd file a bug report with SUSE. But probably the first thing they'd say is you should be using a less ancient kernel, so maybe upgrade that first and see if the issue goes away. I'll try to persuade the admin to install at least the latest patch version of this old kernel. Thanks for your help and kind regards, Stephan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Error: could not read from statistics collector pipe
Hello! In my PostgreSQL logfile, I see that the following entries occur very frequently: 2006-01-27 10:37:29 FATAL could not read from statistics collector pipe: No error 2006-01-27 10:37:30 LOGstatistics collector process (PID 5940) was terminated by signal 1 (PostgreSQL 8.1.2, WinXP SP2, Athlon64 X2 3800, 2 GB RAM) These errors occur every 1-2 minutes when the DB is active. What do these messages mean and how can I stop them from appearing? I also noticed that during inserting lots of records, the DB becomes increasingly slower the more records were inserted. For example, the first 10 records take 15 minutes, but records 30-40 take 3 hours. Could this be related to the messages above? I tried to execute ANALYZE every 10 records, but this did not seem to help. Thanks in advance for your help! - Stephan signature.asc Description: OpenPGP digital signature
Re: [GENERAL] Creation of tsearch2 index is very slow
Tom Lane wrote: The problem seems to be mostly tsearch2's fault rather than the general GIST code. I've applied a partial fix to 8.1 and HEAD branches, which you can find here if you're in a hurry for it: http://archives.postgresql.org/pgsql-committers/2006-01/msg00283.php (the gistidx.c change is all you need for tsearch2) Thanks for all your time and work you and the other guys are spending on this matter! I'll look into the new version, but a.p.o seems to be unreachable at the moment. There is some followup discussion in the pgsql-performance list. It seems possible that we can get another factor of 10 or better with a smarter picksplit algorithm --- but that patch will probably be too large to be considered for back-patching into the stable branches. I've already been following the discussion on pgsql-perform, although I have to admit that don't understand every detail of the tsearch2 implementation. :-) Thus, I'm sorry that I won't be able to help directly on that problem. But it is interesting to read anyway. Best regards, - Stephan signature.asc Description: OpenPGP digital signature
[GENERAL] Creation of tsearch2 index is very slow
Hello! I noticed that the creation of a GIST index for tsearch2 takes very long - about 20 minutes. CPU utilization is 100 %, the resulting index file size is ~25 MB. Is this behaviour normal? Full text columns: title author_list tsearch2 word lists: fti_title fti_author_list tsearch2 indexes:idx_fti_title idx_fti_author_list The table has 700,000 records. When I create a normal B-Tree index on the same column for testing purposes, it works quite fast (approx. 30 seconds). The columns that should be indexed are small, only about 10 words on average. System specs: Athlon64 X2 3800+, 2 GB RAM PostgreSQL 8.1.2, Windows XP SP2 I've never noticed this problem before, so could it probably be related to v8.1.2? Or is something in my configuration or table definition that causes this sluggishness? Thanks very much in advance for your help! - Stephan This is the table definition: - CREATE TABLE publications ( id int4 NOT NULL DEFAULT nextval('publications_id_seq'::regclass), publication_type_id int4 NOT NULL DEFAULT 0, keyword text NOT NULL, mdate date, year date, title text, fti_title tsvector, author_list text, fti_author_list tsvector, overview_timestamp timestamp, overview_xml text, CONSTRAINT publications_pkey PRIMARY KEY (keyword) USING INDEX TABLESPACE dblp_index, CONSTRAINT publications_publication_type_id_fkey FOREIGN KEY (publication_type_id) REFERENCES publication_types (id) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT, CONSTRAINT publications_year_check CHECK (date_part('month'::text, year) = 1::double precision AND date_part('day'::text, year) = 1::double precision) ) WITHOUT OIDS; CREATE INDEX fki_publications_publication_type_id ON publications USING btree (publication_type_id) TABLESPACE dblp_index; CREATE INDEX idx_fti_author_list ON publications USING gist (fti_author_list) TABLESPACE dblp_index; CREATE INDEX idx_fti_title ON publications USING gist (fti_title) TABLESPACE dblp_index; CREATE INDEX idx_publications_year ON publications USING btree (year) TABLESPACE dblp_index; CREATE INDEX idx_publications_year_part ON publications USING btree (date_part('year'::text, year)) TABLESPACE dblp_index; CREATE TRIGGER tsvectorupdate_all BEFORE INSERT OR UPDATE ON publications FOR EACH ROW EXECUTE PROCEDURE multi_tsearch2(); signature.asc Description: OpenPGP digital signature
Re: [GENERAL] Creation of tsearch2 index is very slow
PS: What I forgot to mention was that inserting records into the table is also about 2-3 times slower than before (most likely due to the slow index update operations). I dropped the whole database and restored the dumpfile, but the result it the same. When the index is recreated after COPYing the data, it takes more than 20 minutes for _each_ of both tsearch2 indexes. So the total time to restore this table is more than 45 minutes! - Stephan signature.asc Description: OpenPGP digital signature
Re: [GENERAL] Creation of tsearch2 index is very slow
Tom Lane wrote: Stephan Vollmer [EMAIL PROTECTED] writes: I noticed that the creation of a GIST index for tsearch2 takes very long - about 20 minutes. CPU utilization is 100 %, the resulting index file size is ~25 MB. Is this behaviour normal? This has been complained of before. GIST is always going to be slower at index-building than btree; in the btree case there's a simple optimal strategy for making an index (ie, sort the keys) but for GIST we don't know anything better to do than insert the keys one at a time. Ah, ok. That explains a lot, although I wonder why it is so much slower. However, I'm not sure that anyone's tried to do any performance optimization on the GIST insert code ... there might be some low-hanging fruit there. It'd be interesting to look at a gprof profile of what the backend is doing during the index build. Do you have the ability to do that, or would you be willing to give your data to someone else to investigate with? Unfortunately, I'm not able to investigate it further myself as I'm quite a Postgres newbie. But I could provide someone else with the example table. Maybe someone else could find out why it is so slow. I dropped all unnecessary columns and trimmed the table down to 235,000 rows. The dumped table (compressed with RAR) is 7,1 MB. I don't have a website to upload it but I could send it to someone via e-mail. With this 235,000 row table, index creation times are: - GIST347063 ms - B-Tree2515 ms Thanks for your help! - Stephan signature.asc Description: OpenPGP digital signature
Re: [GENERAL] tsearch2: How to use different configurations for two
Andrew J. Kopciuch wrote: You could write a one trigger for the table to handle both. Something like this : [...] CREATE OR REPLACE FUNCTION multi_tsearch2() RETURNS TRIGGER AS ' [...] I tested the function today and it's exactly what I had in mind. Works perfectly! Thanks, Stephan signature.asc Description: OpenPGP digital signature
[GENERAL] tsearch2: How to use different configurations for two columns?
Hi all, is it possible to index 2 columns in a table with tsearch2 using different configurations for each column index? I have a table publications that is defined as follows (simplified): CREATE TABLE publications ( title text, author_list text, fti_title tsvector, fti_author_list tsvector, ) WITHOUT OIDS; CREATE INDEX idx_fti_author_list ON publications USING gist (fti_author_list); CREATE INDEX idx_fti_title ON publications USING gist (fti_title); CREATE TRIGGER tsvectorupdate_title BEFORE INSERT OR UPDATE ON publications FOR EACH ROW EXECUTE PROCEDURE tsearch2('fti_title', 'title'); CREATE TRIGGER tsvectorupdate_author_list BEFORE INSERT OR UPDATE ON publications FOR EACH ROW EXECUTE PROCEDURE tsearch2('fti_author_list', 'author_list'); The column 'author_list' contains names of authors with many abbreviated first names, e. g. S. Vollmer Michael F. Smith. These abbreviated first names S. and F. shouldn't be indexed. To do this, I created a tsearch2 configuration 'authors' that uses a stopword list with a...z as stopwords. The configuration seems to work fine, but I can't get tsearch2 to use the 'author' config for column 'author_list' and 'default' config for column 'title'. Is there any way to accomplish this? If the solution is not possible or too complicated, as a workaround I could use a function that deletes the abbreviated first names before the column is indexed - similar to the function dropatsymbol() mentioned in the tsearch2 documentation. I tried to use replace(), but I'd need a function with regexps. Thanks in advance, - Stephan signature.asc Description: OpenPGP digital signature
Re: [GENERAL] tsearch2: How to use different configurations for two
[Andrew, sorry for the PM - I pressed the wrong button.] Andrew J. Kopciuch wrote: You could write a one trigger for the table to handle both. Something like this : [...] You can modify the function to be slightly more configurable with parameters taking the column names, and config names and make it a little more reusable. You can accomplish what you want though. I'm a total beginner in regards to pgsql functions, so thanks a lot for your example. When I have some time tomorrow, I'll try it out and also take a look at the PL/pgSQL docs. Thanks for your help, - Stephan signature.asc Description: OpenPGP digital signature
Re: [GENERAL] ILIKE '%term%' and Performance
Hi! CSN wrote: I'm thinking of enabling searches that use queries like select * from items where title ilike '%term%'. The items table has tens of thousands of rows. Is it worth worrying about the performance of such a query (since, if I'm not mistaken, it will never use indices). If it is, what's the best option - use tsearch? How does tsearch (or whatever else) compare performance-wise to not using it, or to typical index-based queries for that matter? This is my first post to this list and I'm quite a PostgreSQL newbie. But I had the same question as you a few days ago so maybe my answer is helpful to you. In my experience, there was a huge difference between using tsearch2 and normal ILIKE. This is the query time on a table with about 236000 publication records when searching for a title: 1. Without tsearch2: SELECT * FROM publications WHERE title ILIKE '%quicksort%'; Total runtime: 1673.439 ms 2. With tsearch2: SELECT * FROM publications WHERE idx_fti @@ to_tsquery('default', 'quicksort'); Total runtime: 11.707 ms So for my setup, tsearch2 was (as I expected) much faster, but of course results will vary between different setups. But maybe it is helpful to you. Greetings, - Stephan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq