[GENERAL] Log message last_statrequest ... is later than collector's time - what does it mean?

2011-09-16 Thread Stephan Vollmer
Hi,

I upgraded our test database from PostgreSQL 8.4.8 to 9.0.4 via pg_dumpall.
The database seems to work fine, but now the logfile of the new database is
flooded with log messages like these:

2011-09-16 13:48:32 CEST: LOG:  database system was shut down at 2011-09-16
13:48:29 CEST
2011-09-16 13:48:32 CEST: LOG:  autovacuum launcher started
2011-09-16 13:48:32 CEST: LOG:  database system is ready to accept
connections
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.614476+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.626524+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.638498+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.650502+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.662507+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.674506+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.686507+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.698498+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.710517+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.722504+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.734495+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.746498+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.758503+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.770501+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.782509+02
2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
13:48:55.890743+02 is later than collector's time 2011-09-16
13:48:54.794496+02
2011-09-16 13:49:07 CEST: LOG:  last_statrequest 2011-09-16
13:49:08.176573+02 is later than collector's time 2011-09-16
13:49:07.500691+02
2011-09-16 13:49:07 CEST: LOG:  last_statrequest 2011-09-16
13:49:08.176573+02 is later than collector's time 2011-09-16
13:49:07.51259+02
2011-09-16 13:49:07 CEST: LOG:  last_statrequest 2011-09-16
13:49:08.176573+02 is later than collector's time 2011-09-16
13:49:07.524583+02
2011-09-16 13:49:07 CEST: LOG:  last_statrequest 2011-09-16
13:49:08.176573+02 is later than collector's time 2011-09-16
13:49:07.536605+02
2011-09-16 13:49:07 CEST: LOG:  last_statrequest 2011-09-16
13:49:08.176573+02 is later than collector's time 2011-09-16
13:49:07.548601+02
2011-09-16 13:49:07 CEST: LOG:  last_statrequest 2011-09-16
13:49:08.176573+02 is later than collector's time 2011-09-16
13:49:07.560569+02
2011-09-16 13:49:07 CEST: LOG:  last_statrequest 2011-09-16
13:49:08.176573+02 is later than collector's time 2011-09-16
13:49:07.572571+02
2011-09-16 13:49:07 CEST: LOG:  last_statrequest 2011-09-16
13:49:08.176573+02 is later than collector's time 2011-09-16
13:49:07.584579+02

Configuration:
- SUSE Linux Enterprise Server 10 (i586)
- uname -a: Linux 2.6.16.21-0.8-bigsmp #1 SMP Mon Jul 3 18:25:39 UTC 2006
i686 athlon i386 GNU/Linux
- PostgreSQL 9.0.4, installed with RPM: postgresql-9.0.4-54.5

What do these messages mean and what can I do to avoid them?

Cheers,
Stephan


Re: [GENERAL] Log message last_statrequest ... is later than collector's time - what does it mean?

2011-09-16 Thread Stephan Vollmer
On Fri, Sep 16, 2011 at 19:16, Tom Lane t...@sss.pgh.pa.us wrote:

 Stephan Vollmer vollmer.step...@googlemail.com writes:
  I upgraded our test database from PostgreSQL 8.4.8 to 9.0.4 via pg_dumpall.
  The database seems to work fine, but now the logfile of the new database is
  flooded with log messages like these:

  2011-09-16 13:48:54 CEST: LOG:  last_statrequest 2011-09-16
  13:48:55.890743+02 is later than collector's time 2011-09-16
  13:48:54.614476+02

 Wow.  AFAIK this is an indication of major system clock problems,
 as in there's at least one backend process that is seeing gettimeofday()
 results significantly later than what the stats collector process is
 seeing.  We have seen small processor-to-processor skews before, but
 you've apparently got skews that are more than a second.  It was
 presumably happening before too, but pre-9.0 the stats collector doesn't
 bleat about it --- we added that logging to try to diagnose such
 problems.

 Now it's relatively harmless so far as this particular issue goes (I
 think the log bleating is the only real consequence); but it's not hard
 to envision very serious problems elsewhere, for instance gmake failing
 to rebuild things because file timestamps are in the future compared to
 what it thinks the time is.  You want to get that fixed.

Thank you for this information! I agree that this problem should be
fixed, but unfortunately I'm not the server admin and there are other
applications running on this machine (even MySQL...). In the meantime,
is there a way to filter these log messages other than setting
log_min_messages to fatal which is not really desireable?


  Configuration:
  - SUSE Linux Enterprise Server 10 (i586)
  - uname -a: Linux 2.6.16.21-0.8-bigsmp #1 SMP Mon Jul 3 18:25:39 UTC 2006
  i686 athlon i386 GNU/Linux

 I'd file a bug report with SUSE.  But probably the first thing they'd
 say is you should be using a less ancient kernel, so maybe upgrade that
 first and see if the issue goes away.

I'll try to persuade the admin to install at least the latest patch
version of this old kernel.

Thanks for your help and kind regards,
Stephan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Error: could not read from statistics collector pipe

2006-01-27 Thread Stephan Vollmer
Hello!

In my PostgreSQL logfile, I see that the following entries occur
very frequently:

2006-01-27 10:37:29 FATAL  could not read from statistics collector
   pipe: No error
2006-01-27 10:37:30 LOGstatistics collector process (PID 5940)
   was terminated by signal 1

(PostgreSQL 8.1.2, WinXP SP2, Athlon64 X2 3800, 2 GB RAM)

These errors occur every 1-2 minutes when the DB is active. What do
these messages mean and how can I stop them from appearing?

I also noticed that during inserting lots of records, the DB becomes
increasingly slower the more records were inserted. For example, the
first 10 records take 15 minutes, but records 30-40 take
3 hours. Could this be related to the messages above?

I tried to execute ANALYZE every 10 records, but this did not
seem to help.

Thanks in advance for your help!

- Stephan



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] Creation of tsearch2 index is very slow

2006-01-21 Thread Stephan Vollmer
Tom Lane wrote:

 The problem seems to be mostly tsearch2's fault rather than the general
 GIST code.  I've applied a partial fix to 8.1 and HEAD branches, which
 you can find here if you're in a hurry for it:
 http://archives.postgresql.org/pgsql-committers/2006-01/msg00283.php
 (the gistidx.c change is all you need for tsearch2)

Thanks for all your time and work you and the other guys are
spending on this matter! I'll look into the new version, but a.p.o
seems to be unreachable at the moment.


 There is some followup discussion in the pgsql-performance list.  It
 seems possible that we can get another factor of 10 or better with a
 smarter picksplit algorithm --- but that patch will probably be too
 large to be considered for back-patching into the stable branches.

I've already been following the discussion on pgsql-perform,
although I have to admit that don't understand every detail of the
tsearch2 implementation. :-)  Thus, I'm sorry that I won't be able
to help directly on that problem. But it is interesting to read anyway.

Best regards,

- Stephan



signature.asc
Description: OpenPGP digital signature


[GENERAL] Creation of tsearch2 index is very slow

2006-01-20 Thread Stephan Vollmer
Hello!

I noticed that the creation of a GIST index for tsearch2 takes very
long - about 20 minutes. CPU utilization is 100 %, the resulting
index file size is ~25 MB. Is this behaviour normal?

Full text columns:   title  author_list
tsearch2 word lists: fti_title  fti_author_list
tsearch2 indexes:idx_fti_title  idx_fti_author_list

The table has 700,000 records. When I create a normal B-Tree index
on the same column for testing purposes, it works quite fast
(approx. 30 seconds).

The columns that should be indexed are small, only about 10 words on
average.

System specs:
Athlon64 X2 3800+, 2 GB RAM
PostgreSQL 8.1.2, Windows XP SP2

I've never noticed this problem before, so could it probably be
related to v8.1.2? Or is something in my configuration or table
definition that causes this sluggishness?

Thanks very much in advance for your help!

- Stephan



This is the table definition:
-
CREATE TABLE publications
(
  id int4 NOT NULL DEFAULT nextval('publications_id_seq'::regclass),
  publication_type_id int4 NOT NULL DEFAULT 0,
  keyword text NOT NULL,
  mdate date,
  year date,
  title text,
  fti_title tsvector,
  author_list text,
  fti_author_list tsvector,
  overview_timestamp timestamp,
  overview_xml text,
  CONSTRAINT publications_pkey PRIMARY KEY (keyword) USING INDEX
  TABLESPACE dblp_index,
  CONSTRAINT publications_publication_type_id_fkey FOREIGN KEY
 (publication_type_id)
  REFERENCES publication_types (id) MATCH SIMPLE
  ON UPDATE RESTRICT ON DELETE RESTRICT,
  CONSTRAINT publications_year_check CHECK (date_part('month'::text,
year) = 1::double precision AND date_part('day'::text, year) =
1::double precision)
)
WITHOUT OIDS;

CREATE INDEX fki_publications_publication_type_id
  ON publications
  USING btree
  (publication_type_id)
  TABLESPACE dblp_index;

CREATE INDEX idx_fti_author_list
  ON publications
  USING gist
  (fti_author_list)
  TABLESPACE dblp_index;

CREATE INDEX idx_fti_title
  ON publications
  USING gist
  (fti_title)
  TABLESPACE dblp_index;

CREATE INDEX idx_publications_year
  ON publications
  USING btree
  (year)
  TABLESPACE dblp_index;

CREATE INDEX idx_publications_year_part
  ON publications
  USING btree
  (date_part('year'::text, year))
  TABLESPACE dblp_index;


CREATE TRIGGER tsvectorupdate_all
  BEFORE INSERT OR UPDATE
  ON publications
  FOR EACH ROW
  EXECUTE PROCEDURE multi_tsearch2();




signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] Creation of tsearch2 index is very slow

2006-01-20 Thread Stephan Vollmer
PS:

What I forgot to mention was that inserting records into the table
is also about 2-3 times slower than before (most likely due to the
slow index update operations).

I dropped the whole database and restored the dumpfile, but the
result it the same. When the index is recreated after COPYing the
data, it takes more than 20 minutes for _each_ of both tsearch2
indexes. So the total time to restore this table is more than 45
minutes!

- Stephan



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] Creation of tsearch2 index is very slow

2006-01-20 Thread Stephan Vollmer
Tom Lane wrote:
 Stephan Vollmer [EMAIL PROTECTED] writes:
 I noticed that the creation of a GIST index for tsearch2 takes very
 long - about 20 minutes. CPU utilization is 100 %, the resulting
 index file size is ~25 MB. Is this behaviour normal?
 
 This has been complained of before.  GIST is always going to be slower
 at index-building than btree; in the btree case there's a simple optimal
 strategy for making an index (ie, sort the keys) but for GIST we don't
 know anything better to do than insert the keys one at a time.

Ah, ok. That explains a lot, although I wonder why it is so much slower.


 However, I'm not sure that anyone's tried to do any performance
 optimization on the GIST insert code ... there might be some low-hanging
 fruit there.  It'd be interesting to look at a gprof profile of what the
 backend is doing during the index build.  Do you have the ability to do
 that, or would you be willing to give your data to someone else to
 investigate with?

Unfortunately, I'm not able to investigate it further myself as I'm
quite a Postgres newbie. But I could provide someone else with the
example table. Maybe someone else could find out why it is so slow.

I dropped all unnecessary columns and trimmed the table down to
235,000 rows. The dumped table (compressed with RAR) is 7,1 MB. I
don't have a website to upload it but I could send it to someone via
e-mail.

With this 235,000 row table, index creation times are:
- GIST347063 ms
- B-Tree2515 ms


Thanks for your help!

- Stephan




signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] tsearch2: How to use different configurations for two

2005-12-10 Thread Stephan Vollmer
Andrew J. Kopciuch wrote:
You could write a one trigger for the table to handle both.
 
 Something like this :
[...]
 CREATE OR REPLACE FUNCTION multi_tsearch2() RETURNS TRIGGER AS '
[...]
I tested the function today and it's exactly what I had in mind.
Works perfectly!

Thanks, Stephan


signature.asc
Description: OpenPGP digital signature


[GENERAL] tsearch2: How to use different configurations for two columns?

2005-12-09 Thread Stephan Vollmer
Hi all,

is it possible to index 2 columns in a table with tsearch2 using
different configurations for each column index?

I have a table publications that is defined as follows (simplified):

CREATE TABLE publications
(
  title   text,
  author_list text,
  fti_title   tsvector,
  fti_author_list tsvector,
)
WITHOUT OIDS;

CREATE INDEX idx_fti_author_list ON publications
  USING gist (fti_author_list);

CREATE INDEX idx_fti_title ON publications
  USING gist (fti_title);

CREATE TRIGGER tsvectorupdate_title
  BEFORE INSERT OR UPDATE ON publications
  FOR EACH ROW
  EXECUTE PROCEDURE tsearch2('fti_title', 'title');

CREATE TRIGGER tsvectorupdate_author_list
  BEFORE INSERT OR UPDATE ON publications
  FOR EACH ROW
  EXECUTE PROCEDURE tsearch2('fti_author_list', 'author_list');

The column 'author_list' contains names of authors with many
abbreviated first names, e. g. S. Vollmer  Michael F. Smith. These
abbreviated first names S. and F. shouldn't be indexed. To do
this, I created a tsearch2 configuration 'authors' that uses a
stopword list with a...z as stopwords.

The configuration seems to work fine, but I can't get tsearch2 to
use the 'author' config for column 'author_list' and 'default'
config for column 'title'. Is there any way to accomplish this?


If the solution is not possible or too complicated, as a workaround
I could use a function that deletes the abbreviated first names
before the column is indexed - similar to the function
dropatsymbol() mentioned in the tsearch2 documentation. I tried to
use replace(), but I'd need a function with regexps.

Thanks in advance,

- Stephan


signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] tsearch2: How to use different configurations for two

2005-12-09 Thread Stephan Vollmer
[Andrew, sorry for the PM - I pressed the wrong button.]

Andrew J. Kopciuch wrote:
You could write a one trigger for the table to handle both.
 
 Something like this :

[...]

 You can modify the function to be slightly more configurable with parameters 
 taking the column names, and config names and make it a little more reusable. 
  
 You can accomplish what you want though.

I'm a total beginner in regards to pgsql functions, so thanks a lot
for your example. When I have some time tomorrow, I'll try it out
and also take a look at the PL/pgSQL docs.

Thanks for your help,

- Stephan


signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] ILIKE '%term%' and Performance

2005-12-05 Thread Stephan Vollmer
Hi!

CSN wrote:
 I'm thinking of enabling searches that use queries like select *
 from items where title ilike '%term%'. The items table has tens
 of thousands of rows. Is it worth worrying about the performance
 of such a query (since, if I'm not mistaken, it will never use
 indices). If it is, what's the best option - use tsearch? How
 does tsearch (or whatever else) compare performance-wise to not
 using it, or to typical index-based queries for that matter?

This is my first post to this list and I'm quite a PostgreSQL
newbie. But I had the same question as you a few days ago so maybe
my answer is helpful to you.

In my experience, there was a huge difference between using tsearch2
and normal ILIKE. This is the query time on a table with about
236000 publication records when searching for a title:

1. Without tsearch2:
   SELECT * FROM publications WHERE title ILIKE '%quicksort%';
   Total runtime: 1673.439 ms

2. With tsearch2:
   SELECT * FROM publications
   WHERE idx_fti @@ to_tsquery('default', 'quicksort');
   Total runtime: 11.707 ms

So for my setup, tsearch2 was (as I expected) much faster, but of
course results will vary between different setups. But maybe it is
helpful to you.

Greetings,

- Stephan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq