Re: [PERFORM] Key/Value reference table generation: INSERT/UPDATE performance

2007-05-22 Thread cedric
Le mardi 22 mai 2007, Richard Huxton a écrit :
 valgog wrote:
  I found several post about INSERT/UPDATE performance in this group,
  but actually it was not really what I am searching an answer for...
 
  I have a simple reference table WORD_COUNTS that contains the count of
  words that appear in a word array storage in another table.

 I think this is the root of your problem, I'm afraid. You're trying to
 count individual words when you're storing an array of words. I don't
 think any of the Gist/GIN indexes will help you with this either.

 However, since you don't want to start from here isn't very useful
 here and now:

 1. See what the performance (explain analyse) of the select
 distinct...generate_series() statement is. I think you're right and
 it's going to be slow.
 2. You're looping through each row of word_storage and counting
 separately. Write it as one query if possible.
 3. As Peter says, don't insert then update, start with an empty table
 and just insert totals for the lot (see #2).

 I'd probably write the query in plperl/python or something else that
 supports hash/dictionary structures. Then just process the whole
 word_storage into the hash - assuming you only have a few thousand
 distinct words that shouldn't take up too much memory.
+1
I made something very similar, and using PL/pgsql is very slow, when using 
perl is very quick. 

I have also use partioning because of cost of update (copy last partition to 
the new, adding the new count, so there is only insert, and drop old table if 
you want)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] unsubscribe

2007-03-16 Thread cedric


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] compact flash disks?

2007-03-07 Thread cedric
Le mardi 6 mars 2007 23:18, James Mansion a écrit :
 I see that one can now get compact flash to SATA connectors.
I can suggest you to have a look at Gigabyte i-ram .
We use it on a website with higth traffic with lot of succes. Unfortunely, I 
can not provide any benchmark...

 If I were to use a filesystem with noatime etc and little non-sql traffic,
 does the physical update pattern tend to have hot sectors that will tend to
 wear out CF?

 I'm wondering about a RAID5 with data on CF drives and RAID1 for teh WAL on
 a fast SATA or SAS drive pair.  I'm thhinking that this would tend to have
 good performance because the seek time for the data is very low, even if
 the actual write speed can be slower than state of the art.  2GB CF isn't
 so pricey any more.

 Just wondering.

 James

 --
 No virus found in this outgoing message.
 Checked by AVG Free Edition.
 Version: 7.5.446 / Virus Database: 268.18.7/711 - Release Date: 05/03/2007
 09:41


 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] archive wal's failure and load increase.

2006-09-28 Thread Cedric Boudin
Honorable members of the list,


I would like to share with you a side effect that I discovered today on
our postgresql 8.1 server.
We ve been running this instance with PITR for now 2 months without any
problems.
The wal's are copied to a remote machine with the pg_archive_command and
locally to some other directory.
For some independant reasons we made the remote machine unreachable for
some hours. The pg_archive_command returned as expected a failure value.

Now to what puzzles me:
the load on the box that normally is kept between 0.7 and 1.5 did
suddenly rise to 4.5 -5.5 and the processes responsiveness  got bad.
The dir pg_xlog has plenty of space to keep several day of wal's.
there was no unfinished backup's or whatever that could have apparently
slowed the machine that much.

So the question is: is there a correlation between not getting the wal's
archived and this massive load growth?
In my understanding, as the pgl engine has nothing more to do with the
filled up log except just to make sure it's archived correctly ther
should not be any significant load increase for this reason. Looking at
the logs the engine tried approx. every 3 minutes to archive the wal's.
Is this behaviour expected, If it is then is it reasonnable to burden
the engine that is already in a inexpected situation with some IMHO
unecessary load increase.

your thougths are welcome

Cedric



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings