Re: [HACKERS] [PATCHES] Proposed patch for operator lookup caching

2007-11-27 Thread Guillaume Smet
On Nov 27, 2007 6:34 AM, Bruce Momjian [EMAIL PROTECTED] wrote:
 And an additional beta might encourage more testing too.

I'm not that sure of this point. I'm really worried about the lack of
people testing 8.3 at the moment. We have really too little feedback.
Perhaps they didn't meet any problem but even that could be good to
know.

That said, if this patch is applied, another beta is the reasonable
way to go. Not sure it's worth it though.

--
Guillaume

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Replacement Selection

2007-11-27 Thread mac_man2005

Hi to all.

It seems a previous mail of mine with following body hasn't been sent.
Sorry for possibly getting it twice.

Actually I have now modified that body, so it's worth to read it once again.

Thanks for your attention.
Regards.


PREVIOUS MAIL--
Well, the refinements are the followings:

Using 2 heaps instead of just one:
one heap creating a descending run and the
other one creating an ascending run.
Both associated to the same logical run.

Suppose we want the input elements to be finally sorted in an ascending
order. To do this we could QuickSort the first M initialization elements 
into RAM

and then divide it into 2 parts.
Suppose the first heap creates the following run:
10
9
8

And suppose the second heap creates the following run:
3
5
7

Those two runs can be seen as just one by mergesort... since they could be
physically merged into one single run: at first we could write the elements
3,5,7 and then the elements of the other run, red upside down.

Possible advantages:
Having two heaps of that kinds lets RS better adapt to local variations of 
the input trend.

This technique can be called Two Ways Replacement Selection (2WRS) just
because of those 2 heaps.
As an extreme example, we can say that having the input already sort in 
reverse order
no more leads us to the worst case: with 2WRS no matter the input is already 
sort
in ascending/descending order... in this case we'll produce just one run 
instead
of producing the maximum number of runs as in RS worst case (input in 
reverse order).
Moreover it lets us to grow the current run in 2 ways: just imagine we would 
output runs
in a regular file. With 2WRS this could be seen as start outputting elements 
from the middle
of such a regular file, the descending heap outputting elements from the 
middle upwards
while the ascending one outputting from the middle downward. This could 
imply getting
a smaller number of dead records (as I said in previous mails, a dear 
record is an element

that won't form part of the current run) and so having longer runs.

Others optimizations, for example, can be done with the virtual 
concatenation technique:
storing a cache of couples (first_element,last_element) for each created 
run. This
could be useful in case we can find 2 couples (first_element_1, 
last_element_1) and

(first_element_2, last_element_2) with   last_element_1 = first_element_2.
In this case, those runs too can be seen as belonging to the same logical 
run

(actually they are 2 RS different physical runs, or even 4 in 2WRS
but can be seen as just one by mergesort). Of course, once those 2 (or 4) 
runs are
logically merged into that only one, this last one in turn could be merged 
to other runs.


What does all that imply? Mergesort would actually consider a smaller number 
of runs
(since it should just work on logical runs). This means less jumps between 
runs on disk.


Now... to test those refinements I should integrate my code into
PostgreSQL... but it's not that easy for me...

Thanks for your attention.
PREVIOUS MAIL--  



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] maintenance_work_mem memory constraint?

2007-11-27 Thread Bernd Helmle

--On Montag, November 26, 2007 21:41:33 +0100 I wrote:


--On Montag, November 26, 2007 13:02:14 -0500 Tom Lane
[EMAIL PROTECTED] wrote:


Bernd Helmle [EMAIL PROTECTED] writes:

... But isn't it worth to special case the
code in grow_memtuples() (and maybe other places where sort is likely to
use more RAM), so that we can remove this constraint on 64-Bit systems
with  many RAM built in? Or am I missing something very important?.


AFAICS this patch can increase the number of sortable tuples by at most
2X (less one).  That doesn't seem worth getting very worked up about ...

regards, tom lane


That's true.

Well, i haven't meant the diff as a discussable patch at all. It's just
what i've done to understand why we have this limit for tuplesort.
afaics, the main constraint here is MaxAllocSize, and i just wonder if
that doesn't introduce unnecessary limits on systems which can use many
RAM for index creation and wether we can be more generous here. So one
idea could be to allow larger allocation requests during sorting on
systems where we know that this is likely to work.


And, to complete my concerns, if i can afford to give maintenance_work_mem 
10GB and the system just uses 2GB this is somewhere near a bug.


--
 Thanks

   Bernd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] String encoding during connection handshake

2007-11-27 Thread sulfinu
Hi all.

I have read the documentation, searched the mailing lists and inspected the 
code JDBC driver code. I do need to address this question to actual 
developers.

Simply put, what is the client encoding that the server assumes BEFORE the 
client connection is established, that is, during the authentication phase? I 
know there's a client_encoding setting on the server side that indicates 
the encoding used in the communication stream, but its default value is the  
database's encoding. Which is not known before the user gets authenticated 
and the logical connection is actually made.

I'm asking this so that I can fix the JDBC driver that wrongly assumes that 
user name, password and database name are made up of ASCII characters only. 
This issue has come up before, but no action has been carried out. See 
http://archives.postgresql.org/pgsql-jdbc/2007-10/msg00128.php
I also need a vital information regarding the MD5 hash that is computed is 
some authentication scenarios. This hash is based on char[] (String) values 
that must be converted into byte[] before being handed over to the hash 
algorithm. What is the encoding used by the server to make this conversion 
during the authentication phase (in order to verify the submitted password)?

I also saw that the JDBC driver sends right away this pair to the server: 
{client_encoding, UNICODE}. Does that mean that the client is requesting 
the server to interpret the communication stream as encoded in UTF8?

Thanks.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Replacement Selection

2007-11-27 Thread Simon Riggs
On Tue, 2007-11-27 at 09:25 +0100, [EMAIL PROTECTED] wrote:

 Others optimizations, for example, can be done with the virtual 
 concatenation technique:
 storing a cache of couples (first_element,last_element) for each created 
 run. This
 could be useful in case we can find 2 couples (first_element_1, 
 last_element_1) and
 (first_element_2, last_element_2) with   last_element_1 = first_element_2.
 In this case, those runs too can be seen as belonging to the same logical 
 run
 (actually they are 2 RS different physical runs, or even 4 in 2WRS
 but can be seen as just one by mergesort). Of course, once those 2 (or 4) 
 runs are
 logically merged into that only one, this last one in turn could be merged 
 to other runs.
 
 What does all that imply? Mergesort would actually consider a smaller number 
 of runs
 (since it should just work on logical runs). This means less jumps between 
 runs on disk.

That's actually a refinement of an idea I've been working on for
optimizing sort. I'll post those separately.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] PostGreSQL and recursive queries...

2007-11-27 Thread Hubert FONGARNAND
_
Ce message et les éventuels documents joints peuvent contenir des informations 
confidentielles.
Au cas où il ne vous serait pas destiné, nous vous remercions de bien vouloir 
le supprimer et en aviser immédiatement l'expéditeur. Toute utilisation de ce 
message non conforme à sa destination, toute diffusion ou publication, totale 
ou partielle et quel qu'en soit le moyen est formellement interdite.
Les communications sur internet n'étant pas sécurisées, l'intégrité de ce 
message n'est pas assurée et la société émettrice ne peut être tenue pour 
responsable de son contenu.
---BeginMessage---
Hi,

We are using the CONNECT BY patch made by Evgen Potemkin on
PostGreSQL 8.2... It works like a charm with very high performances.

But now, we are looking for the 8.3 release... Evgen Potemkin has
stopped to answer about this patch (it's quite normal, he's working at
mysql now...). I've tried to port the patch to the 8.3 postgresql
version... It compiles but it segfault. Many data structures have
changed between 8.3 and 8.2 and i'm not aware enough of postgresql
internals...

So, now the solutions : 

  * using the connectby C function... which is min 10x slower
than the patch (we may improve it a bit, but i doubt it'd beat
the patch...)
  * Waiting for the WITH RECURSIVE support for the 8.4 (but i
don't expect anything, because this is on the todo list since
many years, and i'ven't seen any code/patch since)
  * Someone help me to get the patch working on the 8.3
  * moving to oracle. :- or another hierarchical aware
database.


It's hard to explain to our manager that if we move to the next version
of postgresql there will be a performance drop.

What is the best solution
Please help me!!!


Hubert FONGARNAND
Fiducial
---End Message---

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] PostGreSQL and recursive queries...

2007-11-27 Thread Hubert FONGARNAND
_
Ce message et les éventuels documents joints peuvent contenir des informations 
confidentielles.
Au cas où il ne vous serait pas destiné, nous vous remercions de bien vouloir 
le supprimer et en aviser immédiatement l'expéditeur. Toute utilisation de ce 
message non conforme à sa destination, toute diffusion ou publication, totale 
ou partielle et quel qu'en soit le moyen est formellement interdite.
Les communications sur internet n'étant pas sécurisées, l'intégrité de ce 
message n'est pas assurée et la société émettrice ne peut être tenue pour 
responsable de son contenu.
---BeginMessage---
Hi,

We are using the CONNECT BY patch made by Evgen Potemkin on
PostGreSQL 8.2... It works like a charm with very high performances.

But now, we are looking for the 8.3 release... Evgen Potemkin has
stopped to answer about this patch (it's quite normal, he's working at
mysql now...). I've tried to port the patch to the 8.3 postgresql
version... It compiles but it segfault. Many data structures have
changed between 8.3 and 8.2 and i'm not aware enough of postgresql
internals...

So, now the solutions : 

  * using the connectby C function... which is min 10x slower
than the patch (we may improve it a bit, but i doubt it'd beat
the patch...) 
  * Waiting for the WITH RECURSIVE support for the 8.4 (but i
don't expect anything, because this is on the todo list since
many years, and i'ven't seen any code/patch since) 
  * Someone help me to get the patch working on the 8.3 
  * moving to oracle. :- or another hierarchical aware
database.


It's hard to explain to our manager that if we move to the next version
of postgresql there will be a performance drop.

What is the best solution
Please help me!!!


Hubert FONGARNAND
Fiducial
---End Message---

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] 8.3devel slower than 8.2 under read-only load

2007-11-27 Thread Guillaume Smet
Tom,

On Nov 27, 2007 3:58 AM, Tom Lane [EMAIL PROTECTED] wrote:
 Please try this patch on your real app (not the dumbed-down test case)
 and see what it does for you ...

If I disable the cache in the application, the most visited page
generates 175 SQL queries, mix of simple and more complicated queries
so it's quite real life (it's a page which aggregates quite a lot of
data - available in production here:
http://www.fra.cityvox.fr/guide_lyon/AccueilVille ).

The average of 5 runs for this page gives:
1.108 second for 8.2,
1.054 second for 8.3devel with your cache lookup patch.

Which gives us 8.3devel+cache patch a bit faster (~5%) than 8.2 on a
quite representative set of queries, even with no concurrency at all.
That's very good news.

Without your last patch but with the set of patches you commited these
last days (basically latest CVS tip), 8.3devel is nearly as fast as
8.2 to generate this page but consistently a bit slower.

For historical purposes, here are the results of my simple and dumb
test case (pgbench -c 10 -n 1 with the custom queries I mentioned
earlier):
8.2: 1480 tps
2007-11-01: 1200 tps
2007-11-27: 1420 tps (after your set of patches to simplify the
planner for simple queries)
2007-11-27 + cache lookup patch: 2260 tps

That said, I checked nearly each of the 175 queries and I still find a
few queries a bit slower. I don't know if it's worth digging but as
you already found a couple of problems, I prefer mentioning it. For
example, after simplification of a query with a few joins which is
slower in 8.3devel+patch, I have the following results (consistent
over several runs of the query):
** 8.2 **
= SELECT NP.codejour FROM newsplanning NP WHERE NP.codelang = 'FRA'
AND date_trunc('day',NP.dfin)= date_trunc('day',now()) AND
date_trunc('day',NP.ddeb) = date_trunc('day',now()) AND NP.codeth =
'THE' and NP.niveau = 1;
snip 4 rows
Time: 5.659 ms

= EXPLAIN ANALYZE SELECT NP.codejour FROM newsplanning NP WHERE
NP.codelang = 'FRA' AND date_trunc('day',NP.dfin)=
date_trunc('day',now()) AND date_trunc('day',NP.ddeb) =
date_trunc('day',now()) AND NP.codeth = 'THE' and NP.niveau = 1;

   QUERY PLAN
--
 Index Scan using
idx_newsplanning_numnews_codelang_dfin_ddeb_codeth_niveau on
newsplanning np  (cost=0.00..752.95 rows=50 width=4) (actual
time=4.478..5.029 rows=4 loops=1)
   Index Cond: (((codelang)::text = 'FRA'::text) AND ((codeth)::text =
'THE'::text) AND (niveau = 1))
   Filter: ((date_trunc('day'::text, dfin) = date_trunc('day'::text,
now())) AND (date_trunc('day'::text, ddeb) = date_trunc('day'::text,
now(
 Total runtime: 5.064 ms

** 8.3devel with cache lookup **
=# SELECT NP.codejour FROM newsplanning NP WHERE NP.codelang = 'FRA'
AND date_trunc('day',NP.dfin)= date_trunc('day',now()) AND
date_trunc('day',NP.ddeb) = date_trunc('day',now()) AND NP.codeth =
'THE' and NP.niveau = 1;
snip 4 rows
Time: 5.932 ms

EXPLAIN ANALYZE SELECT NP.codejour FROM newsplanning NP WHERE
NP.codelang = 'FRA' AND date_trunc('day',NP.dfin)=
date_trunc('day',now()) AND date_trunc('day',NP.ddeb) =
date_trunc('day',now()) AND NP.codeth = 'THE' and NP.niveau = 1;

   QUERY PLAN
--
 Index Scan using
idx_newsplanning_numnews_codelang_dfin_ddeb_codeth_niveau on
newsplanning np  (cost=0.00..679.35 rows=46 width=4) (actual
time=4.884..5.467 rows=4 loops=1)
   Index Cond: (((codelang)::text = 'FRA'::text) AND ((codeth)::text =
'THE'::text) AND (niveau = 1))
   Filter: ((date_trunc('day'::text, dfin) = date_trunc('day'::text,
now())) AND (date_trunc('day'::text, ddeb) = date_trunc('day'::text,
now(
 Total runtime: 5.501 ms
(4 rows)

The schema of the table is the following:
= \d newsplanning
   Table public.newsplanning
Column |Type | Modifiers
---+-+
 idplanning| integer | not null
 numnews   | integer | not null
 ddeb  | timestamp without time zone | not null
 dfin  | timestamp without time zone | not null
 codeth| character varying(3)| not null
 niveau| integer | not null
 ordre | integer |
 codelang  | character varying(3)| not null
 ddermodif | timestamp without time zone |
 logindermodif | character varying(20)   |
 codejour  | integer | not null default 1
Indexes:
pk_newsplanning PRIMARY KEY, btree (idplanning)
idx_newsplanning_ddeb_dfin_numnews_niveau_codelang btree (ddeb,
dfin, numnews, niveau, codelang)

Re: [HACKERS] String encoding during connection handshake

2007-11-27 Thread Martijn van Oosterhout
On Tue, Nov 27, 2007 at 02:51:32PM +0200, [EMAIL PROTECTED] wrote:
 Simply put, what is the client encoding that the server assumes BEFORE the 
 client connection is established, that is, during the authentication phase? I 
 know there's a client_encoding setting on the server side that indicates 
 the encoding used in the communication stream, but its default value is the  
 database's encoding. Which is not known before the user gets authenticated 
 and the logical connection is actually made.

I was under the impression that the username/password, had no encoding,
they are Just a Bunch of Bits, i.e. byte[]. Hence it is not relevent
what encoding the database is, it depends what encoding the DB admin
was using when the user was created. That solves your md5 problem.

Looking at it another way, the encoding is part of the password. The
correctly entered password in the wrong encoding is also wrong, because
the matching is done at the byte level.

So I suppose the answer is: whatever encoding you would like it to
be/what the DB admin uses.

This is all AIUI,

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Those who make peaceful revolution impossible will make violent revolution 
 inevitable.
  -- John F Kennedy


signature.asc
Description: Digital signature


Re: [HACKERS] PostGreSQL and recursive queries...

2007-11-27 Thread Andrew Dunstan



Hubert FONGARNAND wrote:

We are using the CONNECT BY patch made by Evgen Potemkin on
PostGreSQL 8.2... It works like a charm with very high performances.

But now, we are looking for the 8.3 release... Evgen Potemkin has
stopped to answer about this patch (it's quite normal, he's working at
mysql now...). I've tried to port the patch to the 8.3 postgresql
version... It compiles but it segfault. Many data structures have
changed between 8.3 and 8.2 and i'm not aware enough of postgresql
internals...

So, now the solutions : 


  * using the connectby C function... which is min 10x slower
than the patch (we may improve it a bit, but i doubt it'd beat
the patch...)
  * Waiting for the WITH RECURSIVE support for the 8.4 (but i
don't expect anything, because this is on the todo list since
many years, and i'ven't seen any code/patch since)
  * Someone help me to get the patch working on the 8.3
  * moving to oracle. :- or another hierarchical aware
database.


It's hard to explain to our manager that if we move to the next version
of postgresql there will be a performance drop.

What is the best solution
Please help me!!!


  


If you are unable to get the patch working with 8.3 yourself, you can 
sponsor or employ someone to do it for you, or to get WITH RECURSIVE 
done for 8.4, which would solve the problem forever, or maybe even both 
would be worth doing.


cheers

andrew



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] PostGreSQL and recursive queries...

2007-11-27 Thread Gregory Stark
Hubert FONGARNAND [EMAIL PROTECTED] writes:

 Ce message et les éventuels documents joints peuvent contenir des
 informations confidentielles. Au cas où il ne vous serait pas destiné, nous
 vous remercions de bien vouloir le supprimer et en aviser immédiatement
 l'expéditeur. Toute utilisation de ce message non conforme à sa destination,
 toute diffusion ou publication, totale ou partielle et quel qu'en soit le
 moyen est formellement interdite. Les communications sur internet n'étant
 pas sécurisées, l'intégrité de ce message n'est pas assurée et la société
 émettrice ne peut être tenue pour responsable de son contenu.

I started working on WITH RECURSIVE a while back and still intend to get back
to it. But there's no guarantee that what I turn up will be to the liking of
everyone else.

I also think the connectby() patch should be possible to port forward but I
haven't looked too much into it. I know it's a big patch, just the sheer
amount of code that has to be gone through carefully to port it forward might
make it kind of hard.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] PG 8.3beta3 Segmentation Fault during Database Restore

2007-11-27 Thread Rudolf van der Leeden

Hi folks,

I've been trying to test a backup/restore of our production database  
(26GB on disk) using PG 8.2.4 as backup and PG 8.3beta3 for the restore.


FIRST TRY:
pg_dump (v8.3beta3)  --format=cthe PG 8.2.4 database   OK
pg_restore  into a brandnew PG 8.3beta3 database    
Segmentation fault after ~10min

From the serverlog:
2007-11-27 11:03:27 CET [7133] LOG:  server process (PID 7337)  
was terminated by signal 11: Segmentation fault
2007-11-27 11:03:27 CET [7235] CONTEXT:  COPY login_session,  
line 9210986


SECOND TRY:
Increased the loglevel to DEBUG1
pg_dump (v8.2.4)  --format=pthe PG 8.2.4 database into an  
ASCII file (31 GB)   OK
psql-restore into a brandnew PG 8.3beta3 database    
Segmentation fault after ~2hours

From the serverlog:
2007-11-27 15:56:38 CET [15833] STATEMENT:  CREATE INDEX  
login_session_promotion_id ON login_session USING btree (promotion_id);

2007-11-27 15:56:38 CET [15833] ERROR:  concurrent insert in progress
2007-11-27 15:56:38 CET [15833] STATEMENT:  CREATE INDEX  
login_session_web_site_id ON login_session USING btree (web_site_id);
2007-11-27 15:56:50 CET [21670] DEBUG:  autovacuum: processing  
database gaia
2007-11-27 15:57:58 CET [15726] LOG:  server process (PID 15833) was  
terminated by signal 11: Segmentation fault
2007-11-27 15:57:58 CET [15726] LOG:  terminating any other active  
server processes


What could be the cause of this problem? Is it a bug or my fault?
The postgres.crash.log is enclosed.

Thanks,
Rudolf VanderLeeden
Logicunited GmbH
Germany
[EMAIL PROTECTED]





postgres.crash.log
Description: Binary data






---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PostGreSQL and recursive queries...

2007-11-27 Thread Hubert FONGARNAND
Le mardi 27 novembre 2007 à 10:00 -0500, Andrew Dunstan a écrit :

 
 Hubert FONGARNAND wrote:
  We are using the CONNECT BY patch made by Evgen Potemkin on
  PostGreSQL 8.2... It works like a charm with very high performances.
 
  But now, we are looking for the 8.3 release... Evgen Potemkin has
  stopped to answer about this patch (it's quite normal, he's working at
  mysql now...). I've tried to port the patch to the 8.3 postgresql
  version... It compiles but it segfault. Many data structures have
  changed between 8.3 and 8.2 and i'm not aware enough of postgresql
  internals...
 
  So, now the solutions : 
 
* using the connectby C function... which is min 10x slower
  than the patch (we may improve it a bit, but i doubt it'd beat
  the patch...)
* Waiting for the WITH RECURSIVE support for the 8.4 (but i
  don't expect anything, because this is on the todo list since
  many years, and i'ven't seen any code/patch since)
* Someone help me to get the patch working on the 8.3
* moving to oracle. :- or another hierarchical aware
  database.
 
 
  It's hard to explain to our manager that if we move to the next version
  of postgresql there will be a performance drop.
 
  What is the best solution
  Please help me!!!
 
 

 
 If you are unable to get the patch working with 8.3 yourself, you can 
 sponsor or employ someone to do it for you,


I think such people is hard to find... but if someone is interested, he
can contact me!


  or to get WITH RECURSIVE 
 done for 8.4, which would solve the problem forever, or maybe even both 
 would be worth doing.
 cheers
 
 andrew
 
 

_
Ce message et les éventuels documents joints peuvent contenir des informations 
confidentielles.
Au cas où il ne vous serait pas destiné, nous vous remercions de bien vouloir 
le supprimer et en aviser immédiatement l'expéditeur. Toute utilisation de ce 
message non conforme à sa destination, toute diffusion ou publication, totale 
ou partielle et quel qu'en soit le moyen est formellement interdite.
Les communications sur internet n'étant pas sécurisées, l'intégrité de ce 
message n'est pas assurée et la société émettrice ne peut être tenue pour 
responsable de son contenu.



Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]

2007-11-27 Thread Tom Lane
Brendan Jurd [EMAIL PROTECTED] writes:
 So far I've only considered the '::' cast syntax suggested in the
 original proposal, e.g.:

 ARRAY[]::text[]

 I wonder whether we are also interested in catching CAST(), e.g.:

 CAST(ARRAY[] AS text[])

I think you'll find that it's just about impossible to not handle both,
because they look the same after the grammar gets done.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PG 8.3beta3 Segmentation Fault during Database Restore

2007-11-27 Thread Tom Lane
Rudolf van der Leeden [EMAIL PROTECTED] writes:
 What could be the cause of this problem? Is it a bug or my fault?

It looks like a corrupted-data kind of problem.  Can you extract
a reproducible test case?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] String encoding during connection handshake

2007-11-27 Thread sulfinu
On Tuesday 27 November 2007, Martijn van Oosterhout wrote:
 I was under the impression that the username/password, had no encoding,
 they are Just a Bunch of Bits, i.e. byte[]. 
I cannot agree to that, simply because Postgres supports (or at least claims 
to) multi-byte characters. And user names, passwords and database names are 
character strings.

 Looking at it another way, the encoding is part of the password. The
 correctly entered password in the wrong encoding is also wrong, because
 the matching is done at the byte level.
I'm afraid that is true to some extent, that's why I'm asking in the first 
place. A user should be able to authenticate as long as he/she is able to 
write the password, regardless of the OS's locale setting.

 This is all AIUI,
Thanks fot the input, I'm waiting for others, too. Or point me to the relevant 
source files.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]

2007-11-27 Thread Brendan Jurd
So far I've only considered the '::' cast syntax suggested in the
original proposal, e.g.:

ARRAY[]::text[]

I wonder whether we are also interested in catching CAST(), e.g.:

CAST(ARRAY[] AS text[])

I'm personally okay with leaving it at support for '::', but
admittedly I am heavily biased towards this syntax (I find CAST very
ugly).  I suppose supporting CAST as well would be the more
predictable behaviour; I think people might be surprised if we
supported one form of casting but not the other.

Comments?

Regards,
BJ

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PG 8.3beta3 Segmentation Fault during Database Restore

2007-11-27 Thread Andrew Dunstan





Rudolf van der Leeden wrote:

Hi folks,

I've been trying to test a backup/restore of our production database 
(26GB on disk) using PG 8.2.4 as backup and PG 8.3beta3 for the restore.


FIRST TRY:
pg_dump (v8.3beta3)  --format=cthe PG 8.2.4 database   OK
pg_restore  into a brandnew PG 8.3beta3 database   
Segmentation fault after ~10min

From the serverlog:
2007-11-27 11:03:27 CET [7133] LOG:  server process (PID 7337) was 
terminated by signal 11: Segmentation fault
2007-11-27 11:03:27 CET [7235] CONTEXT:  COPY login_session, line 
9210986


SECOND TRY:
Increased the loglevel to DEBUG1
pg_dump (v8.2.4)  --format=pthe PG 8.2.4 database into an 
ASCII file (31 GB)   OK
psql-restore into a brandnew PG 8.3beta3 database   
Segmentation fault after ~2hours

From the serverlog:
2007-11-27 15:56:38 CET [15833] STATEMENT:  CREATE INDEX 
login_session_promotion_id ON login_session USING btree (promotion_id);

2007-11-27 15:56:38 CET [15833] ERROR:  concurrent insert in progress
2007-11-27 15:56:38 CET [15833] STATEMENT:  CREATE INDEX 
login_session_web_site_id ON login_session USING btree (web_site_id);
2007-11-27 15:56:50 CET [21670] DEBUG:  autovacuum: processing 
database gaia
2007-11-27 15:57:58 CET [15726] LOG:  server process (PID 15833) was 
terminated by signal 11: Segmentation fault
2007-11-27 15:57:58 CET [15726] LOG:  terminating any other active 
server processes


What could be the cause of this problem? Is it a bug or my fault?
The postgres.crash.log is enclosed.




The general rule is: use pg_dump from the target version. So your first 
attempt was more correct.


Just curious: what happens if you turn autovacuum off before starting 
the restore?


cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]

2007-11-27 Thread Brendan Jurd
On Nov 28, 2007 2:56 AM, Tom Lane [EMAIL PROTECTED] wrote:
  I wonder whether we are also interested in catching CAST(), e.g.:

  CAST(ARRAY[] AS text[])

 I think you'll find that it's just about impossible to not handle both,
 because they look the same after the grammar gets done.

Thanks Tom ... your comment makes me suspect I've been barking up the
wrong tree.

My original intent was to modify the grammar rules to catch an array
expression followed by a typecast, and put the target typename of the
cast directly into the A_ArrayExpr struct.  That notion came from
looking at the way that TypeName gets put into A_Const --
makeStringConst() takes an optional TypeName argument.

Looking at the code in the context of your comment, that was probably
a bad approach.  I may've taken the A_Const analogy too far.

Now I'm thinking I leave the grammar rules alone (apart from making it
legal to specify an empty list of elements), and instead push the
typename down into the child node from makeTypeCast(), if the child is
an A_ArrayExpr.  Does that work better?

Regards,
BJ

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Replacement Selection

2007-11-27 Thread mac_man2005
Any comment about Two Ways Replacement Selection (two heaps instead of just 
one) ?



--
From: Simon Riggs [EMAIL PROTECTED]
Sent: Tuesday, November 27, 2007 1:03 PM
To: [EMAIL PROTECTED]
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Replacement Selection


On Tue, 2007-11-27 at 09:25 +0100, [EMAIL PROTECTED] wrote:


Others optimizations, for example, can be done with the virtual
concatenation technique:
storing a cache of couples (first_element,last_element) for each created
run. This
could be useful in case we can find 2 couples (first_element_1,
last_element_1) and
(first_element_2, last_element_2) with   last_element_1 = 
first_element_2.
In this case, those runs too can be seen as belonging to the same 
logical

run
(actually they are 2 RS different physical runs, or even 4 in 2WRS
but can be seen as just one by mergesort). Of course, once those 2 (or 4)
runs are
logically merged into that only one, this last one in turn could be 
merged

to other runs.

What does all that imply? Mergesort would actually consider a smaller 
number

of runs
(since it should just work on logical runs). This means less jumps 
between

runs on disk.


That's actually a refinement of an idea I've been working on for
optimizing sort. I'll post those separately.

--
 Simon Riggs
 2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Help with release note items

2007-11-27 Thread Bruce Momjian
I need help understanding the following two release note items (see XXX):

 listitem
  para
   Create a general mechanism that supports casts to and from the
   standard string types (typeTEXT/type, typeVARCHAR/type,
   typeCHAR/type) for emphasisevery/emphasis datatype, by
   invoking the datatype's I/O functions (Tom)  XXX examples?
  /para

  para
   These new casts are assignment-only in the to-string direction,
   explicit-only in the other direction, and therefore should create no
   surprising behavior.  Various
   datatype-specific casting functions that were equivalent to this
   behavior were removed.
  /para
 /listitem

 listitem
  para
   Allow limited hashing when using two different data types (Tom)
  /para

  para
   This allows hash joins, hash indexes, hashed subplans, and hash
   aggregation to be used in situations involving cross-data-type
   comparisons, if the data types have compatible hash functions.
   Currently, cross-data-type hashing support exists for
   typesmallint/type/typeinteger/type/typebigint/type,
   and for typefloat4/type/typefloat8/type.  XXX how do you do
   hash aggregate with two different data types
  /para
 /listitem


-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]

2007-11-27 Thread Tom Lane
Brendan Jurd [EMAIL PROTECTED] writes:
 Now I'm thinking I leave the grammar rules alone (apart from making it
 legal to specify an empty list of elements), and instead push the
 typename down into the child node from makeTypeCast(), if the child is
 an A_ArrayExpr.  Does that work better?

Actually, if you do that you might as well forego the separate node type
(which requires a nontrivial amount of infrastructure).  I think it
would work just about as well to have transformExpr check whether the
argument of a TypeCast is an ArrayExpr, and if so call
transformArrayExpr directly from there, passing the TypeName as an
additional argument.  Kinda ugly, but not really any worse than the way
A_Const is handled in that same routine.  (In fact, we could use the
same technique to get rid of the typename field in A_Const ... might
be worth doing?)

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Quality and Performance

2007-11-27 Thread Simon Riggs

Every release we seem to have the same debates about performance issues.

In 8.0 we shipped knowing that bgwriter had serious deficiencies, plus
had no way of logging SQL statements for performance tuning. In 8.2 we
even ended up tweaking the planner *after* release. 

What I don't understand is all the words about quality, yet we don't
seem to include performance as part of that. Performance always seems to
be a feature that can be left until the next release and it's never
the right time to fix it.

I would hope to persuade all that Performance is an integral part of
Quality, not a hindrance to it.

I've never worked on a software project where either the Users or the
Sponsors said don't worry about performance, it can wait, but I really
love the way you coded that. Quality is very, very high with Postgres,
but we also need to include performance as one of the Top Level concerns
*and* do that without dropping the ball on other concerns. That clearly
takes time and effort to balance those concerns.

We obviously need a performance build farm and I think everyone accepts
that. We just need to do it, so that's a given and is something I hope
to be involved in.

What I would really like to persuade everybody is that performance needs
specific attention. Once we've finished integrating the code, we're in
Beta and changes seem to be more difficult then. We must give time and
attention both to measuring performance and to fixing the things we
find. Sure we've done a lot of that, and I've been very happy with that,
but recent events make me think we have lapsed back into thinking that
performance is a threat to quality. I'd love to hear people say loud and
clear that performance matters and we can't ship when we know about
fixable performance holes.

Please can we clear some space in the next release schedule for
performance, plus give some credence to the thought that performance
issues rate our attention just as much as other kinds of bugs?

Maybe we should give each Beta a name, such as Initial Beta,
Performance Beta, Usability Beta as a way of encouraging folk to
focus onto particular aspects of quality at what we consider to be
appropriate times to do so. Not sure whether thats a good idea, but I'd
love to hear about ways to include performance as one of the essential
behaviours of PostgreSQL.

Your thoughts are welcome,

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] String encoding during connection handshake

2007-11-27 Thread Usama Munir
Martin is actually right. No assumption is made about the encoding of the 
password. The password is recieved as a set of bytes over the wire-level 
protocol and then processed accordingly as per your pg_hba settings. please 
refer to auth.c method recv_password_packet(Port *port). The comment on the 
last line of the method might be of your intrest, and i quote
 
Return the received string, Note we do not attempt to do any character set 
conversion on it; since we don't know the client's encoding, there woudn't be 
much point
 
/ Usama



From: [EMAIL PROTECTED] on behalf of [EMAIL PROTECTED]
Sent: Tue 11/27/2007 8:55 PM
To: Martijn van Oosterhout
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] String encoding during connection handshake



On Tuesday 27 November 2007, Martijn van Oosterhout wrote:
 I was under the impression that the username/password, had no encoding,
 they are Just a Bunch of Bits, i.e. byte[].
I cannot agree to that, simply because Postgres supports (or at least claims
to) multi-byte characters. And user names, passwords and database names are
character strings.

 Looking at it another way, the encoding is part of the password. The
 correctly entered password in the wrong encoding is also wrong, because
 the matching is done at the byte level.
I'm afraid that is true to some extent, that's why I'm asking in the first
place. A user should be able to authenticate as long as he/she is able to
write the password, regardless of the OS's locale setting.

 This is all AIUI,
Thanks fot the input, I'm waiting for others, too. Or point me to the relevant
source files.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq




[HACKERS] Sorting Improvements for 8.4

2007-11-27 Thread Simon Riggs
Just wanted to review a few thoughts and ideas around improving external
sorts, as recently encouraged to do by Jim Nasby. 

Current issues/opportunities are these:

ISSUES

a) Memory is always in short supply, so using what we have more
effectively is going to be welcome.

b) Heap sort has a reasonably strong anti-memory effect, meaning that
there is an optimum amount of memory for any sort. This shows itself
with the CPU time increasing during run forming, making this stage of
the sort CPU bound.

c) Many sorts are performed prior to aggregation. It might be possible
to aggregate prior to writing to disk, as a way of reducing the overall
I/O cost. Benefit would occur when the total CPU cost was same no matter
when aggregation occurred; that would not apply in all cases, so we
would need to sense when benefit was possible.

d) Generally reducing the I/O cost of sorting may help the merging
stages of a sort.


SOLUTIONS

The ideas that Greg Stark, Jim Nasby, Heikki and myself have discussed
to date were the following:

1. Sort I/O Compression
2. Aggregation during Sort
3. Memory Pools
4. Dynamic Heap Management
5. Dynamic Run Handling

I've added (5) to the list as well, which hasn't yet been discussed.

1. SORT I/O COMPRESSION

This idea is not dead yet, it just needs a full set of tests to confirm
that there is benefit in all cases. If there's not benefit in all cases,
we may be able to work out which cases those are, so we know when to use
it.


2. AGGREGATION DURING SORT

Many sorts are preliminary steps before aggregation. Aggregation during
run forming would potentially reduce size of heap and reduce number of
comparisons. For many types of aggregate this would not theoretically
increase the number of ops since sum(), avg(), min(), max() are all
commutative according to their inputs. We would probably need to add
another option to Aggregate Functions to indicate the possibility of
calculating the aggregate in this way, since some aggregates might rely
on the current situation that they expect all their inputs at once in
sorted order. (Windowed aggregates are unlikely to be this way).


3. MEMORY POOLS

Solving a) could be done by sensible management and allocation of
resources. Discussed before, so not rehashed here.


4. DYNAMIC HEAP MANAGEMENT

The size of the active heap required to produce the fewest number of
runs varies as the sort progresses. For example, sorting an already
sorted input needs a trivial heap size. 

Larger heap sizes simply avoid forming more runs, which is not
necessarily a bad thing. More runs only become bad things when we go
beyond our ability to perform a single final merge (see Dynamic Run
Handling below).

Smaller heap sizes reduce the number of comparisons required, plus
increase the L2+ cache efficiencies. Those two things are the cause of
the anti-memory effect.

Because of b), optimising the size of the heap could potentially be a
good thing. This can make a considerable difference for nearly sorted
data (measurements required...).

When we have M amount of memory available to us, we don't start by using
it all. We start with m memory and only increase up to M if required.
Runs are built with memory set at m. If a tuple arrives that would force
the formation of a new run we assess

i) do we care if another run is formed? Use our knowledge of the likely
amount of data coming our way, compared with number of runs formed so
far and see if we really care. If we don't care, allow the new run to be
formed and carry on with just heap size of m. (see Dynamic Run Handling
later).

ii) if we do care about number of runs, then allow the heap to grow by
increments up to the full size of M. Increments would be at least x2 and
possibly x4. That way we always have work space to rearrange the heap.

All of this dances too cleverly around the exact technique and potential
costs of rearranging the heap. That is not to be ignored and is the next
task in evaluating and accepting/dismissing this potential technique.

In combination with memory pooling this technique might also allow
memory to be better distributed to other users.


5. DYNAMIC RUN HANDLING (in Final Merge)

Another way of addressing a) is to simply make better use of memory
itself. Let's look at that in more detail:

Number of runs that can be merged at once is currently fixed, based upon
available memory. This has the underlying assumption that all runs will
be concurrently active during final merging, which may not always be
true.

If we have random data then almost all runs will overlap with all other
runs, i.e. the min and max values are sufficiently wide that the runs do
all overlap. In many cases, data arrives in somewhat sorted order, e.g.
financial data is fairly regular with some late payers but not many, and
those trail off with a fairly tight decay. In the somewhat sorted case
we find that the actual overlap is less than total, so there are many
later runs that don't overlap the earlier ones. In the best case 

Re: [HACKERS] psql -f doesn't complain about directories

2007-11-27 Thread Peter Eisentraut
Am Donnerstag, 15. November 2007 schrieb Tom Lane:
 Peter Eisentraut [EMAIL PROTECTED] writes:
  Am Donnerstag, 15. November 2007 schrieb Tom Lane:
  This seems too far removed from the scene of the crime
 
  Yeah, my zeroth attempt was to place this in gets_fromFile(), but there
  you don't have any opportunity to report failure to the main loop.  We'd
  need to change the function signature to be able to pass that around. 
  Maybe that's better overall.

 Well, you could still handle that the same as in your patch: on NULL
 return, check ferror.  It's just that I don't trust errno to stay
 unchanged for very long.

This should do better:

diff -ur ../cvs-pgsql/src/bin/psql/input.c ./src/bin/psql/input.c
--- ../cvs-pgsql/src/bin/psql/input.c   2007-01-12 10:22:42.0 +0100
+++ ./src/bin/psql/input.c  2007-11-27 18:46:34.0 +0100
@@ -179,9 +179,16 @@
/* Disable SIGINT again */
sigint_interrupt_enabled = false;

-   /* EOF? */
+   /* EOF or error? */
if (result == NULL)
+   {
+   if (ferror(source))
+   {
+   psql_error(could not read from input file: 
%s\n, strerror(errno));
+   return NULL;
+   }
break;
+   }

appendPQExpBufferStr(buffer, line);

diff -ur ../cvs-pgsql/src/bin/psql/mainloop.c ./src/bin/psql/mainloop.c
--- ../cvs-pgsql/src/bin/psql/mainloop.c2007-01-12 10:22:42.0 
+0100
+++ ./src/bin/psql/mainloop.c   2007-11-27 18:30:13.0 +0100
@@ -129,7 +129,11 @@
line = gets_interactive(get_prompt(prompt_status));
}
else
+   {
line = gets_fromFile(source);
+   if (!line  ferror(source))
+   successResult = EXIT_FAILURE;
+   }

/*
 * query_buf holds query already accumulated.  line is the 
malloc'd

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Andrew Sullivan
On Tue, Nov 27, 2007 at 05:32:49PM +, Simon Riggs wrote:
 What I would really like to persuade everybody is that performance needs
 specific attention. 

[. . .]

 Your thoughts are welcome,

Well, one thing that might help is something of the specifics you mention.

I remember mentioning to Jan not long after he started at Afilias that we
occasionally saw strange behaviour that looked like lock up.  He was
slightly incredulous, and I didn't have time to build a repeatable test
case.  So it was in the context of testing Slony that he discovered the dual
pains of buffer shuffling and checkpoint storms; this is part of what led
him to work on those problems in 8.0.

The key was to state, at the outset, Here is the problem I want to fix.
By stating precisely and specifically what is to be fixed, the issue moves
from performance needs to a feature that can be implemented.

Perhaps now is the time to list some specific performance areas you want to
fix up?

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, 27 Nov 2007 17:32:49 +
Simon Riggs [EMAIL PROTECTED] wrote:


 Maybe we should give each Beta a name, such as Initial Beta,
 Performance Beta, Usability Beta as a way of encouraging folk to
 focus onto particular aspects of quality at what we consider to be
 appropriate times to do so. Not sure whether thats a good idea, but
 I'd love to hear about ways to include performance as one of the
 essential behaviours of PostgreSQL.
 
 Your thoughts are welcome,

Well I think that we do take performance into account. I agree
that we should *never* have a regression in performance from release
to release, which is what I believe has inspired this thread.

However if you look at a lot of the items that have gone into this
release they were all about performance:

HOT -- has a direct correlation to performance because without it
we vacuum more. Not only more, but in a lot of cases it resolves a
serious PostgreSQL limitation for high velocity tables.

Phantom Xid - less vacuuming

The sequential scan thing Jeff Davis did

The reduced size of the tuple headers

Sincerely,

Joshua D. Drake
  


- -- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTF0XATb/zqfZUUQRAul2AKCfKToFWWkdNWYZTCLmjJJeDQpysQCfTKnI
kURZp3SdAoyRLScxG5PizDo=
=PTcF
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, 27 Nov 2007 18:18:52 +
Simon Riggs [EMAIL PROTECTED] wrote:

 On Tue, 2007-11-27 at 10:08 -0800, Joshua D. Drake wrote:
  Simon Riggs [EMAIL PROTECTED] wrote:

 Agreed. I either initiated or assisted with most of those items; but
 that's not really my point, however because those were planned
 performance features.
 
 My thinking was about how we handle the last minute attention to
 detail that ensures we have performance everywhere, not just on the
 main features.

Well that is certainly a valid point. Unfortunately that is also a lot
of work :). We would have to have some benchmark that actually
attributes itself to common real world load (even the very simple real
world load like was just fixed).

I would also be happy to put some resources behind this.

Sincerely,

Joshua D. Drake 


- -- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTGEoATb/zqfZUUQRAgk8AKCLNFPpkEqKF44PyXI9D7k5ynZDSgCfbNNm
5JcjMi1+ZxWX4CzQn4y0Bdk=
=hvTR
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Andrew Dunstan



Simon Riggs wrote:


We obviously need a performance build farm and I think everyone accepts
that. We just need to do it, so that's a given and is something I hope
to be involved in.


  


It's on my list ... Had I but world enough and time ...

Performance testing can be bolted onto the exiting buildfarm as an 
option. However, performance test machines have some requirements that 
pure functional/build test machines don't have: especially stability. A 
standard buildfarm client can be put on almost any machine and run 
happily. My main workstation runs four buildfarm members including three 
in a VM, and I never notice any impact.  But a performance test machine 
probably needs to be dedicated to just that function. And at least some 
members of the performance test machines would need to be higher end 
machines. The number of people who can afford such resources is much 
lower than those who can run a relatively low impact simple buildfarm 
member.


Maybe we also need to talk about running clients elsewhere for 
performance testing too.


We also need to talk about what would be a good set of tests to run.

One useful thing this would buy us is a time series of test results so 
we could easily see sudden degradations in performance. It must have 
been annoying trying to triangulate performance dropoff recently.


cheers

andrew


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Alvaro Herrera
Joshua D. Drake wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On Tue, 27 Nov 2007 17:32:49 +
 Simon Riggs [EMAIL PROTECTED] wrote:
 
 
  Maybe we should give each Beta a name, such as Initial Beta,
  Performance Beta, Usability Beta as a way of encouraging folk to
  focus onto particular aspects of quality at what we consider to be
  appropriate times to do so. Not sure whether thats a good idea, but
  I'd love to hear about ways to include performance as one of the
  essential behaviours of PostgreSQL.
 
 Well I think that we do take performance into account. I agree
 that we should *never* have a regression in performance from release
 to release, which is what I believe has inspired this thread.

Hmm.  I have developed several features that have driven performance
down.  Autovacuum enabled by default for one.  IIRC the SELECT FOR SHARE
stuff (tuple-level share locks) also hurt performance.  Savepoints
required enlarging tuple headers, which also hurt performance.

In all cases we have gotten some other benefit, be it reduced
administrative pain, or reduced lock contention, or a new feature.

(In fact I think most performance drops have been my fault.)

-- 
Alvaro Herrerahttp://www.advogato.org/person/alvherre
Some men are heterosexual, and some are bisexual, and some
men don't think about sex at all... they become lawyers (Woody Allen)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Simon Riggs
On Tue, 2007-11-27 at 10:08 -0800, Joshua D. Drake wrote:
 Simon Riggs [EMAIL PROTECTED] wrote:

  Maybe we should give each Beta a name, such as Initial Beta,
  Performance Beta, Usability Beta as a way of encouraging folk to
  focus onto particular aspects of quality at what we consider to be
  appropriate times to do so. Not sure whether thats a good idea, but
  I'd love to hear about ways to include performance as one of the
  essential behaviours of PostgreSQL.
  
  Your thoughts are welcome,
 
 Well I think that we do take performance into account. I agree
 that we should *never* have a regression in performance from release
 to release, which is what I believe has inspired this thread.
 
 However if you look at a lot of the items that have gone into this
 release they were all about performance:

Agreed. I either initiated or assisted with most of those items; but
that's not really my point, however because those were planned
performance features.

My thinking was about how we handle the last minute attention to detail
that ensures we have performance everywhere, not just on the main
features.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Joshua D. Drake wrote:
 Well I think that we do take performance into account. I agree
 that we should *never* have a regression in performance from release
 to release, which is what I believe has inspired this thread.

 Hmm.  I have developed several features that have driven performance
 down.

Even changes that are not feature additions but intended solely to
improve performance may have corner cases where they are losses rather
than wins.  I think *never* have a regression in performance is not
only pie-in-the-sky but would be a bad policy to adopt, because it
would mean for instance that we couldn't intentionally optimize common
cases at the expense of uncommon ones.

However, I think everybody agrees that getting blindsided by unexpected
performance dropoffs is a bad thing.  We really need to reinstitute
the sort of daily (or near-daily) performance tracking that Mark Wong
used to be doing, and extend it to cover a wider variety of test cases
than just DBT-2.  As an example, I'll bet that this issue of operator
lookup speed would never have been visible at all in DBT-2.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Tom Lane
By chance I happened to notice in the release notes

Release 7.3
Release date: 2002-11-27

Man, it feels like a long time since that came out...

There has been some discussion of making a project policy of dropping
support for old releases after five years.  Should we consider formally
instituting that?

I see that there are two or three minor bug fixes in the REL7_3_STABLE
branch since 7.3.20.  Rather than just leaving those to rot, maybe the
actual policy should be only one more update after 8.3 comes out.

Comments, opinions?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, 27 Nov 2007 14:02:24 -0500
Tom Lane [EMAIL PROTECTED] wrote:

 By chance I happened to notice in the release notes
 
 Release 7.3
 Release date: 2002-11-27
 
 Man, it feels like a long time since that came out...

5 years was a long time ago :)

 
 There has been some discussion of making a project policy of dropping
 support for old releases after five years.  Should we consider
 formally instituting that?

Yes.

 
 I see that there are two or three minor bug fixes in the REL7_3_STABLE
 branch since 7.3.20.  Rather than just leaving those to rot, maybe the
 actual policy should be only one more update after 8.3 comes out.
 
 Comments, opinions?

Release 7.3.21 with and EOL addendum :). E.g; this is the last release
of 7.3 and 7.3 is now considered unsupported.

Sincerely,

Joshua D. Drake

 
   regards, tom lane
 
 ---(end of
 broadcast)--- TIP 6: explain analyze is your
 friend
 


- -- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTGtNATb/zqfZUUQRAl88AKCpMx0tfZpU8T8raSIMciB7qxdN5QCfdvOJ
gbZY1k844q+xjqwGdntkoaY=
=+cMu
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Dave Page


 --- Original Message ---
 From: Tom Lane [EMAIL PROTECTED]
 To: pgsql-hackers@postgresql.org
 Sent: 27/11/07, 19:02:24
 Subject: [HACKERS] PG 7.3 is five years old today
 
 I see that there are two or three minor bug fixes in the REL7_3_STABLE
 branch since 7.3.20.  Rather than just leaving those to rot, maybe the
 actual policy should be only one more update after 8.3 comes out.
 

I assume you no longer need  to maintain it for Redhat then? If that's the 
case, I'm for dropping it given it's age.

/D

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Tom Lane
Dave Page [EMAIL PROTECTED] writes:
 From: Tom Lane [EMAIL PROTECTED]
 I see that there are two or three minor bug fixes in the REL7_3_STABLE
 branch since 7.3.20.  Rather than just leaving those to rot, maybe the
 actual policy should be only one more update after 8.3 comes out.

 I assume you no longer need  to maintain it for Redhat then?

Well, I still do, nominally, but RHEL-3 is in maintenance mode (meaning
no more scheduled updates).  It would take a fairly serious bug to get
Red Hat's attention to the point that they'd want to turn the package.
If something like that came up, very possibly we'd want to put out a
fix too.  What I'm thinking is more along the lines of not bothering
with back-patching non-catastrophic bugs, and not automatically
including 7.3 in the set of branches we make back-branch releases for.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Gevik Babakhani

 At some point back, I seem to recall the reason for bothering 
 to backpatch to 7.3 is that it had to be maintained for 
 RedHat anyway, so things might as well be backpatched? If 
 that requirements is gone, I think it's time to drop it.

+1 

 And +1 on pushing out one final end of the tree release 
 since there's stuff there.
 

+1


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Magnus Hagander

On Tue, 2007-11-27 at 14:02 -0500, Tom Lane wrote:
 By chance I happened to notice in the release notes
 
 Release 7.3
 Release date: 2002-11-27
 
 Man, it feels like a long time since that came out...
 
 There has been some discussion of making a project policy of dropping
 support for old releases after five years.  Should we consider formally
 instituting that?
 
 I see that there are two or three minor bug fixes in the REL7_3_STABLE
 branch since 7.3.20.  Rather than just leaving those to rot, maybe the
 actual policy should be only one more update after 8.3 comes out.
 
 Comments, opinions?

At some point back, I seem to recall the reason for bothering to
backpatch to 7.3 is that it had to be maintained for RedHat anyway, so
things might as well be backpatched? If that requirements is gone, I
think it's time to drop it.

And +1 on pushing out one final end of the tree release since there's
stuff there.

//Magnus


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] psql -f doesn't complain about directories

2007-11-27 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 This should do better:

Looks good to me, though I'd suggest updating gets_fromFile's header comment:

- * The result is a malloc'd string.
+ * The result is a malloc'd string, or NULL on EOF or input error.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Dave Page
Tom Lane wrote:
 Dave Page [EMAIL PROTECTED] writes:
 From: Tom Lane [EMAIL PROTECTED]
 I see that there are two or three minor bug fixes in the REL7_3_STABLE
 branch since 7.3.20.  Rather than just leaving those to rot, maybe the
 actual policy should be only one more update after 8.3 comes out.
 
 I assume you no longer need  to maintain it for Redhat then?
 
 Well, I still do, nominally, but RHEL-3 is in maintenance mode (meaning
 no more scheduled updates).  It would take a fairly serious bug to get
 Red Hat's attention to the point that they'd want to turn the package.
 If something like that came up, very possibly we'd want to put out a
 fix too.  What I'm thinking is more along the lines of not bothering
 with back-patching non-catastrophic bugs, and not automatically
 including 7.3 in the set of branches we make back-branch releases for.

OK, well +1 for dropping it from me then.

/D

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Simon Riggs
On Tue, 2007-11-27 at 14:02 -0500, Tom Lane wrote:

 There has been some discussion of making a project policy of dropping
 support for old releases after five years.  Should we consider formally
 instituting that?
 
 I see that there are two or three minor bug fixes in the REL7_3_STABLE
 branch since 7.3.20.  Rather than just leaving those to rot, maybe the
 actual policy should be only one more update after 8.3 comes out.

Well, I agree that it shouldn't be your responsibility to do that. We
need to reduce the things you have to worry about to allow you to focus
on later releases.

One of the good things about open source is the ability for software to
remain supported for many years longer than closed source software.

Perhaps we should ask for volunteers to maintain that branch? If we had
a maintenance release manager, then they can take responsibility for
passing down any appropriate bug fixes. We could also create a new list
for people discussing older releases, so we don't get pinged all the
time. 

That way anybody with an application at older release levels can either
step up to the plate or lose support.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Andreas 'ads' Scherbaum
On Tue, 27 Nov 2007 11:08:58 -0800 Joshua D. Drake wrote:

 Release 7.3.21 with and EOL addendum :). E.g; this is the last release
 of 7.3 and 7.3 is now considered unsupported.

I know at least one customer who is using RHEL-3 and PG 7.3 on dozens
machines worldwide. Yes, they are moving to 8.2 but this will require
some more month and eventually not all machines can just be updated to
a newer OS/DB version.

So i'm also for stopping support for 7.3 but not the way you proposed.
If we have supported 7.3 up to now, there should be an official notice
with a date, when support ends. This date should not be the next and
final release some days after the notice ;-)


Kind regards

-- 
Andreas 'ads' Scherbaum
German PostgreSQL User Group

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Replacement Selection

2007-11-27 Thread Simon Riggs
On Tue, 2007-11-27 at 17:49 +0100, [EMAIL PROTECTED] wrote:
 Any comment about Two Ways Replacement Selection (two heaps instead of just 
 one) ?

It might allow dynamic heap size management more easily than with a
single heap.

If you really think it will be better, try it. You'll learn loads, right
or wrong. It's difficult to forecast ahead of time what's a good idea
and what's a bad idea. The real truth of these things is that you need
to pop the hood and start tinkering and its's quite hard to make a plan
for that. If you have a bad idea, just move on to the next one; they're
just ideas.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Josh Berkus
Tom,

 There has been some discussion of making a project policy of dropping
 support for old releases after five years.  Should we consider formally
 instituting that?

The community consensus I recall was three versions only.  Anything beyond 
that would be up to the vendors.

Mind you, I don't know what EDB guarentees but the Sun folks could end up 
patching everything back to 8.1 for the next 5 years depending on customer 
demand.  So I think 5 years will be a reality for us for the conceivable 
future.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes:
 There has been some discussion of making a project policy of dropping
 support for old releases after five years.  Should we consider formally
 instituting that?

 The community consensus I recall was three versions only.  Anything beyond 
 that would be up to the vendors.

Yeah, but some of us are also the vendors ;-).  I still figure that if
I have to maintain branch X for Red Hat, I might as well put those fixes
in the community CVS.  I should think that Sun, EDB, et al would also
find it expedient to not need to maintain private patch sets.  So it
seems to me that the vendor EOL horizons are legitimate to consider
while deciding what the community wants to support.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Simon Riggs
On Tue, 2007-11-27 at 13:54 -0500, Tom Lane wrote:

 However, I think everybody agrees that getting blindsided by unexpected
 performance dropoffs is a bad thing.  We really need to reinstitute
 the sort of daily (or near-daily) performance tracking that Mark Wong
 used to be doing, and extend it to cover a wider variety of test cases
 than just DBT-2.  As an example, I'll bet that this issue of operator
 lookup speed would never have been visible at all in DBT-2.

Yeh, we need multiple large benchmarks run on a regular basis.

My understanding is the community has two 8-core servers to run
benchmarks on, but I'd quite like to have some details on where these
are at. One is likely to be running RHEL, one Solaris.

We also need performance regression tests, which is a slightly different
thing even if they do sound similar.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, 27 Nov 2007 20:32:57 +
Simon Riggs [EMAIL PROTECTED] wrote:

 On Tue, 2007-11-27 at 13:54 -0500, Tom Lane wrote:
 
  However, I think everybody agrees that getting blindsided by
  unexpected performance dropoffs is a bad thing.  We really need to
  reinstitute the sort of daily (or near-daily) performance tracking
  that Mark Wong used to be doing, and extend it to cover a wider
  variety of test cases than just DBT-2.  As an example, I'll bet
  that this issue of operator lookup speed would never have been
  visible at all in DBT-2.
 
 Yeh, we need multiple large benchmarks run on a regular basis.
 
 My understanding is the community has two 8-core servers to run
 benchmarks on, but I'd quite like to have some details on where these
 are at. One is likely to be running RHEL, one Solaris.

The RHEL one as I know it, is the MyYearbook donated one. We are
currently unaware of the status of that machine except to say it is
currently running Gentoo.

I don't know the status of the Solaris machine except that I think we
had IO issues with it.

Sincerely,

Joshua D. Drake


 
 We also need performance regression tests, which is a slightly
 different thing even if they do sound similar.
 


- -- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTH+7ATb/zqfZUUQRAhlYAJ0QqG5CzDmQfi2ynj/i9xJoe1nWUACeOyJQ
gSOnZhVhp61HbOlfRjfDzvM=
=MEIW
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Tom Lane
Andreas 'ads' Scherbaum [EMAIL PROTECTED] writes:
 On Tue, 27 Nov 2007 11:08:58 -0800 Joshua D. Drake wrote:
 Release 7.3.21 with and EOL addendum :). E.g; this is the last release
 of 7.3 and 7.3 is now considered unsupported.

 I know at least one customer who is using RHEL-3 and PG 7.3 on dozens
 machines worldwide.

Are they running 7.3.20?  Will they update to 7.3.21 promptly when we
ship it?  Or are they using whatever Red Hat includes in RHEL-3?
(which is still 7.3.19 I believe)

One of the reasons for losing interest in frequent updates is that
it seems most of the people we hear from who are running 7.3.x are
running a pretty obsolete x.  If we produce an update and no one
actually installs it, we're just wasting time with make-work.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Simon Riggs
On Tue, 2007-11-27 at 13:32 -0500, Andrew Dunstan wrote:

 We also need to talk about what would be a good set of tests to run.

I think we should develop a series of performance regression tests that
can be run as an option on the buildfarm. We'd want a separate page for
that with graphs etc, as you suggest.

My vision for that is a set of tests that test very specific aspects of
code, much the same way as the regression tests attempt feature
coverage. Examples would be
- 1 INSERTs
- 1 INSERTs using multi-VALUEs clauses
- 10 rows inserted by COPY
- 10 rows inserted by CTAS

We would need a way to compare results between releases, so we can see
which aspects have regressed/improved, just as we have with the
buildfarm. That will also be food for release notes, where we can
mention all actions that are 5% faster, or anything we must regrettably
report as being slower.

Sounds like it's waiting on somebody to make the first move, so maybe I
should do that, then let everybody else chip into the framework.

Should we do this as part of core, or as a separate pgfoundry project?

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Simon Riggs
On Tue, 2007-11-27 at 12:36 -0800, Joshua D. Drake wrote:

 The RHEL one as I know it, is the MyYearbook donated one. We are
 currently unaware of the status of that machine except to say it is
 currently running Gentoo.
 
 I don't know the status of the Solaris machine except that I think we
 had IO issues with it.

Might I enquire who has these machines, so I can ask them how can I get
access to them? Are they really Community systems? In what sense?

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, 27 Nov 2007 21:00:03 +
Simon Riggs [EMAIL PROTECTED] wrote:

 On Tue, 2007-11-27 at 12:36 -0800, Joshua D. Drake wrote:
 
  The RHEL one as I know it, is the MyYearbook donated one. We are
  currently unaware of the status of that machine except to say it is
  currently running Gentoo.
  
  I don't know the status of the Solaris machine except that I think
  we had IO issues with it.
 
 Might I enquire who has these machines, so I can ask them how can I
 get access to them? Are they really Community systems? In what sense?


They are community systems in the sense that JoshB told the community
Sun donated a box and MyYearbook said they donated a box. MyYearbook
had stated that we (cmd) were going to manage the MyYearbook box but
nothing has come of it as of yet. I haven't heard anything on the Sun
machine in a *long* time.

My guess is that everyone is busy.

So I have a question :)... The community does have money. We have IO
available, what we don't have is a machine. Would the community be
interested in purchasing a box? 

We could fairly reasonably get an HP or Dell (the new Dell's aren't
nearly as bad as they used to be) or Sun 8 core box with reasonable ram
and just a raid 1 for the OS. Then use the external IO we have available
with that machine.

Joshua D. Drake
 


- -- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTIh5ATb/zqfZUUQRArJ1AJ4jf6S12CY4duUTYzlSQlF+YTYRvwCgmu/i
Dj36cknbGDX2Bzh6rmIkNw8=
=jl1/
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Simon Riggs
On Tue, 2007-11-27 at 15:33 -0300, Alvaro Herrera wrote:
 Joshua D. Drake wrote:
  I agree
  that we should *never* have a regression in performance from release
  to release, which is what I believe has inspired this thread.
 
 Hmm.  I have developed several features that have driven performance
 down.  

I think performance reductions as a result of additional functionality
are acceptable, but we should aim to minimise them. 

It's the small things that crop up along the way that must be fixed,
with the same vigour we fix other bugs.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]

2007-11-27 Thread Brendan Jurd
On Nov 28, 2007 4:19 AM, Tom Lane [EMAIL PROTECTED] wrote:
 Brendan Jurd [EMAIL PROTECTED] writes:
  Now I'm thinking I leave the grammar rules alone (apart from making it
  legal to specify an empty list of elements), and instead push the
  typename down into the child node from makeTypeCast(), if the child is
  an A_ArrayExpr.  Does that work better?

 Actually, if you do that you might as well forego the separate node type
 (which requires a nontrivial amount of infrastructure).  I think it
 would work just about as well to have transformExpr check whether the
 argument of a TypeCast is an ArrayExpr, and if so call
 transformArrayExpr directly from there, passing the TypeName as an
 additional argument.

I actually thought that A_ArrayExpr would be a good addition even if
you ignore the matter of typecasting.  It always seemed weird to me
that the parser generates an ArrayExpr directly.  ArrayExpr has a
bunch of members that are only set by the transform; all the parser
does is set the 'elements' member.  And then the transform creates a
brand new ArrayExpr and populates it based on what's in the 'elements'
member of the otherwise-empty ArrayExpr passed to it.

So my feeling is that an A_ArrayExpr is a better fit for the parser
output than ArrayExpr, and more in keeping with how the rest of the
code does things.

Mind you I'm also okay with your suggestion to let transformExpr take
care of it.  But I'm not adverse to putting in the legwork to set up
the infrastructure for A_ArrayExpr, if it's a nice outcome.

 Kinda ugly, but not really any worse than the way
 A_Const is handled in that same routine.  (In fact, we could use the
 same technique to get rid of the typename field in A_Const ... might
 be worth doing?)

I had a bit of a dig into this.  A_Const-typename gets set directly
by the parse paths for INTERVAL [(int)] string [interval range].  In
fact, as far as I can tell that's the _only_ place A_Const-typename
gets used at all.  And all the transform does with that piece of
information is treat the node like a typecast.

I'm not seeing a huge amount of value in this special treatment.  Why
not just have the parser build this as an A_Const inside a TypeCast
and then let the transform deal with it in the usual way?  I found the
following comment at parsenodes.h:244

* NOTE: for mostly historical reasons, A_Const parsenodes contain
* room for a TypeName; we only generate a separate TypeCast node if the
* argument to be casted is not a constant.  In theory either representation
* would work, but the combined representation saves a bit of code in many
* productions in gram.y.

However, this is no longer the case.  makeTypeCast() doesn't care
about whether its argument is a constant anymore:

* Earlier we would determine whether an A_Const would
* be acceptable, however Domains require coerce_type()
* to process them -- applying constraints as required.

And in many productions in gram.y, many == 2.  Currently the
combined representation requires more code than it saves.

So, I get the impression the use-case for A_Const-typename has become
extinct.  I think it could be removed with a minimum of fuss, and I'd
be happy to include same with my patch (or, submit it as a separate
patch; let me know your preference).

Regards,
BJ

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Poorly named support routines for GIN tsearch index opclasses

2007-11-27 Thread Tom Lane
I've just noticed that tsearch includes a couple of support functions
with rather vague names:
gin_extract_query(internal,internal,smallint)
gin_ts_consistent(internal,smallint,internal)
These are, in fact, specialized to the case of operating on tsquery
inputs, but you'd never guess that from either the name or the
declared argument types.  The first one seems particularly likely
to conflict with other peoples' GIN opclasses, down the road sometime.

Since we've already forced initdb for the next beta, it would be free
to rename these things now.  I don't believe this would cause any
compatibility problems for applications, since SQL queries never call
these things directly.  I'm thinking gin_extract_tsquery and
gin_tsquery_consistent would be suitable names.

Another possibility would be to change the declared signatures to show
tsquery rather than internal at the places where a tsquery argument
is expected.  I'm less excited about that part though.

Comments, objections?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Still a NOTICE in dict_thesaurus.c

2007-11-27 Thread Tom Lane
Wasn't this patch
http://archives.postgresql.org/pgsql-committers/2007-11/msg00170.php
supposed to fix things so that we wouldn't be throwing NOTICEs out of
tsearch dictionary init functions?  It seems to have gotten only
one of the two elog(NOTICE) calls in there.  The other one is for a
stopword in the replacement phrase.  Is there any objection to making
that an ERROR too?

Also, several of the elog's in this file ought to be ereport's since
they are user-caused error conditions.  I think we are nominally past
string freeze, but does anyone have an objection to changing them to
ereport's?  If they don't get translated, the worst that will happen
is that they come out in English, which is what's happening now.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Andrew Dunstan



Josh Berkus wrote:

Tom,

  

There has been some discussion of making a project policy of dropping
support for old releases after five years.  Should we consider formally
instituting that?



The community consensus I recall was three versions only.  Anything beyond 
that would be up to the vendors.


Mind you, I don't know what EDB guarentees but the Sun folks could end up 
patching everything back to 8.1 for the next 5 years depending on customer 
demand.  So I think 5 years will be a reality for us for the conceivable 
future.


  


I don't know that we came up with a highly specific policy. My 
recollection was something like Support would be maintained for n years 
(or possibly releases), after which we could discontinue support at any 
time if bugs were unpatchable.


The burden of maintaining back releases isn't really all that great, ISTM.

I have no objection to cutting a release and declaring it final (with a 
possible exception for security fixes).


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Andrew Dunstan



Simon Riggs wrote:

On Tue, 2007-11-27 at 13:32 -0500, Andrew Dunstan wrote:

  

We also need to talk about what would be a good set of tests to run.



Sounds like it's waiting on somebody to make the first move, so maybe I
should do that, then let everybody else chip into the framework.
  


If you start with a set of tests and send it to me I will start work on 
a benchmarking step in the buildfarm client.




Should we do this as part of core, or as a separate pgfoundry project?

  


Core, please. This is mainline -hackers material.

cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]

2007-11-27 Thread Tom Lane
Brendan Jurd [EMAIL PROTECTED] writes:
 I actually thought that A_ArrayExpr would be a good addition even if
 you ignore the matter of typecasting.  It always seemed weird to me
 that the parser generates an ArrayExpr directly.  ArrayExpr has a
 bunch of members that are only set by the transform; all the parser
 does is set the 'elements' member.

Well, that's a reasonable argument.  And now that I think about it,
a parser-only node type doesn't have nearly the support overhead that
a full-fledged executable node does.  So no objection to A_ArrayExpr
if you want to do that.

 I had a bit of a dig into this.  A_Const-typename gets set directly
 by the parse paths for INTERVAL [(int)] string [interval range].  In
 fact, as far as I can tell that's the _only_ place A_Const-typename
 gets used at all.

Uh, you missed quite a lot of others ... see CURRENT_DATE and a lot of
other productions.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Simon Riggs wrote:
 Should we do this as part of core, or as a separate pgfoundry project?

 Core, please. This is mainline -hackers material.

Huh?  The buildfarm isn't in core, why would a performfarm be?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, 27 Nov 2007 21:00:03 +
Simon Riggs [EMAIL PROTECTED] wrote:

 On Tue, 2007-11-27 at 12:36 -0800, Joshua D. Drake wrote:
 
  The RHEL one as I know it, is the MyYearbook donated one. We are
  currently unaware of the status of that machine except to say it is
  currently running Gentoo.
  
  I don't know the status of the Solaris machine except that I think
  we had IO issues with it.
 
 Might I enquire who has these machines, so I can ask them how can I
 get access to them? Are they really Community systems? In what sense?
 

We just spoke with MyYearbook and my suspicions were correct. Everyone
is just busy. Next ETA on that machine is 3 weeks (on the outside).. so
expect the beginning of the year.

Joshua D. Drake

- -- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTKN4ATb/zqfZUUQRAoTiAKCC6+RNPFihWyos4s7vCmwt2K3C1gCfdFkV
z0ht4nKhelQ+UUwrHnMLXkA=
=2JWW
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Still a NOTICE in dict_thesaurus.c

2007-11-27 Thread Bruce Momjian

Thanks, working on this now.  Because the stop words can be any stop
word I didn't imagine it could be in the subsitution.  What stop word
will it use?  Anyway, coding it now to match the left-hand-side.

---

Tom Lane wrote:
 Wasn't this patch
 http://archives.postgresql.org/pgsql-committers/2007-11/msg00170.php
 supposed to fix things so that we wouldn't be throwing NOTICEs out of
 tsearch dictionary init functions?  It seems to have gotten only
 one of the two elog(NOTICE) calls in there.  The other one is for a
 stopword in the replacement phrase.  Is there any objection to making
 that an ERROR too?
 
 Also, several of the elog's in this file ought to be ereport's since
 they are user-caused error conditions.  I think we are nominally past
 string freeze, but does anyone have an objection to changing them to
 ereport's?  If they don't get translated, the worst that will happen
 is that they come out in English, which is what's happening now.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Still a NOTICE in dict_thesaurus.c

2007-11-27 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Thanks, working on this now.  Because the stop words can be any stop
 word I didn't imagine it could be in the subsitution.  What stop word
 will it use?  Anyway, coding it now to match the left-hand-side.

Why would you do that?  There is not any corresponding idea of leaving
holes in the pattern.  AFAICS the only change needed is NOTICE-ERROR.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Still a NOTICE in dict_thesaurus.c

2007-11-27 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Thanks, working on this now.  Because the stop words can be any stop
  word I didn't imagine it could be in the subsitution.  What stop word
  will it use?  Anyway, coding it now to match the left-hand-side.
 
 Why would you do that?  There is not any corresponding idea of leaving
 holes in the pattern.  AFAICS the only change needed is NOTICE-ERROR.

OK, I see now.  The right-hand side just ignores a stop word, rather
than accepting any stop word.  Got it.

So, for the left-hand-side we have '?' as any stop word and a stop word
as an error, and you want to make a stop word as an error on the
right-hand-size.  No need for '?'.  Agreed.  Want me to do it?

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Guillaume Smet
On Nov 27, 2007 11:45 PM, Andrew Dunstan [EMAIL PROTECTED] wrote:
 If you start with a set of tests and send it to me I will start work on
 a benchmarking step in the buildfarm client.

Are you sure it shouldn't be a separate client? I don't think neither
the prerequisites nor the results wanted have something in common with
the build farm.

Another idea is that we should find a way to let people run their
specific benchmarks and use the client to report their results. Some
people may want to run read only benchmarks, some I/O heavy
benchmarks, and others benchmarks which runs for more than a day.

--
Guillaume

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan [EMAIL PROTECTED] writes:
  

Simon Riggs wrote:


Should we do this as part of core, or as a separate pgfoundry project?
  


  

Core, please. This is mainline -hackers material.



Huh?  The buildfarm isn't in core, why would a performfarm be?


  


It's the tests I think belong in core, not the farm software. Currently 
buildfarm performs functionality tests that are also in core.


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Guillaume Smet
On Nov 27, 2007 7:32 PM, Andrew Dunstan [EMAIL PROTECTED] wrote:
 But a performance test machine
 probably needs to be dedicated to just that function. And at least some
 members of the performance test machines would need to be higher end
 machines. The number of people who can afford such resources is much
 lower than those who can run a relatively low impact simple buildfarm
 member.

As I already indicated it in another thread, we will dedicate 7
servers donated by Continuent to exactly that in a near future (they
are waiting in their boxes atm). So we will have dedicated boxes to
run benchmarks night  day. They will be accessible to the community
to set up as much benchmarks as necessary and I'll help to set up them
and make sure everything is OK.

They are not high end servers as the MyYearBook or Sun ones but they
are pretty decent. They are all i386 so we'll have to get other
clients but I think we can set up a variety of benchmarks to get them
busy and produce valuable results.

If we can get a benchmark client/results server infrastructure by the
end of january, it will be right on time.

I'm pretty sure all vendors of PostgreSQL are aware that a bench farm
is really important right now so I'm pretty sure we'll get more
servers if we can get something as easy to setup as the build farm.

--
Guillaume

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [GENERAL] Empty arrays with ARRAY[]

2007-11-27 Thread Brendan Jurd
On Nov 28, 2007 9:49 AM, Tom Lane [EMAIL PROTECTED] wrote:
  I had a bit of a dig into this.  A_Const-typename gets set directly
  by the parse paths for INTERVAL [(int)] string [interval range].  In
  fact, as far as I can tell that's the _only_ place A_Const-typename
  gets used at all.

 Uh, you missed quite a lot of others ... see CURRENT_DATE and a lot of
 other productions.


Thanks again.  I missed those because they don't use
makeStringConst().  Looking again, it turns out many productions is
more like 15.

That's a bigger number, certainly, but it's still manageable.  It
wouldn't be hard to convert them to generate a const-in-a-cast.  In
fact with the addition of a makeCastStringConst(), I think the code
saving from A_Const-typename would be cancelled out.

If the only reason for keeping A_Const-typename around is the alleged
code saving (as indicated by the code comments), my offer to do away
with it is still on the table.

Regards,
BJ

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Josh Berkus
Andrew,

 It's the tests I think belong in core, not the farm software. Currently
 buildfarm performs functionality tests that are also in core.

Jignesh and I were talking about writing a Pole Position-style test which 
measures peformance on each of a couple dozen specific operations.  There 
are limits to what we could ship with PostgreSQL ... DW operations aren't 
really testable without 18 hours to generate data ... but we could test a 
lot of things.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Poorly designed tsearch NOTICEs

2007-11-27 Thread Tom Lane
Last month I complained:
 regression=# SELECT plainto_tsquery('the any'); 
 NOTICE:  query contains only stopword(s) or doesn't contain lexeme(s), ignored
  plainto_tsquery 
 -
 
 (1 row)

 regression=# select ''::tsquery;
 NOTICE:  tsearch query doesn't contain lexeme(s): 
  tsquery 
 -
 
 (1 row)

 IMHO, it's really bad design to have this sort of NOTICE emitted by
 tsquery input.  Even if an application uses numnode() or querytree() or
 something similar to detect bogus queries, it's going to have its logs
 cluttered with these notices.

 I could see having the @@ operator emit the notice if the query is
 actually used for searching --- though I'm not quite sure how to get it
 to come out only once per query ... maybe we could put it into the index
 consistent() functions somehow?

I experimented with this and found out that it works all right for GIN
indexes, if the NOTICE is put into gin_extract_query(); that seems to be
called just once per GIN index search.  However, the only possible place
to put it in GIST tsearch support would be in the consistent() routines,
and that's no good because those will be called once per entry on the
index's root page --- so you get multiple copies of the NOTICE.

So it seems that the practical alternatives are:

1. Leave these notices where they are.  Expect complaints from people
who would rather not have their logs cluttered with 'em.

2. Remove the notices altogether.  Expect complaints from people who
get no matches on queries that they don't realize are all-stopwords.

3. Remove the notices from the input routines, and put one into
gin_extract_query only.  We'll still get complaints as in #2, but
only from people using GIST indexes or no index at all for searching.

None of these are really terribly attractive, but I'm kinda leaning
to #2 myself.  I'm not convinced that it's the province of the DB to be
issuing messages like this.  In a lot of common scenarios, NOTICEs
aren't going to be seen by the actual person entering the query anyway,
because there are layers of software between him and the DB.  All they
will accomplish is to bloat some logs somewhere.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Time to update list of contributors

2007-11-27 Thread Josh Berkus
All,

Time for the annual update of this list:
http://www.postgresql.org/developer/bios

Here's the list of people I gleaned from the release notes (btw, if people 
have countries for the folks who aren't attributed, I'd appreciate them).  
Of course, there are many contributors to essential non-core code who 
should be listed as well:

Tom Lane, USA
Peter Eisentraut, Germany
Bruce Momjian, USA
Simon Riggs, England
Pavan Deolasee, India
Itagaki Takahiro, Japan
Greg Smith, USA
David Fetter, USA
Pavel Stehule, Czech
Greg Stark, USA
Heikki Linnakangas
Oleg Bartunov, Russia
Florian Pflug
Jeff Davis, USA
Trevor Hardcastle
Nikhil S
Holdger Schurig
D'Arcy Cain, Canada
Gevik Babakhani, Netherlands
Teodor Sigaev, Russia
Alvaro Herrera, Chile
Mark Kirkwood, New Zealand
Joachim Wieland
Henry Hotz, USA
Magnus Haeglander, Sweden
Tatsuo Ishii, Japan
Victor Wagner
Bill Moran, USA
Andrew Dunstan, USA
Arul Shaji
Nickolay Samokhvalov, Russia
Neil Conway, Canada
Marc Fournier, Canada
Jaime Casanova, Venezuala
Albert Cervera
Bernd Helmle
Glen Parker
Jan Wieck, USA
Steve Marshall
Paul Bayer
Doug Knight
Greg Sabino Mullane, USA
Chad Wagner
Brendan Jurd
Euler Taviera de Oliveira, Brazil
Joe Conway, USA
Michael M., Germany
Guillaume Smet, France
Mark Cotner
Chris Marcellino, Italy
Dave Cramer, Canada
Devrim Gunduz, Turkey
Jeremy Drake
Marko Kreen, Estonia
Kris Jurka, Finland
Tom Dunstan, USA

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [pgsql-www] Time to update list of contributors

2007-11-27 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tue, 27 Nov 2007 16:08:36 -0800
Josh Berkus [EMAIL PROTECTED] wrote:

 All,
 
 Time for the annual update of this list:
 http://www.postgresql.org/developer/bios
 
 Here's the list of people I gleaned from the release notes (btw, if
 people have countries for the folks who aren't attributed, I'd
 appreciate them). Of course, there are many contributors to essential
 non-core code who should be listed as well:

It would be useful to know the requirements to be listed...

Joshua D. Drake

 
 Tom Lane, USA
 Peter Eisentraut, Germany
 Bruce Momjian, USA
 Simon Riggs, England
 Pavan Deolasee, India
 Itagaki Takahiro, Japan
 Greg Smith, USA
 David Fetter, USA
 Pavel Stehule, Czech
 Greg Stark, USA
 Heikki Linnakangas
 Oleg Bartunov, Russia
 Florian Pflug
 Jeff Davis, USA
 Trevor Hardcastle
 Nikhil S
 Holdger Schurig
 D'Arcy Cain, Canada
 Gevik Babakhani, Netherlands
 Teodor Sigaev, Russia
 Alvaro Herrera, Chile
 Mark Kirkwood, New Zealand
 Joachim Wieland
 Henry Hotz, USA
 Magnus Haeglander, Sweden
 Tatsuo Ishii, Japan
 Victor Wagner
 Bill Moran, USA
 Andrew Dunstan, USA
 Arul Shaji
 Nickolay Samokhvalov, Russia
 Neil Conway, Canada
 Marc Fournier, Canada
 Jaime Casanova, Venezuala
 Albert Cervera
 Bernd Helmle
 Glen Parker
 Jan Wieck, USA
 Steve Marshall
 Paul Bayer
 Doug Knight
 Greg Sabino Mullane, USA
 Chad Wagner
 Brendan Jurd
 Euler Taviera de Oliveira, Brazil
 Joe Conway, USA
 Michael M., Germany
 Guillaume Smet, France
 Mark Cotner
 Chris Marcellino, Italy
 Dave Cramer, Canada
 Devrim Gunduz, Turkey
 Jeremy Drake
 Marko Kreen, Estonia
 Kris Jurka, Finland
 Tom Dunstan, USA
 


- -- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTLMdATb/zqfZUUQRApmWAKCn35/NMsxIPbb0zVOEOIPwBftthQCdE5bH
0KRAB/zHwywwy0hVZx2f1zg=
=7QdJ
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Time to update list of contributors

2007-11-27 Thread Kris Jurka



On Tue, 27 Nov 2007, Josh Berkus wrote:


Kris Jurka, Finland


USA actually.

Kris Jurka

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Poorly named support routines for GIN tsearch index opclasses

2007-11-27 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Another possibility would be to change the declared signatures to show
 tsquery rather than internal at the places where a tsquery argument
 is expected.  I'm less excited about that part though.

The use of internal arguments has always been the part of GIN/GIST which
bothered me the most. Most of those instances are actually quite necessary but
if there are some that aren't I'm all for removing them.

The only thing is that this has a semantic effect. It means users will be able
to call these functions from SQL directly. Are they safe to allow this? Is
this useful?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Time to update list of contributors

2007-11-27 Thread Gregory Stark

Josh Berkus [EMAIL PROTECTED] writes:

 All,

 Time for the annual update of this list:
...
 Greg Stark, USA

I'm not sure what the countries are supposed to signify but that's neither the
country I hail from nor where I'm currently living.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Andrew Dunstan



Josh Berkus wrote:

Andrew,

  

It's the tests I think belong in core, not the farm software. Currently
buildfarm performs functionality tests that are also in core.



Jignesh and I were talking about writing a Pole Position-style test which 
measures peformance on each of a couple dozen specific operations.  There 
are limits to what we could ship with PostgreSQL ... DW operations aren't 
really testable without 18 hours to generate data ... but we could test a 
lot of things.


  


I think we're going to need several sets of tests, and some settable 
scaling factors.


Performance isn't just about humungous DW apps. We need to have testing 
on small to medium sized apps/machinery as well.



cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Still a NOTICE in dict_thesaurus.c

2007-11-27 Thread Bruce Momjian
Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
   Thanks, working on this now.  Because the stop words can be any stop
   word I didn't imagine it could be in the subsitution.  What stop word
   will it use?  Anyway, coding it now to match the left-hand-side.
  
  Why would you do that?  There is not any corresponding idea of leaving
  holes in the pattern.  AFAICS the only change needed is NOTICE-ERROR.
 
 OK, I see now.  The right-hand side just ignores a stop word, rather
 than accepting any stop word.  Got it.
 
 So, for the left-hand-side we have '?' as any stop word and a stop word
 as an error, and you want to make a stop word as an error on the
 right-hand-size.  No need for '?'.  Agreed.  Want me to do it?

I have applied the attached patch to change the NOTICE to ERROR.  Thanks
for spotting this.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/tsearch/dict_thesaurus.c
===
RCS file: /cvsroot/pgsql/src/backend/tsearch/dict_thesaurus.c,v
retrieving revision 1.8
diff -c -c -r1.8 dict_thesaurus.c
*** src/backend/tsearch/dict_thesaurus.c	15 Nov 2007 22:25:16 -	1.8
--- src/backend/tsearch/dict_thesaurus.c	28 Nov 2007 04:22:08 -
***
*** 570,576 
  			}
  			else if (lexized)
  			{
! elog(NOTICE, thesaurus word \%s\ in substitution is a stop-word, ignored (rule %d), inptr-lexeme, i + 1);
  			}
  			else
  			{
--- 570,576 
  			}
  			else if (lexized)
  			{
! elog(ERROR, thesaurus word \%s\ in substitution is a stop-word (rule %d), inptr-lexeme, i + 1);
  			}
  			else
  			{

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Poorly designed tsearch NOTICEs

2007-11-27 Thread Robert Treat
On Tuesday 27 November 2007 19:03, Tom Lane wrote:
 Last month I complained:
  regression=# SELECT plainto_tsquery('the any');
  NOTICE:  query contains only stopword(s) or doesn't contain lexeme(s),
  ignored plainto_tsquery
  -
 
  (1 row)
 
  regression=# select ''::tsquery;
  NOTICE:  tsearch query doesn't contain lexeme(s): 
   tsquery
  -
 
  (1 row)
 
  IMHO, it's really bad design to have this sort of NOTICE emitted by
  tsquery input.  Even if an application uses numnode() or querytree() or
  something similar to detect bogus queries, it's going to have its logs
  cluttered with these notices.
 
  I could see having the @@ operator emit the notice if the query is
  actually used for searching --- though I'm not quite sure how to get it
  to come out only once per query ... maybe we could put it into the index
  consistent() functions somehow?

 I experimented with this and found out that it works all right for GIN
 indexes, if the NOTICE is put into gin_extract_query(); that seems to be
 called just once per GIN index search.  However, the only possible place
 to put it in GIST tsearch support would be in the consistent() routines,
 and that's no good because those will be called once per entry on the
 index's root page --- so you get multiple copies of the NOTICE.

 So it seems that the practical alternatives are:

 1. Leave these notices where they are.  Expect complaints from people
 who would rather not have their logs cluttered with 'em.

 2. Remove the notices altogether.  Expect complaints from people who
 get no matches on queries that they don't realize are all-stopwords.

 3. Remove the notices from the input routines, and put one into
 gin_extract_query only.  We'll still get complaints as in #2, but
 only from people using GIST indexes or no index at all for searching.

 None of these are really terribly attractive, but I'm kinda leaning
 to #2 myself.  I'm not convinced that it's the province of the DB to be
 issuing messages like this.  In a lot of common scenarios, NOTICEs
 aren't going to be seen by the actual person entering the query anyway,
 because there are layers of software between him and the DB.  All they
 will accomplish is to bloat some logs somewhere.

 Comments?

I would lean toward #1 since it seems to be closest to the behavior from 
previous releases. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Poorly named support routines for GIN tsearch index opclasses

2007-11-27 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Another possibility would be to change the declared signatures to show
 tsquery rather than internal at the places where a tsquery argument
 is expected.  I'm less excited about that part though.

 The only thing is that this has a semantic effect. It means users will
 be able to call these functions from SQL directly. Are they safe to
 allow this? Is this useful?

No, no, and no, because there will still be at least one internal
argument.  I'm just suggesting that the argument positions that do
correspond to ordinary SQL types should be declared that way, as an
extra way of distinguishing these support functions from others for
other opclasses.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Robert Treat
On Tuesday 27 November 2007 15:07, Simon Riggs wrote:
 On Tue, 2007-11-27 at 14:02 -0500, Tom Lane wrote:
  There has been some discussion of making a project policy of dropping
  support for old releases after five years.  Should we consider formally
  instituting that?
 
  I see that there are two or three minor bug fixes in the REL7_3_STABLE
  branch since 7.3.20.  Rather than just leaving those to rot, maybe the
  actual policy should be only one more update after 8.3 comes out.

 Well, I agree that it shouldn't be your responsibility to do that. We
 need to reduce the things you have to worry about to allow you to focus
 on later releases.

 One of the good things about open source is the ability for software to
 remain supported for many years longer than closed source software.

 Perhaps we should ask for volunteers to maintain that branch? If we had
 a maintenance release manager, then they can take responsibility for
 passing down any appropriate bug fixes. We could also create a new list
 for people discussing older releases, so we don't get pinged all the
 time.

 That way anybody with an application at older release levels can either
 step up to the plate or lose support.

+1 to see if anyone else wants to take over management of the branch. I also 
think we should be a bit more generous on the EOL notice. Saying one more 
update after 8.3 is akin to giving a 1 month EOL notice; not friendly at all 
imo. Set it for July 2008 and I think you have given plenty of notice (and 
given the lack of back patches, should be too much of a burden in that time 
either)

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Time to update list of contributors

2007-11-27 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Josh Berkus [EMAIL PROTECTED] writes:
 Time for the annual update of this list:
 Greg Stark, USA

 I'm not sure what the countries are supposed to signify but that's
 neither the country I hail from nor where I'm currently living.

Just tell us how you want to be listed ...

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Time to update list of contributors

2007-11-27 Thread Jaime Casanova
On Nov 27, 2007 7:08 PM, Josh Berkus [EMAIL PROTECTED] wrote:

 Jaime Casanova, Venezuala

Ecuador

 Bernd Helmle

and he's from germany
http://www.oopsware.de/private/bernd.html

-- 
regards,
Jaime Casanova

Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning.
   Richard Cook

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Josh Berkus wrote:
 ... DW operations aren't 
 really testable without 18 hours to generate data ... but we could test a 
 lot of things.

 Performance isn't just about humungous DW apps.

Indeed.  I think the real take-home lesson from these past few days'
discussion is that *any* particular view of performance is going to
miss things that don't affect that case, but do affect somebody else.

What I find most worrisome about the notion of setting up a
performance-farm is that it will encourage us to optimize with blinkers
on --- that is, that we will consider only the specific cases measured
by whatever tests are included in the farm, and will happily pessimize
other cases.  We can ameliorate that a bit if we can get a sufficiently
wide variety of test cases, but it will always be a concern.  And
dogmatic positions like only cases involving terabytes of data are
worth testing are definitely not going to help.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Joshua D. Drake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wed, 28 Nov 2007 00:15:48 -0500
Tom Lane [EMAIL PROTECTED] wrote:

 Andrew Dunstan [EMAIL PROTECTED] writes:
  Josh Berkus wrote:
  ... DW operations aren't 
  really testable without 18 hours to generate data ... but we could
  test a lot of things.
 
  Performance isn't just about humungous DW apps.
 
 Indeed.  I think the real take-home lesson from these past few days'
 discussion is that *any* particular view of performance is going to
 miss things that don't affect that case, but do affect somebody else.
 
 What I find most worrisome about the notion of setting up a
 performance-farm is that it will encourage us to optimize with
 blinkers on --- that is, that we will consider only the specific
 cases measured by whatever tests are included in the farm, and will
 happily pessimize other cases.  We can ameliorate that a bit if we
 can get a sufficiently wide variety of test cases, but it will always
 be a concern.  And dogmatic positions like only cases involving
 terabytes of data are worth testing are definitely not going to help.


Well I certainly agree with that, especially considering that although
we do have installations with that type of data, the percentage is
microscopic compared to those that don't.

I think it may be interested to host a series of different test from
pgbench, odbcbench, dbt2, custom scripts etc... to provide trending for
a particular host. That way if one host does 50tps continuously and
then changes one way or the other, there is at least a flag...

Sincerely,

Joshua D. Drake



- -- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHTP3iATb/zqfZUUQRAgQ3AJ9N71Xgl+O4H4dr/IFagzjxu0HvAwCfZVkB
ReQmGxacCqilRstLGhoqBU4=
=03BG
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Time to update list of contributors

2007-11-27 Thread Pavan Deolasee
On Nov 28, 2007 5:38 AM, Josh Berkus [EMAIL PROTECTED] wrote:


 Nikhil S


Nikhil is from India, EnterpriseDB.


Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] Quality and Performance

2007-11-27 Thread Stefan Kaltenbrunner
Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
 Josh Berkus wrote:
 ... DW operations aren't 
 really testable without 18 hours to generate data ... but we could test a 
 lot of things.
 
 Performance isn't just about humungous DW apps.
 
 Indeed.  I think the real take-home lesson from these past few days'
 discussion is that *any* particular view of performance is going to
 miss things that don't affect that case, but do affect somebody else.

yep - but the do not affect somebody else might be one that could be
actually catched by something like a benchfarm.

 
 What I find most worrisome about the notion of setting up a
 performance-farm is that it will encourage us to optimize with blinkers
 on --- that is, that we will consider only the specific cases measured
 by whatever tests are included in the farm, and will happily pessimize
 other cases.  We can ameliorate that a bit if we can get a sufficiently
 wide variety of test cases, but it will always be a concern.  And
 dogmatic positions like only cases involving terabytes of data are
 worth testing are definitely not going to help.

agreed - I don't think having the tests itself in core (at least
initially) is such a good idea(neither am I sure tacking it on top of
the buildfarm really is).
There are a LOT of things we could do with such a farm/infrastructure
but it will take time to exactly figure out what we can reasonably do on
an automated/regular base and in a common framework.


Stefan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PG 7.3 is five years old today

2007-11-27 Thread Devrim GÜNDÜZ
Hi,

On Tue, 2007-11-27 at 23:53 -0500, Robert Treat wrote:
 I also  think we should be a bit more generous on the EOL notice.
 Saying one more update after 8.3 is akin to giving a 1 month EOL
 notice; not friendly at all imo. Set it for July 2008 and I think you
 have given plenty of notice (and given the lack of back patches,
 should be too much of a burden in that time either)

+1 for this.

Regards,
-- 
Devrim GÜNDÜZ , RHCE
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/


signature.asc
Description: This is a digitally signed message part