Re: [HACKERS] patch: preload dictionary new version

2010-07-20 Thread Itagaki Takahiro
2010/7/14 Pavel Stehule pavel.steh...@gmail.com:
 this patch is significantly reduced original patch. It doesn't propose
 a simple allocator - just eliminate a high memory usage for ispell
 dictionary.

I don't think introducing new methods is a good idea. If you want a
simple allocator, MemoryContextMethods layer seems better for me.

The original purpose of the patch -- preloading dictionaries --
is also need to be redesigned with precompiler approach.
I'll move the proposal to Returned with Feedback status.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Query optimization problem

2010-07-20 Thread Zotov

*i wrote to
  pgsql-b...@postgresql.org
they tell me write to
  pgsql-performa...@postgresql.org
they tell me write here*

*I don`t whant know how optimize query myself (i know it), and i think 
it must do planner.*


I have a query:

 SELECT d1.ID, d2.ID
 FROM DocPrimary d1
   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
 WHERE (d1.ID=234409763) or (d2.ID=234409763)

i think what QO(Query Optimizer) can make it faster (now it seq scan and on
million records works 7 sec)
This Query very fast (use indexes) and easy make from first query

 SELECT d1.ID, d2.ID
 FROM DocPrimary d1
   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
 WHERE (d2.BasedOn=234409763) or (d2.ID=234409763)

Next plans created on table without million rows data don`t look at exec 
time


 --
 Slow Query
 --
 test=# EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS off )SELECT 
d1.ID,

 d2.ID
 test-# FROM DocPrimary d1
 test-#   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
 test-# WHERE (d1.ID=234409763) or (d2.ID=234409763);
 QUERY PLAN
 

   Hash Join  (cost=58.15..132.35 rows=2 width=8) (actual 
time=0.007..0.007

 rows=0 loops=1)
 Output: d1.id, d2.id
 Hash Cond: (d2.basedon = d1.id)
 Join Filter: ((d1.id = 234409763) OR (d2.id = 234409763))
 -   Seq Scan on public.docprimary d2  (cost=0.00..31.40 rows=2140
 width=8) (actual time=0.002..0.002 rows=0 loops=1)
   Output: d2.id, d2.basedon
 -   Hash  (cost=31.40..31.40 rows=2140 width=4) (never executed)
   Output: d1.id
   -   Seq Scan on public.docprimary d1  (cost=0.00..31.40 
rows=2140

 width=4) (never executed)
 Output: d1.id

 --
 Fast Query
 --
 test=# EXPLAIN (ANALYZE on, VERBOSE on, COSTS on, BUFFERS off )SELECT 
d1.ID,

 d2.ID
 test-# FROM DocPrimary d1
 test-#   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
 test-# WHERE (d2.BasedOn=234409763) or (d2.ID=234409763);
 QUERY PLAN
 
-
   Nested Loop  (cost=8.60..58.67 rows=12 width=8) (actual 
time=0.026..0.026

 rows=0 loops=1)
 Output: d1.id, d2.id
 -   Bitmap Heap Scan on public.docprimary d2  (cost=8.60..19.31 
rows=12

 width=8) (actual time=0.023..0.023 rows=0 loops=1)
   Output: d2.id, d2.basedon
   Recheck Cond: ((d2.basedon = 234409763) OR (d2.id = 234409763))
   -   BitmapOr  (cost=8.60..8.60 rows=12 width=0) (actual
 time=0.018..0.018 rows=0 loops=1)
 -   Bitmap Index Scan on basedon_idx  (cost=0.00..4.33
 rows=11 width=0) (actual time=0.008..0.008 rows=0 loops=1)
   Index Cond: (d2.basedon = 234409763)
 -   Bitmap Index Scan on id_pk  (cost=0.00..4.26 rows=1
 width=0) (actual time=0.003..0.003 rows=0 loops=1)
   Index Cond: (d2.id = 234409763)
 -   Index Scan using id_pk on public.docprimary d1  (cost=0.00..3.27
 rows=1 width=4) (never executed)
   Output: d1.id, d1.basedon
   Index Cond: (d1.id = d2.basedon)



PGver: PostgreSQL 9.0b x86
OS: Win7 x64

-
Create table query:
-

CREATE TABLE docprimary
(
  id integer NOT NULL,
  basedon integer,
  CONSTRAINT id_pk PRIMARY KEY (id)
);
CREATE INDEX basedon_idx
  ON docprimary
  USING btree
  (basedon);

--
С уважением,
Зотов Роман Владимирович
руководитель Отдела инструментария
ЗАО НПО Консультант
г.Иваново, ул. Палехская, д. 10
тел./факс: (4932) 41-01-21
mailto: zo...@oe-it.ru



Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Simon Riggs
On Mon, 2010-07-19 at 23:40 -0400, Robert Haas wrote:

 Since it has been over a month since this review was posted and no new
 version of the patch has appeared, I think we should mark this patch
 as Returned with Feedback.

Mark posted a new patch 6 days ago, AFAICS.

Not sure I see any benefit in doing as you propose anyway, or at least
not without warning since it just confuses authors who may believe they
have time while the commitfest is still on.

Commitfests were created to help authors. We must continue to remember
that 99% of patch authors are not full time and likely to find smooth
responsiveness difficult.

We should say Will there be any further action on this patch?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multibyte charater set in levenshtein function

2010-07-20 Thread Itagaki Takahiro
2010/7/13 Alexander Korotkov aekorot...@gmail.com:
 Anyway I think that overhead is not ignorable. That's why I have splited
 levenshtein_internal into levenshtein_internal and levenshtein_internal_mb,
 and levenshtein_less_equal_internal into levenshtein_less_equal_internal and
 levenshtein_less_equal_internal_mb.

Thank you for good measurement! Then, it's reasonable to have multiple
implementations. It also has documentation. I'll change status of the
patch to Ready for Committer.

The patch is good enough except argument types for some functions.
For example:
  - char* vs. const char*
  - text* vs. const char* + length
I hope committers would check whether there are better types for them.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (9.1) btree_gist support for searching on not equals

2010-07-20 Thread Itagaki Takahiro
2010/7/16 Jeff Davis pg...@j-davis.com:
 I'd like to ask you to write additional documentation about btree_gist [1]
 that the module will be more useful when it is used with exclusion
 constraints together. Without documentation, no users find the usages.

| Example using an Exclusion Constraint to enforce the constraint
| that a cage at a zoo can contain only one kind of animal:

Very interesting example :-)
The patch will be applied immediately.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] string_to_array has to be stable?

2010-07-20 Thread Pavel Stehule
Hello

I am working on to_array, to_string functions and I am looking on
string_to_array function. I am surprised so this function is marked as
immutable

postgres=# select array_to_string(array[current_date],',');
 array_to_string
-
 2010-07-20
(1 row)

postgres=# set datestyle to German ;
SET
postgres=# select array_to_string(array[current_date],',');
 array_to_string
-
 20.07.2010
(1 row)

it is probably wrong

Regards

Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sql/med review - problems with patching

2010-07-20 Thread Pavel Stehule
2010/7/20 Itagaki Takahiro itagaki.takah...@gmail.com:
 2010/7/14 Pavel Stehule pavel.steh...@gmail.com:
 please, can you refresh patch, please?

 Updated patch attached. The latest version is always in the git repo.
 http://repo.or.cz/w/pgsql-fdw.git   (branch: fdw)
 I'm developing the patch on postgres' git repo. So, regression test
 for dblink might fail because of out-of-sync issue between cvs and git.

 When I looked to documentation I miss a some tutorial for foreign
 tables. There are only reference. I miss some paragraph where is
 cleanly and simple specified what is possible now and whot isn't
 possible. Enhancing of dblink isn't documented

 Sure. I'll start to write documentation when we agree the design of FDW.

 In function  pgIterate(ForeignScanState *scanstate) you are iterare
 via pg result. I am thinking so using a cursor and fetching multiple
 rows should be preferable.

 Sure, but I'm thinking that it will be improved after libpq supports
 protocol-level cursor. The libpq improvement will be applied
 much more applications including postgresql_fdw.


is there some time frame for this task - or ToDo point? Minimally it
has to be documented, because it can be a issue on larger sets -
speed, memory usage. I am afraid about speed for queries like

select * from large_dblink_tab limit 100;

Regards

Pavel

 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] string_to_array has to be stable?

2010-07-20 Thread Thom Brown
On 20 July 2010 10:31, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 I am working on to_array, to_string functions and I am looking on
 string_to_array function. I am surprised so this function is marked as
 immutable

 postgres=# select array_to_string(array[current_date],',');
  array_to_string
 -
  2010-07-20
 (1 row)

 postgres=# set datestyle to German ;
 SET
 postgres=# select array_to_string(array[current_date],',');
  array_to_string
 -
  20.07.2010
 (1 row)

 it is probably wrong

 Regards

 Pavel Stehule


Yeah, that looks wrong.

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Query results differ depending on operating system (using GIN)

2010-07-20 Thread Artur Dabrowski

Hello,

I have been redirected here from pg-general.

I tested full text search using GIN index and it turned out that the results
depend on operating system. Not all the rows are found when executing some
of queries on pg server installed on Win XP SP3 and CentOS 5.4, while
everything seems to be fine on Ubuntu 4.4.1.

More details and tested queries are described here:
http://old.nabble.com/Incorrect-FTS-results-with-GIN-index-ts29172750.html

I hope you can help with this weird problem.

Regards
Artur
-- 
View this message in context: 
http://old.nabble.com/Query-results-differ-depending-on-operating-system-%28using-GIN%29-tp29213082p29213082.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query results differ depending on operating system (using GIN)

2010-07-20 Thread Thom Brown
On 20 July 2010 10:41, Artur Dabrowski a...@astec.com.pl wrote:

 Hello,

 I have been redirected here from pg-general.

 I tested full text search using GIN index and it turned out that the results
 depend on operating system. Not all the rows are found when executing some
 of queries on pg server installed on Win XP SP3 and CentOS 5.4, while
 everything seems to be fine on Ubuntu 4.4.1.

 More details and tested queries are described here:
 http://old.nabble.com/Incorrect-FTS-results-with-GIN-index-ts29172750.html

 I hope you can help with this weird problem.

 Regards
 Artur
 --

Could you please clarify what you mean by Ubuntu 4.4.1?

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query results differ depending on operating system (using GIN)

2010-07-20 Thread Artur Dabrowski

Thom,

please have a look at this message of Oleg Bartunov for details of this
operating system:
http://old.nabble.com/Incorrect-FTS-results-with-GIN-index-tc29172750.html#a29212846

Artur



Thom Brown wrote:
 
 
 Could you please clarify what you mean by Ubuntu 4.4.1?
 
 Thom
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 
 

-- 
View this message in context: 
http://old.nabble.com/Query-results-differ-depending-on-operating-system-%28using-GIN%29-tp29213082p29213340.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query results differ depending on operating system (using GIN)

2010-07-20 Thread Thom Brown
On 20 July 2010 11:14, Artur Dabrowski a...@astec.com.pl wrote:

 Thom,

 please have a look at this message of Oleg Bartunov for details of this
 operating system:
 http://old.nabble.com/Incorrect-FTS-results-with-GIN-index-tc29172750.html#a29212846

 Artur


Ah, so gcc 4.4.1 on Ubuntu or derivative thereof.

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] standard_conforming_strings

2010-07-20 Thread Marko Kreen
On 7/20/10, Peter Eisentraut pete...@gmx.net wrote:
 On sön, 2010-07-18 at 09:42 -0700, David E. Wheeler wrote:
   On Jul 18, 2010, at 1:35 AM, Peter Eisentraut wrote:
  
I think there are two ways we can do this, seeing that most appear to be
in favor of doing it in the first place:  Either we just flip the
default, make a note in the release notes, and see what happens.  Or we
spend some time now and make, say, a list of driver versions and
application versions that work with standard_conforming_strings = on,
and then decide based on that, and also make that list a public resource
for packagers etc.
  
   Do both. Turn them on, then make a list and inform driver maintainers who 
 need to update. They've got a year, after all.


 Here we go then:
  http://wiki.postgresql.org/wiki/Standard_conforming_strings

There is two sorts of support:

1. Detect stdstr on startup and use that setting.

2. Detect online changes to stdstr and follow them.

AFAICS psycopg does not support 2).  Should we care about that?

There are probably other drivers with partial support.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 3:20 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, 2010-07-19 at 23:40 -0400, Robert Haas wrote:

 Since it has been over a month since this review was posted and no new
 version of the patch has appeared, I think we should mark this patch
 as Returned with Feedback.

 Mark posted a new patch 6 days ago, AFAICS.

Hmm.  I can't find it in my mail, in my archives, or linked off the
CommitFest application.  Was it posted under a different subject line?
 Do you have a link to the archives?

 Not sure I see any benefit in doing as you propose anyway, or at least
 not without warning since it just confuses authors who may believe they
 have time while the commitfest is still on.

 Commitfests were created to help authors. We must continue to remember
 that 99% of patch authors are not full time and likely to find smooth
 responsiveness difficult.

 We should say Will there be any further action on this patch?

It isn't the purpose of CommitFests to provide patch authors with an
unlimited right to have their patches reviewed.  They exist, on the
one hand, to make sure that patches don't linger forever without
getting a review; and on the other hand, to create a defined time for
each member of the community to set aside their own work and
review/commit other people's patches.  It is important that we have
them, and it is also important that they end, so that reviews,
committers, commitfest managers, etc. can stop working on the CF at
some point and get back to their own work.  In other words,
CommitFests need to represent a balance between the needs of authors
and the needs of patch reviewers and committers.

Of course, anyone is always welcome to review a patch that has been
posted, and a committer can decide to work on a patch at any time
also.  But if patch authors are entitled to resubmit a previously
reviewed patch up until the very last CommitFest are *guaranteed* a
possible review and commit even then, then the CommitFest will not end
on time.  Even if the CommitFest does end on time, more than 50% of
the time between now and 9.1 feature freeze is CF-time - that is, time
I'm supposed to be putting aside the work I'd like to get done to help
other people get the work they'd like to do get done.  I'm not really
willing to increase that percentage much further.  I feel it's
important that we give everyone a fair shake, and I have devoted and
will continue to devote a LOT of time to making sure that happens -
but I want (and if I care to still be employed, need) some time to do
my own work, too.

To me, the definition of a fair shake is that people get 4-5 days to
respond to review comments.  This patch has had 33.  It's not unfair
to anyone to say, you know, since you didn't get around to updating
this patch for over a month, you'll need to resubmit the updated
version to the next CommitFest.  If we have the resources to review
and commit a late resubmission of some particular patch, that is
great.  But as of today, we still have 32 patches that need to be
reviewed, many of which do not have a reviewer assigned or which have
a reviewer assigned but have not yet had an initial review.  Since
there are 26 days left in the CommitFest and many of those patches
will need multiple rounds of review and much discussion before we
decide whether to commit them, send them back for rework, or reject
them outright, that's pretty scary.  To me, that's where we should be
focusing our effort.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: preload dictionary new version

2010-07-20 Thread Pavel Stehule
Hello

2010/7/20 Itagaki Takahiro itagaki.takah...@gmail.com:
 2010/7/14 Pavel Stehule pavel.steh...@gmail.com:
 this patch is significantly reduced original patch. It doesn't propose
 a simple allocator - just eliminate a high memory usage for ispell
 dictionary.

 I don't think introducing new methods is a good idea. If you want a
 simple allocator, MemoryContextMethods layer seems better for me.


can you explain it, please?

 The original purpose of the patch -- preloading dictionaries --
 is also need to be redesigned with precompiler approach.
 I'll move the proposal to Returned with Feedback status.


ok.

thank you very much

Pavel Stehule

 --
 Itagaki Takahiro


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Simon Riggs
On Tue, 2010-07-20 at 07:06 -0400, Robert Haas wrote:
 To me, the definition of a fair shake is that people get 4-5 days to
 respond to review comments.  This patch has had 33.  It's not unfair
 to anyone to say, you know, since you didn't get around to updating
 this patch for over a month, you'll need to resubmit the updated
 version to the next CommitFest.  If we have the resources to review
 and commit a late resubmission of some particular patch, that is
 great.  But as of today, we still have 32 patches that need to be
 reviewed, many of which do not have a reviewer assigned or which have
 a reviewer assigned but have not yet had an initial review.  Since
 there are 26 days left in the CommitFest and many of those patches
 will need multiple rounds of review and much discussion before we
 decide whether to commit them, send them back for rework, or reject
 them outright, that's pretty scary.  To me, that's where we should be
 focusing our effort. 

So focus your effort by leaving this alone until the end of the CF.
Actively terminating things early doesn't help at all with the review
work you mention above, but it looks good if we are measuring cases
resolved per day. Are we measuring that? If so, why? Who cares?

Just leave them, and if no action at end of CF, boot them then. Saves
loads of time on chasing up on people, interpreting what they say,
worrying about it and minor admin.

Closing early gains us nothing, though might close the door on useful
work in progress. Net expected benefit is negative from acting early.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: to_string, to_array functions

2010-07-20 Thread Pavel Stehule
Hello

here is a new version - new these functions are not a strict and
function to_string is marked as stable.

both functions share code with older version.

Regards

Pavel

2010/7/16 Brendan Jurd dire...@gmail.com:
 On 17 July 2010 04:52, Pavel Stehule pavel.steh...@gmail.com wrote:
 2010/7/16 Brendan Jurd dire...@gmail.com:
 Also, if we're going to make the function non-strict, we need to
 consider how to respond when the user specifies NULL for the other
 arguments.  If the field separator is NULL, bearing in mind that NULL
 can't match any string, I would expect that to_array would return the
 undivided string as a single array element, and that to_string would
 throw an error:


 ok, it has a sense.

 other question is empty string as separator - but I think, it can has
 same behave like string_to_array and array_to_string functions.


 Agreed.  Those behaviours seem sensible.

 If the first argument is NULL for either function, I think it would be
 reasonable to return NULL.  But I could be convinced that we should
 throw an error in that case too.


 I agree - I prefer a NULL

 Thank You very much

 No worries; I will await a revised patch from you which updates these
 behaviours -- please incorporate the doc/comment changes I posted
 earlier -- I will then do a further review before handing off to a
 committer.

 Cheers,
 BJ



arraytext.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lock_timeout GUC patch - Review

2010-07-20 Thread Marc Cousin
Hi, I've been reviewing this patch for the last few days. Here it is :

* Submission review
  * Is the patch in context diff format?
Yes

  * Does it apply cleanly to the current CVS HEAD?
Yes

  * Does it include reasonable tests, necessary doc patches, etc?
Doc patches are there.
There are no regression tests. Should there be ?

* Usability review
  * Does the patch actually implement that?
Yes

  * Do we want that?
I think so. At least I'd like to have this feature :)

  * Do we already have it?
No

  * Does it follow SQL spec, or the community-agreed behavior?
I didn't see a clear conclusion from the -hackers thread on this (GUC
vs SQL statement extension)

  * Does it include pg_dump support (if applicable)?
Not applicable. Or should pg_dump and/or pg_restore put lock_timeout to 0 ?

  * Are there dangers?
As it is a guc, it could be set globally, is that a danger ?

  * Have all the bases been covered?
I honestly don't know. It touches alarm signal handling.

* Apply the patch, compile it and test:
  * Does the feature work as advertised?
I only tested it with Linux. The code is very OS-dependent. It works
as advertised with Linux. I found only one corner case (see below)

  * Are there corner cases the author has failed to consider?
The feature almost works as advertised : it fails when lock_timeout =
deadlock_timeout. Then the lock_timeout isn't detected. I think this
case isn't considered in handle_sig_alarm().

  * Are there any assertion failures or crashes?
No


* Performance review
  * Does the patch slow down simple tests?
No

  * If it claims to improve performance, does it?
Not applicable

* Does it slow down other things?
No. Maybe alarm signal handling and enabling will be slower, as there
is more work done there  (for instance, a GetCurrentTimestamp, that
was only done when log_lock_waits was activated until now. But I
couldn't measure a slowdown.

* Read the changes to the code in detail and consider:
  * Does it follow the project coding guidelines?
I think so

  * Are there portability issues?
It seems to have already been adressed, from the previous discussion
in the thread.

  * Will it work on Windows/BSD etc?
It should. I couldn't test it though. Infrastructure is there.

* Are the comments sufficient and accurate?
Yes

* Does it do what it says, correctly?
Yes

* Does it produce compiler warnings?
No

* Can you make it crash?
No

* Consider the changes to the code in the context of the project as a whole:
  * Is everything done in a way that fits together coherently with
other features/modules?
I have a feeling that
enable_sig_alarm/enable_sig_alarm_for_lock_timeout tries to solve a
very specific problem, and it gets complicated because there is no
infrastructure in the code to handle several timeouts at the same time
with sigalarm, so each timeout has its own dedicated and intertwined
code. But I'm still discovering this part of the code.

  * Are there interdependencies that can cause problems?
I don't think so.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 7:41 AM, Simon Riggs si...@2ndquadrant.com wrote:
 So focus your effort by leaving this alone until the end of the CF.
 Actively terminating things early doesn't help at all with the review
 work you mention above, but it looks good if we are measuring cases
 resolved per day. Are we measuring that? If so, why? Who cares?

We don't formally measure that, but yeah, I definitely keep an eye on
it.  I've found that if you don't keep a sharp eye on that, you end up
not done with the CommitFest is supposed to be over.  I'd much rather
boot patches for reasonable justification throughout the CommitFest
than boot everything at the end whether there's a justification or
not.

 Closing early gains us nothing, though might close the door on useful
 work in progress.

IMHO, closing early LOSES us nothing.  People are free to work on
their patches whenever they'd like, and hopefully will.  But
pretending we're going to review them all no matter when they get
resubmitted just makes people grumpy when they find out that we're not
magical and can't.  A further point is that it's very difficult to
keep track of progress if the CF page reflects a whole bunch of
supposedly Waiting on Author patches that are really quite
thoroughly dead.

On the other hand, if this patch was really resubmitted already and I
missed it, as you suggested, that's a whole different situation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Simon Riggs
On Tue, 2010-07-20 at 07:49 -0400, Robert Haas wrote:
 A further point is that it's very difficult to
 keep track of progress if the CF page reflects a whole bunch of
 supposedly Waiting on Author patches that are really quite
 thoroughly dead. 

True, but the point under discussion is what to do if no reply is
received from an author. That is something entirely different from a
patch hitting a brick wall.

We gain nothing by moving early on author-delay situations, so I suggest
we don't.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: psql: edit function, show function commands patch

2010-07-20 Thread Pavel Stehule
Hello

2010/7/16 Jan Urbański wulc...@wulczer.org:
 Hi,

 here's a review of the \sf and \ef [num] patch from
 http://archives.postgresql.org/message-id/162867791003290927y3ca44051p80e697bc6b19d...@mail.gmail.com

 == Formatting ==

 The patch has some small tabs/spaces and whitespace  issues and it applies
 with some offsets, I ran pgindent and rebased against HEAD, attaching the
 resulting patch for your convenience.

 == Functionality ==

 The patch adds the following features:
  * \e file.txt num  -  starts a editor for the current query buffer and
 puts the cursor on the [num] line
  * \ef func num - starts a editor for a function and puts the cursor on the
 [num] line
  * \sf func - shows a full CREATE FUNCTION statement for the function
  * \sf+ func - the same, but with line numbers
  * \sf[+] func num - the same, but only from line num onward

 It only touches psql, so no performance or backend stability worries.

 In my humble opinion, only the \sf[+] is interesting, because it gives you a
 copy/pasteable version of the function definition without opening up an
 editor, and I can find that useful (OTOH: you can set PSQL_EDITOR to cat and
 get the same effect with \ef... ok, just joking). Line numbers are an extra
 touch, personally it does not thrill me too much, but I've nothing against
 it.

 The number variants of \e and \ef work by simply executing $EDITOR +num
 file. I tried with some editors that came to my mind, and not all of them
 support it (most do, though):

  * emacs and emacsclient work
  * vi works
  * nano works
  * pico works
  * mcedit works
  * kwrite does not work
  * kedit does not work

 not sure what other people (or for instance Windows people) use. Apart from
 no universal support from editors, it does not save that many keystrokes -
 at most a couple. In the end you can usually easily jump to the line you
 want once you are inside your dream editor.


I disagree. When I working on servers of my customers there are some
default configuration - default editor is usually vi or vim. I cannot
change my preferred editor there and \ef n - can help me very much (I
know only one command for vi - :q :)). I am looking on KDE. There is
usual parameter --line.

I propose following solution - to add a system variable
PSQL_EDITOR_GOTOLINE_COMMAND that will contains a command for editors
without general +n navigation.

so you can set a PSQL_EDITOR_GOTOLINE_COMMAND='--line %d'

and when this string will be used, when will not be empty without default.


 My recommendation would be to only integrate the \sf[+] part of the patch,
 which will have the additional benefit of making it much smaller and cleaner
 (will avoid the grotty splitting of the number from the function name, for
 instance). But I'm just another user out there, maybe others will find uses
 for the other cases.

 I would personally not add the leading and trailing newlines to \sf output,
 but that's a question of taste.

Maybe better is using a title - so source code will use a format like
standard result set.

I'll look on it.


 Docs could use some small grammar fixes, but other than that they're fine.

 == Code ==

 In \sf code there just a strncmp, so this works:
 \sfblablabla funcname

will be fiexed


 The error for an empty \sf is not great, it should probably look more like
 \sf: missing required argument
 following the examples of \pset, \copy or \prompt.

 Why is lnptr always being passed as a pointer? Looks like a unnecessary
 complication and one more variable to care about. Can't we just pass lineno?


because I would not to use a magic constant. when lnptr is NULL, then
lineno is undefined.

Thank you very much

Pavel Stehule

 == End ==

 Cheers,
 Jan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 8:21 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Tue, 2010-07-20 at 07:49 -0400, Robert Haas wrote:
 A further point is that it's very difficult to
 keep track of progress if the CF page reflects a whole bunch of
 supposedly Waiting on Author patches that are really quite
 thoroughly dead.

 True, but the point under discussion is what to do if no reply is
 received from an author. That is something entirely different from a
 patch hitting a brick wall.

 We gain nothing by moving early on author-delay situations, so I suggest
 we don't.

No, we gain something quite specific and tangible, namely, the
expectation that patch authors will stay on top of their patches if
they want them reviewed by the community.  If that expectation doesn't
seem important to you, feel free to try running a CommitFest without
it.  If you can make it work, I'll happily sign on.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Some git conversion issues

2010-07-20 Thread Magnus Hagander
Working on the git conversion with keywords, and I've noticed a couple of
strange things that don't come up the same way in git. All of these are in
non-code files, but they do defeat the identical tarball mode.

For example, a number of files have commits showing up in cvs with nothing at
all changed. This triggered an update of the keywords only, with no contents
changed.

For example, look at:
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/Attic/FAQ_AIX.diff?r1=1.19.2.9;r2=1.19.2.10
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/Attic/FAQ_HPUX.diff?r1=1.14.2.9;r2=1.14.2.10
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/Attic/FAQ_Solaris.diff?r1=1.22.2.10;r2=1.22.2.11

Some show up as completely empty, like:
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/generate_history.pl.diff?r1=1.1;r2=1.1.2.1


Does anybody know how this can even happen? Wouldn't cvs normally just
not commit the file if there are no changes?



I'm also seeing some entries tagged with vendor branch, such as:
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/storage/smgr/README
revision 1.1.1.1

Same issue there, the file comes out on the other end with the wrong keyword (in
this case, listed as 1.1).

I admit I don't even know what a vendor branch is, except I know I've
been annoyed by them before :-)


Other than keywords, the contents of every branch and every tag are correct. I
haven't checked any arbitrary revisions (that's really hard to do with cvs..),
but I have a script that checks every branch and every tag present in
configure, configure.in and README (in the root dir), which should hopefully
cover everything. Does anybody know a better way to find out all tags and
branches that exist, or some other file I should also be looking at?


The full list of files having keyword issues due to these two things (AFAICT so
far, these are the only issues) are:
doc/FAQ_AIX
doc/src/sgml/generate_history.pl
doc/src/sgml/release-old.sgml
src/backend/catalog/README
src/backend/nodes/README
src/backend/po/es.po
src/backend/storage/ipc/README
src/backend/storage/smgr/README
src/bin/initdb/po/de.po
src/bin/initdb/po/fr.po
src/bin/pg_config/po/de.po
src/bin/pg_config/po/fr.po
src/bin/pg_controldata/po/de.po
src/bin/pg_controldata/po/es.po
src/bin/pg_controldata/po/fr.po
src/bin/pg_ctl/po/de.po
src/bin/pg_dump/po/de.po
src/bin/pg_dump/po/fr.po
src/bin/pg_dump/po/sv.po
src/bin/pg_resetxlog/po/de.po
src/bin/psql/po/de.po
src/bin/psql/po/es.po
src/bin/psql/po/fr.po
src/bin/psql/po/sv.po
src/bin/scripts/po/de.po
src/bin/scripts/po/fr.po
src/interfaces/libpq/po/de.po
src/interfaces/libpq/po/fr.po
src/interfaces/libpq/po/sv.po



Any thoughts or ideas on how to fix these? Or shuld we accept and ignore
those differences?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 we gain something quite specific and tangible, namely, the
 expectation that patch authors will stay on top of their patches
 if they want them reviewed by the community.
 
Barring some operational emergency here, I'll be reviewing the
status of all the open patches in the CF today.  If I can't find any
new posting by the author for the patch in question, I'll mark it
Returned With Feedback.  I'll probably be cracking the whip on a few
others, one way or another.  If anyone wonders why I don't do so for
certain patches, it will probably be because I've had off-list
messages about needing more time to do a proper review or being in
transit and unable to do more than post short emails at the moment.
 
I do request that all authors and reviewers make an effort to keep
the CF app page up-to-date.  If you're not sure all recent patches,
reviews, and significant comment posts are reflected in the app for
a patch for which you're an author or reviewer, please check as soon
as possible and make it right; it's save time for me and will help
keep the process moving smoothly.
 
Thanks, all.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some git conversion issues

2010-07-20 Thread Kevin Grittner
Magnus Hagander mag...@hagander.net wrote:
 
 I'm also seeing some entries tagged with vendor branch, such as:

http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/storage/smgr/README
 revision 1.1.1.1
 
 Same issue there, the file comes out on the other end with the
 wrong keyword  (in this case, listed as 1.1).
 
 I admit I don't even know what a vendor branch is, except I know
 I've been annoyed by them before :-)
 
I believe revision 1.1.1.1 is normally seen only for file brought in
through the cvs import command.  vendor branch would make some
sense as a commit message for an import.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Simon Riggs
On Tue, 2010-07-20 at 09:05 -0400, Robert Haas wrote:
 On Tue, Jul 20, 2010 at 8:21 AM, Simon Riggs si...@2ndquadrant.com wrote:
  On Tue, 2010-07-20 at 07:49 -0400, Robert Haas wrote:
  A further point is that it's very difficult to
  keep track of progress if the CF page reflects a whole bunch of
  supposedly Waiting on Author patches that are really quite
  thoroughly dead.
 
  True, but the point under discussion is what to do if no reply is
  received from an author. That is something entirely different from a
  patch hitting a brick wall.
 
  We gain nothing by moving early on author-delay situations, so I suggest
  we don't.
 
 No, we gain something quite specific and tangible, namely, the
 expectation that patch authors will stay on top of their patches if
 they want them reviewed by the community.  If that expectation doesn't
 seem important to you, feel free to try running a CommitFest without
 it.  If you can make it work, I'll happily sign on.

I don't think so. We can assume people wrote a patch because they want
it included in Postgres. Bumping them doesn't help them or us, since
there is always an issue other than wish-to-complete. Not everybody is
able to commit time in the way we do and we should respect that better.

Authors frequently have to wait a long time for a review; why should
reviewers not be as patient as authors must be?

We should be giving authors as much leeway as possible, or they may not
come back.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Kevin Grittner
Simon Riggs si...@2ndquadrant.com wrote:
 
 I don't think so. We can assume people wrote a patch because they
 want it included in Postgres. Bumping them doesn't help them or
 us, since there is always an issue other than wish-to-complete.
 Not everybody is able to commit time in the way we do and we
 should respect that better.
 
Sure.  If people mail me off-list about needing more time, I'm
willing to accommodate, within reason.
 
 Authors frequently have to wait a long time for a review; why
 should reviewers not be as patient as authors must be?
 
Let's keep this in perspective.  We're talking about pushing review
to less than two months away because of lack of author response for
over a month.  And should a patch appear before then, there's
nothing that says an interested member of the community can't review
it before the next CF.  You, for example, would be free to review it
at any time a patch might appear.
 
 We should be giving authors as much leeway as possible, or they
 may not come back.
 
One phenomenon I've noticed is that sometimes a patch is submitted
because an end user has solved their own problem for themselves, but
wishes to share the solution with the community.  They're not always
motivated to go to the lengths required to polish it up to the
standard required for inclusion in core.  In such cases, unless
someone with the time to do so finds it interesting enough to pick
up, it is just going to drop.  I hope such authors feel comfortable
submitting their next effort, as it might be something which
interests a larger audience than the previous effort.  We should do
what we can to ensure that they understand the dynamics of that.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some git conversion issues

2010-07-20 Thread Magnus Hagander
On Tue, Jul 20, 2010 at 15:31, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander mag...@hagander.net wrote:

 I'm also seeing some entries tagged with vendor branch, such as:

 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/storage/smgr/README
 revision 1.1.1.1

 Same issue there, the file comes out on the other end with the
 wrong keyword  (in this case, listed as 1.1).

 I admit I don't even know what a vendor branch is, except I know
 I've been annoyed by them before :-)

 I believe revision 1.1.1.1 is normally seen only for file brought in
 through the cvs import command.  vendor branch would make some
 sense as a commit message for an import.

Yeah, something like that. But why do we for the file above have one
initial revision and one vendor branch, whereas for other files we
don't? (and there's no difference betweenthem)

Or rather, we do have two for example for md.c - but the second one is
not listed as being on vendor branch.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 9:23 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:

 we gain something quite specific and tangible, namely, the
 expectation that patch authors will stay on top of their patches
 if they want them reviewed by the community.

 Barring some operational emergency here, I'll be reviewing the
 status of all the open patches in the CF today.  If I can't find any
 new posting by the author for the patch in question, I'll mark it
 Returned With Feedback.  I'll probably be cracking the whip on a few
 others, one way or another.  If anyone wonders why I don't do so for
 certain patches, it will probably be because I've had off-list
 messages about needing more time to do a proper review or being in
 transit and unable to do more than post short emails at the moment.

 I do request that all authors and reviewers make an effort to keep
 the CF app page up-to-date.  If you're not sure all recent patches,
 reviews, and significant comment posts are reflected in the app for
 a patch for which you're an author or reviewer, please check as soon
 as possible and make it right; it's save time for me and will help
 keep the process moving smoothly.

Thanks, I appreciate it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread David Christensen

On Jul 19, 2010, at 10:40 PM, Robert Haas wrote:

 On Wed, Jun 23, 2010 at 9:22 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 23, 2010 at 9:17 AM, gabrielle gor...@gmail.com wrote:
 On Mon, Jun 21, 2010 at 6:16 PM, Robert Haas robertmh...@gmail.com wrote:
 Well, that might be a good idea, too, but my expectation is that:
 
 psql -f one -f two -f three
 
 ought to behave in a manner fairly similar to:
 
 cat one two three  all
 psql -f all
 
 and it sounds like with this patch that's far from being the case.
 
 Correct.  Each is handled individually.
 
 Should I continue to check on ON_ERROR_ROLLBACK results, or bounce
 this back to the author?
 
 It doesn't hurt to continue to review and find other problems so that
 the author can try to fix them all at once, but it certainly seems
 clear that it's not ready to commit at this point, so it definitely
 needs to go back to the author for a rework.
 
 Since it has been over a month since this review was posted and no new
 version of the patch has appeared, I think we should mark this patch
 as Returned with Feedback.


Sorry for the delays in response.  This is fine; I think there are some 
semantic questions that should still be resolved at this point, particularly if 
we're moving toward supporting multiple -c and -f lines as expressed (an idea 
that I agree would be useful).  Specifically:

1) Does the -1 flag with multiple files indicate a single transaction for all 
commands/files or that each command/file is run in its own transaction?  I'd 
implemented this with the intent that all files were run in a single 
transaction, but it's at least a bit ambiguous, and should probably be 
documented at the very least.

2) I had a question (expressed in the comments) about how the final error 
handling status should be reported/handled.  I can see this affecting a number 
of things, particularly ON_ERROR_{STOP,ROLLBACK} behavior; specifically, if 
there was an error, it sounds like it should just abort processing of any other 
queued files/commands at this point (in the case of ON_ERROR_STOP, at least).

3) With the switch to multiple intermixed -c and -f flags, the internal 
representation will essentially have to change to a queue of structs; I think 
in that case, we'd want to modify the current .psqlrc handling to push a struct 
representing the .psqlrc file at the front of the queue, depending on the code 
paths that currently set this up.  Are there any gotchas to this approach?  
(I'm looking essentially for odd code paths where say .psqlrc was not loaded 
before, but now would be given the proper input of -c, -f file, -f -.)

I'll look more in-depth at the posted feedback and Mark's proposed patch.

Regards,

David
--
David Christensen
End Point Corporation
da...@endpoint.com





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some git conversion issues

2010-07-20 Thread Kevin Grittner
Magnus Hagander mag...@hagander.net wrote:
 
 I believe revision 1.1.1.1 is normally seen only for file brought
 in through the cvs import command.  vendor branch would make
 some sense as a commit message for an import.
 
 Yeah, something like that. But why do we for the file above have
 one initial revision and one vendor branch, whereas for other
 files we don't? (and there's no difference between them)
 
Hmmm...  I quick check of CVS documentation indicates that 1.1.1 is
reserved for the vendor branch created by an import.  New vendor
versions can be imported and will bump 1.1.1.1 to 1.1.1.2 and so on.
 When you commit a modification to a vendor branch file, it goes
onto the trunk.  So, either there was something which *looked* like
a modification to CVS which got massaged out (whitespace or some
such), or someone independently imported the file and someone
(possibly a different someone) committed the file independently of
the import.  I think the latter is probably more likely, but the
former seems within the realm of possibility.
 
Where they are identical, unless the imported version is included in
any tag or PostgreSQL branch, I would eliminate it and keep the
normal copy.  Short of maintaining a coherent history, I don't see
the point of keeping two separate but identical revisions of a file.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 10:00 AM, David Christensen da...@endpoint.com wrote:
 Sorry for the delays in response.  This is fine; I think there are some 
 semantic questions that should still be resolved at this point, particularly 
 if we're moving toward supporting multiple -c and -f lines as expressed (an 
 idea that I agree would be useful).  Specifically:

 1) Does the -1 flag with multiple files indicate a single transaction for all 
 commands/files or that each command/file is run in its own transaction?  I'd 
 implemented this with the intent that all files were run in a single 
 transaction, but it's at least a bit ambiguous, and should probably be 
 documented at the very least.

I think your implementation is right.  Documentation is always good.

 2) I had a question (expressed in the comments) about how the final error 
 handling status should be reported/handled.  I can see this affecting a 
 number of things, particularly ON_ERROR_{STOP,ROLLBACK} behavior; 
 specifically, if there was an error, it sounds like it should just abort 
 processing of any other queued files/commands at this point (in the case of 
 ON_ERROR_STOP, at least).

Right.  I think it should behave much as if you concatenated the files
and then ran psql on the result.  Except with better reporting of
error locations, etc.

 3) With the switch to multiple intermixed -c and -f flags, the internal 
 representation will essentially have to change to a queue of structs; I think 
 in that case, we'd want to modify the current .psqlrc handling to push a 
 struct representing the .psqlrc file at the front of the queue, depending on 
 the code paths that currently set this up.  Are there any gotchas to this 
 approach?  (I'm looking essentially for odd code paths where say .psqlrc was 
 not loaded before, but now would be given the proper input of -c, -f file, -f 
 -.)

 I'll look more in-depth at the posted feedback and Mark's proposed patch.

Well, IIRC, one of -c and -f suppresses psqlrc, and the other does
not.  This doesn't seem very consistent to me, but I'm not sure
there's much to be done about it at this point.  I guess if you use
whichever one suppresses psqlrc even once, it's suppressed, and
otherwise it's not.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql \conninfo command (was: Patch: psql \whoami option)

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 12:16 AM, David Christensen da...@endpoint.com wrote:
 On Jul 19, 2010, at 11:10 PM, Robert Haas wrote:

 On Tue, Jul 20, 2010 at 12:07 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jul 20, 2010 at 12:02 AM, David Christensen da...@endpoint.com 
 wrote:
 I would propose to print instead:

 You are connected to database rhaas via local socket as user rhaas.


 One minor quibble here; you lose the ability to see which pg instance 
 you're running on if there are multiple ones running on different local 
 sockets, so maybe either the port or the socket path should show up here 
 still.

 Doh.  Will fix.

 Something like the attached?


 Looks good to me.

OK, committed.

Thanks for the patch!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query results differ depending on operating system (using GIN)

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 5:41 AM, Artur Dabrowski a...@astec.com.pl wrote:
 I have been redirected here from pg-general.

 I tested full text search using GIN index and it turned out that the results
 depend on operating system. Not all the rows are found when executing some
 of queries on pg server installed on Win XP SP3 and CentOS 5.4, while
 everything seems to be fine on Ubuntu 4.4.1.

 More details and tested queries are described here:
 http://old.nabble.com/Incorrect-FTS-results-with-GIN-index-ts29172750.html

 I hope you can help with this weird problem.

This seems like it's definitely a bug, but I don't know much about the
GIN code.  Copying Oleg and Teodor...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query optimization problem

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 1:57 AM, Zotov zo...@oe-it.ru wrote:
 i wrote to
   pgsql-b...@postgresql.org
 they tell me write to
   pgsql-performa...@postgresql.org
 they tell me write here

 I don`t whant know how optimize query myself (i know it), and i think it
 must do planner.

According to the EXPLAIN ANALYZE output, your slow query is
executing in 0.007 ms, and your fast query is executing in 0.026 ms
(i.e. not as quickly as the slow query).  Since you mention that it
takes 7 s further down, I suspect this is not the real EXPLAIN ANALYZE
output on the real data that you're having a problem with.  You might
have better luck if you post the actual EXPLAIN ANALYZE output here.
Incidentally, sorry for not responding sooner to your private email -
I was on vacation last week.  But please do keep all replies on-list
so that everyone can comment.

All that having been said, I think the issue here is that the query
planner isn't inferring that d1.ID=some constant implies d2.ID=some
constant, even though there's a join clause d1.ID=d2.ID.  I'm not
really sure why it isn't doing that...  I suspect Tom Lane is the only
person who can comment intelligently on that, and he's away this week
(but if anyone else has an idea, feel free to jump in...).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sql/med review - problems with patching

2010-07-20 Thread David Fetter
On Tue, Jul 20, 2010 at 11:40:18AM +0200, Pavel Stehule wrote:
 2010/7/20 Itagaki Takahiro itagaki.takah...@gmail.com:
  2010/7/14 Pavel Stehule pavel.steh...@gmail.com:
  please, can you refresh patch, please?
 
  Updated patch attached. The latest version is always in the git
  repo.  http://repo.or.cz/w/pgsql-fdw.git   (branch: fdw) I'm
  developing the patch on postgres' git repo. So, regression test
  for dblink might fail because of out-of-sync issue between cvs and
  git.
 
  When I looked to documentation I miss a some tutorial for foreign
  tables. There are only reference. I miss some paragraph where is
  cleanly and simple specified what is possible now and whot isn't
  possible. Enhancing of dblink isn't documented
 
  Sure. I'll start to write documentation when we agree the design
  of FDW.
 
  In function  pgIterate(ForeignScanState *scanstate) you are
  iterare via pg result. I am thinking so using a cursor and
  fetching multiple rows should be preferable.
 
  Sure, but I'm thinking that it will be improved after libpq
  supports protocol-level cursor. The libpq improvement will be
  applied much more applications including postgresql_fdw.
 
 
 is there some time frame for this task - or ToDo point? Minimally it
 has to be documented, because it can be a issue on larger sets -
 speed, memory usage. I am afraid about speed for queries like
 
 select * from large_dblink_tab limit 100;

The general issue of passing qualifiers to the remote data source is
complex, especially when the DML for that data source is different
from PostgreSQL's DML.

Do you have some ideas as to how to solve this problem in general?  In
this case?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Solaris Sparc - dblink regression test failure

2010-07-20 Thread Dave Page
Continuing my fun setting up some new Solaris buildfarm critters, I'm
seeing this failure in the dblink test, on 9.0 and 9.1:

gmake[1]: Leaving directory `/export/home/dpage/postgresql/contrib/cube'
gmake[1]: Entering directory `/export/home/dpage/postgresql/contrib/dblink'
gmake -C ../../src/test/regress pg_regress
gmake[2]: Entering directory `/export/home/dpage/postgresql/src/test/regress'
gmake[2]: `pg_regress' is up to date.
gmake[2]: Leaving directory `/export/home/dpage/postgresql/src/test/regress'
../../src/test/regress/pg_regress --inputdir=.
--psqldir=/tmp/pgsql/bin --dbname=contrib_regression dblink
(using postmaster on Unix socket, default port)
== dropping database contrib_regression ==
DROP DATABASE
== creating database contrib_regression ==
CREATE DATABASE
ALTER DATABASE
== running regression test queries==
test dblink   ... ERROR:  invalid attribute number 4
STATEMENT:  SELECT dblink_build_sql_insert('foo','1 2 3 4',4,'{0,
a, {a0,b0,c0}}','{99, xyz, {za0,zb0,zc0}}');
ERROR:  invalid attribute number 4
STATEMENT:  SELECT dblink_build_sql_update('foo','1 2 3 4',4,'{0,
a, {a0,b0,c0}}','{99, xyz, {za0,zb0,zc0}}');
ERROR:  invalid attribute number 4
STATEMENT:  SELECT dblink_build_sql_delete('foo','1 2 3 4',4,'{0,
a, {a0,b0,c0}}');
ERROR:  connection not available
STATEMENT:  SELECT *
FROM dblink('SELECT * FROM foo') AS t(a int, b text, c text[])
WHERE t.a  7;
ERROR:  relation foobar does not exist at character 49
STATEMENT:  DECLARE rmt_foo_cursor CURSOR FOR SELECT * FROM foobar
ERROR:  cursor rmt_foobar_cursor does not exist
STATEMENT:  FETCH 4 FROM rmt_foobar_cursor
ERROR:  cursor rmt_foobar_cursor does not exist
STATEMENT:  CLOSE rmt_foobar_cursor
ERROR:  cursor rmt_foo_cursor does not exist
STATEMENT:  FETCH 4 FROM rmt_foo_cursor
ERROR:  cursor rmt_foo_cursor does not exist
CONTEXT:  Error occurred on dblink connection named unnamed: could
not fetch from cursor.
STATEMENT:  SELECT *
FROM dblink_fetch('rmt_foo_cursor',4) AS t(a int, b text, c text[]);
ERROR:  cursor rmt_foo_cursor does not exist
STATEMENT:  FETCH 4 FROM rmt_foo_cursor
ERROR:  connection not available
STATEMENT:  SELECT *
FROM dblink('SELECT * FROM foo') AS t(a int, b text, c text[])
WHERE t.a  7;
ERROR:  relation foobar does not exist at character 15
STATEMENT:  SELECT * FROM foobar
ERROR:  relation foobar does not exist at character 8
STATEMENT:  UPDATE foobar SET f3[2] = 'b99' WHERE f1 = 11
ERROR:  could not establish connection
DETAIL:  missing = after myconn in connection info string

STATEMENT:  SELECT *
FROM dblink('myconn','SELECT * FROM foo') AS t(a int, b text, c text[])
WHERE t.a  7;
ERROR:  relation foobar does not exist at character 15
STATEMENT:  SELECT * FROM foobar
ERROR:  duplicate connection name
STATEMENT:  SELECT dblink_connect('myconn','dbname=contrib_regression');
ERROR:  relation foobar does not exist at character 49
STATEMENT:  DECLARE rmt_foo_cursor CURSOR FOR SELECT * FROM foobar
ERROR:  DECLARE CURSOR can only be used in transaction blocks
STATEMENT:  DECLARE xact_test CURSOR FOR SELECT * FROM foo
ERROR:  DECLARE CURSOR can only be used in transaction blocks
CONTEXT:  Error occurred on dblink connection named unnamed: could
not execute command.
STATEMENT:  SELECT dblink_exec('myconn','DECLARE xact_test CURSOR FOR
SELECT * FROM foo');
NOTICE:  there is no transaction in progress
ERROR:  cursor rmt_foobar_cursor does not exist
STATEMENT:  FETCH 4 FROM rmt_foobar_cursor
ERROR:  cursor rmt_foo_cursor does not exist
STATEMENT:  FETCH 4 FROM rmt_foo_cursor
ERROR:  cursor rmt_foo_cursor does not exist
CONTEXT:  Error occurred on dblink connection named myconn: could
not fetch from cursor.
STATEMENT:  SELECT *
FROM dblink_fetch('myconn','rmt_foo_cursor',4) AS t(a int, b
text, c text[]);
ERROR:  could not establish connection
DETAIL:  missing = after myconn in connection info string

STATEMENT:  SELECT *
FROM dblink('myconn','SELECT * FROM foo') AS t(a int, b text, c text[])
WHERE t.a  7;
ERROR:  connection myconn not available
STATEMENT:  SELECT dblink_disconnect('myconn');
ERROR:  canceling statement due to user request
STATEMENT:  select * from foo where f1  3
ERROR:  password is required
DETAIL:  Non-superusers must provide a password in the connection string.
STATEMENT:  SELECT dblink_connect('myconn', 'fdtest');
LOG:  unexpected EOF on client connection
ERROR:  source row not found
STATEMENT:  SELECT dblink_build_sql_insert('test_dropped', '2', 1,
   ARRAY['1'::TEXT], ARRAY['2'::TEXT]);
ERROR:  source row not found
STATEMENT:  SELECT dblink_build_sql_update('test_dropped', '2', 1,
   ARRAY['1'::TEXT], ARRAY['2'::TEXT]);
FAILED

==
 1 of 1 tests failed.
==

The differences that caused some tests to fail can be viewed in 

Re: [HACKERS] Query optimization problem

2010-07-20 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 All that having been said, I think the issue here is that the query
 planner isn't inferring that d1.ID=some constant implies d2.ID=some
 constant, even though there's a join clause d1.ID=d2.ID.  

I think that's what the Equivalence Classes are for. Or at least that's
what they do in my head, not forcibly in the code.

The specific diff between the two queries is :

   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
- WHERE (d1.ID=234409763) or (d2.ID=234409763)
+ WHERE (d2.BasedOn=234409763) or (d2.ID=234409763)

So the OP would appreciate that the planner is able to consider applying
the restriction on d2.BasedOn rather than d1.ID given that d2.BasedOn is
the same thing as d1.ID, from the JOIN.

I have no idea if Equivalence Classes are where to look for this, and if
they're meant to extend up to there, and if that's something possible or
wise to implement, though.

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread David Christensen

On Jul 20, 2010, at 9:05 AM, Robert Haas wrote:

 On Tue, Jul 20, 2010 at 10:00 AM, David Christensen da...@endpoint.com 
 wrote:
 Sorry for the delays in response.  This is fine; I think there are some 
 semantic questions that should still be resolved at this point, particularly 
 if we're moving toward supporting multiple -c and -f lines as expressed (an 
 idea that I agree would be useful).  Specifically:
 
 1) Does the -1 flag with multiple files indicate a single transaction for 
 all commands/files or that each command/file is run in its own transaction?  
 I'd implemented this with the intent that all files were run in a single 
 transaction, but it's at least a bit ambiguous, and should probably be 
 documented at the very least.
 
 I think your implementation is right.  Documentation is always good.
 
 2) I had a question (expressed in the comments) about how the final error 
 handling status should be reported/handled.  I can see this affecting a 
 number of things, particularly ON_ERROR_{STOP,ROLLBACK} behavior; 
 specifically, if there was an error, it sounds like it should just abort 
 processing of any other queued files/commands at this point (in the case of 
 ON_ERROR_STOP, at least).
 
 Right.  I think it should behave much as if you concatenated the files
 and then ran psql on the result.  Except with better reporting of
 error locations, etc.
 
 3) With the switch to multiple intermixed -c and -f flags, the internal 
 representation will essentially have to change to a queue of structs; I 
 think in that case, we'd want to modify the current .psqlrc handling to push 
 a struct representing the .psqlrc file at the front of the queue, depending 
 on the code paths that currently set this up.  Are there any gotchas to this 
 approach?  (I'm looking essentially for odd code paths where say .psqlrc was 
 not loaded before, but now would be given the proper input of -c, -f file, 
 -f -.)
 
 I'll look more in-depth at the posted feedback and Mark's proposed patch.
 
 Well, IIRC, one of -c and -f suppresses psqlrc, and the other does
 not.  This doesn't seem very consistent to me, but I'm not sure
 there's much to be done about it at this point.  I guess if you use
 whichever one suppresses psqlrc even once, it's suppressed, and
 otherwise it's not.  :-(


That seems sub-optimal; I can see people wanting to use this feature to do 
something like:

psql -c 'set work_mem = blah' -f script.sql

and then being surprised when it works differently than just `psql -f 
script.sql`.

psql -c select 'starting' -f file1.sql -c select 'midway' -f file2.sql -c 
select 'done!'

Although I wonder if the general usecase for .psqlrc is just in interactive 
mode; i.e., hypothetically if you're running scripts that are sensitive to 
environment, you're running with -X anyway; so maybe that's not that big of a 
deal, as it's kind of an implied -X with multiple -c or -f commands.  And if 
you really wanted it, we could add a flag to explicitly include .psqlrc (or the 
user could just specify -f path/to/psqlrc).

Regards,

David
--
David Christensen
End Point Corporation
da...@endpoint.com





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing NUMERIC size for 9.1

2010-07-20 Thread Robert Haas
On Fri, Jul 16, 2010 at 2:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I'm not entirely happy with the way I handled the variable-length
 struct, although I don't think it's horrible, either. I'm willing to
 rework it if someone has a better idea.

 I don't like the way you did that either (specifically, not the kluge
 in NUMERIC_DIGITS()).  It would probably work better if you declared
 two different structs, or a union of same, to represent the two layout
 cases.

 A couple of other thoughts:

 n_sign_dscale is now pretty inappropriately named, probably better to
 change the field name.  This will also help to catch anything that's
 not using the macros.  (Renaming the n_weight field, or at least burying
 it in an extra level of struct, would be helpful for the same reason.)

I'm not sure what you have in mind here.  If we create a union of two
structs, we'll still have to pick one of them to use to check the high
bits of the first word, so I'm not sure we'll be adding all that much
in terms of clarity.  One possibility would be to name the fields
something like n_header1 and n_header2, or even just n_header[], but
I'm not sure if that's any better.  If it is I'm happy to do it.

 It seems like you've handled the NAN case a bit awkwardly.  Since the
 weight is uninteresting for a NAN, it's okay to not store the weight
 field, so I think what you should do is consider that the dscale field
 is still full-width, ie the format of the first word remains old-style
 not new-style.  I don't remember whether dscale is meaningful for a NAN,
 but if it is, your approach is constraining what is possible to store,
 and is also breaking compatibility with old databases.

There is only one NaN value.  Neither weight or dscale is meaningful.
I think if the high two bits of the first word are 11 we never examine
anything else - do you see somewhere that we're doing otherwise?

 Also, I wonder whether you can do anything with depending on the actual
 bit values of the flag bits --- specifically, it's short header format
 iff first bit is set.  The NUMERIC_HEADER_SIZE macro in particular could
 be made more efficient with that.

Right, OK.

 The sign extension code in the NUMERIC_WEIGHT() macro seems a bit
 awkward; I wonder if there's a better way.  One solution might be to
 offset the value (ie, add or subtract NUMERIC_SHORT_WEIGHT_MIN) rather
 than try to sign-extend per se.

Hmm... so, if the weight is X we store the value
X-NUMERIC_SHORT_WEIGHT_MIN as an unsigned integer?  That's kind of a
funny representation - I *think* it works out to sign extension with
the high bit flipped.  I guess we could do it that way, but it might
make it harder/more confusing to do bit arithmetic with the weight
sign bit later on.

 Please do NOT commit this:

                                (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 !                                errmsg(value overflows numeric format %x 
 w=%d s=%u,
 !                                       result-n_sign_dscale,
 !                                       NUMERIC_WEIGHT(result), 
 NUMERIC_DSCALE(result;

 or at least hide it in #ifdef DEBUG_NUMERIC or some such.

Woopsie.  That's debugging leftovers, sorry.

 Other than that the code changes look pretty clean, I'm mostly just
 dissatisfied with the access macros.

Thanks for the review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sql/med review - problems with patching

2010-07-20 Thread Pavel Stehule
2010/7/20 David Fetter da...@fetter.org:
 On Tue, Jul 20, 2010 at 11:40:18AM +0200, Pavel Stehule wrote:
 2010/7/20 Itagaki Takahiro itagaki.takah...@gmail.com:
  2010/7/14 Pavel Stehule pavel.steh...@gmail.com:
  please, can you refresh patch, please?
 
  Updated patch attached. The latest version is always in the git
  repo.  http://repo.or.cz/w/pgsql-fdw.git   (branch: fdw) I'm
  developing the patch on postgres' git repo. So, regression test
  for dblink might fail because of out-of-sync issue between cvs and
  git.
 
  When I looked to documentation I miss a some tutorial for foreign
  tables. There are only reference. I miss some paragraph where is
  cleanly and simple specified what is possible now and whot isn't
  possible. Enhancing of dblink isn't documented
 
  Sure. I'll start to write documentation when we agree the design
  of FDW.
 
  In function  pgIterate(ForeignScanState *scanstate) you are
  iterare via pg result. I am thinking so using a cursor and
  fetching multiple rows should be preferable.
 
  Sure, but I'm thinking that it will be improved after libpq
  supports protocol-level cursor. The libpq improvement will be
  applied much more applications including postgresql_fdw.
 

 is there some time frame for this task - or ToDo point? Minimally it
 has to be documented, because it can be a issue on larger sets -
 speed, memory usage. I am afraid about speed for queries like

 select * from large_dblink_tab limit 100;

 The general issue of passing qualifiers to the remote data source is
 complex, especially when the DML for that data source is different
 from PostgreSQL's DML.

 Do you have some ideas as to how to solve this problem in general?  In
 this case?

yes, I can. I expect so you can read from foreign table row by row,
because it is supported by executor - via ExecForeignScan().

dblink exec is called inside this method and result is stored.
Repeated call of this method means repeated reading from tuplestore
storage. So outer limit hasn't effect on loaded data - full result is
fetched. I propose little bit different strategy. Using a cursor and
repeating fetching of n rows. Plpgsql FOR statement uses 50 rows.
For external tables 1000 can be enough. With this strategy max (n - 1)
rows are fetched uselessly. So we don't need a local tuplestore and we
will have a result early - with LIMIT clause.

so
ExecInitForeignScan()
-- open cursor

ExecForeignScan()
-- if there are some fetched rows, return row
-- if not, fetch 1000 rows

ExecEndScan()
-- close cursor

Regards

Pavel Stehule

same mechanism is used in plpgsql.




 Cheers,
 David.
 --
 David Fetter da...@fetter.org http://fetter.org/
 Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
 Skype: davidfetter      XMPP: david.fet...@gmail.com
 iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

 Remember to vote!
 Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] could not reattach to shared memory on Windows

2010-07-20 Thread Etienne Dube

On 09/02/2010 4:09 PM, Etienne Dube wrote:

Magnus Hagander wrote:

IIRC, we've had zero reports on whether the patch worked at all on 8.2
in an environment where the problem actually existed. So yes, some
testing and feedback would be much apprecaited.

//Magnus


Thanks for your quick reply.
We upgraded to Service Pack 2 and it solved the problem. Nevertheless, 
I'll try to reproduce the issue under a Win2008 SP1 VM to see whether 
the patch makes a difference. 8.2.x win32 binaries are built using 
MinGW right?


Etienne





The could not reattach to shared memory bug came back to bite us, this 
time on a production machine running Windows Server 2008 R2 x64. I 
manually applied the patch against the 8.2.17 sources and installed the 
build on a test server. It has been running for two days without any 
issue. We'll see after a while if the patch actually fixes the problem 
(it seems to happen only after the postgres service has been up and 
running for some time) but in case you want to include this fix in a 
future 8.2.18 release, I've attached the new patch to apply against the 
REL8_2_STABLE branch.


Etienne

Index: backend/port/sysv_shmem.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/port/sysv_shmem.c,v
retrieving revision 1.47.2.2
diff -u -r1.47.2.2 sysv_shmem.c
--- backend/port/sysv_shmem.c   1 May 2010 22:46:50 -   1.47.2.2
+++ backend/port/sysv_shmem.c   17 Jul 2010 19:19:51 -
@@ -49,6 +49,10 @@
 
 unsigned long UsedShmemSegID = 0;
 void  *UsedShmemSegAddr = NULL;
+#ifdef WIN32
+Size   UsedShmemSegSize = 0;
+#endif
+
 
 static void *InternalIpcMemoryCreate(IpcMemoryKey memKey, Size size);
 static void IpcMemoryDetach(int status, Datum shmaddr);
@@ -445,6 +449,7 @@
 
/* Save info for possible future use */
UsedShmemSegAddr = memAddress;
+   UsedShmemSegSize = size;
UsedShmemSegID = (unsigned long) NextShmemSegID;
 
return hdr;
Index: backend/port/win32/shmem.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/port/win32/Attic/shmem.c,v
retrieving revision 1.13.2.1
diff -u -r1.13.2.1 shmem.c
--- backend/port/win32/shmem.c  4 May 2009 08:36:42 -   1.13.2.1
+++ backend/port/win32/shmem.c  17 Jul 2010 19:22:13 -
@@ -12,8 +12,11 @@
  */
 
 #include postgres.h
+#include miscadmin.h
 
 static DWORD s_segsize = 0;
+extern void *UsedShmemSegAddr;
+extern Size UsedShmemSegSize;
 
 /* Detach from a shared mem area based on its address */
 int
@@ -29,6 +32,13 @@
 void *
 shmat(int memId, void *shmaddr, int flag)
 {
+   /* Release the memory region reserved in the postmaster */
+   if (IsUnderPostmaster)
+   {
+   if (VirtualFree(shmaddr, 0, MEM_RELEASE) == 0)
+   elog(FATAL, failed to release reserved memory region 
(addr=%p): %lu,
+shmaddr, GetLastError());
+   }
/* TODO -- shmat needs to count # attached to shared mem */
void   *lpmem = MapViewOfFileEx((HANDLE) memId,

FILE_MAP_WRITE | FILE_MAP_READ,
@@ -128,3 +138,53 @@
 
return (int) hmap;
 }
+
+/*
+ * pgwin32_ReserveSharedMemoryRegion(hChild)
+ *
+ * Reserve the memory region that will be used for shared memory in a child
+ * process. It is called before the child process starts, to make sure the
+ * memory is available.
+ *
+ * Once the child starts, DLLs loading in different order or threads getting
+ * scheduled differently may allocate memory which can conflict with the
+ * address space we need for our shared memory. By reserving the shared
+ * memory region before the child starts, and freeing it only just before we
+ * attempt to get access to the shared memory forces these allocations to
+ * be given different address ranges that don't conflict.
+ *
+ * NOTE! This function executes in the postmaster, and should for this
+ * reason not use elog(FATAL) since that would take down the postmaster.
+ */
+int
+pgwin32_ReserveSharedMemoryRegion(HANDLE hChild)
+{
+   void *address;
+
+   Assert(UsedShmemSegAddr != NULL);
+   Assert(UsedShmemSegSize != 0);
+
+   address = VirtualAllocEx(hChild, UsedShmemSegAddr, UsedShmemSegSize,
+   MEM_RESERVE, 
PAGE_READWRITE);
+   if (address == NULL) {
+   /* Don't use FATAL since we're running in the postmaster */
+   elog(LOG, could not reserve shared memory region (addr=%p) for 
child %lu: %lu,
+UsedShmemSegAddr, hChild, GetLastError());
+   return false;
+   }
+   if (address != UsedShmemSegAddr)
+   {
+   /*
+* Should never happen - in theory if allocation granularity 
causes strange
+* effects it could, so check just in case.
+   

Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 11:23 AM, David Christensen da...@endpoint.com wrote:
 Well, IIRC, one of -c and -f suppresses psqlrc, and the other does
 not.  This doesn't seem very consistent to me, but I'm not sure
 there's much to be done about it at this point.  I guess if you use
 whichever one suppresses psqlrc even once, it's suppressed, and
 otherwise it's not.  :-(

 That seems sub-optimal; I can see people wanting to use this feature to do 
 something like:

 psql -c 'set work_mem = blah' -f script.sql

 and then being surprised when it works differently than just `psql -f 
 script.sql`.

I agree... but then they might also be surprised if psql -c
'something' works differently from psql -c 'something' -f /dev/null

 Although I wonder if the general usecase for .psqlrc is just in interactive 
 mode; i.e., hypothetically if you're running scripts that are sensitive to 
 environment, you're running with -X anyway; so maybe that's not that big of a 
 deal, as it's kind of an implied -X with multiple -c or -f commands.  And if 
 you really wanted it, we could add a flag to explicitly include .psqlrc (or 
 the user could just specify -f path/to/psqlrc).

It's tempting to propose making .psqlrc apply only in interactive
mode, period.  But that would be an incompatibility with previous
releases, and I'm not sure it's the behavior we want, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query optimization problem

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 11:23 AM, Dimitri Fontaine
dfonta...@hi-media.com wrote:
 Robert Haas robertmh...@gmail.com writes:
 All that having been said, I think the issue here is that the query
 planner isn't inferring that d1.ID=some constant implies d2.ID=some
 constant, even though there's a join clause d1.ID=d2.ID.

 I think that's what the Equivalence Classes are for. Or at least that's
 what they do in my head, not forcibly in the code.

 The specific diff between the two queries is :

   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
 - WHERE (d1.ID=234409763) or (d2.ID=234409763)
 + WHERE (d2.BasedOn=234409763) or (d2.ID=234409763)

 So the OP would appreciate that the planner is able to consider applying
 the restriction on d2.BasedOn rather than d1.ID given that d2.BasedOn is
 the same thing as d1.ID, from the JOIN.

 I have no idea if Equivalence Classes are where to look for this, and if
 they're meant to extend up to there, and if that's something possible or
 wise to implement, though.

I was thinking of the equivalence class machinery as well.  I think
the OR clause may be the problem.  If you just had d1.ID=constant, I
think it would infer that d1.ID, d2.BasedOn, and the constant formed
an equivalence class.  But here you obviously can't smash the constant
into the equivalence class, and I think the planner's not smart enough
to consider other ways of applying an equivalent qual.  In fact, I
have some recollection that Tom has explicitly rejected adding support
for this in the past, on the grounds that the computation would be too
expensive for the number of queries it would help.  Still, it seems to
keep coming up.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Solaris Sparc - dblink regression test failure

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 11:08 AM, Dave Page dp...@pgadmin.org wrote:
 Any ideas?

Are you by any chance running off of the broken git mirror?

http://archives.postgresql.org/pgsql-hackers/2010-07/msg00916.php

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.0, second run

2010-07-20 Thread Kevin Grittner
Andrew Dunstan and...@dunslane.net wrote:
 
 I despaired of this repo being anything like reliable months ago.
 AFAIK it is using a known to be broken version of fromcvs.
 
Could we have it pull (using git) from the repo you have working
correctly?  (Or would that be too Rube Goldbergesque?)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Solaris Sparc - dblink regression test failure

2010-07-20 Thread Dave Page
On Tue, Jul 20, 2010 at 4:53 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jul 20, 2010 at 11:08 AM, Dave Page dp...@pgadmin.org wrote:
 Any ideas?

 Are you by any chance running off of the broken git mirror?

 http://archives.postgresql.org/pgsql-hackers/2010-07/msg00916.php

I might be, yes :-)


-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Mark Wong
On Tue, Jul 20, 2010 at 4:06 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jul 20, 2010 at 3:20 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, 2010-07-19 at 23:40 -0400, Robert Haas wrote:

 Since it has been over a month since this review was posted and no new
 version of the patch has appeared, I think we should mark this patch
 as Returned with Feedback.

 Mark posted a new patch 6 days ago, AFAICS.

 Hmm.  I can't find it in my mail, in my archives, or linked off the
 CommitFest application.  Was it posted under a different subject line?
  Do you have a link to the archives?

My bad, I didn't want to appear to hijack the patch from David.  I
have added the link to the commitfest app, and I'll give it here:

http://archives.postgresql.org/message-id/aanlktikfpzrtrl6392ghatqdwlcwqtxfdsmxh5cp5...@mail.gmail.com

Regads,
Mark

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.0, second run

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 12:01 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Andrew Dunstan and...@dunslane.net wrote:

 I despaired of this repo being anything like reliable months ago.
 AFAIK it is using a known to be broken version of fromcvs.

 Could we have it pull (using git) from the repo you have working
 correctly?  (Or would that be too Rube Goldbergesque?)

It would result in a massive merge commit and the duplication of the
entire history.  The correct solution is probably to (a) install
Andrew's fixed version of the import tool on the server and (b) rewind
the history on the server so it reimports all the subsequent commits.
Sometimes doing only (b) is sufficient to correct the problem, since
the tool seems rather sensitive to ephemeral states of the
respository.

Unfortunately, (a) has not happened.  Magnus seems to feel that Andrew
has not provided sufficient details about which version he should be
running and whether it will likely break anything, and I gather that
Andrew feels otherwise.  Figuring out who is right and who is wrong
and what to do about it is above my pay grade, but it would be really
nice if someone could get this straightened out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.0, second run

2010-07-20 Thread Magnus Hagander
On Tue, Jul 20, 2010 at 18:13, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jul 20, 2010 at 12:01 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Andrew Dunstan and...@dunslane.net wrote:

 I despaired of this repo being anything like reliable months ago.
 AFAIK it is using a known to be broken version of fromcvs.

 Could we have it pull (using git) from the repo you have working
 correctly?  (Or would that be too Rube Goldbergesque?)

 It would result in a massive merge commit and the duplication of the
 entire history.  The correct solution is probably to (a) install
 Andrew's fixed version of the import tool on the server and (b) rewind
 the history on the server so it reimports all the subsequent commits.
 Sometimes doing only (b) is sufficient to correct the problem, since
 the tool seems rather sensitive to ephemeral states of the
 respository.

 Unfortunately, (a) has not happened.  Magnus seems to feel that Andrew
 has not provided sufficient details about which version he should be
 running and whether it will likely break anything, and I gather that
 Andrew feels otherwise.  Figuring out who is right and who is wrong
 and what to do about it is above my pay grade, but it would be really
 nice if someone could get this straightened out.

Meh, who cares who's right or wrong :-)

My main point is I am unsure if this may have any adverse effects, and
I haven't had the time to investigate if it doesor not. Previously
we've just applied a manual correction patch to bring the branch tip
up to the correct state, which is supposedly good enough for the users
of the git server. In which case, someone just needs to proide said
patch :-)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.0, second run

2010-07-20 Thread Andrew Dunstan



Kevin Grittner wrote:

Andrew Dunstan and...@dunslane.net wrote:
 
  

I despaired of this repo being anything like reliable months ago.
AFAIK it is using a known to be broken version of fromcvs.

 
Could we have it pull (using git) from the repo you have working

correctly?  (Or would that be too Rube Goldbergesque?)
 

  


The trouble is that they don't have a common git commit history. I 
suspect we'll just totally mangle the repo if we try that. But anyone 
can experiment if they want to. My repo is available on 
git://github.com/oicu/pg-cvs-mirror.git or 
http://github.com/oicu/pg-cvs-mirror.git Or people can just use that 
repo. I have four buildfarm members running off it daily, and I run a 
daily health check on it against CVS. I can undertake to maintain it 
until at least the middle of August (after which I'll be travelling for 
a few weeks).


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 12:08 PM, Mark Wong mark...@gmail.com wrote:
 On Tue, Jul 20, 2010 at 4:06 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jul 20, 2010 at 3:20 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, 2010-07-19 at 23:40 -0400, Robert Haas wrote:

 Since it has been over a month since this review was posted and no new
 version of the patch has appeared, I think we should mark this patch
 as Returned with Feedback.

 Mark posted a new patch 6 days ago, AFAICS.

 Hmm.  I can't find it in my mail, in my archives, or linked off the
 CommitFest application.  Was it posted under a different subject line?
  Do you have a link to the archives?

 My bad, I didn't want to appear to hijack the patch from David.  I
 have added the link to the commitfest app, and I'll give it here:

 http://archives.postgresql.org/message-id/aanlktikfpzrtrl6392ghatqdwlcwqtxfdsmxh5cp5...@mail.gmail.com

Oh, cool, I missed that, obviously.  So I guess we need someone to
review that version, then.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.0, second run

2010-07-20 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 It would result in a massive merge commit and the duplication of
 the entire history.
 
Ah, well, if the two repositories don't share the same IDs, it a
clear no-go.  Now that I think about it, it would be a bit much to
expect those to match on independent conversions from CVS.
 
How is this going to play out when we do the official conversion
to git?  Will those of us on repositories based off of
git.postgresql.org be faced with similar issues, or are we using the
repo there as the base for the conversion?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.0, second run

2010-07-20 Thread Magnus Hagander
On Tue, Jul 20, 2010 at 18:32, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:

 It would result in a massive merge commit and the duplication of
 the entire history.

 Ah, well, if the two repositories don't share the same IDs, it a
 clear no-go.  Now that I think about it, it would be a bit much to
 expect those to match on independent conversions from CVS.

 How is this going to play out when we do the official conversion
 to git?  Will those of us on repositories based off of
 git.postgresql.org be faced with similar issues, or are we using the
 repo there as the base for the conversion?

No, it will be a completely new repository. Those with the old one
will need to extract a patch from that and then apply it to the new
one.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: pgindent run for 9.0, second run

2010-07-20 Thread Andrew Dunstan



Magnus Hagander wrote:

On Tue, Jul 20, 2010 at 18:13, Robert Haas robertmh...@gmail.com wrote:
  

On Tue, Jul 20, 2010 at 12:01 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:


Andrew Dunstan and...@dunslane.net wrote:

  

I despaired of this repo being anything like reliable months ago.
AFAIK it is using a known to be broken version of fromcvs.


Could we have it pull (using git) from the repo you have working
correctly?  (Or would that be too Rube Goldbergesque?)
  

It would result in a massive merge commit and the duplication of the
entire history.  The correct solution is probably to (a) install
Andrew's fixed version of the import tool on the server and (b) rewind
the history on the server so it reimports all the subsequent commits.
Sometimes doing only (b) is sufficient to correct the problem, since
the tool seems rather sensitive to ephemeral states of the
respository.

Unfortunately, (a) has not happened.  Magnus seems to feel that Andrew
has not provided sufficient details about which version he should be
running and whether it will likely break anything, and I gather that
Andrew feels otherwise.  Figuring out who is right and who is wrong
and what to do about it is above my pay grade, but it would be really
nice if someone could get this straightened out.



Meh, who cares who's right or wrong :-)

My main point is I am unsure if this may have any adverse effects, and
I haven't had the time to investigate if it doesor not. Previously
we've just applied a manual correction patch to bring the branch tip
up to the correct state, which is supposedly good enough for the users
of the git server. In which case, someone just needs to proide said
patch :-)

  


Given that the repo's remaining lifetime is measured in weeks, that 
seems reasonable.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] managing git disk space usage

2010-07-20 Thread Robert Haas
Tom and, I believe, also Andrew have expressed some concerns about the
space that will be taken up by having multiple copies of the git
repository on their systems.  While most users can probably get by
with a single repository, committers will likely need one for each
back-branch that they work with, and we have quite a few of those.

After playing around with this a bit, I've come to the conclusion that
there are a couple of possible options but they've all got some
drawbacks.

1. Clone the origin.  Then, clone the clone n times locally.  This
uses hard links, so it saves disk space.  But, every time you want to
pull, you first have to pull to the main clone, and then to each of
the slave clones.  And same thing when you want to push.

2. Clone the origin n times.  Use more disk space.  Live with it.  :-)

3. Clone the origin once.  Apply patches to multiple branches by
switching branches.  Playing around with it, this is probably a
tolerable way to work when you're only going back one or two branches
but it's certainly a big nuisance when you're going back 5-7 branches.

4. Clone the origin.  Use that to get at the master branch.  Then
clone that clone n-1 times, one for each back-branch.  This makes it a
bit easier to push and pull when you're only dealing with the master
branch, but you still have the double push/double pull problem for all
the other branches.

5. Use git clone --shared or git clone --references or
git-new-workdir.  While I once thought this was the solution, I can't
take very seriously any solution that has a warning in the manual that
says, essentially, git gc may corrupt your repository if you do this.

I'm not really sure which of these I'm going to do yet, and I'm not
sure what to recommend to others, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] managing git disk space usage

2010-07-20 Thread Aidan Van Dyk
* Robert Haas robertmh...@gmail.com [100720 13:04]:
 
 3. Clone the origin once.  Apply patches to multiple branches by
 switching branches.  Playing around with it, this is probably a
 tolerable way to work when you're only going back one or two branches
 but it's certainly a big nuisance when you're going back 5-7 branches.

This is what I do when I'm working on a project that has completely
proper dependancies, and you don't need to always re-run configure
between different branches.  I use ccache heavily, so configure takes
longer than a complete build with a couple-dozen
actually-not-previously-seen changes...

But *all* dependancies need to be proper in the build system, or you end
up needing a git-clean-type-cleanup between branch switches, forcing a
new configure run too, which takes too much time...

Maybe this will cause make dependancies to be refined in PG ;-)

It has the advantage, that if back patching (or in reality, forward
patching) all happens in 1 repository, the git conflict machinery is all
using the same cache of resolutions, meaning that if you apply the same
patch to 2 different branches, with identical code/conflict, you don't
need to do the whole conflict resolution by hand from scratch in the 2nd
branch.

 5. Use git clone --shared or git clone --references or
 git-new-workdir.  While I once thought this was the solution, I can't
 take very seriously any solution that has a warning in the manual that
 says, essentially, git gc may corrupt your repository if you do this.

This is the type of setup I often use.  I have a central set of git
repos that I have automatically straight mirror-clones of project
repositories.   And they are kept up-to-date via cron.  And any time I
clone a work repo, I use --reference.

Since I make sure I don't remove anything from the reference repo, I
don't have to worry about loosing objects other repositories might be
using from the cache repo.  In case anyone is wondering, that's:
git clone --mirror $REPO /data/src/cache/$project.git
git --git-dir=/data/src/cache/$project.git config gc.auto 0

And then in crontab:
git --git-dir=/data/src/cache/$project.git fetch --quiet --all

With gc.auto disabled, and the only commands ever run being git fetch,
no objects are removed, even if a remote rewinds and throws away
commits.

But this way means that the seperate repos only share the past, from
central repository history, which means that you have to jump through
hoops if you want to be able to use git's handyj
merging/cherry-picking/conflict tools when trying to rebase/port
patches between branches.  You're pretty much limited to exporting a
patch, changing to a the new branch-repository, and applying the patch.

a.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-20 Thread Alvaro Herrera
Excerpts from Markus Wanner's message of vie jul 02 19:44:46 -0400 2010:

 Having written a very primitive kind of a dynamic memory allocator for 
 imessages [1], I've always wanted a better alternative. So I've 
 investigated a bit, refactored step-by-step, and finally came up with 
 the attached, lock based dynamic shared memory allocator. Its interface 
 is as simple as malloc() and free(). A restart of the postmaster should 
 truncate the whole area.

Interesting, thanks.

I gave it a skim and found that it badly needs a lot more code comments.

I'm also unconvinced that spinlocks are the best locking primitive here.
Why not lwlocks?

 Being a component which needs to pre-allocate its area in shared memory 
 in advance, you need to define a maximum size for the pool of 
 dynamically allocatable memory. That's currently defined in shmem.h 
 instead of a GUC.

This should be an easy change; I agree that it needs to be configurable.

I'm not sure what kind of resistance you'll see to the idea of a
dynamically allocatable shmem area.  Maybe we could use this in other
areas such as allocating space for heavyweight lock objects.  Right now 
the memory usage for them could grow due to a transitory increase in
lock traffic, leading to out-of-memory conditions later in other
modules.  We've seen reports of that problem, so it'd be nice to be able
to fix that with this infrastructure.

I didn't look at the imessages patch (except to notice that I didn't
very much like the handling of out-of-memory, but you already knew that).

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] managing git disk space usage

2010-07-20 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 2. Clone the origin n times.  Use more disk space.  Live with it. 
:-)
 
But each copy uses almost 0.36% of the formatted space on my 150GB
drive!
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parsing of aggregate ORDER BY clauses

2010-07-20 Thread Daniel Grace
On Mon, Jul 19, 2010 at 4:08 PM, Hitoshi Harada umi.tan...@gmail.com wrote:

 2010/7/19 Tom Lane t...@sss.pgh.pa.us:
  I looked into the problem reported here:
  http://archives.postgresql.org/pgsql-bugs/2010-07/msg00119.php
 

[...]

 
  2. Split the processing of aggregates with ORDER BY/DISTINCT so that the
  sorting/uniqueifying is done in a separate expression node that can work
  with the native types of the given columns, and only after that do we
  perform coercion to the aggregate function's input types.  This would be
  logically the cleanest thing, perhaps, but it'd represent a very major
  rework of the patch, with really no hope of getting it done for 9.0.

[...]

  #3 seems the sanest fix, but I wonder if anyone has an objection or
  better idea.

 I didn't look at the code yet, #2 sounds like the way to go. But I see
 the breakage is unacceptable for 9.0, so #3 is the choice for 9.0 and
 will we fix it as #2 for 9.1 or later?

I'm the original reporter of the mentioned bug.

One possible concern might be typecasts that aren't a 1:1
representation.  While no two VARCHARs are going to produce the same
TEXT, this is not true in other cases (1.1::float::integer and
1.2::float::integer both produce 1, for instance).

Off the top of my head, I can't think of a good example where this
would cause a problem -- it'd be easy enough to manufacture a possible
test case, but it'd be so contrived and I don't know if it's something
that would be seen in production code.  But if we SELECT
SOME_INTEGER_AGGREGATE(DISTINCT floatcol ORDER BY floatcol), should
the DISTINCT operate on floatcol (i.e. 1.1 and 1.2 are distinct, even
if it means the function is called with '1' twice) or
floatcol::integer (1.1 and 1.2 are not distinct)?

I'm guessing the former, even if it means the function is called
multiple times with the same final (after typecasting) input value.
The latter would only be correct if the user specifically wrote it as
DISTINCT floatval::INTEGER.

-- 
Daniel Grace

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SAVEPOINTs and COMMIT performance

2010-07-20 Thread Simon Riggs

As part of a performance investigation for a customer I've noticed an
O(N^2) performance issue on COMMITs of transactions that contain many
SAVEPOINTs. I've consistently measured COMMIT times of around 9 seconds,
with 49% CPU, mostly in LockReassignCurrentOwner().

BEGIN;
INSERT...
SAVEPOINT ...
INSERT...
SAVEPOINT ...
... (repeat 10,000 times)
COMMIT;

The way SAVEPOINTs work is that each is nested within the previous one,
so that at COMMIT time we must recursively commit all the
subtransactions before we issue final commit.

That's a shame because ResourceOwnerReleaseInternal() contains an
optimisation to speed up final commit, by calling ProcReleaseLocks().

What we actually do is recursively call LockReassignCurrentOwner() which
sequentially scans LockMethodLocalHash at each level of transaction. The
comments refer to this as retail rather than the wholesale method,
which never gets to execute anything worthwhile in this case.

This issue does NOT occur in PLpgSQL functions that contain many
EXCEPTION clauses in a loop, since in that case the subtransactions are
started and committed from the top level so that the subxact nesting
never goes too deep.

Fix looks like we need special handling for the depth-first case, rather
than just a recursion loop in CommitTransactionCommand().

Issues looks like it goes all the way back, no fix for 9.0.

I notice also that the nesting model of SAVEPOINTs also means that
read-only subtransactions will still generate an xid when followed by a
DML statement. That's unnecessary, but required given current design.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] crash-recovery replay of CREATE TABLESPACE is broken in HEAD

2010-07-20 Thread Bruce Momjian
Bruce Momjian wrote:
 Bruce Momjian wrote:
   The attached patch does as suggested.  I added the recovery code to the
   create tablespace function so I didn't have to duplicate all the code
   that computes the path names.
   
   Attached.
  
  Uh, another question.  Looking at the createdb recovery, I see:
  
  /*
   * Our theory for replaying a CREATE is to forcibly drop the target
   * subdirectory if present, then re-copy the source data. This may 
  be
   * more work than needed, but it is simple to implement.
   */
  if (stat(dst_path, st) == 0  S_ISDIR(st.st_mode))
  {
  if (!rmtree(dst_path, true))
  ereport(WARNING,
  (errmsg(some useless files may be left behind in 
  old database directory \%s\,
  dst_path)));
  }
  
  Should I be using rmtree() on the mkdir target?
  
  Also, the original tablespace recovery code did not drop the symlink
  first.  I assume that was not a bug only because we don't support moving
  tablespaces:
 
 For consistency with CREATE DATABASE recovery and for reliablity, I
 coded the rmtree() call instead.  Patch attached.

Attached patch applied to HEAD and 9.0.   9.0 open item moved to
completed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +
Index: src/backend/commands/dbcommands.c
===
RCS file: /cvsroot/pgsql/src/backend/commands/dbcommands.c,v
retrieving revision 1.235
diff -c -c -r1.235 dbcommands.c
*** src/backend/commands/dbcommands.c	26 Feb 2010 02:00:38 -	1.235
--- src/backend/commands/dbcommands.c	20 Jul 2010 18:11:06 -
***
*** 1908,1913 
--- 1908,1914 
  		if (stat(dst_path, st) == 0  S_ISDIR(st.st_mode))
  		{
  			if (!rmtree(dst_path, true))
+ /* If this failed, copydir() below is going to error. */
  ereport(WARNING,
  		(errmsg(some useless files may be left behind in old database directory \%s\,
  dst_path)));
Index: src/backend/commands/tablespace.c
===
RCS file: /cvsroot/pgsql/src/backend/commands/tablespace.c,v
retrieving revision 1.77
diff -c -c -r1.77 tablespace.c
*** src/backend/commands/tablespace.c	18 Jul 2010 04:47:46 -	1.77
--- src/backend/commands/tablespace.c	20 Jul 2010 18:11:07 -
***
*** 562,567 
--- 562,586 
  		 location)));
  	}
  
+ 	if (InRecovery)
+ 	{
+ 		struct stat st;
+ 
+ 		/*
+ 		 * Our theory for replaying a CREATE is to forcibly drop the target
+ 		 * subdirectory if present, and then recreate it. This may be
+ 		 * more work than needed, but it is simple to implement.
+ 		 */
+ 		if (stat(location_with_version_dir, st) == 0  S_ISDIR(st.st_mode))
+ 		{
+ 			if (!rmtree(location_with_version_dir, true))
+ /* If this failed, mkdir() below is going to error. */
+ ereport(WARNING,
+ 		(errmsg(some useless files may be left behind in old database directory \%s\,
+ location_with_version_dir)));
+ 		}
+ 	}
+ 
  	/*
  	 * The creation of the version directory prevents more than one tablespace
  	 * in a single location.
***
*** 580,585 
--- 599,614 
  			location_with_version_dir)));
  	}
  
+ 	/* Remove old symlink in recovery, in case it points to the wrong place */
+ 	if (InRecovery)
+ 	{
+ 		if (unlink(linkloc)  0  errno != ENOENT)
+ 			ereport(ERROR,
+ 	(errcode_for_file_access(),
+ 	 errmsg(could not remove symbolic link \%s\: %m,
+ 			linkloc)));
+ 	}
+ 	
  	/*
  	 * Create the symlink under PGDATA
  	 */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 1:50 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 I'm not sure what kind of resistance you'll see to the idea of a
 dynamically allocatable shmem area.  Maybe we could use this in other
 areas such as allocating space for heavyweight lock objects.  Right now
 the memory usage for them could grow due to a transitory increase in
 lock traffic, leading to out-of-memory conditions later in other
 modules.  We've seen reports of that problem, so it'd be nice to be able
 to fix that with this infrastructure.

Well, you can't really fix that problem with this infrastructure,
because this infrastructure only allows shared memory to be
dynamically allocated from a pool set aside for such allocations in
advance.  If a surge in demand can exhaust all the heavyweight lock
space in the system, it can also exhaust the shared pool from which
more heavyweight lock space can be allocated.  The failure might
manifest itself in a totally different subsystem though, since the
allocation that failed wouldn't necessarily be a heavyweight lock
allocation, but some other allocation that failed as a result of space
used by the heavyweight locks.

It would be more interesting if you could expand (or contract) the
size of shared memory as a whole while the system is up and running.
Then, perhaps, max_locks_per_transaction and other, similar GUCs could
be made PGC_SIGHUP, which would give you a way out of such situations
that didn't involve taking down the entire cluster.  I'm not too sure
how to do that, though.

With respect to imessages specifically, what is the motivation for
using shared memory rather than something like an SLRU?  The new
LISTEN implementation uses an SLRU and handles variable-size messages,
so it seems like it might be well-suited to this task.

Incidentally, the link for the imessages patch on the CommitFest page
points to 
http://archives.postgresql.org/message-id/ab0cd52a64e788f4ecb4515d1e6e4...@localhost
- which is the dynamic shmem patch.  So I'm not sure where to find the
latest imessages patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] managing git disk space usage

2010-07-20 Thread Peter Eisentraut
On tis, 2010-07-20 at 13:28 -0400, Aidan Van Dyk wrote:
 But *all* dependancies need to be proper in the build system, or you
 end
 up needing a git-clean-type-cleanup between branch switches, forcing a
 new configure run too, which takes too much time...

This realization, while true, doesn't really help, because we are
talking about maintaining 5+ year old back branches, where we are not
going to fiddle with the build system at this time.  Also, the switch
from 9.0 to 9.1 the other day showed everyone who cared to watch that
the dependencies are currently not correct for major version switches,
so this method will definitely not work at the moment.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] antisocial things you can do in git (but not CVS)

2010-07-20 Thread Robert Haas
I have some concerns related to the upcoming conversion to git and how
we're going to avoid having things get messy as people start using the
new repository.  git has a lot more flexibility and power than CVS,
and I'm worried that it would be easy, even accidentally, to screw up
our history.

1. Inability to cleanly and easily (and programatically) identify who
committed what.  With CVS, the author of a revision is the person who
committed it, period.  With git, the author string can be set to
anything the person typing 'git commit' feels like.  I think there is
also a committer field, but that doesn't always appear and I'm not
clear on how it works.  Also, the author field defaults to something
dumb if you don't explicitly set it to a value.  So I'm worried we
could end up with stuff like this in the repository:

Author: rh...@rhaas-laptop
Author: Robert Haas robertmh...@gmail.com
Author: Robert Haas rh...@enterprisedb.com
Author: Robert Haas
rh...@some-place-i-might-hypothetically-work-in-the-future.com
Author: The Guy Who Wrote Some Patch Which Robert Haas Ended Up
Committing somerandomem...@somerandomdomain.whatever

Right now, it's easy to find all the commits by a particular
committer, and it's easy to see who committed a particular patch, and
the number of distinct committers is pretty small.  I'd hate to give
that up.

git log | grep '^Author' | sort | uniq -c | sort -n | less

My preference would be to stick to a style where we identify the
committer using the author tag and note the patch author, reviewers,
whether the committer made changes, etc. in the commit message.  A
single author field doesn't feel like enough for our workflow, and
having a mix of authors and committers in the author field seems like
a mess.

2. Branch and tag management.  In CVS, there are branches and tags in
only one place: on the server.  In git, you can have local branches
and tags and remote branches and tags, and you can pull and push tags
between servers.  If I'm working on a git repository that has branches
master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal,
numeric_2b, and temprelnames, I want to make sure that I don't
accidentally push the last three of those to the authoritative
server... but I do want to push all the others.  Similarly I want to
push only the corrects subset of tags (though that should be less of
an issue, at least for me, as I don't usually create local tags).  I'm
not sure how to set this up, though.

3. Merge commits.  I believe that we have consensus that commits
should always be done as a squash, so that the history of all of our
branches is linear.  But it seems to me that someone could
accidentally push a merge commit, either because they forgot to squash
locally, or because of a conflict between their local git repo's
master branch and origin/master.  Can we forbid this?

4. History rewriting.  Under what circumstances, if any, are we OK
with rebasing the master?  For example, if we decide not to have merge
commits, and somebody does a merge commit anyway, are we going to
rebase to get rid of it?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-20 Thread Markus Wanner

Hello Alvaro,

thank you for looking through this code.

On 07/20/2010 07:50 PM, Alvaro Herrera wrote:

Interesting, thanks.

I gave it a skim and found that it badly needs a lot more code comments.


Hm.. yeah, the dynshmem stuff could probably need more comments. (The 
bgworker stuff is probably a better example).



I'm also unconvinced that spinlocks are the best locking primitive here.
Why not lwlocks?


It's derived from a completely lock-free algorithm, as proposed by Maged 
M. Michael in: Scalable Lock-Free Dynamic Memory Allocator. I dropped 
all of the CAS primitives with their retry loop around and did further 
simplifications. Spinlocks simply looked like the simplest thing to 
fall-back to. But yeah, splitting into read and write accesses and using 
lwlocks might be a win. Or it might not. I honestly don't know. And it's 
probably not the best performing allocator ever. But it's certainly 
better than nothing.


I did recently release the lock-free variant as well as a lock based 
one, see http://www.bluegap.ch/projects/wamalloc/ for more information.



I'm not sure what kind of resistance you'll see to the idea of a
dynamically allocatable shmem area.


So far neither resistance nor applause. I'd love to hear more of an 
echo. Even if it's resistance.



Maybe we could use this in other
areas


..which is why I've published this separately from Postgres-R.


such as allocating space for heavyweight lock objects.  Right now
the memory usage for them could grow due to a transitory increase in
lock traffic, leading to out-of-memory conditions later in other
modules.  We've seen reports of that problem, so it'd be nice to be able
to fix that with this infrastructure.


Maybe, yes. Sounds like a nice idea.


I didn't look at the imessages patch (except to notice that I didn't
very much like the handling of out-of-memory, but you already knew that).


As all of the allocation problem has now been ripped out, the imessages 
patch got quite a bit smaller. imsg.c now consists of only around 370 
lines of code.


The handling of out-of-(shared)-memory situation could certainly be 
improved, yes. Note that I've already separated out a 
IMessageCreateInternal() method, which simply returns NULL in that case. 
Is that the API you'd prefer?


Getting back to the dynshmem stuff: I don't mind much *which* allocator 
to use. I also looked at jemalloc, but haven't been able to integrate it 
into Postgres. So I've extended my experiment with wamalloc and turned 
it into something usable for Postgres.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some git conversion issues

2010-07-20 Thread Bruce Momjian
Magnus Hagander wrote:
 Working on the git conversion with keywords, and I've noticed a couple of
 strange things that don't come up the same way in git. All of these are in
 non-code files, but they do defeat the identical tarball mode.
 
 For example, a number of files have commits showing up in cvs with nothing at
 all changed. This triggered an update of the keywords only, with no contents
 changed.
 
 For example, look at:
 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/Attic/FAQ_AIX.diff?r1=1.19.2.9;r2=1.19.2.10
 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/Attic/FAQ_HPUX.diff?r1=1.14.2.9;r2=1.14.2.10
 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/Attic/FAQ_Solaris.diff?r1=1.22.2.10;r2=1.22.2.11
 
 Some show up as completely empty, like:
 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/generate_history.pl.diff?r1=1.1;r2=1.1.2.1
 
 
 Does anybody know how this can even happen? Wouldn't cvs normally just
 not commit the file if there are no changes?

I have reproduced this by modifying just the CVS tag:

$PostgreSQL: pgsql/src/backend/catalog/README,v 1.15 2010/07/20
18:38:53 momjian Exp $

so it is possible, and can be ignored.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] antisocial things you can do in git (but not CVS)

2010-07-20 Thread Magnus Hagander
On Tue, Jul 20, 2010 at 20:34, Robert Haas robertmh...@gmail.com wrote:
 I have some concerns related to the upcoming conversion to git and how
 we're going to avoid having things get messy as people start using the
 new repository.  git has a lot more flexibility and power than CVS,
 and I'm worried that it would be easy, even accidentally, to screw up
 our history.

 1. Inability to cleanly and easily (and programatically) identify who
 committed what.  With CVS, the author of a revision is the person who
 committed it, period.  With git, the author string can be set to
 anything the person typing 'git commit' feels like.  I think there is
 also a committer field, but that doesn't always appear and I'm not
 clear on how it works.  Also, the author field defaults to something
 dumb if you don't explicitly set it to a value.  So I'm worried we
 could end up with stuff like this in the repository:

I'm pretty sure we can enforce this on the server side, refusing
commits that don't follow our standard. I haven't done it myself
(yet), but I've read about it.


 My preference would be to stick to a style where we identify the
 committer using the author tag and note the patch author, reviewers,
 whether the committer made changes, etc. in the commit message.  A
 single author field doesn't feel like enough for our workflow, and
 having a mix of authors and committers in the author field seems like
 a mess.

+1.


 2. Branch and tag management.  In CVS, there are branches and tags in
 only one place: on the server.  In git, you can have local branches
 and tags and remote branches and tags, and you can pull and push tags
 between servers.  If I'm working on a git repository that has branches
 master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal,
 numeric_2b, and temprelnames, I want to make sure that I don't
 accidentally push the last three of those to the authoritative
 server... but I do want to push all the others.  Similarly I want to
 push only the corrects subset of tags (though that should be less of
 an issue, at least for me, as I don't usually create local tags).  I'm
 not sure how to set this up, though.

We could put a safeguard in place on the server that won't let you
push a branch and require that additions of new branches be done
manually on the server?


 3. Merge commits.  I believe that we have consensus that commits
 should always be done as a squash, so that the history of all of our
 branches is linear.  But it seems to me that someone could
 accidentally push a merge commit, either because they forgot to squash
 locally, or because of a conflict between their local git repo's
 master branch and origin/master.  Can we forbid this?

Again, I haven't done it, but I've read about it, and I'm almost
certain we can enforce this, yes.


 4. History rewriting.  Under what circumstances, if any, are we OK
 with rebasing the master?  For example, if we decide not to have merge
 commits, and somebody does a merge commit anyway, are we going to
 rebase to get rid of it?

That's something we need a good policy for. Merge commits are special.
For content commits, I think we should basically *never* do that. If
someone commits bad content, we should just make a revert commit which
keeps history linear and just removes the changes as a new commit.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some git conversion issues

2010-07-20 Thread Magnus Hagander
On Tue, Jul 20, 2010 at 20:42, Bruce Momjian br...@momjian.us wrote:
 Magnus Hagander wrote:
 Working on the git conversion with keywords, and I've noticed a couple of
 strange things that don't come up the same way in git. All of these are in
 non-code files, but they do defeat the identical tarball mode.

 For example, a number of files have commits showing up in cvs with nothing at
 all changed. This triggered an update of the keywords only, with no contents
 changed.

 For example, look at:
 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/Attic/FAQ_AIX.diff?r1=1.19.2.9;r2=1.19.2.10
 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/Attic/FAQ_HPUX.diff?r1=1.14.2.9;r2=1.14.2.10
 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/Attic/FAQ_Solaris.diff?r1=1.22.2.10;r2=1.22.2.11

 Some show up as completely empty, like:
 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/doc/src/sgml/generate_history.pl.diff?r1=1.1;r2=1.1.2.1


 Does anybody know how this can even happen? Wouldn't cvs normally just
 not commit the file if there are no changes?

 I have reproduced this by modifying just the CVS tag:

        $PostgreSQL: pgsql/src/backend/catalog/README,v 1.15 2010/07/20
        18:38:53 momjian Exp $

To clarify with what Bruce said on IM to me, the situation is when the
workflow is to manually copy a file from one repo to another before
committing (a working one vs the committer one for example), it may
not be up to date on the version id. Personally this never happens
because I move files by making cvs diff in one and applying in the
other, not copying the files.

For one thing, this showed up in a lot of .po files for 8.1.0RC1.
Peter, can you comment on if this coincides with the tools you use to
do those things?


 so it is possible, and can be ignored.

Yeah, but do we want to? It means a git checkout of the branch tip
will be slightly different - in keywords only - from the previous
checkout in cvs.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some git conversion issues

2010-07-20 Thread Bruce Momjian
Magnus Hagander wrote:
  I have reproduced this by modifying just the CVS tag:
 
  ? ? ? ?$PostgreSQL: pgsql/src/backend/catalog/README,v 1.15 2010/07/20
  ? ? ? ?18:38:53 momjian Exp $
 
 To clarify with what Bruce said on IM to me, the situation is when the
 workflow is to manually copy a file from one repo to another before
 committing (a working one vs the committer one for example), it may
 not be up to date on the version id. Personally this never happens
 because I move files by making cvs diff in one and applying in the
 other, not copying the files.
 
 For one thing, this showed up in a lot of .po files for 8.1.0RC1.
 Peter, can you comment on if this coincides with the tools you use to
 do those things?
 
 
  so it is possible, and can be ignored.
 
 Yeah, but do we want to? It means a git checkout of the branch tip
 will be slightly different - in keywords only - from the previous
 checkout in cvs.
 

Well, we are planning to remove these tags in git anyway, and I don't
think anyone cares about those tags not being consistent.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCH] Re: [HACKERS] Adding XMLEXISTS to the grammar

2010-07-20 Thread Peter Eisentraut
On tis, 2010-06-29 at 12:22 +0100, Mike Fowler wrote:
 Mike Fowler wrote:  
  Thanks again for your help Robert, turns out the fault was in the 
  pg_proc entry (the 3 up there should've been a two!). Once I took the 
  grammar out it was quickly obvious where I'd gone wrong.
 
  Attached is a patch with the revised XMLEXISTS function, complete with 
  grammar support and regression tests. The implemented grammar is:
 
  XMLEXISTS ( xpath_expression PASSING BY REF xml_value [BY REF] )
 
  Though the full grammar makes everything after the xpath_expression 
  optional, I've left it has mandatory simply to avoid lots of rework of 
  the function (would need new null checks, memory handling would need 
  reworking).
 
 
 As with the xpath_exists patch I've now added the SGML documentation 
 detailing this function and extended the regression test a little to 
 test XML literals.

Some thoughts, mostly nitpicks:

The snippet of documentation could be clearer.  It says if the xml
satisifies the xpath.  Not sure what that means exactly.  An XPath
expression, by definition, returns a value.  How is that value used to
determine the result?

Naming of parser symbols: xmlexists_list isn't actually a list of
xmlexists's.  That particular rule can probably be done away with anyway
and the code be put directly into the XMLEXISTS rule.

Why is the first argument AexprConst instead of a_expr?  The SQL
standard says it's a character string literal, but I think we can very
well allow arbitrary expressions.

xmlexists_query_argument_list should be optional.

The rules xml_default_passing_mechanism and xml_passing_mechanism are
pretty useless to have a separate rules.  Just mention the tokens where
they are used.

Why c_expr?

Call the C-level function xmlexists for consistency.





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-20 Thread Markus Wanner

Hi,

On 07/20/2010 08:23 PM, Robert Haas wrote:

Well, you can't really fix that problem with this infrastructure,


No, but it would allow you to better use the existing amount of shared 
memory. Possibly avoiding the problem is certain scenarios.



The failure might
manifest itself in a totally different subsystem though, since the
allocation that failed wouldn't necessarily be a heavyweight lock
allocation, but some other allocation that failed as a result of space
used by the heavyweight locks.


Yeah, that's a valid concern. Maybe it could be addressed by keeping 
track of usage of dynshmem per module, and somehow inform the user about 
the usage pattern in case of OOM.



It would be more interesting


Sure, but then you'd definitely need a dynamic allocator, no?


With respect to imessages specifically, what is the motivation for
using shared memory rather than something like an SLRU?  The new
LISTEN implementation uses an SLRU and handles variable-size messages,
so it seems like it might be well-suited to this task.


Well, imessages predates the new LISTEN implementation by some moons. 
They are intended to replace (unix-ish) pipes between processes. I fail 
to see the immediate link between (S)LRU and inter-process message 
passing. It might be more useful for multiple LISTENers, but I bet it 
has slightly different semantics than imessages.


But to be honest, I don't know too much about the new LISTEN 
implementation. Do you think a loss-less 
(single)-process-to-(single)-process message passing system could be 
built on top of it?



Incidentally, the link for the imessages patch on the CommitFest page
points to 
http://archives.postgresql.org/message-id/ab0cd52a64e788f4ecb4515d1e6e4...@localhost
- which is the dynamic shmem patch.  So I'm not sure where to find the
latest imessages patch.


The archive doesn't display attachments very well. But the imessages 
patch is part of that mail. Maybe you still find it in your local mailbox?


In the archive view, it starts at the line that says:
*** src/backend/storage/ipc/imsg.c  dc149eef487eafb43409a78b8a33c70e7d3c2bfa

(and, well, the dynshmem stuff ends just before that line. Those were 
two .diff files attached, IIRC).


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-20 Thread Alvaro Herrera
Excerpts from Markus Wanner's message of mar jul 20 14:36:55 -0400 2010:

  I'm also unconvinced that spinlocks are the best locking primitive here.
  Why not lwlocks?
 
 It's derived from a completely lock-free algorithm, as proposed by Maged 
 M. Michael in: Scalable Lock-Free Dynamic Memory Allocator.

Hmm, deriving code from a paper published by IBM sounds like bad news --
who knows what patents they hold on the techniques there?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] antisocial things you can do in git (but not CVS)

2010-07-20 Thread Peter Eisentraut
On tis, 2010-07-20 at 14:34 -0400, Robert Haas wrote:
 Right now, it's easy to find all the commits by a particular
 committer, and it's easy to see who committed a particular patch, and
 the number of distinct committers is pretty small.  I'd hate to give
 that up.
 
 git log | grep '^Author' | sort | uniq -c | sort -n | less

git log --format=full | grep '^Commit' | sort | uniq -c | sort -n | less

 My preference would be to stick to a style where we identify the
 committer using the author tag and note the patch author, reviewers,
 whether the committer made changes, etc. in the commit message.  A
 single author field doesn't feel like enough for our workflow, and
 having a mix of authors and committers in the author field seems like
 a mess.

Well, I had looked forward to actually putting the real author into the
author field.

 2. Branch and tag management.  In CVS, there are branches and tags in
 only one place: on the server.  In git, you can have local branches
 and tags and remote branches and tags, and you can pull and push tags
 between servers.  If I'm working on a git repository that has branches
 master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal,
 numeric_2b, and temprelnames, I want to make sure that I don't
 accidentally push the last three of those to the authoritative
 server... but I do want to push all the others.  Similarly I want to
 push only the corrects subset of tags (though that should be less of
 an issue, at least for me, as I don't usually create local tags).  I'm
 not sure how to set this up, though.

I'm going to use one separate clone for my development and one
pristine one for the final commits and copy the patches over manually.
That also solves the next problem ...

 3. Merge commits.  I believe that we have consensus that commits
 should always be done as a squash, so that the history of all of our
 branches is linear.  But it seems to me that someone could
 accidentally push a merge commit, either because they forgot to squash
 locally, or because of a conflict between their local git repo's
 master branch and origin/master.  Can we forbid this?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Finding slave WAL application time delay

2010-07-20 Thread Bruce Momjian
Someone at OSCON just asked if there is a way to find the _time_ delay
between received and applied WAL.  Right now
pg_last_xlog_replay_location() only reports the information in WAL
scale.  It would be nice to report that in time, e.g. milliseconds.

Because an idle master will not generate WAL, I don't think there is a
way to report the time delay in receiving WAL.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some git conversion issues

2010-07-20 Thread Peter Eisentraut
On tis, 2010-07-20 at 20:46 +0200, Magnus Hagander wrote:
 For one thing, this showed up in a lot of .po files for 8.1.0RC1.
 Peter, can you comment on if this coincides with the tools you use to
 do those things?

There are/were some games being played with the key words, so this
effect sounds plausible to me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-20 Thread Markus Wanner

Hi,

On 07/20/2010 09:05 PM, Alvaro Herrera wrote:

Hmm, deriving code from a paper published by IBM sounds like bad news --
who knows what patents they hold on the techniques there?


Yeah, that might be an issue. Note, however, that the lock-based variant 
differs substantially from what's been published. And I sort of doubt 
their patents covers a lot of stuff that's not lock-free-ish.


But again, I'd also very much welcome any other allocator. In my 
opinion, it's the most annoying drawback of the process-based design 
compared to a threaded variant (from the perspective of a developer).


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some git conversion issues

2010-07-20 Thread Peter Eisentraut
On tis, 2010-07-20 at 15:12 +0200, Magnus Hagander wrote:
 Working on the git conversion

What's the tool that is being used now?  Can you keep us up to date on
the options etc. you plan to use?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread Alvaro Herrera
Excerpts from Robert Haas's message of mar jul 20 11:48:29 -0400 2010:

  That seems sub-optimal; I can see people wanting to use this feature to do 
  something like:
 
  psql -c 'set work_mem = blah' -f script.sql
 
  and then being surprised when it works differently than just `psql -f 
  script.sql`.
 
 I agree... but then they might also be surprised if psql -c
 'something' works differently from psql -c 'something' -f /dev/null

I think we should just make sure -X works, and have .psqlrc be read when
it's not specified regardless of -f and -c switches.

Otherwise it's just plain confusing.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some git conversion issues

2010-07-20 Thread Magnus Hagander
+On Tue, Jul 20, 2010 at 21:15, Peter Eisentraut pete...@gmx.net wrote:
 On tis, 2010-07-20 at 15:12 +0200, Magnus Hagander wrote:
 Working on the git conversion

 What's the tool that is being used now?  Can you keep us up to date on
 the options etc. you plan to use?

I'm currently running tests with a bunch of different tools, to see
which looks best. Right now, it looks like cvs2git is best when we
want the keywords included. And yes, I'll definitely post the exact
options as soon as I've finished the testruns.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] managing git disk space usage

2010-07-20 Thread Peter Eisentraut
On tis, 2010-07-20 at 13:04 -0400, Robert Haas wrote:
 2. Clone the origin n times.  Use more disk space.  Live with it.  :-)

Well, I plan to use cp -a to avoid cloning over the network n times, but
other than that that was my plan.  My .git directory currently takes 283
MB, so I think I can just about live with that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Query optimization problem

2010-07-20 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jul 20, 2010 at 11:23 AM, Dimitri Fontaine
 dfonta...@hi-media.com wrote:
   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
 - WHERE (d1.ID=234409763) or (d2.ID=234409763)
 + WHERE (d2.BasedOn=234409763) or (d2.ID=234409763)

 I was thinking of the equivalence class machinery as well.  I think
 the OR clause may be the problem.  If you just had d1.ID=constant, I
 think it would infer that d1.ID, d2.BasedOn, and the constant formed
 an equivalence class.  But here you obviously can't smash the constant
 into the equivalence class, and I think the planner's not smart enough
 to consider other ways of applying an equivalent qual.  In fact, I
 have some recollection that Tom has explicitly rejected adding support
 for this in the past, on the grounds that the computation would be too
 expensive for the number of queries it would help.  Still, it seems to
 keep coming up.

Well what I'm thinking now could have nothing to do with how the code
works. I'd have to check, but well, it's easier to write this mail and
get the chance to have you wonder :)

So, the JOIN condition teaches us that d2.BasedOn=d1.ID, and the OP
would want the planner to derive that (d1.ID=234409763) is the same
thing as (d2.BasedOn=234409763). I guess it would make sense to produce
plans with both the writings and pick one based on the costs.

Now, does it make sense to generate this many more plans to analyze in
the general case, I have no idea about. But given only one join and only
one WHERE clause where the Equivalent applies…

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] standard_conforming_strings

2010-07-20 Thread Peter Eisentraut
On tis, 2010-07-20 at 13:31 +0300, Marko Kreen wrote:
   http://wiki.postgresql.org/wiki/Standard_conforming_strings
 
 There is two sorts of support:
 
 1. Detect stdstr on startup and use that setting.
 
 2. Detect online changes to stdstr and follow them.
 
 AFAICS psycopg does not support 2).  Should we care about that?
 
 There are probably other drivers with partial support.

I think a driver could also just use E'' when it sees a backslash in the
string.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-07-20 Thread David Christensen

On Jul 20, 2010, at 2:18 PM, Alvaro Herrera wrote:

 Excerpts from Robert Haas's message of mar jul 20 11:48:29 -0400 2010:
 
 That seems sub-optimal; I can see people wanting to use this feature to do 
 something like:
 
 psql -c 'set work_mem = blah' -f script.sql
 
 and then being surprised when it works differently than just `psql -f 
 script.sql`.
 
 I agree... but then they might also be surprised if psql -c
 'something' works differently from psql -c 'something' -f /dev/null
 
 I think we should just make sure -X works, and have .psqlrc be read when
 it's not specified regardless of -f and -c switches.
 
 Otherwise it's just plain confusing.


+1.

Regards,

David
--
David Christensen
End Point Corporation
da...@endpoint.com





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] managing git disk space usage

2010-07-20 Thread Andrew Dunstan



Robert Haas wrote:

Tom and, I believe, also Andrew have expressed some concerns about the
space that will be taken up by having multiple copies of the git
repository on their systems.  While most users can probably get by
with a single repository, committers will likely need one for each
back-branch that they work with, and we have quite a few of those.

After playing around with this a bit, I've come to the conclusion that
there are a couple of possible options but they've all got some
drawbacks.

1. Clone the origin.  Then, clone the clone n times locally.  This
uses hard links, so it saves disk space.  But, every time you want to
pull, you first have to pull to the main clone, and then to each of
the slave clones.  And same thing when you want to push.


  


You can have a cron job that does the first pull fairly frequently. It 
should be a fairly cheap operation unless the git protocol is dumber 
than I think.


The second pull is the equivalent of what we do now with cvs update.

Given that, you could push commits direct to the authoritative repo and 
wait for the cron job to catch up your local base clone.


I think that's the pattern I will probably try to follow.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status report on writeable CTEs

2010-07-20 Thread David Fetter
On Sat, Jul 17, 2010 at 01:15:22AM +0900, Hitoshi Harada wrote:
 2010/7/17 Marko Tiikkaja marko.tiikk...@cs.helsinki.fi:
  On 7/16/10 6:15 PM +0300, Hitoshi Harada wrote:
 
  1. Use MaterialNode instead of adding DtScanNode. Since MaterialNode
  is exsiting one that work with single tuplestore, it might be sane to
  modify this so that it accepts tuplestore from Query instead of its
  child node.
 
  I thought about this, but I don't necessarily like the idea of overloading
  executor nodes.
 
 Neither do I have good shape for this solution. Maybe it's not good
 idea. But my concern is adding DtScanNode, which looks similar to
 MaterialNode. Of course each purpose is different, but quite big part
 will overlap each other, I think.

Any ideas as to how to factor this out?  Handiest ideas would be in
the form of a patch ;)

  2. Use temp table instead of tuplestore list. Since we agreed we need
  to execute each plan one by one starting and shutting down executor,
  it now looks very simple strategy.
 
  I didn't look at this because I thought using a tuplestore receiver in the
  portal logic was simple enough.  Any thoughts on how this would work?
 
 It's just deconstructing queries like:
 
 WITH t AS (INSERT INTO x ... RETURING *)
 SELECT * FROM t;
 
 to
 
 CREATE TEMP TABLE t AS INSERT INTO x ... RETURING *;
 SELECT * FROM t;
 
 While the second statement is not implemented yet, it will be so simpler.

So CTAS would get expanded into CTA[row-returning query] ?
Interesting.  How much work would that part be?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status report on writeable CTEs

2010-07-20 Thread Merlin Moncure
On Fri, Jul 16, 2010 at 12:15 PM, Hitoshi Harada umi.tan...@gmail.com wrote:
 2010/7/17 Marko Tiikkaja marko.tiikk...@cs.helsinki.fi:
 On 7/16/10 6:15 PM +0300, Hitoshi Harada wrote:

 1. Use MaterialNode instead of adding DtScanNode. Since MaterialNode
 is exsiting one that work with single tuplestore, it might be sane to
 modify this so that it accepts tuplestore from Query instead of its
 child node.

 I thought about this, but I don't necessarily like the idea of overloading
 executor nodes.

 Neither do I have good shape for this solution. Maybe it's not good
 idea. But my concern is adding DtScanNode, which looks similar to
 MaterialNode. Of course each purpose is different, but quite big part
 will overlap each other, I think.

 2. Use temp table instead of tuplestore list. Since we agreed we need
 to execute each plan one by one starting and shutting down executor,
 it now looks very simple strategy.

 I didn't look at this because I thought using a tuplestore receiver in the
 portal logic was simple enough.  Any thoughts on how this would work?

 It's just deconstructing queries like:

 WITH t AS (INSERT INTO x ... RETURING *)
 SELECT * FROM t;

 to

 CREATE TEMP TABLE t AS INSERT INTO x ... RETURING *;
 SELECT * FROM t;

Is it acceptable for a wCTE query to manipulate the system catalogs?
Couldn't this cause performance issues in some cases?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] antisocial things you can do in git (but not CVS)

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 2:42 PM, Magnus Hagander mag...@hagander.net wrote:
 On Tue, Jul 20, 2010 at 20:34, Robert Haas robertmh...@gmail.com wrote:
 I have some concerns related to the upcoming conversion to git and how
 we're going to avoid having things get messy as people start using the
 new repository.  git has a lot more flexibility and power than CVS,
 and I'm worried that it would be easy, even accidentally, to screw up
 our history.

 1. Inability to cleanly and easily (and programatically) identify who
 committed what.  With CVS, the author of a revision is the person who
 committed it, period.  With git, the author string can be set to
 anything the person typing 'git commit' feels like.  I think there is
 also a committer field, but that doesn't always appear and I'm not
 clear on how it works.  Also, the author field defaults to something
 dumb if you don't explicitly set it to a value.  So I'm worried we
 could end up with stuff like this in the repository:

 I'm pretty sure we can enforce this on the server side, refusing
 commits that don't follow our standard. I haven't done it myself
 (yet), but I've read about it.

+1, though I see downthread that Peter has a contrary opinion.

 My preference would be to stick to a style where we identify the
 committer using the author tag and note the patch author, reviewers,
 whether the committer made changes, etc. in the commit message.  A
 single author field doesn't feel like enough for our workflow, and
 having a mix of authors and committers in the author field seems like
 a mess.

 +1.


 2. Branch and tag management.  In CVS, there are branches and tags in
 only one place: on the server.  In git, you can have local branches
 and tags and remote branches and tags, and you can pull and push tags
 between servers.  If I'm working on a git repository that has branches
 master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal,
 numeric_2b, and temprelnames, I want to make sure that I don't
 accidentally push the last three of those to the authoritative
 server... but I do want to push all the others.  Similarly I want to
 push only the corrects subset of tags (though that should be less of
 an issue, at least for me, as I don't usually create local tags).  I'm
 not sure how to set this up, though.

 We could put a safeguard in place on the server that won't let you
 push a branch and require that additions of new branches be done
 manually on the server?

On this one, I'd just like a way to prevent accidents.  Is there maybe
a config option I can set on my local repository?

 3. Merge commits.  I believe that we have consensus that commits
 should always be done as a squash, so that the history of all of our
 branches is linear.  But it seems to me that someone could
 accidentally push a merge commit, either because they forgot to squash
 locally, or because of a conflict between their local git repo's
 master branch and origin/master.  Can we forbid this?

 Again, I haven't done it, but I've read about it, and I'm almost
 certain we can enforce this, yes.

OK, that sounds good...

 4. History rewriting.  Under what circumstances, if any, are we OK
 with rebasing the master?  For example, if we decide not to have merge
 commits, and somebody does a merge commit anyway, are we going to
 rebase to get rid of it?

 That's something we need a good policy for. Merge commits are special.
 For content commits, I think we should basically *never* do that. If
 someone commits bad content, we should just make a revert commit which
 keeps history linear and just removes the changes as a new commit.

Yeah, I agree.  I'm not sure that merge commits are the ONLY situation
where we'd want to do this, but it should be reserved for cases where
just reversing out the diff wouldn't be sufficient for some reason and
we need to make it as though it never happened.  I don't think it's
probably necessary to disallow this completely - the default setting
of allowing it only with + is probably enough.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] antisocial things you can do in git (but not CVS)

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 3:12 PM, Peter Eisentraut pete...@gmx.net wrote:
 Well, I had looked forward to actually putting the real author into the
 author field.

What if there's more than one?  What if you make changes yourself?
How will you credit the reviewer?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-20 Thread Alvaro Herrera
Excerpts from Markus Wanner's message of mar jul 20 14:54:42 -0400 2010:

  With respect to imessages specifically, what is the motivation for
  using shared memory rather than something like an SLRU?  The new
  LISTEN implementation uses an SLRU and handles variable-size messages,
  so it seems like it might be well-suited to this task.
 
 Well, imessages predates the new LISTEN implementation by some moons. 
 They are intended to replace (unix-ish) pipes between processes. I fail 
 to see the immediate link between (S)LRU and inter-process message 
 passing. It might be more useful for multiple LISTENers, but I bet it 
 has slightly different semantics than imessages.

I guess what Robert is saying is that you don't need shmem to pass
messages around.  The new LISTEN implementation was just an example.
imessages aren't supposed to use it directly.  Rather, the idea is to
store the messages in a new SLRU area.  Thus you don't need to mess with
dynamically allocating shmem at all.

 But to be honest, I don't know too much about the new LISTEN 
 implementation. Do you think a loss-less 
 (single)-process-to-(single)-process message passing system could be 
 built on top of it?

I don't think you should build on top of LISTEN but of slru.c.  This is
probably more similar to multixact (see multixact.c) than to the new
LISTEN implementation.

I think it should be rather straightforward.  There would be a unique
append-point; each process desiring to send a new message to another
backend would add a new message at that point.  There would be one read
pointer per backend, and it would be advanced as messages are consumed.
Old segments could be trimmed as backends advance their read pointer,
similar to how sinval queue is handled.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for 9.1: initdb -C option

2010-07-20 Thread Kevin Grittner
Top posting.  On purpose.  :p
 
This patch seems to be foundering in a sea of opinions.  It seems
that everybody wants to do *something* about this, but what?
 
For one more opinion, my shop has chosen to never touch the default
postgresql.conf file any more, beyond adding one line to the bottom
of it which is an include directive, to bring in our overrides.
 
What will make everyone happy here?
 
-Kevin
 
 
KaiGai Kohei kai...@ak.jp.nec.com wrote:
 (2010/03/29 14:04), David Christensen wrote:
 
 Enclosed is a patch to add a -C option to initdb to allow you to
 easily append configuration directives to the generated
 postgresql.conf file for use in programmatic generation.  In my
 case, I'd been creating multiple db clusters with a script and
 would have specific overrides that I needed to make.  This patch
 fell out of the desire to make this a little cleaner.  
 Please review and comment.
 
  From the commit message:
 
  This is a simple mechanism to allow you to provide explicit
  overrides to any GUC at initdb time.  As a basic example,
  consider the case where you are programmatically generating
  multiple db clusters in order to test various
  configurations:
 
$ for cluster in 1 2 3 4 5 6;
do initdb -D data$cluster -C port = 1234$cluster \
 -C 'max_connections = 10' -C shared_buffers=1M;
  done
 
  A possible future improvement would be to provide some basic
  formatting corrections to allow specificications such as -C
  'port 1234', -C port=1234, and -C 'port = 1234' to all be
  ultimately output as 'port = 1234' in the final output. 
  This would be consistent with postmaster's parsing.
 
  The -C flag was chosen to be a mnemonic for config.
 
 I'd like to volunteer reviewing your patch at first in this commit
 fest.
 
 We already had a few comments on the list before. I want to see
 your opinion for the suggestions prior to code reviews.
 
 Itagaki-san suggested:
 
 | Why don't you use just echo 'options' \
 |   $PGDATA/postgresql.conf ?
 | Could you explain where the -C options is better than initdb +
 | echo?
 
 Greg suggested:
 | We had a patch not quite make it for 9.0 that switched over the 
 | postgresql.conf file to make it easy to scan a whole directory
 | looking for configuration files:
 | 
http://archives.postgresql.org/message-id/9837222c0910240641p7d75e2a4u2cfa6c1b5e603...@mail.gmail.com
 |
 | The idea there was to eventually reduce the amount of
 | postgresql.conf hacking that initdb and other tools have to do. 
 | Your patch would add more code into a path that I'd like to see
 | reduced significantly.
 |
 | That implementation would make something easy enough for your
 | use case too (below untested but show the general idea):
 |
 | $ for cluster in 1 2 3 4 5 6;
 |  do initdb -D data$cluster
 |  (
 |  cat EOF
 |  port = 1234$cluster;
 |  max_connections = 10;
 |  shared_buffers=1M;
 |  EOF
 |  )  data$cluster/conf.d/99clustersetup
 | done
 |
 | This would actually work just fine for what you're doing right
 | now if you used  data$cluster/postgresql.conf for that next
 | to last line there.  There would be duplicates, which I'm
 | guessing is what you wanted to avoid with this patch, but the
 | later values set for the parameters added to the end would win
 | and be the active ones.
 
 Peter suggested:
  
 | I like this idea, but please use small -c for consistency with
 | the postgres program.
 
 It seems to me what Greg suggested is a recent trend. Additional 
 configurations within separated files enables to install/uninstall
 third-party plugins easily from the perspective of packagers,
 rather than consolidated configuration.
 
 However, $PGDATA/postgresql.conf is created on initdb, so it does
 not belong to a certain package. I don't have certainty whether
 the recent trend is also suitable for us, or not.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for 9.1: initdb -C option

2010-07-20 Thread David Christensen

On Jul 20, 2010, at 5:00 PM, Kevin Grittner wrote:

 Top posting.  On purpose.  :p
 
 This patch seems to be foundering in a sea of opinions.  It seems
 that everybody wants to do *something* about this, but what?
 
 For one more opinion, my shop has chosen to never touch the default
 postgresql.conf file any more, beyond adding one line to the bottom
 of it which is an include directive, to bring in our overrides.
 
 What will make everyone happy here?

So you'll now issue:

$ initdb ... -C 'include localconfig.conf' ? :-)

Regards,

David
--
David Christensen
End Point Corporation
da...@endpoint.com





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] antisocial things you can do in git (but not CVS)

2010-07-20 Thread Andrew Dunstan



Robert Haas wrote:

I have some concerns related to the upcoming conversion to git and how
we're going to avoid having things get messy as people start using the
new repository.  git has a lot more flexibility and power than CVS,
and I'm worried that it would be easy, even accidentally, to screw up
our history.

1. Inability to cleanly and easily (and programatically) identify who
committed what.  
  


Each committer sets their name and email using git config. Doesn't look 
like a problem. We don't have such a large number of committers that 
this should be much of an issue. Maybe we can set a pre-receive hook to 
make sure that it's set appropriately?


2. Branch and tag management.  
  

[snip]

I'm inclined to say that as now committers should not normally push 
tags. Marc or whoever is managing things should create the various tags. 
I think our current tagging policy is about right. git push doesn't 
push tags by default, so you'd have to be trying hard to mess this up.

3. Merge commits.  I believe that we have consensus that commits
should always be done as a squash, so that the history of all of our
branches is linear.  But it seems to me that someone could
accidentally push a merge commit, either because they forgot to squash
locally, or because of a conflict between their local git repo's
master branch and origin/master.  Can we forbid this?
  


Again, a pre-receive hook might be able to handle this. See 
http://progit.org/book/ch7-4.html

4. History rewriting.  Under what circumstances, if any, are we OK
with rebasing the master?  For example, if we decide not to have merge
commits, and somebody does a merge commit anyway, are we going to
rebase to get rid of it?

  


In the end, if we screw up badly enough we can just roll things back. It 
would be a pain, but not insurmountably so. I think we need to expect 
that there will be some teething issues. I keep 7 days worth of backups 
of the CVS repo constantly now, and I'll probably do the same with git, 
and I'm sure there will be other backups.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for 9.1: initdb -C option

2010-07-20 Thread Kevin Grittner
David Christensen da...@endpoint.com wrote:
 On Jul 20, 2010, at 5:00 PM, Kevin Grittner wrote:
 
 my shop has chosen to never touch the default postgresql.conf
 file any more, beyond adding one line to the bottom of it which
 is an include directive, to bring in our overrides.
 
 So you'll now issue:
 
 $ initdb ... -C 'include localconfig.conf' ? :-)
 
Yeah, that would cover us.  I'm wondering if it is flexible enough
to serve everyone else so well.  I hesitate to suggest it, but
perhaps it would, in combination with the include directive
supporting a directory name to mean all files in the directory?  Or
maybe if it supported wildcards?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamically allocating chunks from shared memory

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 5:46 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Markus Wanner's message of mar jul 20 14:54:42 -0400 2010:

  With respect to imessages specifically, what is the motivation for
  using shared memory rather than something like an SLRU?  The new
  LISTEN implementation uses an SLRU and handles variable-size messages,
  so it seems like it might be well-suited to this task.

 Well, imessages predates the new LISTEN implementation by some moons.
 They are intended to replace (unix-ish) pipes between processes. I fail
 to see the immediate link between (S)LRU and inter-process message
 passing. It might be more useful for multiple LISTENers, but I bet it
 has slightly different semantics than imessages.

 I guess what Robert is saying is that you don't need shmem to pass
 messages around.  The new LISTEN implementation was just an example.
 imessages aren't supposed to use it directly.  Rather, the idea is to
 store the messages in a new SLRU area.  Thus you don't need to mess with
 dynamically allocating shmem at all.

Right.  I might be full of bull, but that's what I'm saying.  :-)

 But to be honest, I don't know too much about the new LISTEN
 implementation. Do you think a loss-less
 (single)-process-to-(single)-process message passing system could be
 built on top of it?

 I don't think you should build on top of LISTEN but of slru.c.  This is
 probably more similar to multixact (see multixact.c) than to the new
 LISTEN implementation.

 I think it should be rather straightforward.  There would be a unique
 append-point; each process desiring to send a new message to another
 backend would add a new message at that point.  There would be one read
 pointer per backend, and it would be advanced as messages are consumed.
 Old segments could be trimmed as backends advance their read pointer,
 similar to how sinval queue is handled.

If the messages are mostly unicast, it might be nice if to contrive a
method whereby backends didn't need to explicitly advance over
messages destined only for other backends.  Like maybe allocate a
small, fixed amount of shared memory sufficient for two pointers
into the SLRU area per backend, and then use the SLRU to store each
message with a header indicating where the next message is to be
found.  For each backend, you store one pointer to the first queued
message and one pointer to the last queued message.  New messages can
be added by making the current last message point to a newly added
message and updating the last message pointer for that backend.  You'd
need to think about the locking and reference counting carefully to
make sure you eventually freed up unused pages, but it seems like it
might be doable.  Of course, if the messages are mostly multi/anycast,
or if the rate of messaging is low enough that the aforementioned
complexity is not worth bothering with, then, what you said.

One big advantage of attacking the problem with an SLRU is that
there's no fixed upper limit on the amount of data that can be
enqueued at any given time.  You can spill to disk or whatever as
needed (although hopefully you won't normally do so, for performance
reasons).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for 9.1: initdb -C option

2010-07-20 Thread KaiGai Kohei
(2010/07/21 7:33), Kevin Grittner wrote:
 David Christensenda...@endpoint.com  wrote:
 On Jul 20, 2010, at 5:00 PM, Kevin Grittner wrote:
 
 my shop has chosen to never touch the default postgresql.conf
 file any more, beyond adding one line to the bottom of it which
 is an include directive, to bring in our overrides.
 
 So you'll now issue:

 $ initdb ... -C 'include localconfig.conf' ? :-)
 
 Yeah, that would cover us.  I'm wondering if it is flexible enough
 to serve everyone else so well.  I hesitate to suggest it, but
 perhaps it would, in combination with the include directive
 supporting a directory name to mean all files in the directory?  Or
 maybe if it supported wildcards?
 
I reminded that David introduced the following example as a usage of
this feature.

  $ for cluster in 1 2 3 4 5 6;
do
initdb -D data$cluster -C port = 1234$cluster \
   -C 'max_connections = 10' \
   -C shared_buffers=1M;
done

In this case, it tries to set up six database clusters with constant
max_connections and shared_buffers, but individual port numbers for
each database clusters.

Even if we support include directive here, it may not help to describe
the postgresql.conf with a smart way.

Then, how about the Itagaki-san's suggestion?

Itagaki-san suggested:
|  Enclosed is a patch to add a -C option to initdb to allow you to easily
|  append configuration directives to the generated postgresql.conf file
| Why don't you use just echo 'options'  $PGDATA/postgresql.conf ?
| Could you explain where the -C options is better than initdb + echo?

As long as you don't need any special configuration during the initial
setting up launched by initdb, indeed, it seems to me we can edit the
postgresql.conf later.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for 9.1: initdb -C option

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 6:00 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 What will make everyone happy here?

Nothing.

But on a more serious note, the basic dilemma with this patch is
whether it's useful enough to justify the extra code.  I think it's
pretty clearly harmless (modulo the fact that it might have bugs), but
is it useful enough to bother with?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] antisocial things you can do in git (but not CVS)

2010-07-20 Thread Andrew Dunstan



Robert Haas wrote:

On Tue, Jul 20, 2010 at 3:12 PM, Peter Eisentraut pete...@gmx.net wrote:
  

Well, I had looked forward to actually putting the real author into the
author field.



What if there's more than one?  What if you make changes yourself?
How will you credit the reviewer?

  


I think our current practice is fine. Put it in the commit log.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Status report on writeable CTEs

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 5:13 PM, Merlin Moncure mmonc...@gmail.com wrote:
 2. Use temp table instead of tuplestore list. Since we agreed we need
 to execute each plan one by one starting and shutting down executor,
 it now looks very simple strategy.

 I didn't look at this because I thought using a tuplestore receiver in the
 portal logic was simple enough.  Any thoughts on how this would work?

 It's just deconstructing queries like:

 WITH t AS (INSERT INTO x ... RETURING *)
 SELECT * FROM t;

 to

 CREATE TEMP TABLE t AS INSERT INTO x ... RETURING *;
 SELECT * FROM t;

 Is it acceptable for a wCTE query to manipulate the system catalogs?
 Couldn't this cause performance issues in some cases?

Yeah, I suspect the performance wouldn't be too hot.  I think the idea
of writing into a tuplestore and then reading back out from it is the
way to go.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Finding slave WAL application time delay

2010-07-20 Thread Robert Haas
On Tue, Jul 20, 2010 at 3:13 PM, Bruce Momjian br...@momjian.us wrote:
 Someone at OSCON just asked if there is a way to find the _time_ delay
 between received and applied WAL.  Right now
 pg_last_xlog_replay_location() only reports the information in WAL
 scale.  It would be nice to report that in time, e.g. milliseconds.

There's probably more than one way you can define this.  See, for
example, previous discussions of max_standby_delay.

 Because an idle master will not generate WAL, I don't think there is a
 way to report the time delay in receiving WAL.

You could certainly come up with a sensible definition for this,
though there are also nonsensical ones.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >