Re: [HACKERS] Autovacuum vs statement_timeout

2007-04-18 Thread Magnus Hagander
On Tue, Apr 17, 2007 at 10:33:21PM -0400, Bruce Momjian wrote:
 Alvaro Herrera wrote:
  I think that is too strong an assumption, which is why I'm planning to
  back-patch the change to reset statement_timeout to 0 on autovacuum till
  8.0, as discussed.  I think I should also backpatch the change to set
  zero_damaged_pages as well (which is not on 8.0 AFAIR).
  
  It's very very easy to change things in postgresql.conf.  Actually
  knowing what you are doing (i.e. thinking on the consequences on VACUUM
  and such) is a whole another matter.
 
 Frankly, setting statement_timeout in postgresql.conf seems so risky in
 so many ways, perhaps we just need to document that the parameter
 probably should not be set in postgresql.conf, and why.

I'd suggest doing both. Tell people that it's dangerous (probably to the
point of a comment in the sample config file), but *also* force it in
pg_dump since you can't really expect people to read the documentation.

//MAgnus


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Can't ri_KeysEqual() consider two nulls as equal?

2007-04-18 Thread Richard Huxton

Simon Riggs wrote:

On Tue, 2007-04-17 at 17:16 -0400, Tom Lane wrote:

A recent discussion led me to the idea that FK triggers are fired
unnecessarily during an UPDATE if the foreign-key column(s) contain
any NULLs, because ri_KeysEqual() treats two nulls as unequal,
and therefore we conclude the row has changed when it has not.


FK trigger *can be optimised away* is true. No need to have a discussion
about whether NULL == NULL, but the critical test is: if I overwrote it,
would you be able to tell? The answer is No, so away it goes.


The test should perhaps be named unchanged rather than equal.

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] schema creation during initdb

2007-04-18 Thread sharath kumar

Sorry i have put the question wrongly. I wanted to ask if we can create a
schema during createdb time so that i have to hide the following from the
user.
psql -c 'create schema foo' mytemplate
psql -c 'create table foo.bar ...' mytemplate
Whenever a user runs createdb command, the above schema and table should be
created automatically.
Thanks,
Sharat.

On 4/18/07, Andrew Dunstan [EMAIL PROTECTED] wrote:


sharath kumar wrote:
 How can I create a schema during initdb time?
 For example pg_catalog, pg_toast, information_schema are created
 during initdb time. Likewise I want to create my own schema at initdb
 time. How can i do it? Also how to create a table into this schema at
 that time itself?



This question really does not belong on -hackers, which is about
postgres development, not usage. Next time use pgsql-general.

You have misunderstood the purpose of initdb, which is to prepare a
location for running a postmaster against.

After initdb has run there are no user databases yet created (unless you
count the postgres database).

So you would need to do something like this:

  initdb ...
  pg_ctl start ...
  createdb mytemplate
  psql -c 'create schema foo' mytemplate
  psql -c 'create table foo.bar ...' mytemplate


After that you can do this:

  createdb --template mytemplate newdb

and the newdb will have your schema and table.

HTH

andrew






Re: [HACKERS] schema creation during initdb

2007-04-18 Thread Heikki Linnakangas

sharath kumar wrote:

Sorry i have put the question wrongly. I wanted to ask if we can create a
schema during createdb time so that i have to hide the following from the
user.
psql -c 'create schema foo' mytemplate
psql -c 'create table foo.bar ...' mytemplate
Whenever a user runs createdb command, the above schema and table should be
created automatically.


You should create a template database with the schema and the tables you 
want. See chapter 19.3. Template Databases in the user manual:


http://www.postgresql.org/docs/8.2/interactive/manage-ag-templatedbs.html

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] schema creation during initdb

2007-04-18 Thread NikhilS

Hi,

On 4/18/07, sharath kumar [EMAIL PROTECTED] wrote:


Sorry i have put the question wrongly. I wanted to ask if we can create a
schema during createdb time so that i have to hide the following from the
user.
 psql -c 'create schema foo' mytemplate
 psql -c 'create table foo.bar ...' mytemplate
Whenever a user runs createdb command, the above schema and table should
be created automatically.
Thanks,
Sharat.



One way that I can think of doing this would be by adding the details about
the new schema and the tables that lie therein in the
src/backend/catalog/information_schema.sql file (I dont know if this is the
recommended way though). These will end up becoming a part of template1 and
any subsequent databases that are created will contain them.

Regards,
Nikhils

--
EnterpriseDB   http://www.enterprisedb.com


Re: [HACKERS] Hacking on PostgreSQL via GIT

2007-04-18 Thread Aidan Van Dyk
* Tom Lane [EMAIL PROTECTED] [070418 01:33]:
 
 Um ... why do either of you feel there's an issue there?
 
 We switched over to $PostgreSQL$ a few years ago specifically to avoid
 creating merge problems for downstream repositories.  If there are any
 other keyword expansions left in the source text I'd vote to remove
 them.  If you have a problem with $PostgreSQL$, why?

Mine is only a generic warning. I convert many CVS repos to GIT, all
using the same gateway setup, so I haven't done anything specific for
PostgreSQL.  Most other projects are not as diciplined as PostgreSQL,
and I regularly see Modified, Date, Id, Log, etc keywords, as well as
project specific ones like PostgreSQL, OpenBSD, FreeBSD, etc...

Un-expansion *may* not be perfect...

a.

-- 
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED]   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] Hacking on PostgreSQL via GIT

2007-04-18 Thread Alvaro Herrera
Tom Lane wrote:
 Martin Langhoff [EMAIL PROTECTED] writes:
  Aidan Van Dyk wrote:
  And remember the warning I gave that my conversion is *not* a direct CVS
  import - I intentionally *unexpand* all Keywords before stuffing them
  into GIT so that merging and branching can ignore all the Keyword
  conflicts... 
 
  My import is unexpanding those as well to support rebasing and merging
  better.
 
 Um ... why do either of you feel there's an issue there?
 
 We switched over to $PostgreSQL$ a few years ago specifically to avoid
 creating merge problems for downstream repositories.  If there are any
 other keyword expansions left in the source text I'd vote to remove
 them.  If you have a problem with $PostgreSQL$, why?

One weird thing I noticed some time ago is that we have an $Id$ (or was
it $Header$? I don't remember) somewhere, which was supposed to be from
the upstream repo where we got the file from, but it was being expanded
to our local version to the file.  We _also_ have the $PostgreSQL$ tag
in there which carries the same info.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] schema creation during initdb

2007-04-18 Thread Andrew Dunstan

NikhilS wrote:



One way that I can think of doing this would be by adding the details 
about the new schema and the tables that lie therein in the 
src/backend/catalog/information_schema.sql file (I dont know if this 
is the recommended way though). These will end up becoming a part of 
template1 and any subsequent databases that are created will contain 
them.




Well, *I* wouldn't recommend it. That installed version of that file 
(which if anything is what he should mangle rathwer than the source) is 
really just an implementation artefact.


Why would this be better than either loading the schema+table directly 
into template1 after initdb, or making your own template db and loading 
them into that?


cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Autovacuum vs statement_timeout

2007-04-18 Thread Robert Treat
On Tuesday 17 April 2007 21:25, Alvaro Herrera wrote:
 I think that is too strong an assumption, which is why I'm planning to
 back-patch the change to reset statement_timeout to 0 on autovacuum till
 8.0, as discussed.  I think I should also backpatch the change to set
 zero_damaged_pages as well (which is not on 8.0 AFAIR).

blinks  Um, can I get a pointer to that thread?  I can't imagine why we 
would actually want to automatically destroy our data without oversight from 
a DBA... I must be reading that wrong. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Can't ri_KeysEqual() consider two nulls as equal?

2007-04-18 Thread Stephan Szabo
On Tue, 17 Apr 2007, Tom Lane wrote:

 A recent discussion led me to the idea that FK triggers are fired
 unnecessarily during an UPDATE if the foreign-key column(s) contain
 any NULLs, because ri_KeysEqual() treats two nulls as unequal,
 and therefore we conclude the row has changed when it has not.
 I claim that both ri_KeysEqual() and ri_OneKeyEqual() could consider
 two nulls to be equal.  Furthermore it seems like ri_AllKeysUnequal()
 should do so too; the case can't arise at the moment because the sole
 caller already knows that one of the key sets contains no nulls, but
 if this weren't so, the optimization would be actively wrong if we
 concluded that two nulls were unequal.

Do you have any suggestions for alternate names? Keeping them using Equal
seems to be dangerous since people would likely expect it to act like
normal equality (with nulls being different).

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Autovacuum vs statement_timeout

2007-04-18 Thread Robert Treat
On Tuesday 17 April 2007 20:54, Tom Lane wrote:
 Robert Treat [EMAIL PROTECTED] writes:
  I'm with Joshua on this one. Statement_timeout is often used as a means
  for protection from long running statements due to server load and
  locking and all of the above commands can certainly fall into that area.
  If people feel strongly that the command line programs need a way to
  circumvent it, add a --ignore-statement-timeout option or similar
  mechanism.

 The worst-case scenario here is that your server fails and you discover
 that all your backups are corrupt because you didn't notice pg_dump was
 failing due to statement_timeout.  (Maybe it just recently started to
 fail because your biggest table grew past the point at which the COPY
 command exceeded statement_timeout.)


I don't think I recall anyone ever complaining about this, and this scenario 
has been plausible for *years*...

 I'm not excited about the other ones but I can see the argument for
 making pg_dump force the timeout to 0.


Allowing pg_dump to run un-checked could also lead to problems such as 
exceeding maintenence windows causing performance issues, or causing trouble 
due to lock contention with ongoing pg_dumps.  I'll grant that the downsides 
aren't as extreme, but the current functionality provides simple work arounds 
(setting up specific dump users for example).  If we force pg_dump to 0 
timeout, what means will be provided for the DBA who doesn't want to let 
pg_dump run unchecked? 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Autovacuum vs statement_timeout

2007-04-18 Thread Alvaro Herrera
Robert Treat wrote:
 On Tuesday 17 April 2007 21:25, Alvaro Herrera wrote:
  I think that is too strong an assumption, which is why I'm planning to
  back-patch the change to reset statement_timeout to 0 on autovacuum till
  8.0, as discussed.  I think I should also backpatch the change to set
  zero_damaged_pages as well (which is not on 8.0 AFAIR).
 
 blinks  Um, can I get a pointer to that thread?  I can't imagine why we 
 would actually want to automatically destroy our data without oversight from 
 a DBA... I must be reading that wrong. 

You are -- I intend to set it to _off_ :-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] Backend Crash

2007-04-18 Thread Harvell F
I've got a database corruption/backend crash problem with my 8.1.3  
database on Mac OS X Server 10.4.  I'm beginning the process of  
trying to recover it.  If anyone is interested in trying to fully  
understand the what, where, and why of the crash, please contact me.   
I've provided the basic information on the crash below.


Thanks,
  F

--
F Harvell
407 467-1919



--- cut ---

The database error was first identified by a series of emails that  
were sent with incorrect data.  My first step was to try to get a  
database dump (which crashed):


[EMAIL PROTECTED] subscription]$ pg_dump -U dinkdb -W dinkdb -f ~/dinkdb- 
`date +%Y%m%d`.dump

Password:
pg_dump: WARNING:  terminating connection because of crash of another  
server process
DETAIL:  The postmaster has commanded this server process to roll  
back the current transaction and exit, because another server process  
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database  
and repeat your command.

pg_dump: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
pg_dump: SQL command to dump the contents of table feature_view  
failed: PQendcopy() failed.
pg_dump: Error message from server: server closed the connection  
unexpectedly

This probably means the server terminated abnormally
before or while processing the request.
pg_dump: The command was: COPY public.feature_view (visit_id,  
category, feature, username, notes, browser, capability_code, cookie,  
ip, created, updated) TO stdout;



I then shutdown the server and rebooted it and tried another dump:

[EMAIL PROTECTED] fharvell]$ pg_dump -U dinkdb -W dinkdb -f ~/dinkdb- 
`date +%Y%m%d`.dump

Password:
pg_dump: WARNING:  terminating connection because of crash of another  
server process
DETAIL:  The postmaster has commanded this server process to roll  
back the current transaction and exit, because another server process  
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database  
and repeat your command.

pg_dump: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
pg_dump: SQL command to dump the contents of table browser_summary  
failed: PQendcopy() failed.
pg_dump: Error message from server: server closed the connection  
unexpectedly

This probably means the server terminated abnormally
before or while processing the request.
pg_dump: The command was: COPY public.browser_summary (browser,  
capability_code, aggregate_year, aggregate_month, aggregate_count,  
cookie, ip, created, updated) TO stdout;



I've now shut down the database and am copying it before trying to  
dump individual tables to recover as much data as possible.



The crash reporter provides:

**

Host Name:  amos
Date/Time:  2007-04-18 09:15:30.708 -0400
OS Version: 10.4.9 (Build 8P135)
Report Version: 4

Command: psql
Path:/usr/local/pgsql/bin/psql
Parent:  bash [1239]

Version: ??? (???)

PID:1410
Thread: 0

Exception:  EXC_BAD_ACCESS (0x0001)
Codes:  KERN_INVALID_ADDRESS (0x0001) at 0x656c6972

Thread 0 Crashed:
0   libSystem.B.dylib   0x90007658 szone_free + 3148
1   libSystem.B.dylib   0x90015c50 fclose + 176
2   libedit.2.dylib 0x96c3f334 history_end + 1632
3   libedit.2.dylib 0x96c3f7bc history + 468
4   libedit.2.dylib 0x96c41c58 write_history + 84
5   psql0x7cd4 saveHistory + 56 (crt.c:355)
6   psql0x7d94 finishInput + 116 (crt.c:355)
7   libSystem.B.dylib   0x90014ef8 __cxa_finalize + 260
8   libSystem.B.dylib   0x90014dc4 exit + 36
9   psql0x1f58 _start + 344 (crt.c:249)
10  psql0x1dfc start + 60

Thread 0 crashed with PPC Thread State 64:
  srr0: 0x90007658 srr1:  
0x1000d030vrsave: 0x
cr: 0x42002404  xer: 0x2000   lr:  
0x90007624  ctr: 0x90014d20
r0: 0x90007624   r1: 0xbfffef90   r2:  
0x42002402   r3: 0x000d
r4: 0x   r5: 0x000d   r6:  
0x80808080   r7: 0x0003
r8: 0x34313000   r9: 0xbfffeec5  r10:  
0x  r11: 0x42002402
   r12: 0x90014d20  r13: 0x  r14:  
0x  r15: 0x
   r16: 0x  r17: 0x0030  r18:  
0x0400  r19: 0x0032
   r20: 0x0260  r21: 0x01806400  r22:  
0xa0001fac  r23: 0x0264
   r24: 0x0002  r25: 0x0003  r26:  
0x0002  r27: 0x656c696e
   r28: 0x0180  r29: 0x01806000  r30:  

Re: [HACKERS] Backend Crash

2007-04-18 Thread Harvell F
  Just as a follow up, it turns out that our fiberchannel RAID was  
power cycled while the systems were up and running.  There are  
several write errors in the postgresql log.


  Now I'm off to try to recover the data...

--
F Harvell
407 467-1919




On 18 Apr 2007, at 10:08, Harvell F wrote:

I've got a database corruption/backend crash problem with my 8.1.3  
database on Mac OS X Server 10.4.  I'm beginning the process of  
trying to recover it.  If anyone is interested in trying to fully  
understand the what, where, and why of the crash, please contact  
me.  I've provided the basic information on the crash below.






Re: [HACKERS] Autovacuum vs statement_timeout

2007-04-18 Thread Alvaro Herrera
Robert Treat wrote:
 On Tuesday 17 April 2007 20:54, Tom Lane wrote:

  I'm not excited about the other ones but I can see the argument for
  making pg_dump force the timeout to 0.
 
 Allowing pg_dump to run un-checked could also lead to problems such as 
 exceeding maintenence windows causing performance issues, or causing trouble 
 due to lock contention with ongoing pg_dumps.

I have never ever seen a request to be able to control pg_dump and have
it stop dumping if the time taken to dump exceeded a threshold.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Autovacuum vs statement_timeout

2007-04-18 Thread Alvaro Herrera
  Tom Lane wrote:
   I seem to remember that we'd agreed that autovacuum should ignore any
   globally set statement_timeout, on the grounds that a poorly chosen
   setting could indefinitely prevent large tables from being vacuumed.

FWIW in testing, I just noticed that autovacuum does not pay attention
to statement_timeout anyway, because it is only set in
start_xact_command, which is used for interactive commands only, not in
autovacuum.  So there's no need to patch anything.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Background LRU Writer/free list

2007-04-18 Thread Greg Smith
I'm mostly done with my review of the Automatic adjustment of 
bgwriter_lru_maxpages patch.  In addition to issues already brought up 
with that code, there are some small things that need to be done to merge 
it with the recent pg_stat_bgwriter patch, and I have some concerns about 
its unbounded scanning of the buffer pool; I'll write that up in more 
detail or just submit an improved patch as I get time this week.


But there's a fundamental question that has been bugging me, and I think 
it impacts the direction that code should take.  Unless I'm missing 
something in my reading, buffers written out by the LRU writer aren't ever 
put onto the free list.  I assume this is to stop from prematurely 
removing buffers that contain useful data.  In cases where a substantial 
percentage of the buffer cache is dirty, the LRU writer has to scan a 
significant portion of the pool looking for one of the rare clean buffers, 
then write it out.  When a client goes to grab a free buffer afterward, it 
has to scan the same section of the pool to find the now clean buffer, 
which seems redundant.


With the new patch, the LRU writer is fairly well bounded in that it 
doesn't write out more than it thinks it will need; you shouldn't get into 
a situation where many more pages are written than will be used in the 
near future.  Given that mindset, shouldn't pages the LRU scan writes just 
get moved onto the free list?


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Autovacuum vs statement_timeout

2007-04-18 Thread Joshua D. Drake

Alvaro Herrera wrote:

Robert Treat wrote:

On Tuesday 17 April 2007 20:54, Tom Lane wrote:



I'm not excited about the other ones but I can see the argument for
making pg_dump force the timeout to 0.
Allowing pg_dump to run un-checked could also lead to problems such as 
exceeding maintenence windows causing performance issues, or causing trouble 
due to lock contention with ongoing pg_dumps.


If you have that problem, you need bigger hardware. pg_dump is a 
priority application. Not to mention, if you *really* want that time of 
behavior it is easy enough to wrap pg_dump in perl or python.


Let the foot guns be available to those that can script them :)

Sincerely,

Joshua D. Drake





--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] utf8 COPY DELIMITER?

2007-04-18 Thread Jim C. Nasby
On Tue, Apr 17, 2007 at 02:28:18PM -0400, Tom Lane wrote:
 I doubt that supporting a single multibyte character would be an
 interesting extension --- if we wanted to do anything at all there, we'd
 just generalize the delimiter to be an arbitrary string.  But it would
 certainly slow down COPY by some amount, which is an area where you'll
 get push-back for performance losses, so you'd need to make a convincing
 use-case for it.

Couldn't we use a fast code path (what we have now) for the case when
the delimiter is a single byte? That would allow for multi-character
delimiters without penalizing those that don't use them.

As for use case, I worked on migrating some stuff out of a MySQL
database a while ago, and having arbitrary string delimiters would have
made life easier.
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Autovacuum vs statement_timeout

2007-04-18 Thread Robert Treat
On Wednesday 18 April 2007 11:30, Alvaro Herrera wrote:
 Robert Treat wrote:
  On Tuesday 17 April 2007 20:54, Tom Lane wrote:
   I'm not excited about the other ones but I can see the argument for
   making pg_dump force the timeout to 0.
 
  Allowing pg_dump to run un-checked could also lead to problems such as
  exceeding maintenence windows causing performance issues, or causing
  trouble due to lock contention with ongoing pg_dumps.

 I have never ever seen a request to be able to control pg_dump and have
 it stop dumping if the time taken to dump exceeded a threshold.

Given that we already have the functionality, I suspect you wouldn't...

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Background LRU Writer/free list

2007-04-18 Thread Jim C. Nasby
On Wed, Apr 18, 2007 at 09:09:11AM -0400, Greg Smith wrote:
 I'm mostly done with my review of the Automatic adjustment of 
 bgwriter_lru_maxpages patch.  In addition to issues already brought up 
 with that code, there are some small things that need to be done to merge 
 it with the recent pg_stat_bgwriter patch, and I have some concerns about 
 its unbounded scanning of the buffer pool; I'll write that up in more 
 detail or just submit an improved patch as I get time this week.
 
 But there's a fundamental question that has been bugging me, and I think 
 it impacts the direction that code should take.  Unless I'm missing 
 something in my reading, buffers written out by the LRU writer aren't ever 
 put onto the free list.  I assume this is to stop from prematurely 
 removing buffers that contain useful data.  In cases where a substantial 
 percentage of the buffer cache is dirty, the LRU writer has to scan a 
 significant portion of the pool looking for one of the rare clean buffers, 
 then write it out.  When a client goes to grab a free buffer afterward, it 
 has to scan the same section of the pool to find the now clean buffer, 
 which seems redundant.
 
 With the new patch, the LRU writer is fairly well bounded in that it 
 doesn't write out more than it thinks it will need; you shouldn't get into 
 a situation where many more pages are written than will be used in the 
 near future.  Given that mindset, shouldn't pages the LRU scan writes just 
 get moved onto the free list?

I've wondered the same thing myself.

If we're worried about freeing pages that we might want back, we could
change the code so that ReadBuffer would also look at the free list if
it couldn't find a page before going to the OS for it.

So if you make this change will BgBufferSync start incrementing
StrategyControl-nextVictimBuffer and decrementing buf-usage_count like
StrategyGetBuffer does now?
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] utf8 COPY DELIMITER?

2007-04-18 Thread Andrew Dunstan

Jim C. Nasby wrote:

On Tue, Apr 17, 2007 at 02:28:18PM -0400, Tom Lane wrote:
  

I doubt that supporting a single multibyte character would be an
interesting extension --- if we wanted to do anything at all there, we'd
just generalize the delimiter to be an arbitrary string.  But it would
certainly slow down COPY by some amount, which is an area where you'll
get push-back for performance losses, so you'd need to make a convincing
use-case for it.



Couldn't we use a fast code path (what we have now) for the case when
the delimiter is a single byte? That would allow for multi-character
delimiters without penalizing those that don't use them.

As for use case, I worked on migrating some stuff out of a MySQL
database a while ago, and having arbitrary string delimiters would have
made life easier.
  


The first thing to note is that the COPY code is quite complex and 
fragile. Personally, I'd want a heck of a lot of convincing to see it 
changed, and your use case looks to me like it would be better handled 
by preprocessing using a perl script.


Also, if we accept string delimiters on input, we should also allow them 
on output.


cheers

andrew



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [RFC] PostgreSQL Access Control Extension (PGACE)

2007-04-18 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes:
 Give the team some credit, though; they've managed to come up with a 
 system that integrates OS-level ACLs for both SElinux and TxSol, are not 
 asking us to incorporate two different sets, and are coming to us with a 
 serious proposal that has a lot of work behind it.  Please don't blow 
 them off like they were undergrads submitting a semester project.  If 
 they need to come back after 8.3 beta so we can properly pay attention 
 to the proposal, then say so.

Well, personally I won't have any cycles to think hard about any post-8.3
work until after the beta is out.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Background LRU Writer/free list

2007-04-18 Thread Gregory Stark

Greg Smith [EMAIL PROTECTED] writes:

 I'm mostly done with my review of the Automatic adjustment of
 bgwriter_lru_maxpages patch.  In addition to issues already brought up with
 that code, there are some small things that need to be done to merge it with
 the recent pg_stat_bgwriter patch, and I have some concerns about its 
 unbounded
 scanning of the buffer pool; I'll write that up in more detail or just submit
 an improved patch as I get time this week.

I had a thought on this. Instead of sleeping for a constant amount of time and
then estimating the number of pages needed for that constant amount of time
perhaps what bgwriter should be doing is sleeping for a variable amount of
time and estimating the length of time it needs to sleep to arrive at a
constant number of pages being needed.

The reason I think this may be better is that what percentage of the shared
buffers the bgwriter allows to get old between wakeups seems more likely to
be a universal constant that people won't have to adjust than fixed time
interval between bgwriter cleanup operations.

Just a thought.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Backend Crash

2007-04-18 Thread Gregory Stark
Harvell F [EMAIL PROTECTED] writes:

   Just as a follow up, it turns out that our fiberchannel RAID was power 
 cycled
 while the systems were up and running.  There are  several write errors in the
 postgresql log.

   Now I'm off to try to recover the data...

That's still a problem, it indicates either a bug in Postgres or -- sadly more
likely -- a problem with your hardware or system software setup. In a working
system Postgres guarantees that a situation like that will result in
transactions failing to commit (either with errors or freezing), not corrupted
data. Data once committed should never be lost.

In order for this to happen something in your software and hardware setup must
be caching writes then hiding the errors from Postgres. For instance systems
where fsync lies and reports success before it has written the data to disk
can result in silently corrupted data on any power outage or system crash. 

Could you send the write errors? Or at least the first page or so of them?
And check the system logs at that time for any lower-level errors as well.

What kind of drives are in the fibrechannel RAID? Are they SCSI, PATA, or
SATA? Can you check their configuration at all or does the RAID hide all that
from you? Does the RAID have a battery backed cache?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Background LRU Writer/free list

2007-04-18 Thread Tom Lane
Greg Smith [EMAIL PROTECTED] writes:
 With the new patch, the LRU writer is fairly well bounded in that it 
 doesn't write out more than it thinks it will need; you shouldn't get into 
 a situation where many more pages are written than will be used in the 
 near future.  Given that mindset, shouldn't pages the LRU scan writes just 
 get moved onto the free list?

This just seems like a really bad idea: throwing away data we might
want.  Furthermore, if the page was dirty, then it's probably been
accessed more recently than adjacent pages that are clean, so
preferentially zapping just-written pages seems backwards.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Backend Crash

2007-04-18 Thread Tom Lane
Harvell F [EMAIL PROTECTED] writes:
 I've got a database corruption/backend crash problem with my 8.1.3  
 database on Mac OS X Server 10.4.  I'm beginning the process of  
 trying to recover it.  If anyone is interested in trying to fully  
 understand the what, where, and why of the crash, please contact me.   
 I've provided the basic information on the crash below.

These traces are for psql, not the backend.  It looks like you're having
an issue with Apple's libedit failing at psql exit, which is something
I seem to remember we fixed, so you might want to consider an update from
8.1.3.  However, that's got nothing to do with the server-side problem.
Since pg_dump's backend is saying that some *other* process crashed,
the first thing to do is identify what it is that's crashing.  Have you
looked in the postmaster log?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Background LRU Writer/free list

2007-04-18 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 I had a thought on this. Instead of sleeping for a constant amount of time and
 then estimating the number of pages needed for that constant amount of time
 perhaps what bgwriter should be doing is sleeping for a variable amount of
 time and estimating the length of time it needs to sleep to arrive at a
 constant number of pages being needed.

That's an interesting idea, but a possible problem with it is that we
can't vary the granularity of a sleep time as finely as we can vary the
number of buffers processed per iteration.  Assuming that the system's
tick rate is the typical 100Hz, we have only 10ms resolution on sleep
times.

 The reason I think this may be better is that what percentage of the shared
 buffers the bgwriter allows to get old between wakeups seems more likely to
 be a universal constant that people won't have to adjust than fixed time
 interval between bgwriter cleanup operations.

Why?  What you're really trying to determine, I think, is the I/O load
imposed by the bgwriter, and pages-per-second seems a pretty natural
way to think about that; percentage of shared buffers not so much.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Can't ri_KeysEqual() consider two nulls as equal?

2007-04-18 Thread David Fetter
On Wed, Apr 18, 2007 at 07:51:48AM -0700, Stephan Szabo wrote:
 On Tue, 17 Apr 2007, Tom Lane wrote:
 
  A recent discussion led me to the idea that FK triggers are fired
  unnecessarily during an UPDATE if the foreign-key column(s)
  contain any NULLs, because ri_KeysEqual() treats two nulls as
  unequal, and therefore we conclude the row has changed when it has
  not.  I claim that both ri_KeysEqual() and ri_OneKeyEqual() could
  consider two nulls to be equal.  Furthermore it seems like
  ri_AllKeysUnequal() should do so too; the case can't arise at the
  moment because the sole caller already knows that one of the key
  sets contains no nulls, but if this weren't so, the optimization
  would be actively wrong if we concluded that two nulls were
  unequal.
 
 Do you have any suggestions for alternate names?  Keeping them using
 Equal seems to be dangerous since people would likely expect it to
 act like normal equality (with nulls being different).

How about NotDistinct as in SQL's IS NOT DISTINCT FROM ?

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Can't ri_KeysEqual() consider two nulls as equal?

2007-04-18 Thread Tom Lane
Stephan Szabo [EMAIL PROTECTED] writes:
 On Tue, 17 Apr 2007, Tom Lane wrote:
 I claim that both ri_KeysEqual() and ri_OneKeyEqual() could consider
 two nulls to be equal.

 Do you have any suggestions for alternate names? Keeping them using Equal
 seems to be dangerous since people would likely expect it to act like
 normal equality (with nulls being different).

I think Richard's suggestion of KeysUnchanged would work fine.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Background LRU Writer/free list

2007-04-18 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Why?  What you're really trying to determine, I think, is the I/O load
 imposed by the bgwriter, and pages-per-second seems a pretty natural
 way to think about that; percentage of shared buffers not so much.

What I'm saying is that pages/s will vary from system to system. Busier
systems will have higher i/o rates. So a system with a DBA on a system with a
higher rate will want to adjust the bgwriter sleep time lower than the DBA on
a system where bgwriter isn't doing much work.

In particular I'm worried about what happens on a very busy cpu-bound system
where adjusting the sleep times would result in it deciding to not sleep at
all. On such a system sleeping for even 10ms might be too long. But we
probably don't want to make the default even as low as 10ms.

Anyways, if we have a working patch that works the other way around we could
experiment with that and see if there are actual situations where sleeping for
0ms is necessary. Perhaps a mixture of the two approaches will be necessary
anyways because of the granularity issue.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Hacking on PostgreSQL via GIT

2007-04-18 Thread Martin Langhoff
Tom Lane wrote:
 Um ... why do either of you feel there's an issue there?
 
 We switched over to $PostgreSQL$ a few years ago specifically to avoid
 creating merge problems for downstream repositories.  If there are any
 other keyword expansions left in the source text I'd vote to remove
 them.  If you have a problem with $PostgreSQL$, why?

I have to accept the blame for not researching about the repo in the
first place. I didn't know about $PostgreSQL$ - from the looks of it, it
acts _just_ like $Id$. So I guess you use PostgreSQL instead of Id.

As GIT won't touch them, Florian will probably be just fine with his
patches, and I doubt they'll be more than a minor annoyance, if at all.

Keyword expansions are generally bad because SCM tools should track
_content_ - and keyword expansions _modify_ it to add metadata that is
somewhat redundant, obtainable in other ways, and should just not be in
the middle of the _data_. Those modifications lead to patches that have
bogus hunks and sometimes don't apply, MD5/SHA1 checksums that don't
match and a whole lot of uncertainty.

You can't just say the content is the same by comparing bytes or SHA1
digests if the committer, the path or the history are different. And it
is a mighty important ability for an SCM.

The argument runs much longer than that - and the flamewars are quite
entertaining. If anyone's keen we're having one right now on
[EMAIL PROTECTED] . I am sure Pg hackers will find parallels between
keyword expansion (as a misfeature everyone is used to) and the SQL
travesties that early MySQL is famous for.

I've picked my poison... ran away from MySQL to Pg, and from CVS
/SVN/Arch to GIT. Not looking back :-)

cheers


m
-- 
---
Martin @ Catalyst .Net .NZ  Ltd, PO Box 11-053, Manners St,  Wellington
WEB: http://catalyst.net.nz/   PHYS: Level 2, 150-154 Willis St
OFFICE: +64(4)916-7224  UK: 0845 868 5733 ext 7224  MOB: +64(21)364-017
  Make things as simple as possible, but no simpler - Einstein
---

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Hacking on PostgreSQL via GIT

2007-04-18 Thread Jim C. Nasby
On Wed, Apr 18, 2007 at 06:39:34PM +1200, Martin Langhoff wrote:
 Keyword expansions are generally bad because SCM tools should track
 _content_ - and keyword expansions _modify_ it to add metadata that is
 somewhat redundant, obtainable in other ways, and should just not be in
 the middle of the _data_. Those modifications lead to patches that have
 bogus hunks and sometimes don't apply, MD5/SHA1 checksums that don't
 match and a whole lot of uncertainty.

Then how do you tell what version a file is if it's outside of a
checkout?
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [RFC] PostgreSQL Access Control Extension (PGACE)

2007-04-18 Thread KaiGai Kohei
... which presumably wouldn't involve any added dependency on outside 
code.

For people who are already using SELinux or Trusted Solaris, making the
database dependent on that infrastructure might be seen as a plus, but
I'm not sure the rest of the world would be pleased.  


Yes, I was thinking that this should be a compile-time option with a lot 
of warnings in the Docs.


Yes, those facilities are not enabled without '--enable-selinux' compile-time
option. It's a bit unclear for me what means the a lot of warnings the Docs.

Give the team some credit, though; they've managed to come up with a 
system that integrates OS-level ACLs for both SElinux and TxSol, are not 
asking us to incorporate two different sets, and are coming to us with a 
serious proposal that has a lot of work behind it.  Please don't blow 
them off like they were undergrads submitting a semester project.  If 
they need to come back after 8.3 beta so we can properly pay attention 
to the proposal, then say so.


I don't hurry to merge those facilities regardless.
(8.3 is already feature frozen, as announced earlier.)

As I mentioned at first, the purpose of this discussion is to obtain
any feedbacks from PostgreSQL community, for our development.
I believe it also helps SE- stuff to be merged in the later version
of PostgreSQL.


There are also

some interesting questions about SQL spec compliance and whether a
database that silently hides some rows from you will give semantically
consistent results.


Yeah -- that's a potentially serious issue; KaiGai, have you looked into 
it?


Yes, I consider the policy to filter any violated tuple looks consistently.
The policy enforces any tuple has to be filtered before using them, and
it helps that computational processes don't get any effect from them.

But proving innocence is generally hard task.
At first, I want to know what points are you worried about the most.

Thanks,
--
KaiGai Kohei [EMAIL PROTECTED]

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] [RFC] PostgreSQL Access Control Extension (PGACE)

2007-04-18 Thread Tom Lane
KaiGai Kohei [EMAIL PROTECTED] writes:
 There are also
 some interesting questions about SQL spec compliance and whether a
 database that silently hides some rows from you will give semantically
 consistent results.
 
 Yeah -- that's a potentially serious issue; KaiGai, have you looked into 
 it?

 Yes, I consider the policy to filter any violated tuple looks consistently.
 The policy enforces any tuple has to be filtered before using them, and
 it helps that computational processes don't get any effect from them.

 But proving innocence is generally hard task.
 At first, I want to know what points are you worried about the most.

Unique constraints and foreign-key constraints seem the most pressing
problems.  What will you do to avoid having different viewers have
different opinions about whether a constraint is violated?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Hacking on PostgreSQL via GIT

2007-04-18 Thread Aidan Van Dyk
* Jim C. Nasby [EMAIL PROTECTED] [070418 14:39]:
 On Wed, Apr 18, 2007 at 06:39:34PM +1200, Martin Langhoff wrote:
  Keyword expansions are generally bad because SCM tools should track
  _content_ - and keyword expansions _modify_ it to add metadata that is
  somewhat redundant, obtainable in other ways, and should just not be in
  the middle of the _data_. Those modifications lead to patches that have
  bogus hunks and sometimes don't apply, MD5/SHA1 checksums that don't
  match and a whole lot of uncertainty.
 
 Then how do you tell what version a file is if it's outside of a
 checkout?

That's what all the fun is about ;-)  Some would say that labelling the
file is the job of the release processes.  Others say it's the job of
the SCM system...

Of course I just sit on the fence because in the work I have to do, I'm
quite happy that nothing is outside of a checkout.  GIT is good enough
that I have it everywhere.  I realise not everyone's that lucky..

;-)

a.

-- 
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED]   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] Hacking on PostgreSQL via GIT

2007-04-18 Thread Aidan Van Dyk
* Aidan Van Dyk [EMAIL PROTECTED] [070418 15:03]:

  Then how do you tell what version a file is if it's outside of a
  checkout?
 
 That's what all the fun is about ;-)  Some would say that labelling the
 file is the job of the release processes.  Others say it's the job of
 the SCM system...

Noting that if you take something outside of a checkout means you've
released it from the VCS...

-- 
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED]   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] Hacking on PostgreSQL via GIT

2007-04-18 Thread Alvaro Herrera
Aidan Van Dyk wrote:
 * Aidan Van Dyk [EMAIL PROTECTED] [070418 15:03]:
 
   Then how do you tell what version a file is if it's outside of a
   checkout?
  
  That's what all the fun is about ;-)  Some would say that labelling the
  file is the job of the release processes.  Others say it's the job of
  the SCM system...
 
 Noting that if you take something outside of a checkout means you've
 released it from the VCS...

Which is not always what happens in reality.  Consider for example that
we borrowed some files from NetBSD, OpenBSD, Tcl, zic and others.  It
would be nice to know exactly at what point we borrowed the file, so we
can go to the upstream repo and check if there's any bug fix that we
should also apply to our local copy.  And we _also_ modify locally the
file of course, so just digesting the file we have to get a SHA1 (or
whatever) identifier is not an option.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Remaining VACUUM patches

2007-04-18 Thread Alvaro Herrera
There are two additional patches in the VACUUM code.  One is Heikki's
patch to recalculate OldestXmin in the vacuum run.

http://groups.google.es/group/pgsql.patches/browse_thread/thread/b2cfc901534d8990/40ba5b2fbb8f5b91
(much nicer than our archives because the whole thread is there, not
just month-sized pieces).

That thread ended without any conclusion; it is said that the patch will
be reconsidered when Simon Riggs' patch about the WAL flushing bug
lands, but I don't know what patch is that.  Is it in the patch queue?
Was it already applied?

The problem with the patch is that the DBT-2 test showed decreased
performance, but that was still under investigation.

What is the status of this?


The other patch was ITAGAKI Takahiro's patch to fix n_dead_tuples in
pgstats after VACUUM when there is concurrent update activity.  This
patch is still on hold largely because the above patch would cause it to
be a bit obsolete.  So I think if we're not going to apply the former,
we should apply this one.

http://archives.postgresql.org/pgsql-hackers/2007-02/msg00051.php

http://archives.postgresql.org/pgsql-patches/2007-02/msg00021.php

Comments?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Remaining VACUUM patches

2007-04-18 Thread Heikki Linnakangas

Alvaro Herrera wrote:

There are two additional patches in the VACUUM code.  One is Heikki's
patch to recalculate OldestXmin in the vacuum run.

http://groups.google.es/group/pgsql.patches/browse_thread/thread/b2cfc901534d8990/40ba5b2fbb8f5b91
(much nicer than our archives because the whole thread is there, not
just month-sized pieces).

That thread ended without any conclusion; it is said that the patch will
be reconsidered when Simon Riggs' patch about the WAL flushing bug
lands, but I don't know what patch is that.  Is it in the patch queue?
Was it already applied?


It's in patch queue, not applied. It's the one with title Bug: Buffer 
cache is not scan resistant:

http://momjian.us/mhonarc/patches/msg00048.html


The problem with the patch is that the DBT-2 test showed decreased
performance, but that was still under investigation.

What is the status of this?


The plan is that I'll rerun the DBT-2 test after the above patch is 
applied. After that we'll decide if we want the OldestXmin patch or not.



The other patch was ITAGAKI Takahiro's patch to fix n_dead_tuples in
pgstats after VACUUM when there is concurrent update activity.  This
patch is still on hold largely because the above patch would cause it to
be a bit obsolete.  So I think if we're not going to apply the former,
we should apply this one.


I'd like to have the buffer cache is not scan resistant patch reviewed 
first to get the ball rolling on these other patches. The vacuum-related 
patches are just small tweaks, and they don't conflict with any of the 
bigger patches in the queue, so there's no reason to rush them,


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Remaining VACUUM patches

2007-04-18 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 The other patch was ITAGAKI Takahiro's patch to fix n_dead_tuples in
 pgstats after VACUUM when there is concurrent update activity.  This
 patch is still on hold largely because the above patch would cause it to
 be a bit obsolete.

I objected (and still object) to this patch because it allows
n_dead_tuples to drift arbitrarily far away from reality --- a series of
vacuums will incrementally update it using probably-inaccurate deltas,
and there's nothing to ensure that the result converges rather than
diverging.  In the real world it will result in n_dead_tuples becoming
less accurate, not more so.

There was some discussion about better ways to do it, IIRC, but no new
patch has been submitted.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] modifying the table function

2007-04-18 Thread Islam Hegazy
Thanks for the documentation link. It helped me to understand how data are 
passed back to the client.


I figured out that data is sent back to the client using the 'printtup' 
function. It is called by ExecSelect, called by ExecutorRun, etc. What I 
understand now is that the data is sent to the client and stored there until 
the client receives a message from the server to display it. The server 
sends the display message from within 'ReadyForQuery' but I can't figure it 
out.


What I expect is that when I call 'exec_simple_query' several times, the 
server sends one row only for each call. But since they are separate calls, 
the client overwrites the previous results or saves the new row in a new 
place in memory such that it displays the last row only when the server 
invokes 'ReadyForQuery'.


I wonder if I am on the right track or not and how to know such kind of 
message sent from the server?


Regards
Islam Hegazy

- Original Message - 
From: Tom Lane [EMAIL PROTECTED]

To: Islam Hegazy [EMAIL PROTECTED]
Cc: pgsql-hackers@postgresql.org
Sent: Tuesday, April 17, 2007 1:44 AM
Subject: Re: [HACKERS] modifying the table function



Islam Hegazy [EMAIL PROTECTED] writes:
My question is how to inform the client that there is a tuple to display 
=

and return back to the backend to continue the query execution?


I'd suggest you start by reading
http://developer.postgresql.org/pgdocs/postgres/protocol.html
and then develop a clear specification at that level of what you
think should happen.  Perhaps after that exercise it will be clearer
how to change the code.  Think first, program later.

regards, tom lane 



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[HACKERS] Report on PANIC: unexpected hash relation size problem

2007-04-18 Thread Tom Lane
I finally got a chance to look at a reproducible case of this, thanks
to Vlastimil Krejcir.  It's now blindingly obvious that my patch of
2006-11-19, which was intended to fix bug #2737, has broken things.

The problem is partly a misordering of operations in
_hash_expandtable(): it calls _hash_alloc_buckets() before it's actually
committed to doing the table expansion.  If it aborts the bucket split
because it can't get the bucket lock, the relation EOF has already been
pushed out, and thus the next time we come through and actually do a
bucket split, the sanity check that compares actual to expected size
blows up.  As well it should.

This part is not hard to fix: we can just re-order the operations.
However there is a remaining risk, which is that _hash_alloc_buckets()
could fail internally after partially advancing the EOF. If it's trying
to expand the index across a 1Gb segment boundary, and fails (eg, due to
out-of-disk-space) after doing at least one successful smgrextend, then
the index is simply broken, because the filesystem EOF won't match what
we expect, causing subsequent bucket splits to panic here.  Now a
failure right at that point is pretty unlikely, but it could happen,
and having the index broken to the point of PANIC certainly isn't a very
acceptable consequence.

I can't see any way that we could positively guarantee there is no such
failure, since we are changing filesystem state that's not roll-back-able.
I fear what we have to do here is abandon the idea that we can ensure
that the smgr/filesystem EOF for the index is exactly the last used
page; rather, the invariant will need to be that the EOF is greater than
or equal to the last used page.  This is a bit annoying because it blows
away the concurrency improvement I had hoped to make in
_hash_getovflpage, as per this comment:

 * We have to fetch the page with P_NEW to ensure smgr's idea of the
 * relation length stays in sync with ours.  XXX It's annoying to do this
 * with metapage write lock held; would be better to use a lock that
 * doesn't block incoming searches.  Best way to fix it would be to stop
 * maintaining hashm_spares[hashm_ovflpoint] and rely entirely on the
 * smgr relation length to track where new overflow pages come from;
 * then we could release the metapage before we do the smgrextend.
 * FIXME later (not in beta...)

This approach will condemn us to relying on
hashm_spares[hashm_ovflpoint] forever, because it's really what tracks
the last used page.  I don't see any very good alternative, though.

I thought for a bit about still relying on the EOF as the indication of
how much space is used, but that would mean that a failure partway
through _hash_alloc_buckets() would result in permanent wastage of
however much space it had managed to allocate before failing.  Another
problem is that I'm not sure this wouldn't break the overflow page
addressing scheme.  (The overflow page addressing scheme is unreasonably
complicated in any case ... I've occasionally thought about getting rid
of the bitmaps in favor of a simple linked list of unused pages ... but
having to change it makes any such fix even more invasive.)

Anyone have a better idea?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] modifying the table function

2007-04-18 Thread Tom Lane
Islam Hegazy [EMAIL PROTECTED] writes:
 I wonder if I am on the right track or not and how to know such kind of 
 message sent from the server?

Seems like you're doing it the hard way.  Wouldn't it be easier to fix
the client to display data before it's received the whole query result?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Hacking on PostgreSQL via GIT

2007-04-18 Thread Jim C. Nasby
On Thu, Apr 19, 2007 at 10:07:08AM +1200, Martin Langhoff wrote:
 Jim C. Nasby wrote:
  Then how do you tell what version a file is if it's outside of a
  checkout?
 
 It's trivial for git to answer that - the file will either be pristine,
 and then we can just scan for the matching SHA1, or modified, and we can
 scan (taking a weee bit more time) which are the closest matches in
 your history, in what branches and commits.
 
 The actual scripting for this isn't written just yet -- Linus posted a
 proof-of-concept shell implementation along the lines of
 
  git rev-list --no-merges --full-history v0.5..v0.7 --
 src/widget/widget.c  rev-list
 
best_commit=none
best=100
while read commit
do
git cat-file blob $commit:src/widget/widget.c  tmpfile
lines=$(diff reference-file tmpfile | wc -l)
if [ $lines -lt $best ]
then
echo Best so far: $commit $lines
best=$lines
fi
done  rev-list
 
 and it's fast. One of the good properties of this is that you can ask
 for a range of your history (v0.5 to v0.7 in the example) and an exact
 path (src/widget/widget.c) but you can also say --all (meaning in all
 branches) and a handwavy over there, like src. And git will take an
 extra second or two on a large repo, but tell you about all the good
 candidates across the branches.
 
 Metadata is metadata, and we can fish it out of the SCM easily - and
 data is data, and it's silly to pollute it with metadata that is mostly
 incidental.
 
 If I find time today I'll post to the git list a cleaned up version of
 Linus' shell script as
 
 git-findclosestmatch head or range or --all path/to/scan/ \
 randomfile.c

Not bad... took you 40 lines to answer my question. Let's see if I can
beat that...

  Then how do you tell what version a file is if it's outside of a
  checkout?

Answer: you look at the $Id$ (or in this case, $PostgreSQL$) tag.

Sorry, tried to get it to 2 lines, but couldn't. ;)

I understand the argument about metadata and all, and largely agree with
it. But on the other hand I think a version identifier is a critical
piece of information; it's just as critical as the file name when it
comes to identifying the information contained in the file.

Or does GIT not use filenames, either? :)
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Background LRU Writer/free list

2007-04-18 Thread Greg Smith

On Wed, 18 Apr 2007, Tom Lane wrote:

Furthermore, if the page was dirty, then it's probably been accessed 
more recently than adjacent pages that are clean, so preferentially 
zapping just-written pages seems backwards.


The LRU background writer only writes out pages that have a usage_count of 
0, so they can't haven't been accessed too recently.  Assuming the buffer 
allocation rate continues its historical trend, these are the pages that 
are going to be written out and then allocated for something new one way 
or another in the next interval; the content is expected to be lost 
shortly no matter what.


As for preferring dirty pages over clean ones, on a re-read my question 
wasn't as clear as I wanted to be.  I think that clean pages near the 
strategy point should also be moved to the free list by the background 
writer.  You know clients are expected to require x buffers in the next y 
ms based on the history of the server (the new piece of information 
provided by the patch in the queue), and the LRU background writer is 
working in advance to make them available.  If you're doing all that, 
doesn't it make sense to finish the job by putting the pages on the free 
list, where the clients can grab them without running their own scan over 
the buffer cache?


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Background LRU Writer/free list

2007-04-18 Thread Greg Smith

On Wed, 18 Apr 2007, Jim C. Nasby wrote:


So if you make this change will BgBufferSync start incrementing
StrategyControl-nextVictimBuffer and decrementing buf-usage_count like
StrategyGetBuffer does now?


Something will need to keep advancing the nextVictimBuffer, I hadn't 
really finished implementation yet; I just wanted to get an idea if this 
was even feasible, or if there was some larger issue that made the whole 
idea moot.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Background LRU Writer/free list

2007-04-18 Thread Greg Smith

On Wed, 18 Apr 2007, Gregory Stark wrote:

In particular I'm worried about what happens on a very busy cpu-bound 
system where adjusting the sleep times would result in it deciding to 
not sleep at all. On such a system sleeping for even 10ms might be too 
long... Anyways, if we have a working patch that works the other way 
around we could experiment with that and see if there are actual 
situations where sleeping for 0ms is necessary.


I've been waiting for 8.3 to settle down before packaging the prototype 
auto-tuning background writer concept I'm working on (you can peek at the 
code at http://www.westnet.com/~gsmith/content/postgresql/bufmgr.c ), 
which already implements some of the ideas you're talking about in your 
messages today.  I estimate how much of the buffer pool is dirty, use that 
to compute an expected I/O rate, and try to adjust parameters to meet a 
quality of service guarantee for how often the entire buffer pool is 
scanned.  This is one of those problems that gets more difficult the more 
you dig into it; with all that done I still feel like I'm only halfway 
finished and several parts worked radically different in reality than I 
expected them to.


If you're allowing the background writer to write 1000 pages at a clip, 
that's 8MB each interval.  Doing that every 200ms makes for an I/O rate of 
40MB/s.  In a system that cares about data integrity, you'll exceed the 
ability of the WAL to sustain page writes (which limits how fast you can 
dirty pages) long before the interval approaches 0ms.  What I do in my 
code is set the interval to 200ms, compute what the maximum pages to write 
must be, and if it's 1000 then I reduce the interval.  I've tested 
dumping into a fairly fast disk array with tons of cache and I've never 
been able to get useful throughput below an 80ms interval; the OS just 
clamps down and makes you wait for I/O instead regardless of how little 
you intended to sleep.  Eventually, it's got to hit disk, and you can only 
buffer for so long before that starts to slow you down.


Anyway, this is a tangent discussion.  The LRU patch that's in the queue 
doesn't really care if it runs with a short interval or a long one, 
because it automatically scales how much work it does according to how 
much time passed.  I think that many only be a bit of tweaking away from a 
solid solution.  Tuning the all scan, which is what you're talking about 
when you speak in terms of the statistics about the overall buffer pool, 
is a much harder job.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match