Re: [HACKERS] pg_archive_bypass

2010-07-05 Thread Dimitri Fontaine
Bruce Momjian br...@momjian.us writes:
 Turn out 'REM' acts like /bin/true on Windows.  I have documented that
 fact in the attached, applied patch.

I guess that kills it for the pg_archive_bypass internal command, but
in a good way. Thanks!

Regards,
-- 
Dimitri Fontaine
PostgreSQL DBA, Architecte

Damn, I still baffled that we can escape the internal commands
altogether.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pessimal trivial-update performance

2010-07-05 Thread Jesper Krogh

On 2010-07-04 06:11, Tom Lane wrote:

Robert Haasrobertmh...@gmail.com  writes:
   

CREATE OR REPLACE FUNCTION update_tab() RETURNS void AS $$
BEGIN
INSERT INTO tab VALUES (0);
FOR i IN 1..10 LOOP
UPDATE tab SET x = x + 1;
END LOOP;
END
$$ LANGUAGE plpgsql;
 

I believe that none of the dead row versions can be vacuumed during this
test.  So yes, it sucks, but is it representative of real-world cases?

   

The problem can generally be written as tuples seeing multiple
updates in the same transaction?

I think that every time PostgreSQL is used with an ORM, there is
a certain amount of multiple updates taking place. I have actually
been reworking clientside to get around multiple updates, since they
popped up in one of my profiling runs. Allthough the time I optimized
away ended being both roundtrip time + update time, but having
the database do half of it transparently, might have been sufficient
to get me to have had a bigger problem elsewhere..

To sum up. Yes I think indeed it is a real-world case.

Jesper

--
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] log files and permissions

2010-07-05 Thread Martin Pihlak
Martin Pihlak wrote:
 Attached is a patch that adds a GUC log_file_mode which allows to specify
 the creation mode for the log files. Presently it lacks documentation, which
 I'll add if the idea is generally acceptable.
 

Updated patch attached.

regards,
Martin

*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 2789,2794  local0.*/var/log/postgresql
--- 2789,2813 
/listitem
   /varlistentry
  
+  varlistentry id=guc-log-file-mode xreflabel=log_file_mode
+   termvarnamelog_file_mode/varname (typeinteger/type)/term
+   indexterm
+primaryvarnamelog_file_mode/ configuration parameter/primary
+   /indexterm
+   listitem
+para
+ When varnamelogging_collector/varname is enabled,
+ this parameter sets the permissions of the created log files.
+ 	The value is an octal number consisting of 3 digits signifying
+ 	the permissions for the user, group and others.
+/para
+para
+ This parameter can only be set in the filenamepostgresql.conf/
+ file or on the server command line.
+/para
+   /listitem
+  /varlistentry
+ 
   varlistentry id=guc-log-rotation-age xreflabel=log_rotation_age
termvarnamelog_rotation_age/varname (typeinteger/type)/term
indexterm
*** a/src/backend/commands/variable.c
--- b/src/backend/commands/variable.c
***
*** 914,916  show_role(void)
--- 914,946 
  
  	return endptr + 1;
  }
+ 
+ 
+ /*
+  * LOG_FILE_MODE
+  */
+ 
+ extern int Log_file_mode;		/* in guc.c */
+ 
+ /*
+  * assign_log_file_mode: GUC assign_hook for log_file_mode
+  */
+ const char *
+ assign_log_file_mode(const char *value, bool doit, GucSource source)
+ {
+ 	char *endptr;
+ 	long file_mode = strtol(value, endptr, 8);
+ 
+ 	if (!*value || *endptr || file_mode  0 || file_mode  0777)
+ 	{
+ 		ereport(GUC_complaint_elevel(source),
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+  errmsg(invalid value for parameter \log_file_mode\)));
+ 		return NULL;
+ 	}
+ 
+ 	if (doit)
+ 		Log_file_mode = (int) file_mode;
+ 
+ 	return value;
+ }
*** a/src/backend/postmaster/syslogger.c
--- b/src/backend/postmaster/syslogger.c
***
*** 73,78  int			Log_RotationSize = 10 * 1024;
--- 73,79 
  char	   *Log_directory = NULL;
  char	   *Log_filename = NULL;
  bool		Log_truncate_on_rotation = false;
+ int			Log_file_mode = 0600;
  
  /*
   * Globally visible state (used by elog.c)
***
*** 135,140  static void syslogger_parseArgs(int argc, char *argv[]);
--- 136,142 
  static void process_pipe_input(char *logbuffer, int *bytes_in_logbuffer);
  static void flush_pipe_input(char *logbuffer, int *bytes_in_logbuffer);
  static void open_csvlogfile(void);
+ static FILE *logfile_open(const char *filename, const char *mode, bool am_rotating);
  
  #ifdef WIN32
  static unsigned int __stdcall pipeThread(void *arg);
***
*** 516,530  SysLogger_Start(void)
  	 */
  	filename = logfile_getname(time(NULL), NULL);
  
! 	syslogFile = fopen(filename, a);
! 
! 	if (!syslogFile)
! 		ereport(FATAL,
! (errcode_for_file_access(),
!  (errmsg(could not create log file \%s\: %m,
! 		 filename;
! 
! 	setvbuf(syslogFile, NULL, LBF_MODE, 0);
  
  	pfree(filename);
  
--- 518,524 
  	 */
  	filename = logfile_getname(time(NULL), NULL);
  
! 	syslogFile = logfile_open(filename, a, false);
  
  	pfree(filename);
  
***
*** 1004,1018  open_csvlogfile(void)
  
  	filename = logfile_getname(time(NULL), .csv);
  
! 	fh = fopen(filename, a);
! 
! 	if (!fh)
! 		ereport(FATAL,
! (errcode_for_file_access(),
!  (errmsg(could not create log file \%s\: %m,
! 		 filename;
! 
! 	setvbuf(fh, NULL, LBF_MODE, 0);
  
  #ifdef WIN32
  	_setmode(_fileno(fh), _O_TEXT);		/* use CRLF line endings on Windows */
--- 998,1004 
  
  	filename = logfile_getname(time(NULL), .csv);
  
! 	fh = logfile_open(filename, a, false);
  
  #ifdef WIN32
  	_setmode(_fileno(fh), _O_TEXT);		/* use CRLF line endings on Windows */
***
*** 1025,1030  open_csvlogfile(void)
--- 1011,1040 
  }
  
  /*
+  * Open the logfile, set permissions and buffering options.
+  */
+ static FILE *
+ logfile_open(const char *filename, const char *mode, bool am_rotating)
+ {
+ 	FILE   *fh;
+ 
+ 	fh = fopen(filename, mode);
+ 
+ 	if (fh)
+ 	{
+ 		setvbuf(fh, NULL, LBF_MODE, 0);
+ 		fchmod(fileno(fh), Log_file_mode);
+ 	}
+ 	else
+ 		ereport(am_rotating ? LOG : FATAL,
+ (errcode_for_file_access(),
+  (errmsg(could not create%slog file \%s\: %m,
+ 		 am_rotating ?  new  :  , filename;
+ 
+ 	return fh;
+ }
+ 
+ /*
   * perform logfile rotation
   */
  static void
***
*** 1070,1088  logfile_rotate(bool time_based_rotation, int size_rotation_for)
  		if (Log_truncate_on_rotation  time_based_rotation 
  			last_file_name != NULL 
  			strcmp(filename, last_file_name) != 0)
! 			fh = 

Re: [HACKERS] pessimal trivial-update performance

2010-07-05 Thread Pierre C



The problem can generally be written as tuples seeing multiple
updates in the same transaction?

I think that every time PostgreSQL is used with an ORM, there is
a certain amount of multiple updates taking place. I have actually
been reworking clientside to get around multiple updates, since they
popped up in one of my profiling runs. Allthough the time I optimized
away ended being both roundtrip time + update time, but having
the database do half of it transparently, might have been sufficient
to get me to have had a bigger problem elsewhere..

To sum up. Yes I think indeed it is a real-world case.

Jesper


On the Python side, elixir and sqlalchemy have an excellent way of  
handling this, basically when you start a transaction, all changes are  
accumulated in a session object and only flushed to the database on  
session commit (which is also generally the transaction commit). This has  
multiple advantages, for instance it is able to issue multiple-line  
statements, updates are only done once, you save a lot of roundtrips, etc.  
Of course it is most of the time not compatible with database triggers, so  
if there are triggers the ORM needs to be told about them.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pessimal trivial-update performance

2010-07-05 Thread Jesper Krogh

On 2010-07-05 12:11, Pierre C wrote:


 The problem can generally be written as tuples seeing multiple
 updates in the same transaction?

 I think that every time PostgreSQL is used with an ORM, there is a
 certain amount of multiple updates taking place. I have actually
 been reworking clientside to get around multiple updates, since
 they popped up in one of my profiling runs. Allthough the time I
 optimized away ended being both roundtrip time + update time,
 but having the database do half of it transparently, might have
 been sufficient to get me to have had a bigger problem elsewhere..

 To sum up. Yes I think indeed it is a real-world case.

 Jesper

 On the Python side, elixir and sqlalchemy have an excellent way of
 handling this, basically when you start a transaction, all changes
 are accumulated in a session object and only flushed to the
 database on session commit (which is also generally the transaction
 commit). This has multiple advantages, for instance it is able to
 issue multiple-line statements, updates are only done once, you save
 a lot of roundtrips, etc. Of course it is most of the time not
 compatible with database triggers, so if there are triggers the ORM
 needs to be told about them.


How about unique constraints, foreign key violations and checks? Would
you also pospone those errors to commit time? And transactions with lots 
of data?


It doesn't really seem like a net benefit to me, but I can see applications
where it easily will fit.

Jesper


Re: [HACKERS] pessimal trivial-update performance

2010-07-05 Thread Andres Freund
On Monday 05 July 2010 12:11:38 Pierre C wrote:
  The problem can generally be written as tuples seeing multiple
  updates in the same transaction?
  
  I think that every time PostgreSQL is used with an ORM, there is
  a certain amount of multiple updates taking place. I have actually
  been reworking clientside to get around multiple updates, since they
  popped up in one of my profiling runs. Allthough the time I optimized
  away ended being both roundtrip time + update time, but having
  the database do half of it transparently, might have been sufficient
  to get me to have had a bigger problem elsewhere..
  
  To sum up. Yes I think indeed it is a real-world case.
  
  Jesper
 
 On the Python side, elixir and sqlalchemy have an excellent way of
 handling this, basically when you start a transaction, all changes are
 accumulated in a session object and only flushed to the database on
 session commit (which is also generally the transaction commit). This has
 multiple advantages, for instance it is able to issue multiple-line
 statements, updates are only done once, you save a lot of roundtrips, etc.
 Of course it is most of the time not compatible with database triggers, so
 if there are triggers the ORM needs to be told about them.
Its also not concurrency safe in many cases.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Table partitioning - is anything coming?

2010-07-05 Thread Greg Smith

Igor Kryltsov wrote:

I am not asking any firm dates but when (and if) do you think roughly
it will be any enhancements on automating partitioning in Postgres?
  


The earliest possible date for that is the summer of 2011 when 
PostgreSQL 9.1 might be released:  
http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Development_Plan


An outline of the plans is at 
http://wiki.postgresql.org/wiki/Table_partitioning and the current patch 
submitted to help in this direction is due to be reviewed this month:  
https://commitfest.postgresql.org/action/patch_view?id=266


P.S. Attaching PNGs instead of writing text out will annoy some 
contributors here, that's not something you want to get into the habit 
of on any of the mailing lists here.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Always truncate segments before unlink

2010-07-05 Thread Tom Lane
Takahiro Itagaki itagaki.takah...@oss.ntt.co.jp writes:
 In mdunlink(), we truncate the first main fork to zero length
 and actually unlink at the next checkpoint, but other segments
 are not truncated and only unlinked. Then, if another backend
 open the segments, disk spaces occupied by them are not reclaimed
 until all of the backends close their file descriptors. Longer
 checkpoint timeout and connection pooling make things worse.

Truncating seems like an ugly kluge that's not fixing the real problem.
Why are there open descriptors for a dropped relation?  They should all
get closed as a consequence of relcache flush.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] logistics for beta3

2010-07-05 Thread Robert Haas
On Mon, Jun 28, 2010 at 2:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Josh Berkus j...@agliodbs.com writes:
 Therefore, I propose that we set a beta3 release date for July 8th.
 That should give it enough space from the American Holiday.

 You mean wrap on Thursday the 8th for release on Monday the 12th?
 That'd be fine with me.  Actual release on the 8th would mean asking
 people to do release prep work when they should be out watching
 fireworks.

AIUI, this is the plan we decided on.  So:

- Someone (presumably Bruce) needs to run pgindent.  Any reason to
wait any longer on that?
- Someone will need to branch the tree after the wrap and stamp it
9.1devel.  Who is doing that?
- We have three (fairly minor) remaining open items that should
ideally be dealt with.

They are:

* bump catalog version for plpython3u change? Use RTLD_GLOBAL? -- I
don't immediately know what the bit about RTLD_GLOBAL is referring to,
and it's not obvious to me that anyone cares about the catversion
issue enough to argue about it.  So maybe there's nothing to do here
at all.  Anyhow, we should make a decision.
* normalize use of LDFLAGS - I believe Tom is dealing with this.  His
proposal seems sensible to me, although as demonstrated just this
week, I am not an expert on the idiosyncracies of linking on different
platforms.  This doesn't strike me as a show-stopper for beta3.
* Fix log_temp_files units - I believe the consensus here is to revert
Simon's comment/doco changes and instead make the behavior match the
documentation in both 8.4 and HEAD.  I have been assuming Simon would
take care of this since it was his patch, but we're running out of
time.  This one is probably good to get fixed before we ship a beta
documenting a behavior we intend to change.  If necessary I can pick
this up...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logistics for beta3

2010-07-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 - Someone will need to branch the tree after the wrap and stamp it
 9.1devel.  Who is doing that?

Marc generally takes care of making branches.

 * bump catalog version for plpython3u change? Use RTLD_GLOBAL? -- I
 don't immediately know what the bit about RTLD_GLOBAL is referring to,
 and it's not obvious to me that anyone cares about the catversion
 issue enough to argue about it.  So maybe there's nothing to do here
 at all.  Anyhow, we should make a decision.

Peter was suggesting that if we *stopped* using RTLD_GLOBAL then it
might be possible to use plpython2 and plpython3 concurrently in one
backend.  After looking at the archives I'm not convinced that's
workable --- it sounds like not using RTLD_GLOBAL would have the effect
of breaking Python's extension scheme altogether.  But at any rate the
ball's in his court to make and test a concrete proposal if he wishes.

 * normalize use of LDFLAGS - I believe Tom is dealing with this.

Yeah, I hope to get that committed today.  Any later than today will not
leave enough time for buildfarm testing before the wrap.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logistics for beta3

2010-07-05 Thread Robert Haas
On Mon, Jul 5, 2010 at 11:01 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 * normalize use of LDFLAGS - I believe Tom is dealing with this.

 Yeah, I hope to get that committed today.  Any later than today will not
 leave enough time for buildfarm testing before the wrap.

Hmm.  So does that mean we need to get log_temp_files fixed today also?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logistics for beta3

2010-07-05 Thread Marc G. Fournier

On Mon, 5 Jul 2010, Robert Haas wrote:


On Mon, Jun 28, 2010 at 2:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:

Josh Berkus j...@agliodbs.com writes:

Therefore, I propose that we set a beta3 release date for July 8th.
That should give it enough space from the American Holiday.


You mean wrap on Thursday the 8th for release on Monday the 12th?
That'd be fine with me.  Actual release on the 8th would mean asking
people to do release prep work when they should be out watching
fireworks.


AIUI, this is the plan we decided on.  So:

- Someone (presumably Bruce) needs to run pgindent.  Any reason to
wait any longer on that?
- Someone will need to branch the tree after the wrap and stamp it
9.1devel.  Who is doing that?


Me, after I wrap


Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logistics for beta3

2010-07-05 Thread Robert Haas
On Mon, Jul 5, 2010 at 11:18 AM, Marc G. Fournier scra...@hub.org wrote:
 Me, after I wrap

Cool, thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logistics for beta3

2010-07-05 Thread Andrew Dunstan



Marc G. Fournier wrote:


- Someone (presumably Bruce) needs to run pgindent.  Any reason to
wait any longer on that?




The typedefs list on the buildfarm needs to be refreshed. That will take 
me some time, since I wasn't aware we were about to do a pg_indent run.


Starting now ...

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logistics for beta3

2010-07-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jul 5, 2010 at 11:01 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, I hope to get that committed today.  Any later than today will not
 leave enough time for buildfarm testing before the wrap.

 Hmm.  So does that mean we need to get log_temp_files fixed today also?

No, I'm just concerned about the possibility of needing multiple
buildfarm cycles to shake out platform-specific problems with the
LDFLAGS changes.  The log_temp_files business doesn't have any
portability risks AFAICS.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Buildfarm + Git tryouts

2010-07-05 Thread Chris Browne
I'm trying to start preparing buildfarm nodes for the upcoming Git
migration, and have run into a few issues.  I speculate that -hackers
is one of the better places for this to get discussed; if it should be
elsewhere, I'm sure Andrew Dunstan won't be shy to redirect this :-).

What I was hoping to do was to run nodes for a little while against
the repo at git://github.com/oicu/pg-cvs-mirror.git to validate that
it all works against git.

The best instructions available thus far...
http://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto

Unfortunately, I'm finding some anomalies in terms of differences
between my environments and what that suggests.

1.  git clone --mirror is apparently new in git version 1.6

  http://kerneltrap.org/mailarchive/git/2008/8/2/2793244

  My somewhat mouldy Ubuntu node is on git 1.5.4, which doesn't
  include this.

  There's a workaround, as --mirror is just syntactic sugar

   $ git clone --mirror $URL
  may be replaced by
   $ git clone --bare $URL
   $ (cd $(basename $URL)  git remote add --mirror origin $URL)

  I see options:
   a) Require git 1.6 or so (latest stable is 1.7.1.1)
   b) Offer the alternative mirror approach

2.  SCM.pl seems to expect the repository directory to be called 
pgsql (around line #373)

I wound up hacking up the code to head to a specific directory,
which isn't a terribly good hack, but apparently works for now.

-   chdir 'pgsql';
+   chdir '/opt/build-farm/pgsql.git';

It feels as though something's not quite right about how the
nearby Git repository is referenced.

3.  Some problems checking status.

i) Status Line: 491 bad ts parameter - [timestamp omitted] is in the future

I know my clock's reasonable - ntp is reporting I'm within 0.25s of
some stratum 2 nodes.  Is it possible that the buildfarm server is
ill-synced?

ii) Status Line: 460 script version too low

I just pulled a buildfarm build last week, so I'm not sure what this
would relate to.

That's where I've gotten thus far.  Node caracara isn't reporting in
just now because of this testing.  I hope other people ought to be
doing similar tryouts so that the revision effort to shift to Git
doesn't wait to *start* until the development repo shifts.
-- 
let name=cbbrowne and tld=gmail.com in String.concat @ [name;tld];;
http://linuxfinances.info/info/linuxdistributions.html
Rules of the Evil Overlord #77. If I have a fit of temporary insanity
and decide to give  the hero the chance to reject a  job as my trusted
lieutentant,  I will  retain enough  sanity to  wait until  my current
trusted  lieutenant  is  out  of  earshot before  making  the  offer.
http://www.eviloverlord.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buildfarm + Git tryouts

2010-07-05 Thread Alvaro Herrera
Excerpts from Chris Browne's message of lun jul 05 12:33:49 -0400 2010:

 3.  Some problems checking status.
 
 i) Status Line: 491 bad ts parameter - [timestamp omitted] is in the future
 
 I know my clock's reasonable - ntp is reporting I'm within 0.25s of
 some stratum 2 nodes.  Is it possible that the buildfarm server is
 ill-synced?

I asked around and was told that the server is in sync currently.  It
has been known to be out of sync before; if this problem still comes up,
maybe there's a software bug or something.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] t_self as system column

2010-07-05 Thread Alvaro Herrera
Is there a reason we don't have t_self as one of the system columns that
you can examine from SQL?  I'd propose its addition otherwise.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] t_self as system column

2010-07-05 Thread Tom Lane
Alvaro Herrera alvhe...@alvh.no-ip.org writes:
 Is there a reason we don't have t_self as one of the system columns that
 you can examine from SQL?  I'd propose its addition otherwise.

pg_attribute bloat?  I'm a bit hesitant to add a row per table for
something we've gotten along without for so long, especially something
with as bizarre a definition as t_self has got.

At one time I was hoping to get rid of explicit entries in pg_attribute
for system columns, which would negate this concern.  I think we're
stuck with them now, though, because of per-column permissions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buildfarm + Git tryouts

2010-07-05 Thread Andrew Dunstan



Alvaro Herrera wrote:

Excerpts from Chris Browne's message of lun jul 05 12:33:49 -0400 2010:

  

3.  Some problems checking status.

i) Status Line: 491 bad ts parameter - [timestamp omitted] is in the future

I know my clock's reasonable - ntp is reporting I'm within 0.25s of
some stratum 2 nodes.  Is it possible that the buildfarm server is
ill-synced?



I asked around and was told that the server is in sync currently.  It
has been known to be out of sync before; if this problem still comes up,
maybe there's a software bug or something.

  



This discussion really belongs on pgbuildfarm-members, I think.

I have added a small fudge factor to the time test - let's see if that 
changes things.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logistics for beta3

2010-07-05 Thread Andrew Dunstan



Andrew Dunstan wrote:



Marc G. Fournier wrote:


- Someone (presumably Bruce) needs to run pgindent.  Any reason to
wait any longer on that?




The typedefs list on the buildfarm needs to be refreshed. That will 
take me some time, since I wasn't aware we were about to do a 
pg_indent run.


Starting now ...




completed.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] t_self as system column

2010-07-05 Thread Robert Haas
On Mon, Jul 5, 2010 at 2:08 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@alvh.no-ip.org writes:
 Is there a reason we don't have t_self as one of the system columns that
 you can examine from SQL?  I'd propose its addition otherwise.

 pg_attribute bloat?  I'm a bit hesitant to add a row per table for
 something we've gotten along without for so long, especially something
 with as bizarre a definition as t_self has got.

 At one time I was hoping to get rid of explicit entries in pg_attribute
 for system columns, which would negate this concern.  I think we're
 stuck with them now, though, because of per-column permissions.

Because someone might want to grant per-column permissions on those
columns?  That seems like an awfully thin reason to keep all that
bloat around.  I bet the number of people who have granted per-column
permissions on, say, cmax can be counted on one hand - possibly with
five fingers left over.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] t_self as system column

2010-07-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jul 5, 2010 at 2:08 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 At one time I was hoping to get rid of explicit entries in pg_attribute
 for system columns, which would negate this concern.  I think we're
 stuck with them now, though, because of per-column permissions.

 Because someone might want to grant per-column permissions on those
 columns?  That seems like an awfully thin reason to keep all that
 bloat around.  I bet the number of people who have granted per-column
 permissions on, say, cmax can be counted on one hand - possibly with
 five fingers left over.

I'd agree with that argument for the most part, but I'm not entirely
sure about oid, which has some characteristics of a user-data column.

(OTOH, maybe we could allow just oid to retain an explicit pg_attribute
entry... could be messy though.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pessimal trivial-update performance

2010-07-05 Thread Robert Haas
On Sun, Jul 4, 2010 at 9:48 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Sure.  What you'd need is for HeapTupleSatisfiesVacuum to observe that
 (a) the tuple's xmin and xmax are equal,
 (b) they're equal to my own transaction's XID,
 (c) none of the live snapshots in my backend can see cmin but not cmax,
 (d) cmax  currentCommandId, ensuring that every future snapshot will
    see cmax too (not quite convinced this is certain to hold).
[...]
 Of course, you'd also need to get to HeapTupleSatisfiesVacuum in the
 first place.  The complained-of case lacks any VACUUM call.  Maybe a HOT
 cleanup would happen at the right time but I'm not sure.  If it doesn't,
 adding one would represent a significant expenditure that would usually
 not be repaid.

It looks like a HOT cleanup happens when pd_prune_xid falls behind
OldestXmin.  Normally, we set pd_prune_xid to the xmax of the deleted
tuple, but we could perhaps fudge that here to get the desired
behavior; maybe just set it to FrozenXID.  Where it gets sticky is
that the proposed rules for HeapTupleSatisfiesVacuum() give different
answers depending on who does the vacuuming, so if backend A sets a
hint say, hey, there's vacuumable stuff on this page, and then backend
B tries to prune it, nothing will happen.  What would be nicer is if
there were a way for the updater to mark the item pointer or tuple in
some way that would make it look vacuumable to everyone, but without
breaking the HOT chain.

 Another issue here is that since xmin is certainly within the GlobalXmin
 horizon, it would be essential to preserve the update chain ctid links,
 ie, make the tuple's update predecessor point to its successor.  That
 seems workable for the case of cleaning out an intermediate entry in a
 HOT chain, but not otherwise.

Yeah, that's a shame.  HOT is huge, but it would be great if we had a
way to do partial vacuuming even when the indexed columns are updated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 64-bit pgbench V2

2010-07-05 Thread Greg Smith
Attached is an updated second rev of the patch I sent a few months ago, 
to expand pgbench to support database scales larger than around 
4,294--where the 32-bit integer for the account number overflows in the 
current version.  The current limit makes for about a 60GB database.  
Last week I ran this on a system with 72GB of RAM, which are already 
quite common, and wasn't able to get a test that didn't fit in RAM.  
Without a bug fix here I am concerned that pgbench will ship in 9.0 
already obsolete for the generation of hardware is it going to be 
deployed on.


The main tricky part was figuring how to convert the \setshell 
implementation.  That uses strtol to parse the number that should have 
been returned by the shell call.  It turns out there are a stack of ways 
to do something similar but return 64 bits instead:


* strtoll is defined by ISO C99
* strtoq was used on some earlier BSD systems
* MSVC has _strtoi64 for signed and _strtoui64 for unsigned 64bit integers

According to the glib docs at 
http://www.gnu.org/software/gnulib/manual/html_node/strtoll.html , 
stroll is missing on HP-UX 11, OSF/1 5.1, Interix 3.5, so one of the 
HP-UX boxes might be a useful testbed for what works on a trickier platform.


For prototype purposes, I wrote the patch to include some minimal logic 
to map the facility available to strtoint64, falling back to the 32-bit 
strtol if that's the best available.  There are three ways I could 
forsee this going:


1) Keep this ugly bit of code isolated to pgbench
2) Move it to src/include/c.h where the other 64-bit int abstraction is done
3) Push the problem toward autoconf

I don't have a clear argument for or against those individual options, 
they all seem reasonable from some perspectives.


The only open issue I'm not sure about is whether the situation where 
the code falls back to 32-bits should be documented, or even a warning 
produced if you create something at a scale without some strtoll 
available.  Given that it only impacts the \setrandom case, it's not 
really a disaster that it might not work, so long as there's 
documentation explaining the potential limitations.  I'll write those if 
necessary, but I think that some testing on known tricky platforms that 
I don't have setup here is the best next step, so I'm looking for 
feedback on that.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index c830dee..e6621e2 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -60,6 +60,20 @@
 #define INT64_MAX	INT64CONST(0x7FFF)
 #endif
 
+/* Try to find a string to 64-bit integer implementation */
+#ifndef strtoint64
+#ifdef strtoll
+#define strtoint64	strtoll
+#elif defined(_strtoi64)
+#define strtoi64	_strtoi64
+#elif defined(strtoq)
+#define strtoi64	strtoq
+#else
+/* Fall back to 32 bit version if no 64-bit version is available */
+#define strtoi64	strtol
+#endif
+#endif
+
 /*
  * Multi-platform pthread implementations
  */
@@ -310,14 +324,14 @@ usage(const char *progname)
 }
 
 /* random number generator: uniform distribution from min to max inclusive */
-static int
-getrand(int min, int max)
+static int64
+getrand(int64 min, int64 max)
 {
 	/*
 	 * Odd coding is so that min and max have approximately the same chance of
 	 * being selected as do numbers between them.
 	 */
-	return min + (int) (((max - min + 1) * (double) random()) / (MAX_RANDOM_VALUE + 1.0));
+	return min + (int64) (((max - min + 1) * (double) random()) / (MAX_RANDOM_VALUE + 1.0));
 }
 
 /* call PQexec() and exit() on failure */
@@ -627,7 +641,7 @@ runShellCommand(CState *st, char *variable, char **argv, int argc)
 	FILE	   *fp;
 	char		res[64];
 	char	   *endptr;
-	int			retval;
+	int64			retval;
 
 	/*
 	 * Join arguments with whilespace separaters. Arguments starting with
@@ -700,7 +714,7 @@ runShellCommand(CState *st, char *variable, char **argv, int argc)
 	}
 
 	/* Check whether the result is an integer and assign it to the variable */
-	retval = (int) strtol(res, endptr, 10);
+	retval = strtoll(res, endptr, 19);
 	while (*endptr != '\0'  isspace((unsigned char) *endptr))
 		endptr++;
 	if (*res == '\0' || *endptr != '\0')
@@ -708,7 +722,7 @@ runShellCommand(CState *st, char *variable, char **argv, int argc)
 		fprintf(stderr, %s: must return an integer ('%s' returned)\n, argv[0], res);
 		return false;
 	}
-	snprintf(res, sizeof(res), %d, retval);
+	snprintf(res, sizeof(res), INT64_FORMAT, retval);
 	if (!putVariable(st, setshell, variable, res))
 		return false;
 
@@ -956,8 +970,9 @@ top:
 		if (pg_strcasecmp(argv[0], setrandom) == 0)
 		{
 			char	   *var;
-			int			min,
-		max;
+			int64		min,
+		max,
+		rand;
 			char		res[64];
 
 			if (*argv[2] == ':')
@@ -997,15 +1012,16 @@ top:
 
 			if (max  min || max  MAX_RANDOM_VALUE)
 			{
-fprintf(stderr, %s: invalid maximum number %d\n, 

Re: [HACKERS] 64-bit pgbench V2

2010-07-05 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 The main tricky part was figuring how to convert the \setshell 
 implementation.  That uses strtol to parse the number that should have 
 been returned by the shell call.  It turns out there are a stack of ways 
 to do something similar but return 64 bits instead:

Please choose a way that doesn't introduce new portability assumptions.
The backend gets along fine without strtoll, and I don't see why pgbench
should have to require it.

(BTW, I don't actually believe that the proposed code works at all,
since in general strtoll or other variants aren't going to be macros,
but plain functions.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit pgbench V2

2010-07-05 Thread Robert Haas
On Mon, Jul 5, 2010 at 8:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Smith g...@2ndquadrant.com writes:
 The main tricky part was figuring how to convert the \setshell
 implementation.  That uses strtol to parse the number that should have
 been returned by the shell call.  It turns out there are a stack of ways
 to do something similar but return 64 bits instead:

 Please choose a way that doesn't introduce new portability assumptions.
 The backend gets along fine without strtoll, and I don't see why pgbench
 should have to require it.

It doesn't seem very palatable to have multiple handwritten integer
parsers floating around the code base either.  Maybe we should try to
standardize something and ship it in src/port, or somesuch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logistics for beta3

2010-07-05 Thread Robert Haas
On Mon, Jul 5, 2010 at 3:14 PM, Andrew Dunstan and...@dunslane.net wrote:
 Andrew Dunstan wrote:
 Marc G. Fournier wrote:
 - Someone (presumably Bruce) needs to run pgindent.  Any reason to
 wait any longer on that?

 The typedefs list on the buildfarm needs to be refreshed. That will take
 me some time, since I wasn't aware we were about to do a pg_indent run.

 Starting now ...

 completed.

Cool.  So, should we have Bruce go ahead and pgindent now?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Always truncate segments before unlink

2010-07-05 Thread Takahiro Itagaki

Tom Lane t...@sss.pgh.pa.us wrote:
 Truncating seems like an ugly kluge that's not fixing the real problem.
 Why are there open descriptors for a dropped relation?  They should all
 get closed as a consequence of relcache flush.

Relcache will be flushed at the next command, but there could be some
*idle backends* kept by connection pooling. They won't close dropped files
until shared cache invalidation queue are almost filled, that might take
long time.

There might be another solution that we send PROCSIG_CATCHUP_INTERRUPT
signal not only on the threshold of queue length but also on timeout,
where the signal is sent when we have some old messages in the queue
longer than 30sec - 1min.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logistics for beta3

2010-07-05 Thread Marc G. Fournier

On Mon, 5 Jul 2010, Robert Haas wrote:


Cool.  So, should we have Bruce go ahead and pgindent now?


Yup, as that will give 3 days before wrap / branch to deal with any fall 
out from mit :)



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] get_whatever_oid, part 2

2010-07-05 Thread KaiGai Kohei
(2010/06/15 2:25), Robert Haas wrote:
 Per previous discussion:
 
 http://archives.postgresql.org/pgsql-hackers/2010-05/msg01195.php
 http://archives.postgresql.org/pgsql-hackers/2010-05/msg01577.php
 
 Changes in this patch:
 
 - Rename TSParserGetPrsid to get_tsparser_oid, TSDictionaryGetDictid
 to get_tsdictionary_oid, TSTemplateGetTmplid to get_tstemplate_oid,
 TSConfigGetCfgid to get_tsconfiguration_oid, FindConversionByName to
 get_conversion_oid, and GetConstraintName to get_constraint_oid.
 - Add new functions get_opclass_oid, get_opfamily_oid, get_rule_oid,
 get_rule_oid_without_relid, get_trigger_oid, and get_cast_oid.
 - Refactor existing code to use these functions wherever possible.
 
I checked this patch. Then, I was surprised at we have such so many
code duplications. I believe this patch can effectively eliminate
these code duplications and it is worthful to apply.
However, here are some points to be fixed/revised.

1. [BUG] DropCast() does not handle missing_ok correctly.

| --- a/src/backend/commands/functioncmds.c
| +++ b/src/backend/commands/functioncmds.c
| @@ -1759,31 +1759,23 @@ DropCast(DropCastStmt *stmt)
|  {
| Oid sourcetypeid;
| Oid targettypeid;
| -   HeapTuple   tuple;
| ObjectAddress object;
|
| /* when dropping a cast, the types must exist even if you use IF EXISTS */
| sourcetypeid = typenameTypeId(NULL, stmt-sourcetype, NULL);
| targettypeid = typenameTypeId(NULL, stmt-targettype, NULL);
|
| -   tuple = SearchSysCache2(CASTSOURCETARGET,
| -   ObjectIdGetDatum(sourcetypeid),
| -   ObjectIdGetDatum(targettypeid));
| -   if (!HeapTupleIsValid(tuple))
| +   object.classId = CastRelationId;
| +   object.objectId = get_cast_oid(sourcetypeid, targettypeid,
| +  stmt-missing_ok);
| +   object.objectSubId = 0;
| +
| +   if (!OidIsValid(object.objectId))
| {
| -   if (!stmt-missing_ok)
| -   ereport(ERROR,
| -   (errcode(ERRCODE_UNDEFINED_OBJECT),
| -errmsg(cast from type %s to type %s does not exist,
| -   format_type_be(sourcetypeid),
| -   format_type_be(targettypeid;
| -   else
| -   ereport(NOTICE,
| +   ereport(NOTICE,
|  (errmsg(cast from type %s to type %s does not exist, skipping,
|  format_type_be(sourcetypeid),
|  format_type_be(targettypeid;
| -
| -   return;
| }
|
| /* Permission check */

You removed the 'return' on the path when get_cast_oid() with missing_ok = true
returned InvalidOid. It seems to me a bug to be fixed.


2. we can reduce more code duplications using get_opfamily_oid() and
   get_opclass_oid().

I could find translations from a qualified name to schema/object names
at GetIndexOpClass(), RenameOpFamily() and AlterOpFamilyOwner().
The new APIs enable to eliminate code duplications here.

GetIndexOpClass() needs input type of the operator class, in addition
to its OID, but it can be obtained using get_opclass_input_type().
RenameOpFamily() and AlterOpFamilyOwner() need a copy of the operator
family tuple, in addition to its OID, but it can be obtained using
GetSysCacheCopy1(OPFAMILYOID).


3. Are OpClassCacheLookup() and OpFamilyCacheLookup() still needed?

The OpFamilyCacheLookup() is only called from RemoveOpFamily()
except for the get_opfamily_oid(), because it needs namespace OID
in addition to the OID of operator family.
If we have get_opfamily_namespace() in lsyscache.c, we can merge
get_opfamily_oid() and OpFamilyCacheLookup() completely?

The OpClassCacheLookup() is only called from RemoveOpClass(),
RenameOpClass() and AlterOpClassOwner(), because RemoveOpClass()
needs namespace OID in addition to the OID of operator class,
and rest of them want to copy the HeapTuple to update it.
If we have get_opclass_namespace() in lsyscache.c, RemoveOpClass()
can use get_opclass_oid() instead of OpClassCacheLookup().
And, we can use a pair of get_opclass_oid() and GetSysCacheCopy1()
instead of OpClassCacheLookup() and heap_copytuple() in the
RenameOpClass() and AlterOpClassOwner().


4. Name of the arguments incorrect.

| --- a/src/include/catalog/namespace.h
| +++ b/src/include/catalog/namespace.h
| @@ -74,16 +74,16 @@ extern bool OpfamilyIsVisible(Oid opfid);
|  extern Oid ConversionGetConid(const char *conname);
|  extern bool ConversionIsVisible(Oid conid);
|
| -extern Oid TSParserGetPrsid(List *names, bool failOK);
| +extern Oid get_tsparser_oid(List *names, bool failOK);
|  extern bool TSParserIsVisible(Oid prsId);
|
| -extern Oid TSDictionaryGetDictid(List *names, bool failOK);
| +extern Oid get_tsdictionary_oid(List *names, bool failOK);
|  extern bool TSDictionaryIsVisible(Oid dictId);
|
| -extern Oid TSTemplateGetTmplid(List *names, bool failOK);
| +extern Oid get_tstemplate_oid(List *names, bool failOK);
|  extern bool TSTemplateIsVisible(Oid