Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-09-03 Thread Greg Smith

On 09/02/2011 11:10 AM, Tomas Vondra wrote:

My 'ideal' solution would be either to add another GUC (to turn this
on/off) or allow log_checkpoints to have three values

log_checkpoints = {off, normal, detailed}

where 'normal' provides the current output and 'detail' produces this much
verbose output.
   


If this is going to be acceptable, that's likely the only path it could 
happen by and still meet what you're looking for.  I will just again 
stress that the part you're working on instrumenting better right now is 
not actually where larger systems really run into the most problems 
here, based on what I've seen.  I added a series of log messages to 9.1 
at DEBUG1, aimed at tracking the sync phase.  That's where I see many 
more checkpoint issues than in the write one.  On Linux in particular, 
it's almost impossible for the write phase to be more of a problem than 
the sync one.


So the logging you're adding here I don't ever expect to turn on.  But I 
wouldn't argue against an option to handle the logging use-case you're 
concerned about.  Letting people observe for themselves and decide which 
of the phases is more interesting to their workload seems appropriate.  
Then users have options for what to log, no matter which type of problem 
they run into.


If you're expanding log_checkpoints to an enum, for that to handle what 
I think everybody might ever want (for what checkpoints do now at 
least), I'd find that more useful if it happened like this instead:


log_checkpoints = {off, on, write, sync, verbose}

I don't think you should change the semantics of off/on, which will 
avoid breaking existing postgresql.conf files and resources that suggest 
tuning advice.  write can toggle on what you're adding; sync should 
control whether the DEBUG1 messages showing the individual file names in 
the sync phase appear; and verbose can include both.


As far as a heuristic for making this less chatty when there's nothing 
exciting happening goes, I think something based on how much time has 
passed would be the best one.  In your use case, I would guess you don't 
really care whether a message appears every n%.  If I understand you 
correctly now, you would mainly care about getting enough log detail to 
know 1) when things are running really slow, or b) often enough that the 
margin of error in your benchmark results from unaccounted checkpoint 
writes is acceptable.  In both of those cases, I'd think a time-based 
threshold would be appropriate, and that also deals with the time-based 
checkpoints, too.


If your logging criteria for the write phase was display a message any 
time more than 30 seconds have passed since last seeing one, that would 
give you only a few lines of output in a boring, normal 
checkpoint--certainly less than the 9 in-progress samples you're 
outputting now, at 10% intervals.  But in the pathological situations 
where writes are super slow, your log data would become correspondingly 
denser, which is exactly what you want in that situation.


I think combining the two makes the most sense:  log when =30 seconds 
have passed since the last message, and there's been =10% more progress 
made.  (Maybe do the progress check before the time one, to cut down on 
gettimeofday() calls)  That would give you 4 in-progress reports during 
a standard 2.5 minute write phase, and in cases where the checkpoints 
are taking a long time you'd get as many as 9.  That's pretty close to 
auto-tuning the amount of log output, so the amount of it is roughly 
proportional to how likely the information it's logging will be 
interesting.


We certainly don't want to add yet another GUC just to control the 
frequency.  I don't think it will be too hard to put two hard-coded 
thresholds in and do good enough for just about everyone though.  I 
would probably prefer setting those thresholds to 60 seconds/20% 
instead.  That might not be detailed enough for you though.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade automatic testing

2011-09-03 Thread Peter Eisentraut
On fre, 2011-09-02 at 19:49 -0400, Tom Lane wrote:
 The only one that's problematic is pg_regress.so; contrib modules are
 already installed in $libdir.  I still think that installing
 pg_regress.so in $libdir may be the most reasonable solution, assuming
 that the delta involved isn't too great.  Yeah, we would have to
 back-patch the changes into every release branch we want to test
 upgrading from, but how risky is that really?  The *only* thing it
 affects is the regression tests.

Or maybe make use of dynamic_library_path.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade automatic testing

2011-09-03 Thread Bruce Momjian
Peter Eisentraut wrote:
 But if you think about it, it doesn't really test pg_upgrade, it tests
 pg_dump.  So the test could just as well be moved to src/bin/pg_dump/
 and be labeled pg_dump smoke test or whatever.  (Minor detail: The bug
 fix above involved the --binary-upgrade flag, so it is somewhat
 pg_upgrade related.)
 
 A real pg_upgrade test suite should naturally upgrade across binary
 incompatible versions.  The real question is how you develop a useful
 test input.  Most pg_upgrade issues are not bugs of omission or
 regression but unexpected corner cases discovered with databases of
 nontrivial usage patterns.  (The recent one related to upgrade from 8.3
 is an exception.)  Because the basic premise of pg_upgrade is, dump and
 restore the schema, move over the files, that's it, and the rest of the
 code is workarounds for obscure details that are difficult to anticipate
 let alone test for.

You might want to read my blog entry on why pg_upgrade relies so much on
external tools:

http://momjian.us/main/blogs/pgblog/2011.html#June_15_2011_2

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sha1, sha2 functions into core?

2011-09-03 Thread k...@rice.edu
On Fri, Sep 02, 2011 at 04:27:46PM -0500, Ross J. Reedstrom wrote:
 On Fri, Sep 02, 2011 at 02:05:45PM -0500, k...@rice.edu wrote:
  On Fri, Sep 02, 2011 at 09:54:07PM +0300, Peter Eisentraut wrote:
   On ons, 2011-08-31 at 13:12 -0500, Ross J. Reedstrom wrote:
Hmm, this thread seems to have petered out without a conclusion. Just
wanted to comment that there _are_ non-password storage uses for these
digests: I use them in a context of storing large files in a bytea
column, as a means to doing data deduplication, and avoiding pushing
files from clients to server and back.
   
   But I suppose you don't need the hash function in the database system
   for that.
   
  
  It is very useful to have the same hash function used internally by
  PostgreSQL exposed externally. I know you can get the code and add an
  equivalent one of your own...
  
 Thanks for the support Ken, but Peter's right: the only backend use in
 my particular case is to let the backend do the hash calc during bulk
 loads: in the production code path, having the hash in two places
 doesn't save any work, since the client code has to calculate the hash
 in order to test for its existence in the backend. I suppose if the
 network cost was negligable, I could just push the files anyway, and
 have a before-insert trigger calculate the hash and do the dedup: then
 it'd be hidden in the backend completely. But as is, I can do all the
 work in the client.
 

While it is true that it doesn't save any work. My motivation for having
it exposed is that good hash functions are non-trivial to find. I have
dealt with computational artifacts produced by hash functions that seemed
at first to be good. We use a very well behaved function within the data-
base and exposing it will help prevent bad user hash function
implementations.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL low watermark during base backup

2011-09-03 Thread Dimitri Fontaine
Magnus Hagander mag...@hagander.net writes:

 Attached patch implements a low watermark wal location in the
 walsender shmem array. Setting this value in a walsender prevents
 transaction log removal prior to this point - similar to how
 wal_keep_segments work, except with an absolute number rather than

Cool.  The first use case that comes to my mind is when to clean old WAL
files when using multiple standby servers.  Will it help here?

 relative. For now, this is set when running a base backup with WAL
 included - to prevent the required WAL to be recycled away while the
 backup is running, without having to guestimate the value for
 wal_keep_segments.

I would have guessed that if you stream WALs in parallel of the backup,
and begin streaming before you pg_start_backup(), you don't need
anything more.  Is that wrong?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-09-03 Thread Andrew Dunstan



On 09/03/2011 04:49 PM, Dimitri Fontaine wrote:

Andrew Dunstanand...@dunslane.net  writes:

Oh, I meant just having it create separate custom format files for each
database. As shell scripts all over the world have been doing for years,
but it would be nice if it was simply built in.

I guess it could be done, although I'm not going to do it :-) I'm more about
making somewhat hard things easier than easy things slightly easier :-)

Then what about issuing an archive (tar or ar format here) containing
one custom file per database plus the globals file, SQL, plus maybe a
database listing, and hacking pg_restore so that it knows what to do
with such an input ?

Bonus points if that supports the current -l and -L options, of course.




That's probably a lot of code for a little benefit, at least from my 
POV, but others might find it useful.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Large C files

2011-09-03 Thread Bruce Momjian
FYI, here are all the C files with over 6k lines:

-  45133 ./interfaces/ecpg/preproc/preproc.c
-  33651 ./backend/parser/gram.c
-  17551 ./backend/parser/scan.c
   14209 ./bin/pg_dump/pg_dump.c
   10590 ./backend/access/transam/xlog.c
9764 ./backend/commands/tablecmds.c
8681 ./backend/utils/misc/guc.c
-   7667 ./bin/psql/psqlscan.c
7213 ./backend/utils/adt/ruleutils.c
6814 ./backend/utils/adt/selfuncs.c
6176 ./backend/utils/adt/numeric.c
6030 ./pl/plpgsql/src/pl_exec.c

I have dash-marked the files that are computer-generated.  It seems
pg_dump.c and xlog.c should be split into smaller C files.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade automatic testing

2011-09-03 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On fre, 2011-09-02 at 19:49 -0400, Tom Lane wrote:
 The only one that's problematic is pg_regress.so; contrib modules are
 already installed in $libdir.  I still think that installing
 pg_regress.so in $libdir may be the most reasonable solution, assuming
 that the delta involved isn't too great.  Yeah, we would have to
 back-patch the changes into every release branch we want to test
 upgrading from, but how risky is that really?  The *only* thing it
 affects is the regression tests.

 Or maybe make use of dynamic_library_path.

That seemed like a promising idea at first, but on reflection I thought
probably it's not such a great idea.  The problem is, where do you
inject the setting?  In a temp installation we could put it in
postgresql.conf, but for make installcheck the only way that seems
like it would work at all is to apply it as a database configuration
(ALTER DATABASE SET), and that seems problematic.  In particular, it
would not work for testing pg_dump, since pg_dump doesn't copy those
settings.  (I know we've talked about making it do so, but we'd
certainly not wish to back-patch such a change.)

(BTW, this also strikes me as a counterexample for the recently
proposed change to make pg_dumpall dump such settings at the end.
If you've got datatypes or indexes that depend on a shared library
that's found via ALTER DATABASE SET dynamic_library_path, it will
absolutely not work to restore the database contents before that
setting is applied.)

Anyway, after giving up on that I went back to plan A, namely install
regress.so and friends into $libdir.  That turns out to be really quite
straightforward, though I had to hack pg_regress.c a bit to get its idea
of $libdir to match up exactly with the way the backend sees it.
(The only reason this matters is that there's one error report in the
regression tests where the full expansion of $libdir is reported.
Maybe we should just drop that one test case instead of maintaining
the infrastructure for replacing @libdir@ in pg_regress.c.)

Attached is a draft patch for HEAD.  It passes make check and make
installcheck on Unix, but I've not touched the MSVC scripts.
Comments?

regards, tom lane

diff --git a/contrib/dummy_seclabel/Makefile b/contrib/dummy_seclabel/Makefile
index 105400f..3a5f968 100644
*** a/contrib/dummy_seclabel/Makefile
--- b/contrib/dummy_seclabel/Makefile
*** top_builddir = ../..
*** 12,14 
--- 12,24 
  include $(top_builddir)/src/Makefile.global
  include $(top_srcdir)/contrib/contrib-global.mk
  endif
+ 
+ # Build/install only the files needed for regression test support
+ REGRESSION_MODULES = dummy_seclabel
+ 
+ .PHONY: regression-modules install-regression-modules
+ 
+ regression-modules: $(addsuffix $(DLSUFFIX), $(REGRESSION_MODULES))
+ 
+ install-regression-modules: $(addsuffix $(DLSUFFIX), $(REGRESSION_MODULES))
+ 	$(INSTALL_SHLIB) $(addsuffix $(DLSUFFIX), $(REGRESSION_MODULES)) '$(DESTDIR)$(pkglibdir)/'
diff --git a/contrib/spi/Makefile b/contrib/spi/Makefile
index 0c11bfc..83578fb 100644
*** a/contrib/spi/Makefile
--- b/contrib/spi/Makefile
*** top_builddir = ../..
*** 28,30 
--- 28,40 
  include $(top_builddir)/src/Makefile.global
  include $(top_srcdir)/contrib/contrib-global.mk
  endif
+ 
+ # Build/install only the files needed for regression test support
+ REGRESSION_MODULES = autoinc refint
+ 
+ .PHONY: regression-modules install-regression-modules
+ 
+ regression-modules: $(addsuffix $(DLSUFFIX), $(REGRESSION_MODULES))
+ 
+ install-regression-modules: $(addsuffix $(DLSUFFIX), $(REGRESSION_MODULES))
+ 	$(INSTALL_SHLIB) $(addsuffix $(DLSUFFIX), $(REGRESSION_MODULES)) '$(DESTDIR)$(pkglibdir)/'
diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile
index 90aea6c..96839db 100644
*** a/src/test/regress/GNUmakefile
--- b/src/test/regress/GNUmakefile
*** EXTRADEFS = '-DHOST_TUPLE=$(host_tuple)
*** 41,47 
  
  # Build regression test driver
  
! all: pg_regress$(X)
  
  pg_regress$(X): pg_regress.o pg_regress_main.o | submake-libpgport
  	$(CC) $(CFLAGS) $^ $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@
--- 41,47 
  
  # Build regression test driver
  
! all: pg_regress$(X) regression-modules
  
  pg_regress$(X): pg_regress.o pg_regress_main.o | submake-libpgport
  	$(CC) $(CFLAGS) $^ $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@
*** pg_regress.o: pg_regress.c $(top_builddi
*** 53,65 
  $(top_builddir)/src/port/pg_config_paths.h: $(top_builddir)/src/Makefile.global
  	$(MAKE) -C $(top_builddir)/src/port pg_config_paths.h
  
! install: all installdirs
  	$(INSTALL_PROGRAM) pg_regress$(X) '$(DESTDIR)$(pgxsdir)/$(subdir)/pg_regress$(X)'
  
  installdirs:
  	$(MKDIR_P) '$(DESTDIR)$(pgxsdir)/$(subdir)'
  
! uninstall:
  	rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/pg_regress$(X)'
  
  
--- 53,65 
  $(top_builddir)/src/port/pg_config_paths.h: $(top_builddir)/src/Makefile.global
  	$(MAKE) -C 

[HACKERS] Re: [COMMITTERS] pgsql: Remove fmgr.h include in cube contrib --- caused crash on a Ge

2011-09-03 Thread Bruce Momjian
Alvaro Herrera wrote:
 Excerpts from Bruce Momjian's message of vie sep 02 12:20:50 -0300 2011:
 
  Wow, that is interesting.  So the problem is the inclusion of
  replication/walsender.h in xlog.h.  Hard to see how that could cause the
  cube regression tests to fail, but of course, it is.
 
 Hmm, so you included walsender.h into xlog.h?  That seems a bit funny
 considering that walsender.h already includes xlog.h.  It seems the
 reason for this is only the AllowCascadeReplication() definition.  Maybe
 that should go elsewhere instead, for example walsender.h?

Moved to walsender.h. Thanks.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove fmgr.h include in cube contrib --- caused crash on a Ge

2011-09-03 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Alvaro Herrera wrote:
 Hmm, so you included walsender.h into xlog.h?  That seems a bit funny
 considering that walsender.h already includes xlog.h.  It seems the
 reason for this is only the AllowCascadeReplication() definition.  Maybe
 that should go elsewhere instead, for example walsender.h?

 Moved to walsender.h. Thanks.

You seem to have entirely missed the point of Alvaro's remark, which is
that you've got xlog.h including walsender.h (still) as well as
walsender.h including xlog.h.  That's broken.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove fmgr.h include in cube contrib --- caused crash on a Ge

2011-09-03 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Alvaro Herrera wrote:
  Hmm, so you included walsender.h into xlog.h?  That seems a bit funny
  considering that walsender.h already includes xlog.h.  It seems the
  reason for this is only the AllowCascadeReplication() definition.  Maybe
  that should go elsewhere instead, for example walsender.h?
 
  Moved to walsender.h. Thanks.
 
 You seem to have entirely missed the point of Alvaro's remark, which is
 that you've got xlog.h including walsender.h (still) as well as
 walsender.h including xlog.h.  That's broken.

Oh, OK, done.  xlog.h removed from walsender.h and tested.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove fmgr.h include in cube contrib --- caused crash on a Ge

2011-09-03 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 You seem to have entirely missed the point of Alvaro's remark, which is
 that you've got xlog.h including walsender.h (still) as well as
 walsender.h including xlog.h.  That's broken.

 Oh, OK, done.  xlog.h removed from walsender.h and tested.

I would have gone the other way on that one, if possible.  Seems like
xlog.h ought to be the lower-level file.

Some quick mechanical analysis shows that's not the only circular
inclusion path in our headers, either: we have storage/proc.h including
replication/syncrep.h which includes storage/proc.h.  That one appears
to be Simon's fault not yours though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] xlogdump 0.4.0

2011-09-03 Thread Satoshi Nagayasu
Hi hackers,

I'm pleased to announce the latest release of xlogdump.
xlogdump is a tool for extracting data from WAL segment files.

Here is xlogdump README:
https://github.com/snaga/xlogdump/blob/master/xlogdump/README.xlogdump

xlogdump was originally developed by Tom Lane and Diogo Biazus around
a five years ago, but not be maintained these few years.

So, I picked it up and have fixed some problems and enhanced.
It has still several issues, but it seems to start working,
so it's time to try it again, folks!

https://github.com/snaga/xlogdump/downloads

xlogdump is now able to be compiled against 8.2~9.0 (it was tough fixes!),
so it could be a fun (of course, also useful) stuff for many hackers and
users.

Please visit the new project repository, and try it.

https://github.com/snaga/xlogdump

If you have any bug reports and/or comments, please feel free to email me.

Thanks in advance,

p.s.
I really appreciate Tom and Diogo for their previous great work.
-- 
NAGAYASU Satoshi satoshi.nagay...@gmail.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove fmgr.h include in cube contrib --- caused crash on a Ge

2011-09-03 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Tom Lane wrote:
  You seem to have entirely missed the point of Alvaro's remark, which is
  that you've got xlog.h including walsender.h (still) as well as
  walsender.h including xlog.h.  That's broken.
 
  Oh, OK, done.  xlog.h removed from walsender.h and tested.
 
 I would have gone the other way on that one, if possible.  Seems like
 xlog.h ought to be the lower-level file.

Agreed.  Let me work on that.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove fmgr.h include in cube contrib --- caused crash on a Ge

2011-09-03 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 I would have gone the other way on that one, if possible.  Seems like
 xlog.h ought to be the lower-level file.

 Agreed.  Let me work on that.

Uh, I just did it.  Painful.  It would have been a lot easier before
the pgrminclude run, because that baked-in a whole lot of bad decisions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers