Re: [HACKERS] patch to slightly improve clarity of a comment in postgresql.conf.sample

2011-08-26 Thread Dougal Sutherland
Sure, I like that more.


On Thu, Aug 25, 2011 at 9:29 AM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Aug 24, 2011 at 7:33 AM, Dougal Sutherland dou...@gmail.com
 wrote:
  The attached change to postgresql.conf.sample makes it more clear at a
  glance that the default value of listen_addresses is 'localhost', not
  'localhost, *'. This would have saved a friend an hour or two of fiddling
  tonight.

 How about:

# defaults to 'localhost'; use '*' for all

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



Re: [HACKERS] Removal of useless include references

2011-08-26 Thread Christian Ullrich

* Bruce Momjian wrote:


The attached patch removes unneeded include references, and marks some
includes as needing to be skipped by pgrminclude.


There are several unrelated changes to pg_upgrade in that patch, too.

--
Christian



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Greg Smith

On 08/25/2011 04:57 PM, Tomas Vondra wrote:

(b) sends bgwriter stats (so that the buffers_checkpoint is updated)
   


The idea behind only updating the stats in one chunk, at the end, is 
that it makes one specific thing easier to do.  Let's say you're running 
a monitoring system that is grabbing snapshots of pg_stat_bgwriter 
periodically.  If you want to figure out how much work a checkpoint did, 
you only need two points of data to compute that right now.  Whenever 
you see either of the checkpoint count numbers increase, you just 
subtract off the previous sample; now you've got a delta for how many 
buffers that checkpoint wrote out.  You can derive the information about 
the buffer counts involved that appears in the logs quite easily this 
way.  The intent was to make that possible to do, so that people can 
figure this out without needing to parse the log data.


Spreading out the updates defeats that idea.  It also makes it possible 
to see the buffer writes more in real-time, as they happen.  You can 
make a case for both approaches having their use cases; the above is 
just summarizing the logic behind why it's done the way it is right 
now.  I don't think many people are actually doing things with this to 
the level where their tool will care.  The most popular consumer of 
pg_stat_bgwriter data I see is Munin graphing changes, and I don't think 
it will care either way.


Giving people the option of doing it the other way is a reasonable idea, 
but I'm not sure there's enough use case there to justify adding a GUC 
just for that.  My next goal here is to eliminate checkpoint_segments, 
not to add yet another tunable extremely few users would ever touch.


As for throwing more log data out, I'm not sure what new analysis you're 
thinking of that it allows.  I/O gets increasingly spiky as you zoom in 
on it; averaging over a shorter period can easily end up providing less 
insight about trends.  If anything, I spend more time summarizing the 
data that's already there, rather than wanting to break them down.  It's 
already providing way too much detail for most people.  Customers tell 
me they don't care to see checkpoint stats unless they're across a day 
or more of sampling, so even the current once every ~5 minutes is way 
more info than they want.  I have all this log parsing code and things 
that look at pg_stat_bgwriter to collect that data and produce higher 
level reports.  And lots of it would break if any of this patch is added 
and people turn it on.  I imagine other log/stat parsing programs might 
suffer issues too.  That's your other hurdle for change here:  the new 
analysis techniques have to be useful enough to justify that some 
downstream tool disruption is inevitable.


If you have an idea for how to use this extra data for something useful, 
let's talk about what that is and see if it's possible to build it in 
instead.  This problem is harder than it looks, mainly because the way 
the OS caches writes here makes trying to derive hard numbers from what 
the background writer is doing impossible.  When the database writes 
things out, and when they actually get written to disk, they are not the 
same event.  The actual write is often during the sync phase, and not 
being able to tracking that beast is where I see the most problems at.  
The write phase, the easier part to instrument in the database, that is 
pretty boring.  That's why the last extra logging I added here focused 
on adding visibility to the sync activity instead.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Magnus Hagander
On Thu, Aug 25, 2011 at 22:57, Tomas Vondra t...@fuzzy.cz wrote:
 Hello,

 I'd like to propose a small patch that allows better checkpoint progress
 monitoring. The patch is quite simple - it adds a new integer GUC
 checkpoint_update_limit and every time checkpoint writes this number of
 buffers, it does two things:

 (a) logs a checkpoint status message into the server log, with info
 about total number of buffers to write, number of already written buffers,
 current and average write speed and estimate of remaining time

 (b) sends bgwriter stats (so that the buffers_checkpoint is updated)

 I believe this will make checkpoint tuning easier, especially with large
 shared bufferers and large when there's other write activity (so that it's
 difficult to see checkpoint I/O).

 The default value (0) means this continuous logging is disabled.

This seems like the wrong thing to write to the log. It's really only
useful *during* the checkpoint run, isn't it? If so, I think it should
go in a pg_stat view. In theory, this could be the progress view or
progress field talked about around Gregs previous patch - or it
could just be modifying the commandstring in pg_stat_activity. Either
way, it should be updated in shared memory in that case (like current
query is), and not sent with a message to the collector.

IMHO, of course ;)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Tomas Vondra
On 26 Srpen 2011, 9:35, Greg Smith wrote:
 On 08/25/2011 04:57 PM, Tomas Vondra wrote:
 (b) sends bgwriter stats (so that the buffers_checkpoint is updated)

 As for throwing more log data out, I'm not sure what new analysis you're
 thinking of that it allows.  I/O gets increasingly spiky as you zoom in
 on it; averaging over a shorter period can easily end up providing less
 insight about trends.  If anything, I spend more time summarizing the
 data that's already there, rather than wanting to break them down.  It's
 already providing way too much detail for most people.  Customers tell
 me they don't care to see checkpoint stats unless they're across a day
 or more of sampling, so even the current once every ~5 minutes is way
 more info than they want.  I have all this log parsing code and things
 that look at pg_stat_bgwriter to collect that data and produce higher
 level reports.  And lots of it would break if any of this patch is added
 and people turn it on.  I imagine other log/stat parsing programs might
 suffer issues too.  That's your other hurdle for change here:  the new
 analysis techniques have to be useful enough to justify that some
 downstream tool disruption is inevitable.

I was aware that by continuously updating pg_stat_bgwriter, the data won't
be synchronized (i.e. the buffers_checkpoint counters will change while
the number of requested/timed checkpoints remain the same).

But does that really break the tools that process the data? When you're
working with summarized data, the result should be more or less the same
as the difference will be smoothed out by averaging etc. You can always
see just one in progress checkpoint, so if you get 24 checkpoints a day,
the difference will be 1/24 of a checkpoint. Yes, it's a difference.

A really crazy workaround would be to change checkpoints_requested /
checkpoints_timed to double, and use that to indicate current progress of
the checkpoint. So for example 10.54 would mean 10 checkpoints completed,
one checkpoint in progress, already written 54% of blocks. But yes, that's
a bit crazy.

 If you have an idea for how to use this extra data for something useful,
 let's talk about what that is and see if it's possible to build it in
 instead.  This problem is harder than it looks, mainly because the way
 the OS caches writes here makes trying to derive hard numbers from what
 the background writer is doing impossible.  When the database writes
 things out, and when they actually get written to disk, they are not the
 same event.  The actual write is often during the sync phase, and not
 being able to tracking that beast is where I see the most problems at.
 The write phase, the easier part to instrument in the database, that is
 pretty boring.  That's why the last extra logging I added here focused
 on adding visibility to the sync activity instead.

Hmmm, let me explain what led me to this patch - right now I'm doing a
comparison of filesystems with various block sizes (both fs and db
blocks). I've realized that the db block size significantly influences
frequency of checkpoints and amount of data to write, so I'm collecting
data from pg_stat_bgwriter too. The benchmark goes like this

1. collect pg_stat_bgwriter stats
2. run pgbench for 10 minutes
3. collect pg_stat_bgwriter stats (to compute difference with (1))
4. kill the postmaster

The problem is that when checkpoint stats are collected, there might be a
checkpoint in progress and in that case the stats are incomplete. In some
cases (especially with very small db blocks) this has significant impact
because the checkpoints are less frequent.

I can't infer this from other data (e.g. iostat), because that does allow
me what I/O is caused by the checkpoint.

Yes, this does not consider sync timing, but in my case that's not a big
issue (the page cache is rather small so the data are actually forced to
the drive soon).

I could probably live with keeping the current pg_stat_bgwriter logic
(i.e. updating just once) and writing checkpoint status just to the log. I
don't think that should break any current tools that parse logs, because
the message is completely different (prefixed with 'checkpoint status') so
any reasonably written tool should be OK.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache

2011-08-26 Thread Kohei KaiGai
Robert, Thanks for your reviewing.

 For me, the line you removed from dml.out causes the regression tests to fail.

Fixed. Why did I removed this line??

 I don't understand what this is going for:

 +       /*
 +        * To boost up trusted procedure checks on db_procedure object
 +        * class, we also confirm the decision when user calls a procedure
 +        * labeled as 'tcontext'.
 +        */

 Can you explain?

Yes. It also caches an expected security label when a client being
labeled as scontext tries to execute a procedure being labeled as
tcontext, to reduce number of system call invocations on fmgr_hook
and needs_fmgr_hook.
If the expected security label is not same with scontext, it means
the procedure performs as a trusted procedure that switches security
label of the client during its execution; like a security invoker
function.
A pair of security labels are the only factor to determine whether the
procedure is a trusted-procedure, or not. Thus, it is suitable to
cache in userspace avc.

As an aside, the reason why we don't cache the default security label
being assigned on newly created named objects (such as tables, ...) is
that selinux allows to set up exceptional default security label on a
particular name, so it does not suitable for avc structure.
(I'm waiting for getting included this interface into libselinux.)

 sepgsql_avc_check_perms_label has a formatting error on the line that
 says result = false.  It's not indented correctly.

OK, I fixed it.

 Several functions do this: sepgsql_avc_check_valid(); do { ... } while
 (!sepgsql_avc_check_valid);  I don't understand why we need a loop
 there.

It enables to prevent inconsistent access control decision from
concurrent security policy reloading.
I want the following steps being executed in atomic.
 1) Lookup object class number in kernel-side
 2) Lookup permission bits in kernel-side
 3) Ask kernel-side its access control decision.

The selinux_status_update returns 1, if any status of selinux in
kernel side (that requires to flush userspace caches) had been changed
since the last invocation.
In this case, we retry whole of the process from the beginning to
ensure whole of access control decision being made by either old or
new policy.
Thus, I enclosed these blocks by do {...} while() loop.

 The comment for sepgql_avc_check_perms_label uses the word elsewhere
 when it really means otherwise.

OK, I fixed it.

 Changing the calling sequence of sepgsql_get_label() would perhaps be
 better separated out into its own patch.

OK, I reverted it.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp
 configure.in   |4 +-
 contrib/sepgsql/Makefile   |2 +-
 contrib/sepgsql/dml.c  |   59 +++---
 contrib/sepgsql/hooks.c|   64 +++---
 contrib/sepgsql/proc.c |   68 ++-
 contrib/sepgsql/relation.c |   69 +++
 contrib/sepgsql/schema.c   |   39 ++--
 contrib/sepgsql/selinux.c  |2 +-
 contrib/sepgsql/sepgsql.h  |   18 ++-
 contrib/sepgsql/uavc.c |  511 
 doc/src/sgml/sepgsql.sgml  |   12 +-
 11 files changed, 649 insertions(+), 199 deletions(-)

diff --git a/configure.in b/configure.in
index a844afc..b444358 100644
--- a/configure.in
+++ b/configure.in
@@ -964,8 +964,8 @@ fi
 
 # for contrib/sepgsql
 if test $with_selinux = yes; then
-  AC_CHECK_LIB(selinux, selinux_sepgsql_context_path, [],
-   [AC_MSG_ERROR([library 'libselinux', version 2.0.93 or newer, is required for SELinux support])])
+  AC_CHECK_LIB(selinux, selinux_status_open, [],
+   [AC_MSG_ERROR([library 'libselinux', version 2.0.99 or newer, is required for SELinux support])])
 fi
 
 # for contrib/uuid-ossp
diff --git a/contrib/sepgsql/Makefile b/contrib/sepgsql/Makefile
index 1978ccf..e273d8f 100644
--- a/contrib/sepgsql/Makefile
+++ b/contrib/sepgsql/Makefile
@@ -1,7 +1,7 @@
 # contrib/sepgsql/Makefile
 
 MODULE_big = sepgsql
-OBJS = hooks.o selinux.o label.o dml.o \
+OBJS = hooks.o selinux.o uavc.o label.o dml.o \
 	schema.o relation.o proc.o
 DATA_built = sepgsql.sql
 REGRESS = label dml misc
diff --git a/contrib/sepgsql/dml.c b/contrib/sepgsql/dml.c
index 22666b7..3199337 100644
--- a/contrib/sepgsql/dml.c
+++ b/contrib/sepgsql/dml.c
@@ -150,12 +150,11 @@ check_relation_privileges(Oid relOid,
 		  uint32 required,
 		  bool abort)
 {
-	char		relkind = get_rel_relkind(relOid);
-	char	   *scontext = sepgsql_get_client_label();
-	char	   *tcontext;
+	ObjectAddress	object;
 	char	   *audit_name;
 	Bitmapset  *columns;
 	int			index;
+	char		relkind = get_rel_relkind(relOid);
 	bool		result = true;
 
 	/*
@@ -184,45 +183,43 @@ check_relation_privileges(Oid relOid,
 	/*
 	 * Check permissions on the relation
 	 */
-	tcontext = sepgsql_get_label(RelationRelationId, relOid, 0);
-	audit_name = getObjectDescriptionOids(RelationRelationId, relOid);
+	object.classId = RelationRelationId;
+	object.objectId = relOid;
+	object.objectSubId = 0;
+	audit_name = 

Re: [HACKERS] tsvector concatenation - backend crash

2011-08-26 Thread jesper
 Hi

 Attached SQL files gives (at least in my hands) a reliable backend crash
 with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry
 I cannot provide a more trimmed down set of vectors the reproduces the
 bug, thus
 the obsfucated dataset. But even deleting single terms in the vectors
 make
 the bug go away.

Ok, I found 8.3.0 to be good so i ran a git bisect on it.. it gave
me this commit:

e6dbcb72fafa4031c73cc914e829a6dec96ab6b6 is the first bad commit
commit e6dbcb72fafa4031c73cc914e829a6dec96ab6b6
Author: Tom Lane t...@sss.pgh.pa.us
Date:   Fri May 16 16:31:02 2008 +

Extend GIN to support partial-match searches, and extend tsquery to
support
prefix matching using this facility.

Teodor Sigaev and Oleg Bartunov

:04 04 febf59ba02bcd4ce3863e880c6bbd989e0b7b1d2
5e96383e628dd27b5c68b0186af18f80fb7ef129 M  doc
:04 04 b920deca6f074b83dd5d2bd0446785a23019d11a
3f10e54cdeac63129f34865adcadf34ff74ff9a8 M  src
bisect run success

Which means that 8.3 releases are OK, but 8.4 and forward has the problem.

Which at least touches the same area.. the patch is allthogh over 3K lines,
and my C-skills are not that good.

Attached is the git bisect script.. just for the archives.

Jesper

git-bisect-script
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions and experiences writing a Foreign Data Wrapper

2011-08-26 Thread Albe Laurenz
I wrote:
 I wrote a FDW for Oracle to a) learn some server coding
 and b) see how well the FDW API works for me.

I have released the software on PgFoundry:
http://oracle-fdw.projects.postgresql.org/

Would it make sense to mention that in chapter 5.10
of the documentation?

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of useless include references

2011-08-26 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  It has been years since I ran src/tools/pginclude/pgrminclude to remove
  unnecessary include files.  (I have already fixed things so include
  files can be compiled on their own.)
 
  The attached patch removes unneeded include references, and marks some
  includes as needing to be skipped by pgrminclude.
 
  I am sure applying this patch will break builds on some platforms and
  some option combinations so I will monitor the buildfarm when I apply it
  and make adjustments.
 
 The last time you did this was in July 2006.  It took us two weeks to
 mostly recover, but we were still dealing with some fallout in December,
 cf
 http://archives.postgresql.org/pgsql-hackers/2006-12/msg00491.php
 
 We had the buildfarm then, had had it for a couple years.  The notion
 that watching the buildfarm is enough is fully disproven by history.
 
 Unless you have a better test plan than last time (which this isn't),
 I don't think this should be done at all.  The benefits are microscopic
 and the pain real.

I don't have a better plan.  There are #ifdef code blocks that often
don't get processed and therefore this can't be done better.  I will
abandon the idea.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tsvector concatenation - backend crash

2011-08-26 Thread Tom Lane
Jesper Krogh jes...@krogh.cc writes:
 On 2011-08-26 05:28, Tom Lane wrote:
 Hm ... I can reproduce this on one of my usual machines, but not
 another.  What platform are you on exactly?

 64 bit Ubuntu Lucid (amd64).

Huh, weird ... because the platform it's not failing for me on is
Fedora 14 x86_64.  Which is annoying, because that machine has better
tools for looking for memory stomps than the 32-bit HP box where I
do see the problem.  Anyway, will see what I can find.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions and experiences writing a Foreign Data Wrapper

2011-08-26 Thread Andrew Dunstan



On 08/26/2011 07:27 AM, Albe Laurenz wrote:

I wrote:

I wrote a FDW for Oracle to a) learn some server coding
and b) see how well the FDW API works for me.

I have released the software on PgFoundry:
http://oracle-fdw.projects.postgresql.org/

Would it make sense to mention that in chapter 5.10
of the documentation?





I don't think so, any more than any other external module should be 
mentioned in the docs. There are known FDWs for several well known 
external databases, several file formats, and more exotic data sources 
such as twitter. I don't think we want to  maintain a list of these in 
the docs.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions and experiences writing a Foreign Data Wrapper

2011-08-26 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 08/26/2011 07:27 AM, Albe Laurenz wrote:
 http://oracle-fdw.projects.postgresql.org/
 
 Would it make sense to mention that in chapter 5.10
 of the documentation?

 I don't think so, any more than any other external module should be 
 mentioned in the docs. There are known FDWs for several well known 
 external databases, several file formats, and more exotic data sources 
 such as twitter. I don't think we want to  maintain a list of these in 
 the docs.

Wiki page, maybe?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Fast GiST index build

2011-08-26 Thread Alexander Korotkov
On Thu, Aug 25, 2011 at 11:08 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:

 Could you share the test scripts, patches and data sets etc. needed to
 reproduce the tests you've been running? I'd like to try them out on a test
 server.


1) I've updated links to the datasets on the wiki page.
2) Script for index quality testing fastbuild_test.php is in the attachment.
In order to run it you need PHP with pdo and pdo_pgsql modules. Also
plantuner moduler is required (it is used to force planer to use specific
index). After running that script following query returns relative score of
index quality:

select indexname, avg(count::real/(select count from test_result a2 where
a2.indexname = 'usnoa2_idx3' and a2.predicate = a1.predicate and
a2.tablename = a1.tablename)::real) from test_result a1 where a1.tablename =
'usnoa2' group by indexname;

where 'usnoa2' - table name, 'usnoa2_idx3' - name of index which quality was
assumed to be 1.
3) Patch which makes plantuner work with HEAD is also in attachment.
4) Patch with my split algorithm implementation is attached. Now it's form
is appropriate only for testing purposes.
5) For indexes creation I use simple script which is attached as
'indexes.sql'. Also, similar script with different index names I'm running
with my split patch.

Feel free to ask questions about all this stuff.

--
With best regards,
Alexander Korotkov.


fastbuild_test.php.gz
Description: GNU Zip compressed data


plantuner.patch.gz
Description: GNU Zip compressed data


my_split.patch.gz
Description: GNU Zip compressed data
select pg_stat_statements_reset();
set log_statement_stats = on;
set synchronize_seqscans = off;

create index uniform_idx1 on uniform using gist(point) with (buffering=on);
create index uniform_idx2 on uniform using gist(point) with (buffering=auto);
create index uniform_idx3 on uniform using gist(point) with (buffering=off);

create index usnoa2_idx1 on usnoa2 using gist(point) with (buffering=on);
create index usnoa2_idx2 on usnoa2 using gist(point) with (buffering=auto);
create index usnoa2_idx3 on usnoa2 using gist(point) with (buffering=off);

create index usnoa2_shuffled_idx1 on usnoa2_shuffled using gist(point) with (buffering=on);
create index usnoa2_shuffled_idx2 on usnoa2_shuffled using gist(point) with (buffering=auto);
create index usnoa2_shuffled_idx3 on usnoa2_shuffled using gist(point) with (buffering=off);


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Fast GiST index build

2011-08-26 Thread Alexander Korotkov
On Thu, Aug 25, 2011 at 10:53 PM, Heikki Linnakangas 
heikki.linnakan...@enterprisedb.com wrote:


 In the tests on the first version of patch I found index quality of
 regular
 build much better than it of buffering build (without neighborrelocation).
 Now it's similar, though it's because index quality of regular index build
 become worse. There by in current tests regular index build is faster than
 in previous. I see following possible causes of it:
  1) I didn't save source random data. So, now it's a new random data.
 2) Some environment parameters of my test setup may alters, though I
 doubt.
 Despite these possible explanation it seems quite strange for me.


 That's pretty surprising. Assuming the data is truly random, I wouldn't
 expect a big difference in the index quality of one random data set over
 another. If the index quality depends so much on, say, the distribution of
 the few first tuples that are inserted to it, that's a quite interesting
 find on its own, and merits some further research.

Yeah, it's pretty strange. Using same random datasets in different tests can
help to exclude onepossible cause of difference.


  In order to compare index build methods on more qualitative indexes, I've
 tried to build indexes with my double sorting split method (see:
 http://syrcose.ispras.ru/2011/**files/SYRCoSE2011_Proceedings.**
 pdf#page=36http://syrcose.ispras.ru/2011/files/SYRCoSE2011_Proceedings.pdf#page=36).
 So
 on uniform dataset search is faster in about 10 times! And, as it was
 expected, regular index build becomes much slower. It runs more than 60
 hours and while only 50% of index is complete (estimated by file sizes).

 Also, automatic switching to buffering build shows better index quality
 results in all the tests. While it's hard for me to explain that.


 Hmm, makes me a bit uneasy that we're testing with a modified page
 splitting algorithm. But if the new algorithm is that good, could you post
 that as a separate patch, please?

I've post it in another message and I will try to get it into more
appropriate form. Let me clarify this a little. I don't think my split
algorithm is 10 times better than state of the art algorithms. I think that
currently used new linear split shows unreasonably bad results in may cases.
For example, uniformly distributed data is pretty easy case. And with almost
any splitting algorithm we can get index with almost zero overlaps. But new
linear split produces huge overlaps in this case. That's why I decided to
make some experiments with another split algorithm.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Questions and experiences writing a Foreign Data Wrapper

2011-08-26 Thread Dave Page
On Fri, Aug 26, 2011 at 3:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 On 08/26/2011 07:27 AM, Albe Laurenz wrote:
 http://oracle-fdw.projects.postgresql.org/

 Would it make sense to mention that in chapter 5.10
 of the documentation?

 I don't think so, any more than any other external module should be
 mentioned in the docs. There are known FDWs for several well known
 external databases, several file formats, and more exotic data sources
 such as twitter. I don't think we want to  maintain a list of these in
 the docs.

 Wiki page, maybe?

For example this one: http://wiki.postgresql.org/wiki/Foreign_data_wrappers


-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-08-26 Thread Andrew Dunstan



On 08/25/2011 06:15 PM, Andrew Dunstan wrote:



But we could also add these switches to pg_dump too if people feel 
it's worthwhile. I haven't looked but the logic should not be terribly 
hard.





Something like the attached, in fact, which seems pretty simple.

cheers

andrew



diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index f6cd7eb..e9b4cc6 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -140,6 +140,8 @@ static int	column_inserts = 0;
 static int	no_security_labels = 0;
 static int	no_unlogged_table_data = 0;
 static int	serializable_deferrable = 0;
+static int  exclude_post_data = 0;
+static int  post_data_only = 0;
 
 
 static void help(const char *progname);
@@ -334,6 +336,8 @@ main(int argc, char **argv)
 		{use-set-session-authorization, no_argument, use_setsessauth, 1},
 		{no-security-labels, no_argument, no_security_labels, 1},
 		{no-unlogged-table-data, no_argument, no_unlogged_table_data, 1},
+		{no-post-data, no_argument, exclude_post_data, 1},
+		{post-data-only, no_argument, post_data_only, 1},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -790,7 +794,7 @@ main(int argc, char **argv)
 	dumpStdStrings(g_fout);
 
 	/* The database item is always next, unless we don't want it at all */
-	if (include_everything  !dataOnly)
+	if (include_everything  !dataOnly  !post_data_only)
 		dumpDatabase(g_fout);
 
 	/* Now the rearrangeable objects. */
@@ -876,6 +880,8 @@ help(const char *progname)
 	printf(_(  --no-unlogged-table-datado not dump unlogged table data\n));
 	printf(_(  --quote-all-identifiers quote all identifiers, even if not key words\n));
 	printf(_(  --serializable-deferrable   wait until the dump can run without anomalies\n));
+	printf(_(  --no-post-data  do not dump constraints, indexes, rules, triggers\n));
+	printf(_(  --post-data-onlyonly dump constraints, indexes, rules, triggers\n));
 	printf(_(  --use-set-session-authorization\n
 			   use SET SESSION AUTHORIZATION commands instead of\n
 	  ALTER OWNER commands to set ownership\n));
@@ -7023,6 +7029,25 @@ collectComments(Archive *fout, CommentItem **items)
 static void
 dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 {
+
+	int skip = 0;
+
+	switch (dobj-objType)
+	{
+		case DO_INDEX:
+		case DO_TRIGGER:
+		case DO_CONSTRAINT:
+		case DO_FK_CONSTRAINT:
+		case DO_RULE:
+			skip = exclude_post_data;
+			break;
+		default:
+			skip = post_data_only;
+	}
+
+	if (skip)
+		return;
+	
 	switch (dobj-objType)
 	{
 		case DO_NAMESPACE:

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-08-26 Thread hubert depesz lubaczewski
On Fri, Aug 26, 2011 at 12:18:55AM -0400, Bruce Momjian wrote:
 
 OK, this was very helpful.  I found out that there is a bug in current
 9.0.X, 9.1.X, and HEAD that I introduced recently when I excluded temp
 tables.  (The bug is not in any released version of pg_upgrade.)  The
 attached, applied patches should fix it for you.  I assume you are
 running 9.0.X, and not 9.0.4.

pg_upgrade worked. Now I'm doing reindex and later on vacuumdb -az.

will keep you posted.

Best regards,

depesz


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-08-26 Thread Robert Haas
On Fri, Aug 26, 2011 at 11:22 AM, Andrew Dunstan and...@dunslane.net wrote:
 But we could also add these switches to pg_dump too if people feel it's
 worthwhile. I haven't looked but the logic should not be terribly hard.

 Something like the attached, in fact, which seems pretty simple.

It seems like there are three sets of things you might want here:
pre-data, data, post-data.  So in the end we could end up with:

--pre-data-only
--post-data-only
--data-only
--no-pre-data
--no-post-data
--no-data

And then maybe someone will want just the create index commands and
not the constraint commands.  It seems like it might be more elegant
to come up with a single switch where you can list which things you
want:

--sections='predata data'
--sections='postdata'
--sections='index'

Just thinking out loud

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-08-26 Thread Jeff Davis
On Fri, 2011-08-26 at 12:46 -0400, Robert Haas wrote:
 --sections='predata data'
 --sections='postdata'
 --sections='index'

Agreed. After command line options reach a certain level of complexity,
I think it's worth looking for a more general way to express them.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-08-26 Thread Andrew Dunstan



On 08/26/2011 12:46 PM, Robert Haas wrote:

On Fri, Aug 26, 2011 at 11:22 AM, Andrew Dunstanand...@dunslane.net  wrote:

But we could also add these switches to pg_dump too if people feel it's
worthwhile. I haven't looked but the logic should not be terribly hard.

Something like the attached, in fact, which seems pretty simple.

It seems like there are three sets of things you might want here:
pre-data, data, post-data.  So in the end we could end up with:

--pre-data-only
--post-data-only
--data-only
--no-pre-data
--no-post-data
--no-data

And then maybe someone will want just the create index commands and
not the constraint commands.  It seems like it might be more elegant
to come up with a single switch where you can list which things you
want:

--sections='predata data'
--sections='postdata'
--sections='index'

Just thinking out loud


I knew there would be some bike-shedding about how we specify these 
things, which is why I haven't written docs yet.


All the possibilities you specify except for the indexes section can be 
done by using these switches in combination with -s and -a.


For anything more fine-grained, I'm inclined to say that people need to 
roll their own. pg_restore's --list and --use-list give you extremely 
fine-grained control. I have working scripts which use these for example 
to filter out londiste and pgq objects, certain large tables, audit 
objects and more. As an example of the complexity I think we should 
avoid, which section would UNIQUE and PRIMARY KEY constraints belong in? 
constraints because that's what they are, or indexes because that's 
what they create? No matter which answer you choose someone will claim 
you have violated POLA.


Chopping things into pre-data, data and post-data would get us around 
99% of the cases we could reasonably provide for in my experience. That 
seems enough :-)


I don't have anything in principle against your '--sections=foo bar' 
suggestion, but it would be more work to program. Simpler, and probably 
more consistent with how we do other things, would be allowing multiple 
--section options, if we don't want to have named options such as I have 
provided.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Greg Smith

On 08/26/2011 03:54 AM, Magnus Hagander wrote:

In theory, this could be the progress view or
progress field talked about around Gregs previous patch - or it
could just be modifying the commandstring in pg_stat_activity.


Right.  The whole progress indicator idea is hard to do for queries in 
general.  But there's enough of these other progress indicator ideas 
around now that it may be worth putting a standard way to handle them in 
here.  It sounds like that would be sufficient to address the area Tomas 
is trying to instrument better.  I badly want a progress indicator on 
CREATE INDEX CONCURRENTLY too, to at least let me know what phase of the 
build process it's on.  That's turned into a major headache recently.


If we run with the idea of just allowing backends to publish a progress 
text string, I think this one maps into a similar space as the 
autovacuum one.  Publishing how many seconds the operation has been 
running for may be reasonable too.  Whether the overhead of the timing 
calls necessary to compute that will be high or not depends on the 
refresh rate of the progress info.  My suggestion before was to name 
these as key=value pairs for easy parsing; here's three examples now:


autovacumm:  pgbench_accounts h=182701 m=301515 d=321345 s=62.231
(cache hits, cache misses, dirty writes, seconds)

background writer:  checkpoint b=511 t=3072 s=5.321
(buffers written, total, seconds)

create index concurrently:  pgbench_accounts p=1 b=62 t=6213 s=81.232
(phase, blocks processed, total block estimate, seconds)

I think that the idea of making this easily human readable is 
optimistic, because it will make all these strings big enough to start 
mattering.  Given that, we almost have to assume the only consumers of 
this data will be able to interpret it using the documentation.  I'd be 
happy with just the minimal data set in each case, not including any 
statistics you can easily derive from the values given (like the MB/s 
readings).  Adding that figure in particular to more of the log messages 
would be nice though.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Tomas Vondra
On 26 Srpen 2011, 9:54, Magnus Hagander wrote:
 This seems like the wrong thing to write to the log. It's really only
 useful *during* the checkpoint run, isn't it? If so, I think it should
 go in a pg_stat view. In theory, this could be the progress view or
 progress field talked about around Gregs previous patch - or it
 could just be modifying the commandstring in pg_stat_activity. Either
 way, it should be updated in shared memory in that case (like current
 query is), and not sent with a message to the collector.

I personally find it handy for example when I need to find out why
performance degraded at a certain point in the past. Now I can see there
was a checkpoint (thanks to log_checkpoints=on), but I don't know any
details about it's progress.

I already collect info from the pg_stat_bgwriter, that's why I thought I
could update it more often. The log file is a natural destination for such
information, IMHO. I see that as an extension to the current checkpoint
messages that are written to the log.

The commandstring approach is probably fine for interactive work, but not
very handy when you need to analyze something that already happened.
Unless you collected the data, of course.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Tomas Vondra
On 26 Srpen 2011, 19:17, Greg Smith wrote:
 On 08/26/2011 03:54 AM, Magnus Hagander wrote:
 In theory, this could be the progress view or
 progress field talked about around Gregs previous patch - or it
 could just be modifying the commandstring in pg_stat_activity.

 Right.  The whole progress indicator idea is hard to do for queries in
 general.  But there's enough of these other progress indicator ideas
 around now that it may be worth putting a standard way to handle them in
 here.  It sounds like that would be sufficient to address the area Tomas
 is trying to instrument better.  I badly want a progress indicator on
 CREATE INDEX CONCURRENTLY too, to at least let me know what phase of the
 build process it's on.  That's turned into a major headache recently.

 If we run with the idea of just allowing backends to publish a progress
 text string, I think this one maps into a similar space as the
 autovacuum one.  Publishing how many seconds the operation has been
 running for may be reasonable too.  Whether the overhead of the timing
 calls necessary to compute that will be high or not depends on the
 refresh rate of the progress info.  My suggestion before was to name
 these as key=value pairs for easy parsing; here's three examples now:

 autovacumm:  pgbench_accounts h=182701 m=301515 d=321345 s=62.231
 (cache hits, cache misses, dirty writes, seconds)

 background writer:  checkpoint b=511 t=3072 s=5.321
 (buffers written, total, seconds)

 create index concurrently:  pgbench_accounts p=1 b=62 t=6213 s=81.232
 (phase, blocks processed, total block estimate, seconds)

 I think that the idea of making this easily human readable is
 optimistic, because it will make all these strings big enough to start
 mattering.  Given that, we almost have to assume the only consumers of
 this data will be able to interpret it using the documentation.  I'd be
 happy with just the minimal data set in each case, not including any
 statistics you can easily derive from the values given (like the MB/s
 readings).  Adding that figure in particular to more of the log messages
 would be nice though.

I'm a bit confused - are you talking about updating process title or about
writing the info to log? The process title is probably fine for watching
the progress interactively, but it really does not solve what I need.

I need to be able to infer the progress for past events, so I'd have to
sample the 'ps ax' output regularly.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-08-26 Thread Robert Haas
On Fri, Aug 26, 2011 at 1:15 PM, Andrew Dunstan and...@dunslane.net wrote:
 I don't have anything in principle against your '--sections=foo bar'
 suggestion, but it would be more work to program. Simpler, and probably more
 consistent with how we do other things, would be allowing multiple --section
 options, if we don't want to have named options such as I have provided.

I wouldn't object to that, but more work to program probably means
about an extra 10 lines of code in this particular case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-08-26 Thread Alvaro Herrera
Excerpts from Robert Haas's message of vie ago 26 15:36:36 -0300 2011:
 On Fri, Aug 26, 2011 at 1:15 PM, Andrew Dunstan and...@dunslane.net wrote:
  I don't have anything in principle against your '--sections=foo bar'
  suggestion, but it would be more work to program. Simpler, and probably more
  consistent with how we do other things, would be allowing multiple --section
  options, if we don't want to have named options such as I have provided.
 
 I wouldn't object to that, but more work to program probably means
 about an extra 10 lines of code in this particular case.

The --section=data --section=indexes proposal seems very reasonable to
me -- more so than --sections='data indexes'.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inputting relative datetimes

2011-08-26 Thread Jim Nasby
On Aug 25, 2011, at 5:08 AM, Dean Rasheed wrote:
 Funny you should mention intervals...
 
 timestamptz 'today' - interval '5 days'
 timestamptz 'now' + interval '2 hours'
 
 
 Yes, but what I am trying to achieve is a way of entering such
 relative timestamps using a single input value, so that absolute and
 relative timestamps can both be bound to a SQL query using just one
 variable.


Even if the community doesn't want to add this to core, I think it would be a 
great add-on to put on PGXN. If you don't feel up to writing it themselves, 
perhaps you would pay one of the consulting companies to do it for them?
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-08-26 Thread Jim Nasby
On Aug 26, 2011, at 12:15 PM, Andrew Dunstan wrote:
 I knew there would be some bike-shedding about how we specify these things, 
 which is why I haven't written docs yet.

While we're debating what shade of yellow to paint the shed...

My actual use case is to be able to be able to inject SQL into a 
SQL-formatted dump either pre- or post-data (I'm on 8.3, so I don't actually 
dump any data; I'm *mostly* emulating the ability to dump data on just certain 
tables).

So for what I'm doing, the ideal interface would be a way to tell pg_dump When 
you're done dumping all table structures but before you get to any constraints, 
please run $COMMAND and inject it's output into the dump output. For some of 
the data obfuscation we're doing it would be easiest if $COMMAND was a perl 
script instead of SQL, but we could probably convert it.

Of course, many other folks actually need the ability to just spit out specific 
portions of the dump; I'm hoping we can come up with something that supports 
both concepts.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Cryptic error message in low-memory conditions

2011-08-26 Thread Daniel Farina
Hello list,

This is something that I've only recently somewhat pinned down to a cause...

Some Postgres servers will error out for a while with the following
error message:

expected authentication request from server, but received c

If one uses Their Favorite Search Engine, this message is scattered
around the internet, all in reference to Postgres, I think, but none
of the top results seem to have any lucid responses or cause listed.
We've seen this reproduce -- sometimes for minutes at a time -- and
after catching one in the act I am reasonably confident that one
common cause of this is systems that are low on memory, which I
confirmed by looking at postgres logs and matching them up against our
monitoring system.

Critical statistics first: the systems run Linux with overcommit off,
so malloc returns NULL now and again. There is no OOM killer activity.
 SSL is the transport, and SQL role password authentication is in use.
 There is no swap.

Here's an example of the various kinds of failure one can get from
connecting to a system that is low on memory:

2011-08-26 16:03:06 | INFO psql? failed with exception #PGError:
FATAL:  out of memory
DETAIL:  Failed on request of size 488.

2011-08-26 16:02:27 | INFO psql? failed with exception #PGError:
expected authentication request from server, but received c

2011-08-26 16:01:51 | INFO psql? failed with exception #PGError:
expected authentication request from server, but received c

2011-08-26 16:01:15 | INFO psql? failed with exception #PGError:
expected authentication request from server, but received c

2011-08-26 16:00:39 | INFO psql? failed with exception #PGError:
expected authentication request from server, but received c

2011-08-26 16:00:01 | INFO psql? failed with exception #PGError:
expected authentication request from server, but received c

2011-08-26 15:59:25 | INFO psql? failed with exception #PGError:
expected authentication request from server, but received c

2011-08-26 15:58:48 | INFO psql? failed with exception #PGError:
expected authentication request from server, but received c

2011-08-26 15:58:12 | INFO psql? failed with exception #PGError:
FATAL:  out of memory


On the backend side, one can see that often there is a failure to
fork, which is basically expected in this condition.  Various
statements will be reporting OOM also.

The commonality of an error message that does not say anything about
being out of memory is representative of the norm, and nominally one
does not get any express indication that the system is out of memory,
but otherwise responsive.  This puts someone doing monitoring (like
us) in a tricky position: the utilizer of the database is free to use
their memory -- that's what it's for -- but the problem is we cannot
determine that the server is basically online, if fully utilized.
This defeats the ever-common authenticate and run SELECT 1;  basic
monitoring style frequently used to determine the most basic levels of
uptime.

Should the 'out of memory' conditions were delivered most of the time
we could act differently, but for now we basically have to assume that
postgres is offline and poke around.  It's also interesting to note
that the systems are basically responsive (ssh can always seem to
fork, as I'm poking around tools like 'ls' et al seem to be fine), and
sometimes the load average isn't even extreme -- a leaky application
with too many connections can cause this, so it's not like every tiny
last scrap of memory has been consumed.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tsvector concatenation - backend crash

2011-08-26 Thread Tom Lane
jes...@krogh.cc writes:
 Attached SQL files gives (at least in my hands) a reliable backend crash
 with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry
 I cannot provide a more trimmed down set of vectors the reproduces the
 bug, thus
 the obsfucated dataset. But even deleting single terms in the vectors
 make the bug go away.

I found it.  tsvector_concat does this to compute the worst-case output
size needed:

/* conservative estimate of space needed */
out = (TSVector) palloc0(VARSIZE(in1) + VARSIZE(in2));

Unfortunately, that's not really worst case: it could be that the output
will require more alignment padding bytes than the inputs did, if there
is a mix of lexemes with and without position data.  For example, if in1
contains one lexeme of odd length without position data, and in2
contains one lexeme of even length with position data (and no pad byte),
and in1's lexeme sorts before in2's, then we will need a pad byte in the
second lexeme where there was none before.

The core of the fix is to suppose that we might need a newly-added pad
byte for each lexeme:

out = (TSVector) palloc0(VARSIZE(in1) + VARSIZE(in2) + i1 + i2);

which really is an overestimate but I don't feel a need to be tenser
about it.  What I actually committed is a bit longer because I added
some comments and some Asserts ...

 Ok, I found 8.3.0 to be good so i ran a git bisect on it.. it gave
 me this commit:
 
 e6dbcb72fafa4031c73cc914e829a6dec96ab6b6 is the first bad commit
 commit e6dbcb72fafa4031c73cc914e829a6dec96ab6b6
 Author: Tom Lane t...@sss.pgh.pa.us
 Date:   Fri May 16 16:31:02 2008 +
 
 Extend GIN to support partial-match searches, and extend tsquery to
 support
 prefix matching using this facility.

AFAICT this is a red herring: the bug exists all the way back to where
tsvector_concat was added, in 8.3.  I think the reason that your test
case happens to not crash before this commit is that it changed the sort
ordering rules for lexemes.  As you can see from my minimal example
above, we might need different numbers of pad bytes depending on how the
lexemes sort relative to each other.

Anyway, patch is committed; thanks for the report!

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-08-26 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 The --section=data --section=indexes proposal seems very reasonable to
 me -- more so than --sections='data indexes'.

+1 ... not only easier to code and less squishily defined, but more like
the existing precedent for other pg_dump switches, such as --table.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_restore --no-post-data and --post-data-only

2011-08-26 Thread Andrew Dunstan



On 08/26/2011 04:46 PM, Jim Nasby wrote:

On Aug 26, 2011, at 12:15 PM, Andrew Dunstan wrote:

I knew there would be some bike-shedding about how we specify these things, 
which is why I haven't written docs yet.

While we're debating what shade of yellow to paint the shed...

My actual use case is to be able to be able to inject SQL into a 
SQL-formatted dump either pre- or post-data (I'm on 8.3, so I don't actually dump any 
data; I'm *mostly* emulating the ability to dump data on just certain tables).

So for what I'm doing, the ideal interface would be a way to tell pg_dump When 
you're done dumping all table structures but before you get to any constraints, please 
run $COMMAND and inject it's output into the dump output. For some of the data 
obfuscation we're doing it would be easiest if $COMMAND was a perl script instead of SQL, 
but we could probably convert it.

Of course, many other folks actually need the ability to just spit out specific 
portions of the dump; I'm hoping we can come up with something that supports 
both concepts.



Well, the Unix approach is to use tools that do one thing well to build 
up more complex tools. Making pg_dump run some external command to 
inject things into the stream seems like the wrong thing given this 
philosophy. Use pg_dump to get the bits you want (pre-data, post-data) 
and sandwich them around whatever else you want. As for getting data 
from just certain tables, I just posted a patch for pg_dump to exclude 
data for certain tables, and we could look at providing a positive as 
well as a negative filter if there is sufficient demand.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of useless include references

2011-08-26 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of vie ago 26 01:35:45 -0300 2011:
 It has been years since I ran src/tools/pginclude/pgrminclude to remove
 unnecessary include files.  (I have already fixed things so include
 files can be compiled on their own.)
 
 The attached patch removes unneeded include references, and marks some
 includes as needing to be skipped by pgrminclude.

In btree_gist I think you should remove #include postgres.h from the
.h file and put it in the .c files instead, as is customary.  I think
that would make the other changes incorrect.  ltree.h and pg_upgrade.h
also get this wrong.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cryptic error message in low-memory conditions

2011-08-26 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 Some Postgres servers will error out for a while with the following
 error message:
 expected authentication request from server, but received c
 [ and this seems to be triggered by fork failures in the server ]

spockFascinating./spock

I poked at this for awhile by the expedient of hot-wiring postmaster.c
to always fail the fork request:

*** src/backend/postmaster/postmaster.c.origTue Aug 23 17:13:09 2011
--- src/backend/postmaster/postmaster.cFri Aug 26 19:08:25 2011
***
*** 3182,3188 
  #ifdef EXEC_BACKEND
  pid = backend_forkexec(port);
  #else/* !EXEC_BACKEND */
! pid = fork_process();
  if (pid == 0)/* child */
  {
  free(bn);
--- 3182,3189 
  #ifdef EXEC_BACKEND
  pid = backend_forkexec(port);
  #else/* !EXEC_BACKEND */
! pid = -1; // fork_process();
! errno = ENOMEM;
  if (pid == 0)/* child */
  {
  free(bn);

The basic case did what I was expecting:

$ psql dbname=regression
psql: could not fork new process for connection: Cannot allocate memory

which was unsurprising since I'm quite sure that code path got tested
in basically this fashion, back in the dark ages.  But sure enough,
over a TCP connection with SSL support turned on, it fails as described.
strace shows the postmaster is sending what it's supposed to.
I eventually figured out the problem: libpq sees the E response,
thinks it must be talking to a postmaster too old to understand
NEGOTIATE_SSL_CODE, and closes the socket and tries again.  But *it
forgets to clear out its input buffer*, so the body of the error message
is still there waiting to be read, and the next connection sees the c
as the first byte of the first postmaster response.  The fix for this is

*** src/interfaces/libpq/fe-connect.c.origThu Jul 28 10:39:57 2011
--- src/interfaces/libpq/fe-connect.cFri Aug 26 18:52:18 2011
***
*** 2064,2069 
--- 2064,2072 
  conn-allow_ssl_try = false;
  /* Assume it ain't gonna handle protocol 3, either */
  conn-pversion = PG_PROTOCOL(2, 0);
+ /* Discard any unread/unsent data */
+ conn-inStart = conn-inCursor = conn-inEnd = 0;
+ conn-outCount = 0;
  /* Must drop the old connection */
  closesocket(conn-sock);
  conn-sock = -1;

and similarly in the other places in fe-connect.c where we're abandoning
an open connection (not sure every one of those has to have it, but it
seems a good safety practice).

That gets us to the right place in the default SSL mode:

$ psql host=localhost sslmode=prefer dbname=regression
psql: could not fork new process for connection: Cannot allocate memory

but we're still not there if you're trying to force SSL:

$ psql host=localhost sslmode=require dbname=regression
psql: server does not support SSL, but SSL was required

The reason for this is that that same bit of code supposes that any
E response must mean that the postmaster didn't recognize
NEGOTIATE_SSL_CODE.  It doesn't (and of course shouldn't) pay any
attention to the actual textual error message.

Now, seeing as how NEGOTIATE_SSL_CODE has been understood by every build
since PG 7.0, I believe that this is dead code and we could remove it;
it seems exceedingly unlikely that any modern build of libpq will ever
be used to talk to a server that responds to that with E.

In fact it's worse than just delivering a misleading error message in
the require case, because if you're not doing require then what
happens next is that the code forces the protocol level down to 2 and
tries again, supposing that it must be dealing with an ancient server.
In the normal case where you are talking to a server under load rather
than a deliberately-broken one, it's entirely possible that the second
connection attempt succeeds.  And what you've got then is that the
connection is operating in protocol 2 rather than what the user probably
expected, disabling assorted functionality that he may well be depending
on.  I don't recall having seen reports that could match that syndrome,
but then again Daniel's complaint is a new one on me too.

So I'm thinking we'd be well advised to eliminate the assumption that an
E response could be fixed by downgrading to protocol 2, and instead
just make this code report the error message it got from the postmaster.
That's more than a one-liner so I don't have a patch for it yet.

Lastly, I noticed that if I tried this repeatedly on a Unix socket,
I sometimes got

psql: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
could not send startup packet: Broken pipe

rather than the expected results.  

Re: [HACKERS] Removal of useless include references

2011-08-26 Thread Bruce Momjian
Bruce Momjian wrote:
 Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   It has been years since I ran src/tools/pginclude/pgrminclude to remove
   unnecessary include files.  (I have already fixed things so include
   files can be compiled on their own.)
  
   The attached patch removes unneeded include references, and marks some
   includes as needing to be skipped by pgrminclude.
  
   I am sure applying this patch will break builds on some platforms and
   some option combinations so I will monitor the buildfarm when I apply it
   and make adjustments.
  
  The last time you did this was in July 2006.  It took us two weeks to
  mostly recover, but we were still dealing with some fallout in December,
  cf
  http://archives.postgresql.org/pgsql-hackers/2006-12/msg00491.php
  
  We had the buildfarm then, had had it for a couple years.  The notion
  that watching the buildfarm is enough is fully disproven by history.
  
  Unless you have a better test plan than last time (which this isn't),
  I don't think this should be done at all.  The benefits are microscopic
  and the pain real.
 
 I don't have a better plan.  There are #ifdef code blocks that often
 don't get processed and therefore this can't be done better.  I will
 abandon the idea.

OK, try #2.  I already had code that removed #if/#else/#endif code in
*.h files for better testing, so I extended that to all *.c files.  This
reduces the size of the diff from 6.6k lines to 4.7k lines but it makes
it much less likely that there will be problems from running
pgrminclude.

The current patch is here:

http://momjian.us/expire/pgrminclude.diff

I tested the patch on BSD and Linux.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of useless include references

2011-08-26 Thread Bruce Momjian
Alvaro Herrera wrote:
 Excerpts from Bruce Momjian's message of vie ago 26 01:35:45 -0300 2011:
  It has been years since I ran src/tools/pginclude/pgrminclude to remove
  unnecessary include files.  (I have already fixed things so include
  files can be compiled on their own.)
  
  The attached patch removes unneeded include references, and marks some
  includes as needing to be skipped by pgrminclude.
 
 In btree_gist I think you should remove #include postgres.h from the
 .h file and put it in the .c files instead, as is customary.  I think
 that would make the other changes incorrect.  ltree.h and pg_upgrade.h
 also get this wrong.

Thanks, done.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inputting relative datetimes

2011-08-26 Thread Robert Haas
On Fri, Aug 26, 2011 at 4:32 PM, Jim Nasby j...@nasby.net wrote:
 On Aug 25, 2011, at 5:08 AM, Dean Rasheed wrote:
 Funny you should mention intervals...

 timestamptz 'today' - interval '5 days'
 timestamptz 'now' + interval '2 hours'


 Yes, but what I am trying to achieve is a way of entering such
 relative timestamps using a single input value, so that absolute and
 relative timestamps can both be bound to a SQL query using just one
 variable.

 Even if the community doesn't want to add this to core, I think it would be a 
 great add-on to put on PGXN. If you don't feel up to writing it themselves, 
 perhaps you would pay one of the consulting companies to do it for them?

Frankly, our current date parsing code is pretty darn strange and
flaky.  If nobody's found the energy to rationalize that, what are the
chances that we can add a whole bunch more functionality without also
adding a whole bunch more bugs?  For examples of the sorts of things
we haven't gotten around to fixing, see:

http://archives.postgresql.org/pgsql-hackers/2011-03/msg01295.php

Another problem here is that it seems possible, even likely, that
everyone will have their own particular flavor of what they'd like to
see accepted: two weeks ago, a week ago Tuesday, next Saturday, last
Friday, two weeks from Saturday, Christmas plus three fortnights...
if it weren't already spaghetti code...  give it time.  I'm not
necessarily opposed to the idea (especially as a contrib module), but
I'm a little nervous that we might be overestimating the extent to
which Dean's needs are universal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dropdb and dropuser: IF EXISTS

2011-08-26 Thread Robert Haas
On Fri, Aug 26, 2011 at 12:08 AM, Josh Kupershmidt schmi...@gmail.com wrote:
 I noticed a few places where it would be handy if dropdb took a flag
 like --if-exists which would basically just add in the 'IF EXISTS'
 clause to the DROP DATABASE statement. For example, scripts like
 find_static or mbregress.sh use dropdb  createdb, but they generate
 noisy errors from dropdb when run for the first time since there's no
 --if-exists flag. (They could just pipe 'DROP DATABASE IF EXISTS ...'
 to psql, but what's the point of having dropdb if it's not used?)

 Attached is a very quick patch implementing the --if-exists or -X
 option for dropdb and dropuser. I didn't bother adding in a check to
 make sure the server version was 8.2+ since we're not even supporting
 8.1 nowadays, though that'd be easy enough to add in.

+1 for --if-exists, but -X isn't doing a lot for me, especially since
we've used -X for other purposes in other commands.  I'd just skip
having a short form for this one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cheaper snapshots redux

2011-08-26 Thread Robert Haas
On Thu, Aug 25, 2011 at 6:24 PM, Jim Nasby j...@nasby.net wrote:
 On Aug 25, 2011, at 8:24 AM, Robert Haas wrote:
 My hope (and it might turn out that I'm an optimist) is that even with
 a reasonably small buffer it will be very rare for a backend to
 experience a wraparound condition.  For example, consider a buffer
 with ~6500 entries, approximately 64 * MaxBackends, the approximate
 size of the current subxip arrays taken in aggregate.  I hypothesize
 that a typical snapshot on a running system is going to be very small
 - a handful of XIDs at most - because, on the average, transactions
 are going to commit in *approximately* increasing XID order and, if
 you take the regression tests as representative of a real workload,
 only a small fraction of transactions will have more than one XID.  So

 BTW, there's a way to actually gather some data on this by using PgQ (part of 
 Skytools and used by Londiste). PgQ works by creating ticks at regular 
 intervals, where a tick is basically just a snapshot of committed XIDs. 
 Presumably Slony does something similar.

 I can provide you with sample data from our production systems if you're 
 interested.

Yeah, that would be great.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cheaper snapshots redux

2011-08-26 Thread Robert Haas
On Thu, Aug 25, 2011 at 6:29 PM, Jim Nasby j...@nasby.net wrote:
 Actually, I wasn't thinking about the system dynamically sizing shared memory 
 on it's own... I was only thinking of providing the ability for a user to 
 change something like shared_buffers and allow that change to take effect 
 with a SIGHUP instead of requiring a full restart.

I agree.  That would be awesome.  Sadly, I don't have time to work on it.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: regular logging of checkpoint progress

2011-08-26 Thread Noah Misch
On Fri, Aug 26, 2011 at 10:46:33AM +0200, Tomas Vondra wrote:
 Hmmm, let me explain what led me to this patch - right now I'm doing a
 comparison of filesystems with various block sizes (both fs and db
 blocks). I've realized that the db block size significantly influences
 frequency of checkpoints and amount of data to write, so I'm collecting
 data from pg_stat_bgwriter too. The benchmark goes like this
 
 1. collect pg_stat_bgwriter stats
 2. run pgbench for 10 minutes
 3. collect pg_stat_bgwriter stats (to compute difference with (1))
 4. kill the postmaster
 
 The problem is that when checkpoint stats are collected, there might be a
 checkpoint in progress and in that case the stats are incomplete. In some
 cases (especially with very small db blocks) this has significant impact
 because the checkpoints are less frequent.

Could you remove this hazard by adding a step 2a. psql -c CHECKPOINT?

-- 
Noah Mischhttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cheaper snapshots redux

2011-08-26 Thread Gokulakannan Somasundaram
On Tue, Aug 23, 2011 at 5:25 AM, Robert Haas robertmh...@gmail.com wrote:

 I've been giving this quite a bit more thought, and have decided to
 abandon the scheme described above, at least for now.  It has the
 advantage of avoiding virtually all locking, but it's extremely
 inefficient in its use of memory in the presence of long-running
 transactions.  For example, if there's an open transaction that's been
 sitting around for 10 million transactions or so and has an XID
 assigned, any new snapshot is going to need to probe into the big
 array for any XID in that range.  At 8 bytes per entry, that means
 we're randomly accessing about ~80MB of memory-mapped data.  That
 seems problematic both in terms of blowing out the cache and (on small
 machines) possibly even blowing out RAM.  Nor is that the worst case
 scenario: a transaction could sit open for 100 million transactions.

 First i respectfully disagree with you on the point of 80MB. I would say
that its very rare that a small system( with 1 GB RAM ) might have a long
running transaction sitting idle, while 10 million transactions are sitting
idle. Should an optimization be left, for the sake of a very small system to
achieve high enterprise workloads?

Second, if we make use of the memory mapped files, why should we think, that
all the 80MB of data will always reside in memory? Won't they get paged out
by the  operating system, when it is in need of memory? Or do you have some
specific OS in mind?

Thanks,
Gokul.