date:20110612

Hi

I've recently wanted to define a check constraint on an array
column that verifies that all array entries match some regular
expression. Unfortunately, t

The most natural way of expressing such a check would be
  CHECK ('regexp' ~ ANY(field)),
but that doesn't work, because ~ expects the *value*
to be the left argument and the *pattern* to be the right.

The next try was
  CHECK (ANY(field) ~ 'regexp'),
but that doesn't even parse.

Ok, so then use UNNEST() and BOOL_AND() I figured, and wrote
  CHECK ((SELECT BOOL_AND(v ~ 'regexp') FROM UNNEST(field) v)).
But that of course lead to nothing but
  ERROR: cannot use subquery in check constraint

So I the end, I had to wrap the sub-query in a SQL-language
function and use that in the check constraint. While this
solved my immediate problem, the necessity of doing that
highlights a few problems

(A) ~ is an extremely bad name for the regexp-matching
operators, since it's visual form is symmetric but it's
behaviour isn't. This doesn't only make its usage very
error-prone, it also makes it very hard to come up with
sensible name for an commutator of ~. I suggest that we
add =~ as an alias for ~, ~= as an commutator
for =~, and deprecate ~. The same holds for ~~.
We might want to do this starting with 9.1. 

(B) There should be a way to use ANY()/ALL() with the
array elements becoming the left arguments of the operator.
Ideally, we'd support ANY(array) operator value,
but if that's not possible grammar-wise, I suggest we extend
the OPERATOR() syntax to allow
  value OPERATOR(COMMUTATOR operator) ANY(array).
OPERATOR(COMMUTATOR operator) would use the COMMUTATOR
of the specified operator if one exists, and otherwise
use the original operator with the arguments swapped.

(C) Why do we forbid sub-queries in CHECK constraints?
I do realize that any non-IMMUTABLE CHECK constraint is
a foot-gun, but since we already allow STABLE and even
VOLATILE functions to be used inside CHECK constraint,
forbidding sub-queries seems a bit pointless...

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types and extensions

On Jun12, 2011, at 04:37 , Robert Haas wrote:
 On Thu, Jun 9, 2011 at 6:26 PM, Florian Pflug f...@phlo.org wrote:
 On Jun8, 2011, at 17:46 , Jeff Davis wrote:
 It looks like the type input function may be a problem, because it
 doesn't look like it knows what the collation is yet. In other words,
 PG_GET_COLLATION() is zero for the type input function.
 
 But I need to do a comparison to find out if the range is valid or not.
 For instance:
  '[a, Z)'::textrange
 is valid in en_US but not C.
 
 Maybe that check should just be removed? If one views the range
 '[L, U)' as a concise way of expressing L = x AND x  U for some
 x, then allowing the case L  U seems quite natural. There won't
 be any such x of course, but the range is still valid, just empty.
 
 Actually, thinking for this a bit, I believe this is the only
 way text ranges can support collations. If the validity of a range
 depends on the collation, then changing the collation after creation
 seems weird, since it can make previous valid ranges invalid and
 vice versa.
 
 There could be a function RANGE_EMPTY() which people can put into
 their CHECK constraints if they don't want such ranges to sneak
 into their tables...
 
 I think the collation is going to have to be baked into the type
 definition, no?  You can't just up and change the collation of the
 column as you could for a straight text column, if that might cause
 the contents of some rows to be viewed as invalid.

Now you've lost me. If a text range is simply a pair of strings,
as I suggested, and collations are applied only during comparison
and RANGE_EMPTY(), why would the collation have to be baked into
the type?

If you're referring to the case
  (1) Create table with text-range column and collation C1
  (2) Add check constraint containing RANGE_EMPTY()
  (3) Add data
  (4) Alter column to have collation C2, possibly changing
  the result of RANGE_EMPTY() for existing ranges.
then that points to a problem with ALTER COLUMN.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_trgm: unicode string not working

2011-06-12 Thread Sushant Sinha

I am using pg_trgm for spelling correction as prescribed in the
documentation. But I see that it does not work for unicode sring. The
database was initialized with utf8 encoding and the C locale.

Here is the table:
 \d words
 Table public.words
 Column |  Type   | Modifiers 
+-+---
 word   | text| 
 ndoc   | integer | 
 nentry | integer | 
Indexes:
words_idx gin (word gin_trgm_ops)

Query: select word from words where word % 'कतद';

I get an error:

ERROR:  GIN indexes do not support whole-index scans


Any idea what is wrong?

-Sushant.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_trgm: unicode string not working

Hi

Next time, please post questions regarding the usage of postgres
to the -general list, not to -hackers. The purpose of -hackers is
to discuss the development of postgres proper, not the development
of applications using postgres.

On Jun12, 2011, at 13:33 , Sushant Sinha wrote:
 I am using pg_trgm for spelling correction as prescribed in the
 documentation. But I see that it does not work for unicode sring. The
 database was initialized with utf8 encoding and the C locale.

I think you need to use a locale (more precisely, a CTYPE) in which
'क', 'त', 'द' are considered to be alphanumeric.

You can specify the CTYPE when creating the database with
  CREATE DATABASE ... LC_CTYPE = ...

 Here is the table:
 \d words
 Table public.words
 Column |  Type   | Modifiers 
 +-+---
 word   | text| 
 ndoc   | integer | 
 nentry | integer | 
 Indexes:
words_idx gin (word gin_trgm_ops)
 
 Query: select word from words where word % 'कतद';
 
 I get an error:
 
 ERROR:  GIN indexes do not support whole-index scans


pg_trgm probably ignores non-alphanumeric characters during
comparison, so you end up with an empty search string, which
translates to a whole-index scan. Postgres up to 9.0 does
not support such scans for GIN indices.

Note that this restriction was removed in postgres 9.1 which
is currently in beta. However, GIT indices must be re-created
with REINDEX after upgrading from 9.0 to leverage that
improvement.

best regards.
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Creating new remote branch in git?

Bruce Momjian br...@momjian.us writes:
 Uh, I think someone needs to add this to our wiki:

I did.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small SSI issues

2011-06-12 Thread Kevin Grittner

 Heikki Linnakangas  wrote:
 On 10.06.2011 18:05, Kevin Grittner wrote:
 Heikki Linnakangas wrote:
 o There is no safeguard against actually wrapping around the
 SLRU, just the warning

 Any thoughts on what we should do instead? If someone holds open a
 transaction long enough to burn through a billion transaction IDs
 (or possibly less if someone uses a smaller BLCKSZ), should we
 generate a FATAL error?
 
 FATAL is a bit harsh, ERROR seems more appropriate.
 
If we don't cancel the long-running transaction, don't we continue to
have a problem?
 
 Do checks such as that argue for keeping the volatile flag, or do
 you think we can drop it if we make those changes? (That would
 also allow dropping a number of casts which exist just to avoid
 warnings.)
 
 I believe we can drop it, I'll double-check.
 
I see you committed a patch for this, but there were some casts which
become unnecessary with that change that you missed.  Patch attached
to clean up the ones I could spot.
 
-Kevin




ssi-nocast-1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Creating new remote branch in git?

2011-06-12 Thread Bruce Momjian

Tom Lane wrote:
Bruce Momjian br...@momjian.us writes:
Uh, I think someone needs to add this to our wiki:

I did.

I saw your commit that mentioned how to create a new branch. My problem
was with using workdir:

http://wiki.postgresql.org/wiki/Committing_with_Git#Committing_Using_a_Single_Clone_and_multiple_workdirs

I had to use:

pggit config branch.REL9_1_STABLE.remote origin
pggit config branch.REL9_1_STABLE.merge refs/heads/REL9_1_STABLE

or I get errors like this during 'pull':

$ git-new-workdir postgresql/.git/ 8.2
Checking out files: 100% (3851/3851), done.

$ cd 8.2

$ git checkout -b REL8_2_STABLE origin/REL8_2_STABLE
Checking out files: 100% (3908/3908), done.
error: Not tracking: ambiguous information for ref
refs/remotes/origin/REL8_2_STABLE
Switched to a new branch 'REL8_2_STABLE'

$ git pull
You asked me to pull without telling me which branch you
want to merge with, and 'branch.REL8_2_STABLE.merge' in
your configuration file does not tell me, either. Please
specify which branch you want to use on the command line and
try again (e.g. 'git pull repository refspec').
See git-pull(1) for details.

If you often merge with the same branch, you may want to
use something like the following in your configuration file:

[branch REL8_2_STABLE]
remote = nickname
merge = remote-ref

[remote nickname]
url = url
fetch = refspec

See git-config(1) for details.

(Is that error: Not tracking: ambiguous information error harmless?)

Once I execute this:

$ git config branch.REL8_2_STABLE.remote origin
$ git config branch.REL8_2_STABLE.merge refs/heads/REL8_2_STABLE

'pull' then works:

$ git pull
Already up-to-date.

So my point is I don't think we document the need to either update
.git/config or run those commands. The pull error message suggests
updating .git/config, but ideally we should tell users how to set this
up.

Editing the config file was mentioned in this email thread:

http://archives.postgresql.org/pgsql-hackers/2011-06/msg00860.php

--
Bruce Momjian br...@momjian.ushttp://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: collect frequency statistics for arrays

2011-06-12 Thread Alexander Korotkov

On Fri, Jun 10, 2011 at 9:03 PM, Simon Riggs si...@2ndquadrant.com wrote:

 Initial comments are that the code is well structured and I doubt
 there will be problems at the code level. Looks like a good patch.

I'm worrying about perfomance of column @ const estimation. It takes
O(m*(n+m)) of time, where m - const length and n - statistics target.
Probably, it can be too slow is some some cases.


 At the moment I see no tests. If this code will be exercised by
 existing tests then you should put some notes with the patch to
 explain that, or at least provide some pointers as to how I might test
 this.

I didn't find in existing tests which check selectivity estimation accuracy.
And I found difficult to create them because regression tests gives binary
result while estimation accuracy is quantitative value. Existing regression
tests covers case if typanalyze or selectivity estimation function falls
down. I've added ANALYZE array_op_test; line into array test in order to
these tests covers falldown case for this patch functions too.
Seems that, selectivity estimation accuracy should be tested manually on
various distributions. I've done very small amount of such tests.
Unfortunately, few months pass before I got idea about column @ const
case. And now, I don't have sufficient time for it due to my GSoC project.
It would be great if you can help me with this tests.


 Also, I'd like to see some more explanation. Either in comments, or
 just as a post to hackers. That saves me time, but we need to be clear
 about what this does and does not do, what it might do in the future
 etc.. 3+ years from now we need to be able to remember what the code
 was supposed to do. You will forget yourself in time, if you write
 enough patches. Based on this, I think you'll be writing quite a few
 more.

I've added some more comments. I'm afraid that it should be completely
rewritten before committing due to my english. If some particular points
should be clarified more, please, specify them.


 And of course, a few lines for the docs also.

I found that in statistics patch for tsvector only article about pg_stats
view was corrected. I've corrected this article a little bit too.

--
With best regards,
Alexander Korotkov.


arrayanalyze-0.3.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Detailed documentation for external calls (threading, shared resources etc)

2011-06-12 Thread Seref Arikan

Greetings,
This is actually a request for documentation guidance. I intend to
develop an extension to postgresql. Basically I'd like to place calls
to network using ZeroMQ, and I need to have detailed information about
a lot of things, especially  threading issues. I need to have some
global resources which will be presumably used by  multiple threads.
I can see that there is a lot of documentation, but I'd really
appreciate pointers towards the books, or key documents that'd help me
move forward faster (docs/books about inner workings of key
functionality) I'll be using C (most likely the best option) to
develop code, so which books/documents would you recommend?

Cheers
Seref

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On-the-fly index tuple deletion vs. hot_standby

On Sun, Jun 12, 2011 at 12:15:29AM -0400, Robert Haas wrote:
 On Sat, Jun 11, 2011 at 11:40 PM, Noah Misch n...@leadboat.com wrote:
  We currently achieve that wait-free by first marking the page with the next
  available xid and then reusing it when that mark (btpo.xact) predates the
  oldest running xid (RecentXmin). ?(At the moment, I'm failing to work out 
  why
  this is OK with scans from transactions that haven't allocated an xid, but I
  vaguely recall convincing myself it was fine at one point.) ?It would indeed
  also be enough to call GetLockConflicts(locktag-of-index, 
  AccessExclusiveLock)
  and check whether any of the returned transactions have PGPROC.xmin below 
  the
  mark. ?That's notably more expensive than just comparing RecentXmin, so I'm
  not sure how well it would pay off overall. ?However, it could only help us 
  on
  the master. ?(Not strictly true, but any way I see to extend it to the 
  standby
  has critical flaws.) ?On the master, we can see a conflicting transaction 
  and
  put off reusing the page. ?By the time the record hits the standby, we have 
  to
  apply it, and we might have a running transaction that will hold a lock on 
  the
  index for the next, say, 72 hours. ?At such times, vacuum_defer_cleanup_age 
  or
  hot_standby_feedback ought to prevent the recovery stall.
 
  This did lead me to realize that what we do in this regard on the standby 
  can
  be considerably independent from what we do on the master. ?If fruitful, the
  standby can prove the absence of a scan holding a right-link in a completely
  different fashion. ?So, we *could* take the cleanup-lock approach on the
  standby without changing very much on the master.
 
 Well, I'm generally in favor of trying to fix this problem without
 changing what the master does.  It's a weakness of our replication
 technology that the standby has no better way to cope with a cleanup
 operation on the master than to start killing queries, but then again
 it's a weakness of our MVCC technology that we don't reuse space
 quickly enough and end up with bloat.  I hear a lot more complaints
 about the second weakness than I do about the first.

I fully agree.  That said, if this works on the standby, we may as well also use
it opportunistically on the master, to throttle bloat.

 At any rate, if taking a cleanup lock on the right-linked page on the
 standby is sufficient to fix the problem, that seems like a far
 superior solution in any case.  Presumably the frequency of someone
 having a pin on that particular page will be far lower than any
 matching based on XID or heavyweight locks.  And the vast majority of
 such pins should disappear before the startup process feels obliged to
 get out its big hammer.

Yep; looks promising.

Does such a thing have a chance of being backpatchable?  I think the chances
start slim and fall almost to zero on account of the difficulty of avoiding a
WAL format change.  Assuming that conclusion, I do think it's worth starting
with something simple, even if it means additional bloat on the master in the
wal_level=hot_standby + vacuum_defer_cleanup_age / hot_standby_feedback case.
In choosing those settings, the administrator has taken constructive steps to
accept master-side bloat in exchange for delaying recovery conflict.  What's
your opinion?

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Make relation_openrv atomic wrt DDL

On Thu, Jan 20, 2011 at 09:48:12PM -0500, Robert Haas wrote:
 On Thu, Jan 20, 2011 at 6:19 PM, Noah Misch n...@leadboat.com wrote:
  On Thu, Jan 20, 2011 at 09:36:11PM +, Simon Riggs wrote:
  I agree that the DDL behaviour is wrong and should be fixed. Thank you
  for championing that alternative view.
 
  Swapping based upon names only works and is very flexible, much more so
  than EXCHANGE could be.
 
  A separate utility might be worth it, but the feature set of that should
  be defined in terms of correctly-working DDL behaviour. It's possible
  that no further requirement exists. I remove my own patch from
  consideration for this release.
 
  I'll review your patch and commit it, problems or objections excepted. I
  haven't looked at it in any detail.
 
  Thanks. ?I wouldn't be very surprised if that patch is even the wrong way to
  achieve these semantics, but it's great that we're on the same page as to 
  which
  semantics they are.
 
 I think Noah's patch is a not a good idea, because it will result in
 calling RangeVarGetRelid twice even in the overwhelmingly common case
 where no relevant invalidation occurs.  That'll add several syscache
 lookups per table to very common operations.

Valid concern.

[Refresher: this was a patch to improve behavior for this test case:

psql -c CREATE TABLE t AS VALUES ('old'); CREATE TABLE new_t AS VALUES 
('new')
psql -c 'SELECT pg_sleep(2) FROM t'  # block the ALTER or DROP briefly
sleep 1   # 
give prev time to take AccessShareLock

# Do it this way, and the next SELECT gets data from the old table.
psql -c 'ALTER TABLE t RENAME TO old_t; ALTER TABLE new_t RENAME TO t' 
# Do it this way, and get: ERROR:  could not open relation with OID 
41380
#psql -c 'DROP TABLE t; ALTER TABLE new_t RENAME TO t' 

psql -c 'SELECT * FROM t'   # I get 'old' or an error, 
never 'new'.
psql -c 'DROP TABLE IF EXISTS t, old_t, new_t'

It did so by rechecking the RangeVar-oid resolution after locking the found
relation, by which time concurrent DDL could no longer be interfering.]

I benchmarked the original patch with this function:

Datum
nmtest(PG_FUNCTION_ARGS)
{
int32   n = PG_GETARG_INT32(0);
int i;
RangeVar   *rv = makeRangeVar(NULL, pg_am, 0);

for (i = 0; i  n; ++i)
{
Relationr = relation_openrv(rv, 
AccessShareLock);
relation_close(r, AccessShareLock);
}
PG_RETURN_INT32(4);
}

(Releasing the lock before transaction end makes for an unrealistic benchmark,
but so would opening the same relation millions of times in a single
transaction.  I'm trying to isolate the cost that would be spread over
millions of transactions opening relations a handful of times.  See attached
shar archive for a complete extension wrapping that test function.)

Indeed, the original patch slowed it by about 50%.  I improved the patch,
adding a global SharedInvalidMessageCounter to increment as we process
messages.  If this counter does not change between the RangeVarGetRelid() call
and the post-lock AcceptInvalidationMessages() call, we can skip the second
RangeVarGetRelid() call.  With the updated patch, I get these timings (in ms)
for runs of SELECT nmtest(1000):

master: 19697.642, 20087.477, 19748.995
patched: 19723.716, 19879.538, 20257.671

In other words, no significant difference.  Since the patch removes the
no-longer-needed pre-lock call to AcceptInvalidationMessages(), changing to
relation_close(r, NoLock) in the test case actually reveals a 15%
performance improvement.  Granted, nothing to get excited about in light of
the artificiality.

This semantic improvement would be hard to test with the current pg_regress
suite, so I do not include any test case addition in the patch.  If
sufficiently important, it could be done with isolationtester.

Thanks,
nm
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 01a492e..63537fd 100644
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***
*** 979,1004  relation_openrv(const RangeVar *relation, LOCKMODE lockmode)
  {
Oid relOid;
  
!   /*
!* Check for shared-cache-inval messages before trying to open the
!* relation.  This is needed to cover the case where the name 
identifies a
!* rel that has been dropped and recreated since the start of our
!* transaction: if we don't flush the old syscache entry then we'll 
latch
!* onto that entry and suffer an error when we do RelationIdGetRelation.
!* Note that relation_open does not need to do this, since a relation's
!* OID never changes.
!*
!*

Re: [HACKERS] Detailed documentation for external calls (threading, shared resources etc)

On Jun12, 2011, at 19:26 , Seref Arikan wrote:
 This is actually a request for documentation guidance. I intend to
 develop an extension to postgresql. Basically I'd like to place calls
 to network using ZeroMQ, and I need to have detailed information about
 a lot of things, especially  threading issues. I need to have some
 global resources which will be presumably used by  multiple threads.
 I can see that there is a lot of documentation, but I'd really
 appreciate pointers towards the books, or key documents that'd help me
 move forward faster (docs/books about inner workings of key
 functionality) I'll be using C (most likely the best option) to
 develop code, so which books/documents would you recommend?


There are no threading issues in postgres, because postgres doesn't
use threads. Each client connection is serviced by one backend process,
launched by the postmaster when a new client connects. Communication
between backend processes takes places via a shared memory segment
and at times also via signals.

The documentation contains extensive information about how to interface
3rd-party code with postgres. To see how to interface C functions with
the SQL layer, read http://www.postgresql.org/docs/9.0/interactive/xfunc-c.html.
If you need to also access the database from your C-language functions,
also read http://www.postgresql.org/docs/9.0/interactive/spi.html.

More exhaustive documentation is spread around the source tree in
the form of README files. I suggest you read the ones concerned with
the postgres subsystems you're dealing with. At the very least, you
should read .//src/backend/utils/mmgr/README which explains how
postgres manages memory.

The various contrib modules, found in contrib/ in the source tree,
are also a good reference.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small SSI issues

2011-06-12 Thread Heikki Linnakangas


On 12.06.2011 17:59, Kevin Grittner wrote:

Heikki Linnakangas  wrote:
On 10.06.2011 18:05, Kevin Grittner wrote:

Heikki Linnakangas wrote:

o There is no safeguard against actually wrapping around the
SLRU, just the warning


Any thoughts on what we should do instead? If someone holds open a
transaction long enough to burn through a billion transaction IDs
(or possibly less if someone uses a smaller BLCKSZ), should we
generate a FATAL error?


FATAL is a bit harsh, ERROR seems more appropriate.


If we don't cancel the long-running transaction, don't we continue to
have a problem?


Yes, but ERROR is enough to kill the transaction. Unless it's in a 
subtransaction, I guess. But anyway, there's no guarantee that the 
long-running transaction will hit the problem first, or at all. You're 
much more likely to kill an innocent new transaction that tries to 
acquire its first locks, than an old long-running transaction that's 
been running for a while already, probably idle doing nothing, or doing 
a long seqscan on a large table with the whole table locked already.



I see you committed a patch for this, but there were some casts which
become unnecessary with that change that you missed.  Patch attached
to clean up the ones I could spot.


Ah, thanks, applied. I didn't realize those were also only needed 
because of the volatile qualifier.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Detailed documentation for external calls (threading, shared resources etc)

2011-06-12 Thread Greg Stark

Well process or thread, the issues are the same.

Check out the existing modules and how they use spinlocks and lwlocks to
protect shared memory data structures.

One thing to beware of is that there's no shared memory manager so all
shared data structures need to be fixed size allocated on startup. Anything
where that isn't good enough usually ends up as an on disk data structure.
On Jun 12, 2011 8:37 PM, Florian Pflug f...@phlo.org wrote:
 On Jun12, 2011, at 19:26 , Seref Arikan wrote:
 This is actually a request for documentation guidance. I intend to
 develop an extension to postgresql. Basically I'd like to place calls
 to network using ZeroMQ, and I need to have detailed information about
 a lot of things, especially threading issues. I need to have some
 global resources which will be presumably used by multiple threads.
 I can see that there is a lot of documentation, but I'd really
 appreciate pointers towards the books, or key documents that'd help me
 move forward faster (docs/books about inner workings of key
 functionality) I'll be using C (most likely the best option) to
 develop code, so which books/documents would you recommend?


 There are no threading issues in postgres, because postgres doesn't
 use threads. Each client connection is serviced by one backend process,
 launched by the postmaster when a new client connects. Communication
 between backend processes takes places via a shared memory segment
 and at times also via signals.

 The documentation contains extensive information about how to interface
 3rd-party code with postgres. To see how to interface C functions with
 the SQL layer, read
http://www.postgresql.org/docs/9.0/interactive/xfunc-c.html.
 If you need to also access the database from your C-language functions,
 also read http://www.postgresql.org/docs/9.0/interactive/spi.html.

 More exhaustive documentation is spread around the source tree in
 the form of README files. I suggest you read the ones concerned with
 the postgres subsystems you're dealing with. At the very least, you
 should read .//src/backend/utils/mmgr/README which explains how
 postgres manages memory.

 The various contrib modules, found in contrib/ in the source tree,
 are also a good reference.

 best regards,
 Florian Pflug



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Formatting curmudgeon HOWTO

2011-06-12 Thread Greg Smith

With another round of GSoC submissions approaching, I went looking 
around for some better guidance on the topic of how to follow terse 
submission guidelines like blend in with the surrounding code or 
remove spurious whitespace.  And I didn't find any.  Many mature 
open-source projects say things like that, but I haven't been able to 
find a tutorial of just what that means, or how to do it.


Now we have http://wiki.postgresql.org/wiki/Creating_Clean_Patches to 
fill that role, which fits in between Working with Git and Submitting 
a Patch as a fairly detailed walkthrough.  That should be an easier URL 
to point people who submit malformed patches toward than the 
documentation we've had before.


Advocacy aside:  this page might be a good one to submit to sites that 
publish open-source news.  It's pretty generic advice, is useful but not 
widely documented information, and it reflects well on our development 
practice.  I'm trying to reverse the perception we hear about sometimes 
that submitting patches to PostgreSQL is unreasonably difficult.  Seeing 
an example of how much easier it is to read a well formatted patch 
serves well for making people understand why the project has high 
expectations for formatting work.  It's not pedantic, it's functionally 
better.  I threw it onto reddit as a first spot to popularize:  
http://www.reddit.com/r/technology/comments/hy0aq/creating_clean_patches_with_git_diff/


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] lazy vxid locks, v1

Here is a patch that applies over the reducing the overhead of
frequent table locks (fastlock-v3) patch and allows heavyweight VXID
locks to spring into existence only when someone wants to wait on
them.  I believe there is a large benefit to be had from this
optimization, because the combination of these two patches virtually
eliminates lock manager traffic on pgbench -S workloads.  However,
there are several flies in the ointment.

1. It's a bit of a kludge.  I leave it to readers of the patch to
determine exactly what about this patch they think is kludgey, but
it's likely not the empty set.  I suspect that MyProc-fpLWLock needs
to be renamed to something a bit more generic if we're going to use it
like this, but I don't immediately know what to call it.  Also, the
mechanism whereby we take SInvalWriteLock to work out the mapping from
BackendId to PGPROC * is not exactly awesome.  I don't think it
matters from a performance point of view, because operations that need
VXID locks are sufficiently rare that the additional lwlock traffic
won't matter a bit.  However, we could avoid this altogether if we
rejiggered the mechanism for allocating PGPROCs and backend IDs.
Right now, we allocate PGPROCs off of linked lists, except for
auxiliary procs which allocate them by scanning a three-element array
for an empty slot.  Then, when the PGPROC subscribes to sinval, the
sinval mechanism allocates a backend ID by scanning for the lowest
unused backend ID in the ProcState array.  If we changed the logic for
allocating PGPROCs to mimic what the sinval queue currently does, then
the backend ID could be defined as the offset into the PGPROC array.
Translating between a backend ID and a PGPROC * now becomes a matter
of pointer arithmetic.  Not sure if this is worth doing.

2. Bad thing happen with large numbers of connections.  This patch
increases peak performance, but as you increase the number of
concurrent connections beyond the number of CPU cores, performance
drops off faster with the patch than without it.  For example, on the
32-core loaner from Nate Boley, using 80 pgbench -S clients, unpatched
HEAD runs at ~36K TPS; with fastlock, it jumps up to about ~99K TPS;
with this patch also applied, it drops down to about ~64K TPS, despite
the fact that nearly all the contention on the lock manager locks has
been eliminated.On Stefan Kaltenbrunner's 40-core box, he was
actually able to see performance drop down below unpatched HEAD with
this applied!  This is immensely counterintuitive.  What is going on?

Profiling reveals that the system spends enormous amounts of CPU time
in s_lock.  LWLOCK_STATS reveals that the only lwlock with significant
amounts of blocking is the BufFreelistLock; but that doesn't explain
the high CPU utilization.  In fact, it appears that the problem is
with the LWLocks that are frequently acquired in *shared* mode.  There
is no actual lock conflict, but each LWLock is protected by a spinlock
which must be acquired and released to bump the shared locker counts.
In HEAD, everything bottlenecks on the lock manager locks and so it's
not really possible for enough traffic to build up on any single
spinlock to have a serious impact on performance.  The locks being
sought there are exclusive, so when they are contended, processes just
get descheduled.  But with the exclusive locks out of the way,
everyone very quickly lines up to acquire shared buffer manager locks,
buffer content locks, etc. and large pile-ups ensue, leaving to
massive cache line contention and tons of CPU usage.  My initial
thought was that this was contention over the root block of the index
on the pgbench_accounts table and the buf mapping lock protecting it,
but instrumentation showed otherwise.  I hacked up the system to
report how often each lwlock spinlock exceeded spins_per_delay.  The
following is the end of a report showing the locks with the greatest
amounts of excess spinning:

lwlock 0: shacq 0 exacq 191032 blk 42554 spin 272
lwlock 41: shacq 5982347 exacq 11937 blk 1825 spin 4217
lwlock 38: shacq 6443278 exacq 11960 blk 1726 spin 4440
lwlock 47: shacq 6106601 exacq 12096 blk 1555 spin 4497
lwlock 34: shacq 6423317 exacq 11896 blk 1863 spin 4776
lwlock 45: shacq 6455173 exacq 12052 blk 1825 spin 4926
lwlock 39: shacq 6867446 exacq 12067 blk 1899 spin 5071
lwlock 44: shacq 6824502 exacq 12040 blk 1655 spin 5153
lwlock 37: shacq 6727304 exacq 11935 blk 2077 spin 5252
lwlock 46: shacq 6862206 exacq 12017 blk 2046 spin 5352
lwlock 36: shacq 6854326 exacq 11920 blk 1914 spin 5441
lwlock 43: shacq 7184761 exacq 11874 blk 1863 spin 5625
lwlock 48: shacq 7612458 exacq 12109 blk 2029 spin 5780
lwlock 35: shacq 7150616 exacq 11916 blk 2026 spin 5782
lwlock 33: shacq 7536878 exacq 11985 blk 2105 spin 6273
lwlock 40: shacq 7199089 exacq 12068 blk 2305 spin 6290
lwlock 456: shacq 36258224 exacq 0 blk 0 spin 54264
lwlock 42: shacq 43012736 exacq 11851 blk 10675 spin 62017
lwlock 4: shacq 72516569 exacq 190 blk 196 spin 341914

Re: [HACKERS] Creating new remote branch in git?

Bruce Momjian br...@momjian.us writes:
 I saw your commit that mentioned how to create a new branch.  My problem
 was with using workdir:

There seems to be something rather broken with your setup, because I
don't find it necessary to do any of that stuff; the recipe in the wiki
page works fine for me.  What git version are you using?  Maybe a buggy
version of git-new-workdir?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lazy vxid locks, v1

2011-06-12 Thread Greg Stark

On Sun, Jun 12, 2011 at 10:39 PM, Robert Haas robertmh...@gmail.com wrote:
 I hacked up the system to
 report how often each lwlock spinlock exceeded spins_per_delay.

I don't doubt the rest of your analysis but one thing to note, number
of spins on a spinlock is not the same as the amount of time spent
waiting for it.

When there's contention on a spinlock the actual test-and-set
instruction ends up taking a long time while cache lines are copied
around. In theory you could have processes spending an inordinate
amount of time waiting on a spinlock even though they never actually
hit spins_per_delay or you could have processes that quickly exceed
spins_per_delay.

I think in practice the results are the same because the code the
spinlocks protect is always short so it's hard to get the second case
on a multi-core box without actually having contention anyways.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] procpid?

2011-06-12 Thread Dimitri Fontaine

Peter Eisentraut pete...@gmx.net writes:
 On lör, 2011-06-11 at 16:23 -0400, Bruce Momjian wrote:
 Uh, I am the first one I remember complaining about this so I don't
 see why we should break compatibility for such a low-level problem. 

 I complain about it every day to the wall. :)

+1 !
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] reducing the overhead of frequent table locks, v3

On Sun, Jun 12, 2011 at 03:57:08PM -0400, Robert Haas wrote:
 Thus far, locks taken via the fast-path mechanism are not shown in
 pg_locks.  I've been mulling over what to do about that.  It's a bit
 tricky to show a snapshot of the locks in a way that's guaranteed to
 be globally consistent, because you'd need to seize one lock per
 backend plus one lock per lock manager partition, which will typically
 exceed the maximum number of LWLocks that can be simultaneously held
 by a single backend.  And if you don't do that, then you must either
 scan the per-backend queues first and then the lock manager
 partitions, or the other way around.  Since locks can bounce from the
 per-backend queues to the primary lock table, the first offers the
 possibility of seeing the same lock twice, while the second offers the
 possibility of missing it altogether.  I'm inclined to scan the
 per-backend queues first and just document that in rare cases you may
 see duplicate entries.  We could also de-duplicate before returning
 results but I doubt it's worth the trouble.  Anyway, opinions?

Possibly returning duplicates seems okay.

 A related question is whether a fast-path lock should be displayed
 differently in pg_locks than one which lives in the primary lock
 table.  We could add a new boolean (or char) column to pg_locks to
 mark locks as fast-path or not, or maybe change the granted column to
 a three-valued column (fast-path-granted, normal-granted, waiting).
 Or we could omit to distinguish.  Again, opinions?

An extra boolean for that sounds good.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Make relation_openrv atomic wrt DDL

On Sun, Jun 12, 2011 at 3:18 PM, Noah Misch n...@leadboat.com wrote:
 Indeed, the original patch slowed it by about 50%.  I improved the patch,
 adding a global SharedInvalidMessageCounter to increment as we process
 messages.  If this counter does not change between the RangeVarGetRelid() call
 and the post-lock AcceptInvalidationMessages() call, we can skip the second
 RangeVarGetRelid() call.  With the updated patch, I get these timings (in ms)
 for runs of SELECT nmtest(1000):

 master: 19697.642, 20087.477, 19748.995
 patched: 19723.716, 19879.538, 20257.671

 In other words, no significant difference.  Since the patch removes the
 no-longer-needed pre-lock call to AcceptInvalidationMessages(), changing to
 relation_close(r, NoLock) in the test case actually reveals a 15%
 performance improvement.  Granted, nothing to get excited about in light of
 the artificiality.

In point of fact, given the not-so-artificial results I just posted on
another thread (lazy vxid locks) I'm *very* excited about trying to
reduce the cost of AcceptInvalidationMessages().  I haven't reviewed
your patch in detail, but is there a way we can encapsulate the
knowledge of the invalidation system down inside the sinval machinery,
rather than letting the heap code have to know directly about the
counter?  Perhaps AIV() could return true or false depending on
whether any invalidation messages were processed, or somesuch.

 This semantic improvement would be hard to test with the current pg_regress
 suite, so I do not include any test case addition in the patch.  If
 sufficiently important, it could be done with isolationtester.

I haven't had a chance to look closely at the isolation tester yet,
but I'm excited about the possibilities for testing this sort of
thing.  Not sure whether it's worth including this or not, but it
doesn't seem like a bad idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Creating new remote branch in git?

2011-06-12 Thread Bruce Momjian

Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  I saw your commit that mentioned how to create a new branch.  My problem
  was with using workdir:
 
 There seems to be something rather broken with your setup, because I
 don't find it necessary to do any of that stuff; the recipe in the wiki
 page works fine for me.  What git version are you using?  Maybe a buggy
 version of git-new-workdir?

I am running git version 1.7.3.  What is odd is that I didn't need it
when I originally set this up, but now I do, or maybe I manually updated
.git/config last time.  

Did the system create the .git/config '[branch REL9_1_STABLE]' section
for you or did you create it manually?  That is what those 'git config'
commands do.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Creating new remote branch in git?

Bruce Momjian br...@momjian.us writes:
 Did the system create the .git/config '[branch REL9_1_STABLE]' section
 for you or did you create it manually?

git created them for me.  I did no config hacking whatever, but now
I have:

[branch REL9_1_STABLE]
remote = origin
merge = refs/heads/REL9_1_STABLE
rebase = true

which exactly parallels the pre-existing entries for the other branches.

One point that might affect this is that in ~/.gitconfig I have

[branch]
autosetuprebase = always

which is as per the setup recommendations on the wiki page.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Creating new remote branch in git?

2011-06-12 Thread Bruce Momjian

Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Did the system create the .git/config '[branch REL9_1_STABLE]' section
  for you or did you create it manually?
 
 git created them for me.  I did no config hacking whatever, but now
 I have:
 
 [branch REL9_1_STABLE]
   remote = origin
   merge = refs/heads/REL9_1_STABLE
   rebase = true
 
 which exactly parallels the pre-existing entries for the other branches.
 
 One point that might affect this is that in ~/.gitconfig I have
 
 [branch]
   autosetuprebase = always
 
 which is as per the setup recommendations on the wiki page.

I have the same in my ~/.gitconfig:

[branch]
autosetuprebase = always

I am attaching my ~/.gitconfig.

Do I need to run this in every branch?

git config branch.master.rebase true

Right now our wiki only says to run it in the master branch.  I am
attaching my postgresql/.git/config file too.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
[user]
name = Bruce Momjian
email = br...@momjian.us
[core]
excludesfile = /u/postgres/.gitignore
editor = emastd
pager = less -x4 -E
[remote origin]
fetch = +refs/heads/*:refs/remotes/origin/*
url = ssh://g...@gitmaster.postgresql.org/postgresql.git
[remote github]
fetch = +refs/heads/*:refs/remotes/origin/*
url = g...@github.com:bmomjian/postgres.git
[diff]
external = git-external-diff
[branch]
autosetuprebase = always
[branch master]
rebase = true
[gc]
auto = 0
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
[remote origin]
fetch = +refs/heads/*:refs/remotes/origin/*
url = ssh://g...@gitmaster.postgresql.org/postgresql.git
[branch master]
remote = origin
merge = refs/heads/master
rebase = true

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] lazy vxid locks, v1

On Sun, Jun 12, 2011 at 5:58 PM, Greg Stark st...@mit.edu wrote:
 On Sun, Jun 12, 2011 at 10:39 PM, Robert Haas robertmh...@gmail.com wrote:
 I hacked up the system to
 report how often each lwlock spinlock exceeded spins_per_delay.

 I don't doubt the rest of your analysis but one thing to note, number
 of spins on a spinlock is not the same as the amount of time spent
 waiting for it.

 When there's contention on a spinlock the actual test-and-set
 instruction ends up taking a long time while cache lines are copied
 around. In theory you could have processes spending an inordinate
 amount of time waiting on a spinlock even though they never actually
 hit spins_per_delay or you could have processes that quickly exceed
 spins_per_delay.

 I think in practice the results are the same because the code the
 spinlocks protect is always short so it's hard to get the second case
 on a multi-core box without actually having contention anyways.

All good points.  I don't immediately have a better way of measuring
what's going on.  Maybe dtrace could do it, but I don't really know
how to use it and am not sure it's set up on any of the boxes I have
for testing.  Throwing gettimeofday() calls into SpinLockAcquire()
seems likely to change the overall system behavior enough to make the
results utterly meaningless.  It wouldn't be real difficult to count
the number of times that we TAS() rather than just counting the number
of times we TAS() more than spins-per-delay, but I'm not sure whether
that would really address your concern.  Hopefully, further
experimentation will make things more clear.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Creating new remote branch in git?

On Sun, Jun 12, 2011 at 7:59 PM, Bruce Momjian br...@momjian.us wrote:
 Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Did the system create the .git/config '[branch REL9_1_STABLE]' section
  for you or did you create it manually?

 git created them for me.  I did no config hacking whatever, but now
 I have:

 [branch REL9_1_STABLE]
       remote = origin
       merge = refs/heads/REL9_1_STABLE
       rebase = true

 which exactly parallels the pre-existing entries for the other branches.

 One point that might affect this is that in ~/.gitconfig I have

 [branch]
       autosetuprebase = always

 which is as per the setup recommendations on the wiki page.

 I have the same in my ~/.gitconfig:

        [branch]
                autosetuprebase = always

 I am attaching my ~/.gitconfig.

 Do I need to run this in every branch?

        git config branch.master.rebase true

 Right now our wiki only says to run it in the master branch.  I am
 attaching my postgresql/.git/config file too.

This is ironclad evidence that you followed the directions out of
order, but yes, running that for every branch will fix it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Make relation_openrv atomic wrt DDL

On Sun, Jun 12, 2011 at 06:20:53PM -0400, Robert Haas wrote:
 On Sun, Jun 12, 2011 at 3:18 PM, Noah Misch n...@leadboat.com wrote:
  Indeed, the original patch slowed it by about 50%. ?I improved the patch,
  adding a global SharedInvalidMessageCounter to increment as we process
  messages. ?If this counter does not change between the RangeVarGetRelid() 
  call
  and the post-lock AcceptInvalidationMessages() call, we can skip the second
  RangeVarGetRelid() call. ?With the updated patch, I get these timings (in 
  ms)
  for runs of SELECT nmtest(1000):
 
  master: 19697.642, 20087.477, 19748.995
  patched: 19723.716, 19879.538, 20257.671
 
  In other words, no significant difference. ?Since the patch removes the
  no-longer-needed pre-lock call to AcceptInvalidationMessages(), changing to
  relation_close(r, NoLock) in the test case actually reveals a 15%
  performance improvement. ?Granted, nothing to get excited about in light of
  the artificiality.
 
 In point of fact, given the not-so-artificial results I just posted on
 another thread (lazy vxid locks) I'm *very* excited about trying to
 reduce the cost of AcceptInvalidationMessages().

Quite interesting.  A quick look suggests there is room for optimization there.

 I haven't reviewed
 your patch in detail, but is there a way we can encapsulate the
 knowledge of the invalidation system down inside the sinval machinery,
 rather than letting the heap code have to know directly about the
 counter?  Perhaps AIV() could return true or false depending on
 whether any invalidation messages were processed, or somesuch.

I actually did it exactly that way originally.  The problem was the return value
only applying to the given call, while I wished to answer a question like Did
any call to AcceptInvalidationMessages() between code point A and code point B
process a message?  Adding AcceptInvalidationMessages() calls to code between A
and B would make the return value test yield a false negative.  A global counter
was the best thing I could come up with that avoided this hazard.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On-the-fly index tuple deletion vs. hot_standby

On Sun, Jun 12, 2011 at 3:01 PM, Noah Misch n...@leadboat.com wrote:
 I fully agree.  That said, if this works on the standby, we may as well also 
 use
 it opportunistically on the master, to throttle bloat.

As long as the performance cost is de minimis, I agree.

 At any rate, if taking a cleanup lock on the right-linked page on the
 standby is sufficient to fix the problem, that seems like a far
 superior solution in any case.  Presumably the frequency of someone
 having a pin on that particular page will be far lower than any
 matching based on XID or heavyweight locks.  And the vast majority of
 such pins should disappear before the startup process feels obliged to
 get out its big hammer.

 Yep; looks promising.

 Does such a thing have a chance of being backpatchable?  I think the chances
 start slim and fall almost to zero on account of the difficulty of avoiding a
 WAL format change.

I can't see back-patching it.  Maybe people would feel it was worth
considering if we were getting legions of complaints about this
problem, but so far you're the only one.  In any case, back-patching a
WAL format change is a complete non-starter -- we can't go making
minor versions non-interoperable.

 Assuming that conclusion, I do think it's worth starting
 with something simple, even if it means additional bloat on the master in the
 wal_level=hot_standby + vacuum_defer_cleanup_age / hot_standby_feedback case.
 In choosing those settings, the administrator has taken constructive steps to
 accept master-side bloat in exchange for delaying recovery conflict.  What's
 your opinion?

I'm pretty disinclined to go tinkering with 9.1 at this point, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Make relation_openrv atomic wrt DDL

On Sun, Jun 12, 2011 at 9:23 PM, Noah Misch n...@leadboat.com wrote:
 I haven't reviewed
 your patch in detail, but is there a way we can encapsulate the
 knowledge of the invalidation system down inside the sinval machinery,
 rather than letting the heap code have to know directly about the
 counter?  Perhaps AIV() could return true or false depending on
 whether any invalidation messages were processed, or somesuch.

 I actually did it exactly that way originally.  The problem was the return 
 value
 only applying to the given call, while I wished to answer a question like Did
 any call to AcceptInvalidationMessages() between code point A and code point B
 process a message?  Adding AcceptInvalidationMessages() calls to code 
 between A
 and B would make the return value test yield a false negative.  A global 
 counter
 was the best thing I could come up with that avoided this hazard.

Oh, interesting point.  What if AcceptInvalidationMessages returns the
counter?  Maybe with typedef uint32 InvalidationPositionId or
something like that, to make it partially self-documenting, and
greppable.

Taking that a bit further, what if we put that counter in
shared-memory?  After writing new messages into the queue, a writer
would bump this count (only one process can be doing this at a time
because SInvalWriteLock is held) and memory-fence.  Readers would
memory-fence and then read the count before acquiring the lock.  If it
hasn't changed since we last read it, then don't bother acquiring
SInvalReadLock, because no new messages have arrived.  Or maybe an
exact multiple of 2^32 messages have arrived, but there's probably
someway to finesse around that issue, like maybe also using some kind
of memory barrier to allow resetState to be checked without the lock.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_trgm: unicode string not working

On Sun, Jun 12, 2011 at 8:40 AM, Florian Pflug f...@phlo.org wrote:
 Note that this restriction was removed in postgres 9.1 which
 is currently in beta. However, GIT indices must be re-created
 with REINDEX after upgrading from 9.0 to leverage that
 improvement.

Does pg_upgrade know about this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Boolean operators without commutators vs. ALL/ANY

On Sun, Jun 12, 2011 at 7:46 AM, Florian Pflug f...@phlo.org wrote:
 So I the end, I had to wrap the sub-query in a SQL-language
 function and use that in the check constraint. While this
 solved my immediate problem, the necessity of doing that
 highlights a few problems

 (A) ~ is an extremely bad name for the regexp-matching
 operators, since it's visual form is symmetric but it's
 behaviour isn't. This doesn't only make its usage very
 error-prone, it also makes it very hard to come up with
 sensible name for an commutator of ~. I suggest that we
 add =~ as an alias for ~, ~= as an commutator
 for =~, and deprecate ~. The same holds for ~~.

Does any other database or programming language implement it this way?

 (B) There should be a way to use ANY()/ALL() with the
 array elements becoming the left arguments of the operator.
 Ideally, we'd support ANY(array) operator value,
 but if that's not possible grammar-wise, I suggest we extend
 the OPERATOR() syntax to allow
  value OPERATOR(COMMUTATOR operator) ANY(array).
 OPERATOR(COMMUTATOR operator) would use the COMMUTATOR
 of the specified operator if one exists, and otherwise
 use the original operator with the arguments swapped.

It seems to me that if we provided some way of handling this, your
first proposal would be moot; and I have to say I like the idea of
allowing this a lot more than tinkering with the operator names.  I'm
not crazy about the proposed syntax, though; it seems cumbersome, and
it's really only needed for SOME/ALL/ANY, not in general operator
expressions.  Since ANY is a reserved keyword, I believe we could
allow something like expr op ANY BACKWARD ( ... ) -- or some other
keyword in lieu of BACKWARD if you prefer.

Hath the spec anything to say about this?

 (C) Why do we forbid sub-queries in CHECK constraints?
 I do realize that any non-IMMUTABLE CHECK constraint is
 a foot-gun, but since we already allow STABLE and even
 VOLATILE functions to be used inside CHECK constraint,
 forbidding sub-queries seems a bit pointless...

Dunno.  Maybe it's just an implementation restriction?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Boolean operators without commutators vs. ALL/ANY

Robert Haas robertmh...@gmail.com writes:
 On Sun, Jun 12, 2011 at 7:46 AM, Florian Pflug f...@phlo.org wrote:
 (B) There should be a way to use ANY()/ALL() with the
 array elements becoming the left arguments of the operator.

 It seems to me that if we provided some way of handling this, your
 first proposal would be moot; and I have to say I like the idea of
 allowing this a lot more than tinkering with the operator names.

There are syntactic reasons not to do that.  It'd be a lot easier just
to provide a commutator operator for ~.

 (C) Why do we forbid sub-queries in CHECK constraints?

 Dunno.  Maybe it's just an implementation restriction?

(1) We don't want to invoke the planner in the places where we'd
have to do so to make that work.

(2) It's just about inevitable that a sub-query would have results
dependent on other rows beside the one being checked.  As such, it
would be trying to enforce semantics that you simply can't enforce
via CHECK.  (And yes, you can bypass that with a function, but guess
what: it still won't actually work.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types and extensions

On Sun, Jun 12, 2011 at 7:53 AM, Florian Pflug f...@phlo.org wrote:
 I think the collation is going to have to be baked into the type
 definition, no?  You can't just up and change the collation of the
 column as you could for a straight text column, if that might cause
 the contents of some rows to be viewed as invalid.

 Now you've lost me. If a text range is simply a pair of strings,
 as I suggested, and collations are applied only during comparison
 and RANGE_EMPTY(), why would the collation have to be baked into
 the type?

 If you're referring to the case
  (1) Create table with text-range column and collation C1
  (2) Add check constraint containing RANGE_EMPTY()
  (3) Add data
  (4) Alter column to have collation C2, possibly changing
      the result of RANGE_EMPTY() for existing ranges.
 then that points to a problem with ALTER COLUMN.

No, I'm saying that you might have a column containing  '[a, Z)', and
someone might change the collation of the column from en_US to C.
When the collation was en_US, the column could legally contain that
value, but now that the collation is C, it can't.  ALTER TABLE isn't
going to recheck the validity of the data when someone changes the
collation: that's only supposed to affect the sort order, not the
definition of what is a legal value.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Boolean operators without commutators vs. ALL/ANY

On Sun, Jun 12, 2011 at 11:44 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sun, Jun 12, 2011 at 7:46 AM, Florian Pflug f...@phlo.org wrote:
 (B) There should be a way to use ANY()/ALL() with the
 array elements becoming the left arguments of the operator.

 It seems to me that if we provided some way of handling this, your
 first proposal would be moot; and I have to say I like the idea of
 allowing this a lot more than tinkering with the operator names.

 There are syntactic reasons not to do that.  It'd be a lot easier just
 to provide a commutator operator for ~.

Details?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: collect frequency statistics for arrays

On Sun, Jun 12, 2011 at 12:17 PM, Alexander Korotkov
aekorot...@gmail.com wrote:
 I'm worrying about perfomance of column @ const estimation. It takes
 O(m*(n+m)) of time, where m - const length and n - statistics target.
 Probably, it can be too slow is some some cases.

Hmm, that doesn't sound terribly promising.  The best thing to do here
is probably to construct a pessimal case and test it.  So, say, look
for a column @ a 100-element array with a statistics target of 400...
 once you know how bad it is, then we can make intelligent decisions
about where to go with it.

If the data type is hashable, you could consider building a hash table
on the MCVs and then do a probe for each element in the array.  I
think that's better than the other way around because there can't be
more than 10k MCVs, whereas the input constant could be arbitrarily
long.  I'm not entirely sure whether this case is important enough to
be worth spending a lot of code on, but then again it might not be
that much code.

Another option is to bound the number of operations you're willing to
perform to some reasonable limit, say, 10 * default_statistics_target.
 Work out ceil((10 * default_statistics_target) /
number-of-elements-in-const) and consider at most that many MCVs.
When this limit kicks in you'll get a less-accurate selectivity
estimate, but that's a reasonable price to pay for not blowing out
planning time.

 At the moment I see no tests. If this code will be exercised by
 existing tests then you should put some notes with the patch to
 explain that, or at least provide some pointers as to how I might test
 this.

 I didn't find in existing tests which check selectivity estimation accuracy.
 And I found difficult to create them because regression tests gives binary
 result while estimation accuracy is quantitative value. Existing regression
 tests covers case if typanalyze or selectivity estimation function falls
 down. I've added ANALYZE array_op_test; line into array test in order to
 these tests covers falldown case for this patch functions too.
 Seems that, selectivity estimation accuracy should be tested manually on
 various distributions. I've done very small amount of such tests.
 Unfortunately, few months pass before I got idea about column @ const
 case. And now, I don't have sufficient time for it due to my GSoC project.
 It would be great if you can help me with this tests.

AFAIK, we don't have regression tests for any other selectivity
estimator; except perhaps to the extent that we verify they don't
crash.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench--new transaction type

2011-06-12 Thread Greg Smith


On 06/11/2011 03:21 PM, Jeff Janes wrote:

I wouldn't expect IPC chatter to show up in profiling, because it
costs wall time, but not CPU time.  The time spent might be attributed
to the kernel, or to pgbench, or to nothing at all.
   


Profilers aren't necessarily just accumulating raw CPU time though.  If 
the approach includes sampling what code is active right now? 
periodically, you might be able to separate this out even though it's 
not using CPU time in the normal fashion.  I think you might just need 
to use a better profiler.


Anyway, the sort of breakdown this helps produce is valuable 
regardless.  I highlighted the statement you made because I reflexively 
challenge theorizing about code hotspots on general principle.  The 
measurement-based breakdown you provided was more what I look for.




But there is no
repository of such useful for developer testing patches, other than
this mailing list.  That being the case, would it even be worthwhile
to fix it up more and submit it to commitfest?
   


The activity around profiling pgbench and trying to crack one of the 
bottlenecks has been picking up a lot of momentum lately, and if we're 
lucky that will continue throughout 9.2 development.  As such, now seems 
a good time as any to consider adding something like this.  We may end 
up reskinng lots of pgbench before this is over.  I added your patch to 
the CommitFest.



So at a loop of 512, you would have overhead of 59.0/512=0.115 out of
total time of 17.4, or 0.7% overhead.  So that should be large enough.
   


That I think is workable.  If the split was a compile time constant 
fixed at 512 unless you specifically changed it, even the worst typical 
cases shouldn't suffer much from batch overhang.  If you create a 
database so large that you only get 50 TPS, which is unusual but not 
that rare, having a 512 execution batch size would mean you might get 
your -T set end time lagging 10 seconds behind its original plan.  
Unlike the 10K you started with, that is reasonable; that does sound 
like the sweet spot where overhead is low but time overrun isn't too 
terrible.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types and extensions

2011-06-12 Thread Darren Duncan


Robert Haas wrote:

On Sun, Jun 12, 2011 at 7:53 AM, Florian Pflug f...@phlo.org wrote:

I think the collation is going to have to be baked into the type
definition, no?  You can't just up and change the collation of the
column as you could for a straight text column, if that might cause
the contents of some rows to be viewed as invalid.

Now you've lost me. If a text range is simply a pair of strings,
as I suggested, and collations are applied only during comparison
and RANGE_EMPTY(), why would the collation have to be baked into
the type?

If you're referring to the case
 (1) Create table with text-range column and collation C1
 (2) Add check constraint containing RANGE_EMPTY()
 (3) Add data
 (4) Alter column to have collation C2, possibly changing
 the result of RANGE_EMPTY() for existing ranges.
then that points to a problem with ALTER COLUMN.


No, I'm saying that you might have a column containing  '[a, Z)', and
someone might change the collation of the column from en_US to C.
When the collation was en_US, the column could legally contain that
value, but now that the collation is C, it can't.  ALTER TABLE isn't
going to recheck the validity of the data when someone changes the
collation: that's only supposed to affect the sort order, not the
definition of what is a legal value.


You can have the same collation problem even without range types.

Consider the following:
 (1) Create table with the 2 text columns {L,R} and both columns have the 
collation en_US.

 (2) Add check constraint requiring L = R.
 (3) Add a record with the value 'a' for L and 'Z' for R.
 (4) Alter the columns to have the collation C.

Good language design principles demand that the semantics for this simplified 
case and the semantics for replacing {L,R} with a single range-of-text-typed 
column be the same, including what happens with CHECK and ALTER TABLE.


Likewise, anything that affects ORDER BY should affect {,,=,=} and friends 
the same way and vice-versa and likewise should affect range validity.


It makes sense for collation to be considered part of text data types, and 
changing collation is casting from one text type to another.  Generally 
speaking, any inherent or applied aspect of a text or other value (such as 
collation) that affects the results of any deterministic operations on those 
values (such as sorting) should be considered part of the data type of those values.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types and extensions

On Mon, Jun 13, 2011 at 12:47 AM, Darren Duncan dar...@darrenduncan.net wrote:
 If you're referring to the case
  (1) Create table with text-range column and collation C1
  (2) Add check constraint containing RANGE_EMPTY()
  (3) Add data
  (4) Alter column to have collation C2, possibly changing
     the result of RANGE_EMPTY() for existing ranges.
 then that points to a problem with ALTER COLUMN.

 No, I'm saying that you might have a column containing  '[a, Z)', and
 someone might change the collation of the column from en_US to C.
 When the collation was en_US, the column could legally contain that
 value, but now that the collation is C, it can't.  ALTER TABLE isn't
 going to recheck the validity of the data when someone changes the
 collation: that's only supposed to affect the sort order, not the
 definition of what is a legal value.

 You can have the same collation problem even without range types.

 Consider the following:
  (1) Create table with the 2 text columns {L,R} and both columns have the
 collation en_US.
  (2) Add check constraint requiring L = R.
  (3) Add a record with the value 'a' for L and 'Z' for R.
  (4) Alter the columns to have the collation C.

Oh, good point.

rhaas=# create table sample (t text collate en_US, check (t  'Z'));
CREATE TABLE
rhaas=# insert into sample values ('a');
INSERT 0 1
rhaas=# alter table sample alter column t type text collate C;
ERROR:  check constraint sample_t_check is violated by some row

But interestingly, my Mac has a different notion of how this collation
works: it thinks 'a'  'Z' even in en_US.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Make relation_openrv atomic wrt DDL