Re: [HACKERS] Patch for reserved connections for replication users

2013-10-13 Thread Amit Kapila
On Thu, Oct 10, 2013 at 3:17 AM, Gibheer gibh...@zero-knowledge.org wrote:
 On Mon, 7 Oct 2013 11:39:55 +0530
 Amit Kapila amit.kapil...@gmail.com wrote:
 Robert Haas wrote:
 On Mon, Aug 5, 2013 at 2:04 AM, Andres Freund
 andres(at)2ndquadrant(dot)com wrote:
  Hmm.  It seems like this match is making MaxConnections no longer
  mean the maximum number of connections, but rather the maximum
  number of non-replication connections.  I don't think I support
  that definitional change, and I'm kinda surprised if this is
  sufficient to implement it anyway (e.g. see InitProcGlobal()).
 
  I don't think the implementation is correct, but why don't you
  like the definitional change? The set of things you can do from
  replication connections are completely different from a normal
  connection. So using separate pools for them seems to make sense.
  That they end up allocating similar internal data seems to be an
  implementation detail to me.

  Because replication connections are still connections.  If I tell
  the system I want to allow 100 connections to the server, it should
  allow 100 connections, not 110 or 95 or any other number.

 I think that to reserve connections for replication, mechanism similar
 to superuser_reserved_connections be used rather than auto vacuum
 workers or background workers.
 This won't change the definition of MaxConnections. Another thing is
 that rather than introducing new parameter for replication reserved
 connections, it is better to use max_wal_senders as it can serve the
 purpose.

 Review for replication_reserved_connections-v2.patch, considering we
 are going to use mechanism similar to superuser_reserved_connections
 and won't allow definition of MaxConnections to change.



 Hi,

 I took the time and reworked the patch with the feedback till now.
 Thank you very much Amit!

 So this patch uses max_wal_senders together with the idea of the first
 patch I sent. The error messages are also adjusted to make it obvious,
 how it is supposed to be and all checks work, as far as I could tell.

If I understand correctly, now the patch has implementation such that
a. if the number of connections left are (ReservedBackends +
max_wal_senders), then only superusers or replication connection's
will be allowed
b. if the number of connections left are ReservedBackend, then only
superuser connections will be allowed.

So it will ensure that max_wal_senders is used for reserving
connection slots from being used by non-super user connections. I find
new usage of max_wal_senders acceptable, if anyone else thinks
otherwise, please let us know.


1.
+varnamesuperuser_reserved_connections/varname
+varnamemax_wal_senders/varname only superuser and WAL connections
+are allowed.

Here minus seems to be missing before max_wal_senders and I think it
will be better to use replication connections rather than WAL
connections.

2.
-new replication connections will be accepted.
+new WAL or other connections will be accepted.

I think as per new implementation, we don't need to change this line.

3.
+ * reserved slots from max_connections for wal senders. If the number of free
+ * slots (max_connections - max_wal_senders) is depleted.

 Above calculation (max_connections - max_wal_senders) needs to
include super user reserved connections.

4.
+ /*
+ * Although replication connections currently require superuser privileges, we
+ * don't allow them to consume the superuser reserved slots, which are
+ * intended for interactive use.
  */
  if ((!am_superuser || am_walsender) 
  ReservedBackends  0 
  !HaveNFreeProcs(ReservedBackends))
  ereport(FATAL,
  (errcode(ERRCODE_TOO_MANY_CONNECTIONS),
- errmsg(remaining connection slots are reserved for non-replication
superuser connections)));
+ errmsg(remaining connection slots are reserved for superuser connections)));

Will there be any problem if we do the above check before the check
for wal senders and reserved replication connections (+
!HaveNFreeProcs(max_wal_senders + ReservedBackends))) and don't change
the error message in this check. I think this will ensure that users
who doesn't enable max_wal_senders will see the same error message as
before and the purpose to reserve connections for replication can be
served by your second check.


5.
+ gettext_noop(Sets the maximum number wal sender connections and
reserves them.),

Sets the maximum number 'of' wal sender connections and reserves them.
a. 'of' is missing in above line.
b. I think 'reserves them' is not completely right, because super user
connections will be allowed. How about if we add something similar
to 'and reserves them from being used by non-superuser
connections' in above line.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamic shared memory

2013-10-13 Thread Amit Kapila
On Wed, Oct 9, 2013 at 1:10 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Sep 26, 2013 at 9:27 AM, Noah Misch n...@leadboat.com wrote:
  There's no data corruption problem if we proceed - but there likely
  has been one leading to the current state.

 +1 for making this one a PANIC, though.  With startup behind us, a valid dsm
 state file pointed us to a control segment with bogus contents.  The
 conditional probability of shared memory corruption seems higher than that of
 a DBA editing the dsm state file of a running cluster to incorrectly name as
 the dsm control segment some other existing shared memory segment.

 To respond specifically to this point... inability to open a file on
 disk does not mean that shared memory is corrupted.  Full stop.

 A scenario I have seen a few times is that someone changes the
 permissions on part or all of $PGDATA while the server is running.  I
 have only ever seen this happen on Windows.  What typically happens
 today - depending on the exact scenario - is that the checkpoints will
 fail, but the server will remain up, sometimes even committing
 transactions under synchronous_commit=off, even though it can't write
 out its data.  If you fix the permissions before shutting down the
 server, you don't even lose any data.  Making inability to read a file
 into a PANIC condition will cause any such cluster to remain up only
 as long as nobody tries to use dynamic shared memory, and then throw
 up its guts.  I don't think users will appreciate that.

 I am tempted to commit the latest version of this patch as I have it.

1. Do you think we should add information about pg_dynshmem file at link:
http://www.postgresql.org/docs/devel/static/storage-file-layout.html
It contains information about all files/folders in data directory

2.
+/*
+ * Forget that a temporary file is owned by a ResourceOwner
+ */
+void
+ResourceOwnerForgetDSM(ResourceOwner owner, dsm_segment *seg)
+{

Above function description should use 'dynamic shmem segment' rather
than temporary file.
Forget that a dynamic shmem segment is owned by a ResourceOwner


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for reserved connections for replication users

2013-10-13 Thread Gibheer
On Sun, 13 Oct 2013 11:38:17 +0530
Amit Kapila amit.kapil...@gmail.com wrote:

 On Thu, Oct 10, 2013 at 3:17 AM, Gibheer gibh...@zero-knowledge.org
 wrote:
  On Mon, 7 Oct 2013 11:39:55 +0530
  Amit Kapila amit.kapil...@gmail.com wrote:
  Robert Haas wrote:
  On Mon, Aug 5, 2013 at 2:04 AM, Andres Freund
  andres(at)2ndquadrant(dot)com wrote:
   Hmm.  It seems like this match is making MaxConnections no
   longer mean the maximum number of connections, but rather the
   maximum number of non-replication connections.  I don't think
   I support that definitional change, and I'm kinda surprised if
   this is sufficient to implement it anyway (e.g. see
   InitProcGlobal()).
  
   I don't think the implementation is correct, but why don't you
   like the definitional change? The set of things you can do from
   replication connections are completely different from a normal
   connection. So using separate pools for them seems to make
   sense. That they end up allocating similar internal data seems
   to be an implementation detail to me.
 
   Because replication connections are still connections.  If I
   tell the system I want to allow 100 connections to the server,
   it should allow 100 connections, not 110 or 95 or any other
   number.
 
  I think that to reserve connections for replication, mechanism
  similar to superuser_reserved_connections be used rather than auto
  vacuum workers or background workers.
  This won't change the definition of MaxConnections. Another thing
  is that rather than introducing new parameter for replication
  reserved connections, it is better to use max_wal_senders as it
  can serve the purpose.
 
  Review for replication_reserved_connections-v2.patch, considering
  we are going to use mechanism similar to
  superuser_reserved_connections and won't allow definition of
  MaxConnections to change.
 
 
 
  Hi,
 
  I took the time and reworked the patch with the feedback till now.
  Thank you very much Amit!
 
  So this patch uses max_wal_senders together with the idea of the
  first patch I sent. The error messages are also adjusted to make it
  obvious, how it is supposed to be and all checks work, as far as I
  could tell.
 
 If I understand correctly, now the patch has implementation such that
 a. if the number of connections left are (ReservedBackends +
 max_wal_senders), then only superusers or replication connection's
 will be allowed
 b. if the number of connections left are ReservedBackend, then only
 superuser connections will be allowed.

That is correct.

 So it will ensure that max_wal_senders is used for reserving
 connection slots from being used by non-super user connections. I find
 new usage of max_wal_senders acceptable, if anyone else thinks
 otherwise, please let us know.
 
 
 1.
 +varnamesuperuser_reserved_connections/varname
 +varnamemax_wal_senders/varname only superuser and WAL
 connections
 +are allowed.
 
 Here minus seems to be missing before max_wal_senders and I think it
 will be better to use replication connections rather than WAL
 connections.

This is fixed.

 2.
 -new replication connections will be accepted.
 +new WAL or other connections will be accepted.
 
 I think as per new implementation, we don't need to change this line.

I reverted that change.

 3.
 + * reserved slots from max_connections for wal senders. If the
 number of free
 + * slots (max_connections - max_wal_senders) is depleted.
 
  Above calculation (max_connections - max_wal_senders) needs to
 include super user reserved connections.

My first thought was, that I would not add it here. When superuser
reserved connections are not set, then only max_wal_senders would
count.
But you are right, it has to be set, as 3 connections are reserved by
default for superusers.

 4.
 + /*
 + * Although replication connections currently require superuser
 privileges, we
 + * don't allow them to consume the superuser reserved slots, which
 are
 + * intended for interactive use.
   */
   if ((!am_superuser || am_walsender) 
   ReservedBackends  0 
   !HaveNFreeProcs(ReservedBackends))
   ereport(FATAL,
   (errcode(ERRCODE_TOO_MANY_CONNECTIONS),
 - errmsg(remaining connection slots are reserved for non-replication
 superuser connections)));
 + errmsg(remaining connection slots are reserved for superuser
 connections)));
 
 Will there be any problem if we do the above check before the check
 for wal senders and reserved replication connections (+
 !HaveNFreeProcs(max_wal_senders + ReservedBackends))) and don't change
 the error message in this check. I think this will ensure that users
 who doesn't enable max_wal_senders will see the same error message as
 before and the purpose to reserve connections for replication can be
 served by your second check.

I have attached two patches, one that checks only max_wal_senders first
and the other checks reserved_backends first. Both return the original
message, when max_wal_sender is not set and I think it is 

Re: [HACKERS] removing old ports and architectures

2013-10-13 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I think we should remove support for the following architectures:
 - superH

This one was contributed just a year or two ago, if memory serves,
which suggests that somebody out there cares about it.  OTOH, if
they still care, we could insist they provide whatever atomic ops
we want to depend on.

 - PA-RISC. I think Tom was the remaining user there? Maybe just !gcc.

Until pretty recently, there was a PA-RISC machine (not mine) in the
buildfarm.  I don't see it in the list today though.  In any case,
HP's compiler has always been a PITA, so no objection to requiring gcc
for this platform.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compression of full-page-writes

2013-10-13 Thread Jesper Krogh

On 11/10/13 19:06, Andres Freund wrote:

On 2013-10-11 09:22:50 +0530, Amit Kapila wrote:

I think it will be difficult to prove by using any compression
algorithm, that it compresses in most of the scenario's.
In many cases it can so happen that the WAL will also not be reduced
and tps can also come down if the data is non-compressible, because
any compression algorithm will have to try to compress the data and it
will burn some cpu for that, which inturn will reduce tps.

Then those concepts maybe aren't such a good idea after all. Storing
lots of compressible data in an uncompressed fashion isn't an all that
common usecase. I most certainly don't want postgres to optimize for
blank padded data, especially if it can hurt other scenarios. Just not
enough benefit.
That said, I actually have relatively high hopes for compressing full
page writes. There often enough is lot of repetitiveness between rows on
the same page that it should be useful outside of such strange
scenarios. But maybe pglz is just not a good fit for this, it really
isn't a very good algorithm in this day and aage.


Hm,. There is a clear benefit for compressible data and clearly
no benefit from incompressible data..

how about letting autovacuum taste the compressibillity of
pages on per relation/index basis and set a flag that triggers
this functionality where it provides a benefit?

not hugely more magical than figuring out wether the data ends up
in the heap or in a toast table as it is now.

--
Jesper


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG FETCH readahead

2013-10-13 Thread Boszormenyi Zoltan

2013-10-11 00:16 keltezéssel, Alvaro Herrera írta:

Boszormenyi Zoltan escribió:

2013-09-10 03:04 keltezéssel, Peter Eisentraut írta:

You need to update the dblink regression tests.

Done.

Dude, this is an humongous patch.


You *know* that the patch is available in pieces at 
https://github.com/zboszor/ecpg-readahead
right? You must have read the new initial announcement at
http://archives.postgresql.org/message-id/520f4b90.2010...@cybertec.at

You can review and merge them one by one.
The transformation of ecpg_execute() and friends starts at
2d04ff83984c63c34e55175317e3e431eb58e00c


   I was shocked by it initially, but on
further reading, I observed that it's only a huge patch which also does
some mechanical changes to test output.  I think it'd be better to split
the part that's responsible for the changed lines in test output
mentioning ecpg_process_output.


It's 1e0747576e96aae3ec8c60c46baea130aaf8916e in the above repository.

Also, another huge regression test patch is where the cursor readahead
is enabled unconditionally: 27e43069082b29cb6fa4d3414e6484ec7fb80cbe


   That should be a reasonably small
patch which changes ecpg_execute slightly and adds the new function, is
followed by the enormous resulting mechanical changes in test output.
It should be possible to commit that relatively quickly.  Then there's
the rest of the patch, which would adds a huge pile of new code.

I think there are some very minor changes to backend code as well --
would it make sense to post that as a separate piece?


It's 2ad207e6371a33d6a985c76ac066dd51ed5681cb
Also, cd881f0363b1aff1508cfa347e8df6b4981f0ee7 to fix the dblink regression 
test.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing old ports and architectures

2013-10-13 Thread Andres Freund
On 2013-10-12 18:35:00 -0700, Peter Geoghegan wrote:
 Not so sure about these.
  - M32R (no userspace CAS afaics)

I don't think M32R will hurt us/anybody much.

  - 32bit/v9 sparc (doesn't have proper atomics, old)

Sparc v9 is from 1995, so I think not supporting it anymore is
fair. It's afaics not supported by sun studio's intrics either.

  - mips for anything but gcc  4.4, using gcc's atomics support

The reason I'd like to de-support mips for older GCCs is that writing
assembler for them isn't trivial enough to do it blindly and I've had -
for other stuff - difficulties getting my hand on them. GCC provides all
the atomics for mips since 4.2, so we can just rely on that.

  - s390 for anything but gcc  4.4, using gcc's atomics support

Easier to write assembler, but still untestable and even harder to get
access on.
I think 4.2 should be fine as well.

 I think we should think hard about removing support for MIPS. A lot of
 Chinese chip manufacturers have licensed MIPS technology in just the
 last couple of years, so there is plenty of it out there; I'd be
 slightly concerned that the proposed restrictions on MIPS would be
 onerous. Much of this is the kind of hardware that a person might
 plausibly want to run Postgres on.

That's a fair point. But all of them will use gcc, right? I've
previously thought we'd need 4.4 because there's an incompatibility
between 4.3 and 4.4 but I think it won't touch us, so 4.2 which added
atomics for mips seems fine. Given there's no buildfarm animal and
there's lots of variety out there that seems like a fair amount of
support.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing old ports and architectures

2013-10-13 Thread Andres Freund
On 2013-10-13 11:34:42 +0200, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I think we should remove support for the following architectures:
  - superH
 
 This one was contributed just a year or two ago, if memory serves,
 which suggests that somebody out there cares about it.  OTOH, if
 they still care, we could insist they provide whatever atomic ops
 we want to depend on.

It was 2009 - aac3c301b5e8178841e5749b3657c1a639ba06c1 . I haven't yet
verified if gcc's atomics support is acceptable for the platform
(checkout is running for the last 3h...).
If it's supported, falling back to that seems easy enough.

  - PA-RISC. I think Tom was the remaining user there? Maybe just !gcc.
 
 Until pretty recently, there was a PA-RISC machine (not mine) in the
 buildfarm.  I don't see it in the list today though.  In any case,
 HP's compiler has always been a PITA, so no objection to requiring gcc
 for this platform.

The reason I'd like to generally get rid of PA-RISC is that it's the
only platform that doesn't seem to have any form of compare and
swap. GCC should provide fallbacks - with some warnings - using
spinlocks instead but I am afraid people will start doing things like
atomic operations in signal handlers that won't be noticed and will be a
PITA to debug.

Having read a fair amount of assembler looking at this I have to say,
anybody thinking LL/SC architectures are neat...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing old ports and architectures

2013-10-13 Thread Andres Freund
On 2013-10-13 14:08:59 +0200, Andres Freund wrote:
 On 2013-10-13 11:34:42 +0200, Tom Lane wrote:
  Andres Freund and...@2ndquadrant.com writes:
   I think we should remove support for the following architectures:
   - superH
  
  This one was contributed just a year or two ago, if memory serves,
  which suggests that somebody out there cares about it.  OTOH, if
  they still care, we could insist they provide whatever atomic ops
  we want to depend on.
 
 It was 2009 - aac3c301b5e8178841e5749b3657c1a639ba06c1 . I haven't yet
 verified if gcc's atomics support is acceptable for the platform
 (checkout is running for the last 3h...).
 If it's supported, falling back to that seems easy enough.

Ok, so the git checkout has finally
finished. 
http://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/sh/sync.md;hb=HEAD
makes for an interesting read...

In short: On newer sh variants (SH4) true hardware support is used, on
older ones the linux provides helpers. The variant has to be selected by
gcc parameters unfortunately, but I'd guess that's ok (e.g. SH4
-matomic-model=hard-llcs ).

   - PA-RISC. I think Tom was the remaining user there? Maybe just !gcc.
  
  Until pretty recently, there was a PA-RISC machine (not mine) in the
  buildfarm.  I don't see it in the list today though.  In any case,
  HP's compiler has always been a PITA, so no objection to requiring gcc
  for this platform.
 
 The reason I'd like to generally get rid of PA-RISC is that it's the
 only platform that doesn't seem to have any form of compare and
 swap. GCC should provide fallbacks - with some warnings - using
 spinlocks instead

Hm. I had misunderstood the documentation around this a bit - source
code still is the only reasonable documentation :(.
PA-RISC is only supported on linux using kernel helpers (which switch
off interrupts for the duration of the operation...).

 but I am afraid people will start doing things like
 atomic operations in signal handlers that won't be noticed and will be a
 PITA to debug.

That's already a possibility with our memory barriers :(.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing old ports and architectures

2013-10-13 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 That's a fair point. But all of them will use gcc, right? I've
 previously thought we'd need 4.4 because there's an incompatibility
 between 4.3 and 4.4 but I think it won't touch us, so 4.2 which added
 atomics for mips seems fine. Given there's no buildfarm animal and
 there's lots of variety out there that seems like a fair amount of
 support.

FWIW, I wouldn't have the slightest bit of difficulty with setting a
project policy that if you want some platform to be supported, you
must provide a buildfarm animal running on it.

More to the point for this specific case, it seems like our process
ought to be
(1) select a preferably-small set of gcc atomic intrinsics that we
want to use.
(2) provide macro wrappers for these so that substituting other
implementations isn't too tough.
(3) for any platform where people don't want to use recent gcc,
insist they provide substitute implementations of the macros.

I don't think the core project has to be responsible for implementing
(3), except maybe on very-mainstream arches such as x86.  We can
adopt a self-help attitude for everything else.

But ... having said all that, it would be nice to see some proof of
significant performance benefits that we're going to get from kicking
those non-mainstream arches to the curb.  I'm not enamored of removing
real functionality now for vague promises of benefits later.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers on foreign tables

2013-10-13 Thread Ronan Dunklau
Le samedi 12 octobre 2013 07:30:35 Kohei KaiGai a écrit :
 2013/10/10 Ronan Dunklau rdunk...@gmail.com:

 Sorry, I'm uncertain the point above.
 Are you saying FDW driver may be able to handle well a case when
 a remote tuple to be updated is different from a remote tuple being
 fetched on the first stage? Or, am I misunderstanding?

I was not clear in the point above: what I meant was that, from my 
understanding, the FDW driver has no obligation to use the system-attribute 
tupleid to identify the remote tuple. IIRC, one of the early API proposal  
involved had the tupleid passed as an argument to the ExecForeign* hooks. 
Now, these hooks receive the whole planslot instead. This slot can store 
additional resjunks attributes (thanks to the AddForeignUpdateTargets)  
which shouldn't be altered during the execution and are carried on until 
modify stage. So, these additional attributes can be used as identifiers 
instead of the tupleid.

In fact, this is the approach I implemented for the multicorn fdw, and I 
believe it to be used in other FDWs as well (HadoopFDW does that, if I 
understand it correctly).

So, it doesn't make sense to me to assume that the tupleid will be filled as 
valid remote identifier in the context of triggers.

 
 From another standpoint, I'd like to cancel the above my suggestion,
 because after-row update or delete trigger tries to fetch an older image
 of tuple next to the actual update / delete jobs.
 Even if FDW is responsible to identify a remote tuple using tupleid,
 the identified tuple we can fetch is the newer image because FDW
 driver already issues a remote query to update or delete the target
 record.
 So, soon or later, FDW shall be responsible to keep a complete tuple
 image when foreign table has row-level triggers, until writer operation
 is finished.

+1

 Does the incomplete tuple mean a tuple image but some of columns
 are replaced with NULL due to optimization, as postgres_fdw is doing,
 doesn't it?
 If so, a solution is to enforce FDW driver to fetch all the columns if its
 managing foreign table has row level triggers for update / delete.

What I meant is that a DELETE statement, for example, does not build a scan-
stage reltargetlist consisting of the whole set of attributes: the only 
attributes that are fetched are the ones built by addForeignUpdateTargets, and 
whatever additional attributes are involved in the DELETE statement (in the 
WHERE clause, for example).

I apologize if this is not clear, since both my technical and english 
vocabulary are maybe not precise enough to get my point accross.

  Or, perform a second round-trip to the foreign data store to fetch the
  whole tuple.
 
 It might be a solution for before-row trigger, but not for after-row
 trigger, unfortunately.

+1

 As I noted above, 2nd round trip is not a suitable design to support
 after-row trigger. Likely, GetTupleForTrigger() hook shall perform to
 return a cached older image being fetched on the first round of remote
 scan.
 So, I guess every FDW driver will have similar implementation to handle
 these the situation, thus it makes sense that PostgreSQL core has
 a feature to support this handling; that keeps a tuple being fetched last
 and returns older image for row-level triggers.
 
 How about your opinion?

I like this idea, but I don't really see all the implications of such a 
design.
Where would this tuple be stored ?
From what I understand, after-triggers are queued for later execution, using 
the old/new tupleid for later retrieval (in the AfterTriggerEventData struct). 
This would need a major change to work with foreign tables. Would that involve 
a special path for executing those triggers ASAP ?

 
 And, I also want some comments from committers, not only from mine.

+1

Thank you for taking the time to think this through.


--
Ronan Dunklau


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] removing old ports and architectures

2013-10-13 Thread Andres Freund
On 2013-10-13 16:56:12 +0200, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  That's a fair point. But all of them will use gcc, right? I've
  previously thought we'd need 4.4 because there's an incompatibility
  between 4.3 and 4.4 but I think it won't touch us, so 4.2 which added
  atomics for mips seems fine. Given there's no buildfarm animal and
  there's lots of variety out there that seems like a fair amount of
  support.

 FWIW, I wouldn't have the slightest bit of difficulty with setting a
 project policy that if you want some platform to be supported, you
 must provide a buildfarm animal running on it.

 More to the point for this specific case, it seems like our process
 ought to be
 (1) select a preferably-small set of gcc atomic intrinsics that we
 want to use.
 (2) provide macro wrappers for these so that substituting other
 implementations isn't too tough.
 (3) for any platform where people don't want to use recent gcc,
 insist they provide substitute implementations of the macros.
 
 I don't think the core project has to be responsible for implementing
 (3), except maybe on very-mainstream arches such as x86.  We can
 adopt a self-help attitude for everything else.

Sounds fair to me.

The question about platforms that simply cannot provide such atomics
like PA-RISC, which afaics is the only one, remains tho. I am not sure
we really want to provide codepaths that are only going to be tested
there.

I do plan to propose a set of macros for this once we know what cases we
need to support.

 But ... having said all that, it would be nice to see some proof of
 significant performance benefits that we're going to get from kicking
 those non-mainstream arches to the curb.  I'm not enamored of removing
 real functionality now for vague promises of benefits later.

The reason I am bringing this up is that I'd like lift the patch in
http://archives.postgresql.org/message-id/20130926225545.GB26663%40awork2.anarazel.de
from a purely POC state to something committable. The numbers in it seem
to justify causing some pain.
Once the infrastructure is there we can easily get some really nice
further benefits by changing buffer pinning - that's one of the most
expensive things wrt. scalability after the above patch.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing old ports and architectures

2013-10-13 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 The question about platforms that simply cannot provide such atomics
 like PA-RISC, which afaics is the only one, remains tho. I am not sure
 we really want to provide codepaths that are only going to be tested
 there.

PA-RISC is a dead architecture.  According to wikipedia, HP hasn't sold
any such machines since 2008, and won't support them beyond 2013.  If
that really is the only case we're worried about supporting, it's an
easy decision.

What worries me more is that you mentioned several cases where the gcc
atomics exist but need kernel support.  I have to think that a trap
to the kernel would make the operation so expensive as to be a serious
performance loss, not gain.  So it seems to me that platforms like that
are essentially being kicked to the curb if we make this change, even
if they theoretically could still work.  Are there any that we really
care about?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] removing old ports and architectures

2013-10-13 Thread Andres Freund
On 2013-10-13 20:39:21 +0200, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  The question about platforms that simply cannot provide such atomics
  like PA-RISC, which afaics is the only one, remains tho. I am not sure
  we really want to provide codepaths that are only going to be tested
  there.
 
 PA-RISC is a dead architecture.  According to wikipedia, HP hasn't sold
 any such machines since 2008, and won't support them beyond 2013.  If
 that really is the only case we're worried about supporting, it's an
 easy decision.

Great.

 What worries me more is that you mentioned several cases where the gcc
 atomics exist but need kernel support.  I have to think that a trap
 to the kernel would make the operation so expensive as to be a serious
 performance loss, not gain.  So it seems to me that platforms like that
 are essentially being kicked to the curb if we make this change, even
 if they theoretically could still work.

I think it's not that bad on most - they all seem to use some more
lightweight trap mechanisms that basically just change protection and
stops interrupts while doing the math. No chance of scheduling and such.

 Are there any that we really care about?

I don't think so. At least not if we're restricting ourselves to 32bit
cmpxchg/xadd which I think will be enough for the first rounds of
improvement.

It's:
- PA-RISC
- sparc before ultrasparcs (1995)
- Multi-CPU/core SuperH before SH4 (uni SH2 has some cute interrupt
  handling tricks, that do not require a trap)
- arm before v6

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auto-tuning work_mem and maintenance_work_mem

2013-10-13 Thread Dimitri Fontaine
MauMau maumau...@gmail.com writes:
 I understand this problem occurs only when the user configured the
 application server to use distributed transactions, the application server
 crashed between prepare and commit/rollback, and the user doesn't recover
 the application server.  So only improper operation produces the problem.

The reason why that parameter default has changed from 5 to 0 is that
some people would mistakenly use a prepared transaction without a
transaction manager. Few only people are actually using a transaction
manager that it's better to have them have to set PostgreSQL.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auto-tuning work_mem and maintenance_work_mem

2013-10-13 Thread Dimitri Fontaine
Magnus Hagander mag...@hagander.net writes:
 Frankly, I think we'd help 1000 times more users of we enabled a few wal
 writers by default and jumped the wal level. Mainly so they could run one
 off base backup. That's used by orders of magnitude more users than XA.

+1, or += default max_wal_senders actually ;-)

My vote would be to have default to at least 3, so that you can run both
a pg_basebackup -Xs (stream) and a standby or a pg_receivexlog in
parallel. Maybe 5 is an even better default.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auto-tuning work_mem and maintenance_work_mem

2013-10-13 Thread Andres Freund
On 2013-10-12 09:04:55 +0200, Magnus Hagander wrote:
 Frankly, I think we'd help 1000 times more users of we enabled a few wal
 writers by default and jumped the wal level. Mainly so they could run one
 off base backup. That's used by orders of magnitude more users than XA.

Yes, I've thought about that several times as well. I think it might
actually not be too hard to allow increasing the WAL level dynamically
similar to what we do with full_page_writes. If we then would allow
starting wal senders with a lower wal_level we would get there without
breaking older installations which rely on TRUNCATE  COPY
optimizations et al.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-10-13 Thread Peter Geoghegan
On Wed, Oct 9, 2013 at 1:11 PM, Peter Geoghegan p...@heroku.com wrote:
 Unfortunately, I have a very busy schedule in the month ahead,
 including travelling to Ireland and Japan, so I don't think I'm going
 to get the opportunity to work on this too much. I'll try and produce
 a V4 that formally proposes some variant of my ideas around visibility
 of locked tuples.

V4 is attached.

Most notably, this adds the modifications to HeapTupleSatisfiesMVCC(),
though they're neater than in the snippet I sent earlier.

There is also some clean-up around row-level locking. That code has
been simplified. I also try and handle serialization failures in a
better way, though that really needs the attention of a subject matter
expert.

There are a few additional XXX comments highlighting areas of concern,
particularly around serializable behavior. I've deferred making higher
isolation levels care about wrongfully relying on the special
HeapTupleSatisfiesMVCC() exception (e.g. they won't throw a
serialization failure, mostly because I couldn't decide on where to do
the test on time prior to travelling tomorrow).

I've added code to do heap_prepare_insert before value locks are held.
Whatever our eventual value locking implementation, that's going to be
a useful optimization. Though unfortunately I ran out of time to give
this the scrutiny it really deserves, I suppose that it's something
that we can return to later.

I ask that reviewers continue to focus on concurrency issues and broad
design issues, and continue to defer discussion about an eventual
value locking implementation. I continue to think that that's the most
useful way of proceeding for the time being. My earlier points about
probable areas of concern [1] remain a good place for reviewers to
start.

[1] 
http://www.postgresql.org/message-id/cam3swzsvsrtzphjnpjahtj0rffs-gjfhu86vpewf+eo8gwz...@mail.gmail.com

-- 
Peter Geoghegan


insert_on_dup.v4.2013_10_13.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers