Re: [HACKERS] 9.2.1 index-only scans : abnormal heap fetches after VACUUM FULL

2014-02-18 Thread Heikki Linnakangas

On 02/16/2014 10:19 PM, Jim Nasby wrote:

On 1/24/14, 3:52 PM, Jaime Casanova wrote:

On Tue, Dec 3, 2013 at 11:25 AM, Bruce Momjianbr...@momjian.us  wrote:


Is everyone else OK with this approach?  Updated patch attached.


Hi,

I started to look at this patch and i found that it fails an assertion
as soon as you run a VACUUM FULL after a lazy VACUUM even if those are
on unrelated relations. For example in an assert-enabled build with
the regression database run:

VACUUM customer;
[... insert here whatever commands you like or nothing at all ...]
VACUUM FULL tenk1;


Is anyone else confused/concerned that regression testing didn't pick this up?


I wouldn't expect that to be explicitly tested - it's pretty unexpected 
that they work independently but not when run one after another. But 
it's a bit surprising that we don't happen to do that combination in any 
of the tests by pure chance.



The vacuum.sql test does not test lazy vacuum at all, and I can't seem to find 
any other tests that test lazy vacuum either...


There are several lazy vacuums in the regression suite:

sql/alter_table.sql:vacuum analyze atacc1(a);
sql/alter_table.sql:vacuum analyze atacc1(pg.dropped.1);
sql/hs_standby_disallowed.sql:VACUUM hs2;
sql/indirect_toast.sql:VACUUM FREEZE toasttest;
sql/indirect_toast.sql:VACUUM FREEZE toasttest;
sql/matview.sql:VACUUM ANALYZE hogeview;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_add;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_sub;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_div;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_mul;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_sqrt;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_ln;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_log10;
sql/numeric_big.sql:VACUUM ANALYZE num_exp_power_10_ln;
sql/numeric.sql:VACUUM ANALYZE num_exp_add;
sql/numeric.sql:VACUUM ANALYZE num_exp_sub;
sql/numeric.sql:VACUUM ANALYZE num_exp_div;
sql/numeric.sql:VACUUM ANALYZE num_exp_mul;
sql/numeric.sql:VACUUM ANALYZE num_exp_sqrt;
sql/numeric.sql:VACUUM ANALYZE num_exp_ln;
sql/numeric.sql:VACUUM ANALYZE num_exp_log10;
sql/numeric.sql:VACUUM ANALYZE num_exp_power_10_ln;
sql/sanity_check.sql:VACUUM;
sql/without_oid.sql:VACUUM ANALYZE wi;
sql/without_oid.sql:VACUUM ANALYZE wo;

Most of those commands are there to analyze, rather than vacuum, but 
lazy vacuum is definitely exercised by the regression tests. I agree 
it's quite surprising that vacuum.sql doesn't actually perform any lazy 
vacuums.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-18 Thread Andres Freund
Hi Robert,

On 2014-02-17 20:31:34 -0500, Robert Haas wrote:
 1. How safe is it to try to do decoding inside of a regular backend?
 What we're doing here is entering a special mode where we forbid the
 use of regular snapshots in favor of requiring the use of decoding
 snapshots, and forbid access to non-catalog relations.  We then run
 through the decoding process; and then exit back into regular mode.
 On entering and on exiting this special mode, we
 InvalidateSystemCaches().  I don't see a big problem with having
 special backends (e.g. walsender) use this special mode, but I'm less
 convinced that it's wise to try to set things up so that we can switch
 back and forth between decoding mode and regular mode in a single
 backend.

The main reason the SQL interface exists is that it's awfully hard to
use isolationtester, pg_regress et al when the output isn't also visible
via SQL. We tried hacking things in other ways, but that's what it came
down to. If you recall, previously the SQL changes interface was only in
a test_logical_decoding extension, because I wasn't sure it's all that
interesting for real usecases.
It's sure nice for testing things though.

 I worry that won't end up working out very cleanly, and I
 think the prohibition against using this special mode in an
 XID-bearing transaction is merely a small downpayment on future pain
 in this area.

That restriction is in principle only needed when creating the slot, not
when getting changes. The only problem is that some piece of code
doesn't know about it.

The reason it exists are twofold: One is that when looking for an
initial snapshot, we wait for concurrent transactions to end. If we'd
wait for the transaction itself we'd be in trouble, it could never
happen. The second reason is that the code do a XactLockTableWait() to
visualize it's waiting, so isolatester knows it should background the
command. It's not good to wait on itself.
But neither is actually needed when not creating the slot, the code just
needs to be told about that.

 That having been said, I can't pretend at this point
 either to understand the genesis of this particular restriction or
 what other problems are likely to crop up in trying to allow this
 mode-switching.  So it's possible that I'm overblowing it, but it's
 makin' me nervous.

I am not terribly concerned, but I can understand where you are coming
from. I think for replication solutions this isn't going to be needed
but it's way much more handy for testing and such.

 2. I think the snapshot-export code is fundamentally misdesigned.  As
 I said before, the idea that we're going to export one single snapshot
 at one particular point in time strikes me as extremely short-sighted.

I don't think so. It's precisely what you need to implement a simple
replication solution. Yes, there are usecases that could benefit from
more possibilities, but that's always the case.

  For example, consider one-to-many replication where clients may join
 or depart the replication group at any time.  Whenever somebody joins,
 we just want a snapshot, LSN pair such that they can apply all
 changes after the LSN except for XIDs that would have been visible to
 the snapshot.

And? They need to create individual replication slots, which each will
get a snapshot.

 And in fact, we don't even need any special machinery
 for that; the client can just make a connection and *take a snapshot*
 once decoding is initialized enough.

No, they can't. Two reasons: For one the commit order between snapshots
and WAL isn't necessarily the same. For another, clients now need logic
to detect whether a transaction's contents has already been applied or
has not been applied yet, that's nontrivial.

 This code is going to great
 pains to be able to export a snapshot at the precise point when all
 transactions that were running in the first xl_running_xacts record
 seen after the start of decoding have ended, but there's nothing
 magical about that point, except that it's the first point at which a
 freshly-taken snapshot is guaranteed to be good enough to establish an
 initial state for any table in the database.

I still maintain that there's something magic about that moment. It's
when all *future* (from the POV of the snapshot) changes will be
streamed, and all *past* changes are included in the exported snapshot.

 But do you really want to keep that snapshot around long enough to
 copy the entire database?  I bet you don't: if the database is big,
 holding back xmin for long enough to copy the whole thing isn't likely
 to be fun.

Well, that's how pg_dump works, it's not this patch's problem to fix
that.

 You might well want to copy one table at a time, with
 progressively newer snapshots, and apply to each table only those
 transactions that weren't part of the initial snapshot for that table.
  Many other patterns are possible.  What you've got baked in here
 right now is suitable only for the simplest imaginable case, and yet
 we're paying a 

Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-18 Thread Andres Freund
On 2014-02-17 21:10:26 -0500, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  1. How safe is it to try to do decoding inside of a regular backend?
  What we're doing here is entering a special mode where we forbid the
  use of regular snapshots in favor of requiring the use of decoding
  snapshots, and forbid access to non-catalog relations.  We then run
  through the decoding process; and then exit back into regular mode.
  On entering and on exiting this special mode, we
  InvalidateSystemCaches().
 
 How often is such a mode switch expected to happen?  I would expect
 frequent use of InvalidateSystemCaches() to be pretty much disastrous
 for performance, even absent any of the possible bugs you're worried
 about.  It would likely be better to design things so that a decoder
 backend does only that.

Very infrequently. When it's starting to decode, and when it's
ending. When used via walsender, that should only happen at connection
start/end which surely shouldn't be frequent.
It's more frequent when using the SQL interface, but since that's not a
streaming interface on a busy server there still would be a couple of
megabytes of transactions to decode for one reset.

  3. As this feature is proposed, the only plugin we'll ship with 9.4 is
  a test_decoding plugin which, as its own documentation says, doesn't
  do anything especially useful.  What exactly do we gain by forcing
  users who want to make use of these new capabilities to write C code?
 
 TBH, if that's all we're going to ship, I'm going to vote against
 committing this patch to 9.4 at all.  Let it wait till 9.5 when we
 might be able to build something useful on it.

There *are* useful things around already. We didn't include postgres_fdw
in the same release as the fdw code either? I don't see why this should
be held to a different standard.

 To point out just
 one obvious problem, how much confidence can we have in the APIs
 being right if there are no usable clients?

Because there *are* clients. They just don't sound likely to either be
suitable for core code (to specialized) or have already been submitted
(the json plugin).

There's a whole replication suite built ontop of this, to a good degree
to just test it. So I am fairly confident that the most important parts
are covered. There sure is additional features I want, but that's not
surprising.

 The most recent precedent I can think of is the FDW APIs, which I'd
 be the first to admit are still in flux.  But we didn't ship anything
 there without non-toy contrib modules to exercise it.  If we had,
 we'd certainly have regretted it, because in the creation of those
 contrib modules we found flaws in the initial design.

Which non-toy fdw was there? file_fdw was in 9.1, but that's a toy. And
*8.4* had CREATE FOREIGN DATA WRAPPER, without it doing anything...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-18 Thread Andres Freund
On 2014-02-17 18:49:59 -0800, Peter Geoghegan wrote:
 On Mon, Feb 17, 2014 at 6:35 PM, Robert Haas robertmh...@gmail.com wrote:
  What I actually suspect is going to happen if we ship this as-is is
  that people are going to start building logical replication solutions
  on top of the test_decoding module even though it explicitly says that
  it's just test code.  This is *really* cool technology and people are
  *hungry* for it.  But writing C is hard, so if there's not a polished
  plugin available, I bet people are going to try to use the
  not-polished one.  I think we try to get out ahead of that.
 
 Tom made a comparison with FDWs, so I'll make another. The Multicorn
 module made FDW authorship much more accessible by wrapping it in a
 Python interface, I believe with some success. I don't want to stand
 in the way of building a fully-featured test_decoding module, but I
 think that those that would misuse test_decoding as it currently
 stands can be redirected to a third-party wrapper. As you say, it's
 pretty cool stuff, so it seems likely that someone will build one for
 us.

Absolutely. I *sure* hope somebody is going to build such an
abstraction. I am not entirely sure how it'd look like, but ...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-18 Thread Andres Freund
On 2014-02-17 21:35:23 -0500, Robert Haas wrote:
 What
 I don't understand is why we're not taking the test_decoding module,
 polishing it up a little to produce some nice, easily
 machine-parseable output, calling it basic_decoding, and shipping
 that.  Then people who want something else can build it, but people
 who are happy with something basic will already have it.

Because every project is going to need their own plugin
*anyway*. Londiste, slony sure are going to ignore changes to relations
they don't need. Querying their own metadata. They will want
compatibility to the earlier formats as far as possible. Sometime not
too far away they will want to optionally support binary output because
it's so much faster.
There's just not much chance that either of these will be able to agree
on a format short term.

So, possibly we could agree to something that consumers *outside* of
replication could use.

 What I actually suspect is going to happen if we ship this as-is is
 that people are going to start building logical replication solutions
 on top of the test_decoding module even though it explicitly says that
 it's just test code.  This is *really* cool technology and people are
 *hungry* for it.  But writing C is hard, so if there's not a polished
 plugin available, I bet people are going to try to use the
 not-polished one.  I think we try to get out ahead of that.

I really hope there will be nicer ones by the time 9.4 is
released. Euler did send in a json plugin
http://archives.postgresql.org/message-id/52A5BFAE.1040209%2540timbira.com.br
, but there hasn't too much feedback yet. It's hard to start discussing
something that needs a couple of patches to pg before you can develop
your own patch...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-18 Thread Etsuro Fujita
 From: Shigeru Hanada [mailto:shigeru.han...@gmail.com]

 2014-02-10 21:00 GMT+09:00 Etsuro Fujita fujita.ets...@lab.ntt.co.jp:
  (2014/02/07 21:31), Etsuro Fujita wrote:
  So, I've modified the patch so
  that we continue to disallow SET STORAGE on a foreign table *in the
  same manner as before*, but, as your patch does, allow it on an
  inheritance hierarchy that contains foreign tables, with the
  semantics that we quietly ignore the foreign tables and apply the
  operation to the plain tables, by modifying the ALTER TABLE simple
 recursion mechanism.
  Attached is the updated version of the patch.
 
 I'm not sure that allowing ALTER TABLE against parent table affects
 descendants even some of them are foreign table.  I think the rule should
 be simple enough to understand for users, of course it should be also
 consistent and have backward compatibility.

Yeah, the understandability is important.  But I think the flexibility is also 
important.  In other words, I think it is a bit too inflexible that we disallow 
e.g., SET STORAGE to be set on an inheritance tree that contains foreign 
table(s) because we disallow SET STORAGE to be set on foreign tables directly.

 If foreign table can be modified through inheritance tree, this kind of
 change can be done.
 
 1) create foreign table as a child of a ordinary table
 2) run ALTER TABLE parent, the foreign table is also changed
 3) remove foreign table from the inheritance tree by ALTER TABLE child NO
 INHERIT parent
 4) here we can't do same thing as 2), because it is not a child anymore.
 
 So IMO we should determine which ALTER TABLE features are allowed to foreign
 tables, and allow them regardless of the recursivity.

What I think should be newly allowed to be set on foreign tables is

* ADD table_constraint
* DROP CONSTRAINT
* [NO] INHERIT parent_table

Thanks,

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-18 Thread Kyotaro HORIGUCHI
Hello,

 2014-02-10 21:00 GMT+09:00 Etsuro Fujita fujita.ets...@lab.ntt.co.jp:
  (2014/02/07 21:31), Etsuro Fujita wrote:
  So, I've modified the patch so
  that we continue to disallow SET STORAGE on a foreign table *in the same
  manner as before*, but, as your patch does, allow it on an inheritance
  hierarchy that contains foreign tables, with the semantics that we
  quietly ignore the foreign tables and apply the operation to the plain
  tables, by modifying the ALTER TABLE simple recursion mechanism.
  Attached is the updated version of the patch.
 
 I'm not sure that allowing ALTER TABLE against parent table affects
 descendants even some of them are foreign table.  I think the rule
 should be simple enough to understand for users, of course it should
 be also consistent and have backward compatibility.

Could you guess any use cases in which we are happy with ALTER
TABLE's inheritance tree walking?  IMHO, ALTER FOREIGN TABLE
always comes with some changes of the data source so implicitly
invoking of such commands should be defaultly turned off. If the
ALTER'ing the whole familiy is deadfully useful for certain
cases, it might be explicitly turned on by some syntax added to
CREATE FOREIGN TABLE or ALTER TABLE.

It would looks like following,

CREATE FOREIGN TABLE ft1 () INHERITS (pt1 ALLOW_INHERITED_ALTER, pt2);

ALTER TABLE INCLUDE FOREIGN CHILDREN parent ADD COLUMN add1 integer;

These looks quite bad :-( but also seem quite better than
accidentially ALTER'ing foreign children.

If foreign tables were allowed to ALTER'ed with 'ALTER TABLE',
some reconstruction between 'ALTER TABLE' and 'ALTER FOREIGN
TABLE' would be needed.

 If foreign table can be modified through inheritance tree, this kind
 of change can be done.
 
 1) create foreign table as a child of a ordinary table
 2) run ALTER TABLE parent, the foreign table is also changed
 3) remove foreign table from the inheritance tree by ALTER TABLE child
 NO INHERIT parent
 4) here we can't do same thing as 2), because it is not a child anymore.
 
 So IMO we should determine which ALTER TABLE features are allowed to
 foreign tables, and allow them regardless of the recursivity.
 
 Comments?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning recovery.conf into GUCs

2014-02-18 Thread Michael Paquier
This patch is in Waiting for Author for a couple of weeks and has
received a review at least from Andres during this commit fest. As the
situation is not much progressing, I am going to mark it as Returned
with feedback.
If there are any problems with that please let me know.
Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-02-18 Thread KONDO Mitsumasa

(2014/02/17 21:44), Rajeev rastogi wrote:

It got compiled successfully on Windows.

Thank you for checking on Windows! It is very helpful for me.

Can we allow to add three statistics? I think only adding stdev is difficult to
image for user. But if there are min and max, we can image each statements 
situations more easily. And I don't want to manage this feature in my monitoring 
tool that is called pg_statsinfo. Because it is beneficial for a lot of 
pg_stat_statements user and for improvement of postgres performance in the future.


Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-02-18 Thread MauMau

From: Peter Geoghegan p...@heroku.com

You mentioned a hang during a B-Tree insert operation - do you happen
to have a backtrace that relates to that?


Sorry, I may have misunderstood.  The three stack traces I attached are not 
related to btree.  I recall that I saw one stack trace containing 
bt_insert(), but I'm not sure.


When the hang occurred, INSERT/UPDATE/COMMIT statements stopped for at least 
one hour, while SELECT statements ran smoothly.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do you know the reason for increased max latency due to xlog scaling?

2014-02-18 Thread MauMau

From: Andres Freund and...@2ndquadrant.com

On 2014-02-18 01:35:52 +0900, MauMau wrote:

For example, please see the max latencies of test set 2 (PG 9.3) and test
set 4 (xlog scaling with padding).  They are 207.359 and 1219.422
respectively.  The throughput is of course greatly improved, but I think 
the
response time should not be sacrificed as much as possible.  There are 
some

users who are sensitive to max latency, such as stock exchange and online
games.


You need to compare both at the same throughput to have any meaningful
comparison.


I'm sorry for my lack of understanding, but could you tell me why you think 
so?  When the user upgrades to 9.4 and runs the same workload, he would 
experience vastly increased max latency --- or in other words, greater 
variance in response times.  With my simple understanding, that sounds like 
a problem for response-sensitive users.



Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do you know the reason for increased max latency due to xlog scaling?

2014-02-18 Thread Andres Freund
On 2014-02-18 20:49:06 +0900, MauMau wrote:
 From: Andres Freund and...@2ndquadrant.com
 On 2014-02-18 01:35:52 +0900, MauMau wrote:
 For example, please see the max latencies of test set 2 (PG 9.3) and test
 set 4 (xlog scaling with padding).  They are 207.359 and 1219.422
 respectively.  The throughput is of course greatly improved, but I think
 the
 response time should not be sacrificed as much as possible.  There are
 some
 users who are sensitive to max latency, such as stock exchange and online
 games.
 
 You need to compare both at the same throughput to have any meaningful
 comparison.
 
 I'm sorry for my lack of understanding, but could you tell me why you think
 so?  When the user upgrades to 9.4 and runs the same workload, he would
 experience vastly increased max latency --- or in other words, greater
 variance in response times.

No, the existing data indicates no such thing. When they upgrade they
will have the *same* throughput as before. The datapoints you indicate
that there's an increase in latency, but it's there while processing
several time as much data! The highest throughput of set 2 is 3223,
while the highest for set 4 is 14145.
To get interesting latency comparison you'd need to use pgbench --rate
and use a rate *both* versions can sustain.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: option --if-exists for pg_dump

2014-02-18 Thread Pavel Stehule
Hello


2014-02-17 22:14 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:

 Hello


 2014-02-17 18:10 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com:

 Jeevan Chalke escribió:

 I don't understand this code.  (Well, it's pg_dump.)  Or maybe I do
 understand it, and it's not doing what you think it's doing.  I mean, in
 this part:

  diff --git a/src/bin/pg_dump/pg_backup_archiver.c
 b/src/bin/pg_dump/pg_backup_archiver.c
  index 7fc0288..c08a0d3 100644
  --- a/src/bin/pg_dump/pg_backup_archiver.c
  +++ b/src/bin/pg_dump/pg_backup_archiver.c
  @@ -413,8 +413,84 @@ RestoreArchive(Archive *AHX)
/* Select owner and schema as necessary */
_becomeOwner(AH, te);
_selectOutputSchema(AH, te-namespace);
  - /* Drop it */
  - ahprintf(AH, %s, te-dropStmt);
  +
  + if (*te-dropStmt != '\0')
  + {
  + /* Inject IF EXISTS clause to
 DROP part when required. */
  + if (ropt-if_exists)

 It does *not* modify te-dropStmt, it only sends ahprint() a different
 version of what was stored (injected the wanted IF EXISTS clause).  If
 that is correct, then why are we, in this other part, trying to remove
 the IF EXISTS clause?


 we should not to modify te-dropStmt, because only in this fragment a DROP
 STATEMENT is produced. This additional logic ensures correct syntax for all
 variation of DROP.

 When I wrote this patch I had a initial problem with understanding
 relation between pg_dump and pg_restore. And I pushed IF EXISTS to all
 related DROP statements producers. But I was wrong. All the drop statements
 are reparsed and transformed and serialized in this fragment. So only this
 fragment should be modified. IF EXISTS clause can be injected before, when
 you read plain text dump (produced by pg_dump --if-exists) in pg_restore.


I was wrong - IF EXISTS was there, because I generated DROP IF EXISTS
elsewhere in some very old stages of this patch. It is useless to check it
there now.





  @@ -2942,9 +3018,39 @@ _getObjectDescription(PQExpBuffer buf, TocEntry
 *te, ArchiveHandle *AH)
strcmp(type, OPERATOR CLASS) == 0 ||
strcmp(type, OPERATOR FAMILY) == 0)
{
  - /* Chop DROP  off the front and make a modifiable copy
 */
  - char   *first = pg_strdup(te-dropStmt + 5);
  - char   *last;
  + char*first;
  + char*last;
  +
  + /*
  +  * Object description is based on dropStmt statement
 which may have
  +  * IF EXISTS clause.  Thus we need to update an offset
 such that it
  +  * won't be included in the object description.
  +  */

 Maybe I am mistaken and the te-dropStmt already contains the IF EXISTS
 bit for some reason; but if so I don't know why that is.  Care to
 explain?


 pg_restore is available to read plain dump produced by pg_dump
 --if-exists. It is way how IF EXISTS can infect te-dropStmt



 I also think that _getObjectDescription() becomes overworked after this
 patch.  I wonder if we should be storing te-objIdentity so that we can
 construct the ALTER OWNER command without going to as much trouble as
 parsing the DROP command.  Is there a way to do that? Maybe we can ask
 the server for the object identity, for example.  There is a new
 function to do that in 9.3 which perhaps we can now use.


 do you think a pg_describe_object function?

 Probably it is possible, but its significantly much more invasive change,
 you should to get objidentity, that is not trivial

 Regards


It is irony, so this is death code - it is not used now. So I removed it
from patch.

Reduced, fixed patch attached + used tests

Regards

Pavel





 Pavel


 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



commit 937830b60920a8fac84cd2adb24c3d1b5280b036
Author: Pavel Stehule pavel.steh...@gooddata.com
Date:   Tue Feb 18 14:25:40 2014 +0100

reduced2

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 8d45f24..c39767b 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -650,6 +650,16 @@ PostgreSQL documentation
  /varlistentry
 
  varlistentry
+  termoption--if-exists/option/term
+  listitem
+   para
+It uses conditional commands (with literalIF EXISTS/literal
+clause) for cleaning database schema.
+   /para
+  /listitem
+ /varlistentry
+
+ varlistentry
   termoption--disable-dollar-quoting//term
   listitem
para
diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index 5c6a101..ba6583d 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ 

Re: [HACKERS] [bug fix] postgres.exe fails to start on Windows Server 2012 due to ASLR

2014-02-18 Thread Magnus Hagander
On Tue, Feb 4, 2014 at 12:57 PM, Craig Ringer cr...@2ndquadrant.com wrote:

 On 02/04/2014 07:28 PM, MauMau wrote:
 
 
  Please don't mind, I didn't misunderstand your intent.  I think we
  should apply this in the next minor release to avoid unnecessary
  confusion -- more new users would use PostgreSQL on Windows 8/2012 and
  hit this problem.
 
  I added this patch to the CommitFest.

 It's really a bugfix suitable for backpatching IMO.


Unfortunately we missed the releases that have just been wrapped.

I've applied this and backpatched to 9.3 and 9.2, which is as far back as
it goes cleanly.

In 9.1 the build system looked significantly different, which makes it
strange since the original report in this thread was about 9.1 but the
patch supplied clearly wasn't for that.

Does somebody want to look at backpatching this to 9.1 and earlier, or
should we just say that it's not fully supported on those Windows versions
unless you apply the registry workaround?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] [review] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-02-18 Thread Magnus Hagander
On Mon, Feb 3, 2014 at 3:06 PM, MauMau maumau...@gmail.com wrote:

 From: Rajeev rastogi rajeev.rast...@huawei.com

  I will update commitFest with the latest patch immediately after sending
 the mail.


 OK, done setting the status to ready for committer.


We already have two different versions of make_absolute_path() in the tree
- one in src/backend/utils/init/miscinit.c and one in
src/test/regress/pg_regress.c.

I don't think we need a third one.

If we put it in port/ like this patch done, then we should make the other
callers use it. We might need one for frontend and one for backend (I
haven't looked into that much detail), but definitely the one between
pg_regress and pg_ctl should be shared.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] [bug fix] postgres.exe fails to start on Windows Server 2012 due to ASLR

2014-02-18 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 I've applied this and backpatched to 9.3 and 9.2, which is as far back as
 it goes cleanly.

 In 9.1 the build system looked significantly different, which makes it
 strange since the original report in this thread was about 9.1 but the
 patch supplied clearly wasn't for that.

 Does somebody want to look at backpatching this to 9.1 and earlier, or
 should we just say that it's not fully supported on those Windows versions
 unless you apply the registry workaround?

The question I think needs to be asked is not that, but what about the
mingw and cygwin builds?.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-18 Thread Tom Lane
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
 Could you guess any use cases in which we are happy with ALTER
 TABLE's inheritance tree walking?  IMHO, ALTER FOREIGN TABLE
 always comes with some changes of the data source so implicitly
 invoking of such commands should be defaultly turned off. If the
 ALTER'ing the whole familiy is deadfully useful for certain
 cases, it might be explicitly turned on by some syntax added to
 CREATE FOREIGN TABLE or ALTER TABLE.

 It would looks like following,

 CREATE FOREIGN TABLE ft1 () INHERITS (pt1 ALLOW_INHERITED_ALTER, pt2);

 ALTER TABLE INCLUDE FOREIGN CHILDREN parent ADD COLUMN add1 integer;

Ugh.  This adds a lot of logical complication without much real use,
IMO.  The problems that have been discussed in this thread are that
certain types of ALTER are sensible to apply to foreign tables and
others are not --- so how does a one-size-fits-all switch help that?

Also, there are some types of ALTER for which recursion to children
*must* occur or things become inconsistent --- ADD COLUMN itself is
an example, and ADD CONSTRAINT is another.  It would not be acceptable
for an ALLOW_INHERITED_ALTER switch to suppress that, but if the
switch is leaky then it's even less consistent and harder to explain.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] postgres.exe fails to start on Windows Server 2012 due to ASLR

2014-02-18 Thread Magnus Hagander
On Tue, Feb 18, 2014 at 3:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Magnus Hagander mag...@hagander.net writes:
  I've applied this and backpatched to 9.3 and 9.2, which is as far back as
  it goes cleanly.

  In 9.1 the build system looked significantly different, which makes it
  strange since the original report in this thread was about 9.1 but the
  patch supplied clearly wasn't for that.

  Does somebody want to look at backpatching this to 9.1 and earlier, or
  should we just say that it's not fully supported on those Windows
 versions
  unless you apply the registry workaround?

 The question I think needs to be asked is not that, but what about the
 mingw and cygwin builds?.



Cygwin has their own implementation of fork(). So the fix there has to be
done in cygwin and not us - presumably they have either already fixed it,
or will fix it in order to get general compatibility with the new ASLR
versions.

Mingw is a different story. But some googling indicates that mingw disables
ASLR by default. You need to specify --dynamicbase if you want to to be
dynamic, which we don't. (Being mingw I've not managed to find any actual
docs, but for example
http://lists.cypherpunks.ca/pipermail/otr-dev/2012-August/001373.html seems
to indicate it has to be enabled).

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] OS X and ossp-uuid, next chapter

2014-02-18 Thread Robert Haas
On Sat, Feb 15, 2014 at 4:17 PM, Peter Eisentraut pete...@gmx.net wrote:
 We already know that the uuid-ossp extension doesn't build OS X unless a
 small patch is applied.

 This has now gotten slightly worse after the Autoconf upgrade, because
 it will now fail if a header is present but cannot be compiled.
 (Previous versions would only warn.  This is part of a decade-long
 transition process Autoconf is doing.)

 So now you get:

 checking ossp/uuid.h usability... no
 checking ossp/uuid.h presence... no
 checking for ossp/uuid.h... no
 checking uuid.h usability... no
 checking uuid.h presence... yes
 configure: WARNING: uuid.h: present but cannot be compiled
 configure: WARNING: uuid.h: check for missing prerequisite headers?
 configure: WARNING: uuid.h: see the Autoconf documentation
 configure: WARNING: uuid.h: section Present But Cannot Be Compiled
 configure: WARNING: uuid.h: proceeding with the compiler's result
 configure: WARNING: ##  ##
 configure: WARNING: ## Report this to pgsql-b...@postgresql.org ##
 configure: WARNING: ##  ##
 checking for uuid.h... no
 configure: error: header file ossp/uuid.h or uuid.h is required for
 OSSP-UUID

 A possible patch is to hack up the uuid.h check to revert to the old
 behavior; see attached.

 I don't necessarily want to apply that, because it's an OS-specific hack
 and it doesn't even work by itself unless we also patch the place where
 the header is used (previously discussed).  But I'll put it on record
 here for future reporters and for the benefit of packagers.

Given commit e6170126fc201052b0ec5fc92177eb181d602d26, I think we
should just remove uuid-ossp from our tree.  Somebody can maintain it
on PGXN if they feel the urge.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [review] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-02-18 Thread MauMau

From: Magnus Hagander mag...@hagander.net

We already have two different versions of make_absolute_path() in the tree
- one in src/backend/utils/init/miscinit.c and one in
src/test/regress/pg_regress.c.

I don't think we need a third one.

If we put it in port/ like this patch done, then we should make the other
callers use it. We might need one for frontend and one for backend (I
haven't looked into that much detail), but definitely the one between
pg_regress and pg_ctl should be shared.


Agreed.  Sorry, I should have proposed the refactoring.

Rajeev, could you refactor the code so that there is only one 
make_absolute_path()?  I think we can do like this:


1. Delete get_absolute_path() in src/port/path.c.
2. Delete make_absolute_path() in src/test/regress/pg_regress.c.
3. Move make_absolute_path() from src/backend/utils/init/miscinit.c to 
src/port/path.c, and delete the declaration in miscadmin.h.
4. Modify make_absolute_path() in path.c so that it can be used both in 
backend and frontend.  That is, surround the error message output with 
#ifdef FRONTEND ... #else ... #endif.  See some other source files in 
src/port.



Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OS X and ossp-uuid, next chapter

2014-02-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sat, Feb 15, 2014 at 4:17 PM, Peter Eisentraut pete...@gmx.net wrote:
 We already know that the uuid-ossp extension doesn't build OS X unless a
 small patch is applied.
 
 This has now gotten slightly worse after the Autoconf upgrade, because
 it will now fail if a header is present but cannot be compiled.

 Given commit e6170126fc201052b0ec5fc92177eb181d602d26, I think we
 should just remove uuid-ossp from our tree.  Somebody can maintain it
 on PGXN if they feel the urge.

I think that's premature, since that commit hasn't shipped yet.  Give
people a release or two to switch over.

I'd be good with putting a notice on uuid-ossp saying that it's deprecated
and will be removed soon.  Of course, our track record for enforcing such
things is bad, but we gotta start somewhere ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do you know the reason for increased max latency due to xlog scaling?

2014-02-18 Thread Jeff Janes
On Tue, Feb 18, 2014 at 3:49 AM, MauMau maumau...@gmail.com wrote:

 From: Andres Freund and...@2ndquadrant.com

 On 2014-02-18 01:35:52 +0900, MauMau wrote:

 For example, please see the max latencies of test set 2 (PG 9.3) and test
 set 4 (xlog scaling with padding).  They are 207.359 and 1219.422
 respectively.  The throughput is of course greatly improved, but I think
 the
 response time should not be sacrificed as much as possible.  There are
 some
 users who are sensitive to max latency, such as stock exchange and online
 games.


 You need to compare both at the same throughput to have any meaningful
 comparison.


 I'm sorry for my lack of understanding, but could you tell me why you
 think so?  When the user upgrades to 9.4 and runs the same workload, he
 would experience vastly increased max latency


The tests shown have not tested that.  The test is not running the same
workload on 9.4, but rather a vastly higher workload.  If we were to
throttle the workload in 9.4 (using pgbench's new -R, for example) to the
same level it was in 9.3, we probably would not see the max latency
increase.  But that was not tested, so we don't know for sure.



 --- or in other words, greater variance in response times.  With my simple
 understanding, that sounds like a problem for response-sensitive users.


If you need the throughput provided by 9.4, then using 9.3 gets lower
variance simply be refusing to do 80% of the assigned work.  If you don't
need the throughput provided by 9.4, then you probably have some natural
throttling in place.

If you want a real-world like test, you might try to crank up the -c and -j
to the limit in 9.3 in a vain effort to match 9.4's performance, and see
what that does to max latency.  (After all, that is what a naive web app is
likely to do--continue to make more and more connections as requests come
in faster than they can finish.)

Cheers,

Jeff


Re: [HACKERS] [bug fix] postgres.exe fails to start on Windows Server 2012 due to ASLR

2014-02-18 Thread MauMau

From: Magnus Hagander mag...@hagander.net

Unfortunately we missed the releases that have just been wrapped.


It's really unfortunate... I hope the next minor release will be soon.


I've applied this and backpatched to 9.3 and 9.2, which is as far back as
it goes cleanly.


Thank you.



In 9.1 the build system looked significantly different, which makes it
strange since the original report in this thread was about 9.1 but the
patch supplied clearly wasn't for that.

Does somebody want to look at backpatching this to 9.1 and earlier, or
should we just say that it's not fully supported on those Windows versions
unless you apply the registry workaround?


I'll try it at least for 9.1 in a few days when time permits.  The fix 
should be adding only one line in Project.pm.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do you know the reason for increased max latency due to xlog scaling?

2014-02-18 Thread Heikki Linnakangas

On 02/18/2014 06:27 PM, Jeff Janes wrote:

On Tue, Feb 18, 2014 at 3:49 AM, MauMau maumau...@gmail.com wrote:


--- or in other words, greater variance in response times.  With my simple
understanding, that sounds like a problem for response-sensitive users.


If you need the throughput provided by 9.4, then using 9.3 gets lower
variance simply be refusing to do 80% of the assigned work.  If you don't
need the throughput provided by 9.4, then you probably have some natural
throttling in place.

If you want a real-world like test, you might try to crank up the -c and -j
to the limit in 9.3 in a vain effort to match 9.4's performance, and see
what that does to max latency.  (After all, that is what a naive web app is
likely to do--continue to make more and more connections as requests come
in faster than they can finish.)


You're missing MauMau's point. In essence, he's comparing two systems 
with the same number of clients, issuing queries as fast as they can, 
and one can do 2000 TPS while the other one can do 1 TPS. You would 
expect the lower-throughput system to have a *higher* average latency. 
Each query takes longer, that's why the throughput is lower. If you look 
at the avg_latency columns in the graphs 
(http://hlinnaka.iki.fi/xloginsert-scaling/padding/), that's exactly 
what you see.


But what MauMau is pointing out is that the *max* latency is much higher 
in the system that can do 1 TPS. So some queries are taking much 
longer, even though in average the latency is lower. In an ideal, 
totally fair system, each query would take the same amount of time to 
execute, and after it's saturated, increasing the number of clients just 
makes that constant latency higher.


Yeah, I'm pretty sure that's because of the extra checkpoints. If you 
look at the individual test graphs, there are clear spikes in latency, 
but the latency is otherwise small. With a higher TPS, you reach 
checkpoint_segments quicker; I should've eliminated that effect in the 
tests I ran...


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-02-18 Thread Heikki Linnakangas

On 02/17/2014 10:36 PM, Andres Freund wrote:

On 2014-02-17 22:30:54 +0200, Heikki Linnakangas wrote:

This is what I came up with. I like it, I didn't have to contort lwlocks as
much as I feared. I added one field to LWLock structure, which is used to
store the position of how far a WAL inserter has progressed. The LWLock code
calls it just value, without caring what's stored in it, and it's used by
new functions LWLockWait and LWLockWakeup to implement the behavior the WAL
insertion slots have, to wake up other processes waiting for the slot
without releasing it.

This passes regression tests, but I'll have to re-run the performance tests
with this. One worry is that if the padded size of the LWLock struct is
smaller than cache line, neighboring WAL insertion locks will compete for
the cache line. Another worry is that since I added a field to LWLock
struct, it might now take 64 bytes on platforms where it used to be 32 bytes
before. That wastes some memory.


Why don't you allocate them in a separate tranche, from xlog.c? Then you
can store them inside whatever bigger object you want, guaranteeing
exactly the alignment you need. possibly you even can have the extra
value in the enclosing object?


Good idea. New patch attached, doing that.

I'll try to find time on some multi-CPU hardware to performance test 
this against current master, to make sure there's no regression.


- Heikki
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 508970a..3eef968 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -86,7 +86,7 @@ int			sync_method = DEFAULT_SYNC_METHOD;
 int			wal_level = WAL_LEVEL_MINIMAL;
 int			CommitDelay = 0;	/* precommit delay in microseconds */
 int			CommitSiblings = 5; /* # concurrent xacts needed to sleep */
-int			num_xloginsert_slots = 8;
+int			num_xloginsert_locks = 8;
 
 #ifdef WAL_DEBUG
 bool		XLOG_DEBUG = false;
@@ -289,7 +289,7 @@ XLogRecPtr	XactLastRecEnd = InvalidXLogRecPtr;
  * (which is almost but not quite the same as a pointer to the most recent
  * CHECKPOINT record).	We update this from the shared-memory copy,
  * XLogCtl-Insert.RedoRecPtr, whenever we can safely do so (ie, when we
- * hold an insertion slot).  See XLogInsert for details.  We are also allowed
+ * hold an insertion lock).  See XLogInsert for details.  We are also allowed
  * to update from XLogCtl-RedoRecPtr if we hold the info_lck;
  * see GetRedoRecPtr.  A freshly spawned backend obtains the value during
  * InitXLOGAccess.
@@ -361,63 +361,45 @@ typedef struct XLogwrtResult
 	XLogRecPtr	Flush;			/* last byte + 1 flushed */
 } XLogwrtResult;
 
-
 /*
- * A slot for inserting to the WAL. This is similar to an LWLock, the main
- * difference is that there is an extra xlogInsertingAt field that is protected
- * by the same mutex. Unlike an LWLock, a slot can only be acquired in
- * exclusive mode.
- *
- * The xlogInsertingAt field is used to advertise to other processes how far
- * the slot owner has progressed in inserting the record. When a backend
- * acquires a slot, it initializes xlogInsertingAt to 1, because it doesn't
- * yet know where it's going to insert the record. That's conservative
- * but correct; the new insertion is certainly going to go to a byte position
- * greater than 1. If another backend needs to flush the WAL, it will have to
- * wait for the new insertion. xlogInsertingAt is updated after finishing the
- * insert or when crossing a page boundary, which will wake up anyone waiting
- * for it, whether the wait was necessary in the first place or not.
- *
- * A process can wait on a slot in two modes: LW_EXCLUSIVE or
- * LW_WAIT_UNTIL_FREE. LW_EXCLUSIVE works like in an lwlock; when the slot is
- * released, the first LW_EXCLUSIVE waiter in the queue is woken up. Processes
- * waiting in LW_WAIT_UNTIL_FREE mode are woken up whenever the slot is
- * released, or xlogInsertingAt is updated. In other words, a process in
- * LW_WAIT_UNTIL_FREE mode is woken up whenever the inserter makes any progress
- * copying the record in place. LW_WAIT_UNTIL_FREE waiters are always added to
- * the front of the queue, while LW_EXCLUSIVE waiters are appended to the end.
- *
- * To join the wait queue, a process must set MyProc-lwWaitMode to the mode
- * it wants to wait in, MyProc-lwWaiting to true, and link MyProc to the head
- * or tail of the wait queue. The same mechanism is used to wait on an LWLock,
- * see lwlock.c for details.
+ * Inserting to WAL is protected by a bunch of WALInsertLocks. Each WAL
+ * insertion lock consists of a lightweight lock, plus an indicator of how
+ * far the insertion has progressed (insertingAt).
+ *
+ * The insertingAt value is used when writing the WAL to disk, to avoid
+ * waiting unnecessarily for an insertion that's still in-progress, but has
+ * already finished inserting all WAL beyond the point you're going to write
+ * the WAL up to. This isn't just to optimize, it's necessary to 

[HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.

2014-02-18 Thread Alvaro Herrera
Heikki Linnakangas wrote:
 Add a GUC to report whether data page checksums are enabled.

Is there are reason this wasn't back-patched to 9.3?  I think it should
be.



-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.

2014-02-18 Thread Peter Geoghegan
On Tue, Feb 18, 2014 at 11:39 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Is there are reason this wasn't back-patched to 9.3?  I think it should
 be.

+1.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.

2014-02-18 Thread Heikki Linnakangas

On 02/18/2014 09:39 PM, Alvaro Herrera wrote:

Heikki Linnakangas wrote:

Add a GUC to report whether data page checksums are enabled.


Is there are reason this wasn't back-patched to 9.3?  I think it should
be.


I considered it a new feature, so not back-patching was the default. If 
you want to back-patch it, I won't object.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.

2014-02-18 Thread Andres Freund
On 2014-02-18 22:23:59 +0200, Heikki Linnakangas wrote:
 On 02/18/2014 09:39 PM, Alvaro Herrera wrote:
 Heikki Linnakangas wrote:
 Add a GUC to report whether data page checksums are enabled.
 
 Is there are reason this wasn't back-patched to 9.3?  I think it should
 be.

Thirded.

 I considered it a new feature, so not back-patching was the default. If you
 want to back-patch it, I won't object.

Imo it's essentially a simple oversight in the checksum patch...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do you know the reason for increased max latency due to xlog scaling?

2014-02-18 Thread Andres Freund
On 2014-02-18 19:12:32 +0200, Heikki Linnakangas wrote:
 You're missing MauMau's point. In essence, he's comparing two systems with
 the same number of clients, issuing queries as fast as they can, and one can
 do 2000 TPS while the other one can do 1 TPS. You would expect the
 lower-throughput system to have a *higher* average latency.
 Each query takes
 longer, that's why the throughput is lower. If you look at the avg_latency
 columns in the graphs (http://hlinnaka.iki.fi/xloginsert-scaling/padding/),
 that's exactly what you see.
 
 But what MauMau is pointing out is that the *max* latency is much higher in
 the system that can do 1 TPS. So some queries are taking much longer,
 even though in average the latency is lower. In an ideal, totally fair
 system, each query would take the same amount of time to execute, and after
 it's saturated, increasing the number of clients just makes that constant
 latency higher.

Consider me being enthusiastically unenthusiastic about that fact. The
change in throughput still makes this pretty uninteresting. There's so
many things that are influenced by a factor 5 increase in throughput,
that a change in max latency is really not saying much.
There's also the point that with 5 times the throughput it's getting
more likely to sleep while holding critical locks and such.

 Yeah, I'm pretty sure that's because of the extra checkpoints. If you look
 at the individual test graphs, there are clear spikes in latency, but the
 latency is otherwise small. With a higher TPS, you reach checkpoint_segments
 quicker; I should've eliminated that effect in the tests I ran...

I don't think that'd be a good idea. The number of full page writes so
greatly influences the WAL charactersistics, that changing checkpoint
segments would make the tests much harder to compare.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE FOREIGN TABLE ( ... LIKE ... )

2014-02-18 Thread Andres Freund
On 2014-02-18 08:35:35 +0900, Michael Paquier wrote:
 On Tue, Feb 18, 2014 at 7:22 AM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-02-17 23:07:45 +0900, Michael Paquier wrote:
  On Mon, Feb 17, 2014 at 6:28 PM, Andres Freund and...@2ndquadrant.com 
  wrote:
   I don't think this really has gone above Needs Review yet.
  I am not sure that this remark makes the review of this patch much
  progressing :(
 
  By the way, I spent some time looking at it and here are some
  comments:
 
  David just pinged me and tricked me into having a quick look :)
 
  Unless I miss something this possibly allows column definition to slip
  by that shouldn't because normally all fdw column definitions are passed
  through transformColumnDefinition() which does some checks, but the
  copied ones aren't.
  I haven't looked long enough to see whether that's currently
  problematic, but even if not, it's sure a trap waiting to spring.

 transformColumnDefinition contains checks about serial and constraints
 mainly. The only thing that could be problematic IMO is the process
 done exclusively for foreign tables which is the creation of some
 ALTER FOREIGN TABLE ALTER COLUMN commands when per-column options are
 detected, something that is not passed to a like'd table with this
 patch. This may meritate a comment in the code.

As I said, I am not all that concerned that it's a big problem today,
but imo it's an accident waiting to happen.

I rather wonder if the code shouln't just ensure it's running
transformTableLikeClause() before transformColumnDefinition() by doing
it in a separate loop.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do you know the reason for increased max latency due to xlog scaling?

2014-02-18 Thread Heikki Linnakangas

On 02/18/2014 10:51 PM, Andres Freund wrote:

On 2014-02-18 19:12:32 +0200, Heikki Linnakangas wrote:

Yeah, I'm pretty sure that's because of the extra checkpoints. If you look
at the individual test graphs, there are clear spikes in latency, but the
latency is otherwise small. With a higher TPS, you reach checkpoint_segments
quicker; I should've eliminated that effect in the tests I ran...


I don't think that'd be a good idea. The number of full page writes so
greatly influences the WAL charactersistics, that changing checkpoint
segments would make the tests much harder to compare.


I was just thinking of bumping up checkpoint_segments so high that there 
are no checkpoints during any of the tests.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show xid and xmin in pg_stat_activity and pg_stat_replication

2014-02-18 Thread Andres Freund
On 2014-02-17 10:44:41 +0100, Christian Kruse wrote:
 This is true for now. But one of the purposes of using
 LocalPgBackendStatus instead of PgBackendStatus was to be able to add
 more fields like this in future. And thus we might need to change this
 in future, so why not do it now?

I don't think that argument is particularly valid. All (?) the other
functions just return just one value, so they won't ever have the need
to return more.

Not really sure which way is better.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do you know the reason for increased max latency due to xlog scaling?

2014-02-18 Thread Andres Freund
On 2014-02-18 23:01:08 +0200, Heikki Linnakangas wrote:
 On 02/18/2014 10:51 PM, Andres Freund wrote:
 On 2014-02-18 19:12:32 +0200, Heikki Linnakangas wrote:
 Yeah, I'm pretty sure that's because of the extra checkpoints. If you look
 at the individual test graphs, there are clear spikes in latency, but the
 latency is otherwise small. With a higher TPS, you reach checkpoint_segments
 quicker; I should've eliminated that effect in the tests I ran...
 
 I don't think that'd be a good idea. The number of full page writes so
 greatly influences the WAL charactersistics, that changing checkpoint
 segments would make the tests much harder to compare.
 
 I was just thinking of bumping up checkpoint_segments so high that there are
 no checkpoints during any of the tests.

Hm. I actually think that full page writes are an interesting part of
this because they are so significantly differently sized than normal
records.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do you know the reason for increased max latency due to xlog scaling?

2014-02-18 Thread Jeff Janes
On Tue, Feb 18, 2014 at 9:12 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 02/18/2014 06:27 PM, Jeff Janes wrote:

 On Tue, Feb 18, 2014 at 3:49 AM, MauMau maumau...@gmail.com wrote:

  --- or in other words, greater variance in response times.  With my
 simple
 understanding, that sounds like a problem for response-sensitive users.


 If you need the throughput provided by 9.4, then using 9.3 gets lower
 variance simply be refusing to do 80% of the assigned work.  If you don't
 need the throughput provided by 9.4, then you probably have some natural
 throttling in place.

 If you want a real-world like test, you might try to crank up the -c and
 -j
 to the limit in 9.3 in a vain effort to match 9.4's performance, and see
 what that does to max latency.  (After all, that is what a naive web app
 is
 likely to do--continue to make more and more connections as requests come
 in faster than they can finish.)


 You're missing MauMau's point. In essence, he's comparing two systems with
 the same number of clients, issuing queries as fast as they can, and one
 can do 2000 TPS while the other one can do 1 TPS. You would expect the
 lower-throughput system to have a *higher* average latency. Each query
 takes longer, that's why the throughput is lower. If you look at the
 avg_latency columns in the graphs (http://hlinnaka.iki.fi/
 xloginsert-scaling/padding/), that's exactly what you see.

 But what MauMau is pointing out is that the *max* latency is much higher
 in the system that can do 1 TPS. So some queries are taking much
 longer, even though in average the latency is lower. In an ideal, totally
 fair system, each query would take the same amount of time to execute, and
 after it's saturated, increasing the number of clients just makes that
 constant latency higher.


I thought that this was the point I was making, not the point I was
missing.  You have the same hard drives you had before, but now due to a
software improvement you are cramming 5 times more stuff through them.
 Yeah, you will get bigger latency spikes.  Why wouldn't you?  You are now
beating the snot out of your hard drives, whereas before you were not.

If you need 10,000 TPS, then you need to upgrade to 9.4.  If you need it
with low maximum latency as well, then you probably need to get better IO
hardware as well (maybe not--maybe more tuning could help).  With 9.3 you
didn't need better IO hardware, because you weren't capable of maxing out
what you already had.  With 9.4 you can max it out, and this is a good
thing.

If you need 10,000 TPS but only 2000 TPS are completing under 9.3, then
what is happening to the other 8000 TPS? Whatever is happening to them, it
must be worse than a latency spike.

On the other hand, if you don't need 10,000 TPS, than measuring max latency
at 10,000 TPS is the wrong thing to measure.

Cheers,

Jeff


Re: [HACKERS] [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding

2014-02-18 Thread Tom Lane
I wrote:
 dig...@126.com writes:
 select t, t::bytea from convert_from('\xeec1', 'sql_ascii') as g(t);
 [ fails to check that string is valid in database encoding ]

 Hm, yeah.  Normal input to the database goes through pg_any_to_server(),
 which will apply a validation step if the source encoding is SQL_ASCII
 and the destination encoding is something else.  However, pg_convert and
 some other places call pg_do_encoding_conversion() directly, and that
 function will just quietly do nothing if either encoding is SQL_ASCII.

 The minimum-refactoring solution to this would be to tweak
 pg_do_encoding_conversion() so that if the src_encoding is SQL_ASCII but
 the dest_encoding isn't, it does pg_verify_mbstr() rather than nothing.

 I'm not sure if this would break anything we need to have work,
 though.  Thoughts?  Do we want to back-patch such a change?

I looked through all the callers of pg_do_encoding_conversion(), and
AFAICS this change is a good idea.  There are a whole bunch of places
that use pg_do_encoding_conversion() to convert from the database encoding
to encoding X (most usually UTF8), and right now if you do that in a
SQL_ASCII database you have no assurance whatever that what is produced
is actually valid in encoding X.  I think we need to close that loophole.

I found one place --- utf_u2e() in plperl_helpers.h --- that is aware of
the lack of checking and forces a pg_verify_mbstr call for itself; but
it apparently is concerned about whether the source data is actually utf8
in the first place, which I think is not really
pg_do_encoding_conversion's bailiwick.  I'm okay with
pg_do_encoding_conversion being a no-op if src_encoding == dest_encoding.

Barring objections, I will fix and back-patch this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding

2014-02-18 Thread Tom Lane
I wrote:
 I looked through all the callers of pg_do_encoding_conversion(), and
 AFAICS this change is a good idea.  There are a whole bunch of places
 that use pg_do_encoding_conversion() to convert from the database encoding
 to encoding X (most usually UTF8), and right now if you do that in a
 SQL_ASCII database you have no assurance whatever that what is produced
 is actually valid in encoding X.  I think we need to close that loophole.

BTW, a second look confirms that all but one or two of the callers of
pg_do_encoding_conversion() are in fact converting either to or from
the database encoding.  That probably means they ought to be using
pg_any_to_server or pg_server_to_any instead, so that they can take
advantage of the pre-cached default conversion in the not-unlikely
situation that the other encoding is the current client_encoding.

This would imply that we ought to put a validate-if-source-is-SQL_ASCII
step into pg_server_to_any() as well, since that function currently
short-circuits calling pg_do_encoding_conversion() in that case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-18 Thread Kyotaro HORIGUCHI
Hello,

At Tue, 18 Feb 2014 09:49:23 -0500, Tom Lane wrote
 Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
  Could you guess any use cases in which we are happy with ALTER
  TABLE's inheritance tree walking?  IMHO, ALTER FOREIGN TABLE
  always comes with some changes of the data source so implicitly
  invoking of such commands should be defaultly turned off. If the
  ALTER'ing the whole familiy is deadfully useful for certain
  cases, it might be explicitly turned on by some syntax added to
  CREATE FOREIGN TABLE or ALTER TABLE.
 
  It would looks like following,
 
  CREATE FOREIGN TABLE ft1 () INHERITS (pt1 ALLOW_INHERITED_ALTER, pt2);
 
  ALTER TABLE INCLUDE FOREIGN CHILDREN parent ADD COLUMN add1 integer;
 
 Ugh.  This adds a lot of logical complication without much real use,
 IMO.

Yeah! That's exactry the words I wanted for the expamples. Sorry
for bothering you. I also don't see any use case driving us to
overcome the extra complexity.

  The problems that have been discussed in this thread are that
 certain types of ALTER are sensible to apply to foreign tables and
 others are not --- so how does a one-size-fits-all switch help that?

None, I think. I intended only to display what this ALTER's
inheritance walkthrough brings in.

 Also, there are some types of ALTER for which recursion to children
 *must* occur or things become inconsistent --- ADD COLUMN itself is
 an example, and ADD CONSTRAINT is another.  It would not be acceptable
 for an ALLOW_INHERITED_ALTER switch to suppress that, but if the
 switch is leaky then it's even less consistent and harder to explain.

Exactly. It is the problem came with allowing to implicit ALTER
on foreign children, Shigeru's last message seems to referred to.

At Tue, 18 Feb 2014 14:47:51 +0900, Shigeru Hanada wrote
 I'm not sure that allowing ALTER TABLE against parent table affects
 descendants even some of them are foreign table.

Avoiding misunderstandings, my opinion on whether ALTER may
applie to foreign chidlren is 'no'.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-18 Thread Hiroshi Inoue
(2014/02/12 15:31), Inoue, Hiroshi wrote:
 (2014/02/12 3:03), Tom Lane wrote:
 Hiroshi Inoue in...@tpf.co.jp writes:
 (2014/02/09 8:06), Andrew Dunstan wrote:
 Yeah. Incidentally, we didn't quite get rid of dlltool for Cygwin. We
 did get rid of dllwrap. But I agree this is worth trying for Mingw.

 I tried MINGW port with the attached change and successfully built
 src and contrib and all pararell regression tests were OK.

 I cleaned this up a bit (the if-nesting in Makefile.shlib was making
 my head hurt, not to mention that it left a bunch of dead code) and
 committed it.
 
 Thanks.
 
 By my count, the only remaining usage of dlltool is in plpython's
 Makefile.  Can we get rid of that?
 
 Maybe this is one of the few use cases of dlltool.
 Because python doesn't ship with its MINGW import library, the
 Makefile uses dlltool to generate an import library from the python
 DLL.
 
 Also, the only remaining usage of dllwrap is in src/bin/pgevent/Makefile.
 Do we need that either?
 
 Maybe this can be removed.
 I would make a patch later.

Sorry for the late reply.
Attached is a patch to remove dllwarp from pgevent/Makefile.

regards,
Hiroshi Inoue



diff --git a/src/bin/pgevent/Makefile b/src/bin/pgevent/Makefile
index 1d90276..0a7c16d 100644
--- a/src/bin/pgevent/Makefile
+++ b/src/bin/pgevent/Makefile
@@ -17,30 +17,21 @@ include $(top_builddir)/src/Makefile.global
 ifeq ($(PORTNAME), win32)
 
 OBJS=pgevent.o pgmsgevent.o
-NAME=pgevent.dll
+NAME=pgevent
+SHLIB_LINK = 
+SHLIB_EXPORTS = exports.txt
 
-all: $(NAME)
+all: all-lib
 
-install: all install-lib
+include $(top_srcdir)/src/Makefile.shlib
 
-pgevent.dll: pgevent.def $(OBJS)
-	$(DLLWRAP) --def $ -o $(NAME) $(OBJS)
+install: all install-lib
 
 pgmsgevent.o: pgmsgevent.rc win32ver.rc
 	$(WINDRES) $ -o $@ --include-dir=$(top_builddir)/src/include --include-dir=$(top_srcdir)/src/include --include-dir=$(srcdir) --include-dir=.
 
-all-lib: $(NAME)
-
-install-lib: $(NAME)
-	$(INSTALL_STLIB) $ '$(DESTDIR)$(libdir)/$'
-
-uninstall-lib:
-	rm -f '$(DESTDIR)$(libdir)/$(NAME)'
-
-clean distclean:
-	rm -f $(OBJS) $(NAME) win32ver.rc
 
-clean-lib:
-	rm -f $(NAME)
+clean distclean: clean-lib
+	rm -f $(OBJS) win32ver.rc $(DLL_DEFFILE)
 
 endif
diff --git a/src/bin/pgevent/exports.txt b/src/bin/pgevent/exports.txt
new file mode 100644
index 000..70dcd25
--- /dev/null
+++ b/src/bin/pgevent/exports.txt
@@ -0,0 +1,5 @@
+; dlltool --output-def pgevent.def pgevent.o pgmsgevent.o
+EXPORTS
+	DllUnregisterServer@0 ;
+	DllRegisterServer@0 ;
+	DllInstall ;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?

2014-02-18 Thread Peter Geoghegan
I've had multiple complaints of apparent data loss on 9.3.2 customer
databases. There are 2 total, both complaints from the past week, one
of which I was able to confirm. The customer's complaint is that
certain rows are either visible or invisible, depending on whether an
index scan is used or a sequential scan (I confirmed this with an
EXPLAIN ANALYZE). The expectation of the customer is that the rows
should always be visible, because he didn't delete them. These are
trivial queries on single trivial tables, but they have foreign keys.

It appears from our internal records that the database that I
confirmed the issue on has always been on 9.3.2 (the heap files have
only been touched by 9.3.2 binaries, although there is more than one
timeline). I cannot swear that this was never on an earlier point
release of 9.3, but it is considered quite unlikely. When I ran a
manual VACUUM FREEZE, I could no longer reproduce the issue (i.e. the
index scan and sequential scan both subsequently indicated that the
row of interest was not visible, and so are at least in agreement). To
me this suggests a problem with MultiXacts. However, I was under the
impression that the 9.3.3 fix Rework tuple freezing protocol fixed
an issue that could not manifest in such a way, and so it isn't
obvious to me that this is a known bug. A reading of the 9.3.3 release
notes offers no obvious clues as to what the problem might be. Apart
from the freezing rework, and the Account for remote row locks
propagated by local updates item, nothing jumps out, and it isn't
obvious to me how even the most pertinent two items from *.3 might
address relate to this. Have I missed something? Is this likely to be
worth debugging further, to determine if there is an undiscovered bug?
If so, I'm sure I can recover a copy of the data as it was before and
reproduce the problem.

I think that since this database was (very probably) always 9.3.2, we
would not have run the vacuum/vacuum_freeze_table_age amelioration
promoted for those upgrading to that point release (promoted in the
*.2 release notes). On this database, as of right now:

 txid_current
──
  6242282
(1 row)


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?

2014-02-18 Thread Alvaro Herrera
Peter Geoghegan wrote:
 I've had multiple complaints of apparent data loss on 9.3.2 customer
 databases. There are 2 total, both complaints from the past week, one
 of which I was able to confirm. The customer's complaint is that
 certain rows are either visible or invisible, depending on whether an
 index scan is used or a sequential scan (I confirmed this with an
 EXPLAIN ANALYZE).

The multixact bugs would cause tuples to be hidden at the heap level.
If the tuples are visible in a seqscan, then these are more likely to be
related to index problems, not multixact problem.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?

2014-02-18 Thread Peter Geoghegan
On Tue, Feb 18, 2014 at 5:50 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 The multixact bugs would cause tuples to be hidden at the heap level.
 If the tuples are visible in a seqscan, then these are more likely to be
 related to index problems, not multixact problem.

That was my first suspicion, but then re-indexing didn't help, while
VACUUM FREEZE had the immediate effect of making both plans give a
consistent answer.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?

2014-02-18 Thread Peter Geoghegan
On Tue, Feb 18, 2014 at 5:56 PM, Peter Geoghegan p...@heroku.com wrote:
 That was my first suspicion, but then re-indexing didn't help, while
 VACUUM FREEZE had the immediate effect of making both plans give a
 consistent answer.

I should add that this is an unremarkable int4 primary key (which was
dropped and recreated by the customer). There is no obvious reason why
we should see wrong answers that are attributable to application code
or environment.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?

2014-02-18 Thread Peter Geoghegan
On Tue, Feb 18, 2014 at 5:50 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Peter Geoghegan wrote:
 I've had multiple complaints of apparent data loss on 9.3.2 customer
 databases. There are 2 total, both complaints from the past week, one
 of which I was able to confirm. The customer's complaint is that
 certain rows are either visible or invisible, depending on whether an
 index scan is used or a sequential scan (I confirmed this with an
 EXPLAIN ANALYZE).

 The multixact bugs would cause tuples to be hidden at the heap level.
 If the tuples are visible in a seqscan, then these are more likely to be
 related to index problems, not multixact problem.

I guess I wasn't clear enough here: The row in question was visible
with a sequential scans but *not* with an index scan. So you have it
backwards here (understandably).


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-18 Thread Kyotaro HORIGUCHI
Hello,

At Tue, 18 Feb 2014 19:24:50 +0900, Etsuro Fujita wrote
  From: Shigeru Hanada [mailto:shigeru.han...@gmail.com]
  2014-02-10 21:00 GMT+09:00 Etsuro Fujita fujita.ets...@lab.ntt.co.jp:
   (2014/02/07 21:31), Etsuro Fujita wrote:
   So, I've modified the patch so
   that we continue to disallow SET STORAGE on a foreign table *in the
   same manner as before*, but, as your patch does, allow it on an
   inheritance hierarchy that contains foreign tables, with the
   semantics that we quietly ignore the foreign tables and apply the
   operation to the plain tables, by modifying the ALTER TABLE simple
  recursion mechanism.
   Attached is the updated version of the patch.
  
  I'm not sure that allowing ALTER TABLE against parent table affects
  descendants even some of them are foreign table.  I think the rule should
  be simple enough to understand for users, of course it should be also
  consistent and have backward compatibility.
 
 Yeah, the understandability is important.  But I think the
 flexibility is also important.  In other words, I think it is a
 bit too inflexible that we disallow e.g., SET STORAGE to be set
 on an inheritance tree that contains foreign table(s) because
 we disallow SET STORAGE to be set on foreign tables directly.

What use case needs such a flexibility precedes the lost behavior
predictivity of ALTER TABLE and/or code maintenancibility(more
ordinally words must be...) ? I don't agree with the idea that
ALTER TABLE implicitly affects foreign children for the reason in
the upthread. Also turning on/off feature implemented as special
syntax seems little hope.

If you still want that, I suppose ALTER FOREIGN TABLE is usable
chainging so as to walk through the inheritance tree and affects
only foreign tables along the way. It can naturally affects
foregin tables with no unanticipated behavor.

Thoughts?

  If foreign table can be modified through inheritance tree, this kind of
  change can be done.
  
  1) create foreign table as a child of a ordinary table
  2) run ALTER TABLE parent, the foreign table is also changed
  3) remove foreign table from the inheritance tree by ALTER TABLE child NO
  INHERIT parent
  4) here we can't do same thing as 2), because it is not a child anymore.
  
  So IMO we should determine which ALTER TABLE features are allowed to foreign
  tables, and allow them regardless of the recursivity.
 
 What I think should be newly allowed to be set on foreign tables is
 
 * ADD table_constraint
 * DROP CONSTRAINT

As of 9.3

http://www.postgresql.org/docs/9.3/static/sql-alterforeigntable.html

 Consistency with the foreign server is not checked when a
 column is added or removed with ADD COLUMN or DROP COLUMN, a
 NOT NULL constraint is added, or a column type is changed with
 SET DATA TYPE. It is the user's responsibility to ensure that
 the table definition matches the remote side.

So I belive implicit and automatic application of any constraint
on foreign childs are considerably danger.


 * [NO] INHERIT parent_table

Is this usable for inheritance foreign children? NO INHERIT
removes all foreign children but INHERIT is nop.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-02-18 Thread Amit Kapila
On Thu, Jan 2, 2014 at 4:38 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 Hi,

 On 02/01/14 10:02, Andres Freund wrote:
  Christian's idea of a context line seems plausible to me.  I don't
  care for this implementation too much --- a global variable?  Ick.

 Yea, the data should be stored in ErrorContextCallback.arg instead.

 Fixed.

I have looked into this patch. Below are some points which I
think should be improved in this patch:

1. New context added by this patch is display at wrong place
Session-1
Create table foo(c1 int);
insert into foo values(generate_series(1,10);
Begin;
update foo set c1 =11;

Session-2
postgres=# begin;
BEGIN
postgres=# update foo set val1 = 13 where val1=3;
Cancel request sent
ERROR:  canceling statement due to user request
CONTEXT:  relation name: foo (OID 57496)
tuple ctid (0,7)

Do this patch expect to print the context at cancel request
as well?
I don't find it useful. I think the reason is that after setup of
context, if any other error happens, it will display the newly setup
context.

2a. About the message:
LOG:  process 1936 still waiting for ShareLock on transaction 842
after 1014.000ms
CONTEXT:  relation name: foo (OID 57496)
tuple (ctid (0,1)): (1, 2, abc  )

Isn't it better to display database name as well, as if we see in
one of related messages as below, it displays database name as well.

LOG:  process 3012 still waiting for ExclusiveLock on tuple (0,1) of
relation 57
499 of database 12045 after 1014.000 ms

2b. I think there is no need to display *ctid* before tuple offset, it seems
to be obvious as well as consistent with other similar message.

3. About callback
I think rather than calling setup/tear down functions from multiple places,
it will be better to have wrapper functions for
XactTableLockWait() and MultiXactIdWait().

Just check in plpgsql_exec_function(), how the errorcallback is set,
can we do something similar without to avoid palloc?

I think we still need to see how to avoid issue mentioned in point-1.

4. About the values of tuple
I think it might be useful to display some unique part of tuple for
debugging part, but what will we do incase there is no primary
key on table, so may be we can display initial few columns (2 or 3)
or just display tuple offset as is done in other similar message
shown above. More to the point, I have observed that in few cases
displaying values of tuple can be confusing as well. For Example:

Session-1
Create table foo(c1 int);
postgres=# insert into foo values(generate_series(1,3));
INSERT 0 3
postgres=# begin;
BEGIN
postgres=# update foo set c1 = 4;
UPDATE 3

Session-2
postgres=# update foo set c1=6;
LOG:  process 1936 still waiting for ShareLock on transaction 884 after 1014.000
 ms
CONTEXT:  relation name: foo (OID 57499)
tuple (ctid (0,1)): (1)

Session-3
postgres=# begin;
BEGIN
postgres=# update foo set c1=5 where c1 =3;
LOG:  process 3012 still waiting for ShareLock on transaction 884 after 1015.000
 ms
CONTEXT:  relation name: foo (OID 57499)
tuple (ctid (0,3)): (3)

Session-1
Commit;

Session-2
LOG:  process 1936 acquired ShareLock on transaction 884 after 25294.000 ms
CONTEXT:  relation name: foo (OID 57499)
tuple (ctid (0,1)): (1)
UPDATE 3

Session-3
LOG:  process 3012 acquired ShareLock on transaction 884 after 13217.000 ms
CONTEXT:  relation name: foo (OID 57499)
tuple (ctid (0,3)): (3)
LOG:  process 3012 still waiting for ShareLock on transaction 885 after 1014.000
 ms
CONTEXT:  relation name: foo (OID 57499)
tuple (ctid (0,6)): (4)

Now here it will be bit confusing to display values with tuple,
as in session-3 statement has asked to update value (3), but we have started
waiting for value (4). Although it is right, but might not make much sense.
Also after session-2 commits the transaction, session-3 will say acquired lock,
however still it will not be able to update it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-18 Thread Shigeru Hanada
Hi Horiguchi-san,

2014-02-18 19:29 GMT+09:00 Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp:
 Could you guess any use cases in which we are happy with ALTER
 TABLE's inheritance tree walking?  IMHO, ALTER FOREIGN TABLE
 always comes with some changes of the data source so implicitly
 invoking of such commands should be defaultly turned off.

Imagine a case that foreign data source have attributes (A, B, C, D)
but foreign tables and their parent ware defined as (A, B, C).  If
user wants to use D as well, ALTER TABLE parent ADD COLUMN D type
would be useful (rather necessary?) to keep consistency.

Changing data type from compatible one (i.e., int to numeric,
varchar(n) to text), adding CHECK/NOT NULL constraint would be also
possible.

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers