from:"Christophe Pettus"

Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

2017-09-25 Thread Christophe Pettus

> On Sep 25, 2017, at 07:55, Andrew Dunstan  
> wrote:
> Let's ask a couple of users who I think are or have been actually
> hurting on this point. Christophe and David, any opinions?

Since about 90% of what I encounter in this area are automatically-generated 
migrations, having a clear set of (perhaps restrictive) rules which never fail 
is the most important.  It's easy to split the CREATE or ALTERs out into their 
own transaction, and leave usage (such as populating a table from a migration) 
to a second transaction.

It's not clear to me that this is a vote either way, but I think the easiest 
thing to explain ("you cannot use a new enum value in the same transaction that 
created it") is the best in this situation.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Implement table partitioning.

2016-12-10 Thread Christophe Pettus


> On Dec 9, 2016, at 22:52, Keith Fiske  wrote:
> On Fri, Dec 9, 2016 at 10:01 PM, Robert Haas  wrote:
>> One thing that's tricky/annoying about this is that if you have a
>> DEFAULT partition and then add a partition, you have to scan the
>> DEFAULT partition for data that should be moved to the new partition.
>> That makes what would otherwise be a quick operation slow.  Still, I'm
>> sure there's a market for that feature.
> 
> I would find that perfectly acceptable as long as a caveat about the 
> performance impact was included in the documentation.

+1.  I don't think it's conceptually different from adding a column with a 
default, in that regard; you just have to know the impact.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] HashAggregate row estimate = 200

2016-06-03 Thread Christophe Pettus

Something I've noticed frequently is that HashAggregate will, especially if the 
children are Append with one of the nodes a non-seqscan, estimate 200 rows 
rather than a calculated vlaue.  Where is that value coming from?  The 
statistics target, a hardwired constant, or something else?
--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alter or rename enum value

2016-03-27 Thread Christophe Pettus

On Mar 27, 2016, at 7:20 AM, Tom Lane  wrote:
> I do not know whether this would be a meaningful improvement for
> common use-cases, though.

It would certainly be a step forward over the current situation.  It would mean 
that a specific imaginable use-case (inserting a new enum value, then 
populating a dimension table for it) would have to be done as two migrations 
rather than one, but that is much more doable in most tools than having a 
migration run without a transaction at all.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alter or rename enum value

2016-03-26 Thread Christophe Pettus

On Mar 26, 2016, at 7:40 AM, Andrew Dunstan  wrote:
> It would be nice if we could find a less broad brush approach to dealing with 
> the issue.

I don't know how doable this is, but could we use the existing mechanism of 
marking an index invalid if it contains an enum type to which a value was 
added, and the transaction was rolled back?  For the 90% use case, that would 
be acceptable, I would expect.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alter or rename enum value

2016-03-25 Thread Christophe Pettus


On Mar 25, 2016, at 11:50 AM, Andrew Dunstan  wrote:

> I don't believe anyone knows how to do that safely.

The core issue, for me, is that not being able to modify enum values in a 
transaction breaks a very wide variety of database migration tools.  Even a 
very brutal solution like marking indexes containing the altered type invalid 
on a ROLLBACK would be preferable to the current situation.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Deciding which index to use

2014-11-11 Thread Christophe Pettus

Where in the optimizer code does PostgreSQL decide which of several 
possibly-matching partial indexes to use?
--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-28 Thread Christophe Pettus

On Feb 28, 2014, at 2:12 PM, Peter Geoghegan  wrote:

> AFAICT, we have these
> operator classes that work fine with jsonb for the purposes of
> hstore-style indexing (the hstore operator classes).

That assumes that it is acceptable that jsonb be packaged in the hstore 
extension.  To put it mildly, there's no consensus on that point; indeed, I 
think there's consensus that's a non-starter.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-28 Thread Christophe Pettus


On Feb 28, 2014, at 1:34 PM, Peter Geoghegan  wrote:

> Amazon RDS Postgres has hstore.

Just observing that putting something in -contrib does not mean every 
installation can automatically adopt it.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-28 Thread Christophe Pettus

On Feb 28, 2014, at 1:35 PM, Peter Geoghegan  wrote:
> I don't think hstore-style indexing is "advanced"; it's the main
> reason for there being a jsonb, in my view.

jsonb is significantly faster than json even without indexing; there are plenty 
of reasons to have jsonb even if we don't initially have indexing operations 
for it.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-28 Thread Christophe Pettus

On Feb 28, 2014, at 1:03 PM, Josh Berkus  wrote:
> However, the problems with admins not wanting to install -contrib aren't
> really about what's in or not in -contrib.

I'll also mention that an increasingly large number of people are running 
PostgreSQL in an environment where they don't get to pick what packages are 
installed on their server (RDS, for example).  Telling them that something is 
in -contrib can very well be telling them "You can't have it" in those cases.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-28 Thread Christophe Pettus

On Feb 28, 2014, at 6:27 AM, Robert Haas  wrote:

> Taken individually, none of those decisions seem crazy, but taken
> together it's pretty weird.  Instead of inventing a new type (jsonb)
> designed from the ground up to do what we want, we're, well, we're
> doing what Christophe says: creating our own proprietary hierarchical
> type and then making the hierarchical type everyone else uses depend
> on it.  Described in those terms, it's hard for me to believe that
> anyone here thinks that's not a strange thing to do.

A lot of it is that we're getting really tied up in knots about terminology.  
Because of the history of the project, it's being approached as "jsonb depends 
on hstore2", rather than, "We need a binary format, BSON won't cut it, but 
hstore2 is creating one, so let's use the same for both to avoid duplication of 
effort."

Put that last way, it's a more sensible decision.  My specific concern was 
"Well, if you want binary json, install hstore" is a very strange presentation 
to give to customers.  Many of the user-facing objections can be solved just by 
removing the implicit cast from jsonb to hstore, and the remaining operators 
(if they don't make it into this patch) can be added over time.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-27 Thread Christophe Pettus

On Feb 27, 2014, at 11:15 PM, Peter Geoghegan  wrote:

> I don't think that's obvious at all. Anyone is free to spend their
> time however they please, but personally I don't think that that's a
> wise use of anyone's time.

I believe you are misunderstanding me.  If there are actual technical problems 
or snags to migrating jsonb into core with full operator and index support, 
then the way we find out is to do the implementation, unless you know of a 
specific technical holdup already.

> There are no technical issues of any real consequence in this specific 
> instance.

There was no technical reason that json couldn't have been an extension, 
either, but there were very compelling presentational reasons to have it in 
core.  jsonb has exactly the same presentational issues.

> Yes, people who have the ability to block the feature entirely. I am
> attempting to build consensus by reaching a compromise that weighs
> everyone's concerns.

The thing I still haven't heard is why jsonb in core is a bad idea, except that 
it is too much code.  Is that the objection?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-27 Thread Christophe Pettus

On Feb 27, 2014, at 9:59 PM, Peter Geoghegan  wrote:

> I don't find that very reassuring.

Obviously, we have to try it, and that will decide it.

> I don't understand why an extension is seen as not befitting
> of a more important feature.

contrib/ is considered a secondary set of features; I routinely get pushback 
from clients about using hstore because it's not in core, and they are thus 
suspicious of it.  The educational project required to change that far exceeds 
any technical work we are talking about here..  There's a very large 
presentational difference between having a feature in contrib/ and in core, at 
the minimum, setting aside the technical issues (such as the 
extensions-calling-extensions problem).

We have an existence proof of this already: if there was absolutely no 
difference between having things being in contrib/ and being in core, full text 
search would still be in contrib/.

> You are basically suggesting putting all of hstore in core, because
> jsonb and hstore are approximately the same thing. That seem quite a
> bit more controversial than putting everything in the hstore
> extension.

Well, "controversy" is just a way of saying there are people who don't like the 
idea, and I get that.  But I don't see the basis for the dislike.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-27 Thread Christophe Pettus

On Feb 27, 2014, at 9:28 PM, Peter Geoghegan  wrote:

> The only problem with that is now you have to move the implementation
> of ||, plus a bunch of other hstore operators into core. That seems
> like a more difficult direction to move in from a practical
> perspective, and I'm not sure that you won't hit a snag elsewhere.

Implementing operators for new types in PostgreSQL is pretty well-trod ground.  
I really don't know what snags we might hit.

> I suppose that putting it in core would be slightly preferable
> given the strategic importance of jsonb, but it's not something that
> I'd weigh too highly.

I'm completely unsure how to parse the idea that something is strategically 
important but we shouldn't put it in core.  If json was important enough to 
make it into core, jsonb certainly is.

Honestly, I really don't understand the resistance to putting jsonb in core.  
There are missing operators, yes; that's a very straight-forward hole to plug.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-27 Thread Christophe Pettus

On Feb 27, 2014, at 9:12 PM, Craig Ringer  wrote:

> On 02/28/2014 12:43 PM, Christophe Pettus wrote:
>> My proposal is that we break the dependencies of jsonb (at least, at the 
>> user-visible level) on hstore2, thus allowing it in core successfully. jsonb 
>> || jsonb returning hstore seems like a bug to me, not a feature we should be 
>> supporting.
> 
> Urgh, really?
> 
> That's not something I'd be excited to be stuck with into the future.

The reason that we're even here is that there's no jsonb || jsonb operator (or 
the other operators that one would expect).

If you try || without the hstore, you get an error, of course:

postgres=# select '{"foo":{"bar":"yellow"}}'::jsonb || '{}'::jsonb;
ERROR:  operator does not exist: jsonb || jsonb
LINE 1: select '{"foo":{"bar":"yellow"}}'::jsonb || '{}'::jsonb;
 ^
HINT:  No operator matches the given name and argument type(s). You might need 
to add explicit type casts.

The reason it works with hstore installed is that there's an implicit cast from 
hstore to jsonb:

postgres=# create extension hstore;
CREATE EXTENSION
postgres=# select '{"foo":{"bar":"yellow"}}'::jsonb || '{}'::jsonb;
 ?column? 
--
 "foo"=>{"bar"=>"yellow"}
(1 row)

--

But I think we're piling broken on broken here.  Just creating an appropriate 
jsonb || jsonb operator solves this problem.  That seems the clear route 
forward.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-27 Thread Christophe Pettus

On Feb 27, 2014, at 8:31 PM, Peter Geoghegan  wrote:

> On Thu, Feb 27, 2014 at 8:23 PM, Christophe Pettus  wrote:
>> Surely, the answer is to define a jsonb || jsonb (and likely the other 
>> combinatorics of json and jsonb), along with the appropriate GIN and GiST 
>> interfaces for jsonb.  Why would that not work?
> 
> I'm not the one opposed to putting jsonb stuff in the hstore module!

My proposal is that we break the dependencies of jsonb (at least, at the 
user-visible level) on hstore2, thus allowing it in core successfully. jsonb || 
jsonb returning hstore seems like a bug to me, not a feature we should be 
supporting.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-27 Thread Christophe Pettus

On Feb 27, 2014, at 8:04 PM, Peter Geoghegan  wrote:

> I'm hearing a lot about how important jsonb is, but not much on how to
> make the simple jsonb cases that are currently broken (as illustrated
> by my earlier examples [1], [2]) work.

Surely, the answer is to define a jsonb || jsonb (and likely the other 
combinatorics of json and jsonb), along with the appropriate GIN and GiST 
interfaces for jsonb.  Why would that not work?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-27 Thread Christophe Pettus

On Feb 27, 2014, at 5:31 PM, Peter Geoghegan  wrote:

> Now, it's confusing that it has to go through hstore, perhaps, but
> that's hardly all that bad in and of itself.

Yes, it is.  It strikes me as irrational to have jsonb depend on hstore.  Let's 
be honest with ourselves: if we were starting over, we wouldn't start by 
creating our own proprietary hierarchical type and then making the hierarchical 
type everyone else uses depend on it.  hstore exists because json didn't.  But 
json does now, and we shouldn't create a jsonb dependency on hstore.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-02-25 Thread Christophe Pettus

On Feb 25, 2014, at 1:57 PM, Hannu Krosing  wrote:

> It is not in any specs, but nevertheless all major imlementations do it and
> some code depends on it.

I have no doubt that some code depends on it, but "all major implementations" 
is too strong a statement.  BSON, in particular, does not have stable field 
order.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Streaming replication bug in 9.3.2, "WAL contains references to invalid pages"

2014-01-02 Thread Christophe Pettus

Greetings,

We've had two clients experience a crash on the secondary of a streaming 
replication pair, running PostgreSQL 9.3.2.  In both cases, the messages were 
close to this example:

2013-12-30 18:08:00.464 PST,,,23869,,52ab4839.5d3d,16,,2013-12-13 09:47:37 
PST,1/0,0,WARNING,01000,"page 45785 of relation base/236971/365951 is 
uninitialized","xlog redo vacuum: rel 1663/236971/365951; blk 45794, 
lastBlockVacuumed 45784"""
2013-12-30 18:08:00.465 PST,,,23869,,52ab4839.5d3d,17,,2013-12-13 09:47:37 
PST,1/0,0,PANIC,XX000,"WAL contains references to invalid pages","xlog redo 
vacuum: rel 1663/236971/365951; blk 45794, lastBlockVacuumed 45784"""
2013-12-30 18:08:00.950 PST,,,23866,,52ab4838.5d3a,8,,2013-12-13 09:47:36 
PST,,0,LOG,0,"startup process (PID 23869) was terminated by signal 6: 
Aborted",""

In both cases, the indicated relation was a primary key index.  In one case, 
rebuilding the primary key index caused the problem to go away permanently (to 
date).  In the second case, the problem returned even after a full dump / 
restore of the master database (that is, after a dump / restore of the master, 
and reimaging the secondary, the problem returned at the same primary key 
index, although of course with a different OID value).

It looks like this has been experienced on 9.2.6, as well:


http://www.postgresql.org/message-id/flat/CAL_0b1s4QCkFy_55kk_8XWcJPs7wsgVWf8vn4=jxe6v4r7h...@mail.gmail.com

Let me know if there's any further information I can provide.

Best,
--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-13 Thread Christophe Pettus


On Dec 13, 2013, at 8:52 AM, Tom Lane  wrote:
> Please apply commit 478af9b79770da43a2d89fcc5872d09a2d8731f8 and see
> if that doesn't fix it for you.

It appears to fix it.  Thanks!

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-13 Thread Christophe Pettus


On Dec 13, 2013, at 1:49 PM, Merlin Moncure  wrote:
> Is this an edge case or something that will hit a lot of users?

My understanding (Tom can correct me if I'm wrong, I'm sure) is that it is an 
issue for servers on 9.3.2 where there are a lot of query cancellations due to 
facilities like statement_timeout or lock_timeout that cancel a query 
asynchronously.  I assume pg_cancel_backend() would apply as well.

We've only seen it on one client, and that client had a *lot* (thousands on 
thousands) of statement_timeout cancellations.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-13 Thread Christophe Pettus

On Dec 13, 2013, at 8:52 AM, Tom Lane  wrote:

> Please apply commit 478af9b79770da43a2d89fcc5872d09a2d8731f8 and see
> if that doesn't fix it for you.

Great, thanks.  Would the statement_timeout firing invoke this path?  (I'm 
wondering why this particular installation was experiencing this.)

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-12 Thread Christophe Pettus


On Dec 12, 2013, at 7:40 PM, Peter Geoghegan  wrote:
> Couldn't that just be the app setting it locally?

Yes, that's what is happening there (I had to check with the client's 
developers).  It's possible that the one-minute repeat is due to the 
application reissuing the query, rather than specifically related to the 
spinlock issue.  What this does reveal is that all the spinlock issues have 
been on long-running queries, for what it is worth.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-12 Thread Christophe Pettus

On Dec 12, 2013, at 6:41 PM, Andres Freund  wrote:

> Christophe: are there any "unusual" ERROR messages preceding the crash,
> possibly some minutes before?

Interestingly, each spinlock PANIC is *followed*, about one minute later (+/- 
five seconds) by a "canceling statement due to statement timeout" on that exact 
query.  The queries vary enough in text that it is unlikely to be a coincidence.

There are a *lot* of "canceling statement due to statement timeout" messages, 
which is interesting, because:

postgres=# show statement_timeout;
 statement_timeout 
-------
 0
(1 row)

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-12 Thread Christophe Pettus

 0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
md0   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
md1   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
dm-0  0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
dm-1  0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
sdc   0.00 0.00   33.000.00 0.32 0.0019.88 
0.123.523.520.00   0.61   2.00
dm-2  0.00 0.00   33.000.00 0.32 0.0019.88 
0.123.523.520.00   0.61   2.00
sdd   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   1.080.000.290.000.00   98.62

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
avgqu-sz   await r_await w_await  svctm  %util
sda   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
sdb   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
md0   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
md1   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
dm-0  0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
dm-1  0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
sdc   0.00 0.006.004.00 0.07 0.0320.80 
0.000.000.000.00   0.00   0.00
dm-2  0.00 0.006.004.00 0.07 0.0320.80 
0.000.000.000.00   0.00   0.00
sdd   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   0.500.000.080.040.00   99.37

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
avgqu-sz   await r_await w_await  svctm  %util
sda   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
sdb   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
md0   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
md1   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
dm-0  0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
dm-1  0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
sdc   0.00 0.00   11.003.00 0.22 0.1249.14 
0.000.000.000.00   0.00   0.00
dm-2  0.00 0.00   11.003.00 0.22 0.1249.14 
0.000.000.000.00   0.00   0.00
sdd   0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00


--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-12 Thread Christophe Pettus


On Dec 12, 2013, at 6:15 PM, Tom Lane  wrote:

> Are you possibly using any nonstandard extensions?

No, totally stock PostgreSQL.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-12 Thread Christophe Pettus

On Dec 12, 2013, at 5:45 PM, Tom Lane  wrote:

> Presumably, we are seeing the victim rather than the perpetrator of
> whatever is going wrong.

This is probing about a bit blindly, but the only thing I can see about this 
system that is in some way unique (and this is happening on multiple machines, 
so it's unlikely to be hardware) is that there are a relatively large number of 
relations (like, 440,000+) distributed over many schemas.  Is there anything 
that pins a buffer that is O(N) to the number of relations?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-12 Thread Christophe Pettus

f19489fa0 "")
at /tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/tcop/pquery.c:790
#26 0x7fa041d3d960 in exec_simple_query (
query_string=0x7fa0426034a0 "SELECT COUNT(*) FROM \"signups\"  WHERE 
(signups.active_customer_expires_at > '2013-11-11' and 
(signups.overdue_invoices_count is null or signups.overdue_invoices_count = 
0))")
at 
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/tcop/postgres.c:1048
#27 PostgresMain (argc=, argv=, 
dbname=0x7fa0425eb138 "nbuild", username=)
at 
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/tcop/postgres.c:4005
#28 0x7fa041cfa765 in BackendRun (port=0x7fa04262f050)
at 
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/postmaster/postmaster.c:3999
#29 BackendStartup (port=0x7fa04262f050)
at 
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/postmaster/postmaster.c:3688
#30 ServerLoop ()
at 
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/postmaster/postmaster.c:1589
#31 PostmasterMain (argc=, argv=)
at 
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/postmaster/postmaster.c:1258
#32 0x7fa041b36ea2 in main (argc=5, argv=0x7fa0425e91a0)
at /tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/main/main.c:196
--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-12 Thread Christophe Pettus

p/postgres.c:1048
#25 PostgresMain (argc=, argv=, 
---Type  to continue, or q  to quit---
dbname=0x7f699dd3e138 "nbuild", username=)
at 
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/tcop/postgres.c:4005
#26 0x7f699c6ff765 in BackendRun (port=0x7f699dd82050)
at 
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/postmaster/postmaster.c:3999
#27 BackendStartup (port=0x7f699dd82050)
at 
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/postmaster/postmaster.c:3688
#28 ServerLoop ()
at 
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/postmaster/postmaster.c:1589
#29 PostmasterMain (argc=, argv=)
at 
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/postmaster/postmaster.c:1258
#30 0x00007f699c53bea2 in main (argc=5, argv=0x7f699dd3c1a0)
at /tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/main/main.c:196

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-12 Thread Christophe Pettus


On Dec 12, 2013, at 4:04 PM, Tom Lane  wrote:
> If you aren't getting a core file for a PANIC, then core
> files are disabled.

And just like that, we get one.  Stack trace:

#0  0x7f699a4fa425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x7f699a4fa425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x7f699a4fdb8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x7f699c81991b in errfinish ()
#3  0x7f699c81a477 in elog_finish ()
#4  0x7f699c735db3 in s_lock ()
#5  0x7f699c71e1f0 in ?? ()
#6  0x7f699c71eaf9 in ?? ()
#7  0x7f699c71f53e in ReadBufferExtended ()
#8  0x7f699c56d03a in index_fetch_heap ()
#9  0x7f699c67a0b7 in ?? ()
#10 0x7f699c66e98e in ExecScan ()
#11 0x7f699c6679a8 in ExecProcNode ()
#12 0x7f699c67407f in ExecAgg ()
#13 0x7f699c6678b8 in ExecProcNode ()
#14 0x7f699c664dd2 in standard_ExecutorRun ()
#15 0x7f6996ad928d in ?? ()
   from /usr/lib/postgresql/9.3/lib/auto_explain.so
#16 0x7f69968d3525 in ?? ()
   from /usr/lib/postgresql/9.3/lib/pg_stat_statements.so
#17 0x7f699c745207 in ?? ()
#18 0x7f699c746651 in PortalRun ()
#19 0x7f699c742960 in PostgresMain ()
#20 0x7f699c6ff765 in PostmasterMain ()
#21 0x7f699c53bea2 in main ()


--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-12 Thread Christophe Pettus


On Dec 12, 2013, at 3:18 PM, Tom Lane  wrote:
> Hm, a PANIC really ought to result in a core file.  You sure you don't
> have that disabled (perhaps via a ulimit setting)?

Since it's using the Ubuntu packaging, we have pg_ctl_options = '-c' in 
/etc/postgresql/9.3/main/pg_ctl.conf.

> As for the root cause, it's hard to say.  The file/line number says it's
> a buffer header lock that's stuck.  I rechecked all the places that lock
> buffer headers, and all of them have very short code paths to the
> corresponding unlock, so there's no obvious explanation how this could
> happen.

The server was running with shared_buffers=100GB, but the problem has 
reoccurred now with shared_buffers=16GB.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-12 Thread Christophe Pettus

On Dec 12, 2013, at 3:33 PM, Andres Freund  wrote:

> Any other changes but the upgrade? Maybe a different compiler version?

Just the upgrade; they're using the Ubuntu packages from apt.postgresql.org.

> Also, could you share some details about the workload? Highly
> concurrent? Standby? ...

The workload is not very highly concurrent; actually quite lightly loaded.   
There are a very large number (442,000) of user tables.  No standby attached.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] "stuck spinlock"

2013-12-12 Thread Christophe Pettus


On Dec 12, 2013, at 3:37 PM, Peter Geoghegan  wrote:
> Show pg_config output.

Below; it's the Ubuntu package.

BINDIR = /usr/lib/postgresql/9.3/bin
DOCDIR = /usr/share/doc/postgresql-doc-9.3
HTMLDIR = /usr/share/doc/postgresql-doc-9.3
INCLUDEDIR = /usr/include/postgresql
PKGINCLUDEDIR = /usr/include/postgresql
INCLUDEDIR-SERVER = /usr/include/postgresql/9.3/server
LIBDIR = /usr/lib
PKGLIBDIR = /usr/lib/postgresql/9.3/lib
LOCALEDIR = /usr/share/locale
MANDIR = /usr/share/postgresql/9.3/man
SHAREDIR = /usr/share/postgresql/9.3
SYSCONFDIR = /etc/postgresql-common
PGXS = /usr/lib/postgresql/9.3/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--with-tcl' '--with-perl' '--with-python' '--with-pam' 
'--with-openssl' '--with-libxml' '--with-libxslt' 
'--with-tclconfig=/usr/lib/tcl8.5' '--with-tkconfig=/usr/lib/tk8.5' 
'--with-includes=/usr/include/tcl8.5' 'PYTHON=/usr/bin/python' 
'--mandir=/usr/share/postgresql/9.3/man' 
'--docdir=/usr/share/doc/postgresql-doc-9.3' 
'--sysconfdir=/etc/postgresql-common' '--datarootdir=/usr/share/' 
'--datadir=/usr/share/postgresql/9.3' '--bindir=/usr/lib/postgresql/9.3/bin' 
'--libdir=/usr/lib/' '--libexecdir=/usr/lib/postgresql/' 
'--includedir=/usr/include/postgresql/' '--enable-nls' 
'--enable-integer-datetimes' '--enable-thread-safety' '--enable-debug' 
'--disable-rpath' '--with-ossp-uuid' '--with-gnu-ld' '--with-pgport=5432' 
'--with-system-tzdata=/usr/share/zoneinfo' 'CFLAGS=-g -O2 -fstack-protector 
--param=ssp-buffer-size=4 -Wformat -Wformat-security -Werror=format-security 
-fPIC -pie -I/usr/include/mit-krb5 -DLINUX_OOM_ADJ=0' 
'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now -Wl,--as-needed 
-L/usr/lib/mit-krb5 -L/usr/lib/x86_64-linux-gnu/mit-krb5' '--with-krb5' 
'--with-gssapi' '--with-ldap' 'CPPFLAGS=-D_FORTIFY_SOURCE=2'
CC = gcc
CPPFLAGS = -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2 
-I/usr/include/tcl8.5
CFLAGS = -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat 
-Wformat-security -Werror=format-security -fPIC -pie -I/usr/include/mit-krb5 
-DLINUX_OOM_ADJ=0 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g
CFLAGS_SL = -fpic
LDFLAGS = -L../../../src/common -Wl,-Bsymbolic-functions -Wl,-z,relro 
-Wl,-z,now -Wl,--as-needed -L/usr/lib/mit-krb5 
-L/usr/lib/x86_64-linux-gnu/mit-krb5 -L/usr/lib/x86_64-linux-gnu -Wl,--as-needed
LDFLAGS_EX = 
LDFLAGS_SL = 
LIBS = -lpgport -lpgcommon -lxslt -lxml2 -lpam -lssl -lcrypto -lkrb5 -lcom_err 
-lgssapi_krb5 -lz -ledit -lcrypt -ldl -lm 
VERSION = PostgreSQL 9.3.2

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] "stuck spinlock"

2013-12-12 Thread Christophe Pettus

Greetings,

Immediately after an upgrade from 9.3.1 to 9.3.2, we have a client getting 
frequent (hourly) errors of the form:

/var/lib/postgresql/9.3/main/pg_log/postgresql-2013-12-12_211710.csv:2013-12-12 
21:40:10.328 
UTC,"n","n",32376,"10.2.1.142:52451",52aa24eb.7e78,5,"SELECT",2013-12-12 
21:04:43 UTC,9/7178,0,PANIC,XX000,"stuck spinlock (0x7f7df94672f4) detected at 
/tmp/buildd/postgresql-9.3-9.3.2/build/../src/backend/storage/buffer/bufmgr.c:1099",,""

uname -a: Linux postgresql3-master 3.8.0-33-generic #48~precise1-Ubuntu SMP Thu 
Oct 24 16:28:06 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux.

Generally, there's no core file (which is currently enable), as the postmaster 
just normally exits the backend.

Diagnosis suggestions?
 
--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-20 Thread Christophe Pettus

On Nov 20, 2013, at 3:57 PM, Andres Freund  wrote:

> On 2013-11-20 15:52:22 -0800, Josh Berkus wrote:
>> Oh, so this doesn't just happen when the base backup is first taken;
>> *any* time the standby is restarted, it can happen. (!!!)
> 
> Yes.

So, to be completely clear, any secondary running the affected versions which 
is started with hot_standby=on could potentially be corrupted even if it never 
connects to a primary?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-19 Thread Christophe Pettus

Hi, Andres,

>From my understanding, the problem only occurs over streaming replication; if 
>the secondary was never a hot standby, and only used the archived WAL 
>segments, that would be safe.  Is that correct?
--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-19 Thread Christophe Pettus


On Nov 19, 2013, at 10:51 AM, Andres Freund  wrote:

> You seem to imply that I/we should do that work?

No, just that it be done.  Of course, the more support from the professional PG 
community that is given to it, the better.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-19 Thread Christophe Pettus

On Nov 19, 2013, at 10:29 AM, Andres Freund  wrote:

> It's pretty unlikely that any automated testing would have cought this,
> the required conditions are too unlikely for that.

I would expect that "promote secondary while primary is under heavy load" is 
clear-cut test case.  What concerns me more is that we don't seem to have a 
framework to put in a regression test on the bug you just found (and thank you 
for finding it so quickly!).

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-19 Thread Christophe Pettus

On Nov 19, 2013, at 10:25 AM, Andres Freund  wrote:
> I am not sure how much code it's going to take (I'd hope somewhat less),
> but it certainly will take some time to agree how it should be built and
> then building and integrating it.

Given that the situation we're in right now is that we have an unknown number 
of silently corrupt secondaries out there which will only be discovered when 
someone promotes them to being a primary (possibly because the current primary 
died without a backup), I'd say that this is something pretty urgent.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-19 Thread Christophe Pettus

On Nov 19, 2013, at 6:59 AM, Andres Freund  wrote:

> Yes. There's less expensive ways to do it, but those seem to complicated
> to suggest.

If this is something that could be built into to a tool, acknowledging the 
complexity, I'd be happy to see about building it.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-18 Thread Christophe Pettus


On Nov 18, 2013, at 2:26 PM, Andres Freund  wrote:

> Trying to reproduce the issue with and without hot_standby=on would be
> very helpful, but I guess that's time consuming?

I've been working on it, but I haven't gotten it to fail yet.  I'll keep at it.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-18 Thread Christophe Pettus

Great!  If there's any further data I can supply to help, let me know.

On Nov 18, 2013, at 2:15 PM, Andres Freund  wrote:

> Hi,
> 
> Afaics it's likely a combination/interaction of bugs and fixes between:
> * the initial HS code
> * 5a031a5556ff83b8a9646892715d7fef415b83c3
> * f44eedc3f0f347a856eea8590730769125964597
> 
> But that'd mean nobody noticed it during 9.3's beta...
> 
> Greetings,
> 
> Andres Freund
> 
> -- 
> Andres Freundhttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-18 Thread Christophe Pettus

On Nov 18, 2013, at 12:57 PM, Andres Freund  wrote:

> Were there any kind of patterns in the lost data? What kind of workload
> are they running? I have an idea what the issue might be...

On the P1 > S1 case, the data corrupted was data modified in the last few 
minutes before the switchover.  I don't want to over-analyze, but it was within 
the checkpoint_timeout value for that sever.

On the P2 > S2 case, it's less obvious what the pattern is, since there was no 
cutover.

Insufficient information on the P3 > S3 case.

Each of them is a reasonably high-volume OLTP-style workload.  The P1/P2 client 
has a very high level of writes; the P3 more read-heavy, but still a fair 
number of writes. 

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-18 Thread Christophe Pettus


On Nov 18, 2013, at 12:00 PM, Christophe Pettus  wrote:

> One more correction: After rsync finished and the pg_base_backup() was 
> issued, the contents of pg_xlog/ on S1 were deleted.

pg_stop_backup(), sorry.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-18 Thread Christophe Pettus


On Nov 18, 2013, at 11:47 AM, Andres Freund  wrote:
> Without deleting any data, including pg_xlog/, backup.label, anything?

One more correction: After rsync finished and the pg_base_backup() was issued, 
the contents of pg_xlog/ on S1 were deleted.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-18 Thread Christophe Pettus

On Nov 18, 2013, at 11:47 AM, Andres Freund  wrote:

> Did you have hot_standby enabled on all of those machines? Even on the
> 9.0.13 cluster?

Actually, it's a bit more complex than this:

1. We don't know about P0, the 9.0.13 machine.  I assume it was, but it was 
managed elsewhere.
2. P1 never had hot_standby = 'on', as it was never intended to be a hot 
stand_by.
3. S1 did have hot_standby = 'on.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-18 Thread Christophe Pettus


On Nov 18, 2013, at 11:47 AM, Andres Freund  wrote:

> Without deleting any data, including pg_xlog/, backup.label, anything?

Correct.

> Did you have hot_standby enabled on all of those machines? Even on the
> 9.0.13 cluster?

Yes.

> That was just recovery command and primary conninfo?

Correct.

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-18 Thread Christophe Pettus

On Nov 18, 2013, at 11:28 AM, Andres Freund  wrote:
> Could you detail how exactly the base backup was created? Including the
> *exact* logic for copying?

0. Before any of this began, P1 was archiving WAL segments to AWS-S3.
1. pg_start_backup('', true) on P1.
2. Using rsync -av on P1, the entire $PGDATA directory was pushed from P1 to S2.
3. Once the rsync was complete, pg_stop_backup() on P1.
4. Create appropriate recovery.conf on S1.
5. Bring up PostgreSQL on S1.
6. PostgreSQL recovers normally (pulling WAL segments from WAL-E), and 
eventually connects to P1.

> Do you have the log entries for the startup after the base backup?

Sadly, not anymore.

> This server is gone, right?

Correct.

> Could you list the *exact* steps you did to startup the cluster?

0. Before any of this began, P2 was archiving WAL segments to AWS-S3.
1. Initial (empty) data directory deleted on S2.
2. New data directory created with:

/usr/lib/postgresql/9.3/bin/pg_basebackup --verbose --progress 
--xlog-method=stream --host= --user=repluser --pgdata=/data/9.3/main

3. Once the pg_basebackup completed, create appropriate recovery.conf on S1.
4. Bring up PostgreSQL on S2.
5. PostgreSQL recovers normally (pulling a small number of WAL segments from 
WAL-E), and eventually connects to P2.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-18 Thread Christophe Pettus


On Nov 18, 2013, at 10:58 AM, Christophe Pettus  wrote:
> As a note, P1 was created from another system (let's call it P0) using just 
> WAL shipping (no streaming replication), and no data corruption was observed.

As another data point, P0 was running 9.0.13, rather than 9.0.14.
--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

2013-11-18 Thread Christophe Pettus

Three times in the last two weeks, we have experience data corruption secondary 
servers using streaming replication on client systems.  The versions involved 
are 9.0.14, 9.2.5, and 9.3.1.  Each incident was separate; two cases they were 
for the same client (9.0.14 and 9.3.1), one for a different client (9.2.5).

The details of each incident are similar, but not identical.

The details of each incident are:

INCDIDENT #1: 9.0.14 -- A new secondary (S1) was initialized using rsync off of 
an existing, correct primary (P1) for the base backup, and using WAL-E for WAL 
segment shipping.  Both the primary and secondary were running 9.0.14.  S1 
properly connected to the primary once the it was caught up on WAL segments, 
and S1 was then promoted as a primary using the trigger file.

No errors in the log files on either system.

After promotion, it was discovered that there was significant data loss on S1.  
Rows that were present on P1 were missing on S1, and some rows were duplicated 
(including duplicates that violated primary key and other unique constraints).  
The indexes were corrupt, in that they seemed to think that the duplicates were 
not duplicated, and that the missing rows were still present.

Because the client's schema included a "last_updated" field, we were able to 
determine that all of the rows that were either missing or duplicated had been 
updated on P1 shortly (3-5 minutes) before S1 was promoted.  It's possible, but 
not confirmed, that there were active queries (including updates) running on P1 
at the moment of S1's promotion.

As a note, P1 was created from another system (let's call it P0) using just WAL 
shipping (no streaming replication), and no data corruption was observed.

P1 and S1 were both AWS instances running Ubuntu 12.04, using EBS (with xfs as 
the file system) as the data volume.

P1 and S1 have been destroyed at this point.


INCIDENT #2: 9.3.1 -- In order to repair the database, a pg_dump was taken of 
S1y, after having dropped the primary and unique constraints, and restored into 
a new 9.3.1 server, P2.  Duplicate rows were purged, and missing rows were 
added again.  The database, a new primary, was then put back into production, 
and ran without incident.

A new secondary, S2 was created off of the primary.  This secondary was created 
using pg_basebackup using --xlog-method=stream, although the WAL-E archiving 
was still present.

S2 attached to P2 without incident and no errors in the logs, but 
nearly-identical corruption was discovered (although this time without the 
duplicated rows, just missing rows).  At this point, it's not clear if there 
was some clustering in the "last_updated" timestamp for the rows that are 
missing from S2.  No duplicated rows were observed.

P2 and S2 are both AWS instances running Ubuntu 12.04, using EBS (with xfs as 
the file system) as the data volume.

No errors in the log files on either system.

P2 and S2 are still operational.


INCIDENT #3: 9.2.5 -- A client was migrating a large database from a 9.2.2 
system (P3) to a new 9.2.5 system (S3) using streaming replication.  As I 
personally didn't do the steps on this one, I don't have quite as much 
information, but the basics are close to incident #2: When S3 was promoted 
using the trigger file, no errors were observed and the database came up 
normally, but rows were missing from S3 that were present on P3.

P1 is running Centos 6.3 with ext4 as the file system.

P2 is running Centos 6.4 with ext3 as the file system.

Log shipping in this case was done via rsync.

P3 and S3 are still operational.

No errors in the log files on either system.

--

Obviously, we're very concerned that a bug was introduced in the latest minor 
release.  We're happy to gather data as required to assist in diagnosing this.
--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Getting the clog bits for a particular xid

2013-11-13 Thread Christophe Pettus

As part of doing some database corruption investigation, I'm trying to get the 
pg_clog/ bit pair for a particular transaction.

Let's say we check on a particular tuple, and get:

SELECT xmin, id FROM mytable WHERE pk=4727366;

   xmin|id
---+--
 107898222 |  4727366

Each pg_clog file (as least, as of 9.3.1) is 256KB, so there are 256KB * 8 
bits/byte / 2 bits/transaction = 1M transactions per file

So:

107898222 / 1048576 = 102, or 0x0066
107898222 % 1048576 = 943470.

So, we're looking at file 0x0066.  It's the 943470th transaction in that file, 
or the 943470*2 = 1886940th bit.  So, (counting from the MSB being 0), it's the 
4th and 5th bit of byte offset 235867 in that file.

Is that correct?
--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ereport documentation patch

2013-08-20 Thread Christophe Pettus


On Aug 19, 2013, at 11:28 PM, Heikki Linnakangas wrote:

> On 19.08.2013 23:40, Christophe Pettus wrote:
>> Is it reasonable to note in the documentation that ereport does not return 
>> if the error severity is greater than or equal to ERROR?
> 
> Yeah, it probably would be good to mention that. Got a patch?

Attached!


ereport-no-return-doc.patch
Description: Binary data



--
-- Christophe Pettus
   x...@thebuild.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] ereport documentation patch

2013-08-19 Thread Christophe Pettus

Is it reasonable to note in the documentation that ereport does not return if 
the error severity is greater than or equal to ERROR?

--
-- Christophe Pettus
   x...@thebuild.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_rewind, a tool for resynchronizing an old master after failover

2013-05-28 Thread Christophe Pettus

On May 28, 2013, at 12:49 PM, Alvaro Herrera wrote:

> We've had request from companies because they wanted to distribute
> Postgres and lawyers weren't comfortable with copyright statements in
> assorted files.  In those cases we've asked the people mentioned in such
> copyright statements, got approval to remove the offending copyright
> lines, and removed them.

I assume this topic has come up and been rejected for some reason, but just in 
case: The Django project requires an explicit agreement for contributions that 
end up in the main source tree for it, part of which is the acceptance of the 
Django license and copyright notice.  (I don't have my copy right in front of 
me, but I don't think it's a full-on assignment of copyright.)

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Logging both start and end of temporary file usage

2012-07-12 Thread Christophe Pettus

In working with s a client to analyze their temp file usage, it became useful 
to know when a temporary file was created as well as when it was closed.  That 
way, we could process the logs to determine a high water mark of overall temp 
file usage, to know how high it was safe (in that workload) to set work_mem.  
So, I wrote a quick patch that logged both the open and close of the temp file.

Since the final size of the file isn't known at the time that the file is 
created, the patch just logs the filename.  The particular file can be 
correlated with the size by the name when the close message is logged.  Of 
course, there's no information about the pattern of the file size over time, 
but it's a bit more information than was there before.

As we don't know the size of the file until close time, the open is only logged 
if log_temp_files is 0 (the idea being that's "maximum logging").

If this sounds like something worthwhile in general, I can package it up as a 
proper patch.
--
-- Christophe Pettus
   x...@thebuild.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] xlog min recovery request ... is past current point ...

2012-02-03 Thread Christophe Pettus

PostgreSQL 9.0.4:

While bringing up a streaming replica, and while it is working its way through 
the WAL segments before connecting to the primary, I see a lot of messages of 
the form:

2012-02-01 21:26:13.978 PST,,,24448,,4f2a1e61.5f80,54,,2012-02-01 21:25:53 
PST,1/0,0,LOG,0,"restored log file ""00010DB40065"" from 
archive",""
2012-02-01 21:26:14.032 PST,,,24448,,4f2a1e61.5f80,55,,2012-02-01 21:25:53 
PST,1/0,0,WARNING,01000,"xlog min recovery request DB5/42E15098 is past current 
point DB4/657FA490","writing block 5 of relation base/155650/156470_vm
xlog redo insert: rel 1663/155650/1658867; tid 9640/53"""
2012-02-01 21:26:14.526 PST,,,24448,,4f2a1e61.5f80,56,,2012-02-01 21:25:53 
PST,1/0,0,LOG,0,"restored log file ""00010DB40066"" from 
archive",""

All of these are on _vm relations.  The recovery completed successfully and the 
secondary connected to the primary without issue, so: Are these messages 
something to be concerned over?

--
-- Christophe Pettus
  x...@thebuild.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] and it's not a bunny rabbit, either

2010-12-26 Thread Christophe Pettus

On Dec 26, 2010, at 7:55 PM, Robert Haas wrote:

> "tables do not support %s"
> "views do not support %s"
> "indexes do not support %s"

The more detail we can give, the better, of course.  Nothing's more frustrating 
than having a command with an error like, "Object does not support requested 
operation."  Thanks, computer program: "Swerved off road, hit tree" is about as 
useful.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why don't we accept exponential format for integers?

2010-12-17 Thread Christophe Pettus

Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> isinstance(10,int)
True
>>> isinstance(1e10,int)
False

--
-- Christophe Pettus
   x...@thebuild.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hstores in pl/python

2010-12-13 Thread Christophe Pettus

On Dec 13, 2010, at 7:19 PM, Robert Haas wrote:
> If we decree that Python dictionaries map
> onto hstore, does that mean they DON'T map onto json, or Pavel's
> hand-wavy proposal for associative arrays?  Because from 10,000 feet
> it sure isn't obvious why hstore would be preferable to either of the
> other two, except that it already exists and the early bird gets the
> worm.

I'll mention that psycopg2, the most widely Python DBI implementation for 
PostgreSQL, has a built-in mapping of hstore to dict, so signs are definitely 
pointing towards a hstore == dict standardization.  It also suffers from the 
problem that it needs to sniff the hstore OID, which is somewhat annoying, 
especially in a web environment where the sniff has to happen repeatedly.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Final(?) proposal for wal_sync_method changes

2010-12-07 Thread Christophe Pettus


On Dec 7, 2010, at 2:43 PM, Josh Berkus wrote:
> Because nobody sane uses OSX on the server?

The XServe running 10.5 server and 9.0.1 at the other end of the office takes 
your remark personally. :)

--
-- Christophe Pettus
   x...@thebuild.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range types

2009-12-15 Thread Christophe Pettus



On Dec 15, 2009, at 3:40 PM, Jeff Davis wrote:


Based on the premise that timestamps are a continuous value and the
granularity/precision is entirely an implementation detail, you're
right. But I disagree with the premise, at least in some cases that I
think are worthwhile.


The argument is, in essence:

DECIMAL is continuous.
DECIMAL(10,3) is discrete.

timestamptz in general is a continuous value (unless we're talking  
Planck times :) ).  There is no way for us to guarantee that  
next(timestamptz) will have the same value across all platforms; its  
epsilon is platform dependent.


However, if we specify a scale on timestamptz, it becomes much more  
useful.  Just making up a syntax, if we had timestamptz(milliseconds),  
then it's discrete and we know what next(timestamptz(milliseconds)) is.


But in the current implementation, the only way I can see making that  
work is if we specify a scale for timestamptz, and that strikes me as  
a big change to its semantics.


--
-- Christophe Pettus
   x...@thebuild.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Apprentices? (was =patch - Report the schema...)

2009-11-15 Thread Christophe Pettus



On Nov 15, 2009, at 5:14 PM, Andrew Dunstan wrote:
Then I think we need to start being more creative about ways to ease  
the path for people who want to get people involved.



With that as an inspiration, I'd like to offer a modest proposal:  
Apprentices.


In my case, I would be very exciting about participating more in PG  
development.  I have a reasonable amount of C/C++ experience (20-ish  
years), and have been a PG admin/client developer since 1998, but I  
lack enough familiarity with the codebase to do patch review on non- 
trivial patches, or to launch in and do development on a non-trivial  
feature.  (And, as has been noted, there are essential no non-trivial  
features needing work.)


On the other hand, I'm sure there are those such as myself who are  
perfectly capable in *assisting* in some way: Reviewing a patch for  
specific things, writing test cases, working on a subsection of a  
larger patch.  This also is a solution for the intimidating prospect  
of working on a large patch and being sold, "Um, that's really  
completely the wrong thing.  Sorry."


Not every primary contribution or patch reviewer is going to be  
comfortable working with other people, because of temperament or work  
style, but I'm sure some are.  Might this help?

--
-- Christophe Pettus
   x...@thebuild.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: String key space for advisory locks

2009-10-27 Thread Christophe Pettus



On Oct 27, 2009, at 4:37 PM, Merlin Moncure wrote:

'as is', advisory locks is a fantastic feature that can be used for
signaling, mutexing, etc that are relatively difficult things to do in
the transactional world of sql.  My main gripe is that the 'shared id'
method for doing record pessimistic locks is basically a nuclear
missile pointed at your shared buffers if you don't have lot of
discipline in the queries that lock IDs.  Maybe this argues for more
of a 'sql exposed' pessimistic lock feature that operates on similar
level as 'for update'...I'm not sure...curious what thoughts you have
about improving them.


Advisory locks have, as you say, a raft of very useful characteristics:

1. Enforced as much or as little as you wish, depending on your  
application design.

2. Race-condition-free.
3. Cleaned up automatically on session end.

Of course, 2^64 potential entries are enough for anyone.  The  
usability issue comes when you have multiple domains that you want to  
apply advisory locks to in a single database.  For example, if you  
have multiple tables (one of which, just for example, has a character  
pk), and perhaps some inter-client mutex signaling for things that are  
outside of a transactional model ("this client is importing a file  
from an outside source, so don't you do it"), it's unappealing to have  
to come up with ways of representing those in a 64-bit namespace.


Hashing isn't a terrible solution, assuming collisions don't become an  
issue; a well-designed hashtext64() would help a lot.

--
-- Christophe Pettus
   x...@thebuild.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: String key space for advisory locks

2009-10-26 Thread Christophe Pettus



On Oct 26, 2009, at 5:24 PM, Itagaki Takahiro wrote:


Hmmm, hashtext() returns int32. ,
Can you reduce the collision issue if we had hashtext64()?


That would certainly reduce the chance of a collison considerably,  
assuming the right algorithm.


--
-- Christophe Pettus
   x...@thebuild.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Proposal: String key space for advisory locks

2009-10-25 Thread Christophe Pettus


Greetings,

I'd like to propose a potential patch, and wanted to get preliminary  
feedback on it before I started looking into the design.


Summary:Add a string key space to the advisory lock functionality.

Rationale:

Right now, the key spaces (the range of unique values that can be used  
as identity) for advisory locks are either a bigint or two ints.  This  
is, of course, technically more than one could imaginably need in any  
application.  The difficulty arises when the number of potential  
advisory locks is related to rows in one or more tables.


For example, suppose one wanted to use advisory locks to signal that a  
queue entry is being processed, and entries in that queue have a  
primary key that's also a bigint.  There's no problem; the advisory  
lock id is the primary key for the row.


And, then, one wants to use an advisory lock to signal that a  
particular record in another table is being processed in a long-term  
process.  One has a series of unappealing alternatives at that point,  
mostly involving encoding a table ID and the primary key of a record  
into the 64 bit number, or just hoping that the primary key doesn't  
overflow an int, and using the 2 x int form.


API Changes:

Overloading the various advisory lock functions to take a suitable  
string type (varchar(64)?) in addition to the bigint / 2 x int  
variations.  As with the bigint / 2 x int forms, this string namespace  
would be disjoint from the other key spaces.


Thanks in advance for any comments.
--
-- Christophe Pettus
   x...@thebuild.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] contrib/plantuner - enable PostgreSQL planner hints

2009-10-16 Thread Christophe Pettus



On Oct 16, 2009, at 10:04 AM, decibel wrote:

Out of curiosity, did you look at doing hints as comments in a query?


I don't think that a contrib module could change the grammar.

--
-- Christophe Pettus
   x...@thebuild.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

68 matches

Mail list logo