date:20121102

Re: [HACKERS] Bug in ALTER COLUMN SET DATA TYPE ?

2012-11-02 Thread Pavan Deolasee

On Sat, Nov 3, 2012 at 10:21 AM, Nikhil Sontakke  wrote:

>
>
>> So coming back to the issue, do you think it's a good idea to teach
>> ATAddCheckConstraint() that the call is coming from a late phase of ALTER
>> TABLE ?
>>
>> +1
>
> You mentioned AT_PASS_OLD_INDEX in your original mail, but I guess that
> you meant that we should check for  AT_PASS_OLD_CONSTR and then not raise
> an error?
>
>
Yeah, I meant  AT_PASS_OLD_CONSTR.

Thanks,
Pavan

Re: [HACKERS] Bug in ALTER COLUMN SET DATA TYPE ?

2012-11-02 Thread Nikhil Sontakke

> So coming back to the issue, do you think it's a good idea to teach
> ATAddCheckConstraint() that the call is coming from a late phase of ALTER
> TABLE ?
>
> +1

You mentioned AT_PASS_OLD_INDEX in your original mail, but I guess that you
meant that we should check for  AT_PASS_OLD_CONSTR and then not raise an
error?

Regards,
Nikhils
-- 
StormDB - http://www.stormdb.com
The Database Cloud
Postgres-XC Support and Service

Re: [HACKERS] RFC: Timing Events

2012-11-02 Thread Josh Berkus


> I don't see all that going into core without a much bigger push than I
> think people will buy.  What people really want for all these is a
> proper trending system, and that means graphs and dashboards and
> bling--not a history table.

Well, I'm particularly thinking for autoconfiguration.  For example, to
set vacuum_freeze_min_age properly, you have to know the XID "burn rate"
of the server, which is only available via history.  I really don't want
to be depending on a graphical monitoring utility to find these things out.

> This whole approach has the assumption that things are going to fall off
> sometimes.  To expand on that theme for a second, right now I'm more
> worried about the "99%" class of problems.  Neither pg_stat_statements
> nor this idea are very good for tracking the rare rogue problem down.
> They're both aimed to make things that happen a lot more statistically
> likely to be seen, by giving an easier UI to glare at them frequently.
> That's not ideal, but I suspect really fleshing the whole queue consumer
> -> table idea needs to happen to do much better.

I'm just concerned that for some types of incidents, it would be much
more than 1% *of what you want to look at* which fall off.  For example,
consider a server which does 95% reads at a very high rate, but has 2%
of its writes cronically having lock waits.  That's something you want
to solve, but it seems fairly probably that these relatively infrequent
queries would have fallen off the bottom of pg_stat_statements.  Same
thing with the relative handful of queries which do large on-disk sorts.

The problem I'm worried about is that pg_stat_statements is designed to
keep the most frequent queries, but sometimes the thing you really need
to look at is not in the list of most frequent queries.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-02 Thread Josh Berkus


> -Add a configuration subdirectory to the default installation.  Needs to
> follow the config file location, so things like the Debian relocation of
> postgresql.conf still work.  Maybe it has zero files; maybe it has one
> that's named for this purpose, which defaults to the usual:
> 
> # Don't edit this file by hand!  It's overwritten by...
> 
> -Have the standard postgresql.conf end by including that directory
> -SQL parameter changes collect up all other active parameter changes,
> rewrite that file, and signal the server.  If any change requested
> requires a full server restart. warn the user of that fact.

+1

Simple, easy to understand, easy to customize.

> The only obvious bad case I can think of here is if someone has left the
> directory there, but deleted the include_dir statement; then the file
> would be written successfully but never included.  Seems like in the
> worst case the postgresql.conf parser would just need to flag whether it
> found the default directory included or not, to try and flag avoid
> obvious foot shooting.

Yes, and we can have the comment:

# this includes the default directory for extra configuration files
# do not delete or comment this out; remove any extra configuration
# files you don't want instead

... or similar to warn users.  Frankly, if someone removes the
"includedir config/" line, we can presume they know what they are doing.

For that matter, some users might want to move the line to the beginning
of the file, instead of the end.

> Some of the better received ideas I floated for merging the
> recovery.conf file seemed headed this way too.  That also all blocked
> behind the include directory bit being surprisingly tricky to get
> committed.  So that's possible to revive usefully now.  And as much as I
> hate to expand scope by saying it, both changes should probably be
> tackled at once.  It's important to make sure they're both solved well
> by whatever is adopted, they are going to co-exist as committed one day.

Yes.

I'll also point out that includedir would help solve the issue of
"postgresql.conf is under Puppet, but I want to change the logging
options ..." more handily than current solutions.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous commit not... synchronous?

2012-11-02 Thread Daniel Farina

On Fri, Nov 2, 2012 at 5:08 PM, Hannu Krosing  wrote:
> On 11/02/2012 09:46 PM, Daniel Farina wrote:
>>
>> The bar for "reliable" non-volatile storage for me are things like
>> Amazon's S3, and I think a lot of that has to do with the otherwise
>> relatively impoverished semantics it has, so I think this reliability
>> profile will be or has been duplicated elsewhere.
>>
>> In general, this has some relation to remastering issues.
>>
>> In the future, I'd like to be able to turn off the local pg_xlog, at my
>> option.
>
> Have you tried things like mounting remote RAM drive
> over NFS or similar for pg_xlog ?
>
> You probably could even play with DRBD and have one or both
> of the drives be RAM drives.
>
> Hannu

I'm not so interested in such creative workarounds, especially because
the operational maintenance cost is huge when magnified when all I
wish I could do is buffer WAL and forward to a socket, and then
unblock commits when my replication partner says 'alright', just as
they do in spirit for flushing pg_xlog -- what I want is simpler than
a full blown network file system. I also don't need this kind of thing
in the near future -- I'm way behind what's possible as-is.

And then there are issues like transport encryption and being able to
do this across a geography, ugh.

--
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unresolved error 0xC0000409 on Windows Server

2012-11-02 Thread Noah Misch

On Fri, Nov 02, 2012 at 02:05:47PM -0500, Merlin Moncure wrote:
> On Fri, Nov 2, 2012 at 1:25 PM, Matthew Gerber  
> wrote:
> > I am encountering an error on my Postgres installation for Windows Server
> > 64-bit. The error was posted here a couple months ago; however, no solution
> > was found on the pgsql-bugs list, so I am reposting it to pgsql-hackers in
> > the hopes that someone will be able to help. My error message is identical
> > to the one previously posted:
> >
> > 2012-11-01 22:36:26 EDT LOG:  0: server process (PID 7060) was
> > terminated by exception 0xC409
> > 2012-11-01 22:36:26 EDT DETAIL:  Failed process was running: INSERT INTO
> > [snipped SQL command]

Could you post an anonymized query, post an anonymized query plan, and/or
describe the general nature of the query?  Does it call functions?  About how
many rows does it insert?

What server settings have you customized?
https://wiki.postgresql.org/wiki/Server_Configuration

If you could get a stack trace or minidump, that would be most helpful:
https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Windows

Magnus's questions for the reporter of bug #7517 are relevant, too.  Does the
system have any antivirus software installed?

> > 2012-11-01 22:36:26 EDT LOG:  0: all server processes terminated;
> > reinitializing
> > 2012-11-01 22:36:26 EDT LOCATION:  PostmasterStateMachine,
> > src\backend\postmaster\postmaster.c:3135
> > 2012-11-01 22:36:36 EDT FATAL:  XX000: pre-existing shared memory block is
> > still in use
> > 2012-11-01 22:36:36 EDT HINT:  Check if there are any old server processes
> > still running, and terminate them.
> > 2012-11-01 22:36:36 EDT LOCATION:  PGSharedMemoryCreate,
> > src\backend\port\win32_shmem.c:194

This part smells like a bug in its own right.

> hm, several times over the last couple of months (both on postgres 9.1
> and 9.2), i've seen  a similar crash, but on linux.  It hits the log
> like this:
> 
> Execution halted  (~ 200x)
> Error: segfault from C stack overflow
> Execution halted  (~ 30x)
> LOG:  server process (PID 19882) was terminated by signal 11: Segmentation 
> fault
> LOG:  terminating any other active server processes

> note the lack of LOG in 'Execution halted', etc.  This has happened
> several times, on different servers using different workloads (but
> always under load).  As of yet, I've been unable to get a core but I
> hope to get one next time it happens.  I wonder if it's a similar
> cause?

Google suggests those unadorned messages originate in R.  Do the affected
systems use PL/R?  If so ...

> One thing I've been tempted to try is raising max_stack_depth to see
> if that helps the problem.

... that probably won't help.  Depending on the specifics of the situation,
*lowering* max_stack_depth might tend to give you an ERROR instead of a crash.
Or it might just give R a bit more stack space to devour before reaching the
same crash it would otherwise reach.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrect behaviour when using a GiST index on points

2012-11-02 Thread Noah Misch

On Fri, Nov 02, 2012 at 09:01:17PM +0400, Alexander Korotkov wrote:
> On Fri, Nov 2, 2012 at 4:46 PM, Noah Misch  wrote:
> > On Fri, Nov 02, 2012 at 04:05:30PM +0400, Alexander Korotkov wrote:
> > > On Thu, Oct 18, 2012 at 11:18 PM, Noah Misch  wrote:
> >
> > > > > --- 1339,1356 
> > > > >   *recheck = false;
> > > > >   break;
> > > > >   case BoxStrategyNumberGroup:
> > > > > ! /*
> > > > > !  * This code repeats logic of on_ob which uses
> > > > simple comparison
> > > > > !  * rather than FP* functions.
> > > > > !  */
> > > > > ! query = PG_GETARG_BOX_P(1);
> > > > > ! key = DatumGetBoxP(entry->key);
> > > > > !
> > > > > ! *recheck = false;
> > > > > ! result = key->high.x >= query->low.x &&
> > > > > !  key->low.x <= query->high.x &&
> > > > > !  key->high.y >= query->low.y &&
> > > > > !  key->low.y <= query->high.y;
> > > >
> > > > For leaf entries, this correctly degenerates to on_pb().  For internal
> > > > entries, it must, but does not, implement box_overlap().  (The fuzzy
> > > > box_overlap() would be fine.)

> >  It
> > remains that the code here must somehow implement a box_overlap()-style
> > calculation for internal pages.
> >
> 
> Sorry, didn't understand this point. What exactly do you mean by
> box_overlap()-style?

point_ops index entries have type "box".  On leaf pages, the box for each
entry is trivial, having high == low.  At leaf pages, gist_point_consistent()
should implement "point <@ box" with an algorithm equivalent to on_pb(); your
latest code achieves that.  In internal pages, the box for each entry is
rarely trivial; it spans all points stored on the leaf page reachable through
its downlink.  At internal pages, gist_point_consistent() should implement
"point <@ box" with an algorithm near-equivalent to box_overlap().  (As an
optional deviation, it may use exact comparisons despite box_overlap() using
fuzzy comparisons.)  Looking at the math again, your latest code does achieve
that, too.  I was thrown off by your use of a different, albeit mathematically
equivalent, algorithm from the one used in box_overlap().  Please don't do
that; either use box_overlap()'s algorithm here, or change box_overlap() to
use the shorter algorithm you have introduced.  Formulating the same
calculation differently in related code is a recipe for confusion.  (Then
again, perhaps the equivalence of the algorithms is obvious to everyone
entitled to travel within 1 km of the geometric type implementation.)

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous commit not... synchronous?

2012-11-02 Thread Hannu Krosing


On 11/02/2012 09:46 PM, Daniel Farina wrote:

The bar for "reliable" non-volatile storage for me are things like
Amazon's S3, and I think a lot of that has to do with the otherwise
relatively impoverished semantics it has, so I think this reliability
profile will be or has been duplicated elsewhere.

In general, this has some relation to remastering issues.

In the future, I'd like to be able to turn off the local pg_xlog, at my option.

Have you tried things like mounting remote RAM drive
over NFS or similar for pg_xlog ?

You probably could even play with DRBD and have one or both
of the drives be RAM drives.

Hannu




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bugs in planner's equivalence-class processing

2012-11-02 Thread Tom Lane

Joel Jacobson  writes:
> If helpful, here is a simple test to reproduce the problem:
> http://pgsql.privatepaste.com/6429e8a200

FWIW, this is fixed already in git, or at least this particular example
gives what seems the right answer:

 fooid | barid | fooint 
---+---+
 2 |   |  1
(1 row)

> Would you recommend me to rewrite all queries of this particular
> type, where you have COALESCE in the WHERE statement,
> as a precaution?

No, but you might want to grab the appropriate patch and apply it
locally, if you tend to write queries like this.  You want one
of these:

Author: Tom Lane 
Branch: master [72a4231f0] 2012-10-18 12:30:10 -0400
Branch: REL9_2_STABLE [0237b3945] 2012-10-18 12:30:25 -0400
Branch: REL9_1_STABLE [447dad719] 2012-10-18 12:29:00 -0400
Branch: REL9_0_STABLE [afdc7515f] 2012-10-18 12:29:06 -0400
Branch: REL8_4_STABLE [779016271] 2012-10-18 12:29:13 -0400
Branch: REL8_3_STABLE [c29a91037] 2012-10-18 12:29:19 -0400

Fix planning of non-strict equivalence clauses above outer joins.

> We haven't migrated to 9.2 yet, but perhaps there is a risk
> similar queries can render the same problems even in 9.1?

9.2 has considerably more scope to make this kind of error, but related
bugs can be demonstrated as far back as 7.4.  It's a bit surprising
nobody noticed until now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.3] writable foreign tables

2012-11-02 Thread Alexander Korotkov

On Mon, Sep 24, 2012 at 12:49 PM, Kohei KaiGai  wrote:

> 2012/9/23 Kohei KaiGai :
> > 2012/8/29 Kohei KaiGai :
> >> 2012/8/28 Kohei KaiGai :
> >>> 2012/8/28 Tom Lane :
>  Kohei KaiGai  writes:
> >> Would it be too invasive to introduce a new pointer in
> TupleTableSlot
> >> that is NULL for anything but virtual tuples from foreign tables?
> 
> > I'm not certain whether the duration of TupleTableSlot is enough to
> > carry a private datum between scan and modify stage.
> 
>  It's not.
> 
> > Is it possible to utilize ctid field to move a private pointer?
> 
>  UPDATEs and DELETEs do not rely on the ctid field of tuples to carry
> the
>  TID from scan to modify --- in fact, most of the time what the modify
>  step is going to get is a "virtual" TupleTableSlot that hasn't even
>  *got* a physical CTID field.
> 
>  Instead, the planner arranges for the TID to be carried up as an
>  explicit resjunk column named ctid.  (Currently this is done in
>  rewriteTargetListUD(), but see also preptlist.c which does some
> related
>  things for SELECT FOR UPDATE.)
> 
>  I'm inclined to think that what we need here is for FDWs to be able to
>  modify the details of that behavior, at least to the extent of being
>  able to specify a different data type than TID for the row
>  identification column.
> 
> >>> Hmm. It seems to me a straight-forward solution rather than ab-use
> >>> of ctid system column. Probably, cstring data type is more suitable
> >>> to carry a private datum between scan and modify stage.
> >>>
> >>> One problem I noticed is how FDW driver returns an extra field that
> >>> is in neither system nor regular column.
> >>> Number of columns and its data type are defined with TupleDesc of
> >>> the target foreign-table, so we also need a feature to extend it on
> >>> run-time. For example, FDW driver may have to be able to extend
> >>> a "virtual" column with cstring data type, even though the target
> >>> foreign table does not have such a column.
> >>>
> >> I tried to investigate the related routines.
> >>
> >> TupleDesc of TupleTableSlot associated with ForeignScanState
> >> is initialized at ExecInitForeignScan as literal.
> >> ExecAssignScanType assigns TupleDesc of the target foreign-
> >> table on tts_tupleDescriptor, "as-is".
> >> It is the reason why IterateForeignScan cannot return a private
> >> datum except for the columns being declared as regular ones.
> >>
> > The attached patch improved its design according to the upthread
> > discussion. It now got away from ab-use of "ctid" field, and adopts
> > a concept of pseudo-column to hold row-id with opaque data type
> > instead.
> >
> > Pseudo-column is Var reference towards attribute-number larger
> > than number of attributes on the target relation; thus, it is not
> > a substantial object. It is normally unavailable to reference such
> > a larger attribute number because TupleDesc of each ScanState
> > associated with a particular relation is initialized at ExecInitNode.
> >
> > The patched ExecInitForeignScan was extended to generate its
> > own TupleDesc including pseudo-column definitions on the fly,
> > instead of relation's one, when scan-plan of foreign-table requires
> > to have pseudo-columns.
> >
> > Right now, the only possible pseudo-column is "rowid" being
> > injected at rewriteTargetListUD(). It has no data format
> > restriction like "ctid" because of VOID data type.
> > FDW extension can set an appropriate value on the "rowid"
> > field in addition to contents of regular columns at
> > IterateForeignScan method, to track which remote row should
> > be updated or deleted.
> >
> > Another possible usage of this pseudo-column is push-down
> > of target-list including complex calculation. It may enable to
> > move complex mathematical formula into remote devices
> > (such as GPU device?) instead of just a reference of Var node.
> >
> > This patch adds a new interface: GetForeignRelInfo being invoked
> > from get_relation_info() to adjust width of RelOptInfo->attr_needed
> > according to the target-list which may contain "rowid" pseudo-column.
> > Some FDW extension may use this interface to push-down a part of
> > target list into remote side, even though I didn't implement this
> > feature on file_fdw.
> >
> > RelOptInfo->max_attr is a good marker whether the plan shall have
> > pseudo-column reference. Then, ExecInitForeignScan determines
> > whether it should generate a TupleDesc, or not.
> >
> > The "rowid" is fetched using ExecGetJunkAttribute as we are currently
> > doing on regular tables using "ctid", then it shall be delivered to
> > ExecUpdate or ExecDelete. We can never expect the fist argument of
> > them now, so "ItemPointer tupleid" redefined to "Datum rowid", and
> > argument of BR-trigger routines redefined also.
> >
> > [kaigai@iwashi sepgsql]$ cat ~/testfile.csv
> > 10  aaa
> > 11  bbb
> > 12  ccc
> > 1

Re: [HACKERS] Synchronous commit not... synchronous?

2012-11-02 Thread Daniel Farina

On Fri, Nov 2, 2012 at 1:06 PM, Jeff Janes  wrote:
>> I see why it is implemented this way, but it's also still pretty
>> unsatisfying because it means that with cancellation requests clients
>> are in theory able to commit an unlimited number of transactions,
>> synchronous commit or no.
>
> What evil does this allow the client to perpetrate?

The client can commit against my will by accident in an automated
system whose behavior is at least moderately complex and hard to
understand completely for all involved, and then the client's author
subsequently writes me a desperate or angry support request asking why
data was lost. This is not the best time for me to ask "did you setup
a scheduled task to cancel hanging queries automatically? Because
yeah...there's this...thing."

>> It's probably close enough for most purposes, but what would you think
>> about a "2PC-ish" mode at the physical (rather than logical/PREPARE
>> TRANSACTION) level, whereby the master would insist that its standbys
>> have more data written (or at least received...or at least sent) than
>> it has guaranteed flushed to its own xlog at any point?
>
> Then if they interrupt the commit, the remote has it permanently but
> the local does not.  That would be corruption.

That is a good point.

When the server starts up it could interrogate it standbys for WAL to
apply. My ideal is to get a similar relationship between a master and
its 'local' pg_xlog, except over socket, and possibly (but entirely
optionally) to a non-Postgres receiver of WAL, that may buffer WAL and
then submit it directly to what is typically thought of as the
archives.  I have a number of reasons for doing that, but they can all
be summed as: block devices are much more prone to failures -- both
simple and byzantine -- than memory and network traffic with enough
fidelity checking (such as TLS), and the pain from block device
failures -- in particular, the byzantine ones -- is very high when
they occur.

The bar for "reliable" non-volatile storage for me are things like
Amazon's S3, and I think a lot of that has to do with the otherwise
relatively impoverished semantics it has, so I think this reliability
profile will be or has been duplicated elsewhere.

In general, this has some relation to remastering issues.

In the future, I'd like to be able to turn off the local pg_xlog, at my option.

This is something that I've been very slowly moving forward on for a
while, with the first step being writing a Postgres proxy, currently
underway.  The tool support for this kind of facility is not really in
existence yet, but I'll catch up some day...

> What the "DETAIL" doesn't make clear about the current system is that
> the commit *will* be replicated to the standby *eventually*, unless
> the master burns down first.  In particular, if any commit after this
> one makes it to the standby, then the interrupted one is guaranteed to
> have made it as well.
>
>> This would be a nice invariant to have when dealing with a large
>> number of systems, allowing for the catching of some tricky bugs, that
>> standbys are always greater-than-or-equal-to the master's XLogPos.
>
> Could you elaborate on that?

Sure. I'd like to sanity check failovers with as many simple
invariants as I can to catch problems.  Losing a cheap to confirm
invariant is losing a check, so that would be unfortunate when doing
more failovers simultaneously than humans can realistically be
involved with in a short amount of time, as the results of a bug there
are most unpleasant.

--
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bugs in planner's equivalence-class processing

2012-11-02 Thread Joel Jacobson

When I tried our test suite against 9.2.1 one of the tests failed,
it looked really strange, the query had returned a row which could
impossibly match the WHERE statement.

I heard on #postgresql this bug has already been reported.

If helpful, here is a simple test to reproduce the problem:
http://pgsql.privatepaste.com/6429e8a200

>From what I understand from the discussion, this "bug" or behaviour
has been around for quite some time, dating back to 7.4,
perhaps not exactly the same in all major versions though.

Would you recommend me to rewrite all queries of this particular
type, where you have COALESCE in the WHERE statement,
as a precaution?

We haven't migrated to 9.2 yet, but perhaps there is a risk
similar queries can render the same problems even in 9.1?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unresolved error 0xC0000409 on Windows Server

2012-11-02 Thread Merlin Moncure

On Fri, Nov 2, 2012 at 1:25 PM, Matthew Gerber  wrote:
> Hello,
>
> I am encountering an error on my Postgres installation for Windows Server
> 64-bit. The error was posted here a couple months ago; however, no solution
> was found on the pgsql-bugs list, so I am reposting it to pgsql-hackers in
> the hopes that someone will be able to help. My error message is identical
> to the one previously posted:
>
> 2012-11-01 22:36:26 EDT LOG:  0: server process (PID 7060) was
> terminated by exception 0xC409
> 2012-11-01 22:36:26 EDT DETAIL:  Failed process was running: INSERT INTO
> [snipped SQL command]
> 2012-11-01 22:36:26 EDT HINT:  See C include file "ntstatus.h" for a
> description of the hexadecimal value.
> 2012-11-01 22:36:26 EDT LOCATION:  LogChildExit,
> src\backend\postmaster\postmaster.c:2884
> 2012-11-01 22:36:26 EDT LOG:  0: terminating any other active server
> processes
> 2012-11-01 22:36:26 EDT LOCATION:  HandleChildCrash,
> src\backend\postmaster\postmaster.c:2682
> 2012-11-01 22:36:26 EDT WARNING:  57P00: terminating connection because of
> crash of another server process
> 2012-11-01 22:36:26 EDT DETAIL:  The postmaster has commanded this server
> process to roll back the current transaction and exit, because another
> server process exited abnormally and possibly corrupted shared memory.
> 2012-11-01 22:36:26 EDT HINT:  In a moment you should be able to reconnect
> to the database and repeat your command.
> 2012-11-01 22:36:26 EDT LOCATION:  quickdie,
> src\backend\tcop\postgres.c:2556
> 2012-11-01 22:36:26 EDT LOG:  0: all server processes terminated;
> reinitializing
> 2012-11-01 22:36:26 EDT LOCATION:  PostmasterStateMachine,
> src\backend\postmaster\postmaster.c:3135
> 2012-11-01 22:36:36 EDT FATAL:  XX000: pre-existing shared memory block is
> still in use
> 2012-11-01 22:36:36 EDT HINT:  Check if there are any old server processes
> still running, and terminate them.
> 2012-11-01 22:36:36 EDT LOCATION:  PGSharedMemoryCreate,
> src\backend\port\win32_shmem.c:194
>
> The error happens regularly while performing database INSERTS. The [snipped
> SQL command] part above contains the INSERT command that was executing when
> the server crashed. After restarting the server the command executes fine,
> so it's not a problem with the command. I installed Postgres from the
> standard Windows binary "postgresql-9.2.1-1-windows-x64.exe" and I have not
> changed any configuration settings from their default values.
>
> Does anyone know what might be happening and how I might fix it?

hm, several times over the last couple of months (both on postgres 9.1
and 9.2), i've seen  a similar crash, but on linux.  It hits the log
like this:

Execution halted  (~ 200x)
Error: segfault from C stack overflow
Execution halted  (~ 30x)
LOG:  server process (PID 19882) was terminated by signal 11: Segmentation fault
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.

note the lack of LOG in 'Execution halted', etc.  This has happened
several times, on different servers using different workloads (but
always under load).  As of yet, I've been unable to get a core but I
hope to get one next time it happens.  I wonder if it's a similar
cause?

One thing I've been tempted to try is raising max_stack_depth to see
if that helps the problem.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous commit not... synchronous?

2012-11-02 Thread Shaun Thomas


On 11/02/2012 12:31 PM, Simon Riggs wrote:


If people want full two phase commit, that option exists also.


I was about to say...  isn't that what savepoints are for?


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous commit not... synchronous?

2012-11-02 Thread Simon Riggs

On 2 November 2012 16:27, Jeff Janes  wrote:
> On Thu, Nov 1, 2012 at 12:42 PM, Daniel Farina  wrote:
>> On Wed, Oct 31, 2012 at 10:10 PM, Michael Paquier
>>  wrote:
>>> Btw, I believe that this is correct behavior, because in Peter's case the
>>> manual command gets the priority on the value of synchronous_commit, no?
>>> If anybody thinks that I am wrong, feel free to argue on that of course...
>>
>> The idea of canceling a COMMIT statement causing a COMMIT seems pretty
>> strange to me.
>
> It would be.  But you are not cancelling the commit, you are
> *attempting* to cancel the commit.  The message you receive explains
> to what extend your attempt succeeded.

That is correct.

It is possible to cancel the COMMIT, but only until it happens.

If people want full two phase commit, that option exists also.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrect behaviour when using a GiST index on points

2012-11-02 Thread Alexander Korotkov

On Fri, Nov 2, 2012 at 4:46 PM, Noah Misch  wrote:

> On Fri, Nov 02, 2012 at 04:05:30PM +0400, Alexander Korotkov wrote:
> > On Thu, Oct 18, 2012 at 11:18 PM, Noah Misch  wrote:
>
> > > > --- 1339,1356 
> > > >   *recheck = false;
> > > >   break;
> > > >   case BoxStrategyNumberGroup:
> > > > ! /*
> > > > !  * This code repeats logic of on_ob which uses
> > > simple comparison
> > > > !  * rather than FP* functions.
> > > > !  */
> > > > ! query = PG_GETARG_BOX_P(1);
> > > > ! key = DatumGetBoxP(entry->key);
> > > > !
> > > > ! *recheck = false;
> > > > ! result = key->high.x >= query->low.x &&
> > > > !  key->low.x <= query->high.x &&
> > > > !  key->high.y >= query->low.y &&
> > > > !  key->low.y <= query->high.y;
> > >
> > > For leaf entries, this correctly degenerates to on_pb().  For internal
> > > entries, it must, but does not, implement box_overlap().  (The fuzzy
> > > box_overlap() would be fine.)  I recommend making
> gist_point_consistent()'s
> > > treatment of boxes resemble its treatment of circles and polygons; that
> > > eases
> > > verifying their correctness.  Call gist_box_consistent.  Then, for leaf
> > > entries, call box_contain_pt().
> > >
> >
> > I have two objections on doing that:
> > 1) It's not evident for me that fuzzy comparison in internal pages is
> fine.
> > Obviously, it depends on data distribution. It's easy to provide an
> example
> > when fuzzy comparison will lead to significant performance degradation.
> > 2) With PolygonStrategyNumberGroup CircleStrategyNumberGroup it's faster
> to
> > do simple box comparison than doing calculation for exact circle and
> > especially polygon check. In this case previous filtering in leaf pages
> > looks reasonable. With BoxStrategyNumberGroup exact calculation is
> simpler
> > than gist_box_consistent.
>
> That's fair; I withdraw the recommendation to use gist_box_consistent().
>  It
> remains that the code here must somehow implement a box_overlap()-style
> calculation for internal pages.
>

Sorry, didn't understand this point. What exactly do you mean by
box_overlap()-style?

--
With best regards,
Alexander Korotkov.

[HACKERS] the number of pending entries in GIN index with FASTUPDATE=on

2012-11-02 Thread Fujii Masao

Hi,

Is there the way to know the number of pending entries in GIN index which
was created with FASTUPDATE = on? If not, is it worth implementing the
function returning that number?

I sometimes would like to know that number when I measure how much
pending entries affect the performance of GIN index scan and tune how
frequently autovacuum should run VACUUM to clean up them.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous commit not... synchronous?

2012-11-02 Thread Jeff Janes

On Thu, Nov 1, 2012 at 12:42 PM, Daniel Farina  wrote:
> On Wed, Oct 31, 2012 at 10:10 PM, Michael Paquier
>  wrote:
>> Btw, I believe that this is correct behavior, because in Peter's case the
>> manual command gets the priority on the value of synchronous_commit, no?
>> If anybody thinks that I am wrong, feel free to argue on that of course...
>
> The idea of canceling a COMMIT statement causing a COMMIT seems pretty
> strange to me.

It would be.  But you are not cancelling the commit, you are
*attempting* to cancel the commit.  The message you receive explains
to what extend your attempt succeeded.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions Documentation

2012-11-02 Thread David E. Wheeler

On Nov 2, 2012, at 7:56 AM, Dimitri Fontaine  wrote:

>> I'd also be in favor of adding hooks to generate man pages.
> 
> Who still use their local copy of the docs (without search ability)
> anyway? About man pages, I don't know how many DBA are looking there
> when they want to find some documentation.

I use man pages *all the time*.

> I think it all gets down to having a local text file installed and a
> proper web site to show off the extension's documentation, tutorial,
> quick start, etc. A good example of that would be the pgmp extension:
> 
>  http://pgmp.projects.pgfoundry.org/
>  https://github.com/dvarrazzo/pgmp/

Yes, and also http://pgxn.org/dist/pgmp/. But that doesn't help if you're 
offline or behind a moronic corporate firewall.

> Well I'm not really seeing how improving the local copy of any
> documentation is going to change the habit of people to just use some
> online reference with good integrated search facility, or even more
> often, $SEARCH_ENGINE.

Yes, that was one of my motivations for creating PGXN. But I still would love 
to be able to send someone instructions like this:

pgxn install pgtap
man pgtap

Rather than this:

pgxn install pgtap
less `pg_config --docdir `/extension/pgtap.mmd

The man page will be easier on the eyes, too.

When I *do* need the HTML docs, I use this when I need to access the full docs 
on a long flight:

open `pg_config --docdir`/html/index.html

I would love to see an index of links to pages for each installed extension, 
too. We have that for contrib stuff, why not third-party extensions?

Best,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-02 Thread Dimitri Fontaine

Andres Freund  writes:
> I think it only really becomes viable with the introduction of 
> directory includes where you can use one file per value.

+1

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in ALTER COLUMN SET DATA TYPE ?

2012-11-02 Thread Pavan Deolasee

On Fri, Nov 2, 2012 at 9:09 PM, Pavan Deolasee wrote:

>
>
> I also though that for a moment, but the commit that I mentioned did not
> move that error message from somewhere else. I also tested by resetting
> to commit 1f0363001166ef6a43619846e44cfb9dbe7335ed (previous to the
> offending commit) and don't see any error.
>
> But since you're seeing it, may it was taken out by some other older
> commit and then added back again. Will look more into it
>
>
Here is more trace:

The error message was first added by:
commit cd902b331dc4b0c170e800441a98f9213d98b46b
Author: Tom Lane 
Date:   Fri May 9 23:32:05 2008 +

It was then removed by:
commit 61d81bd28dbec65a6b144e0cd3d0bfe25913c3ac
Author: Alvaro Herrera 
Date:   Mon Dec 5 15:10:18 2011 -0300

And then again added back by:
commit 09ff76fcdb275769ac4d1a45a67416735613d04b
Author: Alvaro Herrera 
Date:   Fri Apr 20 23:46:20 2012 -0300

So coming back to the issue, do you think it's a good idea to teach
ATAddCheckConstraint() that the call is coming from a late phase of ALTER
TABLE ?

Thanks,
Pavan

Re: [HACKERS] Bug in ALTER COLUMN SET DATA TYPE ?

2012-11-02 Thread Pavan Deolasee

On Fri, Nov 2, 2012 at 9:00 PM, Tom Lane  wrote:

>
>
> In the 8.4 branch, the error is coming from here:
>
> regression=# \set VERBOSITY verbose
> regression=# ALTER TABLE test ALTER COLUMN a  TYPE numeric;
> ERROR:  42P16: constraint must be added to child tables too
> LOCATION:  ATAddCheckConstraint, tablecmds.c:4467
>
> which appears to have been added in commit cd902b331.  I think the
> commit Pavan is looking at just moved the logic around.
>

I also though that for a moment, but the commit that I mentioned did not
move that error message from somewhere else. I also tested by resetting
to commit 1f0363001166ef6a43619846e44cfb9dbe7335ed (previous to the
offending commit) and don't see any error.

But since you're seeing it, may it was taken out by some other older commit
and then added back again. Will look more into it

Thanks,
Pavan

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-02 Thread Greg Stark

On Fri, Nov 2, 2012 at 1:19 AM, Greg Smith  wrote:
>> The idea at the time was to use the include *directory* functionality,
>> for say a "config.d" directory in pgdata. The builtin one would then
>> use a predictable filename in this directory, so that the DBA who
>> prefers it can drop files both before and after that file into the
>> directory.
>
>
> That's how I remember things as well.

This sounds similar but a bit different from the solution I advocated
for and thought there was widespread support for.

If we changed the default postgresql.conf to be empty except for an
"include postgresql.conf.auto" and had tools to write out
postgresql.conf.auto then things would basically just work.

The main gotcha would have been if people *do* put any settings in
postgresql.conf manually then they would override any auto settings
(if they came after the include) or be overridden by them (if they
come before the include). This might be a bit confusing but I think it
would be fine -- the tools might want to display a warning if the
current source is from a setting in a different file.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in ALTER COLUMN SET DATA TYPE ?

2012-11-02 Thread Tom Lane

Nikhil Sontakke  writes:
>> Hmm.. I haven't yet tested on other branches. But I'm surprised that you
>> find it broken on so much old branches. "git annotate" suggests that the
>> offending error message was added by commit
>> 09ff76fcdb275769ac4d1a45a67416735613d04b and that commit is dated Apr 20,
>> 2012

> Mea Culpa. I did indeed add that error message. But it's strange when Tom
> says that this is broken since 8.3!

In the 8.4 branch, the error is coming from here:

regression=# \set VERBOSITY verbose
regression=# ALTER TABLE test ALTER COLUMN a  TYPE numeric;
ERROR:  42P16: constraint must be added to child tables too
LOCATION:  ATAddCheckConstraint, tablecmds.c:4467

which appears to have been added in commit cd902b331.  I think the
commit Pavan is looking at just moved the logic around.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in ALTER COLUMN SET DATA TYPE ?

2012-11-02 Thread Nikhil Sontakke

> > postgres=# CREATE TABLE test (a float check (a > 10.2));
>> > CREATE TABLE
>> > postgres=# CREATE TABLE test_child() INHERITS(test);
>> > CREATE TABLE
>> > postgres=# ALTER TABLE test ALTER COLUMN a SET DATA TYPE numeric;
>> > ERROR:  constraint must be added to child tables too
>>
>> Interestingly, this works in 8.3 and earlier ... but it fails as
>> described in all later versions.  So we broke it quite some time back.
>> It would be good to identify exactly what change broke it.
>>
>>
> Hmm.. I haven't yet tested on other branches. But I'm surprised that you
> find it broken on so much old branches. "git annotate" suggests that the
> offending error message was added by commit
> 09ff76fcdb275769ac4d1a45a67416735613d04b and that commit is dated Apr 20,
> 2012
>
>

Mea Culpa. I did indeed add that error message. But it's strange when Tom
says that this is broken since 8.3!

Regards,
Nikhils
-- 
StormDB - http://www.stormdb.com
The Database Cloud
Postgres-XC Support and Service

Re: [HACKERS] unfixed bugs with extensions

2012-11-02 Thread Dimitri Fontaine

Dimitri Fontaine  writes:
> Do you want me to polish it? I'd just need some review or hints to fix
> it the way you want.

Oh, catching-up on -hackers before doing the same thing in -commiters is
not the best approach, sorry about that.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in ALTER COLUMN SET DATA TYPE ?

2012-11-02 Thread Pavan Deolasee

On Fri, Nov 2, 2012 at 8:11 PM, Tom Lane  wrote:

> Pavan Deolasee  writes:
> > I noticed this behavior on master and it seems like a bug to me:
>
> > postgres=# CREATE TABLE test (a float check (a > 10.2));
> > CREATE TABLE
> > postgres=# CREATE TABLE test_child() INHERITS(test);
> > CREATE TABLE
> > postgres=# ALTER TABLE test ALTER COLUMN a SET DATA TYPE numeric;
> > ERROR:  constraint must be added to child tables too
>
> Interestingly, this works in 8.3 and earlier ... but it fails as
> described in all later versions.  So we broke it quite some time back.
> It would be good to identify exactly what change broke it.
>
>
Hmm.. I haven't yet tested on other branches. But I'm surprised that you
find it broken on so much old branches. "git annotate" suggests that the
offending error message was added by commit
09ff76fcdb275769ac4d1a45a67416735613d04b and that commit is dated Apr 20,
2012

Thanks,
Pavan

Re: [HACKERS] unfixed bugs with extensions

2012-11-02 Thread Dimitri Fontaine

Alvaro Herrera  writes:
> We have a couple of unfixed bugs regarding extensions, for which patches
> have been proposed but remain unfixed in git.
>
> The oldest is bug #6704, for which a proposed fix for the master branch
> was posted here: 
> http://archives.postgresql.org/message-id/m2zk4e6p7m@2ndquadrant.fr
> There was some disagreement about the proper way forward, so I hadn't
> looked at this patch.  I just did, and find that even if the approach
> taken by the patch is the correct one, it needs some polish before it
> can be committed.

Do you want me to polish it? I'd just need some review or hints to fix
it the way you want.

> The other one was reported by Marko Kreen in

Has been fixed by Tom.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions Documentation

2012-11-02 Thread Dimitri Fontaine

"David E. Wheeler"  writes:
> Put it into the HTML directory
> (share/docs/html/extensions/$extension.html) and inject its name into
> the TOC.
>
> I'd also be in favor of adding hooks to generate man pages.

Who still use their local copy of the docs (without search ability)
anyway? About man pages, I don't know how many DBA are looking there
when they want to find some documentation.

I think it all gets down to having a local text file installed and a
proper web site to show off the extension's documentation, tutorial,
quick start, etc. A good example of that would be the pgmp extension:

  http://pgmp.projects.pgfoundry.org/
  https://github.com/dvarrazzo/pgmp/

>> That said, if there are things we could put in, e.g., pgxs, to make
>> building documentation simpler, we can discuss that.
>
> Yeah, that would be ideal. But if no one has really thought about how
> to go about it yet…

Well I'm not really seeing how improving the local copy of any
documentation is going to change the habit of people to just use some
online reference with good integrated search facility, or even more
often, $SEARCH_ENGINE.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronous commit not... synchronous?

2012-11-02 Thread Fujii Masao

On Fri, Nov 2, 2012 at 4:42 AM, Daniel Farina  wrote:
> On Wed, Oct 31, 2012 at 10:10 PM, Michael Paquier
>  wrote:
>> Btw, I believe that this is correct behavior, because in Peter's case the
>> manual command gets the priority on the value of synchronous_commit, no?
>> If anybody thinks that I am wrong, feel free to argue on that of course...
>
> The idea of canceling a COMMIT statement causing a COMMIT seems pretty
> strange to me.
>
> I would also not expect a cancelled INSERT statement to INSERT, as
> seems would happen by applying the same rules in the
> autocommit/implicit commit case here.

So how should we handle the case where cancel request or SIGINT arrives
while COMMIT is waiting for its WAL to be replicated to the standby? It's hard
to rollback the COMMIT because its WAL has already been flushed to
the disk locally. You think that we should prevent COMMIT at that state
from being canceled?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [9.1] 2 bugs with extensions

2012-11-02 Thread Dimitri Fontaine

Tom Lane  writes:
>> I think it may be time to bite the bullet and change that (including
>> breaking dumpSequence() into two separate functions).  I'm a little bit
>> worried about the compatibility implications of back-patching such a
>> change, though.  Is it likely that anybody out there is depending on the
>> fact that, eg, pg_dump --section=pre-data currently includes SEQUENCE SET
>> items?  Personally I think it's more likely that that'd be seen as a
>> bug, but ...

FWIW, +1

> Specifically, I'm thinking this, which looks rather bulky but most of
> the diff is from reindenting the guts of dumpSequence().

I see that you commited that patch, thanks a lot Tom!

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in ALTER COLUMN SET DATA TYPE ?

2012-11-02 Thread Tom Lane

Pavan Deolasee  writes:
> I noticed this behavior on master and it seems like a bug to me:

> postgres=# CREATE TABLE test (a float check (a > 10.2));
> CREATE TABLE
> postgres=# CREATE TABLE test_child() INHERITS(test);
> CREATE TABLE
> postgres=# ALTER TABLE test ALTER COLUMN a SET DATA TYPE numeric;
> ERROR:  constraint must be added to child tables too

Interestingly, this works in 8.3 and earlier ... but it fails as
described in all later versions.  So we broke it quite some time back.
It would be good to identify exactly what change broke it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-11-02 Thread John Lumby

Claudio Freire wrote:
>
> On Thu, Nov 1, 2012 at 10:59 PM, Greg Smith  wrote:
> > On 11/1/12 6:13 PM, Claudio Freire wrote:
> >
> >> posix_fadvise what's the trouble there, but the fact that the kernel
> >> stops doing read-ahead when a call to posix_fadvise comes. I noticed
> >> the performance hit, and checked the kernel's code. It effectively
> >> changes the prediction mode from sequential to fadvise, negating the
> >> (assumed) kernel's prefetch logic.
> >
> > The Linux posix_fadvise implementation never seemed like it was well liked
> > by the kernel developers. Quirky stuff like this popped up all the time
> > during that period, when effective_io_concurrency was being added. I wonder
> > how far back the fadvise/read-ahead conflict goes back.
>
> Well, to be precise it's not so much as a problem in posix_fadvise
> itself, it's a problem in how it interacts with readahead. Since
> readahead works at the memory mapper level, and only when actually
> performing I/O (which would seem at first glance quite sensible), it
> doesn't get to see fadvise activity.
>
> FADV_WILLNEED is implemented as a forced readahead, which doesn't
> update any of the readahead context structures. Again, at first
> glance, this would seem sensible (explicit hints shouldn't interfere
> with pattern detection logic). However, since those pages are (after
> the fadvise call) under async I/O, next time the memory mapper needs
> that page, instead of requesting I/O through readahead logic, it will
> wait for async I/O to complete.
>
> IOW, what was sequential in fact, became invisible to readahead,
> indistinguishable from random I/O. Whatever page fadvise failed to
> predict will be treated as random I/O, and here the trouble lies.

And this may be one other advantage of async io over posix_fadvise in
the linux environment (with the present mmap behaviour) :
that async io achives the same objective of improving disk/processing overlap
without the mentioned interference with read-ahead.
Although to confirm this would ideally require 3-way comparing
   posix-fadvise + existing readahead behaviour
   posix-fadvise + modify existing readahead behaviour 
  to not force waiting for current async io
                              (i.e. just check the aio and continue normal 
readahead if in progress)
   async io wth no posix-fadvise

It seems in general to be preferable to have an implementation that is
less dependent on specific behaviour of the OS read-head mechanism.

>
> >> I've mused about the possibility to batch async_io requests, and use
> >> the scatter/gather API instead of sending tons of requests to the
> >
> >> kernel. I think doing so would enable a zero-copy path that could very
> >> possibly imply big speed improvements when memory bandwidth is the
> >> bottleneck.
> >
> > Another possibly useful bit of history here for you. Greg Stark wrote a
> > test program that used async I/O effectively on both Linux and Solaris.
> > Unfortunately, it was hard to get that to work given how Postgres does its
> > buffer I/O, and using processes instead of threads. This looks like the
> > place he commented on why:
> >
> > http://postgresql.1045698.n5.nabble.com/Multi-CPU-Queries-Feedback-and-or-suggestions-wanted-td1993361i20.html
> >
> > The part I think was relevant there from him:
> >
> > "In the libaio view of the world you initiate io and either get a
> > callback or call another syscall to test if it's complete. Either
> > approach has problems for Postgres. If the process that initiated io
> > is in the middle of a long query it might take a long time, or not even
> > never get back to complete the io. The callbacks use threads...
> >
> > And polling for completion has the problem that another process could
> > be waiting on the io and can't issue a read as long as the first
> > process has the buffer locked and io in progress. I think aio makes a
> > lot more sense if you're using threads so you can start a thread to
> > wait for the io to complete."
>
> I noticed that. I always envisioned async I/O as managed by some
> dedicated process. One that could check for completion or receive
> callbacks. Postmaster, for instance.

Thanks for the mentioning this posting.    Interesting.
However,    the OP describes an implementation based on libaio. 
Today what we have (for linux) is librt,  which is quite different.
It is arguable worse than libaio (well actually I am sure it is worse)
since it is essentially just an encapsulation of using threads to do
synchronous ios  -  you can look at it as making it easier to do what the 
application could do itself if it set up its own pthreads. The linux
kernel does not know about it and so the CPU overhead of checking for
completion is higher.
But if async io is used *only* for prefetching, and not for the actual
ReadBuffer itself   (which is what I did),   then the problem
mentioned by the OP 
 "If the process that initiated io is in the middle of a long

Re: [HACKERS] Incorrect behaviour when using a GiST index on points

2012-11-02 Thread Noah Misch

On Fri, Nov 02, 2012 at 04:05:30PM +0400, Alexander Korotkov wrote:
> On Thu, Oct 18, 2012 at 11:18 PM, Noah Misch  wrote:

> > > --- 1339,1356 
> > >   *recheck = false;
> > >   break;
> > >   case BoxStrategyNumberGroup:
> > > ! /*
> > > !  * This code repeats logic of on_ob which uses
> > simple comparison
> > > !  * rather than FP* functions.
> > > !  */
> > > ! query = PG_GETARG_BOX_P(1);
> > > ! key = DatumGetBoxP(entry->key);
> > > !
> > > ! *recheck = false;
> > > ! result = key->high.x >= query->low.x &&
> > > !  key->low.x <= query->high.x &&
> > > !  key->high.y >= query->low.y &&
> > > !  key->low.y <= query->high.y;
> >
> > For leaf entries, this correctly degenerates to on_pb().  For internal
> > entries, it must, but does not, implement box_overlap().  (The fuzzy
> > box_overlap() would be fine.)  I recommend making gist_point_consistent()'s
> > treatment of boxes resemble its treatment of circles and polygons; that
> > eases
> > verifying their correctness.  Call gist_box_consistent.  Then, for leaf
> > entries, call box_contain_pt().
> >
> 
> I have two objections on doing that:
> 1) It's not evident for me that fuzzy comparison in internal pages is fine.
> Obviously, it depends on data distribution. It's easy to provide an example
> when fuzzy comparison will lead to significant performance degradation.
> 2) With PolygonStrategyNumberGroup CircleStrategyNumberGroup it's faster to
> do simple box comparison than doing calculation for exact circle and
> especially polygon check. In this case previous filtering in leaf pages
> looks reasonable. With BoxStrategyNumberGroup exact calculation is simpler
> than gist_box_consistent.

That's fair; I withdraw the recommendation to use gist_box_consistent().  It
remains that the code here must somehow implement a box_overlap()-style
calculation for internal pages.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Incorrect behaviour when using a GiST index on points

2012-11-02 Thread Alexander Korotkov

On Thu, Oct 18, 2012 at 11:18 PM, Noah Misch  wrote:

> On Thu, Oct 11, 2012 at 07:17:28AM -0400, Noah Misch wrote:
> > On Tue, Oct 02, 2012 at 01:58:40PM -0400, Noah Misch wrote:
> > > > > On Mon, Aug 27, 2012 at 7:43 PM, Tom Lane 
> wrote:
> > > > >> There's also the big-picture question of whether we should just
> get rid
> > > > >> of fuzzy comparisons in the geometric types instead of trying to
> hack
> > > > >> indexes to work around them.
> >
> > > In any event, I think we should entertain a patch to make the GiST
> operator
> > > class methods bug-compatible with corresponding operators.  Even if we
> decide
> > > to change operator behavior in HEAD, the back branches could use it.
> >
> > We have broad agreement that the specific implementation of fuzz in
> geometric
> > comparison operators is shoddy, but nobody has voiced interest in
> designing a
> > concrete improvement.  I propose adding a TODO item "Remove or improve
> > rounding in geometric comparison operators", endorsing Alexander's
> design, and
> > reviewing his patch.  Objections?
>
> TODO added, and here's a review:
>
> The patch adds no regression tests; it should add tests illustrating the
> problems it fixes.
>

I've added some tests to points.sql.


> I audited the other indexable geometric operators for similar problems.
>  This
> passage in gist_point_consistent_internal(), which handles (point,point)
> operators, caught my suspicion:
>
> case RTSameStrategyNumber:
> if (isLeaf)
> {
> result = FPeq(key->low.x, query->x)
> && FPeq(key->low.y, query->y);
> }
> else
> {
> result = (query->x <= key->high.x &&
> query->x >= key->low.x &&
>   query->y <= key->high.y
> && query->y >= key->low.y);
> }
> break;
>
> A leaf entry reachable from an internal entry may fall exactly on the
> internal-entry bounding box.  Since we would accept a fuzzy match at the
> leaf
> level, I think we must also accept a fuzzy match at the internal level.


Good catch, fixed.


>  > *** a/src/backend/access/gist/gistproc.c
> > --- b/src/backend/access/gist/gistproc.c
>
> > ***
> > *** 1326,1331  gist_point_consistent(PG_FUNCTION_ARGS)
> > --- 1327,1333 
> >   bool   *recheck = (bool *) PG_GETARG_POINTER(4);
> >   boolresult;
> >   StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
> > + BOX*query, *key;
>
> This function now has "query" variables within subsidiary blocks redundant
> with and masking this one.  Avoid doing that.


Fixed.

>
> >   switch (strategyGroup)
> >   {
> > ***
> > *** 1337,1348  gist_point_consistent(PG_FUNCTION_ARGS)
> >   *recheck = false;
> >   break;
> >   case BoxStrategyNumberGroup:
> > ! result = DatumGetBool(DirectFunctionCall5(
> > !
> gist_box_consistent,
> > !
> PointerGetDatum(entry),
> > !
> PG_GETARG_DATUM(1),
> > !
> Int16GetDatum(RTOverlapStrategyNumber),
> > !
>  0, PointerGetDatum(recheck)));
> >   break;
> >   case PolygonStrategyNumberGroup:
> >   {
> > --- 1339,1356 
> >   *recheck = false;
> >   break;
> >   case BoxStrategyNumberGroup:
> > ! /*
> > !  * This code repeats logic of on_ob which uses
> simple comparison
> > !  * rather than FP* functions.
> > !  */
> > ! query = PG_GETARG_BOX_P(1);
> > ! key = DatumGetBoxP(entry->key);
> > !
> > ! *recheck = false;
> > ! result = key->high.x >= query->low.x &&
> > !  key->low.x <= query->high.x &&
> > !  key->high.y >= query->low.y &&
> > !  key->low.y <= query->high.y;
>
> For leaf entries, this correctly degenerates to on_pb().  For internal
> entries, it must, but does not, implement box_overlap().  (The fuzzy
> box_overlap() would be fine.)  I recommend making gist_point_consistent()'s
> treatment of boxes resemble its treatment of circles and polygons; that
> eases
> verifying their correctness.  Call gist_box_consistent.  Then, for leaf
> entries, call box_contain_pt().
>

I have two objections on doing that:
1) It's not evident for me that fuzzy comparison in internal pages is fine.
Obviously, it depends on data distribution. It's easy to provide an example
when fuzzy com

Re: [HACKERS] Problem Observed in behavior of Create Index Concurrently and Hot Update

2012-11-02 Thread Pavan Deolasee

On Wed, Oct 31, 2012 at 2:40 PM, Simon Riggs  wrote:

>
> >
> > diff --git a/src/backend/utils/cache/relcache.c
> > b/src/backend/utils/cache/relcache.c
> > index a59950e..9cadb3f 100644
> > --- a/src/backend/utils/cache/relcache.c
> > +++ b/src/backend/utils/cache/relcache.c
> > @@ -3355,6 +3355,12 @@ RelationGetIndexList(Relation relation)
> > oidvector  *indclass;
> > boolisnull;
> >
> > +   /*
> > +* Ignore any indexes that are currently being dropped
> > +*/
> > +   if (!index->indisvalid && !index->indisready)
> > +   continue;
> > +
> > /* Add index's OID to result list in the proper order */
> > result = insert_ordered_oid(result, index->indexrelid);
>
> Agreed, will fix.
>
>
Thanks Simon.

Looks like a severe data corruption issue. Is there a minor release planned
in near future or would this need one ?

Please let me know if you need any help with this. I investigated this a
bit before posting my analysis (just to ensure that its not a HOT's bug),
but since it involved DROP INDEX CONCURRENTLY, thought it will best be
addressed by you.

Thanks,
Pavan

[HACKERS] Bug in ALTER COLUMN SET DATA TYPE ?

2012-11-02 Thread Pavan Deolasee

Hi,

I noticed this behavior on master and it seems like a bug to me:

postgres=# CREATE TABLE test (a float check (a > 10.2));
CREATE TABLE
postgres=# CREATE TABLE test_child() INHERITS(test);
CREATE TABLE
postgres=# ALTER TABLE test ALTER COLUMN a SET DATA TYPE numeric;

 ERROR:  constraint must be added to child
tables too

The error message seems unintended. Sure, we are changing the data type of
a column which has a constraint on it. The ALTER TABLE mechanism would drop
and recreate that constraint after changing the data type of the column.

The ATAddCheckConstraint() function checks if the "recurse" flag is passed
(basically check against ONLY clause). If the flag is not passed and the
table has children, it raises the above mentioned exception. This is right
for a normal ADD CONSTRAINT operation because we don't want to allow
constraint addition ONLY on the parent table unless there are no child
tables currently on the parent. But when constraint is being re-added as a
side-effect of another ALTER TABLE command, we shouldn't really be raising
an exception because ATPrepCmd() would have expanded to child tables and
there would appropriate commands in the work queue to recreate constraints
on all the child tables as well.

So I think we need to teach ATAddCheckConstraint() to not do this check if
its being called from AT_PASS_OLD_INDEX of ALTER TABLE.

I can work up a patch if we are in agreement that this indeed is a bug and
the approach that I mentioned for fixing this is also right. Comments ?

Thanks,
Pavan

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-02 Thread Magnus Hagander

On Fri, Nov 2, 2012 at 2:19 AM, Greg Smith  wrote:

> On 10/31/12 12:17 PM, Magnus Hagander wrote:
>
>  The idea at the time was to use the include *directory* functionality,
>> for say a "config.d" directory in pgdata. The builtin one would then
>> use a predictable filename in this directory, so that the DBA who
>> prefers it can drop files both before and after that file into the
>> directory.
>>
>
> That's how I remember things as well.  Unfortunately Amit's proposal seems
> like an even more complicated version of ideas that were clearly beaten
> down multiple times over many years now, partly for being too complicated.
>
> The only idea I remember ever crossing the gap between the "edit by hand"
> and "tool config" crowd was based on the include directory concept.  The
> bugs in that implementation are finally worked out and the include_dir
> feature committed recently, so now it's possible to consider using it as a
> building block now.
>
> Here is a much simpler proposal for example:
>
> -Add a configuration subdirectory to the default installation.  Needs to
> follow the config file location, so things like the Debian relocation of
> postgresql.conf still work.  Maybe it has zero files; maybe it has one
> that's named for this purpose, which defaults to the usual:
>

What do you mean by "needs to follow"? In particular, do you mean that it
should be relative to postgresql.conf? I think that would actually be a
*problem* for any system that moves the config file away, like debian,
since you'd then have to grant postgres write permissions on a directory in
/etc/...



> # Don't edit this file by hand!  It's overwritten by...
>
> -Have the standard postgresql.conf end by including that directory
> -SQL parameter changes collect up all other active parameter changes,
> rewrite that file, and signal the server.  If any change requested requires
> a full server restart. warn the user of that fact.
>
> And that's basically it.  Cranky old-timers can remove the include
> directive and/or directory if they don't like it, act as if nothing has
> changed, and move along.  Everyone else gets the beginning of a multiple
> co-existing tool change standard.
>
> The only obvious bad case I can think of here is if someone has left the
> directory there, but deleted the include_dir statement; then the file would
> be written successfully but never included.  Seems like in the worst case
> the postgresql.conf parser would just need to flag whether it found the
> default directory included or not, to try and flag avoid obvious foot
> shooting.
>

Yes. And we could pretty easily find that - have the function that reloads
the config file actually check the source file and line number to make sure
it matches the one fro mthe auto file, and give a WARNING if it doesn't
(which indicates that either the file isn't included, or something else
"later in the chain" overwrote it)


Some of the better received ideas I floated for merging the recovery.conf
> file seemed headed this way too.  That also all blocked behind the include
> directory bit being surprisingly tricky to get committed.  So that's
> possible to revive usefully now.  And as much as I hate to expand scope by
> saying it, both changes should probably be tackled at once.  It's important
> to make sure they're both solved well by whatever is adopted, they are
> going to co-exist as committed one day.
>

Yeah, we don't need the code for both, but we certainly need a "reasonable
design" capable of dealing with both, so we don't paint ourselves into a
corner.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] gistchoose vs. bloat

2012-11-02 Thread Alexander Korotkov

On Sun, Oct 21, 2012 at 11:03 AM, Jeff Davis  wrote:

> On Thu, 2012-10-18 at 15:09 -0300, Alvaro Herrera wrote:
> > Jeff, do you think we need more review of this patch?
>
> In the patch, it refers to rd_options without checking for NULL first,
> which needs to be fixed.
>
> There's actually still one place where it says "id" rather than "is".
> Just a nitpick.
>
> Regarding my point 4 from the previous email, I mildly disagree with the
> style, but I don't see a correctness problem there.
>
> If the first two items are fixed, then the patch is fine with me.
>

First two items are fixed in attached version of the patch.

--
With best regards,
Alexander Korotkov.


gist_choose_bloat-0.4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SP-GiST for ranges based on 2d-mapping and quad-tree

2012-11-02 Thread Alexander Korotkov

On Sun, Oct 21, 2012 at 11:22 AM, Jeff Davis  wrote:

> On Tue, 2012-09-04 at 17:45 +0400, Alexander Korotkov wrote:
> > On Mon, Aug 20, 2012 at 12:25 AM, Jeff Davis 
> > wrote:
> > I am taking a look at this patch now. A few quick comments:
> >
> > * It looks like bounds_adjacent modifies it's by-reference
> > arguments,
> > which is a little worrying to me. The lower/upper labels are
> > flipped
> > back, but the inclusivities are not. Maybe just pass by value
> > instead?
> >
> > * Bounds_adjacent is sensitive to the argument order. Can't it
> > just take
> > bound1 and bound2?
> >
> >
> > Fixed. Patch is attached.
> >
> It looks like this is basically the same diff as v0.1. Did something get
> mixed up?
>

Right version of patch is attached.

--
With best regards,
Alexander Korotkov.


range_spgist_adjacent-0.3.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

40 matches

Mail list logo