Re: [HACKERS] REVIEW: Determining client_encoding from client locale

2011-02-18 Thread Peter Eisentraut
On tis, 2011-02-08 at 02:05 -0800, Ibrar Ahmed wrote:
> I have modified the code to use ADD_STARTUP_OPTION instead of writing code
> again.

Committed this version.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_resetxlog display bogosity

2011-02-18 Thread Bruce Momjian

Is this a TODO item?

---

Alvaro Herrera wrote:
> I just noticed that if I specify pg_resetxlog a timeline ID with the -l
> switch, it will display this value as "TimeLineID of latest checkpoint".
> Which is not really the truth.
> 
> I wonder if pg_resetxlog should display the actual pg_control values in
> one section, and the values that would be set after a reset in a
> different section, so that it is extra clear.  So it would look like
> 
>   pg_control values:
> 
>   pg_control version number:903
>   Catalog version number:   201004261
>   Database system identifier:   5509100787461288958
>   Latest checkpoint's TimeLineID:   1
>   Latest checkpoint's NextXID:  0/667
>   Latest checkpoint's NextOID:  16390
>   Latest checkpoint's NextMultiXactId:  1
>   Latest checkpoint's NextMultiOffset:  0
>   Latest checkpoint's oldestXID:654
>   Latest checkpoint's oldestXID's DB:   1
>   Latest checkpoint's oldestActiveXID:  0
>   Maximum data alignment:   8
>   Database block size:  8192
>   Blocks per segment of large relation: 131072
>   WAL block size:   8192
>   Bytes per WAL segment:16777216
>   Maximum length of identifiers:64
>   Maximum columns in an index:  32
>   Maximum size of a TOAST chunk:1996
>   Date/time type storage:   64-bit integers
>   Float4 argument passing:  by value
>   Float8 argument passing:  by value
> 
>   Values to be used after reset:
> 
>   First log file ID:14
>   First log file segment:   28
>   TimeLineID:   57
> 
> 
> (I'd also like to point out that the "Latest checkpoint's" phrasing is awkward
> and cumbersome for translated output, but I'm refraining from suggest a
> reword because it'd complicate matters for programs that try to read the
> output)
> 
> -- 
> ??lvaro Herrera 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v17

2011-02-18 Thread Bruce Momjian
Simon Riggs wrote:
> Most parameters are set on the primary. Set
> 
>   primary: synchronous_standby_names = 'node1, node2, node3'
> 
> which means that whichever of those standbys connect first will
> become the main synchronous standby. Servers arriving later will
> be potential standbys (standby standbys doesn't sound right...).
> synchronous_standby_names can change at reload.
> 
> Currently, the standby_name is the application_name parameter
> set in the primary_conninfo.
> 
> When we set this for a client, or in postgresql.conf
> 
>   primary: synchronous_replication = on
> 
> then we will wait at commit until the synchronous standby has
> reached the WAL location of our commit point.
> 
> If the current synchronous standby dies then one of the other standbys
> will take over. (I think it would be a great idea to make the
> list a priority order, but I haven't had time to code that).
> 
> If none of the standbys are available, then we don't wait at all
> if allow_standalone_primary is set.
> allow_standalone_primary can change at reload.

Are we going to allow a command to be run when these things happen, like
archive_command does, or is that something for 9.2?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CopyReadLineText optimization revisited

2011-02-18 Thread Bruce Momjian

Was this implemented?  Is it a TODO?

---

Heikki Linnakangas wrote:
> I'm reviving the effort I started a while back to make COPY faster:
> 
> http://archives.postgresql.org/pgsql-patches/2008-02/msg00100.php
> http://archives.postgresql.org/pgsql-patches/2008-03/msg00015.php
> 
> The patch I now have is based on using memchr() to search end-of-line. 
> In a nutshell:
> 
> * we perform possible encoding conversion early, one input block at a 
> time, rather than after splitting the input into lines. This allows us 
> to assume in the later stages that the data is in server encoding, 
> allowing us to search for the '\n' byte without worrying about 
> multi-byte characters.
> 
> * instead of the byte-at-a-time loop in CopyReadLineText(), use memchr() 
> to find the next NL/CR character. This is where the speedup comes from. 
> Unfortunately we can't do that in the CSV codepath, because newlines can 
> be embedded in quoted, so that's unchanged.
> 
> These changes seem to give an overall speedup of between 0-10%, 
> depending on the shape of the table. I tested various tables from the 
> TPC-H schema, and a narrow table consisting of just one short text column.
> 
> I can't think of a case where these changes would be a net loss in 
> performance, and it didn't perform worse on any of the cases I tested 
> either.
> 
> There's a small fly in the ointment: the patch won't recognize backslash 
> followed by a linefeed as an escaped linefeed. I think we should simply 
> drop support for that. The docs already say:
> 
> > It is strongly recommended that applications generating COPY data convert 
> > data newlines and carriage returns to the \n and \r sequences respectively. 
> > At present it is possible to represent a data carriage return by a 
> > backslash and carriage return, and to represent a data newline by a 
> > backslash and newline. However, these representations might not be accepted 
> > in future releases. They are also highly vulnerable to corruption if the 
> > COPY file is transferred across different machines (for example, from Unix 
> > to Windows or vice versa).
> 
> I vaguely recall that we discussed this some time ago already and agreed 
> that we can drop it if it makes life easier.
> 
> This patch is in pretty good shape, however it needs to be tested with 
> different exotic input formats. Also, the loop in CopyReadLineText could 
> probaby be cleaned up a bit, some of the uglifications that were done 
> for performance reasons in the old code are no longer necessary, as 
> memchr() is doing the heavy-lifting and the loop only iterates 1-2 times 
> per line in typical cases.
> 
> 
> It's not strictly necessary, but how about dropping support for the old 
> COPY protocol, and the EOF marker \. while we're at it? It would allow 
> us to drop some code, making the remaining code simpler, and reduce the 
> testing effort. Thoughts on that?
> 
> -- 
>Heikki Linnakangas
>EnterpriseDB   http://www.enterprisedb.com

[ Attachment, skipping... ]

> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v17

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 7:06 PM, Simon Riggs  wrote:
> The patch is very lite touch on a few areas of code, plus a chunk of
> specific code, all on master-side. Pretty straight really. I'm sure
> problems will be found, its not long since I completed this; thanks to
> Daniel Farina for your help with patch assembly.

This looks like it's in far better shape than the previous version.
Thanks to you and Dan for working on it.

As you have this coded, enabling synchronous_replication forces all
transactions to commit synchronously.  This means that, when
synchronous_replication=on, you get synchronous_commit=on behavior
even if you have synchronous_commit=off.  I'd prefer to see us go the
other way, and say that you can only get synchronous replication when
you're also getting synchronous commit.

Even if not, we should at least arrange the test so that we don't
force synchronous commit for transactions that haven't written XLOG.
That is, instead of:

((wrote_xlog && XactSyncCommit) || forceSyncCommit || nrels > 0 ||
SyncRepRequested())

...we should instead say:

((write_xlog && (XactSyncCommit || SyncRepRequested()) ||
forceSyncCommit || nrels > 0)

...but as I say I'd be inclined to instead keep the existing test, and
just skip SyncRepWaitForLSN() if we aren't committing synchronously.

I'm unsure of the value of sync_replication_timeout_client (which
incidentally is spelled replication_timeout_client in one place, and
elsewhere asynchornus -> asynchronous).  I think in practice if you
start hitting that timeout, your server will slow to such a crawl that
you might as well be down.  On the other hand, I see no particular
harm in leaving the option in there either, though I definitely think
the default should be changed to -1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] disposition of remaining patches

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 6:04 PM, Tom Lane  wrote:
> FWIW, my thought is to try to get the API patch committed and then do
> the file_fdw patch.  Maybe I'm hopelessly ASCII-centric, but I do not
> see encoding considerations as a blocking factor for this.  If we define
> that files are read in the database encoding, it's still a pretty damn
> useful feature.  We can look at whether that can be improved after we
> have some kind of feature at all.

Sure.  OTOH, Itagaki Takahiro's solution wasn't a lot of code, so if
he feels reasonably confident in it, I'd like to see it committed.

> postgresql_fdw may have to live as an external project for the 9.1
> cycle, unless it's in much better shape than you suggest above.
> I won't feel too bad about that as long as the core support exists.
> More than likely, people would want to improve it on a faster release
> cycle than the core anyway.

I think as long as we have one implementation in contrib, we're OK to release.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-18 Thread Alvaro Herrera
Excerpts from Joachim Wieland's message of vie feb 18 21:41:51 -0300 2011:
> On Fri, Feb 18, 2011 at 8:57 PM, Alvaro Herrera
>  wrote:
> > 1. why are you using the expansible char array stuff instead of using
> > the StringInfo facility?
> >
> > 2. is md5 the most appropriate digest for this?  If you need a
> > cryptographically secure hash, do we need something stronger?  If not,
> > why not just use hash_any?
> 
> We don't need a cryptographically secure hash.

Ok.

> Should I send an updated patch? Anything else?

No need, I'm halfway through already.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v17

2011-02-18 Thread Alvaro Herrera
Excerpts from Thom Brown's message of vie feb 18 21:32:17 -0300 2011:

> I see the updated patch I provided last time to fix various errata and
> spacing issues got lost in the last round of conversations.  Maybe
> it's safer to provide a patch for the patch.

interdiff?

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-18 Thread Joachim Wieland
On Fri, Feb 18, 2011 at 8:57 PM, Alvaro Herrera
 wrote:
> 1. why are you using the expansible char array stuff instead of using
> the StringInfo facility?
>
> 2. is md5 the most appropriate digest for this?  If you need a
> cryptographically secure hash, do we need something stronger?  If not,
> why not just use hash_any?

We don't need a cryptographically secure hash.

There is no special reason for why it is like it is, I just didn't
think of the better alternatives that you are proposing.

Should I send an updated patch? Anything else?


Thanks for the review,
Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v17

2011-02-18 Thread Simon Riggs
On Sat, 2011-02-19 at 00:32 +, Thom Brown wrote:

> I see the updated patch I provided last time to fix various errata and
> spacing issues got lost in the last round of conversations.  Maybe
> it's safer to provide a patch for the patch.

I'm sorry about that Thom, your changes were and are welcome. The docs
were assembled rather quickly about 2 hours ago and we'll definitely
need your care and attention to bring them to a good level of quality.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make a hard state change from catchup to streaming mode.

2011-02-18 Thread Simon Riggs
On Fri, 2011-02-18 at 09:41 -0600, Kevin Grittner wrote:
> Robert Haas  wrote:
> > Simon Riggs  wrote:
>  
> >> Make a hard state change from catchup to streaming mode.
> >> More useful state change for monitoring purposes, plus a
> >> required change for synchronous replication patch.
> > 
> > As far as I can see, this patch was not posted or discussed before
> > commit, and I'm not sure it's the behavior everyone wants.  It has
> > the effect of preventing the system from ever going backwards from
> > "streaming" to "catchup".  Is that what we want?
>  
> We are looking at moving to streaming replication instead of WAL
> file shipping, but we often have WAN outages.  These can last
> minutes, hours, or even a few days.  What would be the impact of
> this patch on us during and after such outages?

None at all. The patch introduces no behavioural changes, only a useful,
but minor re-categorisation of what is already happening so that its
easier to monitor what happens following startup of a standby.
 
> I don't know how well such experience generalizes, but my personal
> experience with replication technology is that "hard state changes"
> tend to make things more "clunky" and introduce odd issues at the
> state transitions.  Where different message types are intermingled
> without such hard state changes, I've seen more graceful behavior.
>  
> Of course, take that with a grain of salt -- I haven't read the
> patch and am talking in generalities based on having written a
> couple serious replication tools in the past, and having used a few
> others.

I respect your experience.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sync Rep v17

2011-02-18 Thread Thom Brown
On 19 February 2011 00:06, Simon Riggs  wrote:
>
> Well, good news all round.
>
> v17 implements what I believe to be the final set of features for sync
> rep. This one I'm actually fairly happy with. It can be enjoyed best at
> DEBUG3.
>
> The patch is very lite touch on a few areas of code, plus a chunk of
> specific code, all on master-side. Pretty straight really. I'm sure
> problems will be found, its not long since I completed this; thanks to
> Daniel Farina for your help with patch assembly.
>
> Which is just as well, because the other news is that I'm off on holiday
> for a few days, which is most inconvenient. I won't be committing this
> for at least a week and absent from the list. OTOH, I think its ready
> for a final review and commit, so I'm OK if you commit or OK if you
> leave it for me.
>
> That's not the end of it. I can see a few things we could/should do in
> this release, but this includes all the must-do things. Docs could do
> with a little love also. So I expect work for me when I return.
>
>
>
>
> Config Summary
> ==
>
> Most parameters are set on the primary. Set
>
>  primary: synchronous_standby_names = 'node1, node2, node3'
>
> which means that whichever of those standbys connect first will
> become the main synchronous standby. Servers arriving later will
> be potential standbys (standby standbys doesn't sound right...).
> synchronous_standby_names can change at reload.
>
> Currently, the standby_name is the application_name parameter
> set in the primary_conninfo.
>
> When we set this for a client, or in postgresql.conf
>
>  primary: synchronous_replication = on
>
> then we will wait at commit until the synchronous standby has
> reached the WAL location of our commit point.
>
> If the current synchronous standby dies then one of the other standbys
> will take over. (I think it would be a great idea to make the
> list a priority order, but I haven't had time to code that).
>
> If none of the standbys are available, then we don't wait at all
> if allow_standalone_primary is set.
> allow_standalone_primary can change at reload.
>
> Whatever happens, if you set sync_replication_timeout_client
> then backends will give up waiting if some WALsender doesn't
> wake them quickly enough.
>
> You can generally leave these parameters at their default settings
>
>  primary: sync_replication_timeout_client = 120s
>  primary: allow_standalone_primary = on
>  standby: wal_receiver_status_interval = 10s

I see the updated patch I provided last time to fix various errata and
spacing issues got lost in the last round of conversations.  Maybe
it's safer to provide a patch for the patch.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Sync Rep v17

2011-02-18 Thread Simon Riggs

Well, good news all round.

v17 implements what I believe to be the final set of features for sync
rep. This one I'm actually fairly happy with. It can be enjoyed best at
DEBUG3.

The patch is very lite touch on a few areas of code, plus a chunk of
specific code, all on master-side. Pretty straight really. I'm sure
problems will be found, its not long since I completed this; thanks to
Daniel Farina for your help with patch assembly.

Which is just as well, because the other news is that I'm off on holiday
for a few days, which is most inconvenient. I won't be committing this
for at least a week and absent from the list. OTOH, I think its ready
for a final review and commit, so I'm OK if you commit or OK if you
leave it for me.

That's not the end of it. I can see a few things we could/should do in
this release, but this includes all the must-do things. Docs could do
with a little love also. So I expect work for me when I return.




Config Summary
==

Most parameters are set on the primary. Set

  primary: synchronous_standby_names = 'node1, node2, node3'

which means that whichever of those standbys connect first will
become the main synchronous standby. Servers arriving later will
be potential standbys (standby standbys doesn't sound right...).
synchronous_standby_names can change at reload.

Currently, the standby_name is the application_name parameter
set in the primary_conninfo.

When we set this for a client, or in postgresql.conf

  primary: synchronous_replication = on

then we will wait at commit until the synchronous standby has
reached the WAL location of our commit point.

If the current synchronous standby dies then one of the other standbys
will take over. (I think it would be a great idea to make the
list a priority order, but I haven't had time to code that).

If none of the standbys are available, then we don't wait at all
if allow_standalone_primary is set.
allow_standalone_primary can change at reload.

Whatever happens, if you set sync_replication_timeout_client
then backends will give up waiting if some WALsender doesn't
wake them quickly enough.

You can generally leave these parameters at their default settings

  primary: sync_replication_timeout_client = 120s
  primary: allow_standalone_primary = on
  standby: wal_receiver_status_interval = 10s

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index cee09c7..aad9b4e 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2010,8 +2010,117 @@ SET ENABLE_SEQSCAN TO OFF;
 You should also consider setting hot_standby_feedback
 as an alternative to using this parameter.

+
+
+ Synchronous Replication
+
+ 
+  These settings control the behavior of the built-in
+  synchronous replication feature.
+  These parameters would be set on the primary server that is
+  to send replication data to one or more standby servers.
+ 
+
+ 
+ 
+  synchronous_replication (boolean)
+  
+   synchronous_replication configuration parameter
+  
+  
+   
+Specifies whether transaction commit will wait for WAL records
+to be replicated before the command returns a success
+indication to the client.  The default setting is off.
+When on, there will be a delay while the client waits
+for confirmation of successful replication. That delay will
+increase depending upon the physical distance and network activity
+between primary and standby. The commit wait will last until the
+first reply from any standby. Multiple standby servers allow
+increased availability and possibly increase performance as well.
+   
+
+   
+On the primary, this parameter can be changed at any time; the
+behavior for any one transaction is determined by the setting in
+effect when it commits.  It is therefore possible, and useful, to have
+some transactions replicate synchronously and others asynchronously.
+For example, to make a single multistatement transaction commit
+asynchronously when the default is synchronous replication, issue
+SET LOCAL synchronous_replication TO OFF within the
+transaction.
+   
+  
+ 
+
+ 
+  sync_replication_timeout_client (integer)
+  
+   sync_replication_timeout_client configuration parameter
+  
+  
+   
+If the client has synchronous_replication set,
+and a synchronous standby is currently available
+then the commit will wait for up to replication_timeout_client
+seconds before it returns a success. The commit will wait
+forever for a confirmation when replication_timeout_client
+is set to -1.
+   
+   
+If the client has synchronous_replication set,
+		and yet 

Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
I wrote:
> ... My feeling is it'd be best to pass down
> all the information the executor node has got --- probably we should
> just pass the ForeignScanState node itself, and leave a void * in that
> for FDW-private data, and be done with it.  Otherwise we're going to be
> adding missed stuff back to the API every time somebody notices that
> their FDW can't do X because they don't have access to the necessary
> information.

Attached is a rewritten version of fdwhandler.sgml that specifies what I
think is a more future-proof API for the callback functions.  Barring
objections, I'll push ahead with editing the code to match.

regards, tom lane




 
   Writing A Foreign Data Wrapper

   
foreign data wrapper
handler for
   

   
All operations on a foreign table are handled through its foreign data
wrapper, which consists of a set of functions that the planner and
executor call. The foreign data wrapper is responsible for fetching
data from the remote data source and returning it to the
PostgreSQL executor. This chapter outlines how
to write a new foreign data wrapper.
   

   
The FDW author needs to implement a handler function, and optionally
a validator function. Both functions must be written in a compiled
language such as C, using the version-1 interface.
For details on C language calling conventions and dynamic loading,
see .
   

   
The handler function simply returns a struct of function pointers to
callback functions that will be called by the planner and executor.
Most of the effort in writing an FDW is in implementing these callback
functions.
The handler function must be registered with
PostgreSQL as taking no arguments and returning
the special pseudo-type fdw_handler.
The callback functions are plain C functions and are not visible or
callable at the SQL level.
   

   
The validator function is responsible for validating options given in the
CREATE FOREIGN DATA WRAPPER, CREATE
SERVER and CREATE FOREIGN TABLE commands.
The validator function must be registered as taking two arguments, a text
array containing the options to be validated, and an OID representing the
type of object the options are associated with (in the form of the OID
of the system catalog the object would be stored in).  If no validator
function is supplied, the options are not checked at object creation time.
   

   
The foreign data wrappers included in the standard distribution are good
references when trying to write your own.  Look into the
contrib/file_fdw subdirectory of the source tree.
The  reference page also has
some useful details.
   

   

 The SQL standard specifies an interface for writing foreign data wrappers.
 However, PostgreSQL does not implement that API, because the effort to
 accommodate it into PostgreSQL would be large, and the standard API hasn't
 gained wide adoption anyway.

   

   
Foreign Data Wrapper Callback Routines


 The FDW handler function returns a palloc'd FdwRoutine
 struct containing pointers to the following callback functions:




FdwPlan *
PlanForeignScan (Oid foreigntableid,
 PlannerInfo *root,
 RelOptInfo *baserel);


 Plan a scan on a foreign table. This is called when a query is planned.
 foreigntableid is the pg_class OID of the
 foreign table.  root is the planner's global information
 about the query, and baserel is the planner's information
 about this table.
 The function must return a palloc'd struct that contains cost estimates,
 a string to show for this scan in EXPLAIN, and any
 FDW-private information that is needed to execute the foreign scan at a
 later time.  (Note that the private information must be represented in
 a form that copyObject knows how to copy.)



 The information in root and baserel can be used
 to reduce the amount of information that has to be fetched from the
 foreign table (and therefore reduce the cost estimate).
 baserel->baserestrictinfo is particularly interesting, as
 it contains restriction quals (WHERE clauses) that can be
 used to filter the rows to be fetched.  (The FDW is not required to
 enforce these quals, as the finished plan will recheck them anyway.)
 baserel->reltargetlist can be used to determine which
 columns need to be fetched.




void
BeginForeignScan (ForeignScanState *node,
  int eflags);


 Begin executing a foreign scan. This is called during executor startup.
 It should perform any initialization needed before the scan can start.
 The ForeignScanState node has already been created, but
 its fdw_private field is still NULL.  Information about
 the table to scan is accessible through the
 ForeignScanState node (in particular, from the underlying

Re: [HACKERS] DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)

2011-02-18 Thread Bruce Momjian
Robert Haas wrote:
> On Fri, Feb 18, 2011 at 5:10 PM, Bruce Momjian  wrote:
> > Robert Haas wrote:
> >> On Fri, Feb 18, 2011 at 4:59 PM, Tom Lane  wrote:
> >> > Robert Haas  writes:
> >> >> On Fri, Feb 18, 2011 at 4:46 PM, Tom Lane  wrote:
> >> >>> AFAIR nothing's been done about it, so it's a TODO.
> >> >
> >> >> I was thinking of adding it to the 9.1 open items list, but the wiki's
> >> >> been down every time I've tried to go there.
> >> >
> >> > Since the problem's been there since forever, I don't see that it's an
> >> > open item for 9.1. ?That list normally is for "must fix before ship"
> >> > items, not development projects.
> >>
> >> OK. ?If you don't feel it warrants being on that list, then the TODO
> >> is OK with me.
> >
> > Agreed. ?Do you want me to do it, or will you?
> 
> You do it.  :-)

Done:

Restructure truncation logic is more resistant to failure

This also involves not writing dirty buffers for a truncated or
dropped relation

* 
http://archives.postgresql.org/pgsql-hackers/2010-08/msg01032.php 

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] disposition of remaining patches

2011-02-18 Thread Tom Lane
Josh Berkus  writes:
> On 2/18/11 3:04 PM, Tom Lane wrote:
>> postgresql_fdw may have to live as an external project for the 9.1
>> cycle, unless it's in much better shape than you suggest above.
>> I won't feel too bad about that as long as the core support exists.
>> More than likely, people would want to improve it on a faster release
>> cycle than the core anyway.

> FDWs seem like perfect candidates for Extensions.  We'll eventually want
> postgresql_fdw in core, but most FDWs will never be there.

Yeah, agreed as to both points.  I would imagine that we'd absorb
postgresql_fdw into core late in the 9.2 devel cycle, which would still
leave quite a few months where it could be improved on a rapid release
cycle.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] disposition of remaining patches

2011-02-18 Thread Andrew Dunstan



On 02/18/2011 05:47 PM, Robert Haas wrote:

3, 4, 5. SQL/MED.  Tom has picked up the main FDW API patch, which I
expect means it'll go in.  I am not so sure about the FDW patches,
though: in particular, based on Heikki's comments, the postgresql_fdw
patch seems to be badly in need of some more work.  The file_fdw patch
may be in better shape (I'm not 100% sure), but it needs the encoding
fix patch Itagaki Takahiro recently proposed.  For this to be
worthwhile, we presumably need to get at least one FDW committed along
with the API patch.



I'm not sure it's not useful without, but it would be better with it. I 
agree we need some actual uses.


If people want more I'm prepared to put some hurried effort into making 
one just for copy to text array, since the consensus didn't seems to be 
in favor of piggybacking this onto the file_fdw. That would exercise the 
part of the new COPY API that would not otherwise not be exercised by 
file_fdw. If not, I'll eventually contribute that for 9.2.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] disposition of remaining patches

2011-02-18 Thread Josh Berkus
On 2/18/11 3:04 PM, Tom Lane wrote:
> postgresql_fdw may have to live as an external project for the 9.1
> cycle, unless it's in much better shape than you suggest above.
> I won't feel too bad about that as long as the core support exists.
> More than likely, people would want to improve it on a faster release
> cycle than the core anyway.

FDWs seem like perfect candidates for Extensions.  We'll eventually want
postgresql_fdw in core, but most FDWs will never be there.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] disposition of remaining patches

2011-02-18 Thread Josh Berkus
On 2/18/11 2:47 PM, Robert Haas wrote:
> The CommitFest application currently reflects 17 remaining patches for
> CommitFest 2011-01.

I'm impressed, actually.  This is way further along than I expected us
to be.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] disposition of remaining patches

2011-02-18 Thread Tom Lane
Robert Haas  writes:
> 3, 4, 5. SQL/MED.  Tom has picked up the main FDW API patch, which I
> expect means it'll go in.  I am not so sure about the FDW patches,
> though: in particular, based on Heikki's comments, the postgresql_fdw
> patch seems to be badly in need of some more work.  The file_fdw patch
> may be in better shape (I'm not 100% sure), but it needs the encoding
> fix patch Itagaki Takahiro recently proposed.  For this to be
> worthwhile, we presumably need to get at least one FDW committed along
> with the API patch.

FWIW, my thought is to try to get the API patch committed and then do
the file_fdw patch.  Maybe I'm hopelessly ASCII-centric, but I do not
see encoding considerations as a blocking factor for this.  If we define
that files are read in the database encoding, it's still a pretty damn
useful feature.  We can look at whether that can be improved after we
have some kind of feature at all.

postgresql_fdw may have to live as an external project for the 9.1
cycle, unless it's in much better shape than you suggest above.
I won't feel too bad about that as long as the core support exists.
More than likely, people would want to improve it on a faster release
cycle than the core anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Initial review of xslt with no limits patch

2011-02-18 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian  writes:
> > Tom Lane wrote:
> >> I think we have a few TODO items here:
> >> 
> >> * Invent ... and document ... an API that permits safe assembly of a
> >> parameter list from non-constant (and perhaps untrustworthy) values.
> >> 
> >> * Fix xslt_process' failure to report (some?) errors detected by libxslt.
> >> 
> >> * Move the functionality to a less deprecated place.
> >> 
> >> None of these are within the scope of the current patch though.
> 
> > Should any of these be added to our TODO list under XML?
> 
> Yes, all of them, since nothing's been done about any of 'em ...

OK, TODO items added:

Move XSLT from contrib/xml2 to a more reasonable location

* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00539.php 

Report errors returned by the XSLT library

* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00562.php 

Improve the XSLT parameter passing API

* http://archives.postgresql.org/pgsql-hackers/2010-08/msg00416.php 

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 5:10 PM, Bruce Momjian  wrote:
> Robert Haas wrote:
>> On Fri, Feb 18, 2011 at 4:59 PM, Tom Lane  wrote:
>> > Robert Haas  writes:
>> >> On Fri, Feb 18, 2011 at 4:46 PM, Tom Lane  wrote:
>> >>> AFAIR nothing's been done about it, so it's a TODO.
>> >
>> >> I was thinking of adding it to the 9.1 open items list, but the wiki's
>> >> been down every time I've tried to go there.
>> >
>> > Since the problem's been there since forever, I don't see that it's an
>> > open item for 9.1. ?That list normally is for "must fix before ship"
>> > items, not development projects.
>>
>> OK.  If you don't feel it warrants being on that list, then the TODO
>> is OK with me.
>
> Agreed.  Do you want me to do it, or will you?

You do it.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] disposition of remaining patches

2011-02-18 Thread Robert Haas
The CommitFest application currently reflects 17 remaining patches for
CommitFest 2011-01.

1. Change pg_last_xlog_receive_location not to move backwards.  We
don't have complete consensus on what to do here.  If we can agree on
a way forward, I think we can finish this one up pretty quickly.  It's
partially being held up by #2.
2. Synchronous replication.  Splitting up this patch has allowed some
progress to be made here, but there is a lot left to do, and I fear
that trying to hash out the design issues at this late date is not
going to lead to a great final product.  The proposed timeout to make
the server boot out clients that don't seem to be responding is not
worth committing, as it will only work when the server isn't
generating WAL, which can't be presumed to be the normal state of
affairs.  The patch to avoid ever letting the WAL sender status go
backward from catchup to streaming was committed without discussion,
and needs to be reverted for reasons discussed on that thread.  An
updated version of the main patch has yet to be posted.
3, 4, 5. SQL/MED.  Tom has picked up the main FDW API patch, which I
expect means it'll go in.  I am not so sure about the FDW patches,
though: in particular, based on Heikki's comments, the postgresql_fdw
patch seems to be badly in need of some more work.  The file_fdw patch
may be in better shape (I'm not 100% sure), but it needs the encoding
fix patch Itagaki Takahiro recently proposed.  For this to be
worthwhile, we presumably need to get at least one FDW committed along
with the API patch.
6. Writeable CTEs.  Tom said he'd look at this one.
7. contrib/btree_gist KNN.  Needs updating as a result of the
extensions patch.  This ball is really in Teodor and Oleg's court.
8, 9, 10, 11, 12, 13, 14.  PL/python patches.  I believe Peter was
working on these, but I haven't seen any updates in a while.
15. Fix snapshot taking inconsistencies.  Tom said he'd look at this one.
16. synchronized snapshots.  Alvaro is working on this one.
17. determining client_encoding from client locale.  This is Peter's
patch.  Peter, are you planning to commit this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP - Add ability to constrain backend temporary file space

2011-02-18 Thread Mark Kirkwood

On 19/02/11 11:30, Josh Berkus wrote:

Yeah, the disadvantage is that (like statement timeout) it is a 'bottom
of the cliff' type of protection. The advantage is there are no false
positives...

Yeah, just trying to get a handle on the proposed feature.  I have no
objections; it seems like a harmless limit for most people, and useful
to a few.

No worries and sorry, I should have used the "per backend" phrase in the 
title to help clarify what was intended.


Cheers

Mark



Re: [HACKERS] WIP - Add ability to constrain backend temporary file space

2011-02-18 Thread Josh Berkus

> Yeah, the disadvantage is that (like statement timeout) it is a 'bottom
> of the cliff' type of protection. The advantage is there are no false
> positives...

Yeah, just trying to get a handle on the proposed feature.  I have no
objections; it seems like a harmless limit for most people, and useful
to a few.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR standby hangs

2011-02-18 Thread Andrew Dunstan



On 02/18/2011 03:42 PM, Robert Haas wrote:

On Fri, Feb 18, 2011 at 2:50 PM, Tom Lane  wrote:

Robert Haas  writes:

On Fri, Feb 18, 2011 at 2:35 PM, Andrew Dunstan  wrote:

It's not running HS, so there's no query to wait on.

That seems to imply that recovery has leaked a buffer pin.

No, because then the sanity check in LockBufferForCleanup would have
fired:

/* There should be exactly one local pin */
if (PrivateRefCount[buffer - 1] != 1)
elog(ERROR, "incorrect local pin count: %d",
 PrivateRefCount[buffer - 1]);

Hmm, yeah.


Some sort of deadly embrace with the bgwriter, maybe?

Maybe.

I think it'd be useful to know what the buffer header thinks the
refcount on that buffer is, and what the startup process and the
bgwriter each have for PrivateRefCount[buffer].



I'll see what I can find out (damn I hate driving debuggers).

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)

2011-02-18 Thread Bruce Momjian
Robert Haas wrote:
> On Fri, Feb 18, 2011 at 4:59 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> On Fri, Feb 18, 2011 at 4:46 PM, Tom Lane  wrote:
> >>> AFAIR nothing's been done about it, so it's a TODO.
> >
> >> I was thinking of adding it to the 9.1 open items list, but the wiki's
> >> been down every time I've tried to go there.
> >
> > Since the problem's been there since forever, I don't see that it's an
> > open item for 9.1. ?That list normally is for "must fix before ship"
> > items, not development projects.
> 
> OK.  If you don't feel it warrants being on that list, then the TODO
> is OK with me.

Agreed.  Do you want me to do it, or will you?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 4:59 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Feb 18, 2011 at 4:46 PM, Tom Lane  wrote:
>>> AFAIR nothing's been done about it, so it's a TODO.
>
>> I was thinking of adding it to the 9.1 open items list, but the wiki's
>> been down every time I've tried to go there.
>
> Since the problem's been there since forever, I don't see that it's an
> open item for 9.1.  That list normally is for "must fix before ship"
> items, not development projects.

OK.  If you don't feel it warrants being on that list, then the TODO
is OK with me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)

2011-02-18 Thread Tom Lane
Robert Haas  writes:
> On Fri, Feb 18, 2011 at 4:46 PM, Tom Lane  wrote:
>> AFAIR nothing's been done about it, so it's a TODO.

> I was thinking of adding it to the 9.1 open items list, but the wiki's
> been down every time I've tried to go there.

Since the problem's been there since forever, I don't see that it's an
open item for 9.1.  That list normally is for "must fix before ship"
items, not development projects.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP - Add ability to constrain backend temporary file space

2011-02-18 Thread Mark Kirkwood

On 19/02/11 10:38, Josh Berkus wrote:



To answer the other question, what happens when the limit is exceeded is
modeled on statement timeout, i.e query is canceled and a message says
why (exceeded temp files size).

When does this happen?  When you try to allocate the file, or when it
does the original tape sort estimate?

The disadvantage of the former is that the user waited for minutes in
order to have their query cancelled.  The disadvantage of the latter is
that the estimate isn't remotely accurate.



Neither - it checks each write (I think this is pretty cheap - adds two 
int and double + operations and a  /, > operation to FileWrite). If the 
check shows you've written more than the limit, you get canceled. So you 
can exceed the limit by 1 buffer size.


Yeah, the disadvantage is that (like statement timeout) it is a 'bottom 
of the cliff' type of protection. The advantage is there are no false 
positives...


Cheers

Mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)

2011-02-18 Thread Bruce Momjian
Robert Haas wrote:
> On Fri, Feb 18, 2011 at 4:46 PM, Tom Lane  wrote:
> > Bruce Momjian  writes:
> >> Did we make any progress on this? ?Is it a TODO?
> >
> > AFAIR nothing's been done about it, so it's a TODO.
> 
> I was thinking of adding it to the 9.1 open items list, but the wiki's
> been down every time I've tried to go there.

It is up now.  :-)

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 4:46 PM, Tom Lane  wrote:
> Bruce Momjian  writes:
>> Did we make any progress on this?  Is it a TODO?
>
> AFAIR nothing's been done about it, so it's a TODO.

I was thinking of adding it to the 9.1 open items list, but the wiki's
been down every time I've tried to go there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)

2011-02-18 Thread Tom Lane
Bruce Momjian  writes:
> Did we make any progress on this?  Is it a TODO?

AFAIR nothing's been done about it, so it's a TODO.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)

2011-02-18 Thread Bruce Momjian

Did we make any progress on this?  Is it a TODO?

---

Tom Lane wrote:
> Robert Haas  writes:
> > On Sun, Aug 15, 2010 at 9:25 PM, Tom Lane  wrote:
> >> ... and PANIC is absolutely, entirely, 100% unacceptable here. ?I don't
> >> think you understand the context. ?We've already written the truncate
> >> action to WAL (as we must: WAL before data change). ?If we PANIC, that
> >> means we'll PANIC again during WAL replay. ?So that solution means a
> >> down, and perhaps unrecoverably broken, database.
> 
> > All right, that would be bad.
> 
> Actually ... after some study of the code, I find that I'm holding this
> proposal to a higher standard than the current code maintains.
> According to our normal rules for applying WAL-loggable data changes,
> there should be a critical section wrapping the application of the
> data changes with making the WAL entry.  RelationTruncate fails to
> do any such thing: it just pushes out the WAL entry and then calls
> smgrtruncate, and so any error inside the latter just results in
> elog(ERROR).  Whereupon the system state is entirely different from what
> WAL says it should be.  So my previous gut feeling that we need to
> rethink this whole thing seems justified.
> 
> I traced through everything that leads to an ftruncate() call in the
> backend as of HEAD, and found that we have these cases to worry about:
> 
> mdunlink calls ftruncate directly, and does nothing worse than
> elog(WARNING) on failure.  This is fine because all it wants to do
> is save some disk space until the file gets deleted "for real" at
> the next checkpoint.  Failure only means we waste disk space
> temporarily, and is surely not cause for panic.
> 
> All other calls proceed (sometimes indirectly) from RelationTruncate
> or replay of the WAL record it emits.  We have not thought hard
> about the order of the various truncations this does and whether or
> not we have a self-consistent state if it fails partway through.
> If we don't want to make the whole thing a critical section, we need
> to figure that out.
> 
> RelationTruncate (and its subsidiary RelationTruncateIndexes) is called
> from heap_truncate_one_rel (which itself does things in an unsafe order),
> and from lazy_truncate_heap in VACUUM.
> 
> heap_truncate_one_rel has (indirectly) two call sources:
> 
> from ExecuteTruncate for a locally created rel, where we don't care,
> and would definitely rather NOT have a panic on error: just rolling back
> the transaction is fine thank you very much.
> 
> from PreCommit_on_commit_actions, to process ON COMMIT DELETE ROWS.
> Here again, so far as heaps are concerned, rollback on error would be
> plenty since any inserted rows would then just be dead.  The tricky
> part is the indexes for such a table.  If we truncate them before
> truncating the heap, then the worst possible case is an internally
> inconsistent index on a temp table, which will be automatically cleaned
> up during the next successful commit in its session.  So it's pretty
> hard to justify a PANIC response here either.
> 
> So it seems like the only case where there is really grounds for PANIC
> on failure is the VACUUM case.  And there we might apply Heikki's idea
> of trying to zero the untruncatable pages first.
> 
> I'm thinking that we need some sort of what-to-do-on-error flag passed
> into RelationTruncate, plus at least order-of-operations fixes in
> several other places, if not a wholesale refactoring of this whole call
> stack.  But I'm running out of steam and don't have a concrete proposal
> to make right now.  In any case, we've got more problems here than just
> the original one of forgetting dirty buffers too soon.
> 
>   regards, tom lane
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP - Add ability to constrain backend temporary file space

2011-02-18 Thread Tom Lane
Mark Kirkwood  writes:
> Added. With respect to the datatype, using int with KB units means the 
> largest temp size is approx 2047GB - I know that seems like a lot now... 
> but maybe someone out there wants (say) their temp files limited to 
> 4096GB :-)

[ shrug... ]  Sorry, I can't imagine a use case for this parameter where
the value isn't a *lot* less than that.  Maybe if it were global, but
not if it's per-backend.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP - Add ability to constrain backend temporary file space

2011-02-18 Thread Josh Berkus

> Obviously you need to do the same sort of arithmetic as you do with
> work_mem to decide on a reasonable limit to cope with multiple users
> creating temp files. Conservative dbas might want to set it to (free
> disk)/max_connections etc. Obviously for ad-hoc systems it is a bit more
> challenging - but having a per-backend limit is way better than having
> what we have now, which is ... errr... nothing.

Agreed.

> To answer the other question, what happens when the limit is exceeded is
> modeled on statement timeout, i.e query is canceled and a message says
> why (exceeded temp files size).

When does this happen?  When you try to allocate the file, or when it
does the original tape sort estimate?

The disadvantage of the former is that the user waited for minutes in
order to have their query cancelled.  The disadvantage of the latter is
that the estimate isn't remotely accurate.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP - Add ability to constrain backend temporary file space

2011-02-18 Thread Mark Kirkwood

On 19/02/11 08:48, Josh Berkus wrote:

On 2/18/11 11:44 AM, Robert Haas wrote:

On Fri, Feb 18, 2011 at 2:41 PM, Josh Berkus  wrote:

Second, the main issue with these sorts of macro-counters has generally
been their locking effect on concurrent activity.  Have you been able to
run any tests which try to run lots of small externally-sorted queries
at once on a multi-core machine, and checked the effect on throughput?

Since it's apparently a per-backend limit, that doesn't seem relevant.

Oh!  I missed that.

What good would a per-backend limit do, though?

And what happens with queries which exceed the limit?  Error message?  Wait?




By "temp files" I mean those in pgsql_tmp. LOL - A backend limit will 
have the same sort of usefulness as work_mem does - i.e stop a query 
eating all your filesystem space or bringing a server to its knees with 
io load. We have had this happen twice - I know of other folks who have too.


Obviously you need to do the same sort of arithmetic as you do with 
work_mem to decide on a reasonable limit to cope with multiple users 
creating temp files. Conservative dbas might want to set it to (free 
disk)/max_connections etc. Obviously for ad-hoc systems it is a bit more 
challenging - but having a per-backend limit is way better than having 
what we have now, which is ... errr... nothing.


As an example I'd find it useful to avoid badly written queries causing 
too much io load on the db backend of (say) a web system (i.e such a 
system should not *have* queries that want to use that much resource).


To answer the other question, what happens when the limit is exceeded is 
modeled on statement timeout, i.e query is canceled and a message says 
why (exceeded temp files size).


Cheers

Mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP - Add ability to constrain backend temporary file space

2011-02-18 Thread Mark Kirkwood

On 19/02/11 02:34, Robert Haas wrote:


Please add this to the next CommitFest:

https://commitfest.postgresql.org/action/commitfest_view/open

With respect to the datatype of the GUC, int seems clearly correct.
Why would you want to use a float?



Added. With respect to the datatype, using int with KB units means the 
largest temp size is approx 2047GB - I know that seems like a lot now... 
but maybe someone out there wants (say) their temp files limited to 
4096GB :-)


Cheers

Mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
Heikki Linnakangas  writes:
> On 18.02.2011 22:16, Tom Lane wrote:
>> Question after first look: what is the motivation for passing
>> estate->es_param_list_info to BeginScan?  AFAICS, even if there is a
>> reason for that function to need that, it isn't receiving any info that
>> would be sufficient to let it know what's in there.

> The idea is that when the query is planned, the FDW can choose to push 
> down a qual that contains a parameter marker, like "WHERE remotecol = 
> $1". At execution time, it needs the value of the parameter to send it 
> to the remote server. The PostgreSQL FDW does that, although I didn't 
> test it so it might well be broken.

s/might well be/is/ --- there's no guarantee that parameters are valid
at executor setup time.  The place that needs to be grabbing the
parameter value for that purpose is BeginScan.

>> What seems more
>> likely to be useful is to pass in the EState pointer, as for example
>> being able to look at es_query_cxt seems like a good idea.

> By "look at", you mean allocate stuff in it?

Right.  I suppose you're going to comment that CurrentMemoryContext is
probably the same thing, but in general it's not going to pay to make
this API run with blinders on.  My feeling is it'd be best to pass down
all the information the executor node has got --- probably we should
just pass the ForeignScanState node itself, and leave a void * in that
for FDW-private data, and be done with it.  Otherwise we're going to be
adding missed stuff back to the API every time somebody notices that
their FDW can't do X because they don't have access to the necessary
information.  That definitional instability will trump any ABI stability
that might be gained from not relying on executor node types.  (And it's
not like changing ScanState in a released version is an entirely safe
thing to do even today --- there are lots of loadable modules that know
about that struct.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR standby hangs

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 2:50 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Feb 18, 2011 at 2:35 PM, Andrew Dunstan  wrote:
>>> It's not running HS, so there's no query to wait on.
>
>> That seems to imply that recovery has leaked a buffer pin.
>
> No, because then the sanity check in LockBufferForCleanup would have
> fired:
>
>        /* There should be exactly one local pin */
>        if (PrivateRefCount[buffer - 1] != 1)
>                elog(ERROR, "incorrect local pin count: %d",
>                         PrivateRefCount[buffer - 1]);

Hmm, yeah.

> Some sort of deadly embrace with the bgwriter, maybe?

Maybe.

I think it'd be useful to know what the buffer header thinks the
refcount on that buffer is, and what the startup process and the
bgwriter each have for PrivateRefCount[buffer].

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP - Add ability to constrain backend temporary file space

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 2:48 PM, Josh Berkus  wrote:
> On 2/18/11 11:44 AM, Robert Haas wrote:
>> On Fri, Feb 18, 2011 at 2:41 PM, Josh Berkus  wrote:
>>> Second, the main issue with these sorts of macro-counters has generally
>>> been their locking effect on concurrent activity.  Have you been able to
>>> run any tests which try to run lots of small externally-sorted queries
>>> at once on a multi-core machine, and checked the effect on throughput?
>>
>> Since it's apparently a per-backend limit, that doesn't seem relevant.
>
> Oh!  I missed that.
>
> What good would a per-backend limit do, though?
>
> And what happens with queries which exceed the limit?  Error message?  Wait?

Well I have not RTFP, but I assume it'd throw an error.  Waiting isn't
going to accomplish anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SR standby hangs

2011-02-18 Thread Andrew Dunstan


PostgreSQL Experts Inc has a client with a 9.0.2 streaming replication 
server that somehow becomes wedged after running for some time.


The server is running as a warm standby, and the client's application 
tries to connect to both the master and the slave, accepting whichever 
lets it connect (hence hot standby is not turned on).


Archive files are being shipped as well as WAL streaming.

The symptom is that the recovery process blocks forever on a semaphore. 
We've crashed it and got the following backtrace:


   #0  0x003493ed5337 in semop () from /lib64/libc.so.6
   #1  0x005bd103 in PGSemaphoreLock (sema=0x2b14986aec38, interruptOK=1
   '\001') at pg_sema.c:420
   #2  0x005de645 in LockBufferForCleanup () at bufmgr.c:2432
   #3  0x00463733 in heap_xlog_clean (lsn=,
   record=0x1787e1c0) at heapam.c:4168
   #4  heap2_redo (lsn=, record=0x1787e1c0) at 
heapam.c:4858
   #5  0x00488780 in StartupXLOG () at xlog.c:6250
   #6  0x0048a888 in StartupProcessMain () at xlog.c:9254
   #7  0x004a11ef in AuxiliaryProcessMain (argc=2, argv=) at bootstrap.c:412
   #8  0x005c66c9 in StartChildProcess (type=StartupProcess) at
   postmaster.c:4427
   #9  0x005c8ab7 in PostmasterMain (argc=1, argv=0x17858bb0) at
   postmaster.c:1088
   #10 0x005725fe in main (argc=1, argv=) at 
main.c:188


The platform is CentOS 5.5 x86-64, kernel version 2.6.18-194.11.4.el5

I'm not quite sure where to start digging. Has anyone else seen 
something similar? Our consultant reports having seen a similar problem 
elsewhere, at a client who was running hot standby on 9.0.1, but the 
problem did not recur, as it does fairly regularly with this client.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Heikki Linnakangas

On 18.02.2011 22:16, Tom Lane wrote:

Heikki Linnakangas  writes:

Another version, rebased against master branch and with a bunch of small
cosmetic fixes.



I guess this is as good as this is going to get for 9.1.


This is *badly* in need of another cleanup pass; it's full of typos,
contradictory comments, #ifdef NOT_USED stuff, etc etc.  And the
documentation is really inadequate.  If you're out of energy to go
over it, I guess I should step up.


If you have the energy, by all means, thanks.


Question after first look: what is the motivation for passing
estate->es_param_list_info to BeginScan?  AFAICS, even if there is a
reason for that function to need that, it isn't receiving any info that
would be sufficient to let it know what's in there.


The idea is that when the query is planned, the FDW can choose to push 
down a qual that contains a parameter marker, like "WHERE remotecol = 
$1". At execution time, it needs the value of the parameter to send it 
to the remote server. The PostgreSQL FDW does that, although I didn't 
test it so it might well be broken.



 What seems more
likely to be useful is to pass in the EState pointer, as for example
being able to look at es_query_cxt seems like a good idea.


By "look at", you mean allocate stuff in it?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib loose ends: 9.0 to 9.1 incompatibilities

2011-02-18 Thread Dimitri Fontaine
Tom Lane  writes:
>> OK.  Thanks for nailing all of this down - that's got to have been a
>> heck of a job.

+1

> Yeah, it was a bit of a pain, and took longer than I would've hoped.

Well, with some luck (and effort) 9.2 will have the missing DDL pieces.
I think the extension features means we now need support for all kind of
ALTER things, even on operator classes and families.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
Heikki Linnakangas  writes:
> Another version, rebased against master branch and with a bunch of small 
> cosmetic fixes.

> I guess this is as good as this is going to get for 9.1.

This is *badly* in need of another cleanup pass; it's full of typos,
contradictory comments, #ifdef NOT_USED stuff, etc etc.  And the
documentation is really inadequate.  If you're out of energy to go
over it, I guess I should step up.

Question after first look: what is the motivation for passing
estate->es_param_list_info to BeginScan?  AFAICS, even if there is a
reason for that function to need that, it isn't receiving any info that
would be sufficient to let it know what's in there.  What seems more
likely to be useful is to pass in the EState pointer, as for example
being able to look at es_query_cxt seems like a good idea.

BTW, I see no particularly good reason to let the FDW omit ReScan.
If it wants to implement that as end-and-begin, it can do so internally.
It would be a lot clearer to just insist that all the function pointers
be valid, as indeed some (not all) of the comments say already.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-18 Thread Alvaro Herrera


I have two questions: 

1. why are you using the expansible char array stuff instead of using
the StringInfo facility?

2. is md5 the most appropriate digest for this?  If you need a
cryptographically secure hash, do we need something stronger?  If not,
why not just use hash_any?

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR standby hangs

2011-02-18 Thread Tom Lane
Robert Haas  writes:
> On Fri, Feb 18, 2011 at 2:35 PM, Andrew Dunstan  wrote:
>> It's not running HS, so there's no query to wait on.

> That seems to imply that recovery has leaked a buffer pin.

No, because then the sanity check in LockBufferForCleanup would have
fired:

/* There should be exactly one local pin */
if (PrivateRefCount[buffer - 1] != 1)
elog(ERROR, "incorrect local pin count: %d",
 PrivateRefCount[buffer - 1]);

Some sort of deadly embrace with the bgwriter, maybe?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP - Add ability to constrain backend temporary file space

2011-02-18 Thread Josh Berkus
On 2/18/11 11:44 AM, Robert Haas wrote:
> On Fri, Feb 18, 2011 at 2:41 PM, Josh Berkus  wrote:
>> Second, the main issue with these sorts of macro-counters has generally
>> been their locking effect on concurrent activity.  Have you been able to
>> run any tests which try to run lots of small externally-sorted queries
>> at once on a multi-core machine, and checked the effect on throughput?
> 
> Since it's apparently a per-backend limit, that doesn't seem relevant.

Oh!  I missed that.

What good would a per-backend limit do, though?

And what happens with queries which exceed the limit?  Error message?  Wait?


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP - Add ability to constrain backend temporary file space

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 2:41 PM, Josh Berkus  wrote:
> Second, the main issue with these sorts of macro-counters has generally
> been their locking effect on concurrent activity.  Have you been able to
> run any tests which try to run lots of small externally-sorted queries
> at once on a multi-core machine, and checked the effect on throughput?

Since it's apparently a per-backend limit, that doesn't seem relevant.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR standby hangs

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 2:35 PM, Andrew Dunstan  wrote:
> It's not running HS, so there's no query to wait on.

That seems to imply that recovery has leaked a buffer pin.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP - Add ability to constrain backend temporary file space

2011-02-18 Thread Josh Berkus
Mark,

> I got to wonder how hard this would be to do in Postgres, and attached
> is my (WIP) attempt. It provides a guc (max_temp_files_size) to limit
> the size of all temp files for a backend and amends fd.c cancel
> execution if the total size of temporary files exceeds this.

First, are we just talking about pgsql_tmp here, or the pg_temp
tablespace?  That is, just sort/hash files, or temporary tables as well?

Second, the main issue with these sorts of macro-counters has generally
been their locking effect on concurrent activity.  Have you been able to
run any tests which try to run lots of small externally-sorted queries
at once on a multi-core machine, and checked the effect on throughput?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-18 Thread Tom Lane
Frederik Ramm  writes:
> The single query where pg9.0 beat pg8.3 by a country mile was a CREATE 
> INDEX statement on a BIGINT column to a table with about 500 million 
> records - this cost 2679 seconds on normal 8.3, 2443 seconds on 
> large-memory 8.3, and aroung 1650 seconds on 9.0, large memory or not.

FWIW, that's probably due to bigint having become pass-by-value on
64-bit platforms.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure on UNLOGGED VIEW and SEQUENCE

2011-02-18 Thread Tom Lane
Robert Haas  writes:
> On Fri, Feb 18, 2011 at 2:17 PM, Tom Lane  wrote:
>> +1 for where you put the tests, but I don't think
>> ERRCODE_SYNTAX_ERROR is an appropriate errcode.  I'd go with
>> FEATURE_NOT_SUPPORTED for both, I think.

> I hesitate to use FEATURE_NOT_SUPPORTED for something that's
> nonsensical anyway.  I picked SYNTAX_ERROR after some scrutiny of what
> I believe to be parallel cases, such as EXPLAIN (FOO) SELECT 1 and
> CREATE TABLE t AS SELECT 1 INTO me.

[ shrug... ]  I don't care enough to argue about it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Debian readline/libedit breakage

2011-02-18 Thread Bruce Momjian
Tom Lane wrote:
> Andrew Dunstan  writes:
> > On 02/17/2011 04:09 PM, Martijn van Oosterhout wrote:
> >> This is supported. Where it goes wonky is that this also has to work
> >> when the connection is via SSL. So libpq provides a function to return
> >> (via a void*) a pointer to the OpenSSL structure so that can be used to
> >> communicate with the server.
> 
> > Ugh. Maybe not the best design decision we've ever made.
> 
> libpq-fe.h is pretty clear on this matter:
> 
> /* Get the OpenSSL structure associated with a connection. Returns NULL for
>  * unencrypted connections or if any other TLS library is in use. */
> extern void *PQgetssl(PGconn *conn);
> 
> We are under no compulsion to emulate OpenSSL if we switch to another
> library.  The design intent is that we'd provide a separate function
> (PQgetnss?) and callers that know how to use that library would call
> that function.  If they don't, it's not our problem.

Who uses this?  ODBC?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR standby hangs

2011-02-18 Thread Andrew Dunstan



On 02/18/2011 02:23 PM, Tom Lane wrote:

Andrew Dunstan  writes:

The symptom is that the recovery process blocks forever on a semaphore.
We've crashed it and got the following backtrace:
 #0  0x003493ed5337 in semop () from /lib64/libc.so.6
 #1  0x005bd103 in PGSemaphoreLock (sema=0x2b14986aec38, 
interruptOK=1
 '\001') at pg_sema.c:420
 #2  0x005de645 in LockBufferForCleanup () at bufmgr.c:2432
 #3  0x00463733 in heap_xlog_clean (lsn=,
 record=0x1787e1c0) at heapam.c:4168
 #4  heap2_redo (lsn=, record=0x1787e1c0) at 
heapam.c:4858
 #5  0x00488780 in StartupXLOG () at xlog.c:6250

So who's holding the buffer lock that it wants?  Are you sure this is an
actual hang, and not just recovery waiting for a standby query to complete?




It's not running HS, so there's no query to wait on.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
Robert Haas  writes:
> On Fri, Feb 18, 2011 at 10:47 AM, Tom Lane  wrote:
>> The main downside of that is that relation relkinds would have
>> to become fixed (because there would be no practical way of updating
>> RTEs in stored rules), which means the "convert relation to view" hack
>> would have to go away.  Offhand I think no one cares about that anymore,
>> but ...

> That actually sounds like a possible problem, because it's possible to
> create views with circular dependencies using CORV, and they won't
> dump-and-reload properly without that hack.  It's not a particularly
> useful thing to do, of course, and I think we could reengineer pg_dump
> to not need the hack even if someone does do it, but that sounds like
> more work than we want to tackle right now.

Urgh.  That's problematic, because even if we changed pg_dump (which
would not be that hard I think), we'd still have to cope with dump files
emitted by existing versions of pg_dump.  The time constant before that
stops being an issue is measured in years.  I'm not at all sure whether
the circular dependency case is infrequent enough that we could get away
with saying "tough luck" to people who hit the case.

[ thinks a bit ... ]  But we can probably hack our way around that:
teach the rule rewriter to update relkind in any RTE it brings in from a
stored rule.  We already do something similar in some other cases where
a stored parsetree node contains information that could become obsolete.

But that conclusion just makes it even clearer that fixing this
performance problem, if it even is real, should be a separate patch.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR standby hangs

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 2:16 PM, Andrew Dunstan  wrote:
> I'm not quite sure where to start digging. Has anyone else seen
> something similar? Our consultant reports having seen a similar problem
> elsewhere, at a client who was running hot standby on 9.0.1, but the
> problem did not recur, as it does fairly regularly with this client.

I've seen a very similar backtrace that only involved one system (no
Hot Standby).  The problem in that case appears to have been an open
cursor holding a buffer pin.  LockBufferForCleanup() has logic that's
supposed to prevent that from going on too long during HS - it should
nuke the guy with the buffer in when the timeout expires - but maybe
there's a bug in that mechanism.

As a side matter, it would be good to improve this in the non-Hot
Standby case also.  An open cursor can tie down an autovacuum worker
forever, which is not a good thing, as it's easily possible for the
number of open cursors to be larger than the number of available
autovacuum workers...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup and wal streaming

2011-02-18 Thread Bruce Momjian
Magnus Hagander wrote:
> Better late than never (or?), here's the final cleanup of
> pg_streamrecv for moving into the main distribution, per discussion
> back in late dec or early jan. It also includes the "stream logs in
> parallel to backup" part that was not completed on pg_basebackup.
> 
> Other than that, the only changes to pg_basebackup are the moving of a
> couple of functions into streamutil.c to make them usable from both,
> and the progress format output fix Fujii-san mentioned recently.
> 
> Should be complete except for Win32 support (needs thread/fork thing
> for the  background streaming feature. Shouldn't be too hard, and I
> guess that falls on me anyway..) and the reference documentation.
> 
> And with no feedback to my question here
> (http://archives.postgresql.org/pgsql-hackers/2011-02/msg00805.php), I
> went with the "duplicate the macros that are needed to avoid loading
> postgres.h" path.
> 
> Yes, I realize that this is far too late in the CF process really, but
> I wanted to post it anyway... If it's too late to be acceptable it
> should be possible to maintain this outside the main repository until
> 9.2, since it only changes frontend binaries. So I'm not actually
> going to put it on the CF page unless someone else says that's a good
> idea, to at least share the blame from Robert ;)

Well, if you are going to stand behind it, the CF is not a requirement
and you can apply it.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] using a lot of maintenance_work_mem

2011-02-18 Thread Frederik Ramm

Tom & Kevin,

   thank you for your replies. Kevin, I had already employed all the 
tricks you mention, except using temporary tables which would be hard 
for me due to the structure of my application (but I could try using 
something like pgbouncer or so), but thanks a lot for sharing the ideas.


Tom Lane wrote:
If I were to either (a) increase MaxAllocSize to, say, 48 GB instead of 
1 GB, or (b) hack tuplesort.c to ignore MaxAllocSize, just for my local 
setup - would that likely be viable in my situation, or would I break 
countless things?


You would break countless things.


Indeed I did. I tried to raise the MaxAllocSize from 1 GB to a large 
number, but immediately got strange memory allocation errors during the 
regression test (something that looked like a wrapped integer in a 
memory allocation request).


I reduced the number in steps, and found I could compile and run 
PostgreSQL 8.3 with a MaxAllocSize of 4 GB, and PostgreSQL 9.0 with 2 GB 
without breakage.


In a completely un-scientific test run, comprising 42 individual SQL 
statements aimed at importing and indexing a large volume of data, I got 
the following results:


pg8.3 with normal MaxAllocSize .. 15284s
pg8.3 with MaxAllocSize increased to 4 GB ... 14609s (-4.5%)
pg9.0 with normal MaxAllocSize .. 12969s (-15.2%)
pg9.0 with MaxAllocSize increased to 2 GB ... 13211s (-13.5%)


I'd want to see some evidence that it's actually
helpful for production situations.  I'm a bit dubious that you're going
to gain much here.


So, on the whole it seems you were right; the performance, at least with 
that small memory increase I managed to build in without breaking 
things, doesn't increase a lot, or not at all for PostgreSQL 9.0.


The single query that gained most from the increase in memory was an 
ALTER TABLE statement to add a BIGINT primary key to a table with about 
50 million records - this was 75% faster on the both 8.3 and 9.0 but 
since it took only 120 seconds to begin with, didn't change the result a 
lot.


The single query where pg9.0 beat pg8.3 by a country mile was a CREATE 
INDEX statement on a BIGINT column to a table with about 500 million 
records - this cost 2679 seconds on normal 8.3, 2443 seconds on 
large-memory 8.3, and aroung 1650 seconds on 9.0, large memory or not.


The query that, on both 8.3 and 9.0, took about 10% longer with more 
memory was a CREATE INDEX statement on a TEXT column.


All this, as I said, completely un-scientific - I did take care to flush 
caches and not run anything in parallel, but that was about all I did so 
it might come out differently when run often.


My result of all of this? Switch to 9.0 of course ;)

Bye
Frederik

--
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure on UNLOGGED VIEW and SEQUENCE

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 2:17 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> OK.  Proposed patch attached.  It looks to me like an unlogged view is
>> inherently nonsensical, whereas an unlogged sequence is sensible but
>> we don't implement it (and may never implement it, absent some
>> evidence that the overhead of WAL logging sequence changes is worth
>> getting excited about), so I wrote the error messages to reflect that
>> distinction.  I also added a couple of matching regression tests, and
>> documented that UNLOGGED works with SELECT INTO.  I put the check for
>> views in DefineView adjacent to the other check that already cares
>> about relpersistence, and I put the one in DefineSequence to match, at
>> the top for lack of any compelling theory of where it ought to go.  I
>> looked at stuffing it all the way down into DefineRelation but that
>> looks like it would require some other rejiggering of existing logic
>> and assertions, which seems pointless and potentially prone to
>> breaking things.
>
> Regression tests for this seem pretty pointless (ie, a waste of cycles
> forevermore).  +1 for where you put the tests, but I don't think
> ERRCODE_SYNTAX_ERROR is an appropriate errcode.  I'd go with
> FEATURE_NOT_SUPPORTED for both, I think.

I hesitate to use FEATURE_NOT_SUPPORTED for something that's
nonsensical anyway.  I picked SYNTAX_ERROR after some scrutiny of what
I believe to be parallel cases, such as EXPLAIN (FOO) SELECT 1 and
CREATE TABLE t AS SELECT 1 INTO me.

> Also, it might be worth
> putting some of the above justification into the comments, eg
>
>        /* Unlogged sequences are not implemented --- not clear if useful */
>
> versus
>
>        /* Unlogged views are pretty nonsensical */
>
> rather than duplicate comments describing non-duplicate cases.

Good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SR standby hangs

2011-02-18 Thread Tom Lane
Andrew Dunstan  writes:
> The symptom is that the recovery process blocks forever on a semaphore.
> We've crashed it and got the following backtrace:

> #0  0x003493ed5337 in semop () from /lib64/libc.so.6
> #1  0x005bd103 in PGSemaphoreLock (sema=0x2b14986aec38, 
> interruptOK=1
> '\001') at pg_sema.c:420
> #2  0x005de645 in LockBufferForCleanup () at bufmgr.c:2432
> #3  0x00463733 in heap_xlog_clean (lsn=,
> record=0x1787e1c0) at heapam.c:4168
> #4  heap2_redo (lsn=, record=0x1787e1c0) at 
> heapam.c:4858
> #5  0x00488780 in StartupXLOG () at xlog.c:6250

So who's holding the buffer lock that it wants?  Are you sure this is an
actual hang, and not just recovery waiting for a standby query to complete?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 10:47 AM, Tom Lane  wrote:
> Heikki Linnakangas  writes:
>> On 15.02.2011 23:00, Tom Lane wrote:
>>> I think moving the error check downstream would be a good thing.
>
>> Ok, I tried moving the error checks to preprocess_rowmarks().
>> Unfortunately RelOptInfos haven't been built at that stage yet, so you
>> still have to do the catalog lookup to get the relkind. That defeats the
>> purpose.
>
> Mph.  It seems like the right fix here is to add relkind to
> RangeTblEntry: it could be filled in for free in addRangeTableEntry, for
> example.

Heikki and I came to the same conclusion yesterday while chatting
about this on IM.

> The main downside of that is that relation relkinds would have
> to become fixed (because there would be no practical way of updating
> RTEs in stored rules), which means the "convert relation to view" hack
> would have to go away.  Offhand I think no one cares about that anymore,
> but ...

That actually sounds like a possible problem, because it's possible to
create views with circular dependencies using CORV, and they won't
dump-and-reload properly without that hack.  It's not a particularly
useful thing to do, of course, and I think we could reengineer pg_dump
to not need the hack even if someone does do it, but that sounds like
more work than we want to tackle right now.

> In any case, this is looking like a performance optimization that should
> be dealt with in a separate patch.  I'd suggest leaving the code in the
> form with the extra relkind lookups for the initial commit.  It's
> possible that no one would notice the extra lookups anyway --- have you
> benchmarked it?

This is a good point... although I think this results in at least one
extra syscache lookup per table per SELECT, even when no foreign
tables are involved.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure on UNLOGGED VIEW and SEQUENCE

2011-02-18 Thread Tom Lane
Robert Haas  writes:
> OK.  Proposed patch attached.  It looks to me like an unlogged view is
> inherently nonsensical, whereas an unlogged sequence is sensible but
> we don't implement it (and may never implement it, absent some
> evidence that the overhead of WAL logging sequence changes is worth
> getting excited about), so I wrote the error messages to reflect that
> distinction.  I also added a couple of matching regression tests, and
> documented that UNLOGGED works with SELECT INTO.  I put the check for
> views in DefineView adjacent to the other check that already cares
> about relpersistence, and I put the one in DefineSequence to match, at
> the top for lack of any compelling theory of where it ought to go.  I
> looked at stuffing it all the way down into DefineRelation but that
> looks like it would require some other rejiggering of existing logic
> and assertions, which seems pointless and potentially prone to
> breaking things.

Regression tests for this seem pretty pointless (ie, a waste of cycles
forevermore).  +1 for where you put the tests, but I don't think
ERRCODE_SYNTAX_ERROR is an appropriate errcode.  I'd go with
FEATURE_NOT_SUPPORTED for both, I think.  Also, it might be worth
putting some of the above justification into the comments, eg

/* Unlogged sequences are not implemented --- not clear if useful */

versus

/* Unlogged views are pretty nonsensical */

rather than duplicate comments describing non-duplicate cases.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SR standby hangs

2011-02-18 Thread Andrew Dunstan


[this time from the right address - apologies if we get a duplicate]

PostgreSQL Experts Inc has a client with a 9.0.2 streaming replication server 
that somehow becomes wedged after running for some time.

The server is running as a warm standby, and the client's application
tries to connect to both the master and the slave, accepting whichever
lets it connect (hence hot standby is not turned on).

Archive files are being shipped as well as WAL streaming.

The symptom is that the recovery process blocks forever on a semaphore.
We've crashed it and got the following backtrace:

   #0  0x003493ed5337 in semop () from /lib64/libc.so.6
   #1  0x005bd103 in PGSemaphoreLock (sema=0x2b14986aec38, interruptOK=1
   '\001') at pg_sema.c:420
   #2  0x005de645 in LockBufferForCleanup () at bufmgr.c:2432
   #3  0x00463733 in heap_xlog_clean (lsn=,
   record=0x1787e1c0) at heapam.c:4168
   #4  heap2_redo (lsn=, record=0x1787e1c0) at 
heapam.c:4858
   #5  0x00488780 in StartupXLOG () at xlog.c:6250
   #6  0x0048a888 in StartupProcessMain () at xlog.c:9254
   #7  0x004a11ef in AuxiliaryProcessMain (argc=2, argv=) at bootstrap.c:412
   #8  0x005c66c9 in StartChildProcess (type=StartupProcess) at
   postmaster.c:4427
   #9  0x005c8ab7 in PostmasterMain (argc=1, argv=0x17858bb0) at
   postmaster.c:1088
   #10 0x005725fe in main (argc=1, argv=) at 
main.c:188


The platform is CentOS 5.5 x86-64, kernel version 2.6.18-194.11.4.el5

I'm not quite sure where to start digging. Has anyone else seen
something similar? Our consultant reports having seen a similar problem
elsewhere, at a client who was running hot standby on 9.0.1, but the
problem did not recur, as it does fairly regularly with this client.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure on UNLOGGED VIEW and SEQUENCE

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 10:02 AM, Tom Lane  wrote:
> If by the first option you mean causing the failure report to be "syntax
> error" rather than anything more specific, then I agree that option
> sucks.  I'd actually vote for putting the error test somewhere in
> statement execution code, well downstream of gram.y.  The parser's job
> is to produce a parse tree, not to encapsulate knowledge about which
> combinations of options are supported.

OK.  Proposed patch attached.  It looks to me like an unlogged view is
inherently nonsensical, whereas an unlogged sequence is sensible but
we don't implement it (and may never implement it, absent some
evidence that the overhead of WAL logging sequence changes is worth
getting excited about), so I wrote the error messages to reflect that
distinction.  I also added a couple of matching regression tests, and
documented that UNLOGGED works with SELECT INTO.  I put the check for
views in DefineView adjacent to the other check that already cares
about relpersistence, and I put the one in DefineSequence to match, at
the top for lack of any compelling theory of where it ought to go.  I
looked at stuffing it all the way down into DefineRelation but that
looks like it would require some other rejiggering of existing logic
and assertions, which seems pointless and potentially prone to
breaking things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


unlogged-fixes.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: collect frequency statistics for arrays

2011-02-18 Thread Tom Lane
Alexander Korotkov  writes:
> I have following proposal. Currently the ts_typanalyze function accumulates
> frequency statistics for ts_vector using lossy counting technique. But no
> frequency statistics is collecting over arrays. I'm going to generalize
> ts_typanalyze to make it collecting statistics for arrays too. ts_typanalyze
> internally uses lexeme comparison and hashing. I'm going to use functions
> from default btree and hash opclasses of array element type in this
> capacity. Collected frequency statistics for arrays can be used for && and
> @> operators selectivity estimation.

It'd be better to just make a separate function for arrays, instead of
trying to kluge ts_typanalyze to the point where it'd cover both cases.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Proposal: collect frequency statistics for arrays

2011-02-18 Thread Alexander Korotkov
Hackers,

I have following proposal. Currently the ts_typanalyze function accumulates
frequency statistics for ts_vector using lossy counting technique. But no
frequency statistics is collecting over arrays. I'm going to generalize
ts_typanalyze to make it collecting statistics for arrays too. ts_typanalyze
internally uses lexeme comparison and hashing. I'm going to use functions
from default btree and hash opclasses of array element type in this
capacity. Collected frequency statistics for arrays can be used for && and
@> operators selectivity estimation.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Fix for Index Advisor related hooks

2011-02-18 Thread Tom Lane
Heikki Linnakangas  writes:
> On 18.02.2011 17:02, Gurjeet Singh wrote:
>>> On Wed, Feb 16, 2011 at 6:37 PM, Tom Lane   wrote:
 Fetch the values you need and stuff 'em in the struct.  Don't expect
 relcache to do it for you.

>>> I also wish to make Index Advisor faster by not having to lookup this info
>>> every time a new query comes in and that's why I was trying to use the
>>> guts of IndexSupportInitialize() where it does the caching.

> Nah, I don't buy that, sounds like a premature optimization. Just 
> planning a bunch of SQL statements, even if there's hundreds of them in 
> the file, shouldn't take that long. And even if it does, I don't believe 
> without evidence that the catalog lookups for the hypothetical indexes 
> is where the time is spent.

But even more to the point, IndexSupportInitialize is simply not well
matched to the problem.  It's designed to help produce a relcache entry
from a pg_index entry, and in particular to look up opfamily and input
datatype from an opclass OID.  (Oh, and to produce info about index
support functions, which you certainly don't need.)  But as far as I can
see, an index advisor would already *have* opfamily and input datatype,
because what it's going to be starting from is some query WHERE
condition that it thinks would be worth optimizing.  What it's going to
get from looking up that operator in pg_amop is opfamily and opcintype
information.  Looking up an opclass from that is unnecessary work as far
as I can see (you don't need it to put in IndexOptInfo, for sure), and
reversing the lookup afterwards is certainly pointless.

So even granted that performance is critical, you haven't made a case
why exposing IndexSupportInitialize is going to be useful.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for Index Advisor related hooks

2011-02-18 Thread Heikki Linnakangas

On 18.02.2011 17:02, Gurjeet Singh wrote:

On Fri, Feb 18, 2011 at 2:27 AM, Heikki Linnakangas<
heikki.linnakan...@enterprisedb.com>  wrote:


On 17.02.2011 14:30, Gurjeet Singh wrote:


On Wed, Feb 16, 2011 at 6:37 PM, Tom Lane   wrote:

Fetch the values you need and stuff 'em in the struct.  Don't expect

relcache to do it for you.  The only reason relcache is involved in the
current workflow is that we try to cache the information across queries
in order to save on planner startup time ... but I don't think that that
concern is nearly as pressing for something like Index Advisor.  You'll
have enough to do tracking changes in IndexOptInfo, you don't need to
have to deal with refactorings inside relcache as well.



I also wish to make Index Advisor faster by not having to lookup this info
every time a new query comes in and that's why I was trying to use the
guts
of IndexSupportInitialize() where it does the caching.



I doubt performance matters much here. It's not like you're going to be
explaining hundreds of queries per second with a hypotethical index
installed. I think of this as a manual tool that you run from a GUI when you
wonder if an index on column X would help.

The question is, can the Index Advisor easilt access all the information it
needs to build the IndexOptInfo? If not, what's missing?



There's a command line tool in the Index Adviser contrib that takes a file
full of SQL and run them against the Index Adviser. People would want that
tool to be as fast as it can be.


Nah, I don't buy that, sounds like a premature optimization. Just 
planning a bunch of SQL statements, even if there's hundreds of them in 
the file, shouldn't take that long. And even if it does, I don't believe 
without evidence that the catalog lookups for the hypothetical indexes 
is where the time is spent.



Another use case of the Index Advisor is to be switched on for a few hours
while the application runs, and gather the recommendations for the whole
run. We'll need good performance that case too.


How exactly does that work? I would imagine that you log all the 
different SQL statements and how often they're run during that period. 
Similar to pgFouine, for example. And only then you run the index 
advisor on the collected SQL statements.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib loose ends: 9.0 to 9.1 incompatibilities

2011-02-18 Thread Tom Lane
Robert Haas  writes:
> On Fri, Feb 18, 2011 at 12:01 PM, Tom Lane  wrote:
>> OK, I held my nose and inserted UPDATE commands to make the opclasses
>> match.  AFAICT the only remaining discrepancy between contrib modules
>> made fresh in 9.1 and those updated from 9.0 is the question of citext's
>> collation property, which as noted in the other thread is not worth
>> dealing with until the collation stuff is a bit better thought out.

> OK.  Thanks for nailing all of this down - that's got to have been a
> heck of a job.

Yeah, it was a bit of a pain, and took longer than I would've hoped.
It was worth doing though --- I think it not unlikely that in the
long run, the extensions feature will be seen as the largest single
improvement in 9.1.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib loose ends: 9.0 to 9.1 incompatibilities

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 12:01 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Feb 17, 2011 at 9:13 PM, Tom Lane  wrote:
>>> 2. intarray and tsearch2 use some core support functions in their
>>> GIN opclasses, and those support functions changed signatures in 9.1.
>>> The current solution to this involves having stub functions in core
>>> with the old signatures; when you do an upgrade from the 9.0 version
>>> of one of these contrib modules, its opclass will be pointing at the
>>> stub version instead of the preferred version.  I guess we could fix
>>> that with a direct UPDATE on pg_amproc but I'm not sure that's a
>>> good idea.  Note these functions aren't actually *members* of the
>>> extensions, just things it references, so the odds of future trouble
>>> seem pretty small.  On the other hand, if we don't do this, it's
>>> unclear when we'll ever be able to get rid of the stubs.
>>>
>>> Comments?
>
>> ISTM that the pg_amproc entries are part of the operator class, which
>> is owned by the extension.  So it's the upgrade script's job to leave
>> the operator class in the right state.
>
> OK, I held my nose and inserted UPDATE commands to make the opclasses
> match.  AFAICT the only remaining discrepancy between contrib modules
> made fresh in 9.1 and those updated from 9.0 is the question of citext's
> collation property, which as noted in the other thread is not worth
> dealing with until the collation stuff is a bit better thought out.

OK.  Thanks for nailing all of this down - that's got to have been a
heck of a job.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make a hard state change from catchup to streaming mode.

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 10:56 AM, Simon Riggs  wrote:
> We need a hard state change at one particular point to avoid sync rep
> requestors from hanging for hours when the standby is connected but a
> long way from being caught up.

That's a matter of opinion.  I think there are a number of people here
who would say that what you need is a good way to know when you've
caught up, and that you shouldn't enable sync rep until that happens.
What you're proposing would be useful too, if it didn't break other
cases, but it does.  This is precisely why it's a bad idea for us to
be trying to do this kind of engineering at the very last minute.

> This was a part of my sync rep patch that it was easier to break out and
> commit early. There has never been any comment or objection to this
> concept and the same feature has existed in my patches for months.

You posted the latest version of your sync rep patch on January 15th,
after months of inactivity.  Heikki reviewed it on the 21st, and there
it sat un-updated for three weeks.  If your expectation is that any
portion of that patch to which nobody specifically objected is fair
game to commit without further discussion, I don't think that's the
way it works around here.

Post your patches and we'll review them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib loose ends: 9.0 to 9.1 incompatibilities

2011-02-18 Thread Tom Lane
Robert Haas  writes:
> On Thu, Feb 17, 2011 at 9:13 PM, Tom Lane  wrote:
>> 2. intarray and tsearch2 use some core support functions in their
>> GIN opclasses, and those support functions changed signatures in 9.1.
>> The current solution to this involves having stub functions in core
>> with the old signatures; when you do an upgrade from the 9.0 version
>> of one of these contrib modules, its opclass will be pointing at the
>> stub version instead of the preferred version.  I guess we could fix
>> that with a direct UPDATE on pg_amproc but I'm not sure that's a
>> good idea.  Note these functions aren't actually *members* of the
>> extensions, just things it references, so the odds of future trouble
>> seem pretty small.  On the other hand, if we don't do this, it's
>> unclear when we'll ever be able to get rid of the stubs.
>> 
>> Comments?

> ISTM that the pg_amproc entries are part of the operator class, which
> is owned by the extension.  So it's the upgrade script's job to leave
> the operator class in the right state.

OK, I held my nose and inserted UPDATE commands to make the opclasses
match.  AFAICT the only remaining discrepancy between contrib modules
made fresh in 9.1 and those updated from 9.0 is the question of citext's
collation property, which as noted in the other thread is not worth
dealing with until the collation stuff is a bit better thought out.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make a hard state change from catchup to streaming mode.

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 10:41 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Feb 18, 2011 at 10:09 AM, Simon Riggs  wrote:
>>> Make a hard state change from catchup to streaming mode.
>>> More useful state change for monitoring purposes, plus a
>>> required change for synchronous replication patch.
>
>> As far as I can see, this patch was not posted or discussed before
>> commit, and I'm not sure it's the behavior everyone wants.  It has the
>> effect of preventing the system from ever going backwards from
>> "streaming" to "catchup".  Is that what we want?
>
> That seems like a very bad idea from here.  Being able to go back to
> catchup after loss of the streaming connection is essential for
> robustness.  If we now have to restart the slave for that to happen,
> it's not an improvement.

No, that's not the case where it matters.  The state would get reset
on reconnect.  The problem is when, say, the master server is
generating WAL at a rate which exceeds the network bandwidth of the
link between the master and the standby.  The previous coding will
make us flip back into the catchup state when that happens.

Actually, the old code is awfully sensitive, and knowing that you are
not caught up is not really enough information: you need to know how
far behind you are, and how long you've been behind for.  I'm guessing
that Simon intended this patch to deal with that problem, but it's not
the right fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TYPE COLLATABLE?

2011-02-18 Thread Tom Lane
Peter Eisentraut  writes:
> On tor, 2011-02-17 at 17:50 -0500, Tom Lane wrote:
>> Is it time for a direct UPDATE on the pg_type row?  If so, to what?  I see
>> pg_type.typcollation is supposed to be an OID, so how the heck does
>> one map a bool CREATE TYPE parameter into the catalog entry?

> It's 100, which is the OID of "default" in pg_collation.  The value may
> be different for domains.  (Earlier versions of the feature had a
> boolean column and a separate collation column for domains, but somehow
> it turned out to be quite redundant.)

While testing a fix for this, I observe that pg_dump is entirely broken
on the subject, because it fails to dump anything at all about the
typcollation property when dumping a base type.  I also rather wonder
exactly what pg_dump would dump to restore a value of
pg_type.typcollation that's not either 0 or 100.

In short: I think this feature is quite a few bricks shy of a load yet,
and there's no point in my kluging something in citext until it settles
down more.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Students enrollment

2011-02-18 Thread Kevin Grittner
Roman Prykhodchenko  wrote:
 
> My students have to make their science projects during this
> half-year. I would like to involve some of them in PostgreSQL to
> give them an opportunity to work on a real project instead of
> writing their own equation solvers or any other useless apps.
> 
> Here is my vision of such collaboration:
> 
> - I ask students who would like to work on PostgreSQL  project and
>   we choose one
> - You select the supervisor for the student. The supervisor will
>   be the student's "boss" and will give him tasks and help in
>   solving complex problems.
> - The project should take student two-three months of time and you
>   can decide can be developed by student during this term.
> - The student discusses tasks the supervisor and agrees them with
>   me.
> - When the tasks are agreed the student begins working on them and
>   writing his Project report.
> - During the term I coordinate the student and solve organization
>   problems.
> - When the term is finished the student should complete all tasks,
>   agreed on the beginning of the term and also the student should
>   finish his Project report.
> - The student can continue working with you after the science
>   project is finished, if he likes it.
 
Since I haven't seen a reply by anyone else, I'll point out that
right now the committers are at their most busy time of the year,
trying to wrap up the 9.1 release.  I'll offer what advice I can;
someone else will probably flesh this out more later.
 
The usual way something like this is done is for the student to have
a mentor rather than a supervisor, and for the student to deal
directly with the PostgreSQL community through this list.  The
student would be expected to discuss, perhaps at great length, what
was to be implemented and how it should be approached.  The
community tends to operate by consensus.
 
When a first attempt at an implementation is presented, there is
usually significant input and the patch is usually returned for
revision, sometimes after another lengthy round of discussion on the
list.  There is no guarantee of acceptance, although active
participation in the discussions and close attention to concerns and
suggestions helps considerably.  Patches are not accepted without
corresponding updates to the documentation and the regression tests.
 
In the end it comes down to an evaluation, after seeing the patch,
of how beneficial the patch is balanced against the risks and costs,
including the cost of long term maintenance of the extra code.
 
That said, I'm sure it would be an educational experience for the
student, and likely to benefit the PostgreSQL community.  The first
two hurdles, obviously, would be to pick a good project and find a
mentor.  The project should be interesting to the student, within
their abilities, and seen as valuable to the community.  The student
might want to review the TODO list and follow the PostgreSQL lists
for ideas.  It would also be wise to follow the other links from the
Developers tab on the main PostgreSQL web page.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Snapshot synchronization, again...

2011-02-18 Thread Alvaro Herrera

Looking into this patch.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make a hard state change from catchup to streaming mode.

2011-02-18 Thread Simon Riggs
On Fri, 2011-02-18 at 10:41 -0500, Tom Lane wrote:
> Robert Haas  writes:
> > On Fri, Feb 18, 2011 at 10:09 AM, Simon Riggs  wrote:
> >> Make a hard state change from catchup to streaming mode.
> >> More useful state change for monitoring purposes, plus a
> >> required change for synchronous replication patch.
> 
> > As far as I can see, this patch was not posted or discussed before
> > commit, and I'm not sure it's the behavior everyone wants.  It has the
> > effect of preventing the system from ever going backwards from
> > "streaming" to "catchup".  Is that what we want?
> 
> That seems like a very bad idea from here.  Being able to go back to
> catchup after loss of the streaming connection is essential for
> robustness.  If we now have to restart the slave for that to happen,
> it's not an improvement.

If we lose the connection then we start another walsender process, so
that is a non-issue.

We need a hard state change at one particular point to avoid sync rep
requestors from hanging for hours when the standby is connected but a
long way from being caught up.

This was a part of my sync rep patch that it was easier to break out and
commit early. There has never been any comment or objection to this
concept and the same feature has existed in my patches for months.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Debian readline/libedit breakage

2011-02-18 Thread Magnus Hagander
On Fri, Feb 18, 2011 at 16:51, Tom Lane  wrote:
> Andrew Dunstan  writes:
>> On 02/17/2011 04:09 PM, Martijn van Oosterhout wrote:
>>> This is supported. Where it goes wonky is that this also has to work
>>> when the connection is via SSL. So libpq provides a function to return
>>> (via a void*) a pointer to the OpenSSL structure so that can be used to
>>> communicate with the server.
>
>> Ugh. Maybe not the best design decision we've ever made.
>
> libpq-fe.h is pretty clear on this matter:
>
> /* Get the OpenSSL structure associated with a connection. Returns NULL for
>  * unencrypted connections or if any other TLS library is in use. */
> extern void *PQgetssl(PGconn *conn);
>
> We are under no compulsion to emulate OpenSSL if we switch to another
> library.  The design intent is that we'd provide a separate function
> (PQgetnss?) and callers that know how to use that library would call
> that function.  If they don't, it's not our problem.

Yeah, the only issue there is that it should perhaps have been called
PQgetopenssl(). We did that right for PQinitOpenSSL(). But then not
for PQinitSSL(). So we aren't exactly consistent..

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Debian readline/libedit breakage

2011-02-18 Thread Tom Lane
Andrew Dunstan  writes:
> On 02/17/2011 04:09 PM, Martijn van Oosterhout wrote:
>> This is supported. Where it goes wonky is that this also has to work
>> when the connection is via SSL. So libpq provides a function to return
>> (via a void*) a pointer to the OpenSSL structure so that can be used to
>> communicate with the server.

> Ugh. Maybe not the best design decision we've ever made.

libpq-fe.h is pretty clear on this matter:

/* Get the OpenSSL structure associated with a connection. Returns NULL for
 * unencrypted connections or if any other TLS library is in use. */
extern void *PQgetssl(PGconn *conn);

We are under no compulsion to emulate OpenSSL if we switch to another
library.  The design intent is that we'd provide a separate function
(PQgetnss?) and callers that know how to use that library would call
that function.  If they don't, it's not our problem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: FDW API

2011-02-18 Thread Tom Lane
Heikki Linnakangas  writes:
> On 15.02.2011 23:00, Tom Lane wrote:
>> I think moving the error check downstream would be a good thing.

> Ok, I tried moving the error checks to preprocess_rowmarks(). 
> Unfortunately RelOptInfos haven't been built at that stage yet, so you 
> still have to do the catalog lookup to get the relkind. That defeats the 
> purpose.

Mph.  It seems like the right fix here is to add relkind to
RangeTblEntry: it could be filled in for free in addRangeTableEntry, for
example.  The main downside of that is that relation relkinds would have
to become fixed (because there would be no practical way of updating
RTEs in stored rules), which means the "convert relation to view" hack
would have to go away.  Offhand I think no one cares about that anymore,
but ...

In any case, this is looking like a performance optimization that should
be dealt with in a separate patch.  I'd suggest leaving the code in the
form with the extra relkind lookups for the initial commit.  It's
possible that no one would notice the extra lookups anyway --- have you
benchmarked it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Debian readline/libedit breakage

2011-02-18 Thread Andrew Dunstan



On 02/17/2011 04:09 PM, Martijn van Oosterhout wrote:

On Wed, Feb 16, 2011 at 04:33:19PM -0800, Joshua D. Drake wrote:

Maybe we really should consider moving to NSS insread?

http://www.mozilla.org/projects/security/pki/nss/

If it solves the license problem, it is well supported etc..

For the record, which library you choose only matters for a fairly
small (and easy) part of the patch. Changing libpq to be SSL library
agnostic is more work.

For the people who aren't following, the issue is there are libraries
out there that use libpq to setup the connection to the postgres server
(so handing all authentication, et al) and then stealing the FD and
implementing the rest of the protocol themselves.

This is supported. Where it goes wonky is that this also has to work
when the connection is via SSL. So libpq provides a function to return
(via a void*) a pointer to the OpenSSL structure so that can be used to
communicate with the server.


Ugh. Maybe not the best design decision we've ever made.


As you can imagine, unless the library you use is *binary* compatable
with OpenSSL, you're kinda stuck. The idea I suggested way back was to
introduce a passthrough mode which would hide all the connection
details within libpq, simplifying the code on both sides. Then after a
few releases you could remove the old code and change the SSL library
at leasure.

I guess the painless option however is no longer available.



Could we provide an abstraction layer over whatever SSL library is in 
use with things like read/write/poll? Maybe that's what you had in mind 
for the passthrough mode.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Make a hard state change from catchup to streaming mode.

2011-02-18 Thread Tom Lane
Robert Haas  writes:
> On Fri, Feb 18, 2011 at 10:09 AM, Simon Riggs  wrote:
>> Make a hard state change from catchup to streaming mode.
>> More useful state change for monitoring purposes, plus a
>> required change for synchronous replication patch.

> As far as I can see, this patch was not posted or discussed before
> commit, and I'm not sure it's the behavior everyone wants.  It has the
> effect of preventing the system from ever going backwards from
> "streaming" to "catchup".  Is that what we want?

That seems like a very bad idea from here.  Being able to go back to
catchup after loss of the streaming connection is essential for
robustness.  If we now have to restart the slave for that to happen,
it's not an improvement.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Make a hard state change from catchup to streaming mode.

2011-02-18 Thread Kevin Grittner
Robert Haas  wrote:
> Simon Riggs  wrote:
 
>> Make a hard state change from catchup to streaming mode.
>> More useful state change for monitoring purposes, plus a
>> required change for synchronous replication patch.
> 
> As far as I can see, this patch was not posted or discussed before
> commit, and I'm not sure it's the behavior everyone wants.  It has
> the effect of preventing the system from ever going backwards from
> "streaming" to "catchup".  Is that what we want?
 
We are looking at moving to streaming replication instead of WAL
file shipping, but we often have WAN outages.  These can last
minutes, hours, or even a few days.  What would be the impact of
this patch on us during and after such outages?
 
I don't know how well such experience generalizes, but my personal
experience with replication technology is that "hard state changes"
tend to make things more "clunky" and introduce odd issues at the
state transitions.  Where different message types are intermingled
without such hard state changes, I've seen more graceful behavior.
 
Of course, take that with a grain of salt -- I haven't read the
patch and am talking in generalities based on having written a
couple serious replication tools in the past, and having used a few
others.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Make a hard state change from catchup to streaming mode.

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 10:09 AM, Simon Riggs  wrote:
> Make a hard state change from catchup to streaming mode.
> More useful state change for monitoring purposes, plus a
> required change for synchronous replication patch.

As far as I can see, this patch was not posted or discussed before
commit, and I'm not sure it's the behavior everyone wants.  It has the
effect of preventing the system from ever going backwards from
"streaming" to "catchup".  Is that what we want?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix for Index Advisor related hooks

2011-02-18 Thread Gurjeet Singh
On Fri, Feb 18, 2011 at 2:27 AM, Heikki Linnakangas <
heikki.linnakan...@enterprisedb.com> wrote:

> On 17.02.2011 14:30, Gurjeet Singh wrote:
>
>> On Wed, Feb 16, 2011 at 6:37 PM, Tom Lane  wrote:
>>
>> Fetch the values you need and stuff 'em in the struct.  Don't expect
>>> relcache to do it for you.  The only reason relcache is involved in the
>>> current workflow is that we try to cache the information across queries
>>> in order to save on planner startup time ... but I don't think that that
>>> concern is nearly as pressing for something like Index Advisor.  You'll
>>> have enough to do tracking changes in IndexOptInfo, you don't need to
>>> have to deal with refactorings inside relcache as well.
>>>
>>>
>> I also wish to make Index Advisor faster by not having to lookup this info
>> every time a new query comes in and that's why I was trying to use the
>> guts
>> of IndexSupportInitialize() where it does the caching.
>>
>
> I doubt performance matters much here. It's not like you're going to be
> explaining hundreds of queries per second with a hypotethical index
> installed. I think of this as a manual tool that you run from a GUI when you
> wonder if an index on column X would help.
>
> The question is, can the Index Advisor easilt access all the information it
> needs to build the IndexOptInfo? If not, what's missing?


There's a command line tool in the Index Adviser contrib that takes a file
full of SQL and run them against the Index Adviser. People would want that
tool to be as fast as it can be.

Another use case of the Index Advisor is to be switched on for a few hours
while the application runs, and gather the recommendations for the whole
run. We'll need good performance that case too.

I'll try to figure out what all info we need  for IndexOptInfo, it'll take
some time though.

Regards,
-- 
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurjeet@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


Re: [HACKERS] Assertion failure on UNLOGGED VIEW and SEQUENCE

2011-02-18 Thread Tom Lane
Robert Haas  writes:
> On Fri, Feb 18, 2011 at 6:42 AM, Itagaki Takahiro
>> The most easiest fix would be preventing them in parser level.

> Well, that sucks.  I had intended for that to be disallowed at the
> parser level, but obviously I fail.  It seems that disallowing this in
> the parser will require duplicating the OptTemp production.  Or
> alternatively we can just add an error check to the CREATE VIEW and
> CREATE SEQUENCE productions, i.e.

> if ($4 == RELPERSISTENCE_UNLOGGED)
> ereport(ERROR, ...);

> I am somewhat leaning toward the latter option, to avoid unnecessarily
> bloating the size of the parser tables, but I can do it the other way
> if people prefer.

If by the first option you mean causing the failure report to be "syntax
error" rather than anything more specific, then I agree that option
sucks.  I'd actually vote for putting the error test somewhere in
statement execution code, well downstream of gram.y.  The parser's job
is to produce a parse tree, not to encapsulate knowledge about which
combinations of options are supported.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About the performance of startup after dropping many tables

2011-02-18 Thread Tom Lane
Robert Haas  writes:
> Possibly, but it's not necessarily a bad idea to improve performance
> for people with crazy schemas.

It is if it introduces unmaintainable code.  I see no way to collapse
multiple drop operations into one that's not going to be a Rube Goldberg
device.  I'm especially unwilling to introduce such a thing into the
xlog replay code paths, where it's guaranteed to get little testing.

(BTW, it seems like a workaround for the OP is just to CHECKPOINT right
after dropping all those tables.  Or even reconsider their shutdown
procedure.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Separate messages for standby replies and hot standby feedback.

2011-02-18 Thread Simon Riggs
On Fri, 2011-02-18 at 07:11 -0500, Robert Haas wrote:
> On Fri, Feb 18, 2011 at 6:34 AM, Simon Riggs  wrote:
> > Separate messages for standby replies and hot standby feedback.
> > Allow messages to be sent at different times, and greatly reduce
> > the frequency of hot standby feedback. Refactor to allow additional
> > message types.
> 
> You could refactor this some more to avoid calling
> GetCurrentTimestamp() and TimestampDifferenceExceeds() twice each, but
> I'm not sure whether it's worth it.

Thanks for the observation.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure on UNLOGGED VIEW and SEQUENCE

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 6:42 AM, Itagaki Takahiro
 wrote:
> UNLOGGED is defigned in OptTemp in gram.y, so the parser accepts
> CREATE UNLOGGED VIEW and CREATE UNLOGGED SEQUENCE, but they are
> actually not supported.
>
> =# CREATE UNLOGGED VIEW uv AS SELECT 1;
> TRAP: FailedAssertion("!(relkind == 'r' || relkind == 't')", File:
> "heap.c", Line: 1246)
>
> =# CREATE UNLOGGED SEQUENCE us;
> TRAP: FailedAssertion("!(relkind == 'r' || relkind == 't')", File:
> "heap.c", Line: 1246)
>
> The most easiest fix would be preventing them in parser level.

Well, that sucks.  I had intended for that to be disallowed at the
parser level, but obviously I fail.  It seems that disallowing this in
the parser will require duplicating the OptTemp production.  Or
alternatively we can just add an error check to the CREATE VIEW and
CREATE SEQUENCE productions, i.e.

if ($4 == RELPERSISTENCE_UNLOGGED)
ereport(ERROR, ...);

I am somewhat leaning toward the latter option, to avoid unnecessarily
bloating the size of the parser tables, but I can do it the other way
if people prefer.

In scrutinizing this code again, I notice that I accidentally added
the ability to create an unlogged table using SELECT INTO, as in
"SELECT 1 INTO UNLOGGED foo", just as you can also do "SELECT 1 INTO
TEMP foo".  I don't see any particular reason to disallow that, but I
should probably update the documentation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP - Add ability to constrain backend temporary file space

2011-02-18 Thread Robert Haas
On Thu, Feb 17, 2011 at 10:17 PM, Mark Kirkwood
 wrote:
> This is WIP, it does seem to work ok, but some areas/choices I'm not
> entirely clear about are mentioned in the patch itself. Mainly:
>
> - name of the guc... better suggestions welcome
> - datatype for the guc - real would be good, but at the moment the nice
> parse KB/MB/GB business only works for int

Please add this to the next CommitFest:

https://commitfest.postgresql.org/action/commitfest_view/open

With respect to the datatype of the GUC, int seems clearly correct.
Why would you want to use a float?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add support for logging the current role

2011-02-18 Thread Robert Haas
On Thu, Feb 17, 2011 at 10:40 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Feb 17, 2011 at 4:53 PM, Tom Lane  wrote:
>>> In short, add a bit of overhead at SetUserId time in order to make this
>>> cheap (and accurate) in elog.c.
>
>> As Stephen says, I think this is utterly impractical; those routines
>> can't ever throw any kind of error.
>
> Why would they need to throw an error?  It'd be on the caller's head to
> supply the role name along with OID.  We can keep the name in a static
> buffer of size NAMEDATALEN, so don't tell me about palloc failures
> either.

OK, but there are not a small number of callers of that function, and
they don't necessarily have the correct info at hand.  For example,
you'd need to add prevUserAsText to TransactionStateData, which
doesn't seem appealing.

> The logging design as it stands seems to me to be a Rube Goldberg device
> that is probably going to have corner-case bugs quite aside from its
> possible performance issues.

I think you're overreacting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib loose ends: 9.0 to 9.1 incompatibilities

2011-02-18 Thread Robert Haas
On Thu, Feb 17, 2011 at 9:13 PM, Tom Lane  wrote:
> I wrote:
>> Robert Haas  writes:
>>> I think we should try to make the state match as closely as possible,
>>> no matter how you got there.  Otherwise, I think we're storing up a
>>> host of future pain for ourselves.
>
>> Well, if you're willing to hold your nose for the "UPDATE pg_proc" hack,
>> we can make it so.
>
> I believe I've now fixed all the discrepancies between fresh installs
> and 9.0 updates of contrib modules, except for these:
>
> 1. citext COLLATABLE option (see adjacent thread)
>
> 2. intarray and tsearch2 use some core support functions in their
> GIN opclasses, and those support functions changed signatures in 9.1.
> The current solution to this involves having stub functions in core
> with the old signatures; when you do an upgrade from the 9.0 version
> of one of these contrib modules, its opclass will be pointing at the
> stub version instead of the preferred version.  I guess we could fix
> that with a direct UPDATE on pg_amproc but I'm not sure that's a
> good idea.  Note these functions aren't actually *members* of the
> extensions, just things it references, so the odds of future trouble
> seem pretty small.  On the other hand, if we don't do this, it's
> unclear when we'll ever be able to get rid of the stubs.
>
> Comments?

ISTM that the pg_amproc entries are part of the operator class, which
is owned by the extension.  So it's the upgrade script's job to leave
the operator class in the right state.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/MED - file_fdw

2011-02-18 Thread Shigeru HANADA

On Wed, 16 Feb 2011 16:48:33 +0900
Itagaki Takahiro  wrote:

> On Tue, Feb 8, 2011 at 00:30, Shigeru HANADA  
> wrote:
> > Fixed version is attached.
> 
> I reviewed your latest git version, that is a bit newer than the attached 
> patch.
> http://git.postgresql.org/gitweb?p=users/hanada/postgres.git;a=commit;h=0e1a1e1b0e168cb3d8ff4d637747d0ba8f7b8d55
> 
> The code still works with small adjustment, but needs to be rebased on the
> latest master, especially for extension support and copy API changes.
> 
> Here are a list of comments and suggestions:

Thanks for the comments.  Revised version of patch has been attached.

> * You might forget some combination or unspecified options in
> file_fdw_validator().
>   For example, format == NULL or !csv && header cases. I've not tested all
>   cases, but please recheck validations used in BeginCopy().

Right, I've revised validation based on BeginCopy(), and added
regression tests about validation.

> * estimate_costs() needs own row estimation rather than using baserel.
> > > What value does baserel->tuples have?
> > > Foreign tables are never analyzed for now. Is the number correct?
> > No, baserel->tuples is always 0, and baserel->pages is 0 too.
> > In addition, width and rows are set to 100 and 1, as default values.
> 
> It means baserel is not reliable at all, right?

Right, tables which has not been ANALYZEd have default stats in
baserel.  But getting # of records needs another parsing for the file...

> If so, we need alternative
> solution in estimate_costs(). We adjust statistics with runtime relation
> size in estimate_rel_size(). Also, we use get_rel_data_width() for not
> analyzed tables. We could use similar technique in file_fdw, too.

Ah, using get_relation_data_width(), exported version of
get_rel_data_width(), seems to help estimation.  I'll research around
it little more.  By the way, adding ANALYZE support for foreign tables
is reasonable idea for this issue?

> * Should use int64 for file byte size (or, uint32 in blocks).
> unsigned long might be 32bit. ulong is not portable.
> 
> * Oid List (list_make1_oid) might be more handy than Const to hold relid
>   in FdwPlan.fdw_private.
> 
> * I prefer FileFdwExecutionState to FileFdwPrivate, because there are
>   two different 'fdw_private' variables in FdwPlan and FdwExecutionState.
> 
> * The comment in fileIterate seems wrong. It should be
>   /* Store the next tuple as a virtual tuple. */  , right?
> 
> * #include  is needed.

Fixed all of above.

Regards,
--
Shigeru Hanada


20110218-file_fdw.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About the performance of startup after dropping many tables

2011-02-18 Thread Robert Haas
On Thu, Feb 17, 2011 at 10:37 PM, Tom Lane  wrote:
> Gan Jiadong  writes:
>> we have PG 8.3.13 in our system. When running performance cases, we find the
>> startup recovery cost about 3 minutes. It is too long in our system.
>
> Maybe you should rethink the assumption that dropping 4 tables is a
> cheap operation.  Why do you have that many in the first place, let
> alone that many that you drop and recreate frequently?  Almost
> certainly, you need a better-conceived schema.

Possibly, but it's not necessarily a bad idea to improve performance
for people with crazy schemas.

What concerns me a little bit about the proposed scheme, though, is
that it's only going to work if all over those tables are dropped by a
single transaction.  You still need one pass through all of
shared_buffers for every transaction that drops one or more relations.
 Now, I'm not sure, maybe there's no help for that, but ever since
commit c2281ac87cf4828b6b828dc8585a10aeb3a176e0 it's been on my mind
that loops that iterate through the entire buffer cache are bad for
scalability.

Conventional wisdom seems to be that performance tops out at, or just
before, 8GB, but it's already the case that that's a quite a small
fraction of the memory on a large machine, and that's only going to
keep getting worse.  Admittedly, the existing places where we loop
through the whole buffer cache are probably not the primary reason for
that limitation, but...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Separate messages for standby replies and hot standby feedback.

2011-02-18 Thread Robert Haas
On Fri, Feb 18, 2011 at 6:34 AM, Simon Riggs  wrote:
> Separate messages for standby replies and hot standby feedback.
> Allow messages to be sent at different times, and greatly reduce
> the frequency of hot standby feedback. Refactor to allow additional
> message types.

You could refactor this some more to avoid calling
GetCurrentTimestamp() and TimestampDifferenceExceeds() twice each, but
I'm not sure whether it's worth it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Server Name

2011-02-18 Thread Simon Riggs

We agreed to add a parameter called sync_standbys
http://archives.postgresql.org/message-id/4d1dcf5a.7070...@enterprisedb.com.
that required the concept of a server name.

I've written this patch to add a server name parameter and to send an
info message which returns the server name, attached.

For now, Sync Rep will be written to match sync_standbys against the
application_name instead. It may be possible we agree it is the right
way to go, so I've not rushed to apply the patch given here after all. 

This patch now has a lower priority than the main sync rep patch.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index c7f5bd5..46096e6 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -125,6 +125,7 @@ static void XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr);
 static void XLogWalRcvFlush(bool dying);
 static void XLogWalRcvSendReply(void);
 static void XLogWalRcvSendHSFeedback(void);
+static void XLogWalRcvSendInfo(void);
 
 /* Signal handlers */
 static void WalRcvSigHupHandler(SIGNAL_ARGS);
@@ -276,6 +277,9 @@ WalReceiverMain(void)
 	walrcv_connect(conninfo, startpoint);
 	DisableWalRcvImmediateExit();
 
+	/* Give the primary our identification and configuration information */
+	XLogWalRcvSendInfo();
+
 	/* Loop until end-of-streaming or error */
 	for (;;)
 	{
@@ -305,6 +309,7 @@ WalReceiverMain(void)
 		{
 			got_SIGHUP = false;
 			ProcessConfigFile(PGC_SIGHUP);
+			XLogWalRcvSendInfo();
 		}
 
 		/* Wait a while for data to arrive */
@@ -702,3 +707,26 @@ XLogWalRcvSendHSFeedback(void)
 	memcpy(&buf[1], &feedback_message, sizeof(StandbyHSFeedbackMessage));
 	walrcv_send(buf, sizeof(StandbyHSFeedbackMessage) + 1);
 }
+
+/*
+ * Send info message to primary.
+ */
+static void
+XLogWalRcvSendInfo(void)
+{
+	char		buf[sizeof(StandbyInfoMessage) + 1];
+	StandbyInfoMessage	info_message;
+
+	/* Get current timestamp. */
+	info_message.sendTime = GetCurrentTimestamp();
+	strncpy(info_message.servername, ServerName, strlen(ServerName));
+	info_message.servername[strlen(ServerName)] = '\0';
+
+	elog(DEBUG2, "sending standby info servername \"%s\"",
+ info_message.servername);
+
+	/* Prepend with the message type and send it. */
+	buf[0] = 'i';
+	memcpy(&buf[1], &info_message, sizeof(StandbyInfoMessage));
+	walrcv_send(buf, sizeof(StandbyInfoMessage) + 1);
+}
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index e04d59e..43e47d9 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -119,6 +119,7 @@ static void StartReplication(StartReplicationCmd * cmd);
 static void ProcessStandbyMessage(void);
 static void ProcessStandbyReplyMessage(void);
 static void ProcessStandbyHSFeedbackMessage(void);
+static void ProcessStandbyInfoMessage(void);
 static void ProcessRepliesIfAny(void);
 
 
@@ -537,6 +538,10 @@ ProcessStandbyMessage(void)
 			ProcessStandbyHSFeedbackMessage();
 			break;
 
+		case 'i':
+			ProcessStandbyInfoMessage();
+			break;
+
 		default:
 			ereport(COMMERROR,
 	(errcode(ERRCODE_PROTOCOL_VIOLATION),
@@ -655,6 +660,16 @@ ProcessStandbyHSFeedbackMessage(void)
 	}
 }
 
+static void
+ProcessStandbyInfoMessage(void)
+{
+	StandbyInfoMessage	info;
+
+	pq_copymsgbytes(&reply_message, (char *) &info, sizeof(StandbyInfoMessage));
+
+	elog(DEBUG2, "server name %s", info.servername);
+}
+
 /* Main loop of walsender process */
 static int
 WalSndLoop(void)
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 55cbf75..fd733e0 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -186,6 +186,7 @@ static const char *assign_pgstat_temp_directory(const char *newval, bool doit, G
 static const char *assign_application_name(const char *newval, bool doit, GucSource source);
 static const char *show_unix_socket_permissions(void);
 static const char *show_log_file_mode(void);
+static const char *assign_server_name(const char *newval, bool doit, GucSource source);
 
 static char *config_enum_get_options(struct config_enum * record,
 		const char *prefix, const char *suffix,
@@ -405,6 +406,9 @@ int			tcp_keepalives_idle;
 int			tcp_keepalives_interval;
 int			tcp_keepalives_count;
 
+char		*ServerName = NULL;
+
+
 /*
  * These variables are all dummies that don't do anything, except in some
  * cases provide the value for SHOW to display.  The real state is elsewhere
@@ -2365,6 +2369,15 @@ static struct config_string ConfigureNamesString[] =
 	},
 
 	{
+		{"server_name", PGC_POSTMASTER, CLIENT_CONN_STATEMENT,
+			gettext_noop("Allows setting of a unique name for this server."),
+			NULL
+		},
+		&ServerName,
+		"", assign_server_name, NULL
+	},
+
+	{
 		{"temp_tablespaces", PGC_USERSET, CLIENT_CONN_STATEMENT,
 			gettext_noop("Sets the tablespace(s) to use f

[HACKERS] Assertion failure on UNLOGGED VIEW and SEQUENCE

2011-02-18 Thread Itagaki Takahiro
UNLOGGED is defigned in OptTemp in gram.y, so the parser accepts
CREATE UNLOGGED VIEW and CREATE UNLOGGED SEQUENCE, but they are
actually not supported.

=# CREATE UNLOGGED VIEW uv AS SELECT 1;
TRAP: FailedAssertion("!(relkind == 'r' || relkind == 't')", File:
"heap.c", Line: 1246)

=# CREATE UNLOGGED SEQUENCE us;
TRAP: FailedAssertion("!(relkind == 'r' || relkind == 't')", File:
"heap.c", Line: 1246)

The most easiest fix would be preventing them in parser level.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TYPE COLLATABLE?

2011-02-18 Thread Peter Eisentraut
On tor, 2011-02-17 at 17:50 -0500, Tom Lane wrote:
> What are we going to do to allow the citext update script to fix this?
> I see no sign that ALTER TYPE can fix it (and am unsure that we'd want
> to add such a feature, particularly not right now).

How would this normally be handled if a type changes properties or wants
to make use of a new property?  I guess the answer is that there is no
"normally".

> Is it time for a direct UPDATE on the pg_type row?  If so, to what?  I see
> pg_type.typcollation is supposed to be an OID, so how the heck does
> one map a bool CREATE TYPE parameter into the catalog entry?

It's 100, which is the OID of "default" in pg_collation.  The value may
be different for domains.  (Earlier versions of the feature had a
boolean column and a separate collation column for domains, but somehow
it turned out to be quite redundant.)




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY ENCODING revisited

2011-02-18 Thread Itagaki Takahiro
On Fri, Feb 18, 2011 at 03:57, Robert Haas  wrote:
> On Wed, Feb 16, 2011 at 10:45 PM, Itagaki Takahiro
> I am not qualified to fully review this patch because I'm not all that
> familiar with the encoding stuff, but it looks reasonably sensible on
> a quick read-through.  I am supportive of making a change in this area
> even at this late date, because it seems to me that if we're not going
> to change this then we're pretty much giving up on having a usable
> file_fdw in 9.1.  And since postgresql_fdw isn't in very good shape
> either, that would mean we may as well give up on SQL/MED.  We might
> have to do that anyway, but I don't think we should do it just because
> of this issue, if there's a reasonable fix.

One design issue is the new function names:
  extern char *pg_client_to_server(const char *s, int len);
  extern char *pg_server_to_client(const char *s, int len);
+ extern char *pg_any_to_server(const char *s, int len, int encoding);
+ extern char *pg_server_to_any(const char *s, int len, int encoding);

They don't contain any words related to "encoding" or "conversion".

Ishii-san, do you have comments? I guess you designed the area.
Please let me know if there are better alternatives.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >