date:20120430

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Dave Page

On Sun, Apr 29, 2012 at 11:20 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:

 Excerpts from Simon Riggs's message of jue abr 26 11:10:09 -0300 2012:
 On Thu, Apr 26, 2012 at 1:41 PM, Simon Riggs si...@2ndquadrant.com wrote:

  I will also be organising a small-medium sized Future of In-Core
  Replication meeting in Ottawa on Wed 16 May, 6-10pm.

 Thanks for such rapid response. I've put up a wiki page and will be
 adding names as they come through

 http://wiki.postgresql.org/wiki/PgCon2012CanadaInCoreReplicationMeeting

 How is this not redundant with the Cluster Summit?
 http://wiki.postgresql.org/wiki/PgCon2012CanadaClusterSummit

 ... oh, you're also already enlisted in that one.  Sigh.

My understanding is that the agenda for the cluster meeting is almost
entirely dedicated to Postgres-XC.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Simon Riggs

On Mon, Apr 30, 2012 at 7:35 AM, Dave Page dp...@pgadmin.org wrote:
 On Sun, Apr 29, 2012 at 11:20 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:

 Excerpts from Simon Riggs's message of jue abr 26 11:10:09 -0300 2012:
 On Thu, Apr 26, 2012 at 1:41 PM, Simon Riggs si...@2ndquadrant.com wrote:

  I will also be organising a small-medium sized Future of In-Core
  Replication meeting in Ottawa on Wed 16 May, 6-10pm.

 Thanks for such rapid response. I've put up a wiki page and will be
 adding names as they come through

 http://wiki.postgresql.org/wiki/PgCon2012CanadaInCoreReplicationMeeting

 How is this not redundant with the Cluster Summit?
 http://wiki.postgresql.org/wiki/PgCon2012CanadaClusterSummit

 ... oh, you're also already enlisted in that one.  Sigh.

 My understanding is that the agenda for the cluster meeting is almost
 entirely dedicated to Postgres-XC.

Yes. Regrettably, there wasn't a 90 minute slot available to discuss
this at the cluster meeting, the dev meeting agenda is fairly full and
my presentation on replication was rejected for the main conference.
As a result, the additional meeting was the only way left open to me
to initiate detailed discussion.

The additional meeting will give us 240 minutes of discussion and
allows us to bring in some technical users as well.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)

2012-04-30 Thread Albe Laurenz

Tom Lane wrote:
 On Fri, Apr 27, 2012 at 7:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 No, I'm not happy with that.  Smart shutdown is defined to not
affect
 current sessions.  I'm fine with having a fourth mode that acts as
you
 suggest (and, probably, even with making it the default); but not
with
 taking away a behavior that people may well be relying on.

 Agreed, but not sure what to call the new mode: smarter?

 I'm not necessarily opposed to commandeering the name smart for the
 new behavior, so that what we have to find a name for is the old
smart
 behavior.  How about
 
   slow- allow existing sessions to finish (old smart)
   smart   - allow existing transactions to finish (new)
   fast- kill active queries
   immediate - unclean shutdown

But if the meaning of smart changes, then people who use
pg_ctl stop -m smart and expect that active sessions will not be
affected will get a surprise.

Wouldn't it be better to pick a different name for the new fourth
mode?  It could still be the default mode, but I think that people
who explicitly specify a certain mode are more likely to care about
the exact behaviour.

I second Heikki's suggestions for mode names.

And +1 from me on changing the default behaviour.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2 release notes, beta time?

2012-04-30 Thread Magnus Hagander

On Mon, Apr 30, 2012 at 02:23, Devrim GÜNDÜZ dev...@gunduz.org wrote:
 On Sun, 2012-04-29 at 13:23 +0100, Simon Riggs wrote:
  (As a side note, RPMs *may not* be ready, because I (and Magnus)
 will be
  at PGDay Turkey on 12th, and will be busy over the whole weekend).

 Is that a closed meeting? I hadn't seen any mention of that anywhere.

 Not that much. I've been tweeting about that (in Turkish, sorry) and
 sending out announcements to local mailing lists for quite some time,
 but we not able to finalize the venue until last week. This is the 2nd
 one after last year. I will do a wider announcement next time, I promise
 -- and the new event submission policy on postgresql.org raised a
 barrier against adding an item to the website, but that is more a -www
 topic.

Uh, are you referring to the policy at
http://wiki.postgresql.org/wiki/NewsEventsApproval? I don't see
anything there that would exclude this.

 (Oh, and you need to have the secret key to enter the meeting.)

I think Simon can be included in the group of people who can be
accepted with just the secret handshake, no?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)

2012-04-30 Thread Wolfgang Wilhelm

Just for the ones interested in a view on another turf:

In Oracle shutdown immediate is the fastest _clean_ shutdown and shutdown 
abort is equal to shutdown immediate in PG.
The other modes are called shutdown normal and shutdown transactional.


Wolfgang




 Von: Tom Lane t...@sss.pgh.pa.us
An: Simon Riggs si...@2ndquadrant.com 
CC: Robert Haas robertmh...@gmail.com; Alvaro Herrera 
alvhe...@commandprompt.com; Magnus Hagander mag...@hagander.net; 
PostgreSQL-development pgsql-hackers@postgresql.org 
Gesendet: 20:48 Freitag, 27.April 2012
Betreff: Re: [HACKERS] smart shutdown at end of transaction (was: Default mode 
for shutdown) 
 
Simon Riggs si...@2ndquadrant.com writes:
 On Fri, Apr 27, 2012 at 7:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 No, I'm not happy with that.  Smart shutdown is defined to not affect
 current sessions.  I'm fine with having a fourth mode that acts as you
 suggest (and, probably, even with making it the default); but not with
 taking away a behavior that people may well be relying on.

 Agreed, but not sure what to call the new mode: smarter?

I'm not necessarily opposed to commandeering the name smart for the
new behavior, so that what we have to find a name for is the old smart
behavior.  How about

    slow    - allow existing sessions to finish (old smart)
    smart    - allow existing transactions to finish (new)
    fast    - kill active queries
    immediate - unclean shutdown

            regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Analyzing foreign tables memory problems

2012-04-30 Thread Albe Laurenz

During ANALYZE, in analyze.c, functions compute_minimal_stats
and compute_scalar_stats, values whose length exceed
WIDTH_THRESHOLD (= 1024) are not used for calculating statistics
other than that they are counted as too wide rows and assumed
to be all different.

This works fine with regular tables; values exceeding that threshold
don't get detoasted and won't consume excessive memory.

With foreign tables the situation is different.  Even though
values exceeding WIDTH_THRESHOLD won't get used, the complete
rows will be fetched from the foreign table.  This can easily
exhaust maintenance_work_mem.

A foreign data wrapper has no good way to counter the problem.
It can return truncated values in ist AcquireSampleRowsFunc,
but WIDTH_THRESHOLD is private to analyze.c and it's a bad idea
to hard code a cutoff limit of 1025.

I can think of two remedies:
1) Expose WIDTH_THRESHOLD in commands/vacuum.h and add documentation
   so that the authors of foreign data wrappers are aware of the
   problem and can avoid it on their side.
   This would be quite simple.

2) Instead of one callback that returns all sample rows, have
   a callback that just returns the next table row (or the next
   table row from a subset of the pages of the table in the
   internal case).  This function could be called in a short-lived
   memory context.  Vitter's algorithm for selecting a sample
   and the truncation of excessively long values would then be
   handled in analyze.c.
   This would avoid the problem completely and make it easier
   to write a foreign data wrapper.
   I haven't thought this out completely, and it would require
   bigger changes to analyze.c and the API than are probably
   welcome this close to beta.

What is your opinion?

Yours,
Laurenz Albe



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] enable_indexonly

2012-04-30 Thread Magnus Hagander

On Sat, Apr 28, 2012 at 17:41, Andrew Dunstan and...@dunslane.net wrote:

 On 04/27/2012 12:44 PM, Magnus Hagander wrote:


 Hmm. Forgive me, I pressed the wrong button and looked at current docs
 rather than dev docs.

 (Easier when they used to look different...)


 Maybe we should have the stylesheet watermark the dev docs pages.

 Are you offering to write the patch? ;)



 I am about the worst person in the world to ask to do this. But there are
 plenty of web jockeys in the community who would be better at it than would
 I.

:-) Well, same here. But if one of said web jockeys can come up with a
good stylesheet, I'll be happy to integrate it.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts

2012-04-30 Thread Ryan Kelly

On Sun, Apr 29, 2012 at 10:12:40PM -0400, Alvaro Herrera wrote:
 
 Excerpts from Ryan Kelly's message of sáb ene 14 16:22:21 -0300 2012:
 
  I have attached a new patch which handles the connect_timeout option by
  adding a PQconnectTimeout(conn) function to access the connect_timeout
  which I then use to retrieve the existing value from the old connection.
 
 Was this patch dropped entirely?  If not and it hasn't been committed
 yet, I think it belongs in the open CF here:
 https://commitfest.postgresql.org/action/commitfest_view?id=14
Needs some freshening if anyone still wants it. Update against latest
HEAD attached.

 
 -- 
 Álvaro Herrera alvhe...@commandprompt.com
 The PostgreSQL Company - Command Prompt, Inc.
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-Ryan Kelly
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 8a820ac..bf4d110 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1479,6 +1479,24 @@ char *PQoptions(const PGconn *conn);
   /para
  /listitem
 /varlistentry
+
+varlistentry id=libpq-pqconnecttimeout
+ term
+  functionPQconnectTimeout/function
+  indexterm
+   primaryPQconnectTimeout/primary
+  /indexterm
+ /term
+
+ listitem
+  para
+   Returns the connect_timeout property as given to libpq.
+synopsis
+char *PQconnectTimeout(const PGconn *conn);
+/synopsis
+  /para
+ /listitem
+/varlistentry
/variablelist
   /para
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index dd59aa1..90dfe13 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1504,7 +1504,7 @@ static bool
 do_connect(char *dbname, char *user, char *host, char *port)
 {
 	PGconn	   *o_conn = pset.db,
-			   *n_conn;
+			   *n_conn = NULL;
 	char	   *password = NULL;
 
 	if (!dbname)
@@ -1537,7 +1537,7 @@ do_connect(char *dbname, char *user, char *host, char *port)
 
 	while (true)
 	{
-#define PARAMS_ARRAY_SIZE	8
+#define PARAMS_ARRAY_SIZE	9
 		const char **keywords = pg_malloc(PARAMS_ARRAY_SIZE * sizeof(*keywords));
 		const char **values = pg_malloc(PARAMS_ARRAY_SIZE * sizeof(*values));
 
@@ -1555,17 +1555,120 @@ do_connect(char *dbname, char *user, char *host, char *port)
 		values[5] = pset.progname;
 		keywords[6] = client_encoding;
 		values[6] = (pset.notty || getenv(PGCLIENTENCODING)) ? NULL : auto;
-		keywords[7] = NULL;
-		values[7] = NULL;
+		keywords[7] = connect_timeout;
+		values[7] = PQconnectTimeout(o_conn);
+		keywords[8] = NULL;
+		values[8] = NULL;
 
-		n_conn = PQconnectdbParams(keywords, values, true);
+		/* attempt connection asynchronously */
+		n_conn = PQconnectStartParams(keywords, values, true);
+
+		if (sigsetjmp(sigint_interrupt_jmp, 1) != 0)
+		{
+			/* interrupted during connection attempt */
+			PQfinish(n_conn);
+			n_conn = NULL;
+		}
+		else
+		{
+			time_t		end_time = -1;
+
+			/*
+			 * maybe use a connection timeout. this code essentially stolen
+			 * from src/interfaces/libpq/fe-connect.c connectDBComplete
+			 */
+			if (PQconnectTimeout(n_conn) != NULL)
+			{
+int			timeout = atoi(PQconnectTimeout(n_conn));
+if (timeout  0)
+{
+	/*
+	 * Rounding could cause connection to fail; need at least 2 secs
+	 */
+	if (timeout  2)
+		timeout = 2;
+	/* calculate the finish time based on start + timeout */
+	end_time = time(NULL) + timeout;
+}
+			}
+
+			while(end_time  0 || time(NULL)  end_time)
+			{
+int			poll_res;
+int			rc;
+fd_set		read_mask,
+			write_mask;
+struct timeval timeout;
+struct timeval *ptr_timeout;
+
+poll_res = PQconnectPoll(n_conn);
+if (poll_res == PGRES_POLLING_OK ||
+	poll_res == PGRES_POLLING_FAILED)
+{
+	break;
+}
+
+if (poll_res == PGRES_POLLING_READING)
+	FD_SET(PQsocket(n_conn), read_mask);
+if (poll_res == PGRES_POLLING_WRITING)
+	FD_SET(PQsocket(n_conn), write_mask);
+
+/*
+ * Compute appropriate timeout interval. essentially stolen
+ * from src/interfaces/libpq/fe-misc.c pqSocketPoll. Maybe
+ * that function could be made public? we could then replace
+ * the whole inside of this while loop, assuming it is safe
+ * to longjmp out from there.
+ */
+if (end_time == ((time_t) -1))
+	ptr_timeout = NULL;
+else
+{
+	time_t  now = time(NULL);
+
+	if (end_time  now)
+		timeout.tv_sec = end_time - now;
+	else
+		timeout.tv_sec = 0;
+	timeout.tv_usec = 0;
+	ptr_timeout = timeout;
+}
+
+sigint_interrupt_enabled = true;
+if (cancel_pressed)
+{
+	PQfinish(n_conn);
+	n_conn = NULL;
+	sigint_interrupt_enabled = false;
+	break;
+}
+rc = select(PQsocket(n_conn) + 1,
+			read_mask, write_mask, NULL,
+			ptr_timeout);
+sigint_interrupt_enabled = false;
+
+if (rc  0  errno != EINTR)
+	break;
+			}
+
+			if (PQstatus(n_conn) != CONNECTION_OK 
+end_time  0  time(NULL) = end_time)

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Atri Sharma

On Mon, Apr 30, 2012 at 12:39 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, Apr 30, 2012 at 7:35 AM, Dave Page dp...@pgadmin.org wrote:
 On Sun, Apr 29, 2012 at 11:20 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:

 Excerpts from Simon Riggs's message of jue abr 26 11:10:09 -0300 2012:
 On Thu, Apr 26, 2012 at 1:41 PM, Simon Riggs si...@2ndquadrant.com wrote:

  I will also be organising a small-medium sized Future of In-Core
  Replication meeting in Ottawa on Wed 16 May, 6-10pm.

 Thanks for such rapid response. I've put up a wiki page and will be
 adding names as they come through

 http://wiki.postgresql.org/wiki/PgCon2012CanadaInCoreReplicationMeeting

 How is this not redundant with the Cluster Summit?
 http://wiki.postgresql.org/wiki/PgCon2012CanadaClusterSummit

 ... oh, you're also already enlisted in that one.  Sigh.

 My understanding is that the agenda for the cluster meeting is almost
 entirely dedicated to Postgres-XC.

 Yes. Regrettably, there wasn't a 90 minute slot available to discuss
 this at the cluster meeting, the dev meeting agenda is fairly full and
 my presentation on replication was rejected for the main conference.
 As a result, the additional meeting was the only way left open to me
 to initiate detailed discussion.

 The additional meeting will give us 240 minutes of discussion and
 allows us to bring in some technical users as well.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Hi All,

Will the live streaming/recording of the meeting be available?

Atri

-- 
Regards,

Atri
l'apprenant

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)

2012-04-30 Thread Greg Stark

On Mon, Apr 30, 2012 at 9:55 AM, Wolfgang Wilhelm
wolfgang20121...@yahoo.de wrote:
 Just for the ones interested in a view on another turf:

 In Oracle shutdown immediate is the fastest _clean_ shutdown and shutdown
 abort is equal to shutdown immediate in PG.
 The other modes are called shutdown normal and shutdown transactional.

Though the behaviour users see is quite different. In Oracle the
fastest clean shutdown still requires rolling back transactions which
can take a long time. In Postgres rolling back transactions is
instantaneous so a shutdown immediate will appear to behave like a
shutdown abort in Oracle in that it will always run fast even if the
effect on the database is different.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts

2012-04-30 Thread Alvaro Herrera


Excerpts from Ryan Kelly's message of lun abr 30 07:10:14 -0400 2012:
 On Sun, Apr 29, 2012 at 10:12:40PM -0400, Alvaro Herrera wrote:
  
  Excerpts from Ryan Kelly's message of sáb ene 14 16:22:21 -0300 2012:
  
   I have attached a new patch which handles the connect_timeout option by
   adding a PQconnectTimeout(conn) function to access the connect_timeout
   which I then use to retrieve the existing value from the old connection.
  
  Was this patch dropped entirely?  If not and it hasn't been committed
  yet, I think it belongs in the open CF here:
  https://commitfest.postgresql.org/action/commitfest_view?id=14
 Needs some freshening if anyone still wants it. Update against latest
 HEAD attached.

Well, do *you* want it?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Analyzing foreign tables memory problems

2012-04-30 Thread Tom Lane

Albe Laurenz laurenz.a...@wien.gv.at writes:
 During ANALYZE, in analyze.c, functions compute_minimal_stats
 and compute_scalar_stats, values whose length exceed
 WIDTH_THRESHOLD (= 1024) are not used for calculating statistics
 other than that they are counted as too wide rows and assumed
 to be all different.

 This works fine with regular tables; values exceeding that threshold
 don't get detoasted and won't consume excessive memory.

 With foreign tables the situation is different.  Even though
 values exceeding WIDTH_THRESHOLD won't get used, the complete
 rows will be fetched from the foreign table.  This can easily
 exhaust maintenance_work_mem.

I'm fairly skeptical that this is a real problem, and would prefer not
to complicate wrappers until we see some evidence from the field that
it's worth worrying about.  The WIDTH_THRESHOLD logic was designed a
dozen years ago when common settings for work_mem were a lot smaller
than today.  Moreover, to my mind it's always been about avoiding
detoasting operations as much as saving memory, and we don't have
anything equivalent to that consideration in foreign data wrappers.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] default_transaction_isolation = serializable causes crash under Hot Standby

2012-04-30 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:
 
 * throw a WARNING if serializable is stated in other cases, and
 downgrade the request to repeatable read
 
 I think this would be reasonable, but it's still my second choice.
 The advantage of throwing an ERROR is that someone will presumably
 be forced to realize that a problem exists and fix it, whereas a
 WARNING may just generate a combination of log spam and unexpected
 behavior forever.  Also, we currently block cases where you try to
 set transaction_isolation by throwing an ERROR, so it seems a bit
 more consistent to do that in other cases as well.  Still, it's a
 reasonable choice, and certainly better than failing an assertion.
 
I'm not totally clear on your first choice.  Are you looking for
something similar to the patch I posted, except that it would dodge
all resulting errors at the point where they are promoted to FATAL
(before HS is really functional)?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] default_transaction_isolation = serializable causes crash under Hot Standby

2012-04-30 Thread Robert Haas

On Mon, Apr 30, 2012 at 10:26 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:
 Simon Riggs si...@2ndquadrant.com wrote:

 * throw a WARNING if serializable is stated in other cases, and
 downgrade the request to repeatable read

 I think this would be reasonable, but it's still my second choice.
 The advantage of throwing an ERROR is that someone will presumably
 be forced to realize that a problem exists and fix it, whereas a
 WARNING may just generate a combination of log spam and unexpected
 behavior forever.  Also, we currently block cases where you try to
 set transaction_isolation by throwing an ERROR, so it seems a bit
 more consistent to do that in other cases as well.  Still, it's a
 reasonable choice, and certainly better than failing an assertion.

 I'm not totally clear on your first choice.  Are you looking for
 something similar to the patch I posted, except that it would dodge
 all resulting errors at the point where they are promoted to FATAL
 (before HS is really functional)?

That's my vote.  Where is that FATAL error coming from?  I'm guessing
it's somehow resulting from failure to set up the startup transaction
in InitPostgres().  If that's the case, we ought to be able to work
around it, because surely repeatable read would be fine for the
startup transaction, which doesn't really do anything anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Analyzing foreign tables memory problems

2012-04-30 Thread Albe Laurenz

Tom Lane wrote:
 During ANALYZE, in analyze.c, functions compute_minimal_stats
 and compute_scalar_stats, values whose length exceed
 WIDTH_THRESHOLD (= 1024) are not used for calculating statistics
 other than that they are counted as too wide rows and assumed
 to be all different.
 
 This works fine with regular tables; values exceeding that threshold
 don't get detoasted and won't consume excessive memory.
 
 With foreign tables the situation is different.  Even though
 values exceeding WIDTH_THRESHOLD won't get used, the complete
 rows will be fetched from the foreign table.  This can easily
 exhaust maintenance_work_mem.

 I'm fairly skeptical that this is a real problem, and would prefer not
 to complicate wrappers until we see some evidence from the field that
 it's worth worrying about.  The WIDTH_THRESHOLD logic was designed a
 dozen years ago when common settings for work_mem were a lot smaller
 than today.  Moreover, to my mind it's always been about avoiding
 detoasting operations as much as saving memory, and we don't have
 anything equivalent to that consideration in foreign data wrappers.

If I have a table with 10 rows and default_statistics_target
at 100, then a sample of 3 rows will be taken.

If each row contains binary data of 1MB (an Image), then the
data structure returned will use about 30 GB of memory, which
will probably exceed maintenance_work_mem.

Or is there a flaw in my reasoning?

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add timing of buffer I/O requests

2012-04-30 Thread Greg Stark

On Sun, Apr 29, 2012 at 12:26 AM, Robert Haas robertmh...@gmail.com wrote:
 As for track_iotiming - track_io_timing, I'm fine with that as well.

I'm still grumpy about the idea of a GUC changing the explain analyze
output. How would people feel about adding an explain option that
explicitly requests io timing for this explain analyze and then having
the io timing be enabled if either it's requested by explain analyze
or if it's set on globally? That would make it more consistent with
the other explain analyze options?

I realize I don't get to be grumpy without actually contributing
anything, but I'm happy to write up the patch if people agree with the
change.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add timing of buffer I/O requests

2012-04-30 Thread Tom Lane

Greg Stark st...@mit.edu writes:
 On Sun, Apr 29, 2012 at 12:26 AM, Robert Haas robertmh...@gmail.com wrote:
 As for track_iotiming - track_io_timing, I'm fine with that as well.

 I'm still grumpy about the idea of a GUC changing the explain analyze
 output. How would people feel about adding an explain option that
 explicitly requests io timing for this explain analyze and then having
 the io timing be enabled if either it's requested by explain analyze
 or if it's set on globally? That would make it more consistent with
 the other explain analyze options?

I think it's going to be hard to decouple that altogether.  For
instance, if track_io_timing were not on but you did EXPLAIN (TIMING),
you'd end up with timing info getting sent to the stats collector for
just that one statement.  That seems a bit weird too.

I see where you're coming from but I don't think it's a good idea to
add an EXPLAIN option unless you can make the two behaviors (EXPLAIN
reporting and stats collection) truly independent.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Analyzing foreign tables memory problems

2012-04-30 Thread Tom Lane

Albe Laurenz laurenz.a...@wien.gv.at writes:
 Tom Lane wrote:
 I'm fairly skeptical that this is a real problem, and would prefer not
 to complicate wrappers until we see some evidence from the field that
 it's worth worrying about.

 If I have a table with 10 rows and default_statistics_target
 at 100, then a sample of 3 rows will be taken.

 If each row contains binary data of 1MB (an Image), then the
 data structure returned will use about 30 GB of memory, which
 will probably exceed maintenance_work_mem.

 Or is there a flaw in my reasoning?

Only that I don't believe this is a real-world scenario for a foreign
table.  If you have a foreign table in which all, or even many, of the
rows are that wide, its performance is going to suck so badly that
you'll soon look for a different schema design anyway.

I don't want to complicate FDWs for this until it's an actual bottleneck
in real applications, which it may never be, and certainly won't be
until we've gone through a few rounds of performance refinement for
basic operations.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Analyzing foreign tables memory problems

2012-04-30 Thread Simon Riggs

On Mon, Apr 30, 2012 at 3:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Albe Laurenz laurenz.a...@wien.gv.at writes:
 During ANALYZE, in analyze.c, functions compute_minimal_stats
 and compute_scalar_stats, values whose length exceed
 WIDTH_THRESHOLD (= 1024) are not used for calculating statistics
 other than that they are counted as too wide rows and assumed
 to be all different.

 This works fine with regular tables; values exceeding that threshold
 don't get detoasted and won't consume excessive memory.

 With foreign tables the situation is different.  Even though
 values exceeding WIDTH_THRESHOLD won't get used, the complete
 rows will be fetched from the foreign table.  This can easily
 exhaust maintenance_work_mem.

 I'm fairly skeptical that this is a real problem

AFAIK its not possible to select all columns from an Oracle database.
If you use an unqualified LONG column as part of the query then you
get an error.

So there are issues with simply requesting data for analysis.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts

2012-04-30 Thread Ryan Kelly

On Mon, Apr 30, 2012 at 09:02:33AM -0400, Alvaro Herrera wrote:
 
 Excerpts from Ryan Kelly's message of lun abr 30 07:10:14 -0400 2012:
  On Sun, Apr 29, 2012 at 10:12:40PM -0400, Alvaro Herrera wrote:
   
   Excerpts from Ryan Kelly's message of sáb ene 14 16:22:21 -0300 2012:
   
I have attached a new patch which handles the connect_timeout option by
adding a PQconnectTimeout(conn) function to access the connect_timeout
which I then use to retrieve the existing value from the old connection.
   
   Was this patch dropped entirely?  If not and it hasn't been committed
   yet, I think it belongs in the open CF here:
   https://commitfest.postgresql.org/action/commitfest_view?id=14
  Needs some freshening if anyone still wants it. Update against latest
  HEAD attached.
 
 Well, do *you* want it?
Of course. That way I can stop patching my psql and go back to using the
one that came with my release :)

-Ryan Kelly

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Typo in access/transam/xlog.c

2012-04-30 Thread Peter Geoghegan

Within access/transam/xlog.c , the following comment has an obvious error:

 * (This should not be called for for synchronous commits.)

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Analyzing foreign tables memory problems

2012-04-30 Thread Noah Misch

On Mon, Apr 30, 2012 at 12:27:45PM +0200, Albe Laurenz wrote:
 During ANALYZE, in analyze.c, functions compute_minimal_stats
 and compute_scalar_stats, values whose length exceed
 WIDTH_THRESHOLD (= 1024) are not used for calculating statistics
 other than that they are counted as too wide rows and assumed
 to be all different.
 
 This works fine with regular tables; values exceeding that threshold
 don't get detoasted and won't consume excessive memory.
 
 With foreign tables the situation is different.  Even though
 values exceeding WIDTH_THRESHOLD won't get used, the complete
 rows will be fetched from the foreign table.  This can easily
 exhaust maintenance_work_mem.
 
 A foreign data wrapper has no good way to counter the problem.
 It can return truncated values in ist AcquireSampleRowsFunc,
 but WIDTH_THRESHOLD is private to analyze.c and it's a bad idea
 to hard code a cutoff limit of 1025.
 
 I can think of two remedies:
 1) Expose WIDTH_THRESHOLD in commands/vacuum.h and add documentation
so that the authors of foreign data wrappers are aware of the
problem and can avoid it on their side.
This would be quite simple.

Seems reasonable.  How would the FDW return an indication that a value was
non-NULL but removed due to excess width?

Not all databases can cheaply filter out wide column values; by the time the
remote side has an exact width, the remote I/O damage may already be done.  To
dodge that problem, when a column has SET STATISTICS 0, the FDW should be
able to completely omit reading it.  (I haven't studied the API needs, if any,
to make that possible.)

 2) Instead of one callback that returns all sample rows, have
a callback that just returns the next table row (or the next
table row from a subset of the pages of the table in the
internal case).  This function could be called in a short-lived
memory context.  Vitter's algorithm for selecting a sample
and the truncation of excessively long values would then be
handled in analyze.c.
This would avoid the problem completely and make it easier
to write a foreign data wrapper.
I haven't thought this out completely, and it would require
bigger changes to analyze.c and the API than are probably
welcome this close to beta.

This solves the (in your downthread example) 30 GiB of memory consumption, but
you'll still read 30 GiB on the remote side and ship it all over the network.
To call this fixed, we'll need something like (1) that lets the FDW limit
volume at the remote side.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-04-30 Thread Noah Misch

When GIN changes a metapage, we WAL-log its ex-header content and never use a
backup block.  This reduces WAL volume since the vast majority of the metapage
is unused.  However, ginRedoUpdateMetapage() only restores the WAL-logged
content if the metapage LSN predates the WAL record LSN.  If a metapage write
tore and updated the LSN but not the other content, we would fail to complete
the update.  Instead, unconditionally reinitialize the metapage similar to how
_bt_restore_meta() handles the situation.

I found this problem by code reading and did not attempt to build a test case
illustrating its practical consequences.  It's possible that there's no
problem in practice on account of some reason I haven't contemplated.

Thanks,
nm
*** a/src/backend/access/gin/ginxlog.c
--- b/src/backend/access/gin/ginxlog.c
***
*** 492,504  ginRedoUpdateMetapage(XLogRecPtr lsn, XLogRecord *record)
return; /* assume index was 
deleted, nothing to do */
metapage = BufferGetPage(metabuffer);
  
!   if (!XLByteLE(lsn, PageGetLSN(metapage)))
!   {
!   memcpy(GinPageGetMeta(metapage), data-metadata, 
sizeof(GinMetaPageData));
!   PageSetLSN(metapage, lsn);
!   PageSetTLI(metapage, ThisTimeLineID);
!   MarkBufferDirty(metabuffer);
!   }
  
if (data-ntuples  0)
{
--- 492,503 
return; /* assume index was 
deleted, nothing to do */
metapage = BufferGetPage(metabuffer);
  
!   GinInitMetabuffer(metabuffer);
!   memcpy(GinPageGetMeta(metapage), data-metadata, 
sizeof(GinMetaPageData));
! 
!   PageSetLSN(metapage, lsn);
!   PageSetTLI(metapage, ThisTimeLineID);
!   MarkBufferDirty(metabuffer);
  
if (data-ntuples  0)
{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Bruce Momjian

On Thu, Apr 26, 2012 at 01:41:33PM +0100, Simon Riggs wrote:
 Some people have talked about the need for multi-master replication,
 whereby 2+ databases communicate changes to one another. This topic
 has been discussed in some depth in Computer Science academic papers,
 most notably, The Dangers of Replication and a Solution by the late
 Jim Gray. I've further studied this to the point where I have a
 mathematical model of this that allows me to predict what our likely
 success will be from implementing that. Without meaning to worry you,
 MM replication alone is not a solution for large data or the general
 case. For the general case, single master replication will continue to
 be the most viable option. For large and distributed data sets, some
 form of partitioning/sharding is required simply because full
 multi-master replication just isn't viable at both volume and scale.
 So my take on this is that MM is desirable, but is not the only thing
 we need - we also need partial/filtered replication to make large
 systems practical. Hence why I've been calling this the
 Bi-Directional Replication project. I'm aware that paragraph alone
 requires lots of explanation, which I hope to do both in writing and
 in person at the forthcoming developer conference.

I would love to see a layout of exactly where these things make sense,
similar to what we do at the bottom of our documentation for High
Availability, Load Balancing, and Replication:


http://www.postgresql.org/docs/9.1/static/different-replication-solutions.html

Users and developers just can't seem to get the calculus of where things
make sense into their heads, me included.

For example, you said that MM replication alone is not a solution for
large data or the general case.  Why is that?  Is the goal of your work
really to do logical replciation, which allows for major version
upgrades?  Is that the defining feature?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add conversion from pg_wchar to multibyte

2012-04-30 Thread Robert Haas

On Sun, Apr 29, 2012 at 8:12 AM, Erik Rijkers e...@xs4all.nl wrote:
 Perhaps I'm too early with these tests, but FWIW I reran my earlier test 
 program against three
 instances.  (the patches compiled fine, and make check was without problem).

These tests results seem to be more about the pg_trgm changes than the
patch actually on this thread, unless I'm missing something.  But the
executive summary seems to be that pg_trgm might need to be a bit
smarter about costing the trigram-based search, because when the
number of trigrams is really big, using the index is
counterproductive.  Hopefully that's not too hard to fix; the basic
approach seems quite promising.

(I haven't actually looked at the patch on this thread yet to
understand how it fits in; the above comments are about the pg_trgm
regex stuff.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] precision and scale functions for numeric

2012-04-30 Thread Robert Haas

On Sun, Apr 29, 2012 at 1:51 PM, Peter Eisentraut pete...@gmx.net wrote:
 I didn't find a good way to find out how many digits a numeric value has
 or things like whether a numeric value is an integer.  (I had to go
 through bc(1) for the latter.)  Functions like precision() and scale()
 would have been quite handy.  Are there other ways to do this, or would
 this make a good addition?

I think you could test for integer-ness by testing whether val % 0 = 0.

You could in general test of digits after the decimal point by casting
the value to text and using POSITION() to find the decimal point.  If
POSITION() returns zero then zero; else subtract the result from the
string length.  Similarly, it's pretty easy to regexp away the
non-digits and measure the length of the resulting string.

I don't have a good feeling for whether these operations are common
enough to justify adding a few more functions.  They haven't yet come
up for me personally.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Merlin Moncure

On Mon, Apr 30, 2012 at 12:38 PM, Bruce Momjian br...@momjian.us wrote:
 For example, you said that MM replication alone is not a solution for
 large data or the general case.  Why is that?  Is the goal of your work
 really to do logical replciation, which allows for major version
 upgrades?  Is that the defining feature?

TBH, I don't think MM replication belongs in the database at all.
Ditto any replication solution that implements 'eventual consistency'
such that after the fact conflict resolution is required.  In an SQL
database, when a transaction commits, it should remain so.  It belongs
in the application layer.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-04-30 Thread Tom Lane

Noah Misch n...@leadboat.com writes:
 When GIN changes a metapage, we WAL-log its ex-header content and never use a
 backup block.  This reduces WAL volume since the vast majority of the metapage
 is unused.  However, ginRedoUpdateMetapage() only restores the WAL-logged
 content if the metapage LSN predates the WAL record LSN.  If a metapage write
 tore and updated the LSN but not the other content, we would fail to complete
 the update.  Instead, unconditionally reinitialize the metapage similar to how
 _bt_restore_meta() handles the situation.

 I found this problem by code reading and did not attempt to build a test case
 illustrating its practical consequences.  It's possible that there's no
 problem in practice on account of some reason I haven't contemplated.

I think there's no problem in practice; the reason is that the
GinMetaPageData struct isn't large enough to extend past the first
physical sector of the page.  So it's in the same disk sector as the
LSN and tearing is impossible.  Still, this might be a good
future-proofing move, in case GinMetaPageData gets larger.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Analyzing foreign tables memory problems

2012-04-30 Thread Albe Laurenz

Simon Riggs wrote:
 During ANALYZE, in analyze.c, functions compute_minimal_stats
 and compute_scalar_stats, values whose length exceed
 WIDTH_THRESHOLD (= 1024) are not used for calculating statistics
 other than that they are counted as too wide rows and assumed
 to be all different.

 This works fine with regular tables; values exceeding that threshold
 don't get detoasted and won't consume excessive memory.

 With foreign tables the situation is different.  Even though
 values exceeding WIDTH_THRESHOLD won't get used, the complete
 rows will be fetched from the foreign table.  This can easily
 exhaust maintenance_work_mem.

 I'm fairly skeptical that this is a real problem

 AFAIK its not possible to select all columns from an Oracle database.
 If you use an unqualified LONG column as part of the query then you
 get an error.

 So there are issues with simply requesting data for analysis.

To detail on the specific case of Oracle, I have given up on LONG
since a) it has been deprecated for a long time and
b) it is not possible to retrieve a LONG column unless you know
in advance how long it is.

But you can have several BLOB and CLOB columns in a table, each
of which can be arbitrarily large and can lead to the problem
I described.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Analyzing foreign tables memory problems

2012-04-30 Thread Albe Laurenz

Tom Lane wrote:
 I'm fairly skeptical that this is a real problem, and would prefer not
 to complicate wrappers until we see some evidence from the field that
 it's worth worrying about.

 If I have a table with 10 rows and default_statistics_target
 at 100, then a sample of 3 rows will be taken.

 If each row contains binary data of 1MB (an Image), then the
 data structure returned will use about 30 GB of memory, which
 will probably exceed maintenance_work_mem.

 Or is there a flaw in my reasoning?

 Only that I don't believe this is a real-world scenario for a foreign
 table.  If you have a foreign table in which all, or even many, of the
 rows are that wide, its performance is going to suck so badly that
 you'll soon look for a different schema design anyway.

Of course it wouldn't work well to SELECT * from such a foreign table,
but it would work well enough to get one or a few rows at a time,
which is probably such a table's purpose in life anyway.

 I don't want to complicate FDWs for this until it's an actual bottleneck
 in real applications, which it may never be, and certainly won't be
 until we've gone through a few rounds of performance refinement for
 basic operations.

I agree that it may not be the right thing to do something invasive
to solve an anticipated problem that may never be one.

So scrap my second idea.  But I think that exposing WIDTH_THRESHOLD
wouldn't be unreasonable, would it?

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Simon Riggs

On Mon, Apr 30, 2012 at 6:38 PM, Bruce Momjian br...@momjian.us wrote:

 I would love to see a layout of exactly where these things make sense,
 similar to what we do at the bottom of our documentation for High
 Availability, Load Balancing, and Replication:

        
 http://www.postgresql.org/docs/9.1/static/different-replication-solutions.html

 Users and developers just can't seem to get the calculus of where things
 make sense into their heads, me included.

 For example, you said that MM replication alone is not a solution for
 large data or the general case.  Why is that?  Is the goal of your work
 really to do logical replciation, which allows for major version
 upgrades?  Is that the defining feature?

Good question.

The use case, its breadth and utility are always the first place I
start. I'm in the middle of writing a presentation that explains this
from first principles and will be discussing that at the PgCon
meeting. It's taken a long time to articulate that rather than make
leaps of assumption and belief.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Analyzing foreign tables memory problems

2012-04-30 Thread Albe Laurenz

Noah Misch wrote:
 During ANALYZE, in analyze.c, functions compute_minimal_stats
 and compute_scalar_stats, values whose length exceed
 WIDTH_THRESHOLD (= 1024) are not used for calculating statistics
 other than that they are counted as too wide rows and assumed
 to be all different.
 
 This works fine with regular tables;

 With foreign tables the situation is different.  Even though
 values exceeding WIDTH_THRESHOLD won't get used, the complete
 rows will be fetched from the foreign table.  This can easily
 exhaust maintenance_work_mem.

 I can think of two remedies:
 1) Expose WIDTH_THRESHOLD in commands/vacuum.h and add documentation
so that the authors of foreign data wrappers are aware of the
problem and can avoid it on their side.
This would be quite simple.

 Seems reasonable.  How would the FDW return an indication that a value was
 non-NULL but removed due to excess width?

The FDW would return a value of length WIDTH_THRESHOLD+1 that is
long enough to be recognized as too long, but not long enough to
cause a problem.

 Not all databases can cheaply filter out wide column values; by the time the
 remote side has an exact width, the remote I/O damage may already be done.  To
 dodge that problem, when a column has SET STATISTICS 0, the FDW should be
 able to completely omit reading it.  (I haven't studied the API needs, if any,
 to make that possible.)

Depending on the capabilities of the remote side, a FDW can
do more or less intelligent things to avoid the problem.
But it must know WIDTH_THRESHOLD.

Disabling statistics for a column as a workaround is an
interesting idea, but would be more work for the FDW writer
and the user.

 2) Instead of one callback that returns all sample rows, have
a callback that just returns the next table row (or the next
table row from a subset of the pages of the table in the
internal case).  This function could be called in a short-lived
memory context.  Vitter's algorithm for selecting a sample
and the truncation of excessively long values would then be
handled in analyze.c.
This would avoid the problem completely and make it easier
to write a foreign data wrapper.

 This solves the (in your downthread example) 30 GiB of memory consumption, but
 you'll still read 30 GiB on the remote side and ship it all over the network.
 To call this fixed, we'll need something like (1) that lets the FDW limit
 volume at the remote side.

You are right.  I guess the first idea is the more promising one.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] precision and scale functions for numeric

2012-04-30 Thread David Johnston

 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
 ow...@postgresql.org] On Behalf Of Robert Haas
 Sent: Monday, April 30, 2012 2:20 PM
 To: Peter Eisentraut
 Cc: pgsql-hackers
 Subject: Re: [HACKERS] precision and scale functions for numeric

 I think you could test for integer-ness by testing whether val % 0 = 0.

Either I am missing something here or you are.  Since Modulus is a division
function anything % 0 results in a division-by-zero ERROR - division has
to occur before a remainder can be obtained.

Maybe val % 2 NOT IN (0,1) ...

David J.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Robert Haas

On Mon, Apr 30, 2012 at 2:33 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Mon, Apr 30, 2012 at 12:38 PM, Bruce Momjian br...@momjian.us wrote:
 For example, you said that MM replication alone is not a solution for
 large data or the general case.  Why is that?  Is the goal of your work
 really to do logical replciation, which allows for major version
 upgrades?  Is that the defining feature?

 TBH, I don't think MM replication belongs in the database at all.
 Ditto any replication solution that implements 'eventual consistency'
 such that after the fact conflict resolution is required.  In an SQL
 database, when a transaction commits, it should remain so.  It belongs
 in the application layer.

I basically agree, at least in the medium term.  The logical
replication solutions we have today generally seem to work by watching
the inserts, updates, and deletes go by and writing the changed tuples
to a side table.  This is not very performant, because it amounts to
writing the data four times: we have to write WAL for the original
change, write the data files for the original change, write more WAL
for the change records, and the write those data files.  Since all
large database solutions are eventually I/O-bound, this is not great.
Writing and flushing a separate replication log in parallel to WAL
would get us down to three writes, and extracting tuple data from the
existing WAL would get us down to two writes, which is as well as we
ever know how to do.

If we just had that much in core - that is, the ability to efficiently
extra tuple inserts, updates, and deletes on a logical level - it
would be much easier to build a good logical replication system around
PostgreSQL than it is today, and the existing systems could be adapted
to deliver higher performance by making use of the new infrastructure.
 The other half of the changes - applying the updates - is relatively
straightforward, and it wouldn't bother me to leave that in user-land,
especially in the MMR case, where you have to deal with conflict
resolution rules that may be much simpler to express in a higher-level
language than they would be in C.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] precision and scale functions for numeric

2012-04-30 Thread Robert Haas

On Mon, Apr 30, 2012 at 3:33 PM, David Johnston pol...@yahoo.com wrote:
 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
 ow...@postgresql.org] On Behalf Of Robert Haas
 Sent: Monday, April 30, 2012 2:20 PM
 To: Peter Eisentraut
 Cc: pgsql-hackers
 Subject: Re: [HACKERS] precision and scale functions for numeric


 I think you could test for integer-ness by testing whether val % 0 = 0.


 Either I am missing something here or you are.  Since Modulus is a division
 function anything % 0 results in a division-by-zero ERROR - division has
 to occur before a remainder can be obtained.

 Maybe val % 2 NOT IN (0,1) ...

Oops.  I meant (val % 1) = 0, not (val % 0) = 0.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] precision and scale functions for numeric

2012-04-30 Thread Kevin Grittner

David Johnston pol...@yahoo.com wrote:
 
 I think you could test for integer-ness by testing whether val % 0 =
0.
 
 Modulus is a division function anything % 0 results in a
division-by-zero
 
It seems pretty clear that he meant % 1.
 
test=# select '1.01'::numeric % 1;
 ?column? 
--
 0.01
(1 row)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Bruce Momjian

On Mon, Apr 30, 2012 at 07:55:00PM +0100, Simon Riggs wrote:
 On Mon, Apr 30, 2012 at 6:38 PM, Bruce Momjian br...@momjian.us wrote:
 
  I would love to see a layout of exactly where these things make sense,
  similar to what we do at the bottom of our documentation for High
  Availability, Load Balancing, and Replication:
 
         
  http://www.postgresql.org/docs/9.1/static/different-replication-solutions.html
 
  Users and developers just can't seem to get the calculus of where things
  make sense into their heads, me included.
 
  For example, you said that MM replication alone is not a solution for
  large data or the general case.  Why is that?  Is the goal of your work
  really to do logical replciation, which allows for major version
  upgrades?  Is that the defining feature?
 
 Good question.
 
 The use case, its breadth and utility are always the first place I
 start. I'm in the middle of writing a presentation that explains this
 from first principles and will be discussing that at the PgCon
 meeting. It's taken a long time to articulate that rather than make
 leaps of assumption and belief.

Yep, it is the assumption and belief that always confuses me.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 
 The other half of the changes - applying the updates - is
 relatively straightforward, and it wouldn't bother me to leave
 that in user-land, especially in the MMR case, where you have to
 deal with conflict resolution rules that may be much simpler to
 express in a higher-level language than they would be in C.
 
I've developed successful MMR more than once, and while I wouldn't
say it's exactly been implemented in the database, it hasn't exactly
been in application space either.  The most successful
implementations I've worked with have been a layer just outside the
database, of which application software was completely unaware.  The
database wasn't aware of the coordination per se; but it *did* need
to provide transaction information in a clean way, and the
declarations of how data was distributed were in the database.  In
my experience a declarative definition of data distribution has
always been sufficient, and certainly cleaner to deal with than
imperative coding would be.
 
YMMV.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Merlin Moncure

On Mon, Apr 30, 2012 at 2:38 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Apr 30, 2012 at 2:33 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Mon, Apr 30, 2012 at 12:38 PM, Bruce Momjian br...@momjian.us wrote:
 For example, you said that MM replication alone is not a solution for
 large data or the general case.  Why is that?  Is the goal of your work
 really to do logical replciation, which allows for major version
 upgrades?  Is that the defining feature?

 TBH, I don't think MM replication belongs in the database at all.
 Ditto any replication solution that implements 'eventual consistency'
 such that after the fact conflict resolution is required.  In an SQL
 database, when a transaction commits, it should remain so.  It belongs
 in the application layer.

 I basically agree, at least in the medium term.  The logical
 replication solutions we have today generally seem to work by watching
 the inserts, updates, and deletes go by and writing the changed tuples
 to a side table.  This is not very performant, because it amounts to
 writing the data four times: we have to write WAL for the original
 change, write the data files for the original change, write more WAL
 for the change records, and the write those data files.  Since all
 large database solutions are eventually I/O-bound, this is not great.
 Writing and flushing a separate replication log in parallel to WAL
 would get us down to three writes, and extracting tuple data from the
 existing WAL would get us down to two writes, which is as well as we
 ever know how to do.

 If we just had that much in core - that is, the ability to efficiently
 extra tuple inserts, updates, and deletes on a logical level - it
 would be much easier to build a good logical replication system around
 PostgreSQL than it is today, and the existing systems could be adapted
 to deliver higher performance by making use of the new infrastructure.
  The other half of the changes - applying the updates - is relatively
 straightforward, and it wouldn't bother me to leave that in user-land,
 especially in the MMR case, where you have to deal with conflict
 resolution rules that may be much simpler to express in a higher-level
 language than they would be in C.

Yeah -- here at $work the SQL Server team (once in a while we cross
no-man's land and converse) has some fancy technology that sits
directly on top of the transaction log and exposes an API that you can
use to peek into the river of data running through the log and do
stuff with it.  In our case, they use it to triage extracts from about
100 or so distributed databases into a centralized store in a
relatively realtime fashion.  HS/SR simply can't do that and there
would be tremendous value in something that could.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add conversion from pg_wchar to multibyte

2012-04-30 Thread Kevin Grittner

Robert Haas robertmh...@gmail.com wrote:
 
 Hopefully that's not too hard to fix; the basic approach seems
 quite promising.
 
After playing with trigram searches for name searches against copies
of production database with appropriate indexing, our shop has
chosen it as the new way to do name searches here.  It's really
nice.
 
My biggest complaint is related to setting the threshold for the %
operator.  It seems to me that there should be a GUC to control the
default, and that there should be a way to set the threshold for
each % operator in a query (if there is more than one).  The
function names which must be used on the connection before running
the queries don't give any clue that they are related to trigrams:
show_limit() and set_limit() are nearly useless for conveying the
semantics of what they do.
 
Even with those issues, trigram similarity searching is IMO one of
the top five coolest things about PostgreSQL and should be promoted
heavily.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Josh Berkus


 If we just had that much in core - that is, the ability to efficiently
 extra tuple inserts, updates, and deletes on a logical level - it
 would be much easier to build a good logical replication system around
 PostgreSQL than it is today, and the existing systems could be adapted
 to deliver higher performance by making use of the new infrastructure.

Well, this *is* the purpose of the cluster-hackers group, to add backend
support which would make external replication systems easier to build
and more efficient.  So far the only real feature to come out of that
has been the Command Triggers, but if you read the TODO list of that
group you'll see that it's a laundry list of things replication systems
need support for in the backend.

http://wiki.postgresql.org/wiki/ClusterFeatures
http://wiki.postgresql.org/wiki/PgCon2012CanadaClusterSummit

What puts me off about this let's start from first principles approach
is that in our community we have years of experience (in a couple cases,
over a decade) with logical-level replication.  It seems like the last
thing we need is Yet Another PostgreSQL Replication System, started over
from scratch and years away from being production quality.  Don't we
have enough external replication systems with not enough developers
behind them?

Even if improving an existing replication system proves to be
impossible, it would make more sense to start with an analysis of the
strengths and deficiencies of Slony, Londiste, Bucardo, etc., than with
some kind of clean-room approach.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Simon Riggs

On Mon, Apr 30, 2012 at 11:43 PM, Josh Berkus j...@agliodbs.com wrote:

 Well, this *is* the purpose of the cluster-hackers group

Well, I tried all available means to discuss my ideas before
organising an external meeting. You can think of the InCore meeting as
an extension of the cluster hackers meeting if you wish.

 What puts me off about this let's start from first principles approach
 is that in our community we have years of experience (in a couple cases,
 over a decade) with logical-level replication.  It seems like the last
 thing we need is Yet Another PostgreSQL Replication System, started over
 from scratch and years away from being production quality.  Don't we
 have enough external replication systems with not enough developers
 behind them?

I've tried very hard to express my admiration and respect for
developers of many replication systems both personally when we meet in
person and on list. And I've mentioned that I'd like to include as
many ideas as possible in an in-core approach. Yes, we have many
external replication systems. Many, many people have expressed the
desire for more coherent features in core.

I believe we can achieve production quality code in 1, maybe 2
releases. That is only possible by building on what we already have
and reusing the concepts, experience and perhaps even code from other
projects. We are closer to that than your statement allows. Yes, we
have enough external replication systems and that is one reason why we
need to put things in core and not just create another external
system.

 Even if improving an existing replication system proves to be
 impossible, it would make more sense to start with an analysis of the
 strengths and deficiencies of Slony, Londiste, Bucardo, etc., than with
 some kind of clean-room approach.

I'm not sure why you think I would *not* be starting with that analysis.

I look forward to discussing this in person, where I'm sure it will be easier.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Josh Berkus


 Well, this *is* the purpose of the cluster-hackers group
 
 Well, I tried all available means to discuss my ideas before
 organising an external meeting. You can think of the InCore meeting as
 an extension of the cluster hackers meeting if you wish.

That comment wasn't for you, it was for other folks who didn't seem to
be aware that some of these ideas had already been discussed.  You, of
course, know because you were there.

And, if you need a 3-hour meeting there's no question that you need the
evening thing.  There's just not enough time in anyone's schedule.

 I'm not sure why you think I would *not* be starting with that analysis.

That wasn't the impression I got from your prior emails to hackers.
Thanks for the clarification!

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Future In-Core Replication

2012-04-30 Thread Tatsuo Ishii

 Those are the basic requirements that I am trying to address. There
 are a great many important details, but the core of this is probably
 what I would call logical replication, that is shipping changes to
 other nodes in a way that does not tie us to the same physical
 representation that recovery/streaming replication does now. Of
 course, non-physical replication can take many forms.

Guessing from shipping changes to other nodes, you seem to
implicitly aim at asynchronous replication? If so, I am afraid it will
force users to pay some cost to migrate from existig applications.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Call for Lightning Talks for pgCon

2012-04-30 Thread Josh Berkus

Hackers, users, pgCon attendees:

You want to give a lightning talk at pgCon!

Yes, you do.  The fun, the glory, the laughter, the everlasting fame!
These can all be yours.

Be one of the ten brave and true who put together five minutes about
PostgreSQL tools, experiences, forks, ideas, websites, or even
(especially) jokes.Anything from hacking wal files to the
PostgreSQL drinking game is an acceptable topic for the lighting talks.
  A short schedule:

- Right Now: send your lightning talk idea to li...@pgcon.org.
  I'll need a title, speaker full name, speaker cell phone number,
  and brief (one sentence) description.
- This Friday: I'll get back to you with acceptance (or not)
- Friday, May 11th (or sooner) you get me your PDF slides for the talk.
- Thursday, May 17, 5:15 PM: meet Magnus Hagander in the plenary
  presentation room for your order of speaking.
- Thursday, May 17, 5:30PM to 6:30PM: you and 9 others deliver
  your talks

Fine print: Lightning talks are strictly five (5) minutes in length, and
speakers who run over will be cut off.  PDF slides or browser access
only, which will be presented on the conference laptop, so no demos,
animations, private network access, or installed software.  Lightning
talks are subject to pgCon's anti-discrimination policy.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

45 matches

Mail list logo