Re: [HACKERS] posix_fadvise missing in the walsender

2013-02-18 Thread Heikki Linnakangas

On 17.02.2013 14:55, Joachim Wieland wrote:

In access/transam/xlog.c we give the OS buffer caching a hint that we
won't need a WAL file any time soon with

 posix_fadvise(openLogFile, 0, 0, POSIX_FADV_DONTNEED);

before closing the WAL file, but only if we don't have walsenders.
That's reasonable because the walsender will reopen that same file
shortly after.

However the walsender doesn't call posix_fadvise once it's done with
the file and I'm proposing to add this to walsender.c for consistency
as well.

Since there could be multiple walsenders, only the slowest one
should call this function. Finding out the slowest walsender can be
done by inspecting the shared memory and looking at the sentPtr of
each walsender.


I doubt it's worth it, the OS cache generally does a reasonable job at 
deciding what to keep. In the non-walsender case, it's pretty clear that 
we don't need the WAL file anymore, but if we need to work any harder 
than a one-line posix_fadvise call, meh. I could be convinced otherwise 
with some performance test results, of course.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add PQconninfoParseParams and PQconninfodefaultsMerge to libpq

2013-02-18 Thread Heikki Linnakangas

On 18.02.2013 06:07, Amit Kapila wrote:

On Sunday, February 17, 2013 8:44 PM Phil Sorber wrote:

On Sun, Feb 17, 2013 at 1:35 AM, Amit kapilaamit.kap...@huawei.com
wrote:

Now the patch of Phil Sober provides 2 new API's

PQconninfoParseParams(), and PQconninfodefaultsMerge(),

using these API's I can think of below way for patch pass a

connection string to pg_basebackup, ...


1. Call existing function PQconinfoParse() with connection string

input by user and get PQconninfoOption.


2. Now use the existing keywords (individual options specified by

user) and extract the keywords from

PQconninfoOption structure and call new API

PQconninfoParseParams() which will return PQconninfoOption.

The PQconninfoOption structure returned in this step will contain

all keywords


3. Call PQconninfodefaultsMerge() to merge any default values if

exist. Not sure if this step is required?


4. Extract individual keywords from PQconninfoOption structure and

call PQconnectdbParams.


Is this inline with what you have in mind or you have thought of some

other simpler way of using new API's?


Yep, that's roughly what I had in mind. I don't think it's necessary to 
merge defaults in step 3, but it needs to add the replication=true and 
dbname=replication options.



I think what would be nice is an additional connect function that took
PQconninfoOption as a parameter. Or at least something that can
convert from PQconninfoOption to a connection string or keyword/value
arrays.


Yes, it would be better if we would like to use new API's, I think it can
save step-4 and some part in step-2.


pg_basebackup needs to add options to the array, so I don't think a new 
connect function would help much. It's not a lot of code to iterate 
through the PGconnInfoOption array and turn it back into keywords and 
values arrays, so I'd just do that straight in the client code.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add PQconninfoParseParams and PQconninfodefaultsMerge to libpq

2013-02-18 Thread Amit Kapila
On Monday, February 18, 2013 1:41 PM Heikki Linnakangas wrote:
 On 18.02.2013 06:07, Amit Kapila wrote:
  On Sunday, February 17, 2013 8:44 PM Phil Sorber wrote:
  On Sun, Feb 17, 2013 at 1:35 AM, Amit kapilaamit.kap...@huawei.com
  wrote:
  Now the patch of Phil Sober provides 2 new API's
  PQconninfoParseParams(), and PQconninfodefaultsMerge(),
  using these API's I can think of below way for patch pass a
  connection string to pg_basebackup, ...
 
  1. Call existing function PQconinfoParse() with connection string
  input by user and get PQconninfoOption.
 
  2. Now use the existing keywords (individual options specified by
  user) and extract the keywords from
  PQconninfoOption structure and call new API
  PQconninfoParseParams() which will return PQconninfoOption.
  The PQconninfoOption structure returned in this step will
 contain
  all keywords
 
  3. Call PQconninfodefaultsMerge() to merge any default values if
  exist. Not sure if this step is required?
 
  4. Extract individual keywords from PQconninfoOption structure and
  call PQconnectdbParams.
 
  Is this inline with what you have in mind or you have thought of
 some
  other simpler way of using new API's?
 
 Yep, that's roughly what I had in mind. I don't think it's necessary to
 merge defaults in step 3, but it needs to add the replication=true
 and
 dbname=replication options.
 
  I think what would be nice is an additional connect function that
 took
  PQconninfoOption as a parameter. Or at least something that can
  convert from PQconninfoOption to a connection string or
 keyword/value
  arrays.
 
  Yes, it would be better if we would like to use new API's, I think it
 can
  save step-4 and some part in step-2.
 
 pg_basebackup needs to add options to the array, so I don't think a new
 connect function would help much. It's not a lot of code to iterate
 through the PGconnInfoOption array and turn it back into keywords and
 values arrays, so I'd just do that straight in the client code.

Okay, got the point.

Phil, I will try to finish the combined patch. Is it possible for you to
complete
the documentation for the new API's.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-18 Thread Alvaro Herrera
Kevin Grittner escribió:

 I'm OK with that approach, and in the absence of anyone pushing for
 another direction, will make that change to pg_dump.  I'm thinking
 I would only do this for materialized views which were not
 scannable, but which cause REFRESH failures on other materialized
 views if not refreshed first (recursively evaluated), rather than
 just automatically refreshing all MVs on restore.  The reason this
 seems important is that some MVs may take a long time to refresh,
 and a user might want a dump/restore to get to a point where they
 can use the rest of the database while building the contents of
 some MVs in the background or during off hours.

Maybe it would be a good idea to try to put such commands at the very
end of the dump, if possible.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add PQconninfoParseParams and PQconninfodefaultsMerge to libpq

2013-02-18 Thread Amit Kapila
On Monday, February 18, 2013 1:41 PM Heikki Linnakangas wrote:
 On 18.02.2013 06:07, Amit Kapila wrote:
  On Sunday, February 17, 2013 8:44 PM Phil Sorber wrote:
  On Sun, Feb 17, 2013 at 1:35 AM, Amit kapilaamit.kap...@huawei.com
  wrote:
  Now the patch of Phil Sober provides 2 new API's
  PQconninfoParseParams(), and PQconninfodefaultsMerge(),
  using these API's I can think of below way for patch pass a
  connection string to pg_basebackup, ...
 
  1. Call existing function PQconinfoParse() with connection string
  input by user and get PQconninfoOption.
 
  2. Now use the existing keywords (individual options specified by
  user) and extract the keywords from
  PQconninfoOption structure and call new API
  PQconninfoParseParams() which will return PQconninfoOption.
  The PQconninfoOption structure returned in this step will
 contain
  all keywords
 
  3. Call PQconninfodefaultsMerge() to merge any default values if
  exist. Not sure if this step is required?
 
  4. Extract individual keywords from PQconninfoOption structure and
  call PQconnectdbParams.
 
  Is this inline with what you have in mind or you have thought of
 some
  other simpler way of using new API's?
 
 Yep, that's roughly what I had in mind. I don't think it's necessary to
 merge defaults in step 3, but it needs to add the replication=true
 and
 dbname=replication options.

I could see the advantage of calling PQconninfoParseParams() in step-2 is
that 
it will remove the duplicate values by overriding the values for conflicting
keywords.
This is done in function conninfo_array_parse() which is called from
PQconninfoParseParams().
Am I right or there is any other advantage of calling
PQconninfoParseParams()?

If there is no other advantage then this is done in PQconnectdbParams()
also, so can't we avoid calling PQconninfoParseParams()?

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup with -R option and start standby have problems with escaped password

2013-02-18 Thread Boszormenyi Zoltan

2013-01-29 11:15 keltezéssel, Magnus Hagander írta:

On Thu, Jan 24, 2013 at 7:04 AM, Hari Babu haribabu.ko...@huawei.com wrote:

On Wed, Jan 23, 2013 11:48 PM, Magnus Hagander wrote:

On Wed, Jan 23, 2013 at 10:18 AM, Hari Babu haribabu.ko...@huawei.com

wrote:

Test scenario to reproduce:
 1. Start the server
 2. create the user as follows
 ./psql postgres -c create user user1 superuser login
password 'use''1'

 3. Take the backup with -R option as follows.
 ./pg_basebackup -D ../../data1 -R -U user1 -W

The following errors are occurring when the new standby on the backup
database starts.

FATAL:  could not connect to the primary server: missing = after 1'

in

connection info string

What does the resulting recovery.conf file look like?

The recovery.conf which is generated is as follows

standby_mode = 'on'
primary_conninfo = 'user=''user1'' password=''use''1'' port=''5432'' '


I observed the problem is while reading primary_conninfo from the
recovery.conf file
the function GUC_scanstr removes the quotes of the string and also makes
the
continuos double quote('') as single quote(').

By using the same connection string while connecting to primary server the
function conninfo_parse the escape quotes are not able to parse properly
and it is leading
to problem.

please correct me if any thing wrong in my observation.

Well, it's clearly broken at least :O

Zoltan, do you have time to  look at it? I won't have time until at
least after FOSDEM, unfortunately.


I looked at it shortly. What I tried first is adding another pair of single
quotes manually like this:

primary_conninfo = 'user=''user1'' password=''use1'' host=''192.168.1.2'' 
port=''5432'' sslmode=''disable'' sslcompression=''1'' '


But it doesn't solve the problem either, I got:

FATAL:  could not connect to the primary server: missing = after '1' in connection 
info string


This worked though:

primary_conninfo = 'user=user1 password=use\'1 host=192.168.1.2 port=5432 sslmode=disable 
sslcompression=1 '


When I added an elog() to print the conninfo string in libpqrcv_connect(),
I saw that the double quotes were properly eliminated by ParseConfigFp()
in the first case.

So, there is a bug in generating recovery.conf by not double-escaping
the values and another bug in parsing the connection string in libpq
when the parameter value starts with a single-quote character.

Attached are two patches to fix these two bugs, the libpq part can
be back-patched.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index eea9c6b..6528f77 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4216,9 +4216,12 @@ conninfo_parse(const char *conninfo, PQExpBuffer errorMessage,
 }
 if (*cp == '\'')
 {
-	*cp2 = '\0';
 	cp++;
-	break;
+	if (*cp != '\'')
+	{
+		*cp2 = '\0';
+		break;
+	}
 }
 *cp2++ = *cp++;
 			}
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fb5a1bd..1c2ef9a 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1127,6 +1127,7 @@ escape_quotes(const char *src)
 static void
 GenerateRecoveryConf(PGconn *conn)
 {
+	PQExpBufferData		arg_buf;
 	PQconninfoOption *connOptions;
 	PQconninfoOption *option;
 
@@ -1137,6 +1138,13 @@ GenerateRecoveryConf(PGconn *conn)
 		disconnect_and_exit(1);
 	}
 
+	initPQExpBuffer(arg_buf);
+	if (PQExpBufferDataBroken(arg_buf))
+	{
+		fprintf(stderr, _(%s: out of memory), progname);
+		disconnect_and_exit(1);
+	}
+
 	connOptions = PQconninfo(conn);
 	if (connOptions == NULL)
 	{
@@ -1150,6 +1158,7 @@ GenerateRecoveryConf(PGconn *conn)
 	for (option = connOptions; option  option-keyword; option++)
 	{
 		char	   *escaped;
+		char	   *escaped2;
 
 		/*
 		 * Do not emit this setting if: - the setting is replication,
@@ -1169,10 +1178,13 @@ GenerateRecoveryConf(PGconn *conn)
 		 * necessary and doubled single quotes around the value string.
 		 */
 		escaped = escape_quotes(option-val);
+		printfPQExpBuffer(arg_buf, %s='%s', option-keyword, escaped);
 
-		appendPQExpBuffer(recoveryconfcontents, %s=''%s'' , option-keyword, escaped);
+		escaped2 = escape_quotes(arg_buf.data);
+		appendPQExpBuffer(recoveryconfcontents, %s , escaped2);
 
 		free(escaped);
+		free(escaped2);
 	}
 
 	appendPQExpBufferStr(recoveryconfcontents, '\n);
@@ -1183,6 +1195,7 @@ GenerateRecoveryConf(PGconn *conn)
 	}
 
 	PQconninfoFree(connOptions);
+	termPQExpBuffer(arg_buf);
 }
 
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-18 Thread Kevin Grittner
Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Kevin Grittner escribió:

 I'm OK with that approach, and in the absence of anyone pushing
 for another direction, will make that change to pg_dump.  I'm
 thinking I would only do this for materialized views which were
 not scannable, but which cause REFRESH failures on other
 materialized views if not refreshed first (recursively
 evaluated), rather than just automatically refreshing all MVs on
 restore.  The reason this seems important is that some MVs may
 take a long time to refresh, and a user might want a
 dump/restore to get to a point where they can use the rest of
 the database while building the contents of some MVs in the
 background or during off hours.

 Maybe it would be a good idea to try to put such commands at the
 very end of the dump, if possible.

Here is the dump order as currently implemented in that patch.  MVs
are created at the same priority as tables and views.  MV REFRESH
and MV index builds obviously need to follow population of table
data. These are at the same priority because it makes the most
sense to populated an MV without any indexes and then build them
before the MV is used to populate some other MV.  Dependency
information is used to get that to sort properly within the
priority level.

    1,    /* DO_NAMESPACE */
    2,    /* DO_PROCLANG */
    3,    /* DO_COLLATION */
    4,    /* DO_EXTENSION */
    5,    /* DO_TYPE */
    5,    /* DO_SHELL_TYPE */
    6,    /* DO_FUNC */
    7,    /* DO_AGG */
    8,    /* DO_OPERATOR */
    9,    /* DO_OPCLASS */
    9,    /* DO_OPFAMILY */
    10,    /* DO_CAST */
    11,    /* DO_CONVERSION */
    12,    /* DO_TSPARSER */
    13,    /* DO_TSTEMPLATE */
    14,    /* DO_TSDICT */
    15,    /* DO_TSCONFIG */
    16,    /* DO_FDW */
    17,    /* DO_FOREIGN_SERVER */
    18,    /* DO_TABLE */
    19,    /* DO_DUMMY_TYPE */
    20,    /* DO_ATTRDEF */
    21,    /* DO_BLOB */
    22,    /* DO_PRE_DATA_BOUNDARY */
    23,    /* DO_TABLE_DATA */
    24,    /* DO_BLOB_DATA */
    25,    /* DO_POST_DATA_BOUNDARY */
    26,    /* DO_CONSTRAINT */
    27,    /* DO_INDEX */
    28,    /* DO_REFRESH_MATVIEW */
    28 /* DO_MATVIEW_INDEX */
    29,    /* DO_RULE */
    30,    /* DO_TRIGGER */
    31,    /* DO_FK_CONSTRAINT */
    32,    /* DO_DEFAULT_ACL */
    33,    /* DO_EVENT_TRIGGER */

I don't think that pushing MV refreshes and index creation farther
down the list should require anything beyond adjusting the priority
numbers.  I don't see a problem pushing them to the end.  Does
anyone else see anything past priority 28 that MV population should
*not* follow?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-18 Thread Kevin Grittner
Thom Brown t...@linux.com wrote:
 On 16 February 2013 01:01, Kevin Grittner kgri...@ymail.com wrote:
 Unless something else comes up in review or I get feedback to
 the contrary I plan to deal with the above-mentioned issues and
 commit this within a week or two.

 At the moment it's not possible to rename a column without using
 ALTER TABLE on an MV.

 Also, shouldn't we have the ability to set the collation on a
 column of the MV?

Will fix.

 And the inconsistency between regular views and MVs is still
 present, where MVs always dump with column parameters in their
 definition, and views never do.  Not a show-stopper, but curious
 nonetheless.

I haven't worried about this because current behavior generates
correct results -- this seems like a micro-optimization.  The
explanation for why it wound up that way is that creating a
materialized view is in many ways more like creating a table than
like creating a view -- it seemed safer and less invasive to modify
the CREATE TABLE code than the CREATE VIEW code, and specifying
column names just fell out of that as part of the minimal change. 
In looking at the pg_dump output, though, I see that the CMV AS
clause also is getting the names right with column-level aliases,
so it should be pretty simple and safe to leave off the
column-list section for MVs.  I guess it's worth it just to
forestall further questions on the topic.

Thanks!

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON Function Bike Shedding

2013-02-18 Thread Merlin Moncure
On Fri, Feb 15, 2013 at 11:25 AM, Robert Haas robertmh...@gmail.com wrote:
 Note that I have given json_get() and json_get_path() the same names, as it 
 seems to me that the former is the same as the latter, with only one 
 parameter. Same for json_get_as_text() and json_get_path_as_text().

 I realize I'm in the minority here, but -1 from me on all of this.
 Should we also rename xml_is_well_formed() to just is_well_formed()?
 string_agg() to agg()?  Eventually we will have more data types, and
 some of them will have functions that could also be called rows() or
 get_values(), but it's unlikely that they'll have exactly the same
 behavior, which will start to make things confusing.

It's a little late, but I'd like to rebut this point:
 string_agg() to agg()?

This not germane to the discussion. string_agg means you are
aggregating *to* a string, not from one, which is a completely
different thing.  This also applies to to_char, to_date, etc.  If you
wanted to do just 'agg()', you'd have to supply output type somehow --
the only way to do that in postgres is to use hstore null::foo trick
(which is not an improvement obviously).

  xml_is_well_formed() to just is_well_formed()?

Again, this is not the same thing. It does not work on the xml type,
but text, so you'd have to supply a hint to specific behaviors if you
wanted to abstract type out of the function.  Because the returned
type is unambiguously boolean though, you can get away with:

validate(format text, data text);

select validate('json', json string);
select validate('xml', xml string);
etc.

if you wanted to.  And yes, I absolutely think this is superior to
cluttering the public namespace with xml specific verbage, and could
be extended to other formats.  Look at the other way: we currently
have encode(format text, stuff bytea).  Would we be better off with
hex_encode(bytea), escape_encode(bytea)... .etc?

The argument for removing json_ prefix is that when function behaviors
are unambiguously controlled by the arguments, decorating the function
name to match the input argument is unnecessary verbosity.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: PATCH: Split stats file per database WAS: [HACKERS] autovacuum stress-testing our system

2013-02-18 Thread Alvaro Herrera
Tomas Vondra wrote:

 So, here's v10 of the patch (based on the v9+v9a), that implements the
 approach described above.
 
 It turned out to be much easier than I expected (basically just a
 rewrite of the pgstat_read_db_statsfile_timestamp() function.

Thanks.  I'm giving this another look now.  I think the new code means
we no longer need the first_write logic; just let the collector idle
until we get the first request.  (If for some reason we considered that
we should really be publishing initial stats as early as possible, we
could just do a write_statsfiles(allDbs) call before entering the main
loop.  But I don't see any reason to do this.  If you do, please speak
up.)

Also, it seems to me that the new pgstat_db_requested() logic is
slightly bogus (in the inefficient sense, not the incorrect sense):
we should be comparing the timestamp of the request vs.  what's already
on disk instead of blindly returning true if the list is nonempty.  If
the request is older than the file, we don't need to write anything and
can discard the request.  For example, suppose that backend A sends a
request for a DB; we write the file.  If then quickly backend B also
requests stats for the same DB, with the current logic we'd go write the
file, but perhaps backend B would be fine with the file we already
wrote.

Another point is that I think there's a better way to handle nonexistant
files, instead of having to read the global file and all the DB records
to find the one we want.  Just try to read the database file, and only
if that fails try to read the global file and compare the timestamp (so
there might be two timestamps for each DB, one in the global file and
one in the DB-specific file.  I don't think this is a problem).  The
point is avoid having to read the global file if possible.

So here's v11.  I intend to commit this shortly.  (I wanted to get it
out before lunch, but I introduced a silly bug that took me a bit to
fix.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
*** a/src/backend/postmaster/pgstat.c
--- b/src/backend/postmaster/pgstat.c
***
*** 38,43 
--- 38,44 
  #include access/xact.h
  #include catalog/pg_database.h
  #include catalog/pg_proc.h
+ #include lib/ilist.h
  #include libpq/ip.h
  #include libpq/libpq.h
  #include libpq/pqsignal.h
***
*** 66,73 
   * Paths for the statistics files (relative to installation's $PGDATA).
   * --
   */
! #define PGSTAT_STAT_PERMANENT_FILENAME		global/pgstat.stat
! #define PGSTAT_STAT_PERMANENT_TMPFILE		global/pgstat.tmp
  
  /* --
   * Timer definitions.
--- 67,75 
   * Paths for the statistics files (relative to installation's $PGDATA).
   * --
   */
! #define PGSTAT_STAT_PERMANENT_DIRECTORY		pg_stat
! #define PGSTAT_STAT_PERMANENT_FILENAME		pg_stat/global.stat
! #define PGSTAT_STAT_PERMANENT_TMPFILE		pg_stat/global.tmp
  
  /* --
   * Timer definitions.
***
*** 115,120  int			pgstat_track_activity_query_size = 1024;
--- 117,123 
   * Built from GUC parameter
   * --
   */
+ char	   *pgstat_stat_directory = NULL;
  char	   *pgstat_stat_filename = NULL;
  char	   *pgstat_stat_tmpname = NULL;
  
***
*** 219,229  static int	localNumBackends = 0;
   */
  static PgStat_GlobalStats globalStats;
  
! /* Last time the collector successfully wrote the stats file */
! static TimestampTz last_statwrite;
  
! /* Latest statistics request time from backends */
! static TimestampTz last_statrequest;
  
  static volatile bool need_exit = false;
  static volatile bool got_SIGHUP = false;
--- 222,237 
   */
  static PgStat_GlobalStats globalStats;
  
! /* Write request info for each database */
! typedef struct DBWriteRequest
! {
! 	Oid			databaseid;		/* OID of the database to write */
! 	TimestampTz request_time;	/* timestamp of the last write request */
! 	slist_node	next;
! } DBWriteRequest;
  
! /* Latest statistics request times from backends */
! static slist_head	last_statrequests = SLIST_STATIC_INIT(last_statrequests);
  
  static volatile bool need_exit = false;
  static volatile bool got_SIGHUP = false;
***
*** 252,262  static void pgstat_sighup_handler(SIGNAL_ARGS);
  static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
  static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
  	 Oid tableoid, bool create);
! static void pgstat_write_statsfile(bool permanent);
! static HTAB *pgstat_read_statsfile(Oid onlydb, bool permanent);
  static void backend_read_statsfile(void);
  static void pgstat_read_current_status(void);
  
  static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
  static void pgstat_send_funcstats(void);
  static HTAB *pgstat_collect_oids(Oid catalogid);
--- 260,275 
  static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
  static PgStat_StatTabEntry 

Re: [HACKERS] 9.2.3 crashes during archive recovery

2013-02-18 Thread Heikki Linnakangas

On 16.02.2013 10:40, Ants Aasma wrote:

On Fri, Feb 15, 2013 at 3:49 PM, Heikki Linnakangas
hlinnakan...@vmware.com  wrote:

While this solution would help solve my issue, it assumes that the
correct amount of WAL files are actually there. Currently the docs for
setting up a standby refer to 24.3.4. Recovering Using a Continuous
Archive Backup, and that step recommends emptying the contents of
pg_xlog. If this is chosen as the solution the docs should be adjusted
to recommend using pg_basebackup -x for setting up the standby.


When the backup is taken using pg_start_backup or pg_basebackup,
minRecoveryPoint is set correctly anyway, and it's OK to clear out pg_xlog.


How is minRecoveryPoint supposed to get set?


I was a slightly imprecise. minRecoveryPoint is not set at backup, but 
backupStartPoint is. minRecoveryPoint is set later, once the 
end-of-backup record is seen. A valid backupStartPoint has the same 
effect as minRecoveryPoint: the system is considered inconsistent until 
the end-of-backup record is seen.



I just tried taking a
pg_basebackup on master running pgbench. The resulting data dir
controlfile had this:

Min recovery ending location: 0/0

The end location was not in the backup_label either.

Looking at basebackup.c the process seems to be:
1. pg_start_backup
2. copy out backup_label
3. copy out rest of the datadir
4. copy out control file
5. pg_stop_backup
6. copy out WAL
7. send backup stop location

And pg_basebackup.c only uses the stop location to decide how much WAL to fetch.

I don't see anything here that could correctly communicate min
recovery point. Maybe I'm missing something.


backupStartPoint is set, which signals recovery to wait for an 
end-of-backup record, until the system is considered consistent. If the 
backup is taken from a hot standby, backupEndPoint is set, instead of 
inserting an end-of-backup record.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: PATCH: Split stats file per database WAS: [HACKERS] autovacuum stress-testing our system

2013-02-18 Thread Tomas Vondra
On 18.2.2013 16:50, Alvaro Herrera wrote:
 Tomas Vondra wrote:
 
 So, here's v10 of the patch (based on the v9+v9a), that implements the
 approach described above.

 It turned out to be much easier than I expected (basically just a
 rewrite of the pgstat_read_db_statsfile_timestamp() function.
 
 Thanks.  I'm giving this another look now.  I think the new code means
 we no longer need the first_write logic; just let the collector idle
 until we get the first request.  (If for some reason we considered that
 we should really be publishing initial stats as early as possible, we
 could just do a write_statsfiles(allDbs) call before entering the main
 loop.  But I don't see any reason to do this.  If you do, please speak
 up.)
 
 Also, it seems to me that the new pgstat_db_requested() logic is
 slightly bogus (in the inefficient sense, not the incorrect sense):
 we should be comparing the timestamp of the request vs.  what's already
 on disk instead of blindly returning true if the list is nonempty.  If
 the request is older than the file, we don't need to write anything and
 can discard the request.  For example, suppose that backend A sends a
 request for a DB; we write the file.  If then quickly backend B also
 requests stats for the same DB, with the current logic we'd go write the
 file, but perhaps backend B would be fine with the file we already
 wrote.

Hmmm, you're probably right.

 Another point is that I think there's a better way to handle nonexistant
 files, instead of having to read the global file and all the DB records
 to find the one we want.  Just try to read the database file, and only
 if that fails try to read the global file and compare the timestamp (so
 there might be two timestamps for each DB, one in the global file and
 one in the DB-specific file.  I don't think this is a problem).  The
 point is avoid having to read the global file if possible.

I don't think that's a good idea. Keeping the timestamps at one place is
a significant simplification, and I don't think it's worth the
additional complexity. And the overhead is minimal.

So my vote on this change is -1.

 
 So here's v11.  I intend to commit this shortly.  (I wanted to get it
 out before lunch, but I introduced a silly bug that took me a bit to
 fix.)

;-)

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-18 Thread Noah Misch
On Mon, Feb 18, 2013 at 06:49:14AM -0800, Kevin Grittner wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com wrote:
  Maybe it would be a good idea to try to put such commands at the
  very end of the dump, if possible.

     25,    /* DO_POST_DATA_BOUNDARY */
     26,    /* DO_CONSTRAINT */
     27,    /* DO_INDEX */
     28,    /* DO_REFRESH_MATVIEW */
     28 /* DO_MATVIEW_INDEX */
     29,    /* DO_RULE */
     30,    /* DO_TRIGGER */
     31,    /* DO_FK_CONSTRAINT */
     32,    /* DO_DEFAULT_ACL */
     33,    /* DO_EVENT_TRIGGER */
 
 I don't think that pushing MV refreshes and index creation farther
 down the list should require anything beyond adjusting the priority
 numbers.  I don't see a problem pushing them to the end.  Does
 anyone else see anything past priority 28 that MV population should
 *not* follow?

DO_EVENT_TRIGGER should remain last; it may change the behavior of nearly any
other command.

Moving DO_REFRESH_MATVIEW past DO_TRIGGER would affect the outcome when the MV
calls functions that ultimately trip triggers or rules.  Currently, the
behavior will be the same as for CHECK constraints: the rules and triggers
don't exist yet.  This may also affect, for the better, MVs referencing views
that need the CREATE TABLE ... CREATE RULE _RETURN restoration pathway.  It
looks like a positive change.  On the flip side, I wonder if there's some case
I'm not considering where it's important to delay restoring rules and/or
triggers until after restoring objects for which restoration can entail calls
to arbitrary user functions.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: PATCH: Split stats file per database WAS: [HACKERS] autovacuum stress-testing our system

2013-02-18 Thread Alvaro Herrera
Tomas Vondra wrote:
 On 18.2.2013 16:50, Alvaro Herrera wrote:

  Also, it seems to me that the new pgstat_db_requested() logic is
  slightly bogus (in the inefficient sense, not the incorrect sense):
  we should be comparing the timestamp of the request vs.  what's already
  on disk instead of blindly returning true if the list is nonempty.  If
  the request is older than the file, we don't need to write anything and
  can discard the request.  For example, suppose that backend A sends a
  request for a DB; we write the file.  If then quickly backend B also
  requests stats for the same DB, with the current logic we'd go write the
  file, but perhaps backend B would be fine with the file we already
  wrote.
 
 Hmmm, you're probably right.

I left it as is for now; I think it warrants revisiting.

  Another point is that I think there's a better way to handle nonexistant
  files, instead of having to read the global file and all the DB records
  to find the one we want.  Just try to read the database file, and only
  if that fails try to read the global file and compare the timestamp (so
  there might be two timestamps for each DB, one in the global file and
  one in the DB-specific file.  I don't think this is a problem).  The
  point is avoid having to read the global file if possible.
 
 I don't think that's a good idea. Keeping the timestamps at one place is
 a significant simplification, and I don't think it's worth the
 additional complexity. And the overhead is minimal.
 
 So my vote on this change is -1.

Fair enough.

I have pushed it now.  Further testing, of course, is always welcome.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [pgsql-advocacy] Call for Google Summer of Code mentors, admins

2013-02-18 Thread Andres Freund
On 2013-02-14 10:02:13 -0800, Josh Berkus wrote:
 - Please suggest project ideas for GSOC

pg_upgrade support for debian's pg_upgradecluster

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [pgsql-advocacy] Call for Google Summer of Code mentors, admins

2013-02-18 Thread Magnus Hagander
On Thu, Feb 14, 2013 at 7:02 PM, Josh Berkus j...@agliodbs.com wrote:
 Folks,

 Once again, Google is holding Summer of Code.  We need to assess whether
 we want to participate this year.

 Questions:

 - Who wants to mentor for GSOC?

Sign me up on that list. Depending on projects, of course.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Materialized views WIP patch

2013-02-18 Thread Kevin Grittner
Noah Misch n...@leadboat.com wrote:
 On Mon, Feb 18, 2013 at 06:49:14AM -0800, Kevin Grittner wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Maybe it would be a good idea to try to put such commands at
 the very end of the dump, if possible.

 25,    /* DO_POST_DATA_BOUNDARY */
 26,    /* DO_CONSTRAINT */
 27,    /* DO_INDEX */
 28,    /* DO_REFRESH_MATVIEW */
 28 /* DO_MATVIEW_INDEX */
 29,    /* DO_RULE */
 30,    /* DO_TRIGGER */
 31,    /* DO_FK_CONSTRAINT */
 32,    /* DO_DEFAULT_ACL */
 33,    /* DO_EVENT_TRIGGER */

 I don't think that pushing MV refreshes and index creation
 farther down the list should require anything beyond adjusting
 the priority numbers.  I don't see a problem pushing them to the
 end.  Does anyone else see anything past priority 28 that MV
 population should *not* follow?

 DO_EVENT_TRIGGER should remain last; it may change the behavior
 of nearly any other command.

 Moving DO_REFRESH_MATVIEW past DO_TRIGGER would affect the
 outcome when the MV calls functions that ultimately trip triggers
 or rules. Currently, the behavior will be the same as for CHECK
 constraints: the rules and triggers don't exist yet.  This may
 also affect, for the better, MVs referencing views that need the
 CREATE TABLE ... CREATE RULE _RETURN restoration pathway.  It
 looks like a positive change.  On the flip side, I wonder if
 there's some case I'm not considering where it's important to
 delay restoring rules and/or triggers until after restoring
 objects for which restoration can entail calls to arbitrary user
 functions.

I didn't quite follow all of Noah's points or their implications,
so we chatted off-list.  He made a couple additional observations
which allow some simplification of the patch, and allow MV REFRESH
to be moved to the very end of the priority list without ill
effect.

(1)  While it might be incorrect for the CREATE INDEX on a
materialized view to come after event triggers are set up, REFRESH
can be expected to be a routine action in the presence of such
triggers, and it might actually be incorrect to REFRESH when the
triggers are not present.

(2)  REFRESH MATERIALIZED VIEW creates and builds a new heap, and
reindexes it after the data has been loaded, so the timing of the
CREATE INDEX statements on MVs is not critical, as long as they are
done after the CREATE and before the REFRESH.  We could drop them
into the same priority as the other CREATE INDEX statements, and it
would not be a big deal because the MVs would be empty.

This should allow me to simplify the code a little bit and move the
RMV step to the very end.  That may have some advantages when users
want to start using the database while MVs are being populated.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql: Clean up c.h / postgres.h after Assert() move

2013-02-18 Thread Alvaro Herrera
Jeff Janes escribió:
 commit 381d4b70a9854a7b5b9f12d828a0824f8564f1e7 introduced some
 compiler warnings:
 
 assert.c:26: warning: no previous prototype for 'ExceptionalCondition'
 elog.c: In function 'pg_re_throw':
 elog.c:1628: warning: implicit declaration of function 'ExceptionalCondition'
 elog.c:1630: warning: 'noreturn' function does return

Fixed, thanks for the report.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: PATCH: Split stats file per database WAS: [HACKERS] autovacuum stress-testing our system

2013-02-18 Thread Alvaro Herrera
Alvaro Herrera wrote:

 I have pushed it now.  Further testing, of course, is always welcome.

Mastodon failed:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mastodondt=2013-02-19%2000%3A00%3A01

probably worth investigating a bit; we might have broken something.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] unaccent module - two params function should be immutable

2013-02-18 Thread Pavel Stehule
Hello

There was a proposal to change flag of function to immutable - should
be used in indexes

CREATE FUNCTION unaccent(regdictionary, text)
RETURNS text
AS 'MODULE_PATHNAME', 'unaccent_dict'
LANGUAGE C STABLE STRICT;


is there any progress?

Regards

Pavel Stehule


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers