date:20140811

On 2014-08-11 12:42:06 +0900, Fujii Masao wrote:
 On Mon, Aug 11, 2014 at 10:48 AM, Tatsuo Ishii is...@postgresql.org wrote:
  On Mon, Aug 11, 2014 at 9:23 AM, Tatsuo Ishii is...@postgresql.org wrote:
  We can know the LSN of last committed WAL record on primary by using
  pg_current_xlog_location(). It seems there's no API to know the time
  when the WAL record was created. I would like to know standby delay by
  using pg_last_xact_replay_timestamp() and such that API.
 
  If there's no such a API, it would be useful to invent usch an API IMO.
 
  +1
 
  I proposed that function before, but unfortunately it failed to be applied.
  But I still think that function is useful to calculate the replication 
  delay.
  The past discussion is
  http://www.postgresql.org/message-id/cahgqgwf3zjfunej5ka683ku5rqubtswtqfq7g1x0g34o+jx...@mail.gmail.com
 
  I looked into the thread briefly and found Simon and Robert gave -1
  for this because of performance concern. I'm not sure if it's a actual
  performance penalty or not. Maybe we need to major the penalty?
 
 I think that the performance penalty is negligible small because the patch
 I posted before added only three stores to shared memory per
 commit/abort.

Uh. It adds another atomic operation (the spinlock) to the commit
path. That's surely *not* insignificant. At the very least the
concurrency approach needs to be rethought.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nulls in GIN index

2014-08-11 Thread Heikki Linnakangas


On 08/11/2014 01:19 AM, worthy7 wrote:


http://www.postgresql.org/docs/9.1/static/gin-implementation.html
As of PostgreSQL 9.1, NULL key values can be included in the index. Also,
placeholder NULLs are included in the index for indexed items that are NULL
or contain no keys according to extractValue. This allows searches that
should find empty items to do so.

How do I define an index that includes nulls then?


You don't need to do anything special, any NULL values will be indexed 
automatically.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function to know last log write timestamp

On Mon, Aug 11, 2014 at 3:54 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2014-08-11 12:42:06 +0900, Fujii Masao wrote:
On Mon, Aug 11, 2014 at 10:48 AM, Tatsuo Ishii is...@postgresql.org wrote:
On Mon, Aug 11, 2014 at 9:23 AM, Tatsuo Ishii is...@postgresql.org
wrote:
We can know the LSN of last committed WAL record on primary by using
pg_current_xlog_location(). It seems there's no API to know the time
when the WAL record was created. I would like to know standby delay by
using pg_last_xact_replay_timestamp() and such that API.

If there's no such a API, it would be useful to invent usch an API IMO.

I proposed that function before, but unfortunately it failed to be
applied.
But I still think that function is useful to calculate the replication
delay.
The past discussion is
http://www.postgresql.org/message-id/cahgqgwf3zjfunej5ka683ku5rqubtswtqfq7g1x0g34o+jx...@mail.gmail.com

I looked into the thread briefly and found Simon and Robert gave -1
for this because of performance concern. I'm not sure if it's a actual
performance penalty or not. Maybe we need to major the penalty?

I think that the performance penalty is negligible small because the patch
I posted before added only three stores to shared memory per
commit/abort.

Uh. It adds another atomic operation (the spinlock) to the commit
path. That's surely *not* insignificant. At the very least the
concurrency approach needs to be rethought.

No, the patch doesn't add the spinlock at all. What the commit path
additionally does are

1. increment the counter in shared memory
2. set the timestamp of last commit record to shared memory
3. increment the counter in shared memory

There is no extra spinlock.

OTOH, when pg_last_xact_insert_timestamp reads the timestamp from
the shared memory, it checks whether the counter values are the same
or not before and after reading the timestamp. If they are not the same,
it tries to read the timesetamp again. This logic is necessary for reading
the consistent timestamp value there.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for N synchronous standby servers

On Mon, Aug 11, 2014 at 2:10 PM, Michael Paquier
michael.paqu...@gmail.com wrote:


 On Mon, Aug 11, 2014 at 1:26 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Thanks for updating the patch! Again I tested the feature and found
 something
 wrong. I set synchronous_standby_num to 2 and started three standbys. Two
 of
 them are included in synchronous_standby_names, i.e., they are synchronous
 standbys. That is, the other one standby is always asynchronous. When
 I shutdown one of synchronous standbys and executed the write transaction,
 the transaction was successfully completed. So the transaction doesn't
 wait for
 two sync standbys in that case. Probably this is a bug.
 Well, that's working in my case :)

Oh, that worked in my machine, too, this time... I did something wrong.
Sorry for the noise.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function to know last log write timestamp

On 2014-08-11 16:20:41 +0900, Fujii Masao wrote:
On Mon, Aug 11, 2014 at 3:54 PM, Andres Freund and...@2ndquadrant.com wrote:
On 2014-08-11 12:42:06 +0900, Fujii Masao wrote:
On Mon, Aug 11, 2014 at 10:48 AM, Tatsuo Ishii is...@postgresql.org
wrote:
On Mon, Aug 11, 2014 at 9:23 AM, Tatsuo Ishii is...@postgresql.org
wrote:
We can know the LSN of last committed WAL record on primary by using
pg_current_xlog_location(). It seems there's no API to know the time
when the WAL record was created. I would like to know standby delay by
using pg_last_xact_replay_timestamp() and such that API.

If there's no such a API, it would be useful to invent usch an API IMO.

I think that the performance penalty is negligible small because the patch
I posted before added only three stores to shared memory per
commit/abort.

Uh. It adds another atomic operation (the spinlock) to the commit
path. That's surely *not* insignificant. At the very least the
concurrency approach needs to be rethought.

No, the patch doesn't add the spinlock at all. What the commit path
additionally does are

1. increment the counter in shared memory
2. set the timestamp of last commit record to shared memory
3. increment the counter in shared memory

There is no extra spinlock.

Ah, I see. There's another patch somewhere down that thread
(cahgqgwg4xfzjfyzabn5v__d3qpynnsgbph3nar6p40elivk...@mail.gmail.com). The
patch in the message you linked to *does* use a spinlock though.

Yea, that approach then just touches a cacheline that should already be
local. I doubt that the implementation is correct on some more lenient
platforms (missing write memory barrier), but that's not your fault.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support for N synchronous standby servers

2014-08-11 Thread Michael Paquier

On Mon, Aug 11, 2014 at 4:26 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Aug 11, 2014 at 2:10 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 Oh, that worked in my machine, too, this time... I did something wrong.
 Sorry for the noise.
No problem, thanks for spending time testing.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] pg_last_xact_insert_timestamp

Hi,

On 2011-10-04 20:52:59 +0900, Fujii Masao wrote:
 *** a/src/backend/access/transam/xact.c
 --- b/src/backend/access/transam/xact.c
 ***
 *** 1066,1071  RecordTransactionCommit(void)
 --- 1066,1074 
   
   (void) XLogInsert(RM_XACT_ID, XLOG_XACT_COMMIT_COMPACT, 
 rdata);
   }
 + 
 + /* Save timestamp of latest transaction commit record */
 + pgstat_report_xact_end_timestamp(xactStopTimestamp);
   }


Perhaps that pgstat_report() should instead be combined with the
pgstat_report_xact_timestamp(0) in CommitTransaction()? Then the number
of changecount increases and cacheline references would stay the
same. The only thing that'd change would be a single additional
assignment.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nulls in GIN index

2014-08-11 Thread worthy7

Perhaps I'm missing something

Table has 2 columns, text and ftstext

text: how are you
ftstest: (nothing)

Because how and are and you are too common to be tsvectored. Which is
fine.

So if a user searches for how are you:
select * from tbl_lines WHERE
ftstext @@ plainto_tsquery('English', 'how are you')

Returns nothing. Which I somewhat understand, but I want it to return all
the rows with nothing in the ftstext.
plainto_tsquery('English', 'how are you') = ''
and the ftstext of some rows is also = ''
So why doesn't the index return all these rows when a null string is
searched.

I think you can see what im trying to achieve, how do I do it?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/nulls-in-GIN-index-tp5814384p5814416.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] pg_last_xact_insert_timestamp

On Mon, Aug 11, 2014 at 4:46 PM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2011-10-04 20:52:59 +0900, Fujii Masao wrote:
 *** a/src/backend/access/transam/xact.c
 --- b/src/backend/access/transam/xact.c
 ***
 *** 1066,1071  RecordTransactionCommit(void)
 --- 1066,1074 

   (void) XLogInsert(RM_XACT_ID, 
 XLOG_XACT_COMMIT_COMPACT, rdata);
   }
 +
 + /* Save timestamp of latest transaction commit record */
 + pgstat_report_xact_end_timestamp(xactStopTimestamp);
   }


 Perhaps that pgstat_report() should instead be combined with the
 pgstat_report_xact_timestamp(0) in CommitTransaction()? Then the number
 of changecount increases and cacheline references would stay the
 same. The only thing that'd change would be a single additional
 assignment.

Sounds good suggestion.

While reading the patch again, I found it didn't handle the COMMIT/ABORT
PREPARED case properly. According to the commit e74e090, now
pg_last_xact_replay_timestamp() returns the timestamp of COMMIT/ABORT PREPARED.
pg_last_xact_insert_timestamp() is mainly expected to be used to calculate
the replication delay, so it also needs to return that timestam. But the patch
didn't change 2PC code at all. We need to add pgstat_report_xact_end_timestamp()
into FinishPreparedTransaction(), RecordTransactionCommitPrepared() or
RecordTransactionAbortPrepared().

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql: show only failed queries

On Sat, Aug 9, 2014 at 4:16 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hi


 2014-08-08 13:58 GMT+02:00 Fujii Masao masao.fu...@gmail.com:

 On Fri, Aug 8, 2014 at 4:31 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
 
 
 
  2014-08-07 7:10 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
 
  On Thu, Aug 7, 2014 at 6:26 AM, Pavel Stehule pavel.steh...@gmail.com
  wrote:
   Hello
  
   updated version patch in attachment
 
  Thanks! But ISTM you forgot to attached the patch.
 
 
  grr .. I am sorry

 No problem. Thanks for the patch! Attached is the revised version of the
 patch.

   +/* all psql known variables are included in list by default */
   +for (known_varname = known_varnames; *known_varname;
   known_varname++)
   +varnames[nvars++] = pg_strdup(*known_varname);
  
   Don't we need to append both prefix and suffix to even known
   variables?
  
  
   ??? I am not sure if I understand well - probably system read only
   variables as DBNAME, USER, VERSION should not be there
 
  I had that question because complete_from_variables() is also called by
  the
  tab-completion of \echo : and it shows half-baked variables list. So
  I thought probably we need to change complete_from_variables() more to
  fix the problem.
 
 
  I understand now.
 
  I fixed it.
 
  I have a question.\echo probably should not to show empty known
  variable.
 
  data for autocomplete for \echo should be same as result of \set

 Agreed. I think that only the variables having the set values should be
 displayed in \echo : case. So I modified complete_from_variables()
 so that the unset variables are not shown in that case but all the
 variables
 are shown in the tab-completion of \set.

   +else if (strcmp(prev2_wd, \\set) == 0)
   +{
   +if (strcmp(prev_wd, AUTOCOMMIT) == 0)
  
   ISTM that some psql variables like IGNOREEOF are not there. Why not?
  
  
   yes, there are not complete for DBNAME, ENCODING, FETCH_COUNT,
   HISTCONTROL,
   HISTFILE, HISTSIZE, HOST, IGNOREEOFF, PROMPT*,USER,  VERSION
  
   There are more reasons:
  
   * paremeter is not a enum (string, number or both): FETCH_COUNT,
   PROMPT,
   HISTSIZE, ..
  
   * variable is pseudo read only  - change has not any effect: DBNAME,
   ENCODING, VERSION
 
  So HISTCONTROL should be there because it doesn't have such reasons at
  all?
 
 
  yes

 ISTM that you forgot to add HISTCONTROL to your patch. So I just added
 that.

 I added the tab-completion for \unset command. Like \echo :, only
 the variables having the set values should be displayed in \unset case.


 perfect



 I changed complete_from_variables() so that it checks the memory size of
 the variable array even when appending the known variables. If the memory
 size is not enough, it's dynamically increased. Practically this check
 would
 not be required because the initial size of the array is enough larger
 than
 the number of the known variables. I added this as the safe-guard.

 Typo: IGNOREEOFF - IGNOREEOF

 I removed the value none from the value list of ECHO because it's not
 documented and a user might get confused when he or she sees the
 undocumented
 value none. Thought?


 isn't better to fix doc? I don't know any reason why we should not to
 support none

I'm OK with this. The attached patch adds the support of none value both
in ECHO and HISTCONTROL variables (because HISTCONTROL had the same
problem as ECHO had), and also adds the description of that value into
the document.

 I looked to code, you removed a check against duplicate varname in list. Is
 it ok?

Oh, just revived that code.

Regards,

-- 
Fujii Masao
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
***
*** 2827,2833  bar
  they are sent to the server. The switch for this is
  option-e/option. If set to literalerrors/literal then only
  failed queries are displayed on standard error output. The switch
! for this is option-b/option.
  /para
  /listitem
/varlistentry
--- 2827,2835 
  they are sent to the server. The switch for this is
  option-e/option. If set to literalerrors/literal then only
  failed queries are displayed on standard error output. The switch
! for this is option-b/option. If unset, or if set to
! literalnone/literal (or any other value than those above) then
! no queries are displayed.
  /para
  /listitem
/varlistentry
***
*** 2892,2899  bar
   list. If set to a value of literalignoredups/literal, lines
   matching the previous history line are not entered. A value of
   literalignoreboth/literal combines the two options. If
!  unset, or if set to any other value than those above, all lines
!  read in interactive mode are saved on the history list.
  /para
  note
  para
--- 2894,2902 
   list. If set to a

[HACKERS] ProcessUtilityHook DropStmt RenameStmt

2014-08-11 Thread geohas


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi!

I try to catch a DropStmt and convert it to a Rename Stmt, like petere's
pg_trashcan, but i don't like to create new schema. I only like to
rename the table in case of a drop table query.
Is this possible with something like:

ProcessUtility (Node * parsetree,
   const char *queryString,
   ParamListInfo params,
   bool isTopLevel,
   DestReceiver * dest, char *completionTag)

{
if (nodeTag(parsetree) == T_DropStmt)
{
 DropStmt *stmt = (DropStmt *) parsetree;
 if (stmt-removeType == OBJECT_TABLE)
{
 RenameStmt *newstmt = makeNode(RenameStmt);
newstmt-objectType = stmt-removeType;
newstmt-newname = new_name;
parsetree = (Node *) newstmt;
}
(*prev_ProcessUtility) (parsetree, queryString,context, params,
dest, completionTag);
}

regards

ge0has
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)

iQEcBAEBAgAGBQJT6MfNAAoJEJFGMlQe7wR/RCQH/1KOwtCLDT2QVrGm/PKfIFGF
e6w+oOCUYz8v78s+uvI5Y5qEuUr2wqYuUhhV7UWXWBwKgLPkSvUTv04TWS9Ms6FJ
+Zn+yzqWUygdwDzKbKY3/qYreYAL6ZBv62ldjtApNUh1VHpPtZsPWtIe/485KB6v
W4xZt7PUAKOUlqTiQwaZok2rdYt0t7vWdVmw6qncUnlPGBpJM/XGGwDl4w5NCK23
Ls5ueLpe8gKoH1eMYG27FKo1rRARVBtB3zPkXmmfRZR+f1FUIkhiDkfm1AYhBJPy
FG0yExArvZjZLQIIEaenb8GzwjR04Ulaqej5CLPdOB0NomkN0aN0CKcSRT9SrME=
=Y3t4
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-08-11 Thread Kevin Grittner

Stephen Frost sfr...@snowman.net wrote:

 Our grace period for active backends after unclean exit of one
 of their peers is low, milliseconds to seconds.  Our grace
 period for active backends after unclean exit of the postmaster
 is unconstrained.  At least one of those policies has to be
 wrong. Like Andres and Robert, I pick the second one.

 Ditto for me.

+1

In fact, I would say that is slightly understated.  The grace
period for active backends after unclean exit of one of their peers
is low, milliseconds to seconds, *unless the postmaster has also
crashed* -- in which case it is unconstrained.  Why is the crash of
a backend less serious if the postmaster has also crashed?
Certainly it can't be considered to be surprising that if the 
postmaster is crashing that other backends might be also crashing 
around the same time?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] nulls in GIN index

2014-08-11 Thread Andreas Joseph Krogh

På mandag 11. august 2014 kl. 11:17:56, skrev worthy7 worthy@gmail.com 
mailto:worthy@gmail.com: Perhaps I'm missing something

 Table has 2 columns, text and ftstext

 text: how are you
 ftstest: (nothing)

 Because how and are and you are too common to be tsvectored. Which is
 fine.

 So if a user searches for how are you:
 select * from tbl_lines WHERE
 ftstext @@ plainto_tsquery('English', 'how are you')

 Returns nothing. Which I somewhat understand, but I want it to return all
 the rows with nothing in the ftstext.
 plainto_tsquery('English', 'how are you') = ''
 and the ftstext of some rows is also = ''
 So why doesn't the index return all these rows when a null string is
 searched.

 I think you can see what im trying to achieve, how do I do it?   Use the 
'simple' dictionary:   my_fts_column @@ to_tsquery('simple', 'how are you')   --
Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 
andr...@visena.com mailto:andr...@visena.com www.visena.com 
https://www.visena.com  https://www.visena.com

Re: [HACKERS] Hokey wrong versions of libpq in apt.postgresql.org

JD,

* Joshua D. Drake (j...@commandprompt.com) wrote:
 The issue that I specifically ran into is that by using
 apt.postgresql.org in its default configuration, I can't add certain
 extensions (without jumping through hoops). Simple:
 
 Assume a running 9.2.9 from apt.postgresql.org
 apt-get install pgxnclient
 sudo pgxnclient install pg_repack
 
 
 
 Doesn't work. Because it is looking for libs in
 /var/lib/postgresql/9.2 not /var/lib/postgresql/9.3.

Have you got any postgresql-server-dev package installed?  If not, then
pg_config is going to grab the info for the libpq-dev that's installed,
but I doubt the extension is going to compile without the server-dev
package installed anyway..

In any case, pgxnclient should probably be more intelligent when it's
working under a Debian-based installation where multiple major versions
of PG can be installed.

 Yes. I can get the 9.2 libpq but that isn't really the point is it?
 This is quite unexpected behavior from an operational perspective.
 It should just work but it doesn't because we are shipping from
 apt.postgresql.org a 9.3 version of libpq.

I don't believe the 9.3 version of libpq is the issue here at all, see
above..

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Proposal: Incremental Backup

On Tue, Aug 5, 2014 at 8:04 PM, Simon Riggs si...@2ndquadrant.com wrote:
 To decide whether we need to re-copy the file, you read the file until
 we find a block with a later LSN. If we read the whole file without
 finding a later LSN then we don't need to re-copy. That means we read
 each file twice, which is slower, but the file is at most 1GB in size,
 we we can assume will be mostly in memory for the second read.

That seems reasonable, although copying only the changed blocks
doesn't seem like it would be a whole lot harder.  Yes, you'd need a
tool to copy those blocks back into the places where they need to go,
but that's probably not a lot of work and the disk savings, in many
cases, would be enormous.

 As Marco says, that can be optimized using filesystem timestamps instead.

The idea of using filesystem timestamps gives me the creeps.  Those
aren't always very granular, and I don't know that (for example) they
are crash-safe.  Does every filesystem on every platform make sure
that the mtime update hits the disk before the data?  What about clock
changes made manually by users, or automatically by ntpd? I recognize
that there are people doing this today, because it's what we have, and
it must not suck too much, because people are still doing it ... but I
worry that if we do it this way, we'll end up with people saying
PostgreSQL corrupted my data and will have no way of tracking the
problem back to the filesystem or system clock event that was the true
cause of the problem, so they'll just blame the database.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

On 2014-08-10 18:36:18 -0400, Noah Misch wrote:
 [Due for a new subject line?]
 
 On Sat, Aug 09, 2014 at 08:16:01PM +0200, Andres Freund wrote:
  On 2014-08-09 14:09:36 -0400, Tom Lane wrote:
   Andres Freund and...@2ndquadrant.com writes:
On 2014-08-09 14:00:49 -0400, Tom Lane wrote:
I don't think it's anywhere near as black-and-white as you guys claim.
What it comes down to is whether allowing existing 
transactions/sessions
to finish is more important than allowing new sessions to start.
Depending on the application, either could be more important.
   
Nah. The current behaviour circumvents security measures we normally
consider absolutely essential. If the postmaster died some bad shit went
on. The likelihood of hitting corner case bugs where it's important that
we react to a segfault/panic with a restart/crash replay is rather high.
   
   What's your point?  Once a new postmaster starts, it *will* do a crash
   restart, because certainly no shutdown checkpoint ever happened.
  
  That's not saying much. For one, there can be online checkpoints in that
  time. So it's certainly not guaranteed (or even all that likely) that
  all the WAL since the incident is replayed.  For another, it can be
  *hours* before all the backends finish.
  
  IIRC we'll continue to happily write WAL and everything after postmaster
  (and possibly some backends, corrupting shmem) have crashed. The
  bgwriter, checkpointer, backends will continue to write dirty buffers to
  disk. We'll IIRC continue to write checkpoints.   That's simply not
  things we should be doing after postmaster crashed if we can avoid at
  all.
 
 The basic support processes, including the checkpointer, exit promptly upon
 detecting a postmaster exit.  Checkpoints cease.

Only after finishing an 'in process' checkpoint though afaics. And only
if no new checkpoint has been requested since. The latter because we
don't even test for postmaster death if a latch has been set... I think
it's similar for the bgwriter and such.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

On Sat, Aug 9, 2014 at 2:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 +1. I think the current behaviour is a seriously bad idea.

 I don't think it's anywhere near as black-and-white as you guys claim.
 What it comes down to is whether allowing existing transactions/sessions
 to finish is more important than allowing new sessions to start.
 Depending on the application, either could be more important.

It's partly about that, and I think the answer is that being able to
start new sessions is almost always more important; but it's also
about about the fact that the postmaster provides essential
protections against data corruption, and running without those
protections is a bad idea.  If it's not a bad idea, then why do we
need those protections ever?  Why have we put so much effort into
bullet-proofing them over the years?

I mean, we could simply regard the unexpected end of a backend as
being something that is probably OK and we'd usually be right; after
all, a backend would crap out without releasing a critical spinlock
very often.   A lot of users would probably be very happy to be
liberated from the tyranny of a server-wide restart every time a
backend crashes, and 90% of the time nothing bad would happen.  But
clearly this is insanity, because every now and then something would
go terribly wrong and there would be no automated way for the system
to recover, and on even rarer occasions your data would get eaten.
That is why it is right to think that the service provided by the
postmaster is essential, not nice-to-have.

 Ideally we'd have some way to configure the behavior appropriately for
 a given installation; but short of that, it's unclear to me that
 unilaterally changing the system's bias is something our users would
 thank us for.  I've not noticed a large groundswell of complaints about
 it (though this may just reflect that we've made the postmaster pretty
 darn robust, so that the case seldom comes up).

I do think that's a large part of it.  The postmaster doesn't get
killed very often, and when it does, things are often messed up to a
degree where the user's just going to reboot anyway.  But I've
encountered customers who managed to corrupt their database because
backends didn't exit when the postmaster died, because it turns out
that removing postmaster.pid defeats the shared memory interlocks that
normally prevent starting a new postmaster, and the customer did that.
And I've personally experienced at least one protracted outage that
resulted from orphaned backends preventing 'pg_ctl restart' from
working.  If the postmaster weren't so reliable, I'm sure these kinds
of problems would be a lot more common.

But the fact that they're uncommon doesn't mean that the current
behavior is the best one, and I'm convinced that it isn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] failures on barnacle (CLOBBER_CACHE_RECURSIVELY) because of memory leaks

2014-08-11 Thread Tomas Vondra

Hi!

So after 83 days, the regression tests on barnacle completed, and it
smells like another memory leak in CacheMemoryContext, similar to those
fixed in 078b2ed on May 18.

Barnacle is one of those machines with -DCLOBBER_CACHE_RECURSIVELY, and
the tests were running with a snapshot from May 19, so the memory leaks
detected with only -DCLOBBER_CACHE_ALWAYS should be fixed there.

I have a script in place looking for excessive memory usage of postgres
backends - whenever a backend allocates more than 512MB of VSS, it does
MemoryContextStats(TopMemoryContext) on it (from gdb).

The results get logged into the postmaster log, and thus get to the
buildfarm.

See this:
http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=barnacledt=2014-05-19%2011%3A25%3A03

Essentially, it looks like this:

TopMemoryContext: 69984 total in 10 blocks; 6216 free (24 chunks); 63768 used
  TopTransactionContext: 8192 total in 1 blocks; 5768 free (23 chunks);
2424 used
 ...
  Relcache by OID: 8192 total in 1 blocks; 592 free (0 chunks); 7600 used
  CacheMemoryContext: 301981696 total in 45 blocks; 3701744 free (140
chunks); 298279952 used
 ...

or like this:

   CacheMemoryContext: 1602215936 total in 200 blocks; 497336 free (138
chunks); 1601718600 used


So probably another low-probability memory leak, happening apparently only
in CLOBBER_CACHE_RECURSIVELY.


regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

On Fri, Aug 8, 2014 at 10:30 AM, MauMau maumau...@gmail.com wrote:
 I've tracked down the real root cause.  The fix is very simple.  Please
 check the attached one-liner patch.

 I confirmed that the fix is already in 9.3 and 9.5devel, so I just copied
 the code fragment from 9.5devel to 9.2.9.  The attached patch is for 9.2.9.
 I didn't check 9.4 and other versions.  Why wasn't the fix applied to 9.2?

It was considered a performance improvement - i.e. a feature - rather
than a bug fix, so it was only added to master.  That was after the
release of 9.2 and before the release of 9.3, so it's in newer
branches but not older ones.

Author: Heikki Linnakangas heikki.linnakan...@iki.fi
Branch: master Release: REL9_3_BR [c9d7dbacd] 2013-01-29 10:43:33 +0200

Skip truncating ON COMMIT DELETE ROWS temp tables, if the transaction hasn't
touched any temporary tables.

We could try harder, and keep track of whether we've inserted to any temp
tables, rather than accessed them, and which temp tables have been inserted
to. But this is dead simple, and already covers many interesting scenarios.

I'd support back-porting that commit to 9.1 and 9.2 as a fix for this
problem.  As the commit message says, it's dead simple.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Incremental Backup

2014-08-11 Thread Claudio Freire

On Mon, Aug 11, 2014 at 12:27 PM, Robert Haas robertmh...@gmail.com wrote:

 As Marco says, that can be optimized using filesystem timestamps instead.

 The idea of using filesystem timestamps gives me the creeps.  Those
 aren't always very granular, and I don't know that (for example) they
 are crash-safe.  Does every filesystem on every platform make sure
 that the mtime update hits the disk before the data?  What about clock
 changes made manually by users, or automatically by ntpd? I recognize
 that there are people doing this today, because it's what we have, and
 it must not suck too much, because people are still doing it ... but I
 worry that if we do it this way, we'll end up with people saying
 PostgreSQL corrupted my data and will have no way of tracking the
 problem back to the filesystem or system clock event that was the true
 cause of the problem, so they'll just blame the database.

I have the same creeps. I only do it on a live system, after a first
full rsync, where mtime persistence is not an issue, and where I know
ntp updates have not happened.

I had a problem once where a differential rsync with timestamps didn't
work as expected, and corrupted a slave. It was a test system so I
didn't care much at the time, but if it were a backup, I'd be quite
pissed.

Basically, mtimes aren't trustworthy across reboots. Granted, this was
a very old system, debian 5 when it was new, IIRC, so it may be better
now. But it does illustrate just how bad things can get when one
trusts timestamps. This case was an old out-of-sync slave on a test
set up that got de-synchronized, and I tried to re-synchronize it with
a delta rsync to avoid the hours it would take to actually compare
everything (about a day). One segment that was modified after the sync
loss was not transfered, causing trouble at the slave, so I was forced
to re-synchronize with a full rsync (delta, but without timestamps).
This was either before pg_basebackup or before I heard of it ;-), but
in any case, if it happened on a test system with little activity, you
can be certain it can happen on a production system.

So I now only trust mtime when there has been neither a reboot nor an
ntpd running since the last mtime-less rsync. On those cases, the
optimization works and helps a lot. But I doubt you'll take many
incremental backups matching those conditions.

Say what you will of anecdotal evidence, but the issue is quite clear
theoretically as well: modifications to file segments that aren't
reflected within mtime granularity. There are many reasons why mtime
could lose precision. Being an old filesystem with second-precision
timestamps is just one, but not the only one.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] replication commands and log_statements

On Fri, Aug 8, 2014 at 1:04 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Sat, Aug 9, 2014 at 12:29 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Aug 8, 2014 at 11:00 AM, Bruce Momjian br...@momjian.us wrote:
 On Fri, Aug  8, 2014 at 08:51:13AM -0400, Robert Haas wrote:
 On Thu, Aug 7, 2014 at 12:07 PM, Abhijit Menon-Sen a...@2ndquadrant.com 
 wrote:
  At 2014-08-07 23:22:43 +0900, masao.fu...@gmail.com wrote:
  That is, we log replication commands only when log_statement is set to
  all. Neither new parameter is introduced nor log_statement is
  redefined as a list.
 
  That sounds good to me.

 It sounds fairly unprincipled to me.  I liked the idea of making
 log_statement a list, but if we aren't gonna do that, I think this
 should be a separate parameter.

 I am unclear there is enough demand for a separate replication logging
 parameter --- using log_statement=all made sense to me.

 Most people don't want to turn on log_statement=all because it
 produces too much log volume.

 See, for example:
 http://bonesmoses.org/2014/08/05/on-postgresql-logging-verbosity/

 But logging replication commands is quite low-volume, so it is not
 hard to imagine someone wanting to log all replication commands but
 not all SQL statements.

 You can do that by executing
 ALTER ROLE replication user SET log_statement TO 'all'.
 If you don't use the replication user to execute SQL statements,
 no SQL statements are logged in that setting.

If you have a user devoted to it, I suppose that's true.  I still
think it shouldn't get munged together like that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgresql.auto.conf and reload

On Sun, Aug 10, 2014 at 3:54 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Fri, Aug 8, 2014 at 11:41 AM, Fujii Masao masao.fu...@gmail.com wrote:

 Yep, right. ParseConfigFp() is not good place to pick up data_directory.
 What about the attached patch which makes ProcessConfigFile() instead of
 ParseConfigFp() pick up data_directory just after the configuration file
 is
 parsed?

 I think this is better way to handle it. Few comments about patch:

Thanks for the review! Attached is the updated version of the patch.

 1. can't we directly have the code by having else in below loop.
 if (DataDir 
 !ParseConfigFile(PG_AUTOCONF_FILENAME, NULL, false, 0, elevel,
 head, tail))

Done.

 2.
 + if (DataDir == NULL)
 + {
 + ConfigVariable *prev = NULL;
 + for (item = head; item;)
 + {
 ..
 ..
 + }

 If data_directory is not present in list, then can't we directly return
 and leave the other work in function ProcessConfigFile() for second
 pass.

Done.

 3.
 I think comments can be improved:
 a.
 + In this case,
 + * we should not pick up all the settings except the data_directory
 + * from postgresql.conf yet because they might be overwritten
 + * with the settings in PG_AUTOCONF_FILENAME file which will be
 + * read later.

 It would be better if we change it as below:

 In this case, we shouldn't pick any settings except the data_directory
 from postgresql.conf because they might be overwritten
 with the settings in PG_AUTOCONF_FILENAME file which will be
 read later.

Done.

 b.
 +Now only the data_directory needs to be picked up to
 +  * read PG_AUTOCONF_FILENAME file later.

 This part of comment seems to be repetitive as you already mentioned
 some part of it in first line as well as at one other location:

Okay, I just remove that line.

While updating the patch, I found that the ConfigVariable which
is removed from list has the fields that the memory has been already
allocated but we forgot to free them. So I included the code to free
them in the patch.

Regards,

-- 
Fujii Masao
*** a/src/backend/utils/misc/guc-file.l
--- b/src/backend/utils/misc/guc-file.l
***
*** 45,50  static unsigned int ConfigFileLineno;
--- 45,52 
  static const char *GUC_flex_fatal_errmsg;
  static sigjmp_buf *GUC_flex_fatal_jmp;
  
+ static void FreeConfigVariable(ConfigVariable *item);
+ 
  /* flex fails to supply a prototype for yylex, so provide one */
  int GUC_yylex(void);
  
***
*** 151,164  ProcessConfigFile(GucContext context)
  	 * file is in the data directory, we can't read it until the DataDir has
  	 * been set.
  	 */
! 	if (DataDir 
! 		!ParseConfigFile(PG_AUTOCONF_FILENAME, NULL, false, 0, elevel,
! 		 head, tail))
  	{
! 		/* Syntax error(s) detected in the file, so bail out */
! 		error = true;
! 		ErrorConfFile = PG_AUTOCONF_FILENAME;
! 		goto cleanup_list;
  	}
  
  	/*
--- 153,218 
  	 * file is in the data directory, we can't read it until the DataDir has
  	 * been set.
  	 */
! 	if (DataDir)
  	{
! 		if (!ParseConfigFile(PG_AUTOCONF_FILENAME, NULL, false, 0, elevel,
! 			 head, tail))
! 		{
! 			/* Syntax error(s) detected in the file, so bail out */
! 			error = true;
! 			ErrorConfFile = PG_AUTOCONF_FILENAME;
! 			goto cleanup_list;
! 		}
! 	}
! 	/*
! 	 * Pick up only the data_directory if DataDir is not set, which
! 	 * means that the configuration file is read for the first time and
! 	 * PG_AUTOCONF_FILENAME file cannot be read yet. In this case,
! 	 * we shouldn't pick any settings except the data_directory
! 	 * from postgresql.conf because they might be overwritten
! 	 * with the settings in PG_AUTOCONF_FILENAME file which will be
! 	 * read later. OTOH, since it's ensured that data_directory doesn't
! 	 * exist in PG_AUTOCONF_FILENAME file, it will never be overwritten
! 	 * later.
! 	 */
! 	else
! 	{
! 		ConfigVariable *prev = NULL;
! 
! 		for (item = head; item;)
! 		{
! 			ConfigVariable *ptr = item;
! 
! 			item = item-next;
! 			if (strcmp(ptr-name, data_directory) != 0)
! 			{
! if (prev == NULL)
! 	head = ptr-next;
! else
! {
! 	prev-next = ptr-next;
! 	/*
! 	 * On removing last item in list, we need to update tail
! 	 * to ensure that list will be maintianed.
! 	 */
! 	if (prev-next == NULL)
! 		tail = prev;
! }
! FreeConfigVariable(ptr);
! 			}
! 			else
! prev = ptr;
! 		}
! 
! 		/*
! 		 * Quick exit if data_directory is not present in list.
! 		 *
! 		 * Don't remember when we last successfully loaded the config file in
! 		 * this case because that time will be set soon by subsequent load of
! 		 * the config file.
! 		 */
! 		if (head == NULL)
! 			return;
  	}
  
  	/*
***
*** 677,683  ParseConfigFp(FILE *fp, const char *config_file, int depth, int elevel,
  			*tail_p = prev_item;
  	}
  
! 	pfree(cur_item);
  	break;
  }
  			}
--- 731,737 
  			*tail_p = prev_item;
  	}
  
! 	FreeConfigVariable(cur_item);
  	break;

Re: [HACKERS] psql output change in 9.4

On Fri, Aug 8, 2014 at 9:34 PM, Peter Eisentraut pete...@gmx.net wrote:
 This is 9.3:

 peter=# \a
 Output format is unaligned.
 peter=# \a
 Output format is aligned.
 peter=# \x
 Expanded display is on.
 peter=# \x
 Expanded display is off.

 This is new in 9.4:

 peter=# \a
 Output format (format) is unaligned.
 peter=# \a
 Output format (format) is aligned.
 peter=# \x
 Expanded display (expanded) is on.
 peter=# \x
 Expanded display (expanded) is off.

 What is the point of that change?

 I suppose it is so that you can use \pset without arguments to show all
 settings:

 peter=# \pset
 Border style (border) is 1.
 Target width (columns) unset.
 Expanded display (expanded) is off.
 ...

 But those are unrelated features, and the changed output doesn't make
 any sense in the contexts I show above.

 I think this should be reverted, and the \pset output should be
 implemented separately.

Yes, the \pset patch (commit c64e68fd9f1132fec563fb5de53dc3bcccb5fc3b)
caused this behavior change.   I can't remember whether I noticed it
at the time and thought it was a reasonable change, or whether I
didn't notice it when committing.

Either way, clarifying the name of the parameter which is being
displayed does not seem like particularly bad idea to me even in the
contexts you mention.  I've certainly run commands like \a and \t and
then said to myself, crap, which pset parameter does this correspond
to?.  And there was no easy way to figure it out.

I think the output could justly be criticized for making it
insufficiently clear that the parenthesized text is, in fact, the name
of the pset parameter.  We could write something like:

Border style (parameter border) is 1.

But I don't know whether that would be considered an improvement or
just extra verbosity.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.4 pg_restore --help changes

On Fri, Aug 8, 2014 at 9:44 PM, Peter Eisentraut pete...@gmx.net wrote:
 This is a valuable feature change, but I think the help output is
 unhelpful.  The left side has a singular placeholder, but the right side
 talks about objects in plural.  How do I actually specify multiple
 indexes?  Is is --index=foo,bar?  It's --index=foo --index=bar, but
 that's unclear from the help.

 I think it would be clearer to write something like

   -I, --index=NAME restore named indexes (repeat for multiple)

 or perhaps make a note at the bottom

 The options -I, -n, -P, -t, -T, --section can be combined and
 specified multiple times to select multiple objects.

 Other ideas?

I like the note at the bottom idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql output change in 9.4

Robert Haas robertmh...@gmail.com writes:
 On Fri, Aug 8, 2014 at 9:34 PM, Peter Eisentraut pete...@gmx.net wrote:
 What is the point of that change?

 I think the output could justly be criticized for making it
 insufficiently clear that the parenthesized text is, in fact, the name
 of the pset parameter.

Quite; that wasn't apparent to me either.

 We could write something like:
 Border style (parameter border) is 1.

How about

Border style (\pset border) is 1.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing dependency to wsock32.lib when compiling code on WIndows

2014-08-11 Thread Jeff Janes

On Tue, Jul 15, 2014 at 5:14 AM, Michael Paquier michael.paqu...@gmail.com
wrote:

 On Tue, Jul 15, 2014 at 8:46 PM, Magnus Hagander mag...@hagander.net
 wrote:
  On Thu, Jun 19, 2014 at 4:13 PM, MauMau maumau...@gmail.com wrote:
  From: Michael Paquier michael.paqu...@gmail.com
 
  You are right, I only though about the MS scripts when working on this
  patch. Updated patch for a complete cleanup is attached (note I
  updated manually ./configure for test purposes and did not re-run
  autoconf).
 
 
  I marked this patch as ready for committer.  I confirmed:
 
  I'm pretty sure this is not ready for commiter due to:
 
  * The binaries can be built with MSVC 2008 Express.  I didn't try to
 build
  on MinGW.
 
  Somebody needs to test it on mingw first :)
 
  That should be an easy test for someone who has a mingw install
  around, so better leave it at needs review so more people see it and
  can run those tests.
 I vaguely recall that I tested as well with MinGW when writing v2 of
 this patch after MauMau reviewed v1. Btw, just in case, I have just
 made working my MinGW box a bit more and re-tested it now. And it
 worked :)

 Of course someone else than the patch author with a MinGW environment
 at hand is invited to test.


I guess I should have done that

While trying to test more recent stuff against mingw, I kept running into a
problem which bisected down to this (a16bac36eca8158cbf78987e953).

I am following the instructions at
https://wiki.postgresql.org/wiki/Building_With_MinGW,
using the mingw64 set of instructions and of course the git instructions.
 Except that I am running as my own user, not creating a dummy account, and
I am using local hardware, not an amazon rental.

This is the warning/error I get:

auth.c: In function 'ClientAuthentication':
auth.c:458:6: warning: implicit declaration of function 'gai_strerror'
[-Wimplicit-function-declaration]
auth.c:458:6: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
auth.c:458:6: warning: format '%s' expects argument of type 'char *', but
argument 2 has type 'int' [-Wformat]
auth.c:476:6: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
auth.c:476:6: warning: format '%s' expects argument of type 'char *', but
argument 2 has type 'int' [-Wformat]
auth.c: In function 'CheckRADIUSAuth':
auth.c:2282:3: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
hba.c: In function 'parse_hba_line':
hba.c:1060:5: warning: implicit declaration of function 'gai_strerror'
[-Wimplicit-function-declaration]
hba.c:1060:5: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
hba.c:1131:6: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
hba.c: In function 'parse_hba_auth_opt':
hba.c:1607:4: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
pqcomm.c: In function 'StreamServerPort':
pqcomm.c:334:4: warning: implicit declaration of function 'gai_strerror'
[-Wimplicit-function-declaration]
pqcomm.c:334:4: warning: format '%s' expects argument of type 'char *', but
argument 4 has type 'int' [-Wformat]
pqcomm.c:338:4: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
mingwcompat.c:60:1: warning: 'RegisterWaitForSingleObject' redeclared
without dllimport attribute: previous dllimport ignored [-Wattributes]
pgstat.c: In function 'pgstat_init':
pgstat.c:353:3: warning: implicit declaration of function 'gai_strerror'
[-Wimplicit-function-declaration]
pgstat.c:353:3: warning: format '%s' expects argument of type 'char *', but
argument 2 has type 'int' [-Wformat]
postmaster.c: In function 'BackendInitialize':
postmaster.c:3938:3: warning: implicit declaration of function
'gai_strerror' [-Wimplicit-function-declaration]
postmaster.c:3938:3: warning: format '%s' expects argument of type 'char
*', but argument 2 has type 'int' [-Wformat]
libpq/auth.o:auth.c:(.text+0x1949): undefined reference to `gai_strerror'
libpq/auth.o:auth.c:(.text+0x19c4): undefined reference to `gai_strerror'
libpq/auth.o:auth.c:(.text+0x1b1a): undefined reference to `gai_strerror'
libpq/auth.o:auth.c:(.text+0x1cb4): undefined reference to `gai_strerror'
libpq/auth.o:auth.c:(.text+0x1cdc): undefined reference to `gai_strerror'
libpq/hba.o:hba.c:(.text+0x1fa0): more undefined references to
`gai_strerror' follow
collect2.exe: error: ld returned 1 exit status
make[2]: *** [postgres] Error 1
make[1]: *** [all-backend-recurse] Error 2
make: *** [all-src-recurse] Error 2


Any suggestions on what to try?  This patch doesn't seem to explicitly
mention gai_strerror at all, so it must be some indirect effect.


Cheers,

Jeff

Re: [HACKERS] replication commands and log_statements

* Robert Haas (robertmh...@gmail.com) wrote:
 On Fri, Aug 8, 2014 at 1:04 PM, Fujii Masao masao.fu...@gmail.com wrote:
  You can do that by executing
  ALTER ROLE replication user SET log_statement TO 'all'.
  If you don't use the replication user to execute SQL statements,
  no SQL statements are logged in that setting.
 
 If you have a user devoted to it, I suppose that's true.  I still
 think it shouldn't get munged together like that.

Folks *should* have a dedicated replication user, imv.  That said, I
agree with Robert in that I don't particularly like this recommendation
for how to enable logging of replication commands.  For one thing, it
means having to remember to set the per-role GUC for every replication
user which is created and that's the kind of trivially-missed step that
can get people into trouble.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] psql output change in 9.4

2014-08-11 Thread Pavel Stehule

2014-08-11 19:52 GMT+02:00 Tom Lane t...@sss.pgh.pa.us:

 Robert Haas robertmh...@gmail.com writes:
  On Fri, Aug 8, 2014 at 9:34 PM, Peter Eisentraut pete...@gmx.net
 wrote:
  What is the point of that change?

  I think the output could justly be criticized for making it
  insufficiently clear that the parenthesized text is, in fact, the name
  of the pset parameter.

 Quite; that wasn't apparent to me either.

  We could write something like:
  Border style (parameter border) is 1.

 How about

 Border style (\pset border) is 1.


+1

Pavel


 regards, tom lane


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bad estimation together with large work_mem generates terrible slow hash joins

On Sat, Aug 9, 2014 at 9:13 AM, Tomas Vondra t...@fuzzy.cz wrote:
 Adding least-significant bit does not work, we need get back to adding
 the most-significant one. Not sure what's the least complex way to do
 that, though.

 I'm thinking about computing the nbuckets limit (how many buckets may
 fit into work_mem):

 tupsize = HJTUPLE_OVERHEAD +
 MAXALIGN(sizeof(MinimalTupleData)) +
 MAXALIGN(tupwidth);

 nbuckets_bits_max = my_log2(work_mem / tupsize)

 and then start with nbatches from this bit, like this:

  31   23 max   1570
   |||||
   |  |   - batches|   free   | -  buckets   |

That doesn't seem unreasonable provide there's enough bit space, but,
as you point out, the day might not be far off when there isn't.

It also strikes me that when there's only 1 batch, the set of bits
that map onto the batch number is zero-width, and one zero-width bit
range is as good as another.  In other words, if you're only planning
to do one batch, you can easily grow the number of buckets on the fly.
Growing the number of buckets only becomes difficult once you have
more than one batch.

And I mention that because, in the scenario mentioned in your original
post (a hash table with small number of buckets (8192) containing
large number of tuples [~3.3M]), presumably what happens is you start
out thinking you are going to have one batch with 8K buckets.  Then,
as more tuples continue to pour in, you see the load factor rising and
responding by bumping up the size of the hash table.  Now eventually
you realize, gosh, this isn't even going to fit into work_mem, so you
need to start batching it.  But by that point you've presumably done
as much as you're going to do in terms of increasing the number of
buckets; after that, you're just going to add more batches as
necessary.  So not being able to further increase the number of
buckets once the number of batches is no longer 1 wouldn't be a
problem in that case.

Now if you start out planning for multiple batches, and then you
discover you'd like to keep the same number of batches but have more
buckets per batch, that's gonna be hard.  It's not impossible, because
you could steal some of the unused high-order bits above the number of
batches, and then concatenate them with the low-order bits that you
already had, but that seems like kind of an ugly kludge, and AFAICS it
only helps if the new tuples are narrower by enough to justify the
cost of splitting all the buckets.  I won't say that couldn't happen,
but I think it would be better to keep that complexity out of the
patch unless we're absolutely certain it's justified.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql output change in 9.4

On Mon, Aug 11, 2014 at 1:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Aug 8, 2014 at 9:34 PM, Peter Eisentraut pete...@gmx.net wrote:
 What is the point of that change?

 I think the output could justly be criticized for making it
 insufficiently clear that the parenthesized text is, in fact, the name
 of the pset parameter.

 Quite; that wasn't apparent to me either.

 We could write something like:
 Border style (parameter border) is 1.

 How about

 Border style (\pset border) is 1.

That would look just fine as a response to \a or \x, but I'm not sure
it would look as good as a response to \pset, which prints out that
line for every parameter (why does every line say \pset when the
command I just typed is \pset?).  However, I can certainly live with
it if others prefer that to what I suggested.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb format is pessimal for toast compression

On Sat, Aug 9, 2014 at 3:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kgri...@ymail.com writes:
 Stephen Frost sfr...@snowman.net writes:
 Trying to move the header to the end just for the sake of this
 doesn't strike me as a good solution as it'll make things quite
 a bit more complicated.

 Why is that?  How much harder would it be to add a single offset
 field to the front to point to the part we're shifting to the end?
 It is not all that unusual to put a directory at the end, like in
 the .zip file format.

 Yeah, I was wondering that too.  Arguably, directory-at-the-end would
 be easier to work with for on-the-fly creation, not that we do any
 such thing at the moment.  I think the main thing that's bugging Stephen
 is that doing that just to make pglz_compress happy seems like a kluge
 (and I have to agree).

 Here's a possibly more concrete thing to think about: we may very well
 someday want to support JSONB object field or array element extraction
 without reading all blocks of a large toasted JSONB value, if the value is
 stored external without compression.  We already went to the trouble of
 creating analogous logic for substring extraction from a long uncompressed
 text or bytea value, so I think this is a plausible future desire.  With
 the current format you could imagine grabbing the first TOAST chunk, and
 then if you see the header is longer than that you can grab the remainder
 of the header without any wasted I/O, and for the array-subscripting case
 you'd now have enough info to fetch the element value from the body of
 the JSONB without any wasted I/O.  With directory-at-the-end you'd
 have to read the first chunk just to get the directory pointer, and this
 would most likely not give you any of the directory proper; but at least
 you'd know exactly how big the directory is before you go to read it in.
 The former case is probably slightly better.  However, if you're doing an
 object key lookup not an array element fetch, neither of these formats are
 really friendly at all, because each binary-search probe probably requires
 bringing in one or two toast chunks from the body of the JSONB value so
 you can look at the key text.  I'm not sure if there's a way to redesign
 the format to make that less painful/expensive --- but certainly, having
 the key texts scattered through the JSONB value doesn't seem like a great
 thing from this standpoint.

I think that's a good point.

On the general topic, I don't think it's reasonable to imagine that
we're going to come up with a single heuristic that works well for
every kind of input data.  What pglz is doing - assuming that if the
beginning of the data is incompressible then the rest probably is too
- is fundamentally reasonable, nonwithstanding the fact that it
doesn't happen to work out well for JSONB.  We might be able to tinker
with that general strategy in some way that seems to fix this case and
doesn't appear to break others, but there's some risk in that, and
there's no obvious reason in my mind why PGLZ should be require to fly
blind.  So I think it would be a better idea to arrange some method by
which JSONB (and perhaps other data types) can provide compression
hints to pglz.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb format is pessimal for toast compression

Robert Haas robertmh...@gmail.com writes:
 ... I think it would be a better idea to arrange some method by
 which JSONB (and perhaps other data types) can provide compression
 hints to pglz.

I agree with that as a long-term goal, but not sure if it's sane to
push into 9.4.

What we could conceivably do now is (a) add a datatype OID argument to
toast_compress_datum, and (b) hard-wire the selection of a different
compression-parameters struct if it's JSONBOID.  The actual fix would
then be to increase the first_success_by field of this alternate struct.

I had been worrying about API-instability risks associated with (a),
but on reflection it seems unlikely that any third-party code calls
toast_compress_datum directly, and anyway it's not something we'd
be back-patching to before 9.4.

The main objection to (b) is that it wouldn't help for domains over jsonb
(and no, I don't want to insert a getBaseType call there to fix that).

A longer-term solution would be to make this some sort of type property
that domains could inherit, like typstorage is already.  (Somebody
suggested dealing with this by adding more typstorage values, but
I don't find that attractive; typstorage is known in too many places.)
We'd need some thought about exactly what we want to expose, since
the specific knobs that pglz_compress has today aren't necessarily
good for the long term.

This is all kinda ugly really, but since I'm not hearing brilliant
ideas for redesigning jsonb's storage format, maybe this is the
best we can do for now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-11 Thread Peter Geoghegan

On Mon, Aug 11, 2014 at 12:07 PM, Robert Haas robertmh...@gmail.com wrote:
 I think that's a good point.

I think that there may be something to be said for the current layout.
Having adjacent keys and values could take better advantage of CPU
cache characteristics. I've heard of approaches to improving B-Tree
locality that forced keys and values to be adjacent on individual
B-Tree pages [1], for example. I've heard of this more than once. And
FWIW, I believe based on earlier research of user requirements in this
area that very large jsonb datums are not considered all that
compelling. Document database systems have considerable limitations
here.

 On the general topic, I don't think it's reasonable to imagine that
 we're going to come up with a single heuristic that works well for
 every kind of input data.  What pglz is doing - assuming that if the
 beginning of the data is incompressible then the rest probably is too
 - is fundamentally reasonable, nonwithstanding the fact that it
 doesn't happen to work out well for JSONB.  We might be able to tinker
 with that general strategy in some way that seems to fix this case and
 doesn't appear to break others, but there's some risk in that, and
 there's no obvious reason in my mind why PGLZ should be require to fly
 blind.  So I think it would be a better idea to arrange some method by
 which JSONB (and perhaps other data types) can provide compression
 hints to pglz.

If there is to be any effort to make jsonb a more effective target for
compression, I imagine that that would have to target redundancy
between JSON documents. With idiomatic usage, we can expect plenty of
it.

[1] http://www.vldb.org/conf/1999/P7.pdf , We also forced each key
and child pointer to be adjacent to each other physically
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql: show only failed queries

2014-08-11 Thread Pavel Stehule

2014-08-11 14:59 GMT+02:00 Fujii Masao masao.fu...@gmail.com:

 On Sat, Aug 9, 2014 at 4:16 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Hi
 
 
  2014-08-08 13:58 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
 
  On Fri, Aug 8, 2014 at 4:31 AM, Pavel Stehule pavel.steh...@gmail.com
  wrote:
  
  
  
   2014-08-07 7:10 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
  
   On Thu, Aug 7, 2014 at 6:26 AM, Pavel Stehule 
 pavel.steh...@gmail.com
   wrote:
Hello
   
updated version patch in attachment
  
   Thanks! But ISTM you forgot to attached the patch.
  
  
   grr .. I am sorry
 
  No problem. Thanks for the patch! Attached is the revised version of the
  patch.
 
+/* all psql known variables are included in list by default
 */
+for (known_varname = known_varnames; *known_varname;
known_varname++)
+varnames[nvars++] = pg_strdup(*known_varname);
   
Don't we need to append both prefix and suffix to even known
variables?
   
   
??? I am not sure if I understand well - probably system read
 only
variables as DBNAME, USER, VERSION should not be there
  
   I had that question because complete_from_variables() is also called
 by
   the
   tab-completion of \echo : and it shows half-baked variables list.
 So
   I thought probably we need to change complete_from_variables() more
 to
   fix the problem.
  
  
   I understand now.
  
   I fixed it.
  
   I have a question.\echo probably should not to show empty known
   variable.
  
   data for autocomplete for \echo should be same as result of \set
 
  Agreed. I think that only the variables having the set values should be
  displayed in \echo : case. So I modified complete_from_variables()
  so that the unset variables are not shown in that case but all the
  variables
  are shown in the tab-completion of \set.
 
+else if (strcmp(prev2_wd, \\set) == 0)
+{
+if (strcmp(prev_wd, AUTOCOMMIT) == 0)
   
ISTM that some psql variables like IGNOREEOF are not there. Why
 not?
   
   
yes, there are not complete for DBNAME, ENCODING, FETCH_COUNT,
HISTCONTROL,
HISTFILE, HISTSIZE, HOST, IGNOREEOFF, PROMPT*,USER,  VERSION
   
There are more reasons:
   
* paremeter is not a enum (string, number or both): FETCH_COUNT,
PROMPT,
HISTSIZE, ..
   
* variable is pseudo read only  - change has not any effect:
 DBNAME,
ENCODING, VERSION
  
   So HISTCONTROL should be there because it doesn't have such reasons
 at
   all?
  
  
   yes
 
  ISTM that you forgot to add HISTCONTROL to your patch. So I just added
  that.
 
  I added the tab-completion for \unset command. Like \echo :, only
  the variables having the set values should be displayed in \unset
 case.
 
 
  perfect
 
 
 
  I changed complete_from_variables() so that it checks the memory size of
  the variable array even when appending the known variables. If the
 memory
  size is not enough, it's dynamically increased. Practically this check
  would
  not be required because the initial size of the array is enough larger
  than
  the number of the known variables. I added this as the safe-guard.
 
  Typo: IGNOREEOFF - IGNOREEOF
 
  I removed the value none from the value list of ECHO because it's
 not
  documented and a user might get confused when he or she sees the
  undocumented
  value none. Thought?
 
 
  isn't better to fix doc? I don't know any reason why we should not to
  support none

 I'm OK with this. The attached patch adds the support of none value both
 in ECHO and HISTCONTROL variables (because HISTCONTROL had the same
 problem as ECHO had), and also adds the description of that value into
 the document.

  I looked to code, you removed a check against duplicate varname in list.
 Is
  it ok?

 Oh, just revived that code.


yes, It is looking well

regards

Pavel



 Regards,

 --
 Fujii Masao

Re: [HACKERS] 9.4 pg_restore --help changes

2014-08-11 Thread Alvaro Herrera

Robert Haas wrote:
 On Fri, Aug 8, 2014 at 9:44 PM, Peter Eisentraut pete...@gmx.net wrote:

  or perhaps make a note at the bottom
 
  The options -I, -n, -P, -t, -T, --section can be combined and
  specified multiple times to select multiple objects.
 
  Other ideas?
 
 I like the note at the bottom idea.

+1

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb format is pessimal for toast compression

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Robert Haas robertmh...@gmail.com writes:
  ... I think it would be a better idea to arrange some method by
  which JSONB (and perhaps other data types) can provide compression
  hints to pglz.
 
 I agree with that as a long-term goal, but not sure if it's sane to
 push into 9.4.

Agreed.

 What we could conceivably do now is (a) add a datatype OID argument to
 toast_compress_datum, and (b) hard-wire the selection of a different
 compression-parameters struct if it's JSONBOID.  The actual fix would
 then be to increase the first_success_by field of this alternate struct.

Isn't the offset-to-compressable-data variable though, depending on the
number of keys, etc?  Would we be increasing first_success_by based off
of some function which inspects the object?

 I had been worrying about API-instability risks associated with (a),
 but on reflection it seems unlikely that any third-party code calls
 toast_compress_datum directly, and anyway it's not something we'd
 be back-patching to before 9.4.

Agreed.

 The main objection to (b) is that it wouldn't help for domains over jsonb
 (and no, I don't want to insert a getBaseType call there to fix that).

While not ideal, that seems like an acceptable compromise for 9.4 to me.

 A longer-term solution would be to make this some sort of type property
 that domains could inherit, like typstorage is already.  (Somebody
 suggested dealing with this by adding more typstorage values, but
 I don't find that attractive; typstorage is known in too many places.)

Think that was me and having it be something which domains can inherit
makes sense.  Would be able to use this approach to introduce type
(and domains inheirited from that type) specific compression algorithms,
perhaps?  Or even get to a point where we could have a chunk-based
compression scheme for certain types of objects (such as JSONB) where we
keep track of which keys exist at which points in the compressed object,
allowing us to skip to the specific chunk which contains the requested
key, similar to what we do with uncompressed data?

 We'd need some thought about exactly what we want to expose, since
 the specific knobs that pglz_compress has today aren't necessarily
 good for the long term.

Agreed.

 This is all kinda ugly really, but since I'm not hearing brilliant
 ideas for redesigning jsonb's storage format, maybe this is the
 best we can do for now.

This would certainly be an improvement over what's going on now, and I
love the idea of possibly being able to expand this in the future to do
more.  What I'd hate to see is having all of this and only ever using it
to say skip ahead another 1k for JSONB.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] jsonb format is pessimal for toast compression

Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 What we could conceivably do now is (a) add a datatype OID argument to
 toast_compress_datum, and (b) hard-wire the selection of a different
 compression-parameters struct if it's JSONBOID.  The actual fix would
 then be to increase the first_success_by field of this alternate struct.

 Isn't the offset-to-compressable-data variable though, depending on the
 number of keys, etc?  Would we be increasing first_success_by based off
 of some function which inspects the object?

Given that this is a short-term hack, I'd be satisfied with setting it
to INT_MAX.

If we got more ambitious, we could consider improving the cutoff logic
so that it gives up at x% of the object or n bytes, whichever comes
first; but I'd want to see some hard evidence that that was useful
before adding any more cycles to pglz_compress.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb format is pessimal for toast compression

* Peter Geoghegan (p...@heroku.com) wrote:
 If there is to be any effort to make jsonb a more effective target for
 compression, I imagine that that would have to target redundancy
 between JSON documents. With idiomatic usage, we can expect plenty of
 it.

While I certainly agree, that's a rather different animal to address and
doesn't hold a lot of relevance to the current problem.  Or, to put it
another way, I don't think anyone is going to be surprised that two rows
containing the same data (even if they're inserted in the same
transaction and have the same visibility information) are compressed
together in some fashion.

We've got a clear example of someone, quite reasonably, expecting their
JSONB object to be compressed using the normal TOAST mechanism, and
we're failing to do that in cases where it's actually a win to do so.
That's the focus of this discussion and what needs to be addressed
before 9.4 goes out.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-11 Thread Peter Geoghegan

On Mon, Aug 11, 2014 at 1:01 PM, Stephen Frost sfr...@snowman.net wrote:
 We've got a clear example of someone, quite reasonably, expecting their
 JSONB object to be compressed using the normal TOAST mechanism, and
 we're failing to do that in cases where it's actually a win to do so.
 That's the focus of this discussion and what needs to be addressed
 before 9.4 goes out.

Sure. I'm not trying to minimize that. We should fix it, certainly.
However, it does bear considering that JSON data, with each document
stored in a row is not an effective target for TOAST compression in
general, even as text.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.4 logical replication - walsender keepalive replies

2014-08-11 Thread Steve Singer


On 07/14/2014 01:19 PM, Steve Singer wrote:

On 07/06/2014 10:11 AM, Andres Freund wrote:

Hi Steve,



Right. I thought about this for a while, and I think we should change
two things. For one, don't request replies here. It's simply not needed,
as this isn't dealing with timeouts. For another don't just check 
-flush

 sentPtr but also  -write  sentPtr. The reason we're sending these
feedback messages is to inform the 'logical standby' that there's been
WAL activity which it can't see because they don't correspond to
anything that's logically decoded (e.g. vacuum stuff).
Would that suit your needs?

Greetings,


Yes I think that will work for me.
I tested with the attached patch that I think  does what you describe 
and it seems okay.






Any feedback on this?  Do we want that change for 9.4, or do we want 
something else?






Andres Freund









--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.4 logical replication - walsender keepalive replies

On 2014-08-11 17:22:27 -0400, Steve Singer wrote:
 On 07/14/2014 01:19 PM, Steve Singer wrote:
 On 07/06/2014 10:11 AM, Andres Freund wrote:
 Hi Steve,
 
 Right. I thought about this for a while, and I think we should change
 two things. For one, don't request replies here. It's simply not needed,
 as this isn't dealing with timeouts. For another don't just check
 -flush
  sentPtr but also  -write  sentPtr. The reason we're sending these
 feedback messages is to inform the 'logical standby' that there's been
 WAL activity which it can't see because they don't correspond to
 anything that's logically decoded (e.g. vacuum stuff).
 Would that suit your needs?
 
 Greetings,
 
 Yes I think that will work for me.
 I tested with the attached patch that I think  does what you describe and
 it seems okay.
 
 
 
 
 Any feedback on this?  Do we want that change for 9.4, or do we want
 something else?

I plan to test and apply it in the next few days. Digging myself from
under stuff from before my holiday right now...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bad estimation together with large work_mem generates terrible slow hash joins

2014-08-11 Thread Tomas Vondra

On 11.8.2014 20:25, Robert Haas wrote:
 On Sat, Aug 9, 2014 at 9:13 AM, Tomas Vondra t...@fuzzy.cz wrote:
 Adding least-significant bit does not work, we need get back to adding
 the most-significant one. Not sure what's the least complex way to do
 that, though.

 I'm thinking about computing the nbuckets limit (how many buckets may
 fit into work_mem):

 tupsize = HJTUPLE_OVERHEAD +
 MAXALIGN(sizeof(MinimalTupleData)) +
 MAXALIGN(tupwidth);

 nbuckets_bits_max = my_log2(work_mem / tupsize)

 and then start with nbatches from this bit, like this:

  31   23 max   1570
   |||||
   |  |   - batches|   free   | -  buckets   |
 
 That doesn't seem unreasonable provide there's enough bit space, but,
 as you point out, the day might not be far off when there isn't.

Thinking about this a bit more, I believe the danger is not that
imminent. 32 bits mean the Hash node should handle 2^32 tuples in total
(possibly split into multiple batches).

It used to be 2^32 buckets thanks to NTUP_PER_BUCKET=10, but the patch
lowers this to 1 so it's tuples now.

But while we're we're a bit closer to exhausting the 32 bits, I think
it's not really that big issue. Not only hashing a table with ~4e9 rows
is going to be a hellish experience, but if we really want to support
it, we should probably switch to 64-bit hashes.

Because adding some checks is not going to help - it may detect an
issue, but it won't fix it. And actually, even if the two values get
overlap, it does not mean the hashjoin will stop working. There'll be
dependency between batches and buckets, causing non-uniform usage of the
buckets, but it won't blow up.

So I don't think we need to worry about exhausting the 32 bits for now.

 It also strikes me that when there's only 1 batch, the set of bits
 that map onto the batch number is zero-width, and one zero-width bit
 range is as good as another.  In other words, if you're only planning
 to do one batch, you can easily grow the number of buckets on the fly.
 Growing the number of buckets only becomes difficult once you have
 more than one batch.

Yes, that's true. It's also roughly what the patch does IMHO.

If you know you'll need more batches, you can start with the maximum
number of buckets right away (because you expect there's more data than
work_mem, so shoot for

nbuckets = mylog2(work_mem/tuple_size)

or something like that. It's also true that if you're starting with a
single batch, you can do whatever you want with nbuckets until you need
to do (nbatches *= 2).

It's also true that once you're done with the first batch, all the other
batches will be just fine with the number of buckets, because the
batches be about equally sized.

But I think this is pretty much what the patch does, in the end.


 And I mention that because, in the scenario mentioned in your original
 post (a hash table with small number of buckets (8192) containing
 large number of tuples [~3.3M]), presumably what happens is you start
 out thinking you are going to have one batch with 8K buckets.  Then,
 as more tuples continue to pour in, you see the load factor rising and
 responding by bumping up the size of the hash table.  Now eventually
 you realize, gosh, this isn't even going to fit into work_mem, so you
 need to start batching it.  But by that point you've presumably done
 as much as you're going to do in terms of increasing the number of
 buckets; after that, you're just going to add more batches as
 necessary.  So not being able to further increase the number of
 buckets once the number of batches is no longer 1 wouldn't be a
 problem in that case.
 
 Now if you start out planning for multiple batches, and then you
 discover you'd like to keep the same number of batches but have more
 buckets per batch, that's gonna be hard.  It's not impossible, because
 you could steal some of the unused high-order bits above the number of
 batches, and then concatenate them with the low-order bits that you
 already had, but that seems like kind of an ugly kludge, and AFAICS it
 only helps if the new tuples are narrower by enough to justify the
 cost of splitting all the buckets.  I won't say that couldn't happen,
 but I think it would be better to keep that complexity out of the
 patch unless we're absolutely certain it's justified.

I'm definitely in favor of keeping the patch as simple as possible.

I was considering using reversing the bits of the hash, because that's
pretty much the simplest solution. But I think you're right it might
actually work like this:

  * Are more batches needed?

  (yes) = just use nbuckets = my_log2(work_mem / tuple_size)

  (no) = go ahead, until processing all tuples or hitting work_mem

  (work_mem) = meh, use the same nbuckets above

  (all tuples) = compute optimal nbuckets /

Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

Robert Haas robertmh...@gmail.com writes:
 On Fri, Aug 8, 2014 at 10:30 AM, MauMau maumau...@gmail.com wrote:
 I've tracked down the real root cause.  The fix is very simple.  Please
 check the attached one-liner patch.

 I'd support back-porting that commit to 9.1 and 9.2 as a fix for this
 problem.  As the commit message says, it's dead simple.

While I have no great objection to back-porting Heikki's patch, it seems
like a very large stretch to call this a root-cause fix.  At best it's
band-aiding one symptom in a rather fragile way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] failures on barnacle (CLOBBER_CACHE_RECURSIVELY) because of memory leaks

Tomas Vondra t...@fuzzy.cz writes:
 So after 83 days, the regression tests on barnacle completed,

Hah, that's perseverance!

 and it
 smells like another memory leak in CacheMemoryContext, similar to those
 fixed in 078b2ed on May 18.

Ugh, will look.

 See this:
 http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=barnacledt=2014-05-19%2011%3A25%3A03

Initially I was more worried about the crashes, but it looks like that's
probably just a side-effect of the leak:

[537a053d.1f4b:2] LOG:  server process (PID 32110) was terminated by signal 9: 
Killed
[537a053d.1f4b:3] DETAIL:  Failed process was running: select 
pg_get_viewdef('vv3', true);
[537a053d.1f4b:4] LOG:  terminating any other active server processes
[537a053d.1f52:2] WARNING:  terminating connection because of crash of another 
server process
...
[537a053d.1f4b:7] LOG:  server process (PID 3882) was terminated by signal 9: 
Killed
[537a053d.1f4b:8] DETAIL:  Failed process was running: SELECT gid FROM 
pg_prepared_xacts;
[537a053d.1f4b:9] LOG:  terminating any other active server processes
...

Evidently the OOM killer is at large on this machine.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] failures on barnacle (CLOBBER_CACHE_RECURSIVELY) because of memory leaks

2014-08-11 Thread Tomas Vondra

On 12.8.2014 02:05, Tom Lane wrote:

 Evidently the OOM killer is at large on this machine.

Yes. It's a machine with only 8GB of RAM, and there are 3 VMs (LXC
containers), with 2GB of RAM each. That's not much, but while it's
mostly out of necessity, it's apparently a good way to catch leaks.

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing dependency to wsock32.lib when compiling code on WIndows

2014-08-11 Thread Noah Misch

On Mon, Aug 11, 2014 at 11:02:48AM -0700, Jeff Janes wrote:
 While trying to test more recent stuff against mingw, I kept running into a
 problem which bisected down to this (a16bac36eca8158cbf78987e953).

 This is the warning/error I get:
 
 auth.c: In function 'ClientAuthentication':
 auth.c:458:6: warning: implicit declaration of function 'gai_strerror'
 [-Wimplicit-function-declaration]

 libpq/auth.o:auth.c:(.text+0x1949): undefined reference to `gai_strerror'
 libpq/auth.o:auth.c:(.text+0x19c4): undefined reference to `gai_strerror'
 libpq/auth.o:auth.c:(.text+0x1b1a): undefined reference to `gai_strerror'
 libpq/auth.o:auth.c:(.text+0x1cb4): undefined reference to `gai_strerror'
 libpq/auth.o:auth.c:(.text+0x1cdc): undefined reference to `gai_strerror'

 Any suggestions on what to try?  This patch doesn't seem to explicitly
 mention gai_strerror at all, so it must be some indirect effect.

I, too, encountered that.  The patch let configure detect HAVE_GETADDRINFO
under MinGW-w64; passing ac_cv_func_getaddrinfo=no on the configure command
line is a workaround.  Under !HAVE_GETADDRINFO, configure arranges to build
src/port/getaddrinfo.c, and src/include/getaddrinfo.h injects the replacement
gai_strerror() prototype.  That understandably doesn't happen in a
HAVE_GETADDRINFO build, yet src/include/port/win32/sys/socket.h does the
following unconditionally:

/*
 * we can't use the windows gai_strerror{AW} functions because
 * they are defined inline in the MS header files. So we'll use our
 * own
 */
#undef gai_strerror

Somehow or other, we must bring these parts into agreement.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improvement of versioning on Windows, take two

2014-08-11 Thread Michael Paquier

On Sun, Aug 10, 2014 at 9:31 PM, MauMau maumau...@gmail.com wrote:
 From: Michael Paquier michael.paqu...@gmail.com
  LINK : fatal error LNK1104: ファイル
 '.\release\postgres\src_timezone_win32ver.obj' を開くことができません。
Oh yes, right. I don't really know how I missed this error when
testing v1. Adding an explicit call to
RemoveFile('src\timezone\win32ver.rc') for project postgres calms down
the build. Is the attached working for you?
Regards,
-- 
Michael
From f88aa01e1f6f17d2b1a0cba05fb9efb8fe06e45f Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Tue, 12 Aug 2014 10:02:32 +0900
Subject: [PATCH] Versioning support on MinGW and MSVC for remaining exe and
 dll files

Windows versioning is added for the following files:
- regress.dll (MSVC only)
- isolationtester.exe
- pg_isolation_regress.exe
- pg_regress.exe
- pg_regress_ecpg.exe
- zic.exe
---
 src/interfaces/ecpg/test/Makefile | 1 +
 src/test/isolation/Makefile   | 7 +--
 src/test/regress/GNUmakefile  | 7 +--
 src/timezone/Makefile | 5 -
 src/tools/msvc/Mkvcbuild.pm   | 7 +++
 src/tools/msvc/clean.bat  | 3 +++
 6 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/src/interfaces/ecpg/test/Makefile b/src/interfaces/ecpg/test/Makefile
index 56f6a17..a2e0a83 100644
--- a/src/interfaces/ecpg/test/Makefile
+++ b/src/interfaces/ecpg/test/Makefile
@@ -14,6 +14,7 @@ override CPPFLAGS := \
 	$(CPPFLAGS)
 
 PGFILEDESC = ECPG Test - regression tests for ECPG
+PGAPPICON = win32
 
 # where to find psql for testing an existing installation
 PSQLDIR = $(bindir)
diff --git a/src/test/isolation/Makefile b/src/test/isolation/Makefile
index 26bcf22..77bc43d 100644
--- a/src/test/isolation/Makefile
+++ b/src/test/isolation/Makefile
@@ -6,12 +6,15 @@ subdir = src/test/isolation
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
+PGFILEDESC = pg_isolation_tester/isolationtester - isolation regression tests
+PGAPPICON = win32
+
 # where to find psql for testing an existing installation
 PSQLDIR = $(bindir)
 
 override CPPFLAGS := -I$(srcdir) -I$(libpq_srcdir) -I$(srcdir)/../regress $(CPPFLAGS)
 
-OBJS =  specparse.o isolationtester.o
+OBJS =  specparse.o isolationtester.o $(WIN32RES)
 
 all: isolationtester$(X) pg_isolation_regress$(X)
 
@@ -21,7 +24,7 @@ submake-regress:
 pg_regress.o: | submake-regress
 	rm -f $@  $(LN_S) $(top_builddir)/src/test/regress/pg_regress.o .
 
-pg_isolation_regress$(X): isolation_main.o pg_regress.o
+pg_isolation_regress$(X): isolation_main.o pg_regress.o $(WIN32RES)
 	$(CC) $(CFLAGS) $^ $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@
 
 isolationtester$(X): $(OBJS) | submake-libpq submake-libpgport
diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile
index b084e0a..64c9924 100644
--- a/src/test/regress/GNUmakefile
+++ b/src/test/regress/GNUmakefile
@@ -14,6 +14,9 @@ subdir = src/test/regress
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
+PGFILEDESC = pg_regress - regression tests
+PGAPPICON = win32
+
 # file with extra config for temp build
 TEMP_CONF =
 ifdef TEMP_CONFIG
@@ -43,7 +46,7 @@ EXTRADEFS = '-DHOST_TUPLE=$(host_tuple)' \
 
 all: pg_regress$(X)
 
-pg_regress$(X): pg_regress.o pg_regress_main.o | submake-libpgport
+pg_regress$(X): pg_regress.o pg_regress_main.o $(WIN32RES) | submake-libpgport
 	$(CC) $(CFLAGS) $^ $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@
 
 # dependencies ensure that path changes propagate
@@ -177,7 +180,7 @@ bigcheck: all tablespace-setup
 clean distclean maintainer-clean: clean-lib
 # things built by `all' target
 	rm -f $(OBJS) refint$(DLSUFFIX) autoinc$(DLSUFFIX) dummy_seclabel$(DLSUFFIX)
-	rm -f pg_regress_main.o pg_regress.o pg_regress$(X)
+	rm -f pg_regress_main.o pg_regress.o pg_regress$(X) $(WIN32RES)
 # things created by various check targets
 	rm -f $(output_files) $(input_files)
 	rm -rf testtablespace
diff --git a/src/timezone/Makefile b/src/timezone/Makefile
index ef739e9..ab65d22 100644
--- a/src/timezone/Makefile
+++ b/src/timezone/Makefile
@@ -12,11 +12,14 @@ subdir = src/timezone
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
+PGFILEDESC = timezone - timezone database
+PGAPPICON = win32
+
 # files to build into backend
 OBJS= localtime.o strftime.o pgtz.o
 
 # files needed to build zic utility program
-ZICOBJS= zic.o ialloc.o scheck.o localtime.o
+ZICOBJS= zic.o ialloc.o scheck.o localtime.o $(WIN32RES)
 
 # timezone data files
 TZDATA = africa antarctica asia australasia europe northamerica southamerica \
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index e6fb3ec..8026543 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -112,6 +112,7 @@ sub mkvcbuild
 	$postgres-AddFiles('src\backend\utils\misc', 'guc-file.l');
 	$postgres-AddFiles('src\backend\replication', 'repl_scanner.l',
 		'repl_gram.y');
+	$postgres-RemoveFile('src\timezone\win32ver.rc');
 	$postgres-AddDefine('BUILDING_DLL');

Re: [HACKERS] psql: show only failed queries

On Tue, Aug 12, 2014 at 4:41 AM, Pavel Stehule pavel.steh...@gmail.com wrote:



 2014-08-11 14:59 GMT+02:00 Fujii Masao masao.fu...@gmail.com:

 On Sat, Aug 9, 2014 at 4:16 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Hi
 
 
  2014-08-08 13:58 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
 
  On Fri, Aug 8, 2014 at 4:31 AM, Pavel Stehule pavel.steh...@gmail.com
  wrote:
  
  
  
   2014-08-07 7:10 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
  
   On Thu, Aug 7, 2014 at 6:26 AM, Pavel Stehule
   pavel.steh...@gmail.com
   wrote:
Hello
   
updated version patch in attachment
  
   Thanks! But ISTM you forgot to attached the patch.
  
  
   grr .. I am sorry
 
  No problem. Thanks for the patch! Attached is the revised version of
  the
  patch.
 
+/* all psql known variables are included in list by default
*/
+for (known_varname = known_varnames; *known_varname;
known_varname++)
+varnames[nvars++] = pg_strdup(*known_varname);
   
Don't we need to append both prefix and suffix to even known
variables?
   
   
??? I am not sure if I understand well - probably system read
only
variables as DBNAME, USER, VERSION should not be there
  
   I had that question because complete_from_variables() is also called
   by
   the
   tab-completion of \echo : and it shows half-baked variables list.
   So
   I thought probably we need to change complete_from_variables() more
   to
   fix the problem.
  
  
   I understand now.
  
   I fixed it.
  
   I have a question.\echo probably should not to show empty known
   variable.
  
   data for autocomplete for \echo should be same as result of \set
 
  Agreed. I think that only the variables having the set values should be
  displayed in \echo : case. So I modified complete_from_variables()
  so that the unset variables are not shown in that case but all the
  variables
  are shown in the tab-completion of \set.
 
+else if (strcmp(prev2_wd, \\set) == 0)
+{
+if (strcmp(prev_wd, AUTOCOMMIT) == 0)
   
ISTM that some psql variables like IGNOREEOF are not there. Why
not?
   
   
yes, there are not complete for DBNAME, ENCODING, FETCH_COUNT,
HISTCONTROL,
HISTFILE, HISTSIZE, HOST, IGNOREEOFF, PROMPT*,USER,  VERSION
   
There are more reasons:
   
* paremeter is not a enum (string, number or both): FETCH_COUNT,
PROMPT,
HISTSIZE, ..
   
* variable is pseudo read only  - change has not any effect:
DBNAME,
ENCODING, VERSION
  
   So HISTCONTROL should be there because it doesn't have such reasons
   at
   all?
  
  
   yes
 
  ISTM that you forgot to add HISTCONTROL to your patch. So I just added
  that.
 
  I added the tab-completion for \unset command. Like \echo :, only
  the variables having the set values should be displayed in \unset
  case.
 
 
  perfect
 
 
 
  I changed complete_from_variables() so that it checks the memory size
  of
  the variable array even when appending the known variables. If the
  memory
  size is not enough, it's dynamically increased. Practically this check
  would
  not be required because the initial size of the array is enough larger
  than
  the number of the known variables. I added this as the safe-guard.
 
  Typo: IGNOREEOFF - IGNOREEOF
 
  I removed the value none from the value list of ECHO because it's
  not
  documented and a user might get confused when he or she sees the
  undocumented
  value none. Thought?
 
 
  isn't better to fix doc? I don't know any reason why we should not to
  support none

 I'm OK with this. The attached patch adds the support of none value both
 in ECHO and HISTCONTROL variables (because HISTCONTROL had the same
 problem as ECHO had), and also adds the description of that value into
 the document.

  I looked to code, you removed a check against duplicate varname in list.
  Is
  it ok?

 Oh, just revived that code.


 yes, It is looking well

Ok, committed!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql: show only failed queries

2014-08-11 Thread Pavel Stehule

 
  Oh, just revived that code.
 
 
  yes, It is looking well

 Ok, committed!


super

thank you very much

Regards

Pavel



 Regards,

 --
 Fujii Masao

Re: [HACKERS] replication commands and log_statements