Re: [HACKERS] Function to know last log write timestamp

2014-08-11 Thread Andres Freund
On 2014-08-11 12:42:06 +0900, Fujii Masao wrote:
 On Mon, Aug 11, 2014 at 10:48 AM, Tatsuo Ishii is...@postgresql.org wrote:
  On Mon, Aug 11, 2014 at 9:23 AM, Tatsuo Ishii is...@postgresql.org wrote:
  We can know the LSN of last committed WAL record on primary by using
  pg_current_xlog_location(). It seems there's no API to know the time
  when the WAL record was created. I would like to know standby delay by
  using pg_last_xact_replay_timestamp() and such that API.
 
  If there's no such a API, it would be useful to invent usch an API IMO.
 
  +1
 
  I proposed that function before, but unfortunately it failed to be applied.
  But I still think that function is useful to calculate the replication 
  delay.
  The past discussion is
  http://www.postgresql.org/message-id/cahgqgwf3zjfunej5ka683ku5rqubtswtqfq7g1x0g34o+jx...@mail.gmail.com
 
  I looked into the thread briefly and found Simon and Robert gave -1
  for this because of performance concern. I'm not sure if it's a actual
  performance penalty or not. Maybe we need to major the penalty?
 
 I think that the performance penalty is negligible small because the patch
 I posted before added only three stores to shared memory per
 commit/abort.

Uh. It adds another atomic operation (the spinlock) to the commit
path. That's surely *not* insignificant. At the very least the
concurrency approach needs to be rethought.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nulls in GIN index

2014-08-11 Thread Heikki Linnakangas

On 08/11/2014 01:19 AM, worthy7 wrote:


http://www.postgresql.org/docs/9.1/static/gin-implementation.html
As of PostgreSQL 9.1, NULL key values can be included in the index. Also,
placeholder NULLs are included in the index for indexed items that are NULL
or contain no keys according to extractValue. This allows searches that
should find empty items to do so.

How do I define an index that includes nulls then?


You don't need to do anything special, any NULL values will be indexed 
automatically.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to know last log write timestamp

2014-08-11 Thread Fujii Masao
On Mon, Aug 11, 2014 at 3:54 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-08-11 12:42:06 +0900, Fujii Masao wrote:
 On Mon, Aug 11, 2014 at 10:48 AM, Tatsuo Ishii is...@postgresql.org wrote:
  On Mon, Aug 11, 2014 at 9:23 AM, Tatsuo Ishii is...@postgresql.org 
  wrote:
  We can know the LSN of last committed WAL record on primary by using
  pg_current_xlog_location(). It seems there's no API to know the time
  when the WAL record was created. I would like to know standby delay by
  using pg_last_xact_replay_timestamp() and such that API.
 
  If there's no such a API, it would be useful to invent usch an API IMO.
 
  +1
 
  I proposed that function before, but unfortunately it failed to be 
  applied.
  But I still think that function is useful to calculate the replication 
  delay.
  The past discussion is
  http://www.postgresql.org/message-id/cahgqgwf3zjfunej5ka683ku5rqubtswtqfq7g1x0g34o+jx...@mail.gmail.com
 
  I looked into the thread briefly and found Simon and Robert gave -1
  for this because of performance concern. I'm not sure if it's a actual
  performance penalty or not. Maybe we need to major the penalty?

 I think that the performance penalty is negligible small because the patch
 I posted before added only three stores to shared memory per
 commit/abort.

 Uh. It adds another atomic operation (the spinlock) to the commit
 path. That's surely *not* insignificant. At the very least the
 concurrency approach needs to be rethought.

No, the patch doesn't add the spinlock at all. What the commit path
additionally does are

1. increment the counter in shared memory
2. set the timestamp of last commit record to shared memory
3.  increment the counter in shared memory

There is no extra spinlock.

OTOH, when pg_last_xact_insert_timestamp reads the timestamp from
the shared memory, it checks whether the counter values are the same
or not before and after reading the timestamp. If they are not the same,
it tries to read the timesetamp again. This logic is necessary for reading
the consistent timestamp value there.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers

2014-08-11 Thread Fujii Masao
On Mon, Aug 11, 2014 at 2:10 PM, Michael Paquier
michael.paqu...@gmail.com wrote:


 On Mon, Aug 11, 2014 at 1:26 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Thanks for updating the patch! Again I tested the feature and found
 something
 wrong. I set synchronous_standby_num to 2 and started three standbys. Two
 of
 them are included in synchronous_standby_names, i.e., they are synchronous
 standbys. That is, the other one standby is always asynchronous. When
 I shutdown one of synchronous standbys and executed the write transaction,
 the transaction was successfully completed. So the transaction doesn't
 wait for
 two sync standbys in that case. Probably this is a bug.
 Well, that's working in my case :)

Oh, that worked in my machine, too, this time... I did something wrong.
Sorry for the noise.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function to know last log write timestamp

2014-08-11 Thread Andres Freund
On 2014-08-11 16:20:41 +0900, Fujii Masao wrote:
 On Mon, Aug 11, 2014 at 3:54 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-08-11 12:42:06 +0900, Fujii Masao wrote:
  On Mon, Aug 11, 2014 at 10:48 AM, Tatsuo Ishii is...@postgresql.org 
  wrote:
   On Mon, Aug 11, 2014 at 9:23 AM, Tatsuo Ishii is...@postgresql.org 
   wrote:
   We can know the LSN of last committed WAL record on primary by using
   pg_current_xlog_location(). It seems there's no API to know the time
   when the WAL record was created. I would like to know standby delay by
   using pg_last_xact_replay_timestamp() and such that API.
  
   If there's no such a API, it would be useful to invent usch an API IMO.
  
   +1
  
   I proposed that function before, but unfortunately it failed to be 
   applied.
   But I still think that function is useful to calculate the replication 
   delay.
   The past discussion is
   http://www.postgresql.org/message-id/cahgqgwf3zjfunej5ka683ku5rqubtswtqfq7g1x0g34o+jx...@mail.gmail.com
  
   I looked into the thread briefly and found Simon and Robert gave -1
   for this because of performance concern. I'm not sure if it's a actual
   performance penalty or not. Maybe we need to major the penalty?
 
  I think that the performance penalty is negligible small because the patch
  I posted before added only three stores to shared memory per
  commit/abort.
 
  Uh. It adds another atomic operation (the spinlock) to the commit
  path. That's surely *not* insignificant. At the very least the
  concurrency approach needs to be rethought.
 
 No, the patch doesn't add the spinlock at all. What the commit path
 additionally does are
 
 1. increment the counter in shared memory
 2. set the timestamp of last commit record to shared memory
 3.  increment the counter in shared memory
 
 There is no extra spinlock.

Ah, I see. There's another patch somewhere down that thread
(cahgqgwg4xfzjfyzabn5v__d3qpynnsgbph3nar6p40elivk...@mail.gmail.com). The
patch in the message you linked to *does* use a spinlock though.

 OTOH, when pg_last_xact_insert_timestamp reads the timestamp from
 the shared memory, it checks whether the counter values are the same
 or not before and after reading the timestamp. If they are not the same,
 it tries to read the timesetamp again. This logic is necessary for reading
 the consistent timestamp value there.

Yea, that approach then just touches a cacheline that should already be
local. I doubt that the implementation is correct on some more lenient
platforms (missing write memory barrier), but that's not your fault.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers

2014-08-11 Thread Michael Paquier
On Mon, Aug 11, 2014 at 4:26 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Aug 11, 2014 at 2:10 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 Oh, that worked in my machine, too, this time... I did something wrong.
 Sorry for the noise.
No problem, thanks for spending time testing.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] pg_last_xact_insert_timestamp

2014-08-11 Thread Andres Freund
Hi,

On 2011-10-04 20:52:59 +0900, Fujii Masao wrote:
 *** a/src/backend/access/transam/xact.c
 --- b/src/backend/access/transam/xact.c
 ***
 *** 1066,1071  RecordTransactionCommit(void)
 --- 1066,1074 
   
   (void) XLogInsert(RM_XACT_ID, XLOG_XACT_COMMIT_COMPACT, 
 rdata);
   }
 + 
 + /* Save timestamp of latest transaction commit record */
 + pgstat_report_xact_end_timestamp(xactStopTimestamp);
   }


Perhaps that pgstat_report() should instead be combined with the
pgstat_report_xact_timestamp(0) in CommitTransaction()? Then the number
of changecount increases and cacheline references would stay the
same. The only thing that'd change would be a single additional
assignment.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nulls in GIN index

2014-08-11 Thread worthy7
Perhaps I'm missing something

Table has 2 columns, text and ftstext

text: how are you
ftstest: (nothing)

Because how and are and you are too common to be tsvectored. Which is
fine.

So if a user searches for how are you:
select * from tbl_lines WHERE
ftstext @@ plainto_tsquery('English', 'how are you')

Returns nothing. Which I somewhat understand, but I want it to return all
the rows with nothing in the ftstext.
plainto_tsquery('English', 'how are you') = ''
and the ftstext of some rows is also = ''
So why doesn't the index return all these rows when a null string is
searched.

I think you can see what im trying to achieve, how do I do it?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/nulls-in-GIN-index-tp5814384p5814416.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] pg_last_xact_insert_timestamp

2014-08-11 Thread Fujii Masao
On Mon, Aug 11, 2014 at 4:46 PM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2011-10-04 20:52:59 +0900, Fujii Masao wrote:
 *** a/src/backend/access/transam/xact.c
 --- b/src/backend/access/transam/xact.c
 ***
 *** 1066,1071  RecordTransactionCommit(void)
 --- 1066,1074 

   (void) XLogInsert(RM_XACT_ID, 
 XLOG_XACT_COMMIT_COMPACT, rdata);
   }
 +
 + /* Save timestamp of latest transaction commit record */
 + pgstat_report_xact_end_timestamp(xactStopTimestamp);
   }


 Perhaps that pgstat_report() should instead be combined with the
 pgstat_report_xact_timestamp(0) in CommitTransaction()? Then the number
 of changecount increases and cacheline references would stay the
 same. The only thing that'd change would be a single additional
 assignment.

Sounds good suggestion.

While reading the patch again, I found it didn't handle the COMMIT/ABORT
PREPARED case properly. According to the commit e74e090, now
pg_last_xact_replay_timestamp() returns the timestamp of COMMIT/ABORT PREPARED.
pg_last_xact_insert_timestamp() is mainly expected to be used to calculate
the replication delay, so it also needs to return that timestam. But the patch
didn't change 2PC code at all. We need to add pgstat_report_xact_end_timestamp()
into FinishPreparedTransaction(), RecordTransactionCommitPrepared() or
RecordTransactionAbortPrepared().

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: show only failed queries

2014-08-11 Thread Fujii Masao
On Sat, Aug 9, 2014 at 4:16 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hi


 2014-08-08 13:58 GMT+02:00 Fujii Masao masao.fu...@gmail.com:

 On Fri, Aug 8, 2014 at 4:31 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
 
 
 
  2014-08-07 7:10 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
 
  On Thu, Aug 7, 2014 at 6:26 AM, Pavel Stehule pavel.steh...@gmail.com
  wrote:
   Hello
  
   updated version patch in attachment
 
  Thanks! But ISTM you forgot to attached the patch.
 
 
  grr .. I am sorry

 No problem. Thanks for the patch! Attached is the revised version of the
 patch.

   +/* all psql known variables are included in list by default */
   +for (known_varname = known_varnames; *known_varname;
   known_varname++)
   +varnames[nvars++] = pg_strdup(*known_varname);
  
   Don't we need to append both prefix and suffix to even known
   variables?
  
  
   ??? I am not sure if I understand well - probably system read only
   variables as DBNAME, USER, VERSION should not be there
 
  I had that question because complete_from_variables() is also called by
  the
  tab-completion of \echo : and it shows half-baked variables list. So
  I thought probably we need to change complete_from_variables() more to
  fix the problem.
 
 
  I understand now.
 
  I fixed it.
 
  I have a question.\echo probably should not to show empty known
  variable.
 
  data for autocomplete for \echo should be same as result of \set

 Agreed. I think that only the variables having the set values should be
 displayed in \echo : case. So I modified complete_from_variables()
 so that the unset variables are not shown in that case but all the
 variables
 are shown in the tab-completion of \set.

   +else if (strcmp(prev2_wd, \\set) == 0)
   +{
   +if (strcmp(prev_wd, AUTOCOMMIT) == 0)
  
   ISTM that some psql variables like IGNOREEOF are not there. Why not?
  
  
   yes, there are not complete for DBNAME, ENCODING, FETCH_COUNT,
   HISTCONTROL,
   HISTFILE, HISTSIZE, HOST, IGNOREEOFF, PROMPT*,USER,  VERSION
  
   There are more reasons:
  
   * paremeter is not a enum (string, number or both): FETCH_COUNT,
   PROMPT,
   HISTSIZE, ..
  
   * variable is pseudo read only  - change has not any effect: DBNAME,
   ENCODING, VERSION
 
  So HISTCONTROL should be there because it doesn't have such reasons at
  all?
 
 
  yes

 ISTM that you forgot to add HISTCONTROL to your patch. So I just added
 that.

 I added the tab-completion for \unset command. Like \echo :, only
 the variables having the set values should be displayed in \unset case.


 perfect



 I changed complete_from_variables() so that it checks the memory size of
 the variable array even when appending the known variables. If the memory
 size is not enough, it's dynamically increased. Practically this check
 would
 not be required because the initial size of the array is enough larger
 than
 the number of the known variables. I added this as the safe-guard.

 Typo: IGNOREEOFF - IGNOREEOF

 I removed the value none from the value list of ECHO because it's not
 documented and a user might get confused when he or she sees the
 undocumented
 value none. Thought?


 isn't better to fix doc? I don't know any reason why we should not to
 support none

I'm OK with this. The attached patch adds the support of none value both
in ECHO and HISTCONTROL variables (because HISTCONTROL had the same
problem as ECHO had), and also adds the description of that value into
the document.

 I looked to code, you removed a check against duplicate varname in list. Is
 it ok?

Oh, just revived that code.

Regards,

-- 
Fujii Masao
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
***
*** 2827,2833  bar
  they are sent to the server. The switch for this is
  option-e/option. If set to literalerrors/literal then only
  failed queries are displayed on standard error output. The switch
! for this is option-b/option.
  /para
  /listitem
/varlistentry
--- 2827,2835 
  they are sent to the server. The switch for this is
  option-e/option. If set to literalerrors/literal then only
  failed queries are displayed on standard error output. The switch
! for this is option-b/option. If unset, or if set to
! literalnone/literal (or any other value than those above) then
! no queries are displayed.
  /para
  /listitem
/varlistentry
***
*** 2892,2899  bar
   list. If set to a value of literalignoredups/literal, lines
   matching the previous history line are not entered. A value of
   literalignoreboth/literal combines the two options. If
!  unset, or if set to any other value than those above, all lines
!  read in interactive mode are saved on the history list.
  /para
  note
  para
--- 2894,2902 
   list. If set to a 

[HACKERS] ProcessUtilityHook DropStmt RenameStmt

2014-08-11 Thread geohas

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi!

I try to catch a DropStmt and convert it to a Rename Stmt, like petere's
pg_trashcan, but i don't like to create new schema. I only like to
rename the table in case of a drop table query.
Is this possible with something like:

ProcessUtility (Node * parsetree,
   const char *queryString,
   ParamListInfo params,
   bool isTopLevel,
   DestReceiver * dest, char *completionTag)

{
if (nodeTag(parsetree) == T_DropStmt)
{
 DropStmt *stmt = (DropStmt *) parsetree;
 if (stmt-removeType == OBJECT_TABLE)
{
 RenameStmt *newstmt = makeNode(RenameStmt);
newstmt-objectType = stmt-removeType;
newstmt-newname = new_name;
parsetree = (Node *) newstmt;
}
(*prev_ProcessUtility) (parsetree, queryString,context, params,
dest, completionTag);
}

regards

ge0has
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (GNU/Linux)

iQEcBAEBAgAGBQJT6MfNAAoJEJFGMlQe7wR/RCQH/1KOwtCLDT2QVrGm/PKfIFGF
e6w+oOCUYz8v78s+uvI5Y5qEuUr2wqYuUhhV7UWXWBwKgLPkSvUTv04TWS9Ms6FJ
+Zn+yzqWUygdwDzKbKY3/qYreYAL6ZBv62ldjtApNUh1VHpPtZsPWtIe/485KB6v
W4xZt7PUAKOUlqTiQwaZok2rdYt0t7vWdVmw6qncUnlPGBpJM/XGGwDl4w5NCK23
Ls5ueLpe8gKoH1eMYG27FKo1rRARVBtB3zPkXmmfRZR+f1FUIkhiDkfm1AYhBJPy
FG0yExArvZjZLQIIEaenb8GzwjR04Ulaqej5CLPdOB0NomkN0aN0CKcSRT9SrME=
=Y3t4
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-08-11 Thread Kevin Grittner
Stephen Frost sfr...@snowman.net wrote:

 Our grace period for active backends after unclean exit of one
 of their peers is low, milliseconds to seconds.  Our grace
 period for active backends after unclean exit of the postmaster
 is unconstrained.  At least one of those policies has to be
 wrong. Like Andres and Robert, I pick the second one.

 Ditto for me.

+1

In fact, I would say that is slightly understated.  The grace
period for active backends after unclean exit of one of their peers
is low, milliseconds to seconds, *unless the postmaster has also
crashed* -- in which case it is unconstrained.  Why is the crash of
a backend less serious if the postmaster has also crashed?
Certainly it can't be considered to be surprising that if the 
postmaster is crashing that other backends might be also crashing 
around the same time?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nulls in GIN index

2014-08-11 Thread Andreas Joseph Krogh
På mandag 11. august 2014 kl. 11:17:56, skrev worthy7 worthy@gmail.com 
mailto:worthy@gmail.com: Perhaps I'm missing something

 Table has 2 columns, text and ftstext

 text: how are you
 ftstest: (nothing)

 Because how and are and you are too common to be tsvectored. Which is
 fine.

 So if a user searches for how are you:
 select * from tbl_lines WHERE
 ftstext @@ plainto_tsquery('English', 'how are you')

 Returns nothing. Which I somewhat understand, but I want it to return all
 the rows with nothing in the ftstext.
 plainto_tsquery('English', 'how are you') = ''
 and the ftstext of some rows is also = ''
 So why doesn't the index return all these rows when a null string is
 searched.

 I think you can see what im trying to achieve, how do I do it?   Use the 
'simple' dictionary:   my_fts_column @@ to_tsquery('simple', 'how are you')   --
Andreas Joseph Krogh CTO / Partner - Visena AS Mobile: +47 909 56 963 
andr...@visena.com mailto:andr...@visena.com www.visena.com 
https://www.visena.com  https://www.visena.com  

Re: [HACKERS] Hokey wrong versions of libpq in apt.postgresql.org

2014-08-11 Thread Stephen Frost
JD,

* Joshua D. Drake (j...@commandprompt.com) wrote:
 The issue that I specifically ran into is that by using
 apt.postgresql.org in its default configuration, I can't add certain
 extensions (without jumping through hoops). Simple:
 
 Assume a running 9.2.9 from apt.postgresql.org
 apt-get install pgxnclient
 sudo pgxnclient install pg_repack
 
 
 
 Doesn't work. Because it is looking for libs in
 /var/lib/postgresql/9.2 not /var/lib/postgresql/9.3.

Have you got any postgresql-server-dev package installed?  If not, then
pg_config is going to grab the info for the libpq-dev that's installed,
but I doubt the extension is going to compile without the server-dev
package installed anyway..

In any case, pgxnclient should probably be more intelligent when it's
working under a Debian-based installation where multiple major versions
of PG can be installed.

 Yes. I can get the 9.2 libpq but that isn't really the point is it?
 This is quite unexpected behavior from an operational perspective.
 It should just work but it doesn't because we are shipping from
 apt.postgresql.org a 9.3 version of libpq.

I don't believe the 9.3 version of libpq is the issue here at all, see
above..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Proposal: Incremental Backup

2014-08-11 Thread Robert Haas
On Tue, Aug 5, 2014 at 8:04 PM, Simon Riggs si...@2ndquadrant.com wrote:
 To decide whether we need to re-copy the file, you read the file until
 we find a block with a later LSN. If we read the whole file without
 finding a later LSN then we don't need to re-copy. That means we read
 each file twice, which is slower, but the file is at most 1GB in size,
 we we can assume will be mostly in memory for the second read.

That seems reasonable, although copying only the changed blocks
doesn't seem like it would be a whole lot harder.  Yes, you'd need a
tool to copy those blocks back into the places where they need to go,
but that's probably not a lot of work and the disk savings, in many
cases, would be enormous.

 As Marco says, that can be optimized using filesystem timestamps instead.

The idea of using filesystem timestamps gives me the creeps.  Those
aren't always very granular, and I don't know that (for example) they
are crash-safe.  Does every filesystem on every platform make sure
that the mtime update hits the disk before the data?  What about clock
changes made manually by users, or automatically by ntpd? I recognize
that there are people doing this today, because it's what we have, and
it must not suck too much, because people are still doing it ... but I
worry that if we do it this way, we'll end up with people saying
PostgreSQL corrupted my data and will have no way of tracking the
problem back to the filesystem or system clock event that was the true
cause of the problem, so they'll just blame the database.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-08-11 Thread Andres Freund
On 2014-08-10 18:36:18 -0400, Noah Misch wrote:
 [Due for a new subject line?]
 
 On Sat, Aug 09, 2014 at 08:16:01PM +0200, Andres Freund wrote:
  On 2014-08-09 14:09:36 -0400, Tom Lane wrote:
   Andres Freund and...@2ndquadrant.com writes:
On 2014-08-09 14:00:49 -0400, Tom Lane wrote:
I don't think it's anywhere near as black-and-white as you guys claim.
What it comes down to is whether allowing existing 
transactions/sessions
to finish is more important than allowing new sessions to start.
Depending on the application, either could be more important.
   
Nah. The current behaviour circumvents security measures we normally
consider absolutely essential. If the postmaster died some bad shit went
on. The likelihood of hitting corner case bugs where it's important that
we react to a segfault/panic with a restart/crash replay is rather high.
   
   What's your point?  Once a new postmaster starts, it *will* do a crash
   restart, because certainly no shutdown checkpoint ever happened.
  
  That's not saying much. For one, there can be online checkpoints in that
  time. So it's certainly not guaranteed (or even all that likely) that
  all the WAL since the incident is replayed.  For another, it can be
  *hours* before all the backends finish.
  
  IIRC we'll continue to happily write WAL and everything after postmaster
  (and possibly some backends, corrupting shmem) have crashed. The
  bgwriter, checkpointer, backends will continue to write dirty buffers to
  disk. We'll IIRC continue to write checkpoints.   That's simply not
  things we should be doing after postmaster crashed if we can avoid at
  all.
 
 The basic support processes, including the checkpointer, exit promptly upon
 detecting a postmaster exit.  Checkpoints cease.

Only after finishing an 'in process' checkpoint though afaics. And only
if no new checkpoint has been requested since. The latter because we
don't even test for postmaster death if a latch has been set... I think
it's similar for the bgwriter and such.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-08-11 Thread Robert Haas
On Sat, Aug 9, 2014 at 2:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 +1. I think the current behaviour is a seriously bad idea.

 I don't think it's anywhere near as black-and-white as you guys claim.
 What it comes down to is whether allowing existing transactions/sessions
 to finish is more important than allowing new sessions to start.
 Depending on the application, either could be more important.

It's partly about that, and I think the answer is that being able to
start new sessions is almost always more important; but it's also
about about the fact that the postmaster provides essential
protections against data corruption, and running without those
protections is a bad idea.  If it's not a bad idea, then why do we
need those protections ever?  Why have we put so much effort into
bullet-proofing them over the years?

I mean, we could simply regard the unexpected end of a backend as
being something that is probably OK and we'd usually be right; after
all, a backend would crap out without releasing a critical spinlock
very often.   A lot of users would probably be very happy to be
liberated from the tyranny of a server-wide restart every time a
backend crashes, and 90% of the time nothing bad would happen.  But
clearly this is insanity, because every now and then something would
go terribly wrong and there would be no automated way for the system
to recover, and on even rarer occasions your data would get eaten.
That is why it is right to think that the service provided by the
postmaster is essential, not nice-to-have.

 Ideally we'd have some way to configure the behavior appropriately for
 a given installation; but short of that, it's unclear to me that
 unilaterally changing the system's bias is something our users would
 thank us for.  I've not noticed a large groundswell of complaints about
 it (though this may just reflect that we've made the postmaster pretty
 darn robust, so that the case seldom comes up).

I do think that's a large part of it.  The postmaster doesn't get
killed very often, and when it does, things are often messed up to a
degree where the user's just going to reboot anyway.  But I've
encountered customers who managed to corrupt their database because
backends didn't exit when the postmaster died, because it turns out
that removing postmaster.pid defeats the shared memory interlocks that
normally prevent starting a new postmaster, and the customer did that.
And I've personally experienced at least one protracted outage that
resulted from orphaned backends preventing 'pg_ctl restart' from
working.  If the postmaster weren't so reliable, I'm sure these kinds
of problems would be a lot more common.

But the fact that they're uncommon doesn't mean that the current
behavior is the best one, and I'm convinced that it isn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] failures on barnacle (CLOBBER_CACHE_RECURSIVELY) because of memory leaks

2014-08-11 Thread Tomas Vondra
Hi!

So after 83 days, the regression tests on barnacle completed, and it
smells like another memory leak in CacheMemoryContext, similar to those
fixed in 078b2ed on May 18.

Barnacle is one of those machines with -DCLOBBER_CACHE_RECURSIVELY, and
the tests were running with a snapshot from May 19, so the memory leaks
detected with only -DCLOBBER_CACHE_ALWAYS should be fixed there.

I have a script in place looking for excessive memory usage of postgres
backends - whenever a backend allocates more than 512MB of VSS, it does
MemoryContextStats(TopMemoryContext) on it (from gdb).

The results get logged into the postmaster log, and thus get to the
buildfarm.

See this:
http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=barnacledt=2014-05-19%2011%3A25%3A03

Essentially, it looks like this:

TopMemoryContext: 69984 total in 10 blocks; 6216 free (24 chunks); 63768 used
  TopTransactionContext: 8192 total in 1 blocks; 5768 free (23 chunks);
2424 used
 ...
  Relcache by OID: 8192 total in 1 blocks; 592 free (0 chunks); 7600 used
  CacheMemoryContext: 301981696 total in 45 blocks; 3701744 free (140
chunks); 298279952 used
 ...

or like this:

   CacheMemoryContext: 1602215936 total in 200 blocks; 497336 free (138
chunks); 1601718600 used


So probably another low-probability memory leak, happening apparently only
in CLOBBER_CACHE_RECURSIVELY.


regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-08-11 Thread Robert Haas
On Fri, Aug 8, 2014 at 10:30 AM, MauMau maumau...@gmail.com wrote:
 I've tracked down the real root cause.  The fix is very simple.  Please
 check the attached one-liner patch.

 I confirmed that the fix is already in 9.3 and 9.5devel, so I just copied
 the code fragment from 9.5devel to 9.2.9.  The attached patch is for 9.2.9.
 I didn't check 9.4 and other versions.  Why wasn't the fix applied to 9.2?

It was considered a performance improvement - i.e. a feature - rather
than a bug fix, so it was only added to master.  That was after the
release of 9.2 and before the release of 9.3, so it's in newer
branches but not older ones.

Author: Heikki Linnakangas heikki.linnakan...@iki.fi
Branch: master Release: REL9_3_BR [c9d7dbacd] 2013-01-29 10:43:33 +0200

Skip truncating ON COMMIT DELETE ROWS temp tables, if the transaction hasn't
touched any temporary tables.

We could try harder, and keep track of whether we've inserted to any temp
tables, rather than accessed them, and which temp tables have been inserted
to. But this is dead simple, and already covers many interesting scenarios.

I'd support back-porting that commit to 9.1 and 9.2 as a fix for this
problem.  As the commit message says, it's dead simple.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: Incremental Backup

2014-08-11 Thread Claudio Freire
On Mon, Aug 11, 2014 at 12:27 PM, Robert Haas robertmh...@gmail.com wrote:

 As Marco says, that can be optimized using filesystem timestamps instead.

 The idea of using filesystem timestamps gives me the creeps.  Those
 aren't always very granular, and I don't know that (for example) they
 are crash-safe.  Does every filesystem on every platform make sure
 that the mtime update hits the disk before the data?  What about clock
 changes made manually by users, or automatically by ntpd? I recognize
 that there are people doing this today, because it's what we have, and
 it must not suck too much, because people are still doing it ... but I
 worry that if we do it this way, we'll end up with people saying
 PostgreSQL corrupted my data and will have no way of tracking the
 problem back to the filesystem or system clock event that was the true
 cause of the problem, so they'll just blame the database.

I have the same creeps. I only do it on a live system, after a first
full rsync, where mtime persistence is not an issue, and where I know
ntp updates have not happened.

I had a problem once where a differential rsync with timestamps didn't
work as expected, and corrupted a slave. It was a test system so I
didn't care much at the time, but if it were a backup, I'd be quite
pissed.

Basically, mtimes aren't trustworthy across reboots. Granted, this was
a very old system, debian 5 when it was new, IIRC, so it may be better
now. But it does illustrate just how bad things can get when one
trusts timestamps. This case was an old out-of-sync slave on a test
set up that got de-synchronized, and I tried to re-synchronize it with
a delta rsync to avoid the hours it would take to actually compare
everything (about a day). One segment that was modified after the sync
loss was not transfered, causing trouble at the slave, so I was forced
to re-synchronize with a full rsync (delta, but without timestamps).
This was either before pg_basebackup or before I heard of it ;-), but
in any case, if it happened on a test system with little activity, you
can be certain it can happen on a production system.

So I now only trust mtime when there has been neither a reboot nor an
ntpd running since the last mtime-less rsync. On those cases, the
optimization works and helps a lot. But I doubt you'll take many
incremental backups matching those conditions.

Say what you will of anecdotal evidence, but the issue is quite clear
theoretically as well: modifications to file segments that aren't
reflected within mtime granularity. There are many reasons why mtime
could lose precision. Being an old filesystem with second-precision
timestamps is just one, but not the only one.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replication commands and log_statements

2014-08-11 Thread Robert Haas
On Fri, Aug 8, 2014 at 1:04 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Sat, Aug 9, 2014 at 12:29 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Aug 8, 2014 at 11:00 AM, Bruce Momjian br...@momjian.us wrote:
 On Fri, Aug  8, 2014 at 08:51:13AM -0400, Robert Haas wrote:
 On Thu, Aug 7, 2014 at 12:07 PM, Abhijit Menon-Sen a...@2ndquadrant.com 
 wrote:
  At 2014-08-07 23:22:43 +0900, masao.fu...@gmail.com wrote:
  That is, we log replication commands only when log_statement is set to
  all. Neither new parameter is introduced nor log_statement is
  redefined as a list.
 
  That sounds good to me.

 It sounds fairly unprincipled to me.  I liked the idea of making
 log_statement a list, but if we aren't gonna do that, I think this
 should be a separate parameter.

 I am unclear there is enough demand for a separate replication logging
 parameter --- using log_statement=all made sense to me.

 Most people don't want to turn on log_statement=all because it
 produces too much log volume.

 See, for example:
 http://bonesmoses.org/2014/08/05/on-postgresql-logging-verbosity/

 But logging replication commands is quite low-volume, so it is not
 hard to imagine someone wanting to log all replication commands but
 not all SQL statements.

 You can do that by executing
 ALTER ROLE replication user SET log_statement TO 'all'.
 If you don't use the replication user to execute SQL statements,
 no SQL statements are logged in that setting.

If you have a user devoted to it, I suppose that's true.  I still
think it shouldn't get munged together like that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf and reload

2014-08-11 Thread Fujii Masao
On Sun, Aug 10, 2014 at 3:54 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Fri, Aug 8, 2014 at 11:41 AM, Fujii Masao masao.fu...@gmail.com wrote:

 Yep, right. ParseConfigFp() is not good place to pick up data_directory.
 What about the attached patch which makes ProcessConfigFile() instead of
 ParseConfigFp() pick up data_directory just after the configuration file
 is
 parsed?

 I think this is better way to handle it. Few comments about patch:

Thanks for the review! Attached is the updated version of the patch.

 1. can't we directly have the code by having else in below loop.
 if (DataDir 
 !ParseConfigFile(PG_AUTOCONF_FILENAME, NULL, false, 0, elevel,
 head, tail))

Done.

 2.
 + if (DataDir == NULL)
 + {
 + ConfigVariable *prev = NULL;
 + for (item = head; item;)
 + {
 ..
 ..
 + }

 If data_directory is not present in list, then can't we directly return
 and leave the other work in function ProcessConfigFile() for second
 pass.

Done.

 3.
 I think comments can be improved:
 a.
 + In this case,
 + * we should not pick up all the settings except the data_directory
 + * from postgresql.conf yet because they might be overwritten
 + * with the settings in PG_AUTOCONF_FILENAME file which will be
 + * read later.

 It would be better if we change it as below:

 In this case, we shouldn't pick any settings except the data_directory
 from postgresql.conf because they might be overwritten
 with the settings in PG_AUTOCONF_FILENAME file which will be
 read later.

Done.

 b.
 +Now only the data_directory needs to be picked up to
 +  * read PG_AUTOCONF_FILENAME file later.

 This part of comment seems to be repetitive as you already mentioned
 some part of it in first line as well as at one other location:

Okay, I just remove that line.

While updating the patch, I found that the ConfigVariable which
is removed from list has the fields that the memory has been already
allocated but we forgot to free them. So I included the code to free
them in the patch.

Regards,

-- 
Fujii Masao
*** a/src/backend/utils/misc/guc-file.l
--- b/src/backend/utils/misc/guc-file.l
***
*** 45,50  static unsigned int ConfigFileLineno;
--- 45,52 
  static const char *GUC_flex_fatal_errmsg;
  static sigjmp_buf *GUC_flex_fatal_jmp;
  
+ static void FreeConfigVariable(ConfigVariable *item);
+ 
  /* flex fails to supply a prototype for yylex, so provide one */
  int GUC_yylex(void);
  
***
*** 151,164  ProcessConfigFile(GucContext context)
  	 * file is in the data directory, we can't read it until the DataDir has
  	 * been set.
  	 */
! 	if (DataDir 
! 		!ParseConfigFile(PG_AUTOCONF_FILENAME, NULL, false, 0, elevel,
! 		 head, tail))
  	{
! 		/* Syntax error(s) detected in the file, so bail out */
! 		error = true;
! 		ErrorConfFile = PG_AUTOCONF_FILENAME;
! 		goto cleanup_list;
  	}
  
  	/*
--- 153,218 
  	 * file is in the data directory, we can't read it until the DataDir has
  	 * been set.
  	 */
! 	if (DataDir)
  	{
! 		if (!ParseConfigFile(PG_AUTOCONF_FILENAME, NULL, false, 0, elevel,
! 			 head, tail))
! 		{
! 			/* Syntax error(s) detected in the file, so bail out */
! 			error = true;
! 			ErrorConfFile = PG_AUTOCONF_FILENAME;
! 			goto cleanup_list;
! 		}
! 	}
! 	/*
! 	 * Pick up only the data_directory if DataDir is not set, which
! 	 * means that the configuration file is read for the first time and
! 	 * PG_AUTOCONF_FILENAME file cannot be read yet. In this case,
! 	 * we shouldn't pick any settings except the data_directory
! 	 * from postgresql.conf because they might be overwritten
! 	 * with the settings in PG_AUTOCONF_FILENAME file which will be
! 	 * read later. OTOH, since it's ensured that data_directory doesn't
! 	 * exist in PG_AUTOCONF_FILENAME file, it will never be overwritten
! 	 * later.
! 	 */
! 	else
! 	{
! 		ConfigVariable *prev = NULL;
! 
! 		for (item = head; item;)
! 		{
! 			ConfigVariable *ptr = item;
! 
! 			item = item-next;
! 			if (strcmp(ptr-name, data_directory) != 0)
! 			{
! if (prev == NULL)
! 	head = ptr-next;
! else
! {
! 	prev-next = ptr-next;
! 	/*
! 	 * On removing last item in list, we need to update tail
! 	 * to ensure that list will be maintianed.
! 	 */
! 	if (prev-next == NULL)
! 		tail = prev;
! }
! FreeConfigVariable(ptr);
! 			}
! 			else
! prev = ptr;
! 		}
! 
! 		/*
! 		 * Quick exit if data_directory is not present in list.
! 		 *
! 		 * Don't remember when we last successfully loaded the config file in
! 		 * this case because that time will be set soon by subsequent load of
! 		 * the config file.
! 		 */
! 		if (head == NULL)
! 			return;
  	}
  
  	/*
***
*** 677,683  ParseConfigFp(FILE *fp, const char *config_file, int depth, int elevel,
  			*tail_p = prev_item;
  	}
  
! 	pfree(cur_item);
  	break;
  }
  			}
--- 731,737 
  			*tail_p = prev_item;
  	}
  
! 	FreeConfigVariable(cur_item);
  	break;
  

Re: [HACKERS] psql output change in 9.4

2014-08-11 Thread Robert Haas
On Fri, Aug 8, 2014 at 9:34 PM, Peter Eisentraut pete...@gmx.net wrote:
 This is 9.3:

 peter=# \a
 Output format is unaligned.
 peter=# \a
 Output format is aligned.
 peter=# \x
 Expanded display is on.
 peter=# \x
 Expanded display is off.

 This is new in 9.4:

 peter=# \a
 Output format (format) is unaligned.
 peter=# \a
 Output format (format) is aligned.
 peter=# \x
 Expanded display (expanded) is on.
 peter=# \x
 Expanded display (expanded) is off.

 What is the point of that change?

 I suppose it is so that you can use \pset without arguments to show all
 settings:

 peter=# \pset
 Border style (border) is 1.
 Target width (columns) unset.
 Expanded display (expanded) is off.
 ...

 But those are unrelated features, and the changed output doesn't make
 any sense in the contexts I show above.

 I think this should be reverted, and the \pset output should be
 implemented separately.

Yes, the \pset patch (commit c64e68fd9f1132fec563fb5de53dc3bcccb5fc3b)
caused this behavior change.   I can't remember whether I noticed it
at the time and thought it was a reasonable change, or whether I
didn't notice it when committing.

Either way, clarifying the name of the parameter which is being
displayed does not seem like particularly bad idea to me even in the
contexts you mention.  I've certainly run commands like \a and \t and
then said to myself, crap, which pset parameter does this correspond
to?.  And there was no easy way to figure it out.

I think the output could justly be criticized for making it
insufficiently clear that the parenthesized text is, in fact, the name
of the pset parameter.  We could write something like:

Border style (parameter border) is 1.

But I don't know whether that would be considered an improvement or
just extra verbosity.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 pg_restore --help changes

2014-08-11 Thread Robert Haas
On Fri, Aug 8, 2014 at 9:44 PM, Peter Eisentraut pete...@gmx.net wrote:
 This is a valuable feature change, but I think the help output is
 unhelpful.  The left side has a singular placeholder, but the right side
 talks about objects in plural.  How do I actually specify multiple
 indexes?  Is is --index=foo,bar?  It's --index=foo --index=bar, but
 that's unclear from the help.

 I think it would be clearer to write something like

   -I, --index=NAME restore named indexes (repeat for multiple)

 or perhaps make a note at the bottom

 The options -I, -n, -P, -t, -T, --section can be combined and
 specified multiple times to select multiple objects.

 Other ideas?

I like the note at the bottom idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql output change in 9.4

2014-08-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Aug 8, 2014 at 9:34 PM, Peter Eisentraut pete...@gmx.net wrote:
 What is the point of that change?

 I think the output could justly be criticized for making it
 insufficiently clear that the parenthesized text is, in fact, the name
 of the pset parameter.

Quite; that wasn't apparent to me either.

 We could write something like:
 Border style (parameter border) is 1.

How about

Border style (\pset border) is 1.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing dependency to wsock32.lib when compiling code on WIndows

2014-08-11 Thread Jeff Janes
On Tue, Jul 15, 2014 at 5:14 AM, Michael Paquier michael.paqu...@gmail.com
wrote:

 On Tue, Jul 15, 2014 at 8:46 PM, Magnus Hagander mag...@hagander.net
 wrote:
  On Thu, Jun 19, 2014 at 4:13 PM, MauMau maumau...@gmail.com wrote:
  From: Michael Paquier michael.paqu...@gmail.com
 
  You are right, I only though about the MS scripts when working on this
  patch. Updated patch for a complete cleanup is attached (note I
  updated manually ./configure for test purposes and did not re-run
  autoconf).
 
 
  I marked this patch as ready for committer.  I confirmed:
 
  I'm pretty sure this is not ready for commiter due to:
 
  * The binaries can be built with MSVC 2008 Express.  I didn't try to
 build
  on MinGW.
 
  Somebody needs to test it on mingw first :)
 
  That should be an easy test for someone who has a mingw install
  around, so better leave it at needs review so more people see it and
  can run those tests.
 I vaguely recall that I tested as well with MinGW when writing v2 of
 this patch after MauMau reviewed v1. Btw, just in case, I have just
 made working my MinGW box a bit more and re-tested it now. And it
 worked :)

 Of course someone else than the patch author with a MinGW environment
 at hand is invited to test.


I guess I should have done that

While trying to test more recent stuff against mingw, I kept running into a
problem which bisected down to this (a16bac36eca8158cbf78987e953).

I am following the instructions at
https://wiki.postgresql.org/wiki/Building_With_MinGW,
using the mingw64 set of instructions and of course the git instructions.
 Except that I am running as my own user, not creating a dummy account, and
I am using local hardware, not an amazon rental.

This is the warning/error I get:

auth.c: In function 'ClientAuthentication':
auth.c:458:6: warning: implicit declaration of function 'gai_strerror'
[-Wimplicit-function-declaration]
auth.c:458:6: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
auth.c:458:6: warning: format '%s' expects argument of type 'char *', but
argument 2 has type 'int' [-Wformat]
auth.c:476:6: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
auth.c:476:6: warning: format '%s' expects argument of type 'char *', but
argument 2 has type 'int' [-Wformat]
auth.c: In function 'CheckRADIUSAuth':
auth.c:2282:3: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
hba.c: In function 'parse_hba_line':
hba.c:1060:5: warning: implicit declaration of function 'gai_strerror'
[-Wimplicit-function-declaration]
hba.c:1060:5: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
hba.c:1131:6: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
hba.c: In function 'parse_hba_auth_opt':
hba.c:1607:4: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
pqcomm.c: In function 'StreamServerPort':
pqcomm.c:334:4: warning: implicit declaration of function 'gai_strerror'
[-Wimplicit-function-declaration]
pqcomm.c:334:4: warning: format '%s' expects argument of type 'char *', but
argument 4 has type 'int' [-Wformat]
pqcomm.c:338:4: warning: format '%s' expects argument of type 'char *', but
argument 3 has type 'int' [-Wformat]
mingwcompat.c:60:1: warning: 'RegisterWaitForSingleObject' redeclared
without dllimport attribute: previous dllimport ignored [-Wattributes]
pgstat.c: In function 'pgstat_init':
pgstat.c:353:3: warning: implicit declaration of function 'gai_strerror'
[-Wimplicit-function-declaration]
pgstat.c:353:3: warning: format '%s' expects argument of type 'char *', but
argument 2 has type 'int' [-Wformat]
postmaster.c: In function 'BackendInitialize':
postmaster.c:3938:3: warning: implicit declaration of function
'gai_strerror' [-Wimplicit-function-declaration]
postmaster.c:3938:3: warning: format '%s' expects argument of type 'char
*', but argument 2 has type 'int' [-Wformat]
libpq/auth.o:auth.c:(.text+0x1949): undefined reference to `gai_strerror'
libpq/auth.o:auth.c:(.text+0x19c4): undefined reference to `gai_strerror'
libpq/auth.o:auth.c:(.text+0x1b1a): undefined reference to `gai_strerror'
libpq/auth.o:auth.c:(.text+0x1cb4): undefined reference to `gai_strerror'
libpq/auth.o:auth.c:(.text+0x1cdc): undefined reference to `gai_strerror'
libpq/hba.o:hba.c:(.text+0x1fa0): more undefined references to
`gai_strerror' follow
collect2.exe: error: ld returned 1 exit status
make[2]: *** [postgres] Error 1
make[1]: *** [all-backend-recurse] Error 2
make: *** [all-src-recurse] Error 2


Any suggestions on what to try?  This patch doesn't seem to explicitly
mention gai_strerror at all, so it must be some indirect effect.


Cheers,

Jeff


Re: [HACKERS] replication commands and log_statements

2014-08-11 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Fri, Aug 8, 2014 at 1:04 PM, Fujii Masao masao.fu...@gmail.com wrote:
  You can do that by executing
  ALTER ROLE replication user SET log_statement TO 'all'.
  If you don't use the replication user to execute SQL statements,
  no SQL statements are logged in that setting.
 
 If you have a user devoted to it, I suppose that's true.  I still
 think it shouldn't get munged together like that.

Folks *should* have a dedicated replication user, imv.  That said, I
agree with Robert in that I don't particularly like this recommendation
for how to enable logging of replication commands.  For one thing, it
means having to remember to set the per-role GUC for every replication
user which is created and that's the kind of trivially-missed step that
can get people into trouble.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] psql output change in 9.4

2014-08-11 Thread Pavel Stehule
2014-08-11 19:52 GMT+02:00 Tom Lane t...@sss.pgh.pa.us:

 Robert Haas robertmh...@gmail.com writes:
  On Fri, Aug 8, 2014 at 9:34 PM, Peter Eisentraut pete...@gmx.net
 wrote:
  What is the point of that change?

  I think the output could justly be criticized for making it
  insufficiently clear that the parenthesized text is, in fact, the name
  of the pset parameter.

 Quite; that wasn't apparent to me either.

  We could write something like:
  Border style (parameter border) is 1.

 How about

 Border style (\pset border) is 1.


+1

Pavel


 regards, tom lane


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] bad estimation together with large work_mem generates terrible slow hash joins

2014-08-11 Thread Robert Haas
On Sat, Aug 9, 2014 at 9:13 AM, Tomas Vondra t...@fuzzy.cz wrote:
 Adding least-significant bit does not work, we need get back to adding
 the most-significant one. Not sure what's the least complex way to do
 that, though.

 I'm thinking about computing the nbuckets limit (how many buckets may
 fit into work_mem):

 tupsize = HJTUPLE_OVERHEAD +
 MAXALIGN(sizeof(MinimalTupleData)) +
 MAXALIGN(tupwidth);

 nbuckets_bits_max = my_log2(work_mem / tupsize)

 and then start with nbatches from this bit, like this:

  31   23 max   1570
   |||||
   |  |   - batches|   free   | -  buckets   |

That doesn't seem unreasonable provide there's enough bit space, but,
as you point out, the day might not be far off when there isn't.

It also strikes me that when there's only 1 batch, the set of bits
that map onto the batch number is zero-width, and one zero-width bit
range is as good as another.  In other words, if you're only planning
to do one batch, you can easily grow the number of buckets on the fly.
Growing the number of buckets only becomes difficult once you have
more than one batch.

And I mention that because, in the scenario mentioned in your original
post (a hash table with small number of buckets (8192) containing
large number of tuples [~3.3M]), presumably what happens is you start
out thinking you are going to have one batch with 8K buckets.  Then,
as more tuples continue to pour in, you see the load factor rising and
responding by bumping up the size of the hash table.  Now eventually
you realize, gosh, this isn't even going to fit into work_mem, so you
need to start batching it.  But by that point you've presumably done
as much as you're going to do in terms of increasing the number of
buckets; after that, you're just going to add more batches as
necessary.  So not being able to further increase the number of
buckets once the number of batches is no longer 1 wouldn't be a
problem in that case.

Now if you start out planning for multiple batches, and then you
discover you'd like to keep the same number of batches but have more
buckets per batch, that's gonna be hard.  It's not impossible, because
you could steal some of the unused high-order bits above the number of
batches, and then concatenate them with the low-order bits that you
already had, but that seems like kind of an ugly kludge, and AFAICS it
only helps if the new tuples are narrower by enough to justify the
cost of splitting all the buckets.  I won't say that couldn't happen,
but I think it would be better to keep that complexity out of the
patch unless we're absolutely certain it's justified.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql output change in 9.4

2014-08-11 Thread Robert Haas
On Mon, Aug 11, 2014 at 1:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Aug 8, 2014 at 9:34 PM, Peter Eisentraut pete...@gmx.net wrote:
 What is the point of that change?

 I think the output could justly be criticized for making it
 insufficiently clear that the parenthesized text is, in fact, the name
 of the pset parameter.

 Quite; that wasn't apparent to me either.

 We could write something like:
 Border style (parameter border) is 1.

 How about

 Border style (\pset border) is 1.

That would look just fine as a response to \a or \x, but I'm not sure
it would look as good as a response to \pset, which prints out that
line for every parameter (why does every line say \pset when the
command I just typed is \pset?).  However, I can certainly live with
it if others prefer that to what I suggested.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-11 Thread Robert Haas
On Sat, Aug 9, 2014 at 3:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kgri...@ymail.com writes:
 Stephen Frost sfr...@snowman.net writes:
 Trying to move the header to the end just for the sake of this
 doesn't strike me as a good solution as it'll make things quite
 a bit more complicated.

 Why is that?  How much harder would it be to add a single offset
 field to the front to point to the part we're shifting to the end?
 It is not all that unusual to put a directory at the end, like in
 the .zip file format.

 Yeah, I was wondering that too.  Arguably, directory-at-the-end would
 be easier to work with for on-the-fly creation, not that we do any
 such thing at the moment.  I think the main thing that's bugging Stephen
 is that doing that just to make pglz_compress happy seems like a kluge
 (and I have to agree).

 Here's a possibly more concrete thing to think about: we may very well
 someday want to support JSONB object field or array element extraction
 without reading all blocks of a large toasted JSONB value, if the value is
 stored external without compression.  We already went to the trouble of
 creating analogous logic for substring extraction from a long uncompressed
 text or bytea value, so I think this is a plausible future desire.  With
 the current format you could imagine grabbing the first TOAST chunk, and
 then if you see the header is longer than that you can grab the remainder
 of the header without any wasted I/O, and for the array-subscripting case
 you'd now have enough info to fetch the element value from the body of
 the JSONB without any wasted I/O.  With directory-at-the-end you'd
 have to read the first chunk just to get the directory pointer, and this
 would most likely not give you any of the directory proper; but at least
 you'd know exactly how big the directory is before you go to read it in.
 The former case is probably slightly better.  However, if you're doing an
 object key lookup not an array element fetch, neither of these formats are
 really friendly at all, because each binary-search probe probably requires
 bringing in one or two toast chunks from the body of the JSONB value so
 you can look at the key text.  I'm not sure if there's a way to redesign
 the format to make that less painful/expensive --- but certainly, having
 the key texts scattered through the JSONB value doesn't seem like a great
 thing from this standpoint.

I think that's a good point.

On the general topic, I don't think it's reasonable to imagine that
we're going to come up with a single heuristic that works well for
every kind of input data.  What pglz is doing - assuming that if the
beginning of the data is incompressible then the rest probably is too
- is fundamentally reasonable, nonwithstanding the fact that it
doesn't happen to work out well for JSONB.  We might be able to tinker
with that general strategy in some way that seems to fix this case and
doesn't appear to break others, but there's some risk in that, and
there's no obvious reason in my mind why PGLZ should be require to fly
blind.  So I think it would be a better idea to arrange some method by
which JSONB (and perhaps other data types) can provide compression
hints to pglz.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 ... I think it would be a better idea to arrange some method by
 which JSONB (and perhaps other data types) can provide compression
 hints to pglz.

I agree with that as a long-term goal, but not sure if it's sane to
push into 9.4.

What we could conceivably do now is (a) add a datatype OID argument to
toast_compress_datum, and (b) hard-wire the selection of a different
compression-parameters struct if it's JSONBOID.  The actual fix would
then be to increase the first_success_by field of this alternate struct.

I had been worrying about API-instability risks associated with (a),
but on reflection it seems unlikely that any third-party code calls
toast_compress_datum directly, and anyway it's not something we'd
be back-patching to before 9.4.

The main objection to (b) is that it wouldn't help for domains over jsonb
(and no, I don't want to insert a getBaseType call there to fix that).

A longer-term solution would be to make this some sort of type property
that domains could inherit, like typstorage is already.  (Somebody
suggested dealing with this by adding more typstorage values, but
I don't find that attractive; typstorage is known in too many places.)
We'd need some thought about exactly what we want to expose, since
the specific knobs that pglz_compress has today aren't necessarily
good for the long term.

This is all kinda ugly really, but since I'm not hearing brilliant
ideas for redesigning jsonb's storage format, maybe this is the
best we can do for now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-11 Thread Peter Geoghegan
On Mon, Aug 11, 2014 at 12:07 PM, Robert Haas robertmh...@gmail.com wrote:
 I think that's a good point.

I think that there may be something to be said for the current layout.
Having adjacent keys and values could take better advantage of CPU
cache characteristics. I've heard of approaches to improving B-Tree
locality that forced keys and values to be adjacent on individual
B-Tree pages [1], for example. I've heard of this more than once. And
FWIW, I believe based on earlier research of user requirements in this
area that very large jsonb datums are not considered all that
compelling. Document database systems have considerable limitations
here.

 On the general topic, I don't think it's reasonable to imagine that
 we're going to come up with a single heuristic that works well for
 every kind of input data.  What pglz is doing - assuming that if the
 beginning of the data is incompressible then the rest probably is too
 - is fundamentally reasonable, nonwithstanding the fact that it
 doesn't happen to work out well for JSONB.  We might be able to tinker
 with that general strategy in some way that seems to fix this case and
 doesn't appear to break others, but there's some risk in that, and
 there's no obvious reason in my mind why PGLZ should be require to fly
 blind.  So I think it would be a better idea to arrange some method by
 which JSONB (and perhaps other data types) can provide compression
 hints to pglz.

If there is to be any effort to make jsonb a more effective target for
compression, I imagine that that would have to target redundancy
between JSON documents. With idiomatic usage, we can expect plenty of
it.

[1] http://www.vldb.org/conf/1999/P7.pdf , We also forced each key
and child pointer to be adjacent to each other physically
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: show only failed queries

2014-08-11 Thread Pavel Stehule
2014-08-11 14:59 GMT+02:00 Fujii Masao masao.fu...@gmail.com:

 On Sat, Aug 9, 2014 at 4:16 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Hi
 
 
  2014-08-08 13:58 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
 
  On Fri, Aug 8, 2014 at 4:31 AM, Pavel Stehule pavel.steh...@gmail.com
  wrote:
  
  
  
   2014-08-07 7:10 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
  
   On Thu, Aug 7, 2014 at 6:26 AM, Pavel Stehule 
 pavel.steh...@gmail.com
   wrote:
Hello
   
updated version patch in attachment
  
   Thanks! But ISTM you forgot to attached the patch.
  
  
   grr .. I am sorry
 
  No problem. Thanks for the patch! Attached is the revised version of the
  patch.
 
+/* all psql known variables are included in list by default
 */
+for (known_varname = known_varnames; *known_varname;
known_varname++)
+varnames[nvars++] = pg_strdup(*known_varname);
   
Don't we need to append both prefix and suffix to even known
variables?
   
   
??? I am not sure if I understand well - probably system read
 only
variables as DBNAME, USER, VERSION should not be there
  
   I had that question because complete_from_variables() is also called
 by
   the
   tab-completion of \echo : and it shows half-baked variables list.
 So
   I thought probably we need to change complete_from_variables() more
 to
   fix the problem.
  
  
   I understand now.
  
   I fixed it.
  
   I have a question.\echo probably should not to show empty known
   variable.
  
   data for autocomplete for \echo should be same as result of \set
 
  Agreed. I think that only the variables having the set values should be
  displayed in \echo : case. So I modified complete_from_variables()
  so that the unset variables are not shown in that case but all the
  variables
  are shown in the tab-completion of \set.
 
+else if (strcmp(prev2_wd, \\set) == 0)
+{
+if (strcmp(prev_wd, AUTOCOMMIT) == 0)
   
ISTM that some psql variables like IGNOREEOF are not there. Why
 not?
   
   
yes, there are not complete for DBNAME, ENCODING, FETCH_COUNT,
HISTCONTROL,
HISTFILE, HISTSIZE, HOST, IGNOREEOFF, PROMPT*,USER,  VERSION
   
There are more reasons:
   
* paremeter is not a enum (string, number or both): FETCH_COUNT,
PROMPT,
HISTSIZE, ..
   
* variable is pseudo read only  - change has not any effect:
 DBNAME,
ENCODING, VERSION
  
   So HISTCONTROL should be there because it doesn't have such reasons
 at
   all?
  
  
   yes
 
  ISTM that you forgot to add HISTCONTROL to your patch. So I just added
  that.
 
  I added the tab-completion for \unset command. Like \echo :, only
  the variables having the set values should be displayed in \unset
 case.
 
 
  perfect
 
 
 
  I changed complete_from_variables() so that it checks the memory size of
  the variable array even when appending the known variables. If the
 memory
  size is not enough, it's dynamically increased. Practically this check
  would
  not be required because the initial size of the array is enough larger
  than
  the number of the known variables. I added this as the safe-guard.
 
  Typo: IGNOREEOFF - IGNOREEOF
 
  I removed the value none from the value list of ECHO because it's
 not
  documented and a user might get confused when he or she sees the
  undocumented
  value none. Thought?
 
 
  isn't better to fix doc? I don't know any reason why we should not to
  support none

 I'm OK with this. The attached patch adds the support of none value both
 in ECHO and HISTCONTROL variables (because HISTCONTROL had the same
 problem as ECHO had), and also adds the description of that value into
 the document.

  I looked to code, you removed a check against duplicate varname in list.
 Is
  it ok?

 Oh, just revived that code.


yes, It is looking well

regards

Pavel



 Regards,

 --
 Fujii Masao



Re: [HACKERS] 9.4 pg_restore --help changes

2014-08-11 Thread Alvaro Herrera
Robert Haas wrote:
 On Fri, Aug 8, 2014 at 9:44 PM, Peter Eisentraut pete...@gmx.net wrote:

  or perhaps make a note at the bottom
 
  The options -I, -n, -P, -t, -T, --section can be combined and
  specified multiple times to select multiple objects.
 
  Other ideas?
 
 I like the note at the bottom idea.

+1

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-11 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Robert Haas robertmh...@gmail.com writes:
  ... I think it would be a better idea to arrange some method by
  which JSONB (and perhaps other data types) can provide compression
  hints to pglz.
 
 I agree with that as a long-term goal, but not sure if it's sane to
 push into 9.4.

Agreed.

 What we could conceivably do now is (a) add a datatype OID argument to
 toast_compress_datum, and (b) hard-wire the selection of a different
 compression-parameters struct if it's JSONBOID.  The actual fix would
 then be to increase the first_success_by field of this alternate struct.

Isn't the offset-to-compressable-data variable though, depending on the
number of keys, etc?  Would we be increasing first_success_by based off
of some function which inspects the object?

 I had been worrying about API-instability risks associated with (a),
 but on reflection it seems unlikely that any third-party code calls
 toast_compress_datum directly, and anyway it's not something we'd
 be back-patching to before 9.4.

Agreed.

 The main objection to (b) is that it wouldn't help for domains over jsonb
 (and no, I don't want to insert a getBaseType call there to fix that).

While not ideal, that seems like an acceptable compromise for 9.4 to me.

 A longer-term solution would be to make this some sort of type property
 that domains could inherit, like typstorage is already.  (Somebody
 suggested dealing with this by adding more typstorage values, but
 I don't find that attractive; typstorage is known in too many places.)

Think that was me and having it be something which domains can inherit
makes sense.  Would be able to use this approach to introduce type
(and domains inheirited from that type) specific compression algorithms,
perhaps?  Or even get to a point where we could have a chunk-based
compression scheme for certain types of objects (such as JSONB) where we
keep track of which keys exist at which points in the compressed object,
allowing us to skip to the specific chunk which contains the requested
key, similar to what we do with uncompressed data?

 We'd need some thought about exactly what we want to expose, since
 the specific knobs that pglz_compress has today aren't necessarily
 good for the long term.

Agreed.

 This is all kinda ugly really, but since I'm not hearing brilliant
 ideas for redesigning jsonb's storage format, maybe this is the
 best we can do for now.

This would certainly be an improvement over what's going on now, and I
love the idea of possibly being able to expand this in the future to do
more.  What I'd hate to see is having all of this and only ever using it
to say skip ahead another 1k for JSONB.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-11 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 What we could conceivably do now is (a) add a datatype OID argument to
 toast_compress_datum, and (b) hard-wire the selection of a different
 compression-parameters struct if it's JSONBOID.  The actual fix would
 then be to increase the first_success_by field of this alternate struct.

 Isn't the offset-to-compressable-data variable though, depending on the
 number of keys, etc?  Would we be increasing first_success_by based off
 of some function which inspects the object?

Given that this is a short-term hack, I'd be satisfied with setting it
to INT_MAX.

If we got more ambitious, we could consider improving the cutoff logic
so that it gives up at x% of the object or n bytes, whichever comes
first; but I'd want to see some hard evidence that that was useful
before adding any more cycles to pglz_compress.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-11 Thread Stephen Frost
* Peter Geoghegan (p...@heroku.com) wrote:
 If there is to be any effort to make jsonb a more effective target for
 compression, I imagine that that would have to target redundancy
 between JSON documents. With idiomatic usage, we can expect plenty of
 it.

While I certainly agree, that's a rather different animal to address and
doesn't hold a lot of relevance to the current problem.  Or, to put it
another way, I don't think anyone is going to be surprised that two rows
containing the same data (even if they're inserted in the same
transaction and have the same visibility information) are compressed
together in some fashion.

We've got a clear example of someone, quite reasonably, expecting their
JSONB object to be compressed using the normal TOAST mechanism, and
we're failing to do that in cases where it's actually a win to do so.
That's the focus of this discussion and what needs to be addressed
before 9.4 goes out.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-11 Thread Peter Geoghegan
On Mon, Aug 11, 2014 at 1:01 PM, Stephen Frost sfr...@snowman.net wrote:
 We've got a clear example of someone, quite reasonably, expecting their
 JSONB object to be compressed using the normal TOAST mechanism, and
 we're failing to do that in cases where it's actually a win to do so.
 That's the focus of this discussion and what needs to be addressed
 before 9.4 goes out.

Sure. I'm not trying to minimize that. We should fix it, certainly.
However, it does bear considering that JSON data, with each document
stored in a row is not an effective target for TOAST compression in
general, even as text.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 logical replication - walsender keepalive replies

2014-08-11 Thread Steve Singer

On 07/14/2014 01:19 PM, Steve Singer wrote:

On 07/06/2014 10:11 AM, Andres Freund wrote:

Hi Steve,



Right. I thought about this for a while, and I think we should change
two things. For one, don't request replies here. It's simply not needed,
as this isn't dealing with timeouts. For another don't just check 
-flush

 sentPtr but also  -write  sentPtr. The reason we're sending these
feedback messages is to inform the 'logical standby' that there's been
WAL activity which it can't see because they don't correspond to
anything that's logically decoded (e.g. vacuum stuff).
Would that suit your needs?

Greetings,


Yes I think that will work for me.
I tested with the attached patch that I think  does what you describe 
and it seems okay.






Any feedback on this?  Do we want that change for 9.4, or do we want 
something else?






Andres Freund









--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 logical replication - walsender keepalive replies

2014-08-11 Thread Andres Freund
On 2014-08-11 17:22:27 -0400, Steve Singer wrote:
 On 07/14/2014 01:19 PM, Steve Singer wrote:
 On 07/06/2014 10:11 AM, Andres Freund wrote:
 Hi Steve,
 
 Right. I thought about this for a while, and I think we should change
 two things. For one, don't request replies here. It's simply not needed,
 as this isn't dealing with timeouts. For another don't just check
 -flush
  sentPtr but also  -write  sentPtr. The reason we're sending these
 feedback messages is to inform the 'logical standby' that there's been
 WAL activity which it can't see because they don't correspond to
 anything that's logically decoded (e.g. vacuum stuff).
 Would that suit your needs?
 
 Greetings,
 
 Yes I think that will work for me.
 I tested with the attached patch that I think  does what you describe and
 it seems okay.
 
 
 
 
 Any feedback on this?  Do we want that change for 9.4, or do we want
 something else?

I plan to test and apply it in the next few days. Digging myself from
under stuff from before my holiday right now...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bad estimation together with large work_mem generates terrible slow hash joins

2014-08-11 Thread Tomas Vondra
On 11.8.2014 20:25, Robert Haas wrote:
 On Sat, Aug 9, 2014 at 9:13 AM, Tomas Vondra t...@fuzzy.cz wrote:
 Adding least-significant bit does not work, we need get back to adding
 the most-significant one. Not sure what's the least complex way to do
 that, though.

 I'm thinking about computing the nbuckets limit (how many buckets may
 fit into work_mem):

 tupsize = HJTUPLE_OVERHEAD +
 MAXALIGN(sizeof(MinimalTupleData)) +
 MAXALIGN(tupwidth);

 nbuckets_bits_max = my_log2(work_mem / tupsize)

 and then start with nbatches from this bit, like this:

  31   23 max   1570
   |||||
   |  |   - batches|   free   | -  buckets   |
 
 That doesn't seem unreasonable provide there's enough bit space, but,
 as you point out, the day might not be far off when there isn't.

Thinking about this a bit more, I believe the danger is not that
imminent. 32 bits mean the Hash node should handle 2^32 tuples in total
(possibly split into multiple batches).

It used to be 2^32 buckets thanks to NTUP_PER_BUCKET=10, but the patch
lowers this to 1 so it's tuples now.

But while we're we're a bit closer to exhausting the 32 bits, I think
it's not really that big issue. Not only hashing a table with ~4e9 rows
is going to be a hellish experience, but if we really want to support
it, we should probably switch to 64-bit hashes.

Because adding some checks is not going to help - it may detect an
issue, but it won't fix it. And actually, even if the two values get
overlap, it does not mean the hashjoin will stop working. There'll be
dependency between batches and buckets, causing non-uniform usage of the
buckets, but it won't blow up.

So I don't think we need to worry about exhausting the 32 bits for now.

 It also strikes me that when there's only 1 batch, the set of bits
 that map onto the batch number is zero-width, and one zero-width bit
 range is as good as another.  In other words, if you're only planning
 to do one batch, you can easily grow the number of buckets on the fly.
 Growing the number of buckets only becomes difficult once you have
 more than one batch.

Yes, that's true. It's also roughly what the patch does IMHO.

If you know you'll need more batches, you can start with the maximum
number of buckets right away (because you expect there's more data than
work_mem, so shoot for

nbuckets = mylog2(work_mem/tuple_size)

or something like that. It's also true that if you're starting with a
single batch, you can do whatever you want with nbuckets until you need
to do (nbatches *= 2).

It's also true that once you're done with the first batch, all the other
batches will be just fine with the number of buckets, because the
batches be about equally sized.

But I think this is pretty much what the patch does, in the end.


 And I mention that because, in the scenario mentioned in your original
 post (a hash table with small number of buckets (8192) containing
 large number of tuples [~3.3M]), presumably what happens is you start
 out thinking you are going to have one batch with 8K buckets.  Then,
 as more tuples continue to pour in, you see the load factor rising and
 responding by bumping up the size of the hash table.  Now eventually
 you realize, gosh, this isn't even going to fit into work_mem, so you
 need to start batching it.  But by that point you've presumably done
 as much as you're going to do in terms of increasing the number of
 buckets; after that, you're just going to add more batches as
 necessary.  So not being able to further increase the number of
 buckets once the number of batches is no longer 1 wouldn't be a
 problem in that case.
 
 Now if you start out planning for multiple batches, and then you
 discover you'd like to keep the same number of batches but have more
 buckets per batch, that's gonna be hard.  It's not impossible, because
 you could steal some of the unused high-order bits above the number of
 batches, and then concatenate them with the low-order bits that you
 already had, but that seems like kind of an ugly kludge, and AFAICS it
 only helps if the new tuples are narrower by enough to justify the
 cost of splitting all the buckets.  I won't say that couldn't happen,
 but I think it would be better to keep that complexity out of the
 patch unless we're absolutely certain it's justified.

I'm definitely in favor of keeping the patch as simple as possible.

I was considering using reversing the bits of the hash, because that's
pretty much the simplest solution. But I think you're right it might
actually work like this:

  * Are more batches needed?

  (yes) = just use nbuckets = my_log2(work_mem / tuple_size)

  (no) = go ahead, until processing all tuples or hitting work_mem

  (work_mem) = meh, use the same nbuckets above

  (all tuples) = compute optimal nbuckets / 

Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-08-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Aug 8, 2014 at 10:30 AM, MauMau maumau...@gmail.com wrote:
 I've tracked down the real root cause.  The fix is very simple.  Please
 check the attached one-liner patch.

 I'd support back-porting that commit to 9.1 and 9.2 as a fix for this
 problem.  As the commit message says, it's dead simple.

While I have no great objection to back-porting Heikki's patch, it seems
like a very large stretch to call this a root-cause fix.  At best it's
band-aiding one symptom in a rather fragile way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] failures on barnacle (CLOBBER_CACHE_RECURSIVELY) because of memory leaks

2014-08-11 Thread Tom Lane
Tomas Vondra t...@fuzzy.cz writes:
 So after 83 days, the regression tests on barnacle completed,

Hah, that's perseverance!

 and it
 smells like another memory leak in CacheMemoryContext, similar to those
 fixed in 078b2ed on May 18.

Ugh, will look.

 See this:
 http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=barnacledt=2014-05-19%2011%3A25%3A03

Initially I was more worried about the crashes, but it looks like that's
probably just a side-effect of the leak:

[537a053d.1f4b:2] LOG:  server process (PID 32110) was terminated by signal 9: 
Killed
[537a053d.1f4b:3] DETAIL:  Failed process was running: select 
pg_get_viewdef('vv3', true);
[537a053d.1f4b:4] LOG:  terminating any other active server processes
[537a053d.1f52:2] WARNING:  terminating connection because of crash of another 
server process
...
[537a053d.1f4b:7] LOG:  server process (PID 3882) was terminated by signal 9: 
Killed
[537a053d.1f4b:8] DETAIL:  Failed process was running: SELECT gid FROM 
pg_prepared_xacts;
[537a053d.1f4b:9] LOG:  terminating any other active server processes
...

Evidently the OOM killer is at large on this machine.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] failures on barnacle (CLOBBER_CACHE_RECURSIVELY) because of memory leaks

2014-08-11 Thread Tomas Vondra
On 12.8.2014 02:05, Tom Lane wrote:

 Evidently the OOM killer is at large on this machine.

Yes. It's a machine with only 8GB of RAM, and there are 3 VMs (LXC
containers), with 2GB of RAM each. That's not much, but while it's
mostly out of necessity, it's apparently a good way to catch leaks.

Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing dependency to wsock32.lib when compiling code on WIndows

2014-08-11 Thread Noah Misch
On Mon, Aug 11, 2014 at 11:02:48AM -0700, Jeff Janes wrote:
 While trying to test more recent stuff against mingw, I kept running into a
 problem which bisected down to this (a16bac36eca8158cbf78987e953).

 This is the warning/error I get:
 
 auth.c: In function 'ClientAuthentication':
 auth.c:458:6: warning: implicit declaration of function 'gai_strerror'
 [-Wimplicit-function-declaration]

 libpq/auth.o:auth.c:(.text+0x1949): undefined reference to `gai_strerror'
 libpq/auth.o:auth.c:(.text+0x19c4): undefined reference to `gai_strerror'
 libpq/auth.o:auth.c:(.text+0x1b1a): undefined reference to `gai_strerror'
 libpq/auth.o:auth.c:(.text+0x1cb4): undefined reference to `gai_strerror'
 libpq/auth.o:auth.c:(.text+0x1cdc): undefined reference to `gai_strerror'

 Any suggestions on what to try?  This patch doesn't seem to explicitly
 mention gai_strerror at all, so it must be some indirect effect.

I, too, encountered that.  The patch let configure detect HAVE_GETADDRINFO
under MinGW-w64; passing ac_cv_func_getaddrinfo=no on the configure command
line is a workaround.  Under !HAVE_GETADDRINFO, configure arranges to build
src/port/getaddrinfo.c, and src/include/getaddrinfo.h injects the replacement
gai_strerror() prototype.  That understandably doesn't happen in a
HAVE_GETADDRINFO build, yet src/include/port/win32/sys/socket.h does the
following unconditionally:

/*
 * we can't use the windows gai_strerror{AW} functions because
 * they are defined inline in the MS header files. So we'll use our
 * own
 */
#undef gai_strerror

Somehow or other, we must bring these parts into agreement.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvement of versioning on Windows, take two

2014-08-11 Thread Michael Paquier
On Sun, Aug 10, 2014 at 9:31 PM, MauMau maumau...@gmail.com wrote:
 From: Michael Paquier michael.paqu...@gmail.com
  LINK : fatal error LNK1104: ファイル
 '.\release\postgres\src_timezone_win32ver.obj' を開くことができません。
Oh yes, right. I don't really know how I missed this error when
testing v1. Adding an explicit call to
RemoveFile('src\timezone\win32ver.rc') for project postgres calms down
the build. Is the attached working for you?
Regards,
-- 
Michael
From f88aa01e1f6f17d2b1a0cba05fb9efb8fe06e45f Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Tue, 12 Aug 2014 10:02:32 +0900
Subject: [PATCH] Versioning support on MinGW and MSVC for remaining exe and
 dll files

Windows versioning is added for the following files:
- regress.dll (MSVC only)
- isolationtester.exe
- pg_isolation_regress.exe
- pg_regress.exe
- pg_regress_ecpg.exe
- zic.exe
---
 src/interfaces/ecpg/test/Makefile | 1 +
 src/test/isolation/Makefile   | 7 +--
 src/test/regress/GNUmakefile  | 7 +--
 src/timezone/Makefile | 5 -
 src/tools/msvc/Mkvcbuild.pm   | 7 +++
 src/tools/msvc/clean.bat  | 3 +++
 6 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/src/interfaces/ecpg/test/Makefile b/src/interfaces/ecpg/test/Makefile
index 56f6a17..a2e0a83 100644
--- a/src/interfaces/ecpg/test/Makefile
+++ b/src/interfaces/ecpg/test/Makefile
@@ -14,6 +14,7 @@ override CPPFLAGS := \
 	$(CPPFLAGS)
 
 PGFILEDESC = ECPG Test - regression tests for ECPG
+PGAPPICON = win32
 
 # where to find psql for testing an existing installation
 PSQLDIR = $(bindir)
diff --git a/src/test/isolation/Makefile b/src/test/isolation/Makefile
index 26bcf22..77bc43d 100644
--- a/src/test/isolation/Makefile
+++ b/src/test/isolation/Makefile
@@ -6,12 +6,15 @@ subdir = src/test/isolation
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
+PGFILEDESC = pg_isolation_tester/isolationtester - isolation regression tests
+PGAPPICON = win32
+
 # where to find psql for testing an existing installation
 PSQLDIR = $(bindir)
 
 override CPPFLAGS := -I$(srcdir) -I$(libpq_srcdir) -I$(srcdir)/../regress $(CPPFLAGS)
 
-OBJS =  specparse.o isolationtester.o
+OBJS =  specparse.o isolationtester.o $(WIN32RES)
 
 all: isolationtester$(X) pg_isolation_regress$(X)
 
@@ -21,7 +24,7 @@ submake-regress:
 pg_regress.o: | submake-regress
 	rm -f $@  $(LN_S) $(top_builddir)/src/test/regress/pg_regress.o .
 
-pg_isolation_regress$(X): isolation_main.o pg_regress.o
+pg_isolation_regress$(X): isolation_main.o pg_regress.o $(WIN32RES)
 	$(CC) $(CFLAGS) $^ $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@
 
 isolationtester$(X): $(OBJS) | submake-libpq submake-libpgport
diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile
index b084e0a..64c9924 100644
--- a/src/test/regress/GNUmakefile
+++ b/src/test/regress/GNUmakefile
@@ -14,6 +14,9 @@ subdir = src/test/regress
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
+PGFILEDESC = pg_regress - regression tests
+PGAPPICON = win32
+
 # file with extra config for temp build
 TEMP_CONF =
 ifdef TEMP_CONFIG
@@ -43,7 +46,7 @@ EXTRADEFS = '-DHOST_TUPLE=$(host_tuple)' \
 
 all: pg_regress$(X)
 
-pg_regress$(X): pg_regress.o pg_regress_main.o | submake-libpgport
+pg_regress$(X): pg_regress.o pg_regress_main.o $(WIN32RES) | submake-libpgport
 	$(CC) $(CFLAGS) $^ $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@
 
 # dependencies ensure that path changes propagate
@@ -177,7 +180,7 @@ bigcheck: all tablespace-setup
 clean distclean maintainer-clean: clean-lib
 # things built by `all' target
 	rm -f $(OBJS) refint$(DLSUFFIX) autoinc$(DLSUFFIX) dummy_seclabel$(DLSUFFIX)
-	rm -f pg_regress_main.o pg_regress.o pg_regress$(X)
+	rm -f pg_regress_main.o pg_regress.o pg_regress$(X) $(WIN32RES)
 # things created by various check targets
 	rm -f $(output_files) $(input_files)
 	rm -rf testtablespace
diff --git a/src/timezone/Makefile b/src/timezone/Makefile
index ef739e9..ab65d22 100644
--- a/src/timezone/Makefile
+++ b/src/timezone/Makefile
@@ -12,11 +12,14 @@ subdir = src/timezone
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
+PGFILEDESC = timezone - timezone database
+PGAPPICON = win32
+
 # files to build into backend
 OBJS= localtime.o strftime.o pgtz.o
 
 # files needed to build zic utility program
-ZICOBJS= zic.o ialloc.o scheck.o localtime.o
+ZICOBJS= zic.o ialloc.o scheck.o localtime.o $(WIN32RES)
 
 # timezone data files
 TZDATA = africa antarctica asia australasia europe northamerica southamerica \
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index e6fb3ec..8026543 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -112,6 +112,7 @@ sub mkvcbuild
 	$postgres-AddFiles('src\backend\utils\misc', 'guc-file.l');
 	$postgres-AddFiles('src\backend\replication', 'repl_scanner.l',
 		'repl_gram.y');
+	$postgres-RemoveFile('src\timezone\win32ver.rc');
 	$postgres-AddDefine('BUILDING_DLL');
 	

Re: [HACKERS] psql: show only failed queries

2014-08-11 Thread Fujii Masao
On Tue, Aug 12, 2014 at 4:41 AM, Pavel Stehule pavel.steh...@gmail.com wrote:



 2014-08-11 14:59 GMT+02:00 Fujii Masao masao.fu...@gmail.com:

 On Sat, Aug 9, 2014 at 4:16 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Hi
 
 
  2014-08-08 13:58 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
 
  On Fri, Aug 8, 2014 at 4:31 AM, Pavel Stehule pavel.steh...@gmail.com
  wrote:
  
  
  
   2014-08-07 7:10 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
  
   On Thu, Aug 7, 2014 at 6:26 AM, Pavel Stehule
   pavel.steh...@gmail.com
   wrote:
Hello
   
updated version patch in attachment
  
   Thanks! But ISTM you forgot to attached the patch.
  
  
   grr .. I am sorry
 
  No problem. Thanks for the patch! Attached is the revised version of
  the
  patch.
 
+/* all psql known variables are included in list by default
*/
+for (known_varname = known_varnames; *known_varname;
known_varname++)
+varnames[nvars++] = pg_strdup(*known_varname);
   
Don't we need to append both prefix and suffix to even known
variables?
   
   
??? I am not sure if I understand well - probably system read
only
variables as DBNAME, USER, VERSION should not be there
  
   I had that question because complete_from_variables() is also called
   by
   the
   tab-completion of \echo : and it shows half-baked variables list.
   So
   I thought probably we need to change complete_from_variables() more
   to
   fix the problem.
  
  
   I understand now.
  
   I fixed it.
  
   I have a question.\echo probably should not to show empty known
   variable.
  
   data for autocomplete for \echo should be same as result of \set
 
  Agreed. I think that only the variables having the set values should be
  displayed in \echo : case. So I modified complete_from_variables()
  so that the unset variables are not shown in that case but all the
  variables
  are shown in the tab-completion of \set.
 
+else if (strcmp(prev2_wd, \\set) == 0)
+{
+if (strcmp(prev_wd, AUTOCOMMIT) == 0)
   
ISTM that some psql variables like IGNOREEOF are not there. Why
not?
   
   
yes, there are not complete for DBNAME, ENCODING, FETCH_COUNT,
HISTCONTROL,
HISTFILE, HISTSIZE, HOST, IGNOREEOFF, PROMPT*,USER,  VERSION
   
There are more reasons:
   
* paremeter is not a enum (string, number or both): FETCH_COUNT,
PROMPT,
HISTSIZE, ..
   
* variable is pseudo read only  - change has not any effect:
DBNAME,
ENCODING, VERSION
  
   So HISTCONTROL should be there because it doesn't have such reasons
   at
   all?
  
  
   yes
 
  ISTM that you forgot to add HISTCONTROL to your patch. So I just added
  that.
 
  I added the tab-completion for \unset command. Like \echo :, only
  the variables having the set values should be displayed in \unset
  case.
 
 
  perfect
 
 
 
  I changed complete_from_variables() so that it checks the memory size
  of
  the variable array even when appending the known variables. If the
  memory
  size is not enough, it's dynamically increased. Practically this check
  would
  not be required because the initial size of the array is enough larger
  than
  the number of the known variables. I added this as the safe-guard.
 
  Typo: IGNOREEOFF - IGNOREEOF
 
  I removed the value none from the value list of ECHO because it's
  not
  documented and a user might get confused when he or she sees the
  undocumented
  value none. Thought?
 
 
  isn't better to fix doc? I don't know any reason why we should not to
  support none

 I'm OK with this. The attached patch adds the support of none value both
 in ECHO and HISTCONTROL variables (because HISTCONTROL had the same
 problem as ECHO had), and also adds the description of that value into
 the document.

  I looked to code, you removed a check against duplicate varname in list.
  Is
  it ok?

 Oh, just revived that code.


 yes, It is looking well

Ok, committed!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: show only failed queries

2014-08-11 Thread Pavel Stehule
 
  Oh, just revived that code.
 
 
  yes, It is looking well

 Ok, committed!


super

thank you very much

Regards

Pavel



 Regards,

 --
 Fujii Masao



Re: [HACKERS] replication commands and log_statements

2014-08-11 Thread Fujii Masao
On Tue, Aug 12, 2014 at 2:34 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Aug 8, 2014 at 1:04 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Sat, Aug 9, 2014 at 12:29 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Aug 8, 2014 at 11:00 AM, Bruce Momjian br...@momjian.us wrote:
 On Fri, Aug  8, 2014 at 08:51:13AM -0400, Robert Haas wrote:
 On Thu, Aug 7, 2014 at 12:07 PM, Abhijit Menon-Sen a...@2ndquadrant.com 
 wrote:
  At 2014-08-07 23:22:43 +0900, masao.fu...@gmail.com wrote:
  That is, we log replication commands only when log_statement is set to
  all. Neither new parameter is introduced nor log_statement is
  redefined as a list.
 
  That sounds good to me.

 It sounds fairly unprincipled to me.  I liked the idea of making
 log_statement a list, but if we aren't gonna do that, I think this
 should be a separate parameter.

 I am unclear there is enough demand for a separate replication logging
 parameter --- using log_statement=all made sense to me.

 Most people don't want to turn on log_statement=all because it
 produces too much log volume.

 See, for example:
 http://bonesmoses.org/2014/08/05/on-postgresql-logging-verbosity/

 But logging replication commands is quite low-volume, so it is not
 hard to imagine someone wanting to log all replication commands but
 not all SQL statements.

 You can do that by executing
 ALTER ROLE replication user SET log_statement TO 'all'.
 If you don't use the replication user to execute SQL statements,
 no SQL statements are logged in that setting.

 If you have a user devoted to it, I suppose that's true.  I still
 think it shouldn't get munged together like that.

Why do we need to treat only replication commands as special ones and
add new parameter to display them? I agree to add new parameter like
log_replication to log the replication activity instead of the command
itself, like checkpoint activity which can be enabled by log_checkpoints.
But I'm not sure why logging of replication commands also needs to be
controlled by separate parameter.

And, I still think that those who set log_statement to all expect that all
the commands including replication commands are logged. I'm afraid
that separating only parameter for replication commands might confuse
the users.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf and reload

2014-08-11 Thread Amit Kapila
On Mon, Aug 11, 2014 at 11:08 PM, Fujii Masao masao.fu...@gmail.com wrote:


 While updating the patch, I found that the ConfigVariable which
 is removed from list has the fields that the memory has been already
 allocated but we forgot to free them. So I included the code to free
 them in the patch.

Yes, that is right.

! /*
!  * Pick up only the data_directory if DataDir is not set, which
!  * means that the configuration file is read for the first time and
!  * PG_AUTOCONF_FILENAME file cannot be read yet. In this case,
!  * we shouldn't pick any settings except the data_directory
!  * from postgresql.conf because they might be overwritten
!  * with the settings in PG_AUTOCONF_FILENAME file which will be
!  * read later. OTOH, since it's ensured that data_directory doesn't
!  * exist in PG_AUTOCONF_FILENAME file, it will never be overwritten
!  * later.
!  */
! else

It is bit incovinient to read this part of code, some other places in
same file use comment inside else construct which seems to be
better. one example is as below:

else
{
/*
 * ordinary variable, append to list.  For multiple items of
 * same parameter, retain only which comes later.
 */


I have marked this as Ready For Committer.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] replication commands and log_statements

2014-08-11 Thread Amit Kapila
On Tue, Aug 12, 2014 at 9:29 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Tue, Aug 12, 2014 at 2:34 AM, Robert Haas robertmh...@gmail.com
wrote:
 
  If you have a user devoted to it, I suppose that's true.  I still
  think it shouldn't get munged together like that.

 Why do we need to treat only replication commands as special ones and
 add new parameter to display them?

One difference is that replication commands are internally generated
commands. Do we anywhere else log such internally generated
commands with log_statement = all?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Removing dependency to wsock32.lib when compiling code on WIndows

2014-08-11 Thread Michael Paquier
On Tue, Aug 12, 2014 at 9:43 AM, Noah Misch n...@leadboat.com wrote:

 Somehow or other, we must bring these parts into agreement.


It is interesting to see that with MinGW-32b getaddrinfo is still set
to no at configure phase. What if we simply wrap undef gai_strerror
like in the patch attached? I just set up an environment with MinGW-64
and I was able to build the code (tested as well with MinGW-32 btw).

Another idea would be to enforce getaddrinfo to no at configure for
MinGW environments.
Thoughts?
-- 
Michael
From 284aff1932d085587aa98a6023d8a6bc4ab9a16a Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Mon, 11 Aug 2014 19:11:45 -0700
Subject: [PATCH] Fix build error of MinGW-64 caused by incorrect gai_strerror

Since commit a16bac3, configure detects HAVE_GETADDRINFO at configure
phase in builds done with MinGW-64, resulting in an incorrect definition
of gai_strerror as this is inconditionally undefined in
src/include/port/win32/sys/socket.h. This commit simply prevents
gai_strerror from being undefined when HAVE_GETADDRINFO is detected.

This error has been first triggered with libpq, but with no doubts it
may have been present in other places, libpq being only the first one
to fail.
---
 src/include/port/win32/sys/socket.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/include/port/win32/sys/socket.h b/src/include/port/win32/sys/socket.h
index edaee6a..33e697a 100644
--- a/src/include/port/win32/sys/socket.h
+++ b/src/include/port/win32/sys/socket.h
@@ -28,6 +28,8 @@
  * they are defined inline in the MS header files. So we'll use our
  * own
  */
+#if !defined(HAVE_GETADDRINFO)
 #undef gai_strerror
+#endif
 
 #endif   /* WIN32_SYS_SOCKET_H */
-- 
2.0.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5: Memory-bounded HashAgg

2014-08-11 Thread Jeff Davis
On Mon, 2014-08-11 at 01:29 +0200, Tomas Vondra wrote:
 On 10.8.2014 23:26, Jeff Davis wrote:
  This patch is requires the Memory Accounting patch, or something
  similar to track memory usage.
 
 I think the patch you sent actually includes the accounting patch. Is
 that on purpose, or by accident?

Accident, thank you.

 So once a group gets into memory, it stays there? That's going to work
 fine for aggregates with fixed-size state (int4, or generally state that
 gets allocated and does not grow), but I'm afraid for aggregates with
 growing state (as for example array_agg and similar) that's not really a
 solution.

I agree in theory, but for now I'm just not handling that case at all
because there is other work that needs to be done first. For one thing,
we would need a way to save the transition state, and we don't really
have that. In the case of array_agg, the state is not serialized and
there's no generic way to ask it to serialize itself without finalizing.

I'm open to ideas. Do you think my patch is going generally in the right
direction, and we can address this problem later; or do you think we
need a different approach entirely?

 While hacking on the hash join, I envisioned the hash aggregate might
 work in a very similar manner, i.e. something like this:
 
   * nbatches=1, nbits=0
   * when work_mem gets full = nbatches *= 2, nbits += 1
   * get rid of half the groups, using nbits from the hash
  = dump the current states into 'states.batchno' file
  = dump further tuples to 'tuples.batchno' file
   * continue until the end, or until work_mem gets full again

It would get a little messy with HashAgg. Hashjoin is dealing entirely
with tuples; HashAgg deals with tuples and groups.

Also, if the transition state is fixed-size (or even nearly so), it
makes no sense to remove groups from the hash table before they are
finished. We'd need to detect that somehow, and it seems almost like two
different algorithms (though maybe not a bad idea to use a different
algorithm for things like array_agg).

Not saying that it can't be done, but (unless you have an idea) requires
quite a bit more work than what I did here.

 It also seems to me that the logic of the patch is about this:
 
   * try to lookup the group in the hash table
 * found = call the transition function
 * not found
 * enough space = call transition function
 * not enough space = tuple/group goes to a batch
 
 Which pretty much means all tuples need to do the lookup first. The nice
 thing on the hash-join approach is that you don't really need to do the
 lookup - you just need to peek at the hash whether the group belongs to
 the current batch (and if not, to which batch it should go).

That's an interesting point. I suspect that, in practice, the cost of
hashing the tuple is more expensive (or at least not much cheaper than)
doing a failed lookup.

 For aggregates using 'internal' to pass pointers that requires some help
 from the author - serialization/deserialization functions.

Ah, yes, this is what I was referring to earlier.

 * EXPLAIN details for disk usage
 * choose number of partitions intelligently
 
 What is the purpose of HASH_DISK_MAX_PARTITIONS? I mean, when we decide
 we need 2048 partitions, why should we use less if we believe it will
 get us over work_mem?

Because I suspect there are costs in having an extra file around that
I'm not accounting for directly. We are implicitly assuming that the OS
will keep around enough buffers for each BufFile to do sequential writes
when needed. If we create a zillion partitions, then either we end up
with random I/O or we push the memory burden into the OS buffer cache.

We could try to model those costs explicitly to put some downward
pressure on the number of partitions we select, but I just chose to cap
it for now.

 For us, removing the sort is a big deal, because we're working with
 100M rows regularly. It's more complicated though, because the sort is
 usually enforced by COUNT(DISTINCT) and that's not going to disappear
 because of this patch. But that's solvable with a custom aggregate.

I hope this offers you a good alternative.

I'm not sure it will ever beat sort for very high cardinality cases, but
I hope it can beat sort when the group size averages something higher
than one. It will also be safer, so the optimizer can be more aggressive
about choosing HashAgg.

Thank you for taking a look so quickly!

Regards,
Jeff Davis






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing dependency to wsock32.lib when compiling code on WIndows

2014-08-11 Thread Noah Misch
On Tue, Aug 12, 2014 at 01:58:17PM +0900, Michael Paquier wrote:
 On Tue, Aug 12, 2014 at 9:43 AM, Noah Misch n...@leadboat.com wrote:
 
  Somehow or other, we must bring these parts into agreement.
 
 
 It is interesting to see that with MinGW-32b getaddrinfo is still set
 to no at configure phase. What if we simply wrap undef gai_strerror
 like in the patch attached? I just set up an environment with MinGW-64
 and I was able to build the code (tested as well with MinGW-32 btw).

It's easy to devise something that fixes the build.  What is the right fix,
and why?

Note that MinGW-w64 is available in both 32-bit and 64-bit.  It is a fork of
MinGW, which is always 32-bit.  I experienced the problem with 64-bit
MinGW-w64; I don't know how 32-bit MinGW-w64 compares.

 Another idea would be to enforce getaddrinfo to no at configure for
 MinGW environments.

Consistency with the MSVC build is desirable, either HAVE_GETADDRINFO for both
or !HAVE_GETADDRINFO for both.

nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers