Re: [HACKERS] Proposal: pg_confcheck - syntactic & semantic validation of postgresql configuration files

2015-11-08 Thread Peter Eisentraut
On 10/14/15 1:50 PM, Andres Freund wrote:
> On October 14, 2015 7:45:53 PM GMT+02:00, Alvaro Herrera 
>  wrote:
>> Amir Rohan wrote:
>>
>>> it does fail the "dependent options" test:
>>> $ postgres -C "archive_mode"
>>> on
>>> $ postgres -C wal_level
>>> minimal
>>>
>>> no errors, great, let's try it:
>>> $ pg_ctl restart
>>>
>>> FATAL:  WAL archival cannot be enabled when wal_level is "minimal"
>>
>> This complaint could be fixed we had a --check-config that runs the
>> check hook for every variable, I think.

I think that would be widely useful and fairly uncontroversial.

> The problem is that this, and some others, invariant is checked outside the 
> GUC framework. Which we should probably change, which IIRC will require some 
> new infrastructure.

In the extreme, this problem is not solvable (halting problem).  If we
had a dry-run checking functionality, there would probably be more
incentive to normalize many of the common dependency cases into a
declarative system.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-11-08 Thread Noah Misch
On Thu, Oct 29, 2015 at 08:46:52AM +0100, Andres Freund wrote:
> On October 29, 2015 7:59:03 AM GMT+01:00, Noah Misch  
> wrote:
> >That helps; thanks.  Your design seems good.  I've located only insipid
> >defects.
> 
> Great!
> 
> > I propose to save some time by writing a patch series
> >eliminating
> >them, which you could hopefully review.  Does that sound good?
> 
> Yes, it does.

I have pushed a stack of branches to https://github.com/nmisch/postgresql.git:

mxt0-revert - reverts commits 4f627f8 and aa29c1c
mxt1-disk-independent - see below
mxt2-cosmetic - update already-wrong comments and formatting
mxt3-main - replaces commit 4f627f8
mxt4-rm-legacy - replaces commit aa29c1c

The plan is to squash each branch into one PostgreSQL commit.  In addition to
examining overall "git diff mxt2-cosmetic mxt3-main", I recommend reviewing
itemized changes and commit log entries in "git log -p --reverse --no-merges
mxt2-cosmetic..mxt3-main".  In particular, when a change involves something
you discussed upthread or was otherwise not obvious, I put a statement of
rationale in the commit log.

> +  * Make sure to only attempt truncation if there's values to truncate
> +  * away. In normal processing values shouldn't go backwards, but there's
> +  * some corner cases (due to bugs) where that's possible.

Which bugs are those?  I would like to include more detail if available.


If anything here requires careful study, it's the small mxt1-disk-independent
change, which I have also attached in patch form.  That patch removes the
SlruScanDirCbFindEarliest() test from TruncateMultiXact(), which in turn makes
multixact control decisions independent of whether TruncateMultiXact() is
successful at unlinking segments.  Today, one undeletable segment file can
cause TruncateMultiXact() to give up on truncation completely for a span of
hundreds of millions of MultiXactId.  Patched multixact.c will, like CLOG,
make its decisions strictly based on the content of files it expects to exist.
It will no longer depend on the absence of files it hopes will not exist.

To aid in explaining the change's effects, I will define some terms.  A
"logical wrap" occurs when no range of 2^31 integers covers the set of
MultiXactId stored in xmax fields.  A "file-level wrap" occurs when there
exists a pair of pg_multixact/offsets segment files such that:

  | segno_a * SLRU_PAGES_PER_SEGMENT * MULTIXACT_OFFSETS_PER_PAGE - 
segno_b * SLRU_PAGES_PER_SEGMENT * MULTIXACT_OFFSETS_PER_PAGE | > 2^31

A logical wrap implies either a file-level wrap or missing visibility
metadata, but a file-level wrap does not imply other consequences.  The
SlruScanDirCbFindEarliest() test is usually redundant with
find_multixact_start(), because MultiXactIdPrecedes(oldestMXact, earliest)
almost implies that find_multixact_start() will fail.  The exception arises
when pg_multixact/offsets files compose a file-level wrap, which can happen
when TruncateMultiXact() fails to unlink segments as planned.  When it does
happen, the result of SlruScanDirCbFindEarliest(), and therefore the computed
"earliest" value, is undefined.  (This outcome is connected to our requirement
to use only half the pg_clog or pg_multixact/offsets address space at any one
time.  The PagePrecedes callbacks for these SLRUs cease to be transitive if
more than half their address space is in use.)

The SlruScanDirCbFindEarliest() test can be helpful when a file-level wrap
coexists with incorrect oldestMultiXactId (extant xmax values define "correct
oldestMultiXactId").  If we're lucky with readdir() order, the test will block
truncation so we don't delete still-needed segments.  I am content to lose
that, because (a) the code isn't reliable for (or even directed toward) that
purpose and (b) sites running on today's latest point releases no longer have
incorrect oldestMultiXactId.
diff --git a/src/backend/access/transam/multixact.c 
b/src/backend/access/transam/multixact.c
index cb12fc3..cb4a0cd 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -2945,29 +2945,6 @@ SlruScanDirCbRemoveMembers(SlruCtl ctl, char *filename, 
int segpage,
return false;   /* keep going */
 }
 
-typedef struct mxtruncinfo
-{
-   int earliestExistingPage;
-} mxtruncinfo;
-
-/*
- * SlruScanDirectory callback
- * This callback determines the earliest existing page number.
- */
-static bool
-SlruScanDirCbFindEarliest(SlruCtl ctl, char *filename, int segpage, void *data)
-{
-   mxtruncinfo *trunc = (mxtruncinfo *) data;
-
-   if (trunc->earliestExistingPage == -1 ||
-   ctl->PagePrecedes(segpage, trunc->earliestExistingPage))
-   {
-   trunc->earliestExistingPage = segpage;
-   }
-
-   return false;   /* keep going */
-}
-
 /*
  * Remove all MultiXactOffset and MultiXactMember segments before the oldest
  * ones 

Re: [HACKERS] eXtensible Transaction Manager API

2015-11-08 Thread Michael Paquier
On Sun, Nov 8, 2015 at 1:53 AM, Konstantin Knizhnik wrote:
> In tsDTM approach two phase commit is performed by coordinator and currently
> is using standard PostgreSQL two phase commit:
>
> Code in GO performing two phase commit:
>
>   exec(conn1, "prepare transaction '" + gtid + "'")
>   exec(conn2, "prepare transaction '" + gtid + "'")
>   exec(conn1, "select dtm_begin_prepare($1)", gtid)
>   exec(conn2, "select dtm_begin_prepare($1)", gtid)
>   csn = _execQuery(conn1, "select dtm_prepare($1, 0)", gtid)
>   csn = _execQuery(conn2, "select dtm_prepare($1, $2)", gtid, csn)
>   exec(conn1, "select dtm_end_prepare($1, $2)", gtid, csn)
>   exec(conn2, "select dtm_end_prepare($1, $2)", gtid, csn)
>   exec(conn1, "commit prepared '" + gtid + "'")
>   exec(conn2, "commit prepared '" + gtid + "'")
>
> If commit at some of the nodes failed, coordinator should rollback prepared
> transaction at all nodes.

Not always. If COMMIT PREPARED fails at some of the nodes but succeeds
on others, the transaction is already partially acknowledged as
committed in the cluster. Hence it makes more sense for the
coordinator to commit transactions on the remaining nodes. Before
issuing any COMMIT PREPARED queries, I guess that's fine to rollback
the transactions on all nodes though.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] eXtensible Transaction Manager API

2015-11-08 Thread Konstantin Knizhnik

On 11/08/2015 02:46 PM, Michael Paquier wrote:

On Sun, Nov 8, 2015 at 1:53 AM, Konstantin Knizhnik wrote:

In tsDTM approach two phase commit is performed by coordinator and currently
is using standard PostgreSQL two phase commit:

Code in GO performing two phase commit:

   exec(conn1, "prepare transaction '" + gtid + "'")
   exec(conn2, "prepare transaction '" + gtid + "'")
   exec(conn1, "select dtm_begin_prepare($1)", gtid)
   exec(conn2, "select dtm_begin_prepare($1)", gtid)
   csn = _execQuery(conn1, "select dtm_prepare($1, 0)", gtid)
   csn = _execQuery(conn2, "select dtm_prepare($1, $2)", gtid, csn)
   exec(conn1, "select dtm_end_prepare($1, $2)", gtid, csn)
   exec(conn2, "select dtm_end_prepare($1, $2)", gtid, csn)
   exec(conn1, "commit prepared '" + gtid + "'")
   exec(conn2, "commit prepared '" + gtid + "'")

If commit at some of the nodes failed, coordinator should rollback prepared
transaction at all nodes.

Not always. If COMMIT PREPARED fails at some of the nodes but succeeds
on others, the transaction is already partially acknowledged as
committed in the cluster. Hence it makes more sense for the
coordinator to commit transactions on the remaining nodes. Before
issuing any COMMIT PREPARED queries, I guess that's fine to rollback
the transactions on all nodes though.

We will get inconsistency if  transaction is committed on some subset of nodes 
involved in transaction.
Assume bank debit-credit example. If some distributed transaction transfers 
money from the account at one node to the account and another node,
then committing transaction just at one node cause incorrect total balance.
The main goal of DTM is to preserve ACID semantic for distributed transaction, 
so either transaction is committed at all nodes, either it is not committed at 
all.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OS X El Capitan and DYLD_LIBRARY_PATH

2015-11-08 Thread Michael Paquier
On Sun, Nov 8, 2015 at 7:26 AM, Tom Lane  wrote:
> "David E. Wheeler"  writes:
>> Apple says that the more people file bugs, the more likely the issue is to 
>> get attention. So by all means, please file radars about this. You can 
>> reference 21732670 as the bug you’re duping (they marked mine as a dupe for 
>> that one).
>
> Done here.

Let's be noisy.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Improvements of Hunspell dictionaries support

2015-11-08 Thread Artur Zakirov

On 07.11.2015 17:20, Emre Hasegeli wrote:

It seems to have something to do with the order of the affixes.  It
works, if I move affix 2646 to the beginning of the list.

[1] https://tr-spell.googlecode.com/files/dict_aff_5000_suffix_113_words.zip

Thank you for reply.

This was because of the flag field size of the SPELL struct. And long 
flags were being trancated in the .dict file.


I attached new patch. It is temporary patch, not final. It can be done 
better.


--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

*** a/src/backend/tsearch/spell.c
--- b/src/backend/tsearch/spell.c
***
*** 153,159  cmpspell(const void *s1, const void *s2)
  static int
  cmpspellaffix(const void *s1, const void *s2)
  {
! 	return (strncmp((*(SPELL *const *) s1)->p.flag, (*(SPELL *const *) s2)->p.flag, MAXFLAGLEN));
  }
  
  static char *
--- 153,159 
  static int
  cmpspellaffix(const void *s1, const void *s2)
  {
! 	return (strcmp((*(SPELL *const *) s1)->flag, (*(SPELL *const *) s2)->flag));
  }
  
  static char *
***
*** 237,242  cmpaffix(const void *s1, const void *s2)
--- 237,309 
  	   (const unsigned char *) a2->repl);
  }
  
+ static unsigned short
+ decodeFlag(IspellDict *Conf, char *sflag, char **sflagnext)
+ {
+ 	unsigned short	s;
+ 	char		   *next;
+ 
+ 	switch (Conf->flagMode)
+ 	{
+ 		case FM_LONG:
+ 			s = (int)sflag[0] << 8 | (int)sflag[1];
+ 			if (sflagnext)
+ *sflagnext = sflag + 2;
+ 			break;
+ 		case FM_NUM:
+ 			s = (unsigned short) strtol(sflag, , 10);
+ 			if (sflagnext)
+ 			{
+ if (next)
+ {
+ 	*sflagnext = next;
+ 	while (**sflagnext)
+ 	{
+ 		if (**sflagnext == ',')
+ 		{
+ 			*sflagnext = *sflagnext + 1;
+ 			break;
+ 		}
+ 		*sflagnext = *sflagnext + 1;
+ 	}
+ }
+ else
+ 	*sflagnext = 0;
+ 			}
+ 			break;
+ 		default:
+ 			s = (unsigned short) *((unsigned char *)sflag);
+ 			if (sflagnext)
+ *sflagnext = sflag + 1;
+ 	}
+ 
+ 	return s;
+ }
+ 
+ static bool
+ isAffixFlagInUse(IspellDict *Conf, int affix, unsigned short affixflag)
+ {
+ 	char *flagcur;
+ 	char *flagnext = 0;
+ 
+ 	if (affixflag == 0)
+ 		return true;
+ 
+ 	flagcur = Conf->AffixData[affix];
+ 
+ 	while (*flagcur)
+ 	{
+ 		if (decodeFlag(Conf, flagcur, ) == affixflag)
+ 			return true;
+ 		if (flagnext)
+ 			flagcur = flagnext;
+ 		else
+ 			break;
+ 	}
+ 
+ 	return false;
+ }
+ 
  static void
  NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
  {
***
*** 255,261  NIAddSpell(IspellDict *Conf, const char *word, const char *flag)
  	}
  	Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
  	strcpy(Conf->Spell[Conf->nspell]->word, word);
! 	strlcpy(Conf->Spell[Conf->nspell]->p.flag, flag, MAXFLAGLEN);
  	Conf->nspell++;
  }
  
--- 322,328 
  	}
  	Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + strlen(word) + 1);
  	strcpy(Conf->Spell[Conf->nspell]->word, word);
! 	Conf->Spell[Conf->nspell]->flag = cpstrdup(Conf, flag);
  	Conf->nspell++;
  }
  
***
*** 355,361  FindWord(IspellDict *Conf, const char *word, int affixflag, int flag)
  	else if ((flag & StopMiddle->compoundflag) == 0)
  		return 0;
  
! 	if ((affixflag == 0) || (strchr(Conf->AffixData[StopMiddle->affix], affixflag) != NULL))
  		return 1;
  }
  node = StopMiddle->node;
--- 422,428 
  	else if ((flag & StopMiddle->compoundflag) == 0)
  		return 0;
  
! 	if (isAffixFlagInUse(Conf, StopMiddle->affix, affixflag))
  		return 1;
  }
  node = StopMiddle->node;
***
*** 394,400  NIAddAffix(IspellDict *Conf, int flag, char flagflags, const char *mask, const c
  
  	Affix = Conf->Affix + Conf->naffixes;
  
! 	if (strcmp(mask, ".") == 0)
  	{
  		Affix->issimple = 1;
  		Affix->isregis = 0;
--- 461,467 
  
  	Affix = Conf->Affix + Conf->naffixes;
  
! 	if (strcmp(mask, ".") == 0 || *mask == '\0')
  	{
  		Affix->issimple = 1;
  		Affix->isregis = 0;
***
*** 429,443  NIAddAffix(IspellDict *Conf, int flag, char flagflags, const char *mask, const c
  		err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
  		 REG_ADVANCED | REG_NOSUB,
  		 DEFAULT_COLLATION_OID);
  		if (err)
! 		{
! 			char		errstr[100];
! 
! 			pg_regerror(err, &(Affix->reg.regex), errstr, sizeof(errstr));
! 			ereport(ERROR,
! 	(errcode(ERRCODE_INVALID_REGULAR_EXPRESSION),
! 	 errmsg("invalid regular expression: %s", errstr)));
! 		}
  	}
  
  	Affix->flagflags = flagflags;
--- 496,504 
  		err = pg_regcomp(&(Affix->reg.regex), wmask, wmasklen,
  		 REG_ADVANCED | REG_NOSUB,
  		 DEFAULT_COLLATION_OID);
+ 		/* Ignore regular expression error and do not add wrong affix */
  		if (err)
! 			return;
  	}
  
  	Affix->flagflags = flagflags;
***
*** 595,604  addFlagValue(IspellDict *Conf, char *s, uint32 val)
  

Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2015-11-08 Thread Michael Paquier
On Sat, Nov 7, 2015 at 3:54 PM, Michael Paquier wrote:
> I thought about something like that at some point by saving a minimum
> activity pointer in XLogCtl, updated each time a segment was forcibly
> switched or after inserting a checkpoint record. Then the bgwriter
> looked at if the current insert position matched this minimum activity
> pointer, skipping LogStandbySnapshot if both positions match. Does
> this match your line of thoughts?

Looking at the code, it occurred to me that the LSN position saved for
a XLOG_SWITCH record is the last position of current segment, so we
would still need to check if the current insert LSN matches the
beginning of a new segment and if the last segment was forcibly
switched by saving RecPtr of RequestXLogSwitch in XLogCtl for example.
Thoughts?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pam auth - add rhost item

2015-11-08 Thread Michael Paquier
On Sat, Oct 17, 2015 at 1:00 AM, kolo hhmow  wrote:
> Ok.
> Thak you all!

This patch was listed twice in the CF app. I removed the duplicated
entry and let this one alive:
https://commitfest.postgresql.org/7/392/
Could you add your name as an author please?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] eXtensible Transaction Manager API

2015-11-08 Thread Simon Riggs
On 8 November 2015 at 16:59, Konstantin Knizhnik 
wrote:

> On 11/08/2015 02:46 PM, Michael Paquier wrote:
>
>> On Sun, Nov 8, 2015 at 1:53 AM, Konstantin Knizhnik wrote:
>>
>>> In tsDTM approach two phase commit is performed by coordinator and
>>> currently
>>> is using standard PostgreSQL two phase commit:
>>>
>>> Code in GO performing two phase commit:
>>>
>>>exec(conn1, "prepare transaction '" + gtid + "'")
>>>exec(conn2, "prepare transaction '" + gtid + "'")
>>>exec(conn1, "select dtm_begin_prepare($1)", gtid)
>>>exec(conn2, "select dtm_begin_prepare($1)", gtid)
>>>csn = _execQuery(conn1, "select dtm_prepare($1, 0)", gtid)
>>>csn = _execQuery(conn2, "select dtm_prepare($1, $2)", gtid,
>>> csn)
>>>exec(conn1, "select dtm_end_prepare($1, $2)", gtid, csn)
>>>exec(conn2, "select dtm_end_prepare($1, $2)", gtid, csn)
>>>exec(conn1, "commit prepared '" + gtid + "'")
>>>exec(conn2, "commit prepared '" + gtid + "'")
>>>
>>> If commit at some of the nodes failed, coordinator should rollback
>>> prepared
>>> transaction at all nodes.
>>>
>> Not always. If COMMIT PREPARED fails at some of the nodes but succeeds
>> on others, the transaction is already partially acknowledged as
>> committed in the cluster. Hence it makes more sense for the
>> coordinator to commit transactions on the remaining nodes. Before
>> issuing any COMMIT PREPARED queries, I guess that's fine to rollback
>> the transactions on all nodes though.
>>
> We will get inconsistency if  transaction is committed on some subset of
> nodes involved in transaction.
> Assume bank debit-credit example. If some distributed transaction
> transfers money from the account at one node to the account and another
> node,
> then committing transaction just at one node cause incorrect total balance.
> The main goal of DTM is to preserve ACID semantic for distributed
> transaction, so either transaction is committed at all nodes, either it is
> not committed at all.


Agreed.

COMMIT PREPARED is a pretty thin layer; the work is all done in the
PREPARE. With a DTM, the main commit itself is done once only in the DTM,
so all the COMMIT PREPARED would do is release local node resources.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Uh-oh: documentation PDF output no longer builds in HEAD

2015-11-08 Thread Andres Freund
On 2015-11-08 13:34:18 -0500, Tom Lane wrote:
> $ cd doc/src/sgml
> $ make postgres-US.pdf
> ... lots of crap later ...
> ! TeX capacity exceeded, sorry [number of strings=245828].

> We ran into a very similar issue back around 9.0, and solved it with an
> ugly style-sheet hack, see thread here:
> http://www.postgresql.org/message-id/flat/1270189232.5018.7.ca...@hp-laptop2.gunduz.org
>
> As noted then, and as I reconfirmed just now, you can *not* fix this by
> hacking TeX's parameters: there is a hard-wired limit of 2^18 strings
> regardless of what you try to set in texmf.cnf.

While taking pretty short of forever, postgres-US.pdf seems to build on
my debian unstable as of 8d7396e509 + some additional docs. Is this
dependant of what version of text you're using (plain tex, pdftex,
xetex, whatnot)?

postgres-US.log contains:
360764 strings out of 481710
2617927 string characters out of 6028023
857532 words of memory out of 5085000
252961 multiletter control sequences out of 15000+60
101035 words of font info for 156 fonts, out of 800 for 9000
36 hyphenation exceptions out of 8191

So at least debian's version of tex seems to have to worked around the
limit somehow. I found only one interesting looking setting in the
relevant config files:

%%
%% jacking up TeX settings for the unique uses of jadetex
%%
extra_mem_bot.jadetex = 85000
extra_mem_bot.pdfjadetex = 85000

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extend pgbench expressions with functions

2015-11-08 Thread Robert Haas
On Sat, Nov 7, 2015 at 2:45 AM, Fabien COELHO  wrote:
> After looking at the generated html version, I find that the "1/param" and
> "2/param" formula are very simple and pretty easy to read, and they would
> not be really enhanced with additional spacing.
>
> ISTM that adaptative spacing (no spacing for level 1 operations, some for
> higher level) is a good approach for readability, ie:
>
>f(i) - f(i+1)
>  ^ no spacing here
> ^ some spacing here
>
> So I would suggest to keep the submitted version, unless this is a blocker.

Well, I think with the ".0" version it looks more like floating-point
math, and I like the extra white-space.  But I'm happy to hear other
opinions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #13611: test_postmaster_connection failed (Windows, listen_addresses = '0.0.0.0' or '::')

2015-11-08 Thread Noah Misch
On Thu, Oct 29, 2015 at 05:39:36PM +0900, Tatsuo Ishii wrote:

> > I am happy to finish it, but I am no less happy if you finish it.  Which do
> > you prefer?
> 
> Please go ahead and commit.
> 
> > Should the back-branch commits mirror the master branch?  A more-cautious
> > alternative would be to, in back branches, wrap the change in #ifdefs so it
> > takes effect only on Windows, OpenBSD and NetBSD.  It could break setups 
> > with
> > local firewall rules that block connections to "127.0.0.1" or "::1" without
> > blocking "0.0.0.0" or "::".  Such firewall rules sound outlandish enough 
> > that
> > I would be fairly comfortable not worrying about this and making the change
> > unconditional in all branches.  It's a judgment call, though.
> 
> I think back patching with #ifdefs is better. On Windows etc. the case
> has been broken anyway and the fix should only bring benefits to users.

I committed it with #ifdef in 9.1-9.4 and without #ifdef in 9.5 and master.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-11-08 Thread Andres Freund
On November 8, 2015 12:54:07 AM PST, Noah Misch  wrote:

>I have pushed a stack of branches to
>https://github.com/nmisch/postgresql.git:
>
>mxt0-revert - reverts commits 4f627f8 and aa29c1c
>mxt1-disk-independent - see below
>mxt2-cosmetic - update already-wrong comments and formatting
>mxt3-main - replaces commit 4f627f8
>mxt4-rm-legacy - replaces commit aa29c1c
>
>The plan is to squash each branch into one PostgreSQL commit.  In
>addition to
>examining overall "git diff mxt2-cosmetic mxt3-main", I recommend
>reviewing
>itemized changes and commit log entries in "git log -p --reverse
>--no-merges
>mxt2-cosmetic..mxt3-main".  In particular, when a change involves
>something
>you discussed upthread or was otherwise not obvious, I put a statement
>of
>rationale in the commit log.

I'm not following along right now - in order to make cleanups the plan is to 
revert a couple commits and then redo them prettyfied?

Andres
Hi
--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bitmap index scans use of filters on available columns

2015-11-08 Thread Tom Lane
Jeff Janes  writes:
> On Fri, Nov 6, 2015 at 7:15 PM, Tomas Vondra
>  wrote:
>> I've however also noticed that all the 'like' procedures are marked as not
>> leak proof, which is a bit unfortunate because that's the example from
>> Jeff's e-mail that started this thread.

> Is there a reason they aren't leak proof?  I don't see that they have
> side effects, or that they can throw errors.  What do they leak?

Huh?

regression=# select 'z' like '\';
ERROR:  LIKE pattern must not end with escape character

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Uh-oh: documentation PDF output no longer builds in HEAD

2015-11-08 Thread Tom Lane
$ cd doc/src/sgml
$ make postgres-US.pdf
... lots of crap later ...

[3253.0.51
! TeX capacity exceeded, sorry [number of strings=245828].
 
   \endgroup \set@typeset@protect 
l.1879198 {1}}
  \Node%
!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on postgres-US.log.
make: *** [postgres-US.pdf] Error 1
rm postgres-US.tex-pdf

The A4-format PDF still builds, which implies that this has something to
do with the number of pages produced.  The 9.5 branch also still builds,
but it seems highly likely that it is within a few pages of failing as
well.  (HEAD is only about a dozen pages longer than 9.5 in A4 format,
and presumably the difference is around that for US format.)

We ran into a very similar issue back around 9.0, and solved it with an
ugly style-sheet hack, see thread here:
http://www.postgresql.org/message-id/flat/1270189232.5018.7.ca...@hp-laptop2.gunduz.org

As noted then, and as I reconfirmed just now, you can *not* fix this by
hacking TeX's parameters: there is a hard-wired limit of 2^18 strings
regardless of what you try to set in texmf.cnf.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-11-08 Thread Noah Misch
On Sun, Nov 08, 2015 at 11:11:42AM -0800, Andres Freund wrote:
> On November 8, 2015 12:54:07 AM PST, Noah Misch  wrote:
> 
> >I have pushed a stack of branches to
> >https://github.com/nmisch/postgresql.git:
> >
> >mxt0-revert - reverts commits 4f627f8 and aa29c1c
> >mxt1-disk-independent - see below
> >mxt2-cosmetic - update already-wrong comments and formatting
> >mxt3-main - replaces commit 4f627f8
> >mxt4-rm-legacy - replaces commit aa29c1c
> >
> >The plan is to squash each branch into one PostgreSQL commit.  In
> >addition to
> >examining overall "git diff mxt2-cosmetic mxt3-main", I recommend
> >reviewing
> >itemized changes and commit log entries in "git log -p --reverse
> >--no-merges
> >mxt2-cosmetic..mxt3-main".  In particular, when a change involves
> >something
> >you discussed upthread or was otherwise not obvious, I put a statement
> >of
> >rationale in the commit log.
> 
> I'm not following along right now - in order to make cleanups the plan is to 
> revert a couple commits and then redo them prettyfied?

Yes, essentially.  Given the volume of updates, this seemed neater than
framing those updates as in-tree incremental development.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Uh-oh: documentation PDF output no longer builds in HEAD

2015-11-08 Thread Andres Freund
On 2015-11-08 16:29:56 -0500, Tom Lane wrote:
> and...@anarazel.de (Andres Freund) writes:
> > While taking pretty short of forever, postgres-US.pdf seems to build on
> > my debian unstable as of 8d7396e509 + some additional docs. Is this
> > dependant of what version of text you're using (plain tex, pdftex,
> > xetex, whatnot)?
> 
> > postgres-US.log contains:
> > 360764 strings out of 481710
> 
> Interesting.  They must have boosted the strings limit from 2^18 to 2^19.
> According to what I've read, this is doable when compiling TeX from
> source, but we can hardly expect users (or packagers) to do that if their
> distribution hasn't built it that way.  (The 2^18 limit I'm seeing is with
> RHEL6's tex package.

Debian uses pdflatex from texlive-base (2015.20151016-1). Maybe that's
the relevant difference.

> I'm currently downloading the Fedora rawhide package
> to see if it's any better, but man that is one large package...)

It indeed is. I've not found any relevant patches in debian's
package. Lots of changing paths and defaults, but afaics nothing else.

> BTW, I realized after poking around that the hack I put in back in 9.0
> probably only eliminates about 5000 strings from the pool, because
> it should save one string per \pagelabel entry added to the .aux
> file, and there are less than 5000 such entries after a successful
> build.  So that was a good quick-n-dirty fix but it's really only
> scratching the surface of the problem: there are ~24 other strings
> getting made somewhere.  I wonder if a better answer is possible.

Debian's pdfjadetex package has the following comment in
README.jadetex.cfg:

> * Not Labelling Elements
> 
> In some cases, it is possible for pdfjadetex to error out even with
> expanded texmf.cnf settings.  The sign of this is that jadetex is able
> to process the file, but pdfjadetex isn't.  The upstream maintainer,
> Sebastian Rahtz, had this to say:
> 
> | pdfjadetex _can_ go over a string limit in TeX
> | which *isn't* changeable in texmf.cnf. The workaround is to write a
> | file called jadetex.cfg, containing just the line
> |
> | \LabelElementsfalse
> |
> | and see if that helps. it stops jadetex from using up a string for
> | every element. If that leaves unsatisfied cross-references, try
> | "jadetex" instead of "pdfjadetex", and create your PDF in
> | another via
> | (ie via Distiller).

Might be worthwhile to see wether \LabelElementsfalse makes a huge
difference.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Uh-oh: documentation PDF output no longer builds in HEAD

2015-11-08 Thread Tom Lane
and...@anarazel.de (Andres Freund) writes:
> While taking pretty short of forever, postgres-US.pdf seems to build on
> my debian unstable as of 8d7396e509 + some additional docs. Is this
> dependant of what version of text you're using (plain tex, pdftex,
> xetex, whatnot)?

> postgres-US.log contains:
> 360764 strings out of 481710

Interesting.  They must have boosted the strings limit from 2^18 to 2^19.
According to what I've read, this is doable when compiling TeX from
source, but we can hardly expect users (or packagers) to do that if their
distribution hasn't built it that way.  (The 2^18 limit I'm seeing is with
RHEL6's tex package.  I'm currently downloading the Fedora rawhide package
to see if it's any better, but man that is one large package...)

BTW, I realized after poking around that the hack I put in back in 9.0
probably only eliminates about 5000 strings from the pool, because
it should save one string per \pagelabel entry added to the .aux
file, and there are less than 5000 such entries after a successful
build.  So that was a good quick-n-dirty fix but it's really only
scratching the surface of the problem: there are ~24 other strings
getting made somewhere.  I wonder if a better answer is possible.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rework the way multixact truncations work

2015-11-08 Thread Andres Freund
On November 8, 2015 11:52:05 AM PST, Noah Misch  wrote:
>On Sun, Nov 08, 2015 at 11:11:42AM -0800, Andres Freund wrote:
>> On November 8, 2015 12:54:07 AM PST, Noah Misch 
>wrote:
>> 
>> >I have pushed a stack of branches to
>> >https://github.com/nmisch/postgresql.git:
>> >
>> >mxt0-revert - reverts commits 4f627f8 and aa29c1c
>> >mxt1-disk-independent - see below
>> >mxt2-cosmetic - update already-wrong comments and formatting
>> >mxt3-main - replaces commit 4f627f8
>> >mxt4-rm-legacy - replaces commit aa29c1c
>> >
>> >The plan is to squash each branch into one PostgreSQL commit.  In
>> >addition to
>> >examining overall "git diff mxt2-cosmetic mxt3-main", I recommend
>> >reviewing
>> >itemized changes and commit log entries in "git log -p --reverse
>> >--no-merges
>> >mxt2-cosmetic..mxt3-main".  In particular, when a change involves
>> >something
>> >you discussed upthread or was otherwise not obvious, I put a
>statement
>> >of
>> >rationale in the commit log.
>> 
>> I'm not following along right now - in order to make cleanups the
>plan is to revert a couple commits and then redo them prettyfied?
>
>Yes, essentially.  Given the volume of updates, this seemed neater than
>framing those updates as in-tree incremental development.

I don't like that plan. I don't have a problem doing that in some development 
branch somewhere, but I fail to see any benefit doing that in 9.5/master. It'll 
just make the history more convoluted for no benefit.

I'll obviously still review the changes.

Even for review it's nor particularly convenient, because now the entirety of 
the large changes essentially needs to be reviewed anew, given they're not the 
same. 

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bitmap index scans use of filters on available columns

2015-11-08 Thread Jeff Janes
On Fri, Nov 6, 2015 at 7:15 PM, Tomas Vondra
 wrote:
> Hi,
>
> On 11/07/2015 02:18 AM, Robert Haas wrote:
>>
>> On Fri, Nov 6, 2015 at 7:11 PM, Tomas Vondra
>>  wrote:

 I think LEAKPROOF is probably fine for this.  How would the new thing
 be different?
>>>
>>>
>>> I don't think so - AFAIK "leakproof" is about not revealing information
>>> about arguments, nothing more and nothing less. It does not say whether
>>> it's
>>> safe to evaluate on indexes, and I don't think we should extend the
>>> meaning
>>> in this direction.
>>
>>
>> That seems like a non-answer answer.
>
>
> I'm not claiming I have an answer, really. My knowledge of leakproof stuff
> is a bit shallow. Also, I had a few beers earlier today, which does not
> really improve the depth of knowledge on any topic except politics and
> football (aka soccer). So you may be perfectly right.
>
>> Clearly, if a function can leak information about it's arguments,
>> for example by throwing an error, we can't call it on tuples that
>> might not even be visible, or the behavior of the query might be
>> change. So any function that is not leakproof is also not safe for
>> this purpose.
>>
>> Now that doesn't rule out the possibility that the functions for
>> which this optimization is safe are a subset of the leakproof
>> functions - but offhand I don't see why that should be the case. The
>> index tuple is just a tuple, and the values it contains are just
>> datums, no more or less than in a heap tuple. There could be a reason
>> for concern here, but saying that it might not be "safe to evaluate
>> on indexes" just begs the question: WHY wouldn't that be safe?
>
>
> Ah. For some reason I thought the "division by zero" example is one of the
> non-safe cases, but now I see int4div is not marked as leakproof, so we
> couldn't push it to index anyway.
>
> I've however also noticed that all the 'like' procedures are marked as not
> leak proof, which is a bit unfortunate because that's the example from
> Jeff's e-mail that started this thread.

Is there a reason they aren't leak proof?  I don't see that they have
side effects, or that they can throw errors.  What do they leak?

Anyway, I just picked that for convenience.  One of the original
pieces that lead me on this quest had a range inclusion rather than a
LIKE.  I just thought that changing it to a LIKE made the data
generator simpler and easier to reason about, without changing the
fundamental principle.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some questions about the array.

2015-11-08 Thread Robert Haas
On Fri, Nov 6, 2015 at 9:44 PM, Jim Nasby  wrote:
>> Since the start-pos is recorded in the array, I wonder if it's worth
>> supporting negative indexing for arrays with the default 1-indexed
>> element numbering, and just ERRORing for others. Does anyone really
>> use anything else?
>
> I'd prefer that over using something like ~.

I'm not necessarily objecting to that, but it's not impossible that it
could break something for some existing user.  We can decide not to
care about that, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Uh-oh: documentation PDF output no longer builds in HEAD

2015-11-08 Thread Tom Lane
and...@anarazel.de (Andres Freund) writes:
>> In some cases, it is possible for pdfjadetex to error out even with
>> expanded texmf.cnf settings.  The sign of this is that jadetex is able
>> to process the file, but pdfjadetex isn't.  The upstream maintainer,
>> Sebastian Rahtz, had this to say:
>> 
>> | pdfjadetex _can_ go over a string limit in TeX
>> | which *isn't* changeable in texmf.cnf. The workaround is to write a
>> | file called jadetex.cfg, containing just the line
>> |
>> | \LabelElementsfalse

Interesting.  That seems to be a slightly more aggressive version of my
9.0-era hack: it effectively turns FlowObjectSetup into a no-op that won't
generate page labels at all, saving *both* of the strings it would create
not only one.  However, I'm afraid that's not gonna do: it looks like it
turns a large fraction of the index entries from page numbers into "??".
And some of the table-of-contents entries as well.  (It looks like maybe
only things with explicit id= entries get correct page number data with
this setting.  We could maybe live with that if the tool threw an error
about missing ids; but it doesn't, it just emits "??" ...)

Curiously though, that gets us down to this:

 30615 strings out of 245828
 397721 string characters out of 1810780

which implies that indeed FlowObjectSetup *is* the cause of most of
the strings being entered.  I'm not sure how that squares with the
observation that there are less than 5000 \pagelabel entries in the
postgres-US.aux file.  Time for more digging.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] snapshot too old, configured by time

2015-11-08 Thread Steve Singer

On 10/15/2015 05:47 PM, Kevin Grittner wrote:
All other issues raised by Álvaro and Steve have been addressed, 
except for this one, which I will argue against: 


I've been looking through the updated patch


In snapmgr.c


+ * XXX: If we can trust a read of an int64 value to be atomic, we can 
skip the

+ * spinlock here.
+ */
+int64
+GetOldSnapshotThresholdTimestamp(void)


Was your intent with the XXX for it to be a TODO to only aquire the lock 
on platforms without the atomic 64bit operations?


On a more general note:

I've tried various manual tests of this feature and it sometimes works 
as expected and sometimes doesn't.
I'm getting the feeling that how I expect it to work isn't quite in sync 
with how it does work.


I'd expect the following to be sufficient to generate the test

T1: Obtains a snapshot that can see some rows
T2: Waits 60 seconds and performs an update on those rows
T2: Performs a vacuum
T1: Waits 60 seconds, tries to select from the table.  The snapshot 
should be too old



For example it seems that in test 002 the select_ok on conn1 following 
the vacuum but right before the final sleep is critical to the snapshot 
too old error showing up (ie if I remove that select_ok but leave in the 
sleep I don't get the error)


Is this intended and if so is there a better way we can explain how 
things work?



Also is 0 intended to be an acceptable value for old_snapshot_threshold 
and if so what should we expect to see then?






So if I understand correctly, every call to BufferGetPage needs to have
a TestForOldSnapshot immediately afterwards?  It seems easy to later
introduce places that fail to test for old snapshots.  What happens if
they do?  Does vacuum remove tuples anyway and then the query returns
wrong results?  That seems pretty dangerous.  Maybe the snapshot could
be an argument of BufferGetPage?

There are 486 occurences of BufferGetPage in the source code, and
this patch follows 36 of them with TestForOldSnapshot.  This only
needs to be done when a page is going to be used for a scan which
produces user-visible results.  That is, it is *not* needed for
positioning within indexes to add or vacuum away entries, for heap
inserts or deletions (assuming the row to be deleted has already
been found).  It seems wrong to modify about 450 BufferGetPage
references to add a NULL parameter; and if we do want to make that
noop change as insurance, it seems like it should be a separate
patch, since the substance of this patch would be buried under the
volume of that.

I will add this to the November CF.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: 9.5 replication origins fix for logical decoding

2015-11-08 Thread Andres Freund
Hi,

On 2015-10-19 21:43:32 +0800, Craig Ringer wrote:
> Patch revision 3 attached. It's a one-liner, with just the fix, and an
> explanatory note in the patch header.

Pushed to 9.5 and master.

Thanks for noticing the issue,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-08 Thread Kouhei Kaigai
> -Original Message-
> From: Kaigai Kouhei(海外 浩平)
> Sent: Sunday, November 08, 2015 12:38 AM
> To: 'Robert Haas'
> Cc: Etsuro Fujita; Tom Lane; Kyotaro HORIGUCHI; pgsql-hackers@postgresql.org;
> Shigeru Hanada
> Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
> 
> > On Fri, Nov 6, 2015 at 9:42 AM, Kouhei Kaigai  wrote:
> > > This patch needs to be rebased.
> > > One thing different from the latest version is fdw_recheck_quals of
> > > ForeignScan was added. So, ...
> > >
> > > (1) Principle is that FDW driver knows what qualifiers were pushed down
> > > and how does it kept in the private field. So, fdw_recheck_quals is
> > > redundant and to be reverted.
> > >
> > > (2) Even though the principle is as described in (1), however,
> > > wired logic in ForeignRecheck() and fdw_recheck_quals are useful
> > > default for most of FDW drivers. So, it shall be kept and valid
> > > only if RecheckForeignScan callback is not defined.
> > >
> > > Which is better approach for the v3 patch?
> > > My preference is (1), because fdw_recheck_quals is a new feature,
> > > thus, FDW driver has to be adjusted in v9.5 more or less, even if
> > > it already supports qualifier push-down.
> > > In general, interface becomes more graceful to stick its principle.
> >
> > fdw_recheck_quals seems likely to be very convenient for FDW authors,
> > and I think ripping it out would be a terrible decision.
> >
> OK, I try to co-exist fdw_recheck_quals and RecheckForeignScan callback.
> 
> > I think ForeignRecheck should first call ExecQual to test
> > fdw_recheck_quals.  If it returns false, return false.  If it returns
> > true, then give the FDW callback a chance, if one is defined.  If that
> > returns false, return false.   If we haven't yet returned false,
> > return true.
> >
> I think ExecQual on fdw_recheck_quals shall be called next to the
> RecheckForeignScan callback, because econtext->ecxt_scantuple shall
> not be reconstructed unless RecheckForeignScan callback is not called
> if scanrelid==0.
> 
> If RecheckForeignScan is called prior to ExecQual, FDW driver can
> take either of two options according to its preference.
> 
> (1) RecheckForeignScan callback reconstruct a joined tuple based on
> the primitive EPQ slots, but nothing are rechecked by itself.
> ForeignRecheck runs ExecQual on fdw_recheck_quals that represents
> qualifiers of base relations and join condition.
> 
> (2) RecheckForeignScan callback reconstruct a joined tuple based on
> the primitive EPQ slots, then rechecks qualifiers of base relations
> and join condition by itself. It put NIL on fdw_recheck_quals, so
> ExecQual in ForeignRecheck() always true.
> 
> In either case, we cannot use ExecQual prior to reconstruction of
> a joined tuple because only FDW driver knows how to reconstruct it.
> So, it means ForeignScan with scanrelid==0 always has to set NIL on
> fdw_recheck_quals, if we would put ExecQual prior to the callback.
>
The attached patch is an adjusted version of the previous one.
Even though it co-exists a new callback and fdw_recheck_quals,
the callback is kicked first as follows.


@@ -85,6 +86,18 @@ ForeignRecheck(ForeignScanState *node, TupleTableSlot *slot)
 
ResetExprContext(econtext);
 
+   /*
+* FDW driver has to recheck visibility of EPQ tuple towards
+* the scan qualifiers once it gets pushed down.
+* In addition, if this node represents a join sub-tree, not
+* a scan, FDW driver is also responsible to reconstruct
+* a joined tuple according to the primitive EPQ tuples.
+*/
+   if (fdwroutine->RecheckForeignScan)
+   {
+   if (!fdwroutine->RecheckForeignScan(node, slot))
+   return false;
+   }
return ExecQual(node->fdw_recheck_quals, econtext, false);
 }


If callback is invoked first, FDW driver can reconstruct a joined tuple
with its comfortable way, then remaining checks can be done by ExecQual
and fds_recheck_quals on the caller side.
If callback would be located on the tail, FDW driver has no choice.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 



pgsql-fdw-epq-recheck.v3.patch
Description: pgsql-fdw-epq-recheck.v3.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] eXtensible Transaction Manager API

2015-11-08 Thread Michael Paquier
On Mon, Nov 9, 2015 at 4:33 AM, Simon Riggs  wrote:
> On 8 November 2015 at 16:59, Konstantin Knizhnik 
> wrote:
>>
>> On 11/08/2015 02:46 PM, Michael Paquier wrote:
>>>
>>> On Sun, Nov 8, 2015 at 1:53 AM, Konstantin Knizhnik wrote:

 In tsDTM approach two phase commit is performed by coordinator and
 currently
 is using standard PostgreSQL two phase commit:

 Code in GO performing two phase commit:

exec(conn1, "prepare transaction '" + gtid + "'")
exec(conn2, "prepare transaction '" + gtid + "'")
exec(conn1, "select dtm_begin_prepare($1)", gtid)
exec(conn2, "select dtm_begin_prepare($1)", gtid)
csn = _execQuery(conn1, "select dtm_prepare($1, 0)", gtid)
csn = _execQuery(conn2, "select dtm_prepare($1, $2)", gtid,
 csn)
exec(conn1, "select dtm_end_prepare($1, $2)", gtid, csn)
exec(conn2, "select dtm_end_prepare($1, $2)", gtid, csn)
exec(conn1, "commit prepared '" + gtid + "'")
exec(conn2, "commit prepared '" + gtid + "'")

 If commit at some of the nodes failed, coordinator should rollback
 prepared
 transaction at all nodes.
>>>
>>> Not always. If COMMIT PREPARED fails at some of the nodes but succeeds
>>> on others, the transaction is already partially acknowledged as
>>> committed in the cluster. Hence it makes more sense for the
>>> coordinator to commit transactions on the remaining nodes. Before
>>> issuing any COMMIT PREPARED queries, I guess that's fine to rollback
>>> the transactions on all nodes though.
>>
>> We will get inconsistency if  transaction is committed on some subset of
>> nodes involved in transaction.
>> Assume bank debit-credit example. If some distributed transaction
>> transfers money from the account at one node to the account and another
>> node,
>> then committing transaction just at one node cause incorrect total
>> balance.
>> The main goal of DTM is to preserve ACID semantic for distributed
>> transaction, so either transaction is committed at all nodes, either it is
>> not committed at all.
>
>
> Agreed.
>
> COMMIT PREPARED is a pretty thin layer; the work is all done in the PREPARE.
> With a DTM, the main commit itself is done once only in the DTM, so all the
> COMMIT PREPARED would do is release local node resources.

Sure. Now imagine that the pg_twophase entry is corrupted for this
transaction on one node. This would trigger a PANIC on it, and
transaction would not be committed everywhere. I am aware of the fact
that by definition PREPARE TRANSACTION ensures that a transaction will
be committed with COMMIT PREPARED, just trying to see any corner cases
with the approach proposed. The DTM approach is actually rather close
to what a GTM in Postgres-XC does :)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] eXtensible Transaction Manager API

2015-11-08 Thread Amit Kapila
On Sat, Nov 7, 2015 at 10:23 PM, Konstantin Knizhnik <
k.knizh...@postgrespro.ru> wrote:

>
> Lock manager is one of the tasks we are currently working on.
> There are still a lot of open questions:
> 1. Should distributed lock manager (DLM) do something else except
> detection of distributed deadlock?
>

I think so.  Basically DLM should be responsible for maintaining
all the lock information which inturn means that any backend process
that needs to acquire/release lock needs to interact with DLM, without that
I don't think even global deadlock detection can work (to detect deadlocks
among all the nodes, it needs to know the lock info of all nodes).
This is somewhat inline with what currently we do during lock conflicts,
i.e if the backend incur a lock conflict and decides to sleep, before
sleeping it tries to detect an early deadlock, so if we make DLM responsible
for managing locks the same could be even achieved in distributed system.


> 2. Should DLM be part of XTM API or it should be separate API?
>

We might be able to do it either ways (make it part of XTM API or devise a
separate API). I think here more important point is to first get the high
level
design for Distributed Transactions (which primarily includes consistent
Commit/Rollback, snapshots, distributed lock manager (DLM) and recovery).



> 3. Should DLM be implemented by separate process or should it be part of
> arbiter (dtmd).
>

That's important decision. I think it will depend on which kind of design
we choose for distributed transaction manager (arbiter based solution or
non-arbiter based solution, something like tsDTM).  I think DLM should be
separate, else arbiter will become hot-spot with respect to contention.


> 4. How to globally identify resource owners (0transactions) in global lock
> graph. In case of DTM we have global (shared) XIDs,
> and in tsDTM - global transactions IDs, assigned by application (which is
> not so clear how to retrieve).
> In other cases we may need to have local->global transaction id mapping,
> so looks like DLM should be part of DTM...
>
>
I think the DLM should in itself have all the necessary information to find
deadlocks or anything else required by locking system.  For what kind of
cases, do you envision to identify global resource owners?
And I think if we require some interaction between DLM and DTM, then we
can implement the same, rather than making DLM part of DTM.


>
>
> Also I have
> noticed that discussion about Rollback is not there, example how will
> Rollback happen with API's provided in your second approach (tsDTM)?
>
>
> In tsDTM approach two phase commit is performed by coordinator and
> currently is using standard PostgreSQL two phase commit:
>
>
> If commit at some of the nodes failed, coordinator should rollback
> prepared transaction at all nodes.
>
>
Can you please explain more about tsDTM approach, how timestamps
are used, what is exactly CSN (is it Commit Sequence Number) and
how it is used in prepare phase?  IS CSN used as timestamp?
Is the coordinator also one of the PostgreSQL instance?

>
> I think in this patch, it is important to see the completeness of all the
> API's that needs to be exposed for the implementation of distributed
> transactions and the same is difficult to visualize without having complete
> picture of all the components that has some interaction with the
> distributed
> transaction system.  On the other hand we can do it in incremental fashion
> as and when more parts of the design are clear.
>
>
> That is exactly what we are going to do - we are trying to integrate DTM
> with existed systems (pg_shard, postgres_fdw, BDR) and find out what is
> missed and should be added. In parallel we are trying to compare efficiency
> and scalability of different solutions.
>

One thing, I have noticed that in DTM approach, you seems to
be considering a centralized (Arbiter) transaction manager and
centralized Lock manager which seems to be workable approach,
but I think such an approach won't scale in write-heavy or mixed
read-write workload.  Have you thought about distributing responsibility
of global transaction and lock management?  I think such a system
might be somewhat difficult to design, but the scalability will be better.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Returning 'Infinity'::TIMESTAMPTZ from "to_timestamp" function

2015-11-08 Thread Tom Lane
Vitaly Burovoy  writes:
> To add an ability to construct 'Infinity' TIMESTAMPTZ via
> "to_timestamp" call, there are two ways:

> 1. Rewrite the function "pg_catalog.to_timestamp(double precision)" as
> an internal one. It's the easiest way, because it allows to avoid
> usage of INTERVAL as a helper (of course, there is still possible to
> use intervals as shown above in user's scripts, but without "Infinity"
> support).

> 2. Add support of infinite intervals. It is harder, because it touches
> a lot of functions. I can add that support if it is in demand.

> Which way is preferred?

I think you should stay away from infinite intervals; that seems like
there would be a lot of definitional questions to be resolved.  Even
if we decide we want to deal with that someday, it shouldn't be a blocking
issue for conversion between infinite floats and infinite timestamps.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transactions involving multiple postgres foreign servers

2015-11-08 Thread Ashutosh Bapat
On Sat, Nov 7, 2015 at 12:07 AM, Robert Haas  wrote:

> On Wed, Aug 12, 2015 at 6:25 AM, Ashutosh Bapat
>  wrote:
> > The previous patch would not compile on the latest HEAD. Here's updated
> > patch.
>
> Perhaps unsurprisingly, this doesn't apply any more.  But we have
> bigger things to worry about.
>
> The recent eXtensible Transaction Manager and the slides shared at the
> Vienna sharding summit, now posted at
> https://drive.google.com/file/d/0B8hhdhUVwRHyMXpRRHRSLWFXeXc/view make
> me think that some careful thought is needed here about what we want
> and how it should work. Slide 10 proposes a method for the extensible
> transaction manager API to interact with FDWs.  The FDW would do this:
>
> select dtm_join_transaction(xid);
> begin transaction;
> update...;
> commit;
>
> I think the idea here is that the commit command doesn't really
> commit; it just escapes the distributed transaction while leaving it
> marked not-committed.  When the transaction subsequently commits on
> the local server, the XID is marked committed and the effects of the
> transaction become visible on all nodes.
>

Since the foreign server (referred to in the slides as secondary server)
requires to call "create extension pg_dtm" and select
dtm_join_transaction(xid);, I assume that the foreign server has to be a
PostgreSQL server and one which has this extension installed and has a
version that can support this extension. So, we can not use the extension
for all FDWs and even for postgres_fdw it can be used only for a foreign
server with above capabilities. The slides mention just FDW but I think
they mean postgres_fdw and not all FDWs.


>
> I think that this API is intended to provide not only consistent
> cross-node decisions about whether a particular transaction has
> committed, but also consistent visibility.  If the API is sufficient
> for that and if it can be made sufficiently performant, that's a
> strictly stronger guarantee than what this proposal would provide.
>
> On the other hand, I see a couple of problems:
>
> 1. The extensible transaction manager API is meant to be pluggable.
> Depending on which XTM module you choose to load, the SQL that needs
> to be executed by postgres_fdw on the remote node will vary.
> postgres_fdw shouldn't have knowledge of all the possible XTMs out
> there, so it would need some way to know what SQL to send.
>
> 2. If the remote server isn't running the same XTM as the local
> server, or if it is running the same XTM but is not part of the same
> group of cooperating nodes as the local server, then we can't send a
> command to join the distributed transaction at all.  In that case, the
> 2PC for FDW approach is still, maybe, useful.
>

Elaborating more on this: Slide 11 shows arbiter protocol to start a
transaction and next slide shows the same for commit. Slide 15 shows the
transaction flow diagram for tsDTM. In DTM approach it doesn't specify how
xids are communicated between nodes, but it's implicit in the protocol that
xid space is shared by the nodes. Similarly for tsDTM it assumes that CSN
space is shared by all the nodes (see synchronization for max(CSN)). This
can not be assumed for FDWs (not even postgres_fdw) where foreign servers
are independent entities with independent xid space.


>
> On the whole, I'm inclined to think that the XTM-based approach is
> probably more useful and more general, if we can work out the problems
> with it.  I'm not sure that I'm right, though, nor am I sure how hard
> it will be.
>
>
2PC for FDW and XTM are trying to solve different problems with some
commonality. 2PC for FDW is trying to solve problem of atomic commit (I am
borrowing from the terminology you used in PGCon 2015) for FDWs in general
(although limited to FDWs which can support 2 phase commit) and XTM tries
to solve problems of atomic visibility, atomic commit and consistency for
postgres_fdw where foreign servers support XTM. The only thing common
between these two is atomic visibility.

If we accept XTM and discard 2PC for FDW, we will not be able to support
atomic commit for FDWs in general. That, I think would be serious
limitation for Postgres FDW, esp. now that DMLs are allowed. If we accept
only 2PC for FDW and discard XTM, we won't be able to get atomic visibility
and consistency for postgres_fdw with foreign servers supporting XTM. That
would be again serious limitation for solutions implementing sharding,
multi-master clusters etc.

There are approaches like [1] by which cluster of heterogenous servers
(with some level of snapshot isolation) can be constructed. Ideally that
will enable PostgreSQL users to maximize their utilization of FDWs.

Any distributed transaction management requires 2PC in some or other form.
So, we should implement 2PC for FDW keeping in mind various forms of 2PC
used practically. Use that infrastructure to build XTM like capabilities
for restricted postgres_fdw uses. Previously, I have 

Re: [HACKERS] Adjust errorcode in background worker code

2015-11-08 Thread Craig Ringer
On 7 November 2015 at 02:55, Robert Haas  wrote:

> I wonder if we need to think about inventing some new error codes.  I
> can sort of understand that "feature not supported" is something that
> can come in a large number of different contexts and mean pretty much
> the same all the time, but I'm betting that things like "invalid
> parameter value" and "invalid text representation" and "object not in
> prerequisite state" cover an amazing breadth of errors that may not
> actually be that similar to each other.

That depends on what client applications are going to do about the error.

If the app's only choice is likely to be "huh, that's weird, I'll barf
and tell the user about it" then there's little point changing
anything. The app just spits out a different error and nobody gains
anything.

If there are error sites where an application could take specific,
useful action in response to an error then they could well benefit
from being changed. ERRCODE_T_R_DEADLOCK_DETECTED for example is
extremely useful to applications.

So the first step should be auditing error sites of interest to see
whether the client app could possibly do anything constructive about
the error. Then see if there's an existing category it'd fit better,
or if we have a group of errors that would go well together in a new
category where the application action for them is similar.

Having too many highly specific errors is at least as bad as too few
broad categories. You land up with huge case statements and error
number lists - which you inevitably miss something from. They're
annoying to write, so they start getting re-used and cargo culted in
increasingly outdated forms. Next thing you know you've invented Java
exception handling ;)

Personally the only PostgreSQL errors I've ever cared about handling
specifically have been:

* deadlock detected, serialization failure for tx abort and retry loop
* connection-level errors, for reconnect and retry handling, graceful
reporting of DB shutdown, etc
* constraint violations, where useful info can be extracted to tell
the user what exactly was wrong with their input
* access permission errors, to drive more informative client-side UI

The great majority of sever-side errors go into the "huh, something's
broken that shouldn't have, tell the user to tell the admin" box.

I'm nuts about handling errors in my code compared to most of what
I've seen. The majority of apps I see aren't even prepared to cope
with a deadlock without data loss, tending to barf an error into the
logs and/or to the user, forget what they were doing, and hope someone
comes to rescue them or re-enter the info.

Do you think anyone has *ever* written code to trap
ERRCODE_AMBIGUOUS_FUNCTION or ERRCODE_INVALID_ARGUMENT_FOR_LOG?  Not
counting simple translation layers that map every exception ... I
doubt it.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] eXtensible Transaction Manager API

2015-11-08 Thread Simon Riggs
On 7 November 2015 at 16:53, Konstantin Knizhnik 
wrote:


> On 11/07/2015 05:11 PM, Amit Kapila wrote:
>
>
> Today, while studying your proposal and related material, I noticed
> that in both the approaches DTM and tsDTM, you are talking about
> committing a transaction and acquiring the snapshot consistently, but
> not touched upon the how the locks will be managed across nodes and
> how deadlock detection across nodes will work.  This will also be one
> of the crucial points in selecting one of the approaches.
>
>
> Lock manager is one of the tasks we are currently working on.
> There are still a lot of open questions:
> 1. Should distributed lock manager (DLM) do something else except
> detection of distributed deadlock?
> 2. Should DLM be part of XTM API or it should be separate API?
> 3. Should DLM be implemented by separate process or should it be part of
> arbiter (dtmd).
> 4. How to globally identify resource owners (0transactions) in global lock
> graph. In case of DTM we have global (shared) XIDs,
> and in tsDTM - global transactions IDs, assigned by application (which is
> not so clear how to retrieve).
> In other cases we may need to have local->global transaction id mapping,
> so looks like DLM should be part of DTM...
>

Yes, we need a Distributed Lock Manager, but I think its a separate thing
from the DTM.

(I'm loath to use the phase DLM, which was used within Oracle Parallel
server for a buffer lock manager, which is not what is being discussed).

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Patch: Implement failover on libpq connect level.

2015-11-08 Thread Craig Ringer
On 7 November 2015 at 04:59, Robert Haas  wrote:

> So, I really wonder why we're not happy with the ability to substitute
> out just the host and IP.

I tend to agree. That solves 95% of the problem and doesn't foreclose
solving the other 5% some other way if someone really cares later.

I'd rather not see convoluted and complex connstrings, per my prior
post. The JDBC extended URL style seems good enough.

IMO the biggest problem is that client-side failover is way more
complex than "got -ECONNREFUSED, try next host". I think we're all
focusing on not being able to twiddle arbitrary settings while
ignoring the real problem with client failover, i.e. making it
actually work in the real world.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-08 Thread Kouhei Kaigai
> > 
> > @@ -85,6 +86,18 @@ ForeignRecheck(ForeignScanState *node, TupleTableSlot
> *slot)
> >
> > ResetExprContext(econtext);
> >
> > +   /*
> > +* FDW driver has to recheck visibility of EPQ tuple towards
> > +* the scan qualifiers once it gets pushed down.
> > +* In addition, if this node represents a join sub-tree, not
> > +* a scan, FDW driver is also responsible to reconstruct
> > +* a joined tuple according to the primitive EPQ tuples.
> > +*/
> > +   if (fdwroutine->RecheckForeignScan)
> > +   {
> > +   if (!fdwroutine->RecheckForeignScan(node, slot))
> > +   return false;
> > +   }
> > return ExecQual(node->fdw_recheck_quals, econtext, false);
> >   }
> > 
> >
> > If callback is invoked first, FDW driver can reconstruct a joined tuple
> > with its comfortable way, then remaining checks can be done by ExecQual
> > and fds_recheck_quals on the caller side.
> > If callback would be located on the tail, FDW driver has no choice.
> 
> To test this change, I think we should update the postgres_fdw patch so
> as to add the RecheckForeignScan.
> 
> Having said that, as I said previously, I don't see much value in adding
> the callback routine, to be honest.  I know KaiGai-san considers that
> that would be useful for custom joins, but I don't think that that would
> be useful even for foreign joins, because I think that in case of
> foreign joins, the practical implementation of that routine in FDWs
> would be to create a secondary plan and execute that plan by performing
> ExecProcNode, as my patch does [1].  Maybe I'm missing something, though.
>
I've never denied that alternative local sub-plan is one of the best
approach for postgres_fdw, however, I've also never heard why you can
say the best approach for postgres_fdw is definitely also best for
others.
If we would justify less flexible interface specification because of
comfort for a particular extension, it should not be an extension,
but a built-in feature.
My standpoint has been consistent through the discussion; we can never
predicate which feature shall be implemented on FDW interface, therefore,
we also cannot predicate which implementation is best for EPQ rechecks
also. Only FDW driver knows which is the "best" for them, not us.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Returning 'Infinity'::TIMESTAMPTZ from "to_timestamp" function

2015-11-08 Thread Vitaly Burovoy
Hello everyone!

Continuing the topic of extracting EPOCH from 'Infinity'::TIMESTAMPTZ
and according to an item "converting between infinity timestamp and
float8" in the TODO list...

Even when "SELECT extract(EPOCH FROM TIMESTAMPTZ 'Infinity')" results
'Infinity'::float, there is still trouble to convert it back:
# SELECT to_timestamp('Infinity'::float);
ERROR:  timestamp out of range
CONTEXT:  SQL function "to_timestamp" statement 1

The function "to_timestamp(double precision)" is defined as an SQL-script:
select ('epoch'::pg_catalog.timestamptz + $1 * '1 second'::pg_catalog.interval)

Whereas error message points to a function "timestamptz_pl_interval",
there is still a nuance in a function "interval_mul", because it
returns "Interval->time" as "-Infinity" for both +/-infinity as an
input value (apart from the fact that INTERVAL does not support
infinite values officially).

To add an ability to construct 'Infinity' TIMESTAMPTZ via
"to_timestamp" call, there are two ways:

1. Rewrite the function "pg_catalog.to_timestamp(double precision)" as
an internal one. It's the easiest way, because it allows to avoid
usage of INTERVAL as a helper (of course, there is still possible to
use intervals as shown above in user's scripts, but without "Infinity"
support).

2. Add support of infinite intervals. It is harder, because it touches
a lot of functions. I can add that support if it is in demand.

Which way is preferred?
-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bootstrap pg_shseclabel in relcache initialization

2015-11-08 Thread Craig Ringer
On 9 November 2015 at 12:40, Adam Brightwell
 wrote:
> Hi All,
>
> While working on an auth hook, I found that I was unable to access the
> pg_shseclabel system table while processing the hook.  I discovered
> that the only tables that were bootstrapped and made available at this
> stage of the the auth process were pg_database, pg_authid and
> pg_auth_members.  Unfortunately, this is problematic if you have
> security labels that are associated with a role which are needed to
> determine auth decisions/actions.
>
> Given that the shared relations currently exposed can also have
> security labels that can be used for auth purposes, I believe it makes
> sense to make those available as well.  I have attached a patch that
> adds this functionality for review/discussion.  If this functionality
> makes sense I'll add it to the commitfest.

Your reasoning certainly makes sense to me. I'm a little surprised
this didn't cause issues for SEPostgreSQL already.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Foreign join pushdown vs EvalPlanQual

2015-11-08 Thread Etsuro Fujita

On 2015/11/09 9:26, Kouhei Kaigai wrote:

I think ForeignRecheck should first call ExecQual to test
fdw_recheck_quals.  If it returns false, return false.  If it returns
true, then give the FDW callback a chance, if one is defined.  If that
returns false, return false.   If we haven't yet returned false,
return true.



I think ExecQual on fdw_recheck_quals shall be called next to the
RecheckForeignScan callback, because econtext->ecxt_scantuple shall
not be reconstructed unless RecheckForeignScan callback is not called
if scanrelid==0.


I agree with KaiGai-san.  I think we can define fdw_recheck_quals for 
the foreign-join case as quals not in scan.plan.qual, the same way as 
the simple foreign scan case.  (In other words, the quals would be 
defind as "otherclauses", ie, rinfo->is_pushed_down=true, that have been 
pushed down to the remote server.  For checking the fdw_recheck_quals, 
however, I think we should reconstruct the join tuple first, which I 
think is essential for cases where an outer join is performed remotely, 
to avoid changing the semantics.  BTW, in my patch [1], a secondary plan 
will be created to evaluate such otherclauses after reconstructing the 
join tuple.



The attached patch is an adjusted version of the previous one.
Even though it co-exists a new callback and fdw_recheck_quals,
the callback is kicked first as follows.


Thanks for the patch!



@@ -85,6 +86,18 @@ ForeignRecheck(ForeignScanState *node, TupleTableSlot *slot)

ResetExprContext(econtext);

+   /*
+* FDW driver has to recheck visibility of EPQ tuple towards
+* the scan qualifiers once it gets pushed down.
+* In addition, if this node represents a join sub-tree, not
+* a scan, FDW driver is also responsible to reconstruct
+* a joined tuple according to the primitive EPQ tuples.
+*/
+   if (fdwroutine->RecheckForeignScan)
+   {
+   if (!fdwroutine->RecheckForeignScan(node, slot))
+   return false;
+   }
return ExecQual(node->fdw_recheck_quals, econtext, false);
  }


If callback is invoked first, FDW driver can reconstruct a joined tuple
with its comfortable way, then remaining checks can be done by ExecQual
and fds_recheck_quals on the caller side.
If callback would be located on the tail, FDW driver has no choice.


To test this change, I think we should update the postgres_fdw patch so 
as to add the RecheckForeignScan.


Having said that, as I said previously, I don't see much value in adding 
the callback routine, to be honest.  I know KaiGai-san considers that 
that would be useful for custom joins, but I don't think that that would 
be useful even for foreign joins, because I think that in case of 
foreign joins, the practical implementation of that routine in FDWs 
would be to create a secondary plan and execute that plan by performing 
ExecProcNode, as my patch does [1].  Maybe I'm missing something, though.


Best regards,
Etsuro Fujita

[1] http://www.postgresql.org/message-id/5624d583.10...@lab.ntt.co.jp



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bitmap index scans use of filters on available columns

2015-11-08 Thread Jeff Janes
On Sun, Nov 8, 2015 at 12:34 PM, Tom Lane  wrote:
> Jeff Janes  writes:
>> On Fri, Nov 6, 2015 at 7:15 PM, Tomas Vondra
>>  wrote:
>>> I've however also noticed that all the 'like' procedures are marked as not
>>> leak proof, which is a bit unfortunate because that's the example from
>>> Jeff's e-mail that started this thread.
>
>> Is there a reason they aren't leak proof?  I don't see that they have
>> side effects, or that they can throw errors.  What do they leak?
>
> Huh?
>
> regression=# select 'z' like '\';
> ERROR:  LIKE pattern must not end with escape character

Ah, I was only thinking of the pattern as a constant.  And there is no
way to declare a function leakproof on one input but not another.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] bootstrap pg_shseclabel in relcache initialization

2015-11-08 Thread Adam Brightwell
Hi All,

While working on an auth hook, I found that I was unable to access the
pg_shseclabel system table while processing the hook.  I discovered
that the only tables that were bootstrapped and made available at this
stage of the the auth process were pg_database, pg_authid and
pg_auth_members.  Unfortunately, this is problematic if you have
security labels that are associated with a role which are needed to
determine auth decisions/actions.

Given that the shared relations currently exposed can also have
security labels that can be used for auth purposes, I believe it makes
sense to make those available as well.  I have attached a patch that
adds this functionality for review/discussion.  If this functionality
makes sense I'll add it to the commitfest.

Thanks,
Adam
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 9c3d096..c38a8ac 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -51,6 +51,7 @@
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_rewrite.h"
+#include "catalog/pg_shseclabel.h"
 #include "catalog/pg_tablespace.h"
 #include "catalog/pg_trigger.h"
 #include "catalog/pg_type.h"
@@ -98,6 +99,7 @@ static const FormData_pg_attribute Desc_pg_database[Natts_pg_database] = {Schema
 static const FormData_pg_attribute Desc_pg_authid[Natts_pg_authid] = {Schema_pg_authid};
 static const FormData_pg_attribute Desc_pg_auth_members[Natts_pg_auth_members] = {Schema_pg_auth_members};
 static const FormData_pg_attribute Desc_pg_index[Natts_pg_index] = {Schema_pg_index};
+static const FormData_pg_attribute Desc_pg_shseclabel[Natts_pg_shseclabel] = {Schema_pg_shseclabel};
 
 /*
  *		Hash tables that index the relation cache
@@ -3187,13 +3189,14 @@ RelationCacheInitialize(void)
 /*
  *		RelationCacheInitializePhase2
  *
- *		This is called to prepare for access to shared catalogs during startup.
- *		We must at least set up nailed reldescs for pg_database, pg_authid,
- *		and pg_auth_members.  Ideally we'd like to have reldescs for their
- *		indexes, too.  We attempt to load this information from the shared
- *		relcache init file.  If that's missing or broken, just make phony
- *		entries for the catalogs themselves.  RelationCacheInitializePhase3
- *		will clean up as needed.
+ *		This is called to prepare for access to shared catalogs during
+ *		startup.  We must at least set up nailed reldescs for
+ *		pg_database, pg_authid, pg_auth_members, and pg_shseclabel.
+ *		Ideally we'd like to have reldescs for their indexes, too.  We
+ *		attempt to load this information from the shared relcache init
+ *		file.  If that's missing or broken, just make phony entries for
+ *		the catalogs themselves.  RelationCacheInitializePhase3 will
+ *		clean up as needed.
  */
 void
 RelationCacheInitializePhase2(void)
@@ -3229,8 +3232,10 @@ RelationCacheInitializePhase2(void)
   true, Natts_pg_authid, Desc_pg_authid);
 		formrdesc("pg_auth_members", AuthMemRelation_Rowtype_Id, true,
   false, Natts_pg_auth_members, Desc_pg_auth_members);
+		formrdesc("pg_shseclabel", SharedSecLabelRelation_Rowtype_Id, true,
+  false, Natts_pg_shseclabel, Desc_pg_shseclabel);
 
-#define NUM_CRITICAL_SHARED_RELS	3	/* fix if you change list above */
+#define NUM_CRITICAL_SHARED_RELS	4	/* fix if you change list above */
 	}
 
 	MemoryContextSwitchTo(oldcxt);
@@ -3365,6 +3370,8 @@ RelationCacheInitializePhase3(void)
 			AuthIdRelationId);
 		load_critical_index(AuthMemMemRoleIndexId,
 			AuthMemRelationId);
+		load_critical_index(SharedSecLabelObjectIndexId,
+			SharedSecLabelRelationId);
 
 #define NUM_CRITICAL_SHARED_INDEXES 5	/* fix if you change list above */
 
diff --git a/src/include/catalog/pg_shseclabel.h b/src/include/catalog/pg_shseclabel.h
index 0ff41f3..d8334bf 100644
--- a/src/include/catalog/pg_shseclabel.h
+++ b/src/include/catalog/pg_shseclabel.h
@@ -18,9 +18,10 @@
  *		typedef struct FormData_pg_shseclabel
  * 
  */
-#define SharedSecLabelRelationId		3592
+#define SharedSecLabelRelationId			3592
+#define SharedSecLabelRelation_Rowtype_Id	4066
 
-CATALOG(pg_shseclabel,3592) BKI_SHARED_RELATION BKI_WITHOUT_OIDS
+CATALOG(pg_shseclabel,3592) BKI_SHARED_RELATION BKI_ROWTYPE_OID(4066) BKI_WITHOUT_OIDS BKI_SCHEMA_MACRO
 {
 	Oid			objoid;			/* OID of the shared object itself */
 	Oid			classoid;		/* OID of table containing the shared object */
@@ -31,6 +32,8 @@ CATALOG(pg_shseclabel,3592) BKI_SHARED_RELATION BKI_WITHOUT_OIDS
 #endif
 } FormData_pg_shseclabel;
 
+typedef FormData_pg_shseclabel *Form_pg_shseclabel;
+
 /* 
  *		compiler constants for pg_shseclabel
  * 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Fwd: [HACKERS] Multixid hindsight design

2015-11-08 Thread Craig Ringer
On 25 June 2015 at 00:52, Robert Haas  wrote:

> I agree that we can do much better at testing than we traditionally
> have done, and I think pretty much everyone in the room for the
> developer unconference session on testing at PGCon was also in
> agreement with that.  I really like the idea of taking purpose-built
> testing frameworks - like the one that Heikki created for the WAL
> format changes - and polishing them to the point where they can go
> into core.  That's more work, of course, but very beneficial.

Something that'd really help with that, IMO, would be weakening the
requirement that everything be C or be core Perl. Instead require that
configure detect whether or not facilities for some tests are present,
and have them fail with a clean warning indicating they were skipped
for lack of pre-requisites at 'make' time.

I don't see that as significantly different to having some buildfarm
animals not bother to configure or test the PLs, SSL, etc. I
understand why adding to the mix required for the core server build
isn't acceptable, but hopefully separate test suites can be more
flexible. A free-for-all of languages and tools doesn't make sense,
but I'd like to see, say, python and the 'unittest' module added, and
to see acceptance of tests using psycopg2 to stream and decode WAL
from a slot.

Thoughts?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: 9.5 replication origins fix for logical decoding

2015-11-08 Thread Craig Ringer
On 9 November 2015 at 07:04, Andres Freund  wrote:
> Hi,
>
> On 2015-10-19 21:43:32 +0800, Craig Ringer wrote:
>> Patch revision 3 attached. It's a one-liner, with just the fix, and an
>> explanatory note in the patch header.
>
> Pushed to 9.5 and master.
>
> Thanks for noticing the issue,

Thanks for taking the time to sanity-check and apply the fix.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-11-08 Thread Amit Kapila
On Sat, Nov 7, 2015 at 4:11 AM, Robert Haas  wrote:
>
> On Fri, Oct 23, 2015 at 9:22 PM, Amit Kapila 
wrote:
> > The base rel's consider_parallel flag won't be percolated to childrels,
so
> > even
> > if we mark base rel as parallel capable, while generating the path it
won't
> > be considered.  I think we need to find a way to pass on that
information if
> > we want to follow this way.
>
> Fixed in the attached version.  I added a max_parallel_degree check,
> too, per your suggestion.
>

+ else if (IsA(node, SubPlan) || IsA(node, SubLink) ||
+ IsA(node, AlternativeSubPlan) || IsA(node, Param))
+ {
+ /*
+ * Since we don't have the ability to push subplans down to workers
+ * at present, we treat subplan references as parallel-restricted.
+ */
+ if (!context->allow_restricted)
+ return true;
+ }

You seem to have agreed upthread to change this check for PARAM_EXEC
paramkind. I think you have forgotten to change this code.

@@ -714,6 +860,9 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
  continue;
  }

+ /* Copy consider_parallel flag from parent. */
+ childrel->consider_parallel = rel->consider_parallel;
+

We are changing the childrel quals in this function and in function
adjust_appendrel_attrs()->adjust_appendrel_attrs_mutator(), we are
adding an extra conversion step for one of the case to the Var
participating in qualification.  So even though it might be safe to
assume that it won't add any new parallel-restricted or parallel-unsafe
expression in qual, ideally we should have a check for parallel-safety in
childrel quals separately as those might not be identical to parent rel.

> > True, we can do that way.  What I was trying to convey by above is
> > that we anyway need checks during path creation atleast in some
> > of the cases, so why not do all the checks at that time only as I
> > think all the information will be available at that time.
> >
> > I think if we store parallelism related info in RelOptInfo, that can
also
> > be made to work, but the only worry I have with that approach is we
> > need to have checks at two levels one at RelOptInfo formation time
> > and other at Path formation time.
>
> I don't really see that as a problem.  What I'm thinking about doing
> (but it's not implemented in the attached patch) is additionally
> adding a ppi_consider_parallel flag to the ParamPathInfo.  This would
> be meaningful only for baserels, and would indicate whether the
> ParamPathInfo's ppi_clauses are parallel-safe.
>
> If we're thinking about adding a parallel path to a baserel, we need
> the RelOptInfo to have consider_parallel set and, if there is a
> ParamPathInfo, we need the ParamPathInfo's ppi_consider_parallel flag
> to be set also.  That shows that both the rel's baserestrictinfo and
> the paramaterized join clauses are parallel-safe.  For a joinrel, we
> can add a path if (1) the joinrel has consider_parallel set and (2)
> the paths being joined are parallel-safe.  Testing condition (2) will
> require a per-Path flag, so we'll end up with one flag in the
> RelOptInfo, a second in the ParamPathInfo, and a third in the Path.
>

I am already adding a parallel_aware flag in the patch to make seqscan
node parallel_aware, so if you find that patch better compare to partial
seq scan node patch, then I can take care of above in that patch.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


[HACKERS]

2015-11-08 Thread Jörg Strebel
unsubscribe pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers