date:20140717

Christoph Berg c...@df7cb.de writes:
 One place that's been bugging me where case-insensitivity would really
 make sense is this:

 # set work_mem = '1mb';
 ERROR:  22023: invalid value for parameter work_mem: 1mb
 HINT:  Valid units for this parameter are kB, MB, and GB.

Yeah ... there was some pedantry about how kB and KB mean different
things.  IMO that's mere pedantry, but ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gaussian distribution pgbench

2014-07-17 Thread Robert Haas

On Wed, Jul 16, 2014 at 12:57 AM, Fabien COELHO coe...@cri.ensmp.fr wrote:
 Well, I think the feedback has been pretty clear, honestly.  Here's
 what I'm unhappy about: I can't understand what these options are
 actually doing.

 We can try to improve the documentation, once more!

 However, ISTM that it is not the purpose of pgbench documentation to be a
 primer about what is an exponential or gaussian distribution, so the idea
 would yet be to have a relatively compact explanation, and that the
 interested but clueless reader would document h..self from wikipedia or a
 text book or a friend or a math teacher (who could be a friend as well:-).

Well, I think it's a balance.  I agree that the pgbench documentation
shouldn't try to substitute for a text book or a math teacher, but I
also think that you shouldn't necessarily need to refer to a text book
or a math teacher in order to figure out how to use pgbench.  Saying
it's complicated, so we don't have to explain it would be a cop out;
we need to *make* it simple.  And if there's no way to do that, then
IMHO we should reject the patch in favor of some future patch that
implements something that will be easy for users to understand.

  [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=10
 starting vacuum...end.
 transaction type: Exponential distribution TPC-B (sort of)
 scaling factor: 1
 exponential threshold: 10.0

 decile percents: 63.2% 23.3% 8.6% 3.1% 1.2% 0.4% 0.2% 0.1% 0.0% 0.0%
 highest/lowest percent of the range: 9.5% 0.0%

 I don't have a clue what that means.  None.

 Maybe we could add in front of the decile/percent

 distribution of increasing account key values selected by pgbench:

I still wouldn't know what that meant.  And it misses the point
anyway: if the documentation is good, this will be unnecessary.  If
the documentation is bad, a printout that tries to illustrate it by
example is not an acceptable substitute.

 Here is an example of an explanation that would make sense to me.
 This is not the actual behavior of your patch, I'm quite sure, so this
 is just an example of the *kind* of explanation that I think is
 needed:

 This is more or less the approximate behavior of the patch, but for 1% of
 the range, not 50%. However I'm not sure that the current documentation is
 so bad.

I think it isn't, because in the system I described, a larger value
indicates a flatter distribution, but in the documentation, a smaller
value indicates a flatter distribution.  That having been said, I
agree the current documentation for the exponential distribution is
not too bad.  But this part does not make sense:

+  A crude approximation of the distribution is that the most frequent 1%
+  values are drawn replaceablethreshold/% of the time.
+  The closer to 0.0 the threshold, the flatter (more uniform) the access
+  distribution.

Given the first statement, I'd expect the lowest possible threshold to
be 0.01, not 0.

The documentation for the Gaussian distribution is in somewhat worse
shape.  Unlike the documentation for exponential, it makes no attempt
at all to give the user a clear idea what the distribution actually
looks like.  The closest it comes is this:

+  In other worlds, the larger the replaceablethreshold/,
+  the narrower the access range around the middle.

But that's not really very close - there's no way for a user to judge
what impact the threshold parameter actually has except to try it.
Unlike the discussion of exponential, which contains a fairly-precise
mathematical characterization of the behavior, the Gaussian stuff has
nothing except a hand-wavy explanation that a higher threshold skews
the distribution more.  (Also, the English expression is in other
words not in other worlds - but in fact the phrase has no business
in that sentence at all, because it is not reiterating the contents of
the previous sentence in different language, but rather making a new
point entirely.  And the following sentence does not start with a
capital letter, though maybe that's because it was intended to be
incorporated into this sentence somehow.)

I think that you also need to consider which instances of the words
gaussian and exponential are referring to the option and which are
referring to the abstract mathematical concept.  When you're talking
about the option, you should use all lower-case (as you've done) but
with literal tags or similar.  When you're referring to the
mathematical distribution, Gaussian should be capitalized.

BTW, I agree with both Heikki's suggestion that we make these options
to setrandom only and not expose command-line options for them, and
with Andres's critique that the documentation of those options is far
too repetitive.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: 9.3: more problems with Could not open file pg_multixact/members/xxxx

2014-07-17 Thread Robert Haas

On Wed, Jul 16, 2014 at 12:46 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 (Maybe I'm just too tired and I'm failing to fully understand the torn
 page protection.  I thought I understood how it worked, but now I'm not
 sure -- I mean I don't see how it can possibly have any value at all.
 Surely if the disk writes the first 512-byte sector of the page and then
 forgets the updates to the next 15 sectors, the page will appear as not
 needing the full page image to be restored ...)

We always restore full page images, regardless of the page LSN.
Otherwise, we'd have exactly the problem you describe.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.5] Custom Plan API

On 2014-07-16 10:43:08 +0900, Shigeru Hanada wrote:
 Kaigai-san,
 
 2014-07-15 21:37 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com:
  Sorry, expected result of sanity-check test was not updated on
  renaming to pg_custom_plan_provider.
  The attached patch fixed up this point.
 
 I confirmed that all regression tests passed, so I marked the patch as
 Ready for committer.

I personally don't see how this patch is 'ready for committer'. I
realize that that state is sometimes used to denote that review needs to
be escalated, but it still seemspremature.

Unless I miss something there hasn't been any API level review of this?
Also, aren't there several open items?

Perhaps there needs to be a stage between 'needs review' and 'ready for
committer'?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

I am not opposed to moving the contrib code into core in the manner
that you oppose. I don't feel strongly either way.

I noticed in passing that your revision says this *within* levenshtein.c:

+ * Guaranteed to work with Name datatype's cstrings.
+ * For full details see levenshtein.c.

On Thu, Jul 17, 2014 at 6:34 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
Patch 2 is a rebase of the feature of Peter that can be applied on top of
patch 1. The code is rather untouched (haven't much played with Peter's
thingies), well-commented, but I think that this needs more work,
particularly when a query has a single RTE like in this case where no hints
are proposed to the user (mentioned upthread):

The only source of disagreement that I am aware of at this point is
the question of whether or not we should accept two candidates from
the same RTE. I lean slightly towards no, as already explained [1]
[2], but it's not as if I feel that strongly either way - this
approach of looking for only a single best candidate per RTE taken in
deference to the concerns of others.

I imagined that when a committer picked this up, an executive decision
would be made one way or the other. I am quite willing to revise the
patch to alter this behavior at the request of a committer.

[1]
http://www.postgresql.org/message-id/CAM3SWZTrm4PmqMmL9=eyx-8f-vx-ha7dme4koms2vcomozg...@mail.gmail.com
[2]
http://www.postgresql.org/message-id/cam3swzs6kiqeqjz4pv3fkp6cgw1ws26exoqtjb_xmw3ze5b...@mail.gmail.com
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RLS Design

2014-07-17 Thread Alvaro Herrera

Tom Lane wrote:

 20MB messages to the list aren't that friendly.  Please don't do that
 again, unless asked to.

FWIW the message was not distributed to the list.  I got a note from
Adam and dropped it from the moderation queue.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.5] Custom Plan API

2014-07-17 Thread Alvaro Herrera

I haven't followed this at all, but I just skimmed over it and noticed
the CustomPlanMarkPos thingy; apologies if this has been discussed
before.  It seems a bit odd to me; why isn't it sufficient to have a
boolean flag in regular CustomPlan to indicate that it supports
mark/restore?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BUFFER_LOCK_EXCLUSIVE is used in ginbuildempty().

2014-07-17 Thread Alvaro Herrera

Kyotaro HORIGUCHI wrote:
 Hello,
 
 As far as I see gin seems using GIN_EXCLUSIVE instead of
 BUFFER_LOCK_EXCLUSIVE for LockBuffer, but the raw
 BUFFER_LOCK_EXCLUSIVE appears in ginbuildempty().
 
 Does it has a meaning to fix them to GIN_EXCLUSIVE?

I don't understand the point of having these GIN_EXCLUSIVE / GIN_SHARED
symbols.  It's not like we could do anything different than
BUFFER_LOCK_EXCLUSIVE etc instead.  It there was a GinLockBuffer() it
might make more sense to have specialized symbols, but as it is it seems
pointless.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BUFFER_LOCK_EXCLUSIVE is used in ginbuildempty().

On Thu, Jul 17, 2014 at 7:47 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 I don't understand the point of having these GIN_EXCLUSIVE / GIN_SHARED
 symbols.  It's not like we could do anything different than
 BUFFER_LOCK_EXCLUSIVE etc instead.  It there was a GinLockBuffer() it
 might make more sense to have specialized symbols, but as it is it seems
 pointless.

It's a pattern common to the index AMs. I think it's kind of pointless
myself, but as long as we're doing it we might as well be consistent.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Portability issues in TAP tests

My Salesforce colleagues have been complaining that the TAP tests added
in 9.4 don't work terribly well for them.  I've been poking at this,
and I believe this is a reasonably complete list of the problems:

1. make [install]check-world tries to run the TAP tests even when
prove was not found by configure.  I regard this as a stop-ship issue
for 9.4; prove is not part of a basic Perl installation, and even if
it were, we don't require Perl to build from a tarball.

2. Most of the tests fail in make check mode, unless you already did
make install, because of failures to load libpq.so.  The right fix for
this seems to be to modify LD_LIBRARY_PATH and friends to include the temp
install's libdir, comparably to what pg_regress.c does (lines 903ff in
HEAD).  Unfortunately we can't just borrow that code since there is no
C wrapper involved in the TAP tests.  (Maybe there should be?)  This again
seems like a stop-ship issue.

3. Many of the tests depend on Test::More's subtest feature, which
unfortunately is of rather recent vintage (2009 or so).  It's not present
for example in RHEL6's version of Test::More, and presumably not in any
older distros either.  I'm not sure if we should consider it acceptable
to depend on this feature: it doesn't seem exactly critical, and if we
continue to use it, there is approximately 0 chance that we'll ever be
able to enable the TAP tests on many of the existing buildfarm members.
In any case, as long as we depend on it, there had better be some cleaner
response to older versions of Test::More than just blowing up.

4. IPC::Run isn't installed by default on RHEL, and probably not on other
distros either.  If there's a reasonably painless way to remove this
dependency, it'd improve the portability of the tests.  This is lower
priority than the previous items, for sure.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.5] Custom Plan API

Alvaro Herrera alvhe...@2ndquadrant.com writes:
 I haven't followed this at all, but I just skimmed over it and noticed
 the CustomPlanMarkPos thingy; apologies if this has been discussed
 before.  It seems a bit odd to me; why isn't it sufficient to have a
 boolean flag in regular CustomPlan to indicate that it supports
 mark/restore?

Yeah, I thought that was pretty bogus too, but it's well down the
list of issues that were there last time I looked at this ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BUFFER_LOCK_EXCLUSIVE is used in ginbuildempty().

Peter Geoghegan p...@heroku.com writes:
 On Thu, Jul 17, 2014 at 7:47 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 I don't understand the point of having these GIN_EXCLUSIVE / GIN_SHARED
 symbols.  It's not like we could do anything different than
 BUFFER_LOCK_EXCLUSIVE etc instead.  It there was a GinLockBuffer() it
 might make more sense to have specialized symbols, but as it is it seems
 pointless.

 It's a pattern common to the index AMs. I think it's kind of pointless
 myself, but as long as we're doing it we might as well be consistent.

I think that to the extent that these symbols are used in APIs above the
direct buffer-access layer, they are useful --- for example using
BT_READ/BT_WRITE in _bt_search calls seems like a useful increment of
readability.  GIN seems to have less of that than some of the other AMs,
but I do see GIN_SHARE being used that way in some calls.

BTW, there's one direct usage of BUFFER_LOCK_EXCLUSIVE in the GIST code
as well, which should probably be replaced with GIST_EXCLUSIVE if we're
trying to be consistent.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gaussian distribution pgbench

2014-07-17 Thread Fabien COELHO




However, ISTM that it is not the purpose of pgbench documentation to be a
primer about what is an exponential or gaussian distribution, so the idea
would yet be to have a relatively compact explanation, and that the
interested but clueless reader would document h..self from wikipedia or a
text book or a friend or a math teacher (who could be a friend as well:-).


Well, I think it's a balance.  I agree that the pgbench documentation
shouldn't try to substitute for a text book or a math teacher, but I
also think that you shouldn't necessarily need to refer to a text book
or a math teacher in order to figure out how to use pgbench.  Saying
it's complicated, so we don't have to explain it would be a cop out;
we need to *make* it simple.  And if there's no way to do that, then
IMHO we should reject the patch in favor of some future patch that
implements something that will be easy for users to understand.


 [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=10
starting vacuum...end.
transaction type: Exponential distribution TPC-B (sort of)
scaling factor: 1
exponential threshold: 10.0

decile percents: 63.2% 23.3% 8.6% 3.1% 1.2% 0.4% 0.2% 0.1% 0.0% 0.0%
highest/lowest percent of the range: 9.5% 0.0%


I don't have a clue what that means.  None.


Maybe we could add in front of the decile/percent

distribution of increasing account key values selected by pgbench:


I still wouldn't know what that meant.  And it misses the point
anyway: if the documentation is good, this will be unnecessary.  If
the documentation is bad, a printout that tries to illustrate it by
example is not an acceptable substitute.


The decile description is quite classic when discussing statistics.


Here is an example of an explanation that would make sense to me.
This is not the actual behavior of your patch, I'm quite sure, so this
is just an example of the *kind* of explanation that I think is
needed:


This is more or less the approximate behavior of the patch, but for 1% of
the range, not 50%. However I'm not sure that the current documentation is
so bad.


I think it isn't, because in the system I described, a larger value
indicates a flatter distribution, but in the documentation, a smaller
value indicates a flatter distribution.


Ok. But the general thrust was ok.

That having been said, I agree the current documentation for the 
exponential distribution is not too bad.  But this part does not make 
sense:


+  A crude approximation of the distribution is that the most frequent 1%
+  values are drawn replaceablethreshold/% of the time.


I'm trying to be nice to the reader by providing an intuitive 
information. I do not seem to succeed:-) I'm attempting to say that when 
you draw from a range, say 1 to 1000, the first 1%, i.e. values 1 to 10,

are draw about threshold% of the time.

If I draw from one hundred values:

\setrandom x 1 100 exponential 10.0

The 1 will be drawn about 10% of the time, and the 99 next values will 
share the remaining 90%.



+  The closer to 0.0 the threshold, the flatter (more uniform) the access
+  distribution.

Given the first statement, I'd expect the lowest possible threshold to
be 0.01, not 0.


This is in the sense of epsilon, small number close to 0 but different 
from 0. The lowest possible threshold is the smallest 
strictly positive representable with a double.



The documentation for the Gaussian distribution is in somewhat worse
shape.  Unlike the documentation for exponential, it makes no attempt
at all to give the user a clear idea what the distribution actually
looks like.  The closest it comes is this:

+  In other worlds, the larger the replaceablethreshold/,
+  the narrower the access range around the middle.

But that's not really very close - there's no way for a user to judge
what impact the threshold parameter actually has except to try it.
Unlike the discussion of exponential, which contains a fairly-precise
mathematical characterization of the behavior,


I have now added a precise formula for Gaussian. When you see the formula, 
maybe you still would want see the decile to have an intuition.


I think that we assumed that the reader would know that a gaussian 
distribution is the classic bell-shaped distribution, and if not .?he 
would not be interested anyway.



the Gaussian stuff has
nothing except a hand-wavy explanation that a higher threshold skews
the distribution more.  (Also, the English expression is in other
words not in other worlds - but in fact the phrase has no business
in that sentence at all, because it is not reiterating the contents of
the previous sentence in different language, but rather making a new
point entirely.  And the following sentence does not start with a
capital letter, though maybe that's because it was intended to be
incorporated into this sentence somehow.)

I think that you also need to consider which instances of the words
gaussian and exponential are referring to the option and

Re: [HACKERS] [TODO] Process pg_hba.conf keywords as case-insensitive

2014-07-17 Thread Gavin Flower


On 18/07/14 04:08, Tom Lane wrote:

Christoph Berg c...@df7cb.de writes:

One place that's been bugging me where case-insensitivity would really
make sense is this:
# set work_mem = '1mb';
ERROR:  22023: invalid value for parameter work_mem: 1mb
HINT:  Valid units for this parameter are kB, MB, and GB.

Yeah ... there was some pedantry about how kB and KB mean different
things.  IMO that's mere pedantry, but ...

regards, tom lane



But kb  kB do mean different things: kilobits vs kilobytes! :-)
(Network throughput seems to be always in bits per second - my broadband 
download is quoted at 100Mb/s, whereas I get 12MB/s download at best.)



Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Set new system identifier using pg_resetxlog

On 2014-06-27 00:51:02 +0200, Petr Jelinek wrote:
 - while ((c = getopt(argc, argv, fl:m:no:O:x:e:)) != -1)
 + while ((c = getopt(argc, argv, fl:m:no:O:x:e:s::)) != -1)

Why two :?

   {
   switch (c)
   {
 @@ -227,6 +229,33 @@ main(int argc, char *argv[])
   XLogFromFileName(optarg, minXlogTli, 
 minXlogSegNo);
   break;
  
 + case 's':
 + if (optarg)
 + {
 +#ifdef HAVE_STRTOULL
 + set_sysid = strtoull(optarg, endptr, 
 10);
 +#else
 + set_sysid = strtoul(optarg, endptr, 
 10);
 +#endif

Wouldn't it be better to use sscanf()? That should work with large
inputs across platforms.

 + /*
 +  * Validate input, we use strspn 
 because strtoul(l) accepts
 +  * negative numbers and silently 
 converts them to unsigned
 +  */
 + if (set_sysid == 0 || errno != 0 ||
 + endptr == optarg || *endptr != 
 '\0' ||
 + strspn(optarg, 0123456789) != 
 strlen(optarg))
 + {
 + fprintf(stderr, _(%s: invalid 
 argument for option -s\n), progname);
 + fprintf(stderr, _(Try \%s 
 --help\ for more information.\n), progname);
 + exit(1);
 + }

Maybe: 'invalid argument \%s\ ...'?

 @@ -1087,6 +1147,7 @@ usage(void)
   printf(_(  -o OID   set next OID\n));
   printf(_(  -O OFFSETset next multitransaction offset\n));
   printf(_(  -V, --versionoutput version information, then exit\n));
 + printf(_(  -s [SYSID]   set system identifier (or generate 
 one)\n));
   printf(_(  -x XID   set next transaction ID\n));
   printf(_(  -?, --help   show this help, then exit\n));
   printf(_(\nReport bugs to pgsql-b...@postgresql.org.\n));

I think we usually try to keep the options roughly alphabetically
ordered. Same in the getopt() above.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Set new system identifier using pg_resetxlog

On 2014-07-18 00:41:05 +0200, Andres Freund wrote:
 On 2014-06-27 00:51:02 +0200, Petr Jelinek wrote:
  -   while ((c = getopt(argc, argv, fl:m:no:O:x:e:)) != -1)
  +   while ((c = getopt(argc, argv, fl:m:no:O:x:e:s::)) != -1)
 
 Why two :?

Obviously strike that, wanted to delete the paragraph, but forgot about
it. I hadn't remembered the autogeneration of sysids at the beginning of
the patch.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL for VAX on NetBSD/OpenBSD

2014-07-17 Thread Thor Lancelot Simon

On Wed, Jun 25, 2014 at 10:50:47AM -0700, Greg Stark wrote:
 On Wed, Jun 25, 2014 at 10:17 AM, Robert Haas robertmh...@gmail.com wrote:
  Well, the fact that initdb didn't produce a working configuration and
  that make installcheck failed to work properly are bad.  But, yeah,
  it's not totally broken.
 
 Yeah it seems to me that these kinds of autoconf and initdb tests
 failing are different from a platform where the spinlock code doesn't
 work. It's actually valuable to have a platform where people routinely
 trigger those configuration values. If they're broken there's not much
 value in carrying them.

Well, I have to ask this question: why should there be any vax-specific
code?  What facilities beyond what POSIX with the threading extensions
offers on a modern system do you really need?  Why?

It seems to me that NetBSD/vax is a very good platform for testing one's
assumptions about whether one's code is truly portable -- because it is
a moderately weird architecture, with some resource constraints, but with
a modern kernel and runtime offering everything you'd get from a software
point of view on any other platform.

Except, of course, for IEEE floating point, because the VAX's floating point
unit simply does not provide that.  But if other tests fail on the VAX or
one's source tree is littered with any other kind of VAX-specific code or
special cases for VAX, I would submit that this suggests one's code has
fairly serious architectual or implementation discipline issues.

Thor


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Set new system identifier using pg_resetxlog

2014-07-17 Thread Josh Berkus

On 06/30/2014 09:46 AM, Alvaro Herrera wrote:
 If we only had bricks and mortar, I think we would have a tool to
 display and tweak pg_control separately from emptying pg_xlog, rather
 than this odd separation between pg_controldata and pg_resetxlog, each
 of which do a mixture of those things.  But we have a wall two thirds
 done already, so it seems to make more sense to me to continue forward
 rather than tear it down and start afresh.  This patch is a natural
 extension of what we already have.

As I said previously, I disagree.  Being able to set a system identifier
at runtime is useful for a variety of scenarios which do not involve
database surgery or the risk of wiping out your data; in fact, the only
bad thing you can do by changing the system id is break the replication
connection.

As such, tying a change in system id to pg_resetxlog is kind of like
having a combination dental floss dispenser and bazooka.  No, not
*that* trigger!

It pretty much guarantees that the ability to change the systemid, which
could be a generally useful feature with all kinds of application for
HA, clusters and sharding, would remain an internal feature of BDR only,
because everyone else will be afraid to touch it.

If this means that we need to finally create pg_editcontroldata, then so
be it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GSoC2014] Patch ALTER TABLE ... SET LOGGED

On 2014-07-16 20:45:15 -0300, Fabrízio de Royes Mello wrote:
 On Wed, Jul 16, 2014 at 7:26 PM, Andres Freund and...@2ndquadrant.com
 wrote:
 
  On 2014-07-16 20:53:06 +0200, Andres Freund wrote:
   On 2014-07-16 20:25:42 +0200, Andres Freund wrote:
Hi,
   
I quickly looked at this patch and I think there's major missing
 pieces
around buffer management and wal logging.
   
a) Currently buffers that are in memory marked as
   permanent/non-permanent aren't forced out to disk/pruned from
 cache,
   not even when they're dirty.
b) When converting from a unlogged to a logged table the relation
 needs
   to be fsynced.
c) Currently a unlogged table changed into a logged one will be
   corrupted on a standby because its contents won't ever be WAL
 logged.
  
   Forget that, didn't notice that you're setting tab-rewrite = true.
 
  So, while that danger luckily isn't there I think there's something
  similar. Consider:
 
  CREATE TABLE blub(...);
  INSERT INTO blub ...;
 
  BEGIN;
  ALTER TABLE blub SET UNLOGGED;
  ROLLBACK;
 
  The rewrite will read in the 'old' contents - but because it's done
  after the pg_class.relpersistence is changed they'll all not be marked
  as BM_PERMANENT in memory. Then the ALTER TABLE is rolled back,
  including the relpersistence setting. Which will unfortunately leave
  pages with the wrong persistency setting in memory, right?
 
 
 That means should I FlushRelationBuffers(rel) before change the
 relpersistence?

I don't think that'd help. I think what this means that you simply
cannot change the relpersistence of the old relation before the heap
swap is successful. So I guess it has to be something like (pseudocode):

OIDNewHeap = make_new_heap(..);
newrel = heap_open(OIDNewHeap, AEL);

/*
 * Change the temporary relation to be unlogged/logged. We have to do
 * that here so buffers for the new relfilenode will have the right
 * persistency set while the original filenode's buffers won't get read
 * in with the wrong (i.e. new) persistency setting. Otherwise a
 * rollback after the rewrite would possibly result with buffers for the
 * original filenode having the wrong persistency setting.
 *
 * NB: This relies on swap_relation_files() also swapping the
 * persistency. That wouldn't work for pg_class, but that can't be
 * unlogged anyway.
 */
AlterTableChangeCatalogToLoggedOrUnlogged(newrel);
FlushRelationBuffers(newrel);
/* copy heap data into newrel */
finish_heap_swap();

And then in swap_relation_files() also copy the persistency.


That's the best I can come up right now at least.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Making joins involving ctid work for the benefit of UPSERT

I'm working on UPSERT again. I think that in order to make useful
progress, I'll have to find a better way of overcoming the visibility
issues (i.e. the problem of what to do about a
still-in-progress-to-our-snapshot row being locked at READ COMMITTED
isolation level [1][2]).

I've made some tentative additions to my earlier patch to change the syntax:

postgres=# explain analyze
with c as (
  insert into ints(key, val) values (1, 'a'), (2, 'b')
  on conflict
  select * for update)
  update ints set val = c.val from c conflicts;
  QUERY PLAN
--
 Update on ints  (cost=0.03..62.34 rows=2 width=104) (actual
time=0.095..0.095 rows=0 loops=1)
   CTE c
 -  Insert on ints ints_1  (cost=0.00..0.03 rows=2 width=36)
(actual time=0.039..0.049 rows=2 loops=1)
   -  Values Scan on *VALUES*  (cost=0.00..0.03 rows=2
width=36) (actual time=0.001..0.002 rows=2 loops=1)
   -  Nested Loop  (cost=0.00..62.31 rows=2 width=104) (actual
time=0.063..0.080 rows=2 loops=1)
 Join Filter: (ints.ctid = c.ctid)
 Rows Removed by Join Filter: 6
 -  CTE Scan on c  (cost=0.00..0.04 rows=2 width=100) (actual
time=0.044..0.055 rows=2 loops=1)
 -  Materialize  (cost=0.00..28.45 rows=1230 width=10)
(actual time=0.005..0.006 rows=4 loops=2)
   -  Seq Scan on ints  (cost=0.00..22.30 rows=1230
width=10) (actual time=0.009..0.009 rows=4 loops=1)
 Planning time: 0.132 ms
 Execution time: 0.159 ms
(12 rows)

This works, as far as it goes. Parse analysis adds the ctid join to
the top-level UPDATE based on the fact that a magical UPDATE statement
(magical due to having a CONFLICTS clause) references a CTE in its
FromList (this is mandated by CONFLICTS during parse analysis), a CTE
containing an INSERT ON CONFLICT, and itself projecting a magical set
of rejected rows sufficient to get a locked c.ctid implicitly. (FWIW,
DELETEs may also accept a CONFLICTS clause and be used like this).

The optimizer does not support joins involving ctid, which is why
there is a seqscan plan - even setting enable_seqscan = off does not
alter the plan right now.

Apart from the obvious fact that we're doing an unnecessary Seq Scan
on the target table, which is unacceptable on performance grounds
(since we do in fact already know exactly what list of tids to pass to
the top-level ModifyTable node to UPDATE), there is another problem:
Since I cannot use a tid scan, I cannot play tricks with visibility
within tidscans for the benefit of solving the basic problem of doing
the right thing for READ COMMITTED. Clearly I need to add a new MVCC
violation in the spirit of EvalPlanQual() to avoid this problem
(since READ COMMITTED serialization failures are unacceptable to
users), but I'd like this new MVCC violation to be as selective as
possible.

Let me take a step back, though. Presently UPDATE WHERE CURRENT OF
does something that isn't a million miles from what I have in mind to
address the problem. That feature exists for sensitive cursors,
where we lock rows (since FOR UPDATE was specified with the DECLARE
CURSOR statement) as they're fetched from the cursor. The upshot is
that in general, rows can change out from under us since they're not
yet locked, including in ways that might result in a new row version
not satisfying the original SELECT FOR UPDATE query predicate -
clearly users don't want to have to deal with that when they
subsequently go to UPDATE WHERE CURRENT OF. Special handling is
required. Purely because UPDATE WHERE CURRENT OF was specified, as
part of this special handling that update must use a tid scan. There
is a little bit of special machinery within the optimizer (within
cost_tidscan()) to force this for the isCurrentOf case.

The tid scan code within the executor also has special UPDATE WHERE
CURRENT OF handling. TidListCreate() specially handles the case by
asking execCurrentOf() to look up the cursor's portal, and to get a
tid from its QueryDesc for the current cursor position. TidNext()
specially fetches the most recent version visible to our estate's
Snapshot when we UPDATE WHERE CURRENT OF, without considering the
original SELECT FOR UPDATE predicate (and whether or not it would in
any sense continue to be satisfied). Each tid is then fetched from the
heap and returned. I think I ought to be able to do something similar
with this new CONFLICTS clause, by similarly marshaling some list of
tids ultimately originating from the ModifyTable node that locked a
potentially-not-yet-visible row (or maybe the optimizer recognizes the
internal special case and creates the tid scan node visibility
credulous). I'd then return heap tuples by using a dirty snapshot.
Since this can only happen for the UPDATE's tid scan, and there must
be a tid scan, it should be correct (on its own fairly reasonable
terms, for READ COMMITTED). After all, at

Re: [HACKERS] [v9.5] Custom Plan API

2014-07-17 Thread Kouhei Kaigai

 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  I haven't followed this at all, but I just skimmed over it and noticed
  the CustomPlanMarkPos thingy; apologies if this has been discussed
  before.  It seems a bit odd to me; why isn't it sufficient to have a
  boolean flag in regular CustomPlan to indicate that it supports
  mark/restore?
 
 Yeah, I thought that was pretty bogus too, but it's well down the list of
 issues that were there last time I looked at this ...
 
IIRC, CustomPlanMarkPos was suggested to keep the interface of
ExecSupportsMarkRestore() that takes plannode tag to determine
whether it support Mark/Restore.
As my original proposition did, it seems to me a flag field in
CustomPlan structure is straightforward, if we don't hesitate to
change ExecSupportsMarkRestore().

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.5] Custom Plan API

2014-07-17 Thread Kouhei Kaigai

 I personally don't see how this patch is 'ready for committer'. I realize
 that that state is sometimes used to denote that review needs to be
 escalated, but it still seemspremature.
 
 Unless I miss something there hasn't been any API level review of this?
 Also, aren't there several open items?
 
Even though some interface specifications are revised according to the
comment from Tom on the last development cycle, the current set of
interfaces are not reviewed by committers. I really want this.

Here are two open items that we want to wait for committers comments.

* Whether set_cheapest() is called for all relkind?

This pactch moved set_cheapest() to the end of set_rel_pathlist(),
to consolidate entrypoint of custom-plan-provider handler function.
It also implies CPP can provider alternative paths towards non-regular
relations (like sub-queries, functions, ...).
Hanada-san wonder whether we really have a case to run alternative
sub-query code. Even though I don't have usecases for alternative
sub-query execution logic, but we also don't have a reason why not
to restrict it.

* How argument of add_path handler shall be derivered?

The handler function (that adds custom-path to the required relation
scan if it can provide) is declared with an argument with INTERNAL
data type. Extension needs to have type-cast on the supplied pointer
to customScanArg data-type (or potentially customHashJoinArg and
so on...) according to the custom plan class.
I think it is well extendable design than strict argument definitions,
but Hanada-san wonder whether it is the best design.

 Perhaps there needs to be a stage between 'needs review' and 'ready for
 committer'?

It needs clarification of 'ready for committer'. I think interface
specification is a kind of task to be discussed with committers
because preference/viewpoint of rr-reviewer are not always same
opinion with them.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


 -Original Message-
 From: Andres Freund [mailto:and...@2ndquadrant.com]
 Sent: Friday, July 18, 2014 3:12 AM
 To: Shigeru Hanada
 Cc: Kaigai Kouhei(海外 浩平); Kohei KaiGai; Simon Riggs; Tom Lane; Stephen
 Frost; Robert Haas; PgHacker; Jim Mlodgenski; Peter Eisentraut
 Subject: Re: [HACKERS] [v9.5] Custom Plan API
 
 On 2014-07-16 10:43:08 +0900, Shigeru Hanada wrote:
  Kaigai-san,
 
  2014-07-15 21:37 GMT+09:00 Kouhei Kaigai kai...@ak.jp.nec.com:
   Sorry, expected result of sanity-check test was not updated on
   renaming to pg_custom_plan_provider.
   The attached patch fixed up this point.
 
  I confirmed that all regression tests passed, so I marked the patch as
  Ready for committer.
 
 I personally don't see how this patch is 'ready for committer'. I realize
 that that state is sometimes used to denote that review needs to be
 escalated, but it still seemspremature.
 
 Unless I miss something there hasn't been any API level review of this?
 Also, aren't there several open items?
 
 Perhaps there needs to be a stage between 'needs review' and 'ready for
 committer'?
 
 Greetings,
 
 Andres Freund
 
 --
  Andres Freund   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT