from:"Dean Rasheed"

Re: [HACKERS] Row Level Security Documentation

2017-09-26 Thread Dean Rasheed

On 26 September 2017 at 00:42, Stephen Frost <sfr...@snowman.net> wrote:
> * Dean Rasheed (dean.a.rash...@gmail.com) wrote:
>> Attached is a proposed patch...
>
> I've taken a look through this and generally agree with it.

Thanks for looking at this.

> the bits inside  ...  tags should be
> consistently capitalized (you had one case of 'All' that I noticed)

Yes, that's a typo. Will fix.

> I wonder if it'd be better to have the "simple" description *first*
> instead of later on, eg, where you have "Thus the overall result is
> that" we might want to try and reword things to decribe it as "Overall,
> it works thusly, ..., and the specifics are ...".

OK, that makes sense. I'll try it that way round.

> That's a relatively minor point, however, and I do feel that this patch
> is a definite improvement.  Were you thinking of just applying this for
> v10, or back-patching all or part of it..?

I was planning on back-patching it to 9.5, taking out the parts
relating the restrictive policies as appropriate. Currently the 9.5
and 9.6 docs are identical, as are 10 and HEAD, and 9.5/9.6 only
differs from 10/HEAD in a couple of places where they mention
restrictive policies. IMO we should stick to that, making any
improvements available in the back-branches. I was also thinking the
same about the new summary table, although I haven't properly reviewed
that yet.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row Level Security Documentation

2017-09-25 Thread Dean Rasheed

On 5 August 2017 at 10:03, Fabien COELHO  wrote:
> Patch applies cleanly, make html ok, new table looks good to me.
>

So I started looking at this patch, but before even considering the
new table proposed, I think there are multiple issues that need to be
resolved with the current docs:

Firstly, there are 4 separate places in the current CREATE POLICY docs
that say that multiple policies are combined using OR, which is of
course no longer correct, since PG10 added RESTRICTIVE policy support.
In fact, it wasn't even strictly correct in 9.5/9.6, because if
multiple different types of policy apply to a single command (e.g.
SELECT and UPDATE policies, being applied to an UPDATE command), then
they are combined using AND. Rather than trying to make this correct
in 4 different places, I think there should be a single new section of
the docs that explains how multiple policies are combined. This will
also reduce the number of places where the pre- and post-v10 docs
differ, making them easier to maintain.

In the 6th paragraph of the description section, the terminology used
is mixed up and does not match the terminology used elsewhere
("command" instead of "policy" and "policy" instead of "expression").

The description of UPDATE policies lists the commands that the policy
applies to (UPDATE and INSERT ... ON CONFLICT DO UPDATE), but fails to
mention SELECT FOR UPDATE and SELECT FOR SHARE.

The proposed patch distinguishes between UPDATE/DELETE commands that
have WHERE or RETURNING clauses, and those that don't, claiming that
the former require SELECT permissions and the latter don't. That's
based on a couple of statements from the current docs, but it's not
entirely accurate. For example, "UPDATE foo SET a=NULL WHERE true
RETURNING now()" does not require SELECT permissions despite having
both WHERE and RETURNING clauses (OK, I admit that's a rather
contrived example, but still...), and conversely (a more realistic
example) "UPDATE foo SET a=b+c" does require SELECT permissions
despite not having WHERE or RETURNING clauses. I think we need to try
to be more precise about when SELECT permissions are required.

In the notes section, there is a note about there needing to be at
least one permissive policy for restrictive policies to take effect.
That's well worth pointing out, however, I fear that this note is
buried so far down the page (amongst some other very wordy notes) that
readers will miss it. I suggest moving it to the parameters section,
where permissive and restrictive policies are first introduced,
because it's a pretty crucial fact to be aware of when using these new
types of policy.

Attached is a proposed patch to address these issues, along with a few
other minor tidy-ups.

Regards,
Dean

create-policy-docs.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-09-14 Thread Dean Rasheed

On 14 September 2017 at 03:49, Noah Misch  wrote:
> On Wed, Sep 13, 2017 at 12:06:40PM -0400, Robert Haas wrote:
>> OK, thanks.  I still don't really like allowing this, but I can see
>> that compatibility with other systems has some value here, and if
>> nobody else is rejecting these cases, maybe we shouldn't either.  So
>> I'll hold my nose and change my vote to canonicalizing rather than
>> rejecting outright.
>
> I vote for rejecting it.  DDL compatibility is less valuable than other
> compatibility.  The hypothetical affected application can change its DDL to
> placate PostgreSQL and use that modified DDL for all other databases, too.

So we have 3 options:

1. Do nothing, allowing and keeping any values after a MINVALUE/MAXVALUE.

2. Allow the such values, but canonicalise what we store.

3. Reject anything other than MINVALUE/MAXVALUE after MINVALUE/MAXVALUE.


My order of preference is still (1), (2) then (3) because:

- Compatibility.
- Less verbose / easier to type.
- We allow multiple syntaxes for equivalent constraints in other places,
  without canonicalising what the user enters.
- Regarding Robert's coding argument, code in general should not be
  looking at and basing decisions on any values it sees after a
  MINVALUE/MAXVALUE. If it does, at the very least it's doing more work
  than it needs to, and at worst, there's a potential bug there which
  would be more readily exposed by allowing arbitrary values there. Code
  that only worked because because of earlier canonicalisation would
  strike me as being somewhat fragile.

However, it seems that there is a clear consensus towards (2) or (3)
and we have viable patches for each, although I'm not sure which of
those the consensus is really leaning towards.

Robert, since partitioning was originally your commit, and you know
the wider codebase better, I'm happy to go with whatever you decide.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-09-14 Thread Dean Rasheed

On 13 September 2017 at 14:51, Robert Haas  wrote:
> Coincidentally, I wrote a patch for this too, but mine goes back to
> rejecting MINVALUE or MAXVALUE followed by anything else.
>

LGTM, if we decide to go this way.

One minor review comment -- you missed an example code snippet using
(MINVALUE, 0) further down the page in create_table.sgml, which needs
updating.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-09-14 Thread Dean Rasheed

On 13 September 2017 at 10:05, Amit Langote
 wrote:
> Coincidentally, I just wrote the patch for canonicalizing stored values,
> instead of erroring out.  Please see attached if that's what you were
> thinking too.
>

Looks reasonable to me, if we decide to go this way.

One minor review comment -- it isn't really necessary to have the
separate new boolean local variables as well as the datum kind
variables. Just tracking the datum kind is sufficient and slightly
simpler. That would also address a slight worry I have that your
coding might result in a compiler warning about the kind variables
being used uninitialised with some less intelligent compilers, not
smart enough to work out that it can't actually happen.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-09-13 Thread Dean Rasheed

On 13 September 2017 at 14:53, Robert Haas <robertmh...@gmail.com> wrote:
> On Wed, Sep 13, 2017 at 4:51 AM, Dean Rasheed <dean.a.rash...@gmail.com> 
> wrote:
>> A drawback to doing this is that we lose compatibility with syntaxes
>> supported by other databases, which was part of the reason for
>> choosing the terms MINVALUE and MAXVALUE in the first place.
>>
>> So thinking about this afresh, my preference would actually be to just
>> canonicalise the values stored rather than erroring out.
>
> Can you be more specific about what other databases do here?  Which
> other systems support MINVALUE/MAXVALUE, and what are their respective
> behaviors in this situation?
>

Oracle, MySQL and DB2 all use MINVALUE/MAXVALUE. Actually, Oracle and
MySQL only use MAXVALUE, not MINVALUE, because they don't allow gaps
between partitions and the first partition implicitly starts at
MINVALUE, so the bounds that we currently support are a strict
superset of those supported by Oracle and MySQL.

Both Oracle and MySQL allow finite values after MAXVALUE (usually
listed as "0" in code examples, e.g. see [1]). Oracle explicitly
documents the fact that values after MAXVALUE are irrelevant in [1].
I'm not sure if MySQL explicitly documents that, but it does behave
the same.

Also, both Oracle and MySQL store what the user entered (they do not
canonicalise), as can be seen by looking at ALL_TAB_PARTITIONS in
Oracle, or "show create table" in MySQL.

I have not used DB2.

Regards,
Dean

[1] https://docs.oracle.com/cd/E18283_01/server.112/e16541/part_admin001.htm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-09-13 Thread Dean Rasheed

Robert Haas  writes:
> On Tue, Sep 12, 2017 at 9:58 AM, Alvaro Herrera  
> wrote:
>> Did anything happen on this, or did we just forget it completely?
>
> I forgot it.  :-(
>
> I really think we should fix this.

Ah, sorry. This was for me to follow up, and I dropped the ball.

Here's a patch restoring the original error checks (actually not the
same coding as used in previous versions of the patch, because that
would have allowed a MINVALUE after a MAXVALUE and vice versa).

A drawback to doing this is that we lose compatibility with syntaxes
supported by other databases, which was part of the reason for
choosing the terms MINVALUE and MAXVALUE in the first place.

So thinking about this afresh, my preference would actually be to just
canonicalise the values stored rather than erroring out.

Thoughts?

Regards,
Dean

restore-minvalue-maxvalue-error-checks.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dubious error message from partition.c

2017-08-09 Thread Dean Rasheed

On 9 August 2017 at 13:03, Robert Haas  wrote:
> On Tue, Aug 8, 2017 at 11:34 PM, Tom Lane  wrote:
>> A small suggestion is that it'd be better to write it like "Specified
>> upper bound \"%s\" precedes lower bound \"%s\"."  I think "succeeds" has
>> more alternate meanings than "precedes", so the wording you have seems
>> more confusing than it needs to be.  (Of course, the situation could be
>> the opposite in other languages, but translators have the ability to
>> reverse the ordering if they need to.)
>
> I think that doesn't quite work, because the failure is caused by LB
> <= UB, not LB < UB.  We could fix that by writing "precedes or equals"
> but that seems lame.  Maybe:
>
> Lower bound %s does not precede upper bound %s.
>

There was an earlier suggestion to use "greater than or equal to". I
think that would work quite well:

ERROR:  invalid range bounds for partition \"%s\"
DETAIL:  lower bound %s is greater than or equal to upper bound %s.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-08-08 Thread Dean Rasheed

On 8 August 2017 at 19:22, Robert Haas <robertmh...@gmail.com> wrote:
> On Fri, Jul 21, 2017 at 4:24 AM, Dean Rasheed <dean.a.rash...@gmail.com> 
> wrote:
>> Also drop the constraint prohibiting finite values after an unbounded
>> column, and just document the fact that any values after MINVALUE or
>> MAXVALUE are ignored. Previously it was necessary to repeat UNBOUNDED
>> multiple times, which was needlessly verbose.
>
> I would like to (belatedly) complain about this part of this commit.
> Now you can do stuff like this:
>
> rhaas=# create table foo (a int, b text) partition by range (a, b);
> CREATE TABLE
> rhaas=# create table foo1 partition of foo for values from (minvalue,
> 0) to (maxvalue, 0);
> CREATE TABLE
>
> The inclusion of 0 has no effect, but it is still stored in the
> catalog, shows up in \d foo1, and somebody might think it does
> something.  I think we should go back to requiring bounds after
> minvalue or maxvalue to be minvalue or maxvalue.  I thought maybe the
> idea here was that you were going to allow something like this to
> work, which actually would have saved some typing:
>
> create table foo2 partition of foo for values from (minvalue) to (maxvalue);
>
> But no:
>
> ERROR:  FROM must specify exactly one value per partitioning column
>
> So the only way that this has made things less verbose is by letting
> you put in a meaningless value of the data type which takes fewer than
> 8 characters to type.  That doesn't seem to me to be a defensible way
> of reducing verbosity.
>

Well perhaps verbosity-reduction isn't sufficient justification but I
still think this is correct because logically any values in the bound
after MINVALUE/MAXVALUE are irrelevant, so it seems overly restrictive
to force all later values to be MINVALUE/MAXVALUE when the code will
just ignore those values.

I don't think we should allow values after MINVALUE/MAXVALUE to be
omitted altogether because we document multi-column ranges as being
most easily understood according to the rules of row-wise comparisons,
and they require equal numbers of values in each row.

It's also worth noting that this choice is consistent with other
databases, so at least some people will be used to being able to do
this.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor comment update in partition.c

2017-08-01 Thread Dean Rasheed

On 31 July 2017 at 12:53, Beena Emerson  wrote:
> The commit d363d42bb9a4399a0207bd3b371c966e22e06bd3 changed
> RangeDatumContent *content to PartitionRangeDatumKind *kind but a
> comment on function partition_rbound_cmp was left unedited and it
> still mentions content1 instead of kind1.
>
> PFA the patch to fix this.
>

Yes you're right, thanks. Patch committed.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-21 Thread Dean Rasheed

On 17 July 2017 at 17:37, Dean Rasheed <dean.a.rash...@gmail.com> wrote:
> On 17 July 2017 at 16:34, Robert Haas <robertmh...@gmail.com> wrote:
>> Do you want to own this open item, then?
>>
> OK.
>
> I need to give the patch another read-through, and then I'll aim to
> push it sometime in the next few days.
>

Committed.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-17 Thread Dean Rasheed

On 17 July 2017 at 16:34, Robert Haas <robertmh...@gmail.com> wrote:
> On Sun, Jul 16, 2017 at 6:40 AM, Dean Rasheed <dean.a.rash...@gmail.com> 
> wrote:
>> Technically, anything that can be done using INCLUSIVE/EXCLUSIVE can
>> also be done using using MINVALUE/MAXVALUE, by artificially adding
>> another partitioning column and making it unbounded above/below, but
>> that would really just be a hack, and it (artificially adding an extra
>> column) would be unnecessary if we added INCLUSIVE/EXCLUSIVE support
>> in a later release. Thus, I think the 2 features would complement each
>> other quite nicely.
>
> OK, works for me.  I'm not really keen about the MINVALUE/MAXVALUE
> syntax -- it's really +/- infinity, not a value at all -- but I
> haven't got a better proposal and yours at least has the virtue of
> perhaps being familiar to those who know about Oracle.
>

Cool. Sounds like we've reached a consensus, albeit with some
reservations around the fact that MINVALUE/MAXVALUE aren't actually
values, despite their names.

+/- infinity *are* values for some datatypes such as timestamps, so it
had to be something different from that, and MINVALUE/MAXVALUE are
quite short and simple, and match the syntax used by 3 other
databases.


> Do you want to own this open item, then?
>

OK.

I need to give the patch another read-through, and then I'll aim to
push it sometime in the next few days.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-16 Thread Dean Rasheed

On 14 July 2017 at 06:12, Robert Haas  wrote:
> I agree that it's a big problem that (10, UNBOUNDED)
> interpreted as a maximum value means first_column <= 10 and when
> interpreted as a minimum value means first_column >= 10, because those
> things aren't opposites of each other.  I guess the proposal here
> would make (10, MAXVALUE) as a maximum value mean first_column <= 10
> and as a minimum would mean first_column > 10, and contrariwise for
> MINVALUE.  That seems to restore the intended design principle of the
> system, which is good

Right. So in general, when using MINVALUE/MAXVALUE for the 2nd column
of a 2-column partitioning scheme, the partition constraints simplify
as follows:

  FROM (x, MINVALUE) => col1 >= x
  FROM (x, MAXVALUE) => col1 > x

  TO (x, MINVALUE)   => col1 < x
  TO (x, MAXVALUE)   => col1 <= x

which restores the property that one partition can be made contiguous
with another by having the upper bounds of one partition equal to the
lower bounds of the other.

Note that the choice of MINVALUE or MAXVALUE only affects whether the
constraint on the previous column is inclusive or exclusive. That's
quite different from what an INCLUSIVE/EXCLUSIVE flag would do.

>, but...
>
> ...originally, Amit proposed to attach a postfix INCLUSIVE or
> EXCLUSIVE to each bound specification, and this does feel like a bit
> of a back door to the same place, kinda.  A partition defined to run
> from (10, MAXVALUE) TO (11, MAXVALUE) is a lot like a partition
> defined to run from (10) EXCLUSIVE to (11) EXCLUSIVE.  And if we
> eventually decide to allow that, then what will be the difference
> between a partition which starts at (10, MAXVALUE) EXCLUSIVE and one
> which starts from (10, MAXVALUE) INCLUSIVE?

The INCLUSIVE/EXCLUSIVE flag would apply to the constraint as a whole:

  FROM (x, y) INCLUSIVE => (col1, col2) >= (x, y)
  FROM (x, y) EXCLUSIVE => (col1, col2) > (x, y)

  TO (x, y) INCLUSIVE   => (col1, col2) <= (x, y)
  TO (x, y) EXCLUSIVE   => (col1, col2) < (x, y)

which, when expanded out, actually only affects the constraint on the
final column, and then only in the case where all the other columns
are equal to the partition bound value:

  FROM (x, y) INCLUSIVE => col1 > x OR (col1 = x AND col2 >= y)
  FROM (x, y) EXCLUSIVE => col1 > x OR (col1 = x AND col2 > y)

  TO (x, y) INCLUSIVE   => col1 < x OR (col2 = x AND col2 <= y)
  TO (x, y) EXCLUSIVE   => col1 < x OR (col2 = x AND col2 < y)

So while MINVALUE/MAXVALUE makes a particular column unbounded
below/above, and as a side-effect can influence the inclusivity of the
preceding column, INCLUSIVE/EXCLUSIVE affects the inclusivity of the
final column (something that MINVALUE/MAXVALUE cannot do).

MINVALUE/MAXVALUE takes precedence, in the sense that if the bound on
any column is MINVALUE/MAXVALUE, that column and any later columns are
unbounded and no longer appear in the partition constraint expression,
and so any INCLUSIVE/EXCLUSIVE flag would have no effect. That seems
pretty intuitive to me -- "unbounded inclusive" is no different from
"unbounded exclusive".

Technically, anything that can be done using INCLUSIVE/EXCLUSIVE can
also be done using using MINVALUE/MAXVALUE, by artificially adding
another partitioning column and making it unbounded above/below, but
that would really just be a hack, and it (artificially adding an extra
column) would be unnecessary if we added INCLUSIVE/EXCLUSIVE support
in a later release. Thus, I think the 2 features would complement each
other quite nicely.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-13 Thread Dean Rasheed

On 12 July 2017 at 10:46, Ashutosh Bapat
<ashutosh.ba...@enterprisedb.com> wrote:
> On Wed, Jul 12, 2017 at 12:54 AM, Dean Rasheed <dean.a.rash...@gmail.com> 
> wrote:
>> On 11 July 2017 at 13:29, Ashutosh Bapat
>>> The description in this paragraph seems to be attaching intuitive
>>> meaning of word "unbounded" to MAXVALUE and MINVALUE, which have
>>> different intuitive meanings of themselves. Not sure if that's how we
>>> should describe MAXVALUE/MINVALUE.
>>>
>> I'm not sure I understand your point. MINVALUE and MAXVALUE do mean
>> unbounded below and above respectively. This paragraph is just making
>> the point that that isn't the same as -/+ infinity.
>>
> What confuses me and probably users is something named min/max"value"
> is not a value but something lesser or greater than any other "value".

Ah OK, I see what you're saying.

It's worth noting though that, after a little looking around, I found
that Oracle, MySQL and DB2 all use MINVALUE/MAXVALUE for unbounded
range partitions (although in the case of Oracle and MySQL, they
currently only support specifying upper bounds, and only use MAXVALUE
at the moment).

So MINVALUE/MAXVALUE are likely to be familiar to at least some people
coming from other databases. Of course, for those other databases, the
surrounding syntax for creating partitioned tables is completely
different, but at least this makes the bounds themselves portable (our
supported set of bounds will be a superset of those supported by
Oracle and MySQL, and I think the same as those supported by DB2).

I also personally quite like those terms, because they're nice and
concise, and it's pretty obvious which is which.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New partitioning - some feedback

2017-07-12 Thread Dean Rasheed

On 12 July 2017 at 23:23, Dean Rasheed <dean.a.rash...@gmail.com> wrote:
> I also agree that there probably isn't much need for a list that
> *only* includes partitions, but if someone comes up with a convincing
> use case, then we could add another 2-letter command for that.
>

Actually, I just thought of a round-about sort of use case:

The various 2-letter commands \dE, \dt, etc... are designed to work
together, so you can do things like \dEt or \dtE to list all local and
foreign tables, whilst excluding views, sequences, etc. So, if for the
sake of argument, \dP were made to list partitions, then you'd be able
to do things like \dEPt to list all the various kinds of tables,
including partitions, whilst excluding views, etc.

That seems somewhat more elegant and flexible than \d++ or \d! or whatever.

Of course, you'd have to decide whether a foreign partition came under
\dE, \dP, both or something else. I'm not sure that we should eat
another letter of the alphabet just for that case, because there
aren't many left, and I don't think any will be natural mnemonics like
the others.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New partitioning - some feedback

2017-07-12 Thread Dean Rasheed

On 12 July 2017 at 15:58, Alvaro Herrera  wrote:
> Amit Langote wrote:
>> On 2017/07/11 13:34, Alvaro Herrera wrote:
>> > However, the "list tables"
>> > command \dt should definitely IMO not list partitions.
>>
>> Do you mean never?  Even if a modifier is specified?  In the patch I
>> proposed, \d! (or \d+ or \d++, if '!' turns out to be unpopular) will list
>> partitions, but \d or \dt won't.  That is, partitions are hidden by default.
>
> I don't think there is any need for a single list of all partition of
> all tables -- is there?  I can't think of anything, but then I haven't
> been exposed very much to this feature yet.  For now, I lean towards "never".
>

So just focusing on the listing issue for now...

I tend to agree with some of the upstream comments that a bare \d
should list everything, including partitions, because partitions are
still tables that you might want to do DML or DDL on.

Also, if you look at what we already have, \d lists all types of
relations, and then there are 2-letter commands \dE, \di, \dm, \ds,
\dt and \dv that list just specific kinds of relations, for example
\dE lists foreign tables, and \dt lists local tables, specifically
excluding foreign tables.

So ISTM that the most logical extension of that is:

  \d - list all relations, including partitions

  \dt - list only tables that are not foreign tables or partitions
of other tables

(that's not quite an exact extension of the existing logic, because of
course it's partitioned tables that have the different relkind, not
the partitions, but the above seems like the most useful behaviour)

With this, I don't think there's any need for any additional
modifiers, like ! or ++.

I also agree that there probably isn't much need for a list that
*only* includes partitions, but if someone comes up with a convincing
use case, then we could add another 2-letter command for that.

> I don't think \d! works terribly well as a mental model, but maybe
> that's just me.
>

Yeah, I agree. It just looks ugly somehow.

>> So it seems most of us are in favor for showing partitioned tables as
>> "partitioned table" instead of "table" in the table listing.
>
> Yeah, that sounds good to me.
>

+1

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-11 Thread Dean Rasheed

On 11 July 2017 at 13:29, Ashutosh Bapat
 wrote:
> + 
> +  Also note that some element types, such as timestamp,
> +  have a notion of "infinity", which is just another value that can
> +  be stored. This is different from MINVALUE and
> +  MAXVALUE, which are not real values that can be stored,
> +  but rather they are ways of saying the value is unbounded.
> +  MAXVALUE can be thought of as being greater than any
> +  other value, including "infinity" and MINVALUE as being
> +  less than any other value, including "minus infinity". Thus the range
> +  FROM ('infinity') TO (MAXVALUE) is not an empty range; it
> +  allows precisely one value to be stored  the timestamp
> +  "infinity".
>   
>
> The description in this paragraph seems to be attaching intuitive
> meaning of word "unbounded" to MAXVALUE and MINVALUE, which have
> different intuitive meanings of themselves. Not sure if that's how we
> should describe MAXVALUE/MINVALUE.
>

I'm not sure I understand your point. MINVALUE and MAXVALUE do mean
unbounded below and above respectively. This paragraph is just making
the point that that isn't the same as -/+ infinity.

> Most of the patch seems to be replacing "content" with "kind",
> RangeDatumContent with PartitionRangeDatumKind  and RANGE_DATUM_FINITE
> with PARTITION_RANGE_DATUM_VALUE. But those changes in name don't seem
> to be adding much value to the patch. Replacing RANGE_DATUM_NEG_INF
> and RANGE_DATUM_POS_INF with PARTITION_RANGE_DATUM_MINVALUE and
> PARTITION_RANGE_DATUM_MAXVALUE looks like a good change in line with
> MINVALUE/MAXVALUE change. May be we should reuse the previous
> variables, enum type name and except those two, so that the total
> change introduced by the patch is minimal.
>

No, this isn't just renaming that other enum. It's about replacing the
boolean "infinite" flag on PartitionRangeDatum with something that can
properly enumerate the 3 kinds of PartitionRangeDatum that are allowed
(and, as noted above "finite"/"infinite isn't the right terminology
either). Putting that new enum in parsenodes.h makes it globally
available, wherever the PartitionRangeDatum structure is used. A
side-effect of that change is that the old RangeDatumContent enum that
was local to partition.c is no longer needed.

RangeDatumContent wouldn't be a good name for a globally visible enum
of this kind because that name fails to link it to partitioning in any
way, and could easily be confused as having something to do with RTEs
or range types. Also, the term "content" is more traditionally used in
the Postgres sources for a field *holding* content, rather than a
field specifying the *kind* of content. On the other hand, you'll note
that the term "kind" is by far the most commonly used term for naming
this kind of enum, and any matching fields.

IMO, code consistency and readability takes precedence over keeping
patch sizes down.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-09 Thread Dean Rasheed

On 6 July 2017 at 22:43, Joe Conway  wrote:
> I agree we should get this right the first time and I also agree with
> Dean's proposal, so I guess I'm a +2
>

On 7 July 2017 at 03:21, Amit Langote  wrote:
> +1 to releasing this syntax in PG 10.
>

So, that's 3 votes in favour of replacing UNBOUNDED with
MINVALUE/MAXVALUE for range partition bounds in PG 10. Not a huge
consensus, but no objections either. Any one else have an opinion?

Robert, have you been following this thread?

I was thinking of pushing this later today, in time for beta2.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-07 Thread Dean Rasheed

On 7 July 2017 at 03:21, Amit Langote  wrote:
> The patch looks generally good, although I found and fixed some minor
> issues (typos and such).  Please find attached the updated patch.
>

Thanks for the review. Those changes all look good. I also see that I
missed an example in the docs at the bottom of the CREATE TABLE page,
so I'll go update that.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-06 Thread Dean Rasheed

On 6 July 2017 at 21:04, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Dean Rasheed <dean.a.rash...@gmail.com> writes:
>> However, this is also an incompatible syntax change, and any attempt
>> to support both the old and new syntaxes is likely to be messy, so we
>> really need to get consensus on whether this is the right thing to do,
>> and whether it *can* be done now for PG10.
>
> FWIW, I'd much rather see us get it right the first time than release
> PG10 with a syntax that we'll regret later.  I do not think that beta2,
> or even beta3, is too late for such a change.
>
> I'm not taking a position on whether this proposal is actually better
> than what we have.  But if there's a consensus that it is, we should
> go ahead and do it, not worry that it's too late.
>

OK, thanks. That's good to know.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-06 Thread Dean Rasheed

On 5 July 2017 at 18:07, Dean Rasheed <dean.a.rash...@gmail.com> wrote:
> So if we were to go for maximum flexibility and compatibility with
> Oracle, then perhaps what we would do is more like the original idea
> of UNBOUNDED ABOVE/BELOW, except call them MINVALUE and MAXVALUE,
> which conveniently are already unreserved keywords, as well as being
> much shorter. Plus, we would also relax the constraint about having
> finite values after MINVALUE/MAXVALUE.
>

So I know that I have flip-flopped a few times on this now, but I'm
now starting to think that this approach, replacing UNBOUNDED with
MINVALUE and MAXVALUE is the best way to go, along with permitting
finite values after MINVALUE/MAXVALUE.

This gives the greatest flexibility, it's not too verbose, and it
makes it easy to define contiguous sets of partitions just by making
the lower bound of one match the upper bound of another.

With this approach, any partition bounds that Oracle allows are also
valid in PostgreSQL, not that I would normally give too much weight to
that, but it is I think quite a nice syntax. Of course, we also
support things that Oracle doesn't allow, such as MINVALUE and gaps
between partitions.

Parts of the patch are similar to your UNBOUNDED ABOVE/BELOW patch,
but there are a number of differences -- most notably, I replaced the
"infinite" boolean flag on PartitionRangeDatum with a 3-value enum and
did away with all the DefElem nodes and the associated special string
constants being copied and compared.

However, this is also an incompatible syntax change, and any attempt
to support both the old and new syntaxes is likely to be messy, so we
really need to get consensus on whether this is the right thing to do,
and whether it *can* be done now for PG10.

Regards,
Dean

Replace_UNBOUNDED_with_MINVALUE_and_MAXVALUE.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-06 Thread Dean Rasheed

On 5 July 2017 at 10:43, Amit Langote  wrote:
> 0001 is your patch to tidy up check_new_partition_bound()  (must be
> applied before 0002)
>

I pushed this first patch, simplifying check_new_partition_bound() for
range partitions, since it seemed like a good simplification, but note
that I don't think that was actually the cause of the latent bug you
saw upthread.

I think the real issue was in partition_rbound_cmp() -- normally, if
the upper bound of one partition coincides with the lower bound of
another, that function would report the upper bound as the smaller
one, but that logic breaks if any of the bound values are infinite,
since then it will exit early, returning 0, without ever comparing the
"lower" flags on the bounds.

I'm tempted to push a fix for that independently, since it's a bug
waiting to happen, even though it's not possible to hit it currently.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-05 Thread Dean Rasheed

On 5 July 2017 at 10:43, Amit Langote  wrote:
> In retrospect, that sounds like something that was implemented in the
> earlier versions of the patch, whereby there was no ability to specify
> UNBOUNDED on a per-column basis.  So the syntax was:
>
> FROM { (x [, ...]) | UNBOUNDED } TO { (y [, ...]) | UNBOUNDED }
>
> But, it was pointed out to me [1] that that doesn't address the use case,
> for example, where part1 goes up to (10, 10) and part2 goes from (10, 10)
> up to (10, unbounded).
>

[Reading that other thread]

It's a reasonable point that our syntax is quite different from
Oracle's, and doing this takes it even further away, and removes
support for things that they do support.

For the record, Oracle allows things like the following:

DROP TABLE t1;
CREATE TABLE t1 (a NUMBER, b NUMBER, c NUMBER)
PARTITION BY RANGE (a,b,c)
  (PARTITION t1p1 VALUES LESS THAN (1,2,3),
   PARTITION t1p2 VALUES LESS THAN (2,3,4),
   PARTITION t1p3 VALUES LESS THAN (3,MAXVALUE,5),
   PARTITION t1p4 VALUES LESS THAN (4,MAXVALUE,6)
  );

INSERT INTO t1 VALUES(1,2,3);
INSERT INTO t1 VALUES(2,3,4);
INSERT INTO t1 VALUES(3,4,5);
INSERT INTO t1 VALUES(3.01,4,5);
INSERT INTO t1 VALUES(4,5,10);

COLUMN subobject_name FORMAT a20;
SELECT a, b, c, subobject_name
  FROM t1, user_objects o
 WHERE o.data_object_id = dbms_rowid.rowid_object(t1.ROWID)
 ORDER BY a,b,c;

 A  B  C SUBOBJECT_NAME
-- -- -- 
 1  2  3 T1P2
 2  3  4 T1P3
 3  4  5 T1P3
  3.01  4  5 T1P4
 4  5 10 T1P4

So they use MAXVALUE instead of UNBOUNDED for an upper bound, which is
more explicit. They don't have an equivalent MINVALUE, but it's
arguably not necessary, since the first partition's lower bound is
implicitly unbounded.

With this syntax they don't need to worry about gaps or overlaps
between partitions, which is nice, but arguably less flexible.

They're also more lax about allowing finite values after MAXVALUE, and
they document the fact that any value after a MAXVALUE is ignored.

I don't think their scheme provides any way to define a partition of
the above table that would hold all rows for which a < some value.

So if we were to go for maximum flexibility and compatibility with
Oracle, then perhaps what we would do is more like the original idea
of UNBOUNDED ABOVE/BELOW, except call them MINVALUE and MAXVALUE,
which conveniently are already unreserved keywords, as well as being
much shorter. Plus, we would also relax the constraint about having
finite values after MINVALUE/MAXVALUE.

I think I'll go play around with that idea to see what it looks like
in practice. Your previous patch already does much of that, and is far
less invasive.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-05 Thread Dean Rasheed

On 5 July 2017 at 10:43, Amit Langote  wrote:
>> So the more I think about this, the more I think that a cleaner design
>> would be as follows:
>>
>> 1). Don't allow UNBOUNDED, except in the first column, where it can
>> keep it's current meaning.
>>
>> 2). Allow the partition bounds to have fewer columns than the
>> partition definition, and have that mean the same as it would have
>> meant if you were partitioning by that many columns. So, for
>> example, if you were partitioning by (col1,col2), you'd be allowed
>> to define a partition like so:
>>
>>   FROM (x) TO (y)
>>
>> and it would mean
>>
>>   x <= col1 < y
>>
>> Or you'd be able to define a partition like
>>
>>   FROM (x1,x2) TO (y)
>>
>> which would mean
>>
>>   (col1 > x1) OR (col1 = x1 AND col2 >= x2) AND col1 < y
>>
>> 3). Don't allow any value after UNBOUNDED (i.e., only specify
>> UNBOUNDED once in a partition bound).
>
> I assume we don't need the ability of specifying ABOVE/BELOW in this design.
>

Yes that's right.

> In retrospect, that sounds like something that was implemented in the
> earlier versions of the patch, whereby there was no ability to specify
> UNBOUNDED on a per-column basis.  So the syntax was:
>
> FROM { (x [, ...]) | UNBOUNDED } TO { (y [, ...]) | UNBOUNDED }
>

Yes, that's where I ended up too.

> But, it was pointed out to me [1] that that doesn't address the use case,
> for example, where part1 goes up to (10, 10) and part2 goes from (10, 10)
> up to (10, unbounded).
>
> The new design will limit the usage of unbounded range partitions at the
> tail ends.
>

True, but I don't think that's really a problem. When the first column
is a discrete type, an upper bound of (10, unbounded) can be rewritten
as (11) in the new design. When it's a continuous type, e.g. floating
point, it can no longer be represented, because (10.0, unbounded)
really means (col1 <= 10.0). But we've already decided not to support
anything other than inclusive lower bounds and exclusive upper bounds,
so allowing this upper bound goes against that design choice.

>> Of course, it's pretty late in the day to be proposing this kind of
>> redesign, but I fear that if we don't tackle it now, it will just be
>> harder to deal with in the future.
>>
>> Actually, a quick, simple hacky implementation might be to just fill
>> in any omitted values in a partition bound with negative infinity
>> internally, and when printing a bound, omit any values after an
>> infinite value. But really, I think we'd want to tidy up the
>> implementation, and I think a number of things would actually get much
>> simpler. For example, get_qual_for_range() could simply stop when it
>> reached the end of the list of values for the bound, and it wouldn't
>> need to worry about an unbounded value following a bounded one.
>>
>> Thoughts?
>
> I cooked up a patch for the "hacky" implementation for now, just as you
> described in the above paragraph.  Will you be willing to give it a look?
> I will also think about the non-hacky way of implementing this.
>

OK, I'll take a look.

Meanwhile, I already had a go at the "non-hacky" implementation (WIP
patch attached). The more I worked on it, the simpler things got,
which I think is a good sign.

Part-way through, I realised that the PartitionRangeDatum Node type is
no longer needed, because each bound value is now necessarily finite,
so the lowerdatums and upperdatums lists in a PartitionBoundSpec can
now be made into lists of Const nodes, making them match the
listdatums field used for LIST partitioning, and then a whole lot of
related code gets simplified.

It needed a little bit more code in partition.c to track individual
bound sizes, but there were a number of other places that could be
simplified, so overall this represents a reduction in the code size
and complexity.

It's not complete (e.g., no doc updates yet), but it passes all the
tests, and so far seems to work as I would expect.

Regards,
Dean
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
new file mode 100644
index 7da2058..aade9f5
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -66,23 +66,26 @@
  * is an upper bound.
  */

-/* Ternary value to represent what's contained in a range bound datum */
-typedef enum RangeDatumContent
+/* Type of an individual bound of a range partition */
+typedef enum RangeBoundKind
 {
-	RANGE_DATUM_FINITE = 0,		/* actual datum stored elsewhere */
-	RANGE_DATUM_NEG_INF,		/* negative infinity */
-	RANGE_DATUM_POS_INF			/* positive infinity */
-} RangeDatumContent;
+	RANGE_BOUND_FINITE = 0,		/* actual bound stored in the datums array */
+	RANGE_BOUND_NEG_INF,		/* negative infinity; NULL datums array */
+	RANGE_BOUND_POS_INF			/* positive infinity; NULL datums array */
+} RangeBoundKind;

 typedef struct PartitionBoundInfoData
 {
 	char		strategy;		/* list or range bounds? */
 	int			ndatums;		/*

Re: [HACKERS] Multi column range partition table

2017-07-04 Thread Dean Rasheed

On 3 July 2017 at 10:32, Amit Langote <langote_amit...@lab.ntt.co.jp> wrote:
> On 2017/07/03 17:36, Dean Rasheed wrote:
>> The bigger question is do we want this for PG10? If so, time is
>> getting tight. My feeling is that we do, because otherwise we'd be
>> changing the syntax in PG11 of a feature only just released in PG10,
>> and I think the current syntax is flawed, so it would be better not to
>> have it in any public release. I'd feel better hearing from the
>> original committer though.
>
> The way I have extended the syntax in the posted patch, ABOVE/BELOW (or
> whatever we decide instead) are optional.  UNBOUNDED without the
> ABOVE/BELOW specifications implicitly means UNBOUNDED ABOVE if in FROM and
> vice versa, which seems to me like sensible default behavior and what's
> already present in PG 10.
>
> Do you think ABOVE/BELOW shouldn't really be optional?
>

Hmm, I'm not so sure about that.

The more I think about this, the more I think that the current design
is broken, and that introducing UNBOUNDED ABOVE/BELOW is just a
sticking plaster to cover that up. Yes, it allows nicer multi-column
ranges to be defined, as demonstrated upthread. But, it also allows
some pretty counterintuitive things like making the lower bound
exclusive and the upper bound inclusive.

I think that's actually the real problem with the current design. If I
have a single-column partition like

  (col) FROM (x) TO (y)

it's pretty clear that's a simple range, inclusive at the lower end
and exclusive at the upper end:

  (x) <= (col) < (y)

If I now make that a 2-column partition, but leave the second column
unbounded:

  (col1,col2) FROM (x,UNBOUNDED) TO (y,UNBOUNDED)

my initial expectation would have been for that to mean the same
thing, i.e.,

  (x) <= (col1) < (y)

but that only happens if "UNBOUNDED" means negative infinity in both
places. That then starts to give the sort of desirable properties
you'd expect, like using the same expression for the lower bound of
one partition as the upper bound of another makes the two partitions
contiguous.

But of course, that's not exactly a pretty design either, because then
you'd be saying that UNBOUNDED means positive infinity if it's the
upper bound of the first column, and negative infinity if it's the
lower bound of the first column or either bound of any other column.

Another aspect of the current design I don't like is that you have to
keep repeating UNBOUNDED [ABOVE/BELOW], for each of the rest of the
columns in the bound, and anything else is an error. That's a pretty
verbose way of saying "the rest of the columns are unbounded".

So the more I think about this, the more I think that a cleaner design
would be as follows:

1). Don't allow UNBOUNDED, except in the first column, where it can
keep it's current meaning.

2). Allow the partition bounds to have fewer columns than the
partition definition, and have that mean the same as it would have
meant if you were partitioning by that many columns. So, for
example, if you were partitioning by (col1,col2), you'd be allowed
to define a partition like so:

  FROM (x) TO (y)

and it would mean

  x <= col1 < y

Or you'd be able to define a partition like

  FROM (x1,x2) TO (y)

which would mean

  (col1 > x1) OR (col1 = x1 AND col2 >= x2) AND col1 < y

3). Don't allow any value after UNBOUNDED (i.e., only specify
UNBOUNDED once in a partition bound).

This design has a few neat properties:

- Lower bounds are always inclusive and upper bounds are always
  exclusive.

- If the expression for the lower bound of one partition is the same
  as the expression for the upper bound of another, the 2 partitions
  are contiguous, making it easy to define a covering set of
  partitions.

- It's much easier to understand what a bound of "(x)" means than
  "(x,UNBOUNDED [ABOVE/BELOW])"

- It's much less verbose, and there's no needless repetition.

Of course, it's pretty late in the day to be proposing this kind of
redesign, but I fear that if we don't tackle it now, it will just be
harder to deal with in the future.

Actually, a quick, simple hacky implementation might be to just fill
in any omitted values in a partition bound with negative infinity
internally, and when printing a bound, omit any values after an
infinite value. But really, I think we'd want to tidy up the
implementation, and I think a number of things would actually get much
simpler. For example, get_qual_for_range() could simply stop when it
reached the end of the list of values for the bound, and it wouldn't
need to worry about an unbounded value following a bounded one.

Thoughts?

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-03 Thread Dean Rasheed

On 3 July 2017 at 06:00, Amit Langote <langote_amit...@lab.ntt.co.jp> wrote:
> On 2017/07/03 2:15, Dean Rasheed wrote:
>> My first thought was UNBOUNDED ABOVE/BELOW, because that matches the
>> terminology already in use of upper and lower bounds.
>
> I was starting to like the Ashutosh's suggested UNBOUNDED MIN/MAX syntax,
> but could you clarify your comment that ABOVE/BELOW is the terminology
> already in use of upper and lower bounds?  I couldn't find ABOVE/BELOW in
> our existing syntax anywhere that uses the upper/lower bound notion, so
> was confused a little bit.
>

I just meant that the words "above" and "below" more closely match the
already-used terms "upper" and "lower" for the bounds, so that
terminology seemed more consistent, e.g. "UNBOUNDED ABOVE" => no upper
bound.

> Also, I assume UNBOUNDED ABOVE signifies positive infinity and vice versa.
>

Right.

I'm not particularly wedded to that terminology. I always find naming
things hard, so if anyone can think of anything better, let's hear it.

The bigger question is do we want this for PG10? If so, time is
getting tight. My feeling is that we do, because otherwise we'd be
changing the syntax in PG11 of a feature only just released in PG10,
and I think the current syntax is flawed, so it would be better not to
have it in any public release. I'd feel better hearing from the
original committer though.

Meanwhile, I'll continue trying to review the latest patches...

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-02 Thread Dean Rasheed

On 30 June 2017 at 10:04, Ashutosh Bapat
 wrote:
> On Fri, Jun 30, 2017 at 1:36 PM, Amit Langote
>  wrote:
>>
>> Alright, I spent some time implementing a patch to allow specifying
>> -infinity and +infinity in arbitrary ways.  Of course, it prevents
>> nonsensical inputs with appropriate error messages.
>
> I don't think -infinity and +infinity are the right terms. For a
> string or character data type there is no -infinity and +infinity.
> Similarly for enums. We need to extend UNBOUNDED somehow to indicate
> the end of a given type in the given direction. I thought about
> UNBOUNDED LEFT/RIGHT but then whether LEFT indicates -ve side or +side
> would cause confusion. Also LEFT/RIGHT may work for a single
> dimensional datatype but not for multi-dimensional spaces. How about
> MINIMUM/MAXIMUM or UNBOUNDED MIN/MAX to indicate the extremities.
>

Yes, I think you're right. Also, some datatypes include values that
are equal to +/-infinity, which would then behave differently from
unbounded as range bounds, so it wouldn't be a good idea to overload
that term.

My first thought was UNBOUNDED ABOVE/BELOW, because that matches the
terminology already in use of upper and lower bounds.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi column range partition table

2017-07-02 Thread Dean Rasheed

On 30 June 2017 at 09:06, Amit Langote  wrote:
> When testing the patch, I realized that the current code in
> check_new_partition_bound() that checks for range partition overlap had a
> latent bug that resulted in false positives for the new cases that the new
> less restrictive syntax allowed.  I spent some time simplifying that code
> while also fixing the aforementioned bug.  It's implemented in the
> attached patch 0001.
>

I haven't had time to look at 0002 yet, but looking at 0001, I'm not
convinced that this really represents much of a simplification, but I
do prefer the way it now consistently reports the first overlapping
partition in the error message.

I'm not entirely convinced by this change either:

-if (equal || off1 != off2)
+if (off2 > off1 + 1 || ((off2 == off1 + 1) && !equal))

Passing probe_is_bound = true to partition_bound_bsearch() will I
think cause it to return equal = false when the upper bound of one
partition equals the lower bound of another, so relying on the "equal"
flag here seems a bit dubious. I think I can just about convince
myself that it works, but not for the reasons stated in the comments.

It also seems unnecessary for this code to be doing 2 binary searches.
I think a better simplification would be to just do one binary search
to find the gap that the lower bound fits in, and then test the upper
bound of the new partition against the lower bound of the next
partition (if there is one), as in the attached patch.

Regards,
Dean
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
new file mode 100644
index 7da2058..96760a0
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -745,78 +745,62 @@ check_new_partition_bound(char *relname,
 if (partdesc->nparts > 0)
 {
 	PartitionBoundInfo boundinfo = partdesc->boundinfo;
-	int			off1,
-off2;
-	bool		equal = false;
+	int			offset;
+	bool		equal;
 
 	Assert(boundinfo && boundinfo->ndatums > 0 &&
 		   boundinfo->strategy == PARTITION_STRATEGY_RANGE);
 
 	/*
-	 * Firstly, find the greatest range bound that is less
-	 * than or equal to the new lower bound.
+	 * Test whether the new lower bound (which is treated
+	 * inclusively as part of the new partition) lies inside an
+	 * existing partition, or in a gap.
+	 *
+	 * If it's in a gap, the next index value will be -1 (the
+	 * lower bound of the next partition).  This is also true
+	 * if there is no next partition, since the index array is
+	 * initialised with an extra -1 at the end.
+	 *
+	 * Note that this also allows for the possibility that the
+	 * new lower bound equals an existing upper bound.
 	 */
-	off1 = partition_bound_bsearch(key, boundinfo, lower, true,
-   );
+	offset = partition_bound_bsearch(key, boundinfo, lower,
+	 true, );
 
-	/*
-	 * off1 == -1 means that all existing bounds are greater
-	 * than the new lower bound.  In that case and the case
-	 * where no partition is defined between the bounds at
-	 * off1 and off1 + 1, we have a "gap" in the range that
-	 * could be occupied by the new partition.  We confirm if
-	 * so by checking whether the new upper bound is confined
-	 * within the gap.
-	 */
-	if (!equal && boundinfo->indexes[off1 + 1] < 0)
+	if (boundinfo->indexes[offset + 1] < 0)
 	{
-		off2 = partition_bound_bsearch(key, boundinfo, upper,
-	   true, );
-
 		/*
-		 * If the new upper bound is returned to be equal to
-		 * the bound at off2, the latter must be the upper
-		 * bound of some partition with which the new
-		 * partition clearly overlaps.
-		 *
-		 * Also, if bound at off2 is not same as the one
-		 * returned for the new lower bound (IOW, off1 !=
-		 * off2), then the new partition overlaps at least one
-		 * partition.
+		 * Check that the new partition will fit in the gap.
+		 * For it to fit, the new upper bound must be less than
+		 * or equal to the lower bound of the next partition,
+		 * if there is one.
 		 */
-		if (equal || off1 != off2)
+		if (offset + 1 < boundinfo->ndatums)
 		{
-			overlap = true;
+			int32		cmpval;
 
-			/*
-			 * The bound at off2 could be the lower bound of
-			 * the partition with which the new partition
-			 * overlaps.  In that case, use the upper bound
-			 * (that is, the bound at off2 + 1) to get the
-			 * index of that partition.
-			 */
-			if (boundinfo->indexes[off2] < 0)
-with = boundinfo->indexes[off2 + 1];
-			else
-with = boundinfo->indexes[off2];
+			cmpval = partition_bound_cmp(key, boundinfo,
+		 offset + 1, upper,
+		 true);
+			if (cmpval < 0)
+			{
+/*
+ * The new

Re: [HACKERS] Multi column range partition table

2017-06-23 Thread Dean Rasheed

On 23 June 2017 at 08:01, Ashutosh Bapat
 wrote:
> The way we have designed our syntax, we don't have a way to tell that
> p3 comes after p2 and they have no gap between those. But I don't
> think that's your question. What you are struggling with is a way to
> specify a lower bound (10, +infinity) so that anything with i1 > 10
> would go to partition 3.
>

I think actually there is a fundamental problem here, which arises
because UNBOUNDED has 2 different meanings depending on context, and
thus it is not possible in general to specify the start of one range
to be equal to the end of the previous range, as is necessary to get
contiguous non-overlapping ranges.

Note that this isn't just a problem for floating point datatypes
either, it also applies to other types such as strings. For example,
given a partition over (text, int) types defined with the following
values:

  FROM ('a', UNBOUNDED) TO ('b', UNBOUNDED)

which is equivalent to

  FROM ('a', -INFINITY) TO ('b', +INFINITY)

where should the next range start?

Even if you were to find a way to specify "the next string after 'b'",
it wouldn't exactly be pretty. The problem is that the above partition
corresponds to "all the strings starting with 'a', plus the string
'b', which is pretty ugly. A neater way to define the pair of ranges
in this case would be:

  FROM ('a', -INFINITY) TO ('b', -INFINITY)
  FROM ('b', -INFINITY) TO ('c', -INFINITY)

since then all strings starting with 'a' would fall into the first
partition and all the strings starting with 'b' would fall into the
second one.

Currently, when there are 2 partition columns, the partition
constraint is defined as

  (a is not null) and (b is not null)
  and
  (a > al or (a = al and b >= bl))
  and
  (a < au or (a = au and b < bu))

if the upper bound bu were allowed to be -INFINITY (something that
should probably be forbidden unless the previous column's upper bound
were finite), then this would simplify to

  (a is not null) and (b is not null)
  and
  (a > al or (a = al and b >= bl))
  and
  (a < au)

and in the example above, where al is -INFINITY, it would further simplify to

  (a is not null) and (b is not null)
  and
  (a >= al)
  and
  (a < au)

There would also be a similar simplification possible if the lower
bound of a partition column were allowed to be +INFINITY.

So, I think that having UNBOUNDED represent both -INFINITY and
+INFINITY depending on context is a design flaw, and that we need to
allow both -INFINITY and +INFINITY as upper and lower bounds (provided
they are preceded by a column with a finite bound). I think that, in
general, that's the only way to allow contiguous non-overlapping
partitions to be defined on multiple columns.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rules on table partitions

2017-06-21 Thread Dean Rasheed

On 20 June 2017 at 03:01, Amit Langote  wrote:
> Hmm, yes.  The following exercise convinced me.
>
> create table r (a int) partition by range (a);
> create table r1 partition of r for values from (1) to (10);
> create rule "_RETURN" as on select to r1 do instead select * from r;
>
> insert into r values (1);
> ERROR:  cannot insert into view "r1"
> HINT:  To enable inserting into the view, provide an INSTEAD OF INSERT
> trigger or an unconditional ON INSERT DO INSTEAD rule.
>
> The error is emitted by CheckValidResultRel() that is called on individual
> leaf partitions when setting up tuple-routing in ExecInitModifyTable.
>
> I agree that we should forbid this case, so please find attached a patch.
>

Committed.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rules on table partitions

2017-06-20 Thread Dean Rasheed

On 20 June 2017 at 03:01, Amit Langote <langote_amit...@lab.ntt.co.jp> wrote:
> On 2017/06/19 20:19, Dean Rasheed wrote:
>> Currently we allow rules to be defined on table partitions, but these
>> rules only fire when the partition is accessed directly, not when it
>> is accessed via the parent:
>
> Yeah, the same thing as will happen with an inheritance setup, but I guess
> you are asking whether that's what we want.
>
> With old-style inheritance based setup, you will see exactly the exact
> same result:
>
> Note that inheritance is expanded in the planner, not the rewriter, so the
> rules of partitions (that are added to the query later than the rewriter)
> won't fire, AFAICS.  Same will apply to partitions.
>

Ah yes, you're right. I was misremembering how that worked. In an
old-style inheritance-based setup you would have to define triggers on
the parent table to handle INSERT, and since they would insert
directly into the relevant child, any child INSERT rules would be
fired, but child UPDATE and DELETE rules would not. At least with
built-in partitioning it's consistent in not firing rules on the
partitions for any command.


>> Perhaps we
>> should explicitly forbid this for now -- i.e., raise a "not supported"
>> error when attempting to add a rule to a partition, or attach a table
>> with rules to a partitioned table.
>
> We could do that, but an oft-raised question is how different we should
> make new partitions from the old-style inheritance child tables?
>
> Although a slightly different territory, you will also notice that
> statement triggers of partitions won't be fired unless they are explicitly
> named in the query, which is what happens for inheritance in general and
> hence also for partitions.
>
>> Personally, I wouldn't regard adding proper support for rules on
>> partitions as a high priority, so I'd be OK with it remaining
>> unsupported unless someone cares enough to implement it, but that
>> seems preferable to leaving it partially working in this way.
>
> Sure, if consensus turns out to be that we prohibit rules, statement
> triggers, etc. that depend on the relation being explicitly named in the
> query to be defined on partitions, I could draft up a patch for v10.
>

Hmm, perhaps it's OK as-is. The user can always define the
rules/triggers on both the parent and the children, if that's what
they want.


>> Also, as things stand, it is possible to do the following:
>>
>> CREATE TABLE t2(a int, b int) PARTITION BY RANGE(a);
>> CREATE TABLE t2_p PARTITION OF t2 FOR VALUES FROM (1) TO (10);
>> CREATE RULE "_RETURN" AS ON SELECT TO t2_p DO INSTEAD SELECT * FROM t2;
>>
>> which results in the partition becoming a view that selects from the
>> parent, which surely ought to be forbidden.
>
> Hmm, yes.  The following exercise convinced me.
>
> create table r (a int) partition by range (a);
> create table r1 partition of r for values from (1) to (10);
> create rule "_RETURN" as on select to r1 do instead select * from r;
>
> insert into r values (1);
> ERROR:  cannot insert into view "r1"
> HINT:  To enable inserting into the view, provide an INSTEAD OF INSERT
> trigger or an unconditional ON INSERT DO INSTEAD rule.
>
> The error is emitted by CheckValidResultRel() that is called on individual
> leaf partitions when setting up tuple-routing in ExecInitModifyTable.
>
> I agree that we should forbid this case,

Yeah. Also it would be possible to define the rule to select from a
non-empty table, leading to rows appearing in the partition, but not
the parent. Since we normally explicitly forbid the use of a view as a
table partition, it seems worth closing the loophole in this case.

> so please find attached a patch.

Thanks, I'll take a look at it later today.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Rules on table partitions

2017-06-19 Thread Dean Rasheed

Currently we allow rules to be defined on table partitions, but these
rules only fire when the partition is accessed directly, not when it
is accessed via the parent:

CREATE TABLE t1(a int, b int) PARTITION BY RANGE(a);
CREATE TABLE t1_p PARTITION OF t1 FOR VALUES FROM (1) TO (10);
INSERT INTO t1 VALUES (1,101), (2,201);

CREATE TABLE t1_p_log(a int, b int, d date);
CREATE RULE t1_p_log_rule AS ON UPDATE TO t1_p
  DO ALSO INSERT INTO t1_p_log VALUES(old.a, old.b, now());

UPDATE t1 SET b=b+1 WHERE a=1;
UPDATE t1_p SET b=b+1 WHERE a=2;

SELECT * FROM t1_p_log;

 a |  b  | d
---+-+
 2 | 201 | 2017-06-19
(1 row)


I'd regard that as a bug, especially since this kind of thing would
have worked with old-style user-defined partitioning. Perhaps we
should explicitly forbid this for now -- i.e., raise a "not supported"
error when attempting to add a rule to a partition, or attach a table
with rules to a partitioned table.

Personally, I wouldn't regard adding proper support for rules on
partitions as a high priority, so I'd be OK with it remaining
unsupported unless someone cares enough to implement it, but that
seems preferable to leaving it partially working in this way.

Also, as things stand, it is possible to do the following:

CREATE TABLE t2(a int, b int) PARTITION BY RANGE(a);
CREATE TABLE t2_p PARTITION OF t2 FOR VALUES FROM (1) TO (10);
CREATE RULE "_RETURN" AS ON SELECT TO t2_p DO INSTEAD SELECT * FROM t2;

which results in the partition becoming a view that selects from the
parent, which surely ought to be forbidden.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG10 Partitioned tables and relation_is_updatable()

2017-06-13 Thread Dean Rasheed

On 13 June 2017 at 05:50, Ashutosh Bapat
<ashutosh.ba...@enterprisedb.com> wrote:
> On Tue, Jun 13, 2017 at 12:03 AM, Dean Rasheed <dean.a.rash...@gmail.com> 
> wrote:
>> Barring objections, I'll push my original patch and work up patches
>> for the other couple of issues I found.
>
> No objections, the patch is good to go as is. Sorry for high-jacking
> this thread.
>

Thanks for the review. Patch pushed.

Here are patches for the other 2 issues, with test cases showing how
they currently fail on HEAD.

Regards,
Dean
diff --git a/src/backend/commands/policy.c b/src/backend/commands/policy.c
new file mode 100644
index 4a75842..dad31df
--- a/src/backend/commands/policy.c
+++ b/src/backend/commands/policy.c
@@ -474,7 +474,8 @@ RemoveRoleFromObjectPolicy(Oid roleid, O
 
 	rel = relation_open(relid, AccessExclusiveLock);
 
-	if (rel->rd_rel->relkind != RELKIND_RELATION)
+	if (rel->rd_rel->relkind != RELKIND_RELATION &&
+		rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("\"%s\" is not a table",
diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out
new file mode 100644
index e2ec961..26d28f2
--- a/src/test/regress/expected/rowsecurity.out
+++ b/src/test/regress/expected/rowsecurity.out
@@ -3885,6 +3885,7 @@ RESET SESSION AUTHORIZATION;
 CREATE ROLE regress_rls_dob_role1;
 CREATE ROLE regress_rls_dob_role2;
 CREATE TABLE dob_t1 (c1 int);
+CREATE TABLE dob_t2 (c1 int) PARTITION BY RANGE (c1);
 CREATE POLICY p1 ON dob_t1 TO regress_rls_dob_role1 USING (true);
 DROP OWNED BY regress_rls_dob_role1;
 DROP POLICY p1 ON dob_t1; -- should fail, already gone
@@ -3892,6 +3893,9 @@ ERROR:  policy "p1" for table "dob_t1" d
 CREATE POLICY p1 ON dob_t1 TO regress_rls_dob_role1,regress_rls_dob_role2 USING (true);
 DROP OWNED BY regress_rls_dob_role1;
 DROP POLICY p1 ON dob_t1; -- should succeed
+CREATE POLICY p1 ON dob_t2 TO regress_rls_dob_role1,regress_rls_dob_role2 USING (true);
+DROP OWNED BY regress_rls_dob_role1;
+DROP POLICY p1 ON dob_t2; -- should succeed
 DROP USER regress_rls_dob_role1;
 DROP USER regress_rls_dob_role2;
 --
diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql
new file mode 100644
index 3ce9293..ba8fed4
--- a/src/test/regress/sql/rowsecurity.sql
+++ b/src/test/regress/sql/rowsecurity.sql
@@ -1740,6 +1740,7 @@ CREATE ROLE regress_rls_dob_role1;
 CREATE ROLE regress_rls_dob_role2;
 
 CREATE TABLE dob_t1 (c1 int);
+CREATE TABLE dob_t2 (c1 int) PARTITION BY RANGE (c1);
 
 CREATE POLICY p1 ON dob_t1 TO regress_rls_dob_role1 USING (true);
 DROP OWNED BY regress_rls_dob_role1;
@@ -1749,6 +1750,10 @@ CREATE POLICY p1 ON dob_t1 TO regress_rl
 DROP OWNED BY regress_rls_dob_role1;
 DROP POLICY p1 ON dob_t1; -- should succeed
 
+CREATE POLICY p1 ON dob_t2 TO regress_rls_dob_role1,regress_rls_dob_role2 USING (true);
+DROP OWNED BY regress_rls_dob_role1;
+DROP POLICY p1 ON dob_t2; -- should succeed
+
 DROP USER regress_rls_dob_role1;
 DROP USER regress_rls_dob_role2;
 
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
new file mode 100644
index a637551..cc09355
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1761,7 +1761,8 @@ plpgsql_parse_cwordtype(List *idents)
 		classStruct->relkind != RELKIND_VIEW &&
 		classStruct->relkind != RELKIND_MATVIEW &&
 		classStruct->relkind != RELKIND_COMPOSITE_TYPE &&
-		classStruct->relkind != RELKIND_FOREIGN_TABLE)
+		classStruct->relkind != RELKIND_FOREIGN_TABLE &&
+		classStruct->relkind != RELKIND_PARTITIONED_TABLE)
 		goto done;
 
 	/*
@@ -1987,7 +1988,8 @@ build_row_from_class(Oid classOid)
 		classStruct->relkind != RELKIND_VIEW &&
 		classStruct->relkind != RELKIND_MATVIEW &&
 		classStruct->relkind != RELKIND_COMPOSITE_TYPE &&
-		classStruct->relkind != RELKIND_FOREIGN_TABLE)
+		classStruct->relkind != RELKIND_FOREIGN_TABLE &&
+		classStruct->relkind != RELKIND_PARTITIONED_TABLE)
 		ereport(ERROR,
 (errcode(ERRCODE_WRONG_OBJECT_TYPE),
  errmsg("relation \"%s\" is not a table", relname)));
diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out
new file mode 100644
index 7ebbde6..35b83f7
--- a/src/test/regress/expected/plpgsql.out
+++ b/src/test/regress/expected/plpgsql.out
@@ -5979,3 +5979,48 @@ LINE 1: SELECT (SELECT string_agg(id ||
^
 QUERY:  SELECT (SELECT string_agg(id || '=' || name, ',') FROM d)
 CONTEXT:  PL/pgSQL function alter_table_under_transition_tables_upd_func() line 3 at RAISE
+--
+-- Check type parsing and record fetching from partitioned tables
+--
+CREATE TABLE partitioned_table (a int, b text) PARTITION BY LIST (a);
+CREATE TABLE pt_p

Re: [HACKERS] PG10 Partitioned tables and relation_is_updatable()

2017-06-13 Thread Dean Rasheed

On 13 June 2017 at 05:50, Ashutosh Bapat
<ashutosh.ba...@enterprisedb.com> wrote:
> On Tue, Jun 13, 2017 at 12:03 AM, Dean Rasheed <dean.a.rash...@gmail.com> 
> wrote:
>> My initial thought, looking at the patch, is that it might be better
>> to have all the macros in one file to make them easier to maintain.
>
> Right now the macros are listed just below relkind enum in pg_class.h.
> Is that a good place or do you think, we should list them in a
> separate file?
>

Yeah, I wondered about putting them in a separate file, but I think
just below the relkind enum is probably the best place, so that people
changing that enum immediately see the first set of related things to
be updated.

>> Barring objections, I'll push my original patch and work up patches
>> for the other couple of issues I found.
>
> No objections, the patch is good to go as is. Sorry for high-jacking
> this thread.
>

No worries. I missed that other thread initially, so it was useful to
link the 2 threads together.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG10 Partitioned tables and relation_is_updatable()

2017-06-12 Thread Dean Rasheed

On 12 June 2017 at 17:51, Joe Conway  wrote:
> On 06/12/2017 07:40 AM, Joe Conway wrote:
>> On 06/12/2017 01:49 AM, Amit Langote wrote:
>>> As he mentioned in his reply, Ashutosh's proposal to abstract away the
>>> relkind checks is interesting in this regard.
>>>
>> I have not looked at Ashutosh's patch yet, but it sounds like a good
>> idea to me.
>
> After looking I remain convinced - +1 in general.
>

Yes, I think this will probably help, but I worry that it will turn
into quite a large and invasive patch, and there are a number of
design choices to be made over the naming and precise set of macros.
Is this really PG10 material?

My initial thought, looking at the patch, is that it might be better
to have all the macros in one file to make them easier to maintain.


> One thing I would change -- in many cases there are ERROR messages
> enumerating the relkinds. E.g.:
> 8<-
> if (!RELKIND_CAN_HAVE_COLUMN_COMMENT(relation->rd_rel->relkind))
> ereport(ERROR,
> (errcode(ERRCODE_WRONG_OBJECT_TYPE),
> errmsg("\"%s\" is not a table, view, materialized view,
> composite type, or foreign table",
> 8<-
>
> I think those messages should also be rewritten to make them less
> relkind specific and more clear, otherwise we still risk getting out of
> sync in the future. Maybe something like this:
> 8<-
>"\"%s\" is not a kind of relation that can have column comments"
> 8<-
> Thoughts?
>

-1. I think the existing error messages provide useful information
that you'd be removing. If you're worried about the error messages
getting out of sync, then wouldn't it be better to centralise them
along with the corresponding macros?


> In any case, unless another committer else wants it, and assuming no
> complaints with the concept, I'd be happy to take this.
>

OK, have at it.

Barring objections, I'll push my original patch and work up patches
for the other couple of issues I found.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Make ANALYZE more selective about what is a "most common value"?

2017-06-11 Thread Dean Rasheed

On 11 June 2017 at 20:19, Tom Lane  wrote:
>> The standard way of doing this is to calculate the "standard error" of
>> the sample proportion - see, for example [3], [4]:
>>   SE = sqrt(p*(1-p)/n)
>> Note, however, that this formula assumes that the sample size n is
>> small compared to the population size N, which is not necessarily the
>> case. This can be taken into account by applying the "finite
>> population correction" (see, for example [5]), which involves
>> multiplying by an additional factor:
>>   SE = sqrt(p*(1-p)/n) * sqrt((N-n)/(N-1))
>
> It's been a long time since college statistics, but that wikipedia article
> reminds me that the binomial distribution isn't really the right thing for
> our problem anyway.  We're doing sampling without replacement, so that the
> correct model is the hypergeometric distribution.

Yes that's right.

>  The article points out
> that the binomial distribution is a good approximation as long as n << N.
> Can this FPC factor be justified as converting binomial estimates into
> hypergeometric ones, or is it ad hoc?

No, it's not just ad hoc. It comes from the variance of the
hypergeometric distribution [1] divided by the variance of a binomial
distribution [2] with p=K/N, in the notation of those articles.

This is actually a very widely used formula, used in fields like
analysis of survey data, which is inherently sampling without
replacement (assuming the questioners don't survey the same people
more than once!).

Regards,
Dean


[1] https://en.wikipedia.org/wiki/Hypergeometric_distribution
[2] https://en.wikipedia.org/wiki/Binomial_distribution


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG10 Partitioned tables and relation_is_updatable()

2017-06-11 Thread Dean Rasheed

On 11 June 2017 at 16:59, Joe Conway  wrote:
> On 06/11/2017 08:55 AM, Joe Conway wrote:
>> Yeah, I noticed the same while working on the RLS related patch. I did
>> not see anything else in rewriteHandler.c but it is probably worth
>> looking wider for other omissions.
>
> So in particular:
>
> #define RelationIsUsedAsCatalogTable(relation)  \
> ((relation)->rd_options && \
>  ((relation)->rd_rel->relkind == RELKIND_RELATION || \
>   (relation)->rd_rel->relkind == RELKIND_MATVIEW) ? \
>  ((StdRdOptions *) (relation)->rd_options)->user_catalog_table : false)
> 8<
>
> Does RELKIND_PARTITIONED_TABLE apply there?
>

Hmm, a quick experiment shows that it won't allow a partitioned table
to be used as a user catalog table, although I'm not sure if that's
intentional. If it is, it doesn't appear to be documented.

I found another RLS-related omission -- RemoveRoleFromObjectPolicy()
doesn't work for partitioned tables, so DROP OWNED BY  will fail
if there are any partitioned tables with RLS.

I also found another couple of omissions in PL/pgSQL --
plpgsql_parse_cwordtype() and build_row_from_class(), so for example
%rowtype doesn't work for partitioned tables.

That's it for my quick once-over the code-base, but I'm not confident
that will have caught everything.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Make ANALYZE more selective about what is a "most common value"?

2017-06-11 Thread Dean Rasheed

On 05/06/17 09:30, Tom Lane wrote:
> First, I think we need a larger hard floor on the number of occurrences
> of a value that're required to make ANALYZE decide it is a "most common
> value"...
>
> Second, the code also has a rule that potential MCVs need to have an
> estimated frequency at least 25% larger than what it thinks the "average"
> value's frequency is.  A rule of that general form seems like a good idea,
> but I now think the 25% threshold is far too small to do anything useful.
> In particular, in any case like this where there are more distinct values
> than there are sample rows, the "average frequency" estimate will
> correspond to less than one occurrence in the sample, so that this rule is
> totally useless to filter anything that we would otherwise consider as an
> MCV.  I wonder if we shouldn't make it be "at least double the estimated
> average frequency".
>

I think we should attempt to come up with a more principled approach
to this, taking into account the table and sample sizes. Here's what I
found, after a bit of research:

A common initial rule of thumb is that the value should occur at least
10 times in the sample - see, for example [1], [2]. The idea behind
that goes as follows:

Suppose that N is the population size (total number of rows in the
table), and n is the sample size, and that some particular candidate
value appears x times in the sample.

Then the "sample proportion" is given by

  p = x/n.

Taking a different sample would, in general, give a different value
for x, and hence for the sample proportion, so repeatedly taking
different random samples of the table would lead to a distribution of
values of p, and in general, this distribution would be a binomial
distribution, since x is the count of sampled rows matching a yes/no
question (does the row match the candidate value?).

The idea behind the rule of thumb above is that if n is large enough,
and x is not too close to 0 or n, then the binomial distribution of p
can be reasonably approximated by a normal distribution. There are
various rules of thumb for this to be true, but this one is actually
one of the stronger ones, as can be seen in [2]. Technically the rule
should be:

  x >= 10 and x <= n-10

but it's probably not worth bothering with the second test, since
we're looking for MCVs.

Note that this says nothing about the margin of error of p, just that
it is reasonable to treat it as having a normal distribution, which
then allows the margin of error to be analysed using standard
techniques.

The standard way of doing this is to calculate the "standard error" of
the sample proportion - see, for example [3], [4]:

  SE = sqrt(p*(1-p)/n)

Note, however, that this formula assumes that the sample size n is
small compared to the population size N, which is not necessarily the
case. This can be taken into account by applying the "finite
population correction" (see, for example [5]), which involves
multiplying by an additional factor:

  SE = sqrt(p*(1-p)/n) * sqrt((N-n)/(N-1))

This correction factor becomes 0 when n=N (whole table sampled => no
error) and it approaches 1 when n is much smaller than N.

Having computed the standard error of the sample proportion, it is
then possible to establish a confidence interval around p. For
example, one could say that there is a 95% probability that the total
population proportion lies in the range

  [ pmin = p-2*SE, pmax = p+2*SE ]

This gives rise to the possibility of writing another rule for candidate MCVs:

If there are Nd distinct values in the table, so that the average
frequency of occurrence of any particular value is 1/Nd, then the test

  pmin > 1/Nd

would imply that there is a 97.5% probably that the candidate value is
more common than the average (since the 5% of the distribution of p
outside the confidence interval is split evenly below pmin and above
pmax).

To take a concrete example, suppose that the table has N=1000,000
rows, with Nd=500 distinct values, so that each value occurs on
average 2000 times. If the sample size is 30,000 then the current rule
that a value needs to have an estimated frequency 25% over the average
means that a value would need to be seen 75 times to be considered as
an MCV. If that rule were changed to "at least double the estimated
average frequency", then a value would need to be seen at least 150
times. On the other hand, using the above rule based on a 95%
confidence interval, a value would only need to be seen 78 times, in
which case the estimated total number of occurrences of the value in
the table would be 2600+/-579, and using the MCV estimate would almost
certainly be better than not using it.

Regards,
Dean

[1] https://onlinecourses.science.psu.edu/stat200/node/43
[2] https://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation
[3] http://mathworld.wolfram.com/SampleProportion.html
[4] https://en.wikipedia.org/wiki/Population_proportion
[5]

[HACKERS] PG10 Partitioned tables and relation_is_updatable()

2017-06-11 Thread Dean Rasheed

Hi,

It looks like relation_is_updatable() didn't get the message about
partitioned tables. Thus, for example, information_schema.views and
information_schema.columns report that simple views built on top of
partitioned tables are non-updatable, which is wrong. Attached is a
patch to fix this.

I think this kind of omission is an easy mistake to make when adding a
new relkind, so it might be worth having more pairs of eyes looking
out for more of the same. I did a quick scan of the rewriter code
(prompted by the recent similar omission for RLS on partitioned
tables) and I didn't find any more problems there, but I haven't
looked elsewhere yet.

Regards,
Dean


teach-relation-is-updatable-about-partitioned-tables.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Statistics "dependency"

2017-04-23 Thread Dean Rasheed

On 23 April 2017 at 03:37, Bruce Momjian  wrote:
> In looking at the new multi-column statistics "dependency" option in
> Postgres 10, I am quite confused by the term "dependency".  Wouldn't
> "correlation" be clearer and less confusing as "column dependency"
> already means something else.
>

Actually, the terms "dependency" and "correlation" are both quite
broad terms that cover a whole range of other different things, and
hence could be misleading. The precise term for this is "functional
dependency" [1], so if anything, the option name should be
"functional_dependencies" or some shortening of that, keeping a part
of each of those words.

Regards,
Dean

[1] https://en.wikipedia.org/wiki/Functional_dependency

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-04-22 Thread Dean Rasheed

On 21 April 2017 at 01:21, Tomas Vondra  wrote:
> On 04/21/2017 12:13 AM, Tom Lane wrote:
>>
>> Alvaro Herrera  writes:
>>>
>>> Simon just pointed out that having the WITH clause appear in the middle
>>> of the CREATE STATISTICS command looks odd; apparently somebody else
>>> already complained on list about the same.  Other commands put the WITH
>>> clause at the end, so perhaps we should do likewise in the new command.
>>
>> +1 for WITH at the end; the existing syntax looks weird to me too.
>
> -1 from me
>

Yeah, I'm still marginally in favour of the current syntax because
it's a bit more consistent with the CREATE VIEW syntax which puts the
WITH (options) clause before the query, and assuming that multi-table
and partial statistics support do get added in the future, the thing
that the statistics get created on is going to look more like a query.

OTOH, a few people have now commented that the syntax looks weird, and
not just because of the placement of the WITH clause. I don't have any
better ideas, but it's not too late to change if anyone else does...

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics (v19)

2017-02-12 Thread Dean Rasheed

On 11 February 2017 at 01:17, Tomas Vondra  wrote:
> Thanks for the feedback, I'll fix this. I've allowed myself to be a bit
> sloppy because the number of attributes in the statistics is currently
> limited to 8, so the overflows are currently not an issue. But it doesn't
> hurt to make it future-proof, in case we change that mostly artificial limit
> sometime in the future.
>

Ah right, so it can't overflow at present, but it's neater to have an
overflow-proof algorithm.

Thinking about the exactness of the division steps is quite
interesting. Actually, the order of the multiplying factors doesn't
matter as long as the divisors are in increasing order. So in both my
proposal:

result = 1
for (i = 1; i <= k; i++)
result = (result * (n-k+i)) / i;

and David's proposal, which is equivalent but has the multiplying
factors in the opposite order, equivalent to:

result = 1
for (i = 1; i <= k; i++)
result = (result * (n-i+1)) / i;

the divisions are exact at each step. The first time through the loop
it divides by 1 which is trivially exact. The second time it divides
by 2, having multiplied by 2 consecutive factors, one of which is
therefore guaranteed to be divisible by 2. The third time it divides
by 3, having multiplied by 3 consecutive factors, one of which is
therefore guaranteed to be divisible by 3, and so on.

My approach originally seemed more logical to me because of the way it
derives from the recurrence relation binomial(n, k) = binomial(n-1,
k-1) * n / k, but they both work fine as long as they have suitable
overflow checks.

It's also interesting that descriptions of this algorithm tend to talk
about setting k to min(k, n-k) at the start as an optimisation step,
as I did in fact, whereas it's actually more than that -- it helps
prevent unnecessary intermediate overflows when k > n/2. Of course,
that's not a worry for the current use of this function, but it's good
to have a robust algorithm.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics (v19)

2017-02-08 Thread Dean Rasheed

On 8 February 2017 at 16:09, David Fetter  wrote:
> Combinations are n!/(k! * (n-k)!), so computing those is more
> along the lines of:
>
> unsigned long long
> choose(unsigned long long n, unsigned long long k) {
> if (k > n) {
> return 0;
> }
> unsigned long long r = 1;
> for (unsigned long long d = 1; d <= k; ++d) {
> r *= n--;
> r /= d;
> }
> return r;
> }
>
> which greatly reduces the chance of overflow.
>

Hmm, but that doesn't actually prevent overflows, since it can
overflow in the multiplication step, and there is no protection
against that.

In the algorithm I presented, the inputs and the intermediate result
are kept below INT_MAX, so the multiplication step cannot overflow the
64-bit integer, and it will only raise an overflow error if the actual
result won't fit in a 32-bit int. Actually a crucial part of that,
which I failed to mention previously, is the first step replacing k
with min(k, n-k). This is necessary for inputs like (100,99), which
should return 100, and which must be computed as 100 choose 1, not 100
choose 99, otherwise it will overflow internally before getting to the
final result.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics (v19)

2017-02-08 Thread Dean Rasheed

On 6 February 2017 at 21:26, Alvaro Herrera  wrote:
> Tomas Vondra wrote:
>> On 02/01/2017 11:52 PM, Alvaro Herrera wrote:
>
>> > Nearby, some auxiliary functions such as n_choose_k and
>> > num_combinations are not documented. What it is that they do? I'd
>> > move these at the end of the file, keeping the important entry points
>> > at the top of the file.
>>
>> I'd say n-choose-k is pretty widely known term from combinatorics. The
>> comment would essentially say just 'this is n-choose-k' which seems rather
>> pointless. So as much as I dislike the self-documenting code, this actually
>> seems like a good case of that.
>
> Actually, we do have such comments all over the place.  I knew this as
> "n sobre k", so the english name doesn't immediately ring a bell with me
> until I look it up; I think the function comment could just say
> "n_choose_k -- this function returns the binomial coefficient".
>

One of the things you have to watch out for when writing code to
compute binomial coefficients is integer overflow, since the numerator
and denominator get large very quickly. For example, the current code
will overflow for n=13, k=12, which really isn't that large.

This can be avoided by computing the product in reverse and using a
larger datatype like a 64-bit integer to store a single intermediate
result. The point about multiplying the terms in reverse is that it
guarantees that each intermediate result is an exact integer (a
smaller binomial coefficient), so there is no need to track separate
numerators and denominators, and you avoid huge intermediate
factorials. Here's what that looks like in psuedo-code:

binomial(int n, int k):
# Save computational effort by using the symmetry of the binomial
# coefficients
k = min(k, n-k);

# Compute the result using binomial(n, k) = binomial(n-1, k-1) * n / k,
# starting from binomial(n-k, 0) = 1, and computing the sequence
# binomial(n-k+1, 1), binomial(n-k+2, 2), ...
#
# Note that each intermediate result is an exact integer.
int64 result = 1;
for (int i = 1; i <= k; i++)
{
result = (result * (n-k+i)) / i;
if (result > INT_MAX) Raise overflow error
}
return (int) result;

Note also that I think num_combinations(n) is just an expensive way of
calculating 2^n - n - 1.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improving RLS planning

2016-12-29 Thread Dean Rasheed

On 29 December 2016 at 15:55, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Dean Rasheed <dean.a.rash...@gmail.com> writes:
>> On 28 December 2016 at 19:12, Tom Lane <t...@sss.pgh.pa.us> wrote:
>>> [ getting back to this patch finally... ]  I made the suggested change
>>> to that test case, and what I see is a whole lot of "NOTICE: snooped value
>>> = whatever" outputs.
>>>
>>> I'd leave it as shown in the attached diff fragment, except that I'm
>>> a bit worried about possible platform dependency of the output.  The
>>> hashing occurring in the subplans shouldn't affect output order, but
>>> I'm not sure if we want a test output like this or not.  Thoughts?
>
>> How about replacing "a = 3" with "a < 7 AND a != 6". That then
>> exercises more of the possible types of behaviour for quals: The "a <
>> 7" qual is pushed down and used as an index condition. The "a != 6"
>> qual is pushed down and used as a filter, because it's cheap and
>> leakproof. The leakproof() qual isn't pushed down on cost grounds. The
>> snoop() qual isn't pushed down on security grounds. Both snoop() and
>> leakproof() are used as filters, along with "a != 6", and a SB subplan
>> qual. "a != 6" is executed first because it has a security_level of 0,
>> and is cheaper than the subplan. snoop() is executed later, despite
>> being cheaper than the other filter quals, because it has a higher
>> security_level, and leakproof() is executed last because it has the
>> same security level as snoop() but is more expensive.
>
> Will do, although I think that the next test case (the one with "a = 8")
> already shows most of those behaviors.
>

Except that it doesn't have a cheap leakproof qual like "a != 6", not
handled automatically by the index, that order_qual_clauses() can
assign security_level = 0 to, and then move to the start of the list.

I think it's probably worth having a clause like that in one of the
tests, but it could perhaps be added to the "a = 8" test, if you
wanted to drop the "a = 3" test.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improving RLS planning

2016-12-29 Thread Dean Rasheed

On 28 December 2016 at 19:12, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Stephen Frost <sfr...@snowman.net> writes:
>> * Dean Rasheed (dean.a.rash...@gmail.com) wrote:
>>> Hmm. I've not read any of the new code yet, but the fact that this
>>> test now reduces to a one-time filter makes it effectively useless as
>>> a test of qual evaluation order because it has deduced that it doesn't
>>> need to evaluate them. I would suggest replacing the qual with
>>> something that can't be reduced, perhaps "2*a = 6".
>
>> That's a good thought, I agree.
>
> [ getting back to this patch finally... ]  I made the suggested change
> to that test case, and what I see is a whole lot of "NOTICE: snooped value
> = whatever" outputs.
>
> I'd leave it as shown in the attached diff fragment, except that I'm
> a bit worried about possible platform dependency of the output.  The
> hashing occurring in the subplans shouldn't affect output order, but
> I'm not sure if we want a test output like this or not.  Thoughts?
>

How about replacing "a = 3" with "a < 7 AND a != 6". That then
exercises more of the possible types of behaviour for quals: The "a <
7" qual is pushed down and used as an index condition. The "a != 6"
qual is pushed down and used as a filter, because it's cheap and
leakproof. The leakproof() qual isn't pushed down on cost grounds. The
snoop() qual isn't pushed down on security grounds. Both snoop() and
leakproof() are used as filters, along with "a != 6", and a SB subplan
qual. "a != 6" is executed first because it has a security_level of 0,
and is cheaper than the subplan. snoop() is executed later, despite
being cheaper than the other filter quals, because it has a higher
security_level, and leakproof() is executed last because it has the
same security level as snoop() but is more expensive.

Thus the test is more likely to highlight any future changes to the
pushdown/ordering rules. The output is also neater, because nothing
ends up being printed by snoop(), although of course it is OK for
values greater than 5 to be printed.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CREATE OR REPLACE VIEW bug

2016-12-17 Thread Dean Rasheed

On 17 December 2016 at 15:42, Dean Rasheed <dean.a.rash...@gmail.com> wrote:
> It seems that there is a bug in CREATE OR REPLACE VIEW...
>
> DefineView()/DefineVirtualRelation() will need a little re-jigging to
> do things in the required order.

...and the required order for existing views is

1. Add any new columns
2. Add rules to store the new query
3. Update the view options

because 2 will fail if the view's columns don't match the query's columns.

Attached is a patch enforcing this order and adding some comments to
make it clear why the order matters here.

Barring objections I'll back-patch this to 9.4 where WCO was added.

Regards,
Dean
diff --git a/src/backend/commands/view.c b/src/backend/commands/view.c
new file mode 100644
index c6b0e4f..414507f
--- a/src/backend/commands/view.c
+++ b/src/backend/commands/view.c
@@ -59,15 +59,13 @@ validateWithCheckOption(char *value)
 /*-
  * DefineVirtualRelation
  *
- * Create the "view" relation. `DefineRelation' does all the work,
- * we just provide the correct arguments ... at least when we're
- * creating a view.  If we're updating an existing view, we have to
- * work harder.
+ * Create a view relation and use the rules system to store the query
+ * for the view.
  *-
  */
 static ObjectAddress
 DefineVirtualRelation(RangeVar *relation, List *tlist, bool replace,
-	  List *options)
+	  List *options, Query *viewParse)
 {
 	Oid			viewOid;
 	LOCKMODE	lockmode;
@@ -162,18 +160,13 @@ DefineVirtualRelation(RangeVar *relation
 		checkViewTupleDesc(descriptor, rel->rd_att);
 
 		/*
-		 * The new options list replaces the existing options list, even if
-		 * it's empty.
-		 */
-		atcmd = makeNode(AlterTableCmd);
-		atcmd->subtype = AT_ReplaceRelOptions;
-		atcmd->def = (Node *) options;
-		atcmds = lappend(atcmds, atcmd);
-
-		/*
 		 * If new attributes have been added, we must add pg_attribute entries
 		 * for them.  It is convenient (although overkill) to use the ALTER
 		 * TABLE ADD COLUMN infrastructure for this.
+		 *
+		 * Note that we must do this before updating the query for the view,
+		 * since the rules system requires that the correct view columns be in
+		 * place when defining the new rules.
 		 */
 		if (list_length(attrList) > rel->rd_att->natts)
 		{
@@ -192,9 +185,38 @@ DefineVirtualRelation(RangeVar *relation
 atcmd->def = (Node *) lfirst(c);
 atcmds = lappend(atcmds, atcmd);
 			}
+
+			AlterTableInternal(viewOid, atcmds, true);
+
+			/* Make the new view columns visible */
+			CommandCounterIncrement();
 		}
 
-		/* OK, let's do it. */
+		/*
+		 * Update the query for the view.
+		 *
+		 * Note that we must do this before updating the view options, because
+		 * the new options may not be compatible with the old view query (for
+		 * example if we attempt to add the WITH CHECK OPTION, we require that
+		 * the new view be automatically updatable, but the old view may not
+		 * have been).
+		 */
+		StoreViewQuery(viewOid, viewParse, replace);
+
+		/* Make the new view query visible */
+		CommandCounterIncrement();
+
+		/*
+		 * Finally update the view options.
+		 *
+		 * The new options list replaces the existing options list, even if
+		 * it's empty.
+		 */
+		atcmd = makeNode(AlterTableCmd);
+		atcmd->subtype = AT_ReplaceRelOptions;
+		atcmd->def = (Node *) options;
+		atcmds = list_make1(atcmd);
+
 		AlterTableInternal(viewOid, atcmds, true);
 
 		ObjectAddressSet(address, RelationRelationId, viewOid);
@@ -211,7 +233,7 @@ DefineVirtualRelation(RangeVar *relation
 		ObjectAddress address;
 
 		/*
-		 * now set the parameters for keys/inheritance etc. All of these are
+		 * Set the parameters for keys/inheritance etc. All of these are
 		 * uninteresting for views...
 		 */
 		createStmt->relation = relation;
@@ -224,13 +246,20 @@ DefineVirtualRelation(RangeVar *relation
 		createStmt->if_not_exists = false;
 
 		/*
-		 * finally create the relation (this will error out if there's an
-		 * existing view, so we don't need more code to complain if "replace"
-		 * is false).
+		 * Create the relation (this will error out if there's an existing
+		 * view, so we don't need more code to complain if "replace" is
+		 * false).
 		 */
 		address = DefineRelation(createStmt, RELKIND_VIEW, InvalidOid, NULL,
  NULL);
 		Assert(address.objectId != InvalidOid);
+
+		/* Make the new view relation visible */
+		CommandCounterIncrement();
+
+		/* Store the query for the view */
+		StoreViewQuery(address.objectId, viewParse, replace);
+
 		return address;
 	}
 }
@@ -530,16 +559,7 @@ DefineView(ViewStmt *stmt, const char *q
 	 * aborted.
 	 */
 	address = DefineVirtualRelation(view, viewParse->targetList,
-	stmt->replace, stmt->options);
-
-	/*
-	 * The relation we have just created is not visibl

[HACKERS] CREATE OR REPLACE VIEW bug

2016-12-17 Thread Dean Rasheed

It seems that there is a bug in CREATE OR REPLACE VIEW's handling of
WITH CHECK OPTION (noticed while thinking about the recent change to
pg_dump's handling of circular dependencies in views -- d8c05af). If
you use CREATE OR REPLACE VIEW on a view that isn't auto-updatable and
turn it into one that is, and at the same time attempt to add a WITH
CHECK OPTION (which is exactly what pg_dump will now do) it fails:

CREATE TABLE t1 (a int);
CREATE VIEW v1 AS SELECT null::int AS a;
CREATE OR REPLACE VIEW v1 AS SELECT * FROM t1 WHERE a > 0 WITH CHECK OPTION;

ERROR:  WITH CHECK OPTION is supported only on automatically updatable views
HINT:  Views that do not select from a single table or view are not
automatically updatable.


The problem is that before updating the view's query, DefineView()
calls DefineVirtualRelation() which attempts to add the new check
option to the existing view via the ALTER VIEW mechanism, and that
fails because the new check option isn't valid against the old view
query.

So if we're going to use the ALTER VIEW mechanism to update the view's
options, which is probably the most convenient way to do it,
DefineView()/DefineVirtualRelation() will need a little re-jigging to
do things in the required order.

I'll try to knock up a patch to do that.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add support for restrictive RLS policies

2016-12-03 Thread Dean Rasheed

Stephen,

I looked through this in a little more detail and I found one other
issue: the documentation for the system catalog table pg_policy and
the view pg_policies needs to be updated to include the new columns
that have been added.

Other than that, it all looks good to me, subject to the previous comments.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add support for restrictive RLS policies

2016-12-01 Thread Dean Rasheed

On 1 December 2016 at 14:38, Stephen Frost <sfr...@snowman.net> wrote:
> * Dean Rasheed (dean.a.rash...@gmail.com) wrote:
>> In get_policies_for_relation() ...
>> ... I think it should sort the restrictive policies by name
>
> Hmmm, is it really the case that the quals will always end up being
> evaluated in that order though?  Isn't order_qual_clauses() going to end
> up changing the order based on the relative cost?  If the cost is the
> same it should maintain the order, but even that could change in the
> future based on the comments, no?  In short, I'm not entirely sure that
> we actually want to be required to always evaluate the quals in order of
> policy name and we might get complaints if we happen to make that work
> today and it ends up being changed later.
>

No, this isn't about the quals that get put into the WHERE clause of
the resulting queries. As you say, order_quals_clauses() is going to
re-order those anyway. This is about the WithCheckOption's that get
generated for UPDATEs and INSERTs, and having those checked in a
predictable order. The main advantage to that is to guarantee a
predictable error message from self tests that attempt to insert
invalid data. This is basically the same as what was done for CHECK
constraints in e5f455f59fed0632371cddacddd79895b148dc07.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add support for restrictive RLS policies

2016-12-01 Thread Dean Rasheed

On 30 November 2016 at 21:54, Stephen Frost  wrote:
> Unless there's further comments, I'll plan to push this sometime
> tomorrow.
>

Sorry I didn't have time to look at this properly. I was intending to,
but my day job just keeps getting in the way. I do have a couple of
comments relating to the documentation and one relating to the code:

-   row-level security policy.
+   row-level security policy.  Note that only the set of roles which the
+   policy applies to and the USING and
+   WITH CHECK expressions are able to be changed using
+   ALTER POLICY.  To change other properties of a policy,
+   such as the command it is applied for or if it is a permissive or
+   restrictive policy, the policy must be dropped and recreated.

This note reads a little awkwardly to me. I think I would write it as:

Note that ALTER POLICY only allows the set of roles
to which the policy applies and the USING and
WITH CHECK expressions to be modified.  To change
other properties of a policy, such as the command to which it applies
or whether it is permissive or restrictive, the policy must be dropped
and recreated.

+PERMISSIVE
+
+ 
+  Specify that the policy is to be created as a "permissive" policy.
+  All "permissive" policies which are applicable to a given query will
+  be combined together using the boolean "OR" operator.  By creating
+  "permissive" policies, administrators can add to the set of records
+  which can be accessed.  Policies are PERMISSIVE by default.
+ 
+
+   
+
+   
+RESTRICTIVE
+
+ 
+  Specify that the policy is to be created as a "restrictive" policy.
+  All "restrictive" policies which are applicable to a given query will
+  be combined together using the boolean "AND" operator.  By creating
+  "restrictive" policies, administrators can reduce the set of records
+  which can be accessed as all "restrictive" policies must be passed for
+  each record.
+ 
+
+   

I don't think this or anywhere else makes it entirely clear that the
user needs to have at least one permissive policy in addition to any
restrictive policies for this to be useful. I think this section is
probably a good place to mention that, since it's probably the first
place people will read about restrictive policies. I think it also
needs to be spelled out exactly how a mix of permissive and
restrictive policies are combined, because there is more than one way
to combine a set of quals with ANDs and ORs (although only one way
really makes sense in this context). So perhaps an additional note
along the lines of:

Note that there needs to be at least one permissive policy to grant
access to records before restrictive policies can be usefully used to
reduce that access. If only restrictive policies exist, then no records
will be accessible. When a mix of permissive and restrictive policies
are present, a record is only accessible if at least one of the
permissive policies passes, in addition to all the restrictive
policies.

Also, I don't think it's necessary to keep quoting "restrictive" and
"permissive" here.

In get_policies_for_relation(), after the loop that does this:

-*permissive_policies = lappend(*permissive_policies, policy);
+{
+if (policy->permissive)
+*permissive_policies = lappend(*permissive_policies, policy);
+else
+*restrictive_policies = lappend(*restrictive_policies, policy);
+}

I think it should sort the restrictive policies by name, for the same
reason that hook-restrictive policies are sorted -- so that the WITH
CHECK expressions are checked in a well-defined order (which was
chosen to be consistent with the order of checking of other similar
things, like CHECK constraints and triggers). I also think that this
should be a separate sort step from the sort for hook policies,
inserted just after the loop above, so that the order is all internal
policies sorted by name, followed by all hook policies sorted by name,
to be consistent with the existing principle that hook policies are
applied after internal policies.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improving RLS planning

2016-12-01 Thread Dean Rasheed

On 28 November 2016 at 23:55, Stephen Frost  wrote:
>> diff --git a/src/test/regress/expected/updatable_views.out 
>> b/src/test/regress/expected/updatable_views.out
> [...]
>> --- 2104,2114 
>>
>>   EXPLAIN (VERBOSE, COSTS OFF)
>>   UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a = 3;
>> ! QUERY PLAN
>> ! --
>> !  Result
>> !One-Time Filter: false
>> ! (2 rows)
>
> Perhaps Dean recalls something more specific, but reviewing this test
> and the others around it, I believe it was simply to see what happens in
> a case which doesn't pass the security barrier view constraint and to
> cover the same cases with the UPDATE as were in the SELECTs above it.  I
> don't see it being an issue that it now results in a one-time filter:
> false result.
>

Hmm. I've not read any of the new code yet, but the fact that this
test now reduces to a one-time filter makes it effectively useless as
a test of qual evaluation order because it has deduced that it doesn't
need to evaluate them. I would suggest replacing the qual with
something that can't be reduced, perhaps "2*a = 6".

In addition, I think that the tests on this view are probably no
longer adequate for the purpose of validating that the qual evaluation
order is safe. With the old implementation, the subquery scans in the
plans made it pretty clear that it was safe, and likely to remain safe
with variants of those queries, but that's not so obvious with the new
plans. Maybe some additional quals could be added to the view
definition, perhaps based on the other view columns, to verify that
the outer leaky qual always gets evaluated after the security barrier
quals, regardless of cost. Or perhaps that's something that's better
proved with an all-new set of tests, but it does seem to me that the
new implementation has a higher risk (or at least introduces different
risks) of unsafe evaluation orders that warrant some additional
testing.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improving RLS planning

2016-11-10 Thread Dean Rasheed

On 10 November 2016 at 17:12, Tom Lane  wrote:
> Yeah, I think we'd be best off to avoid the bare term "security".
> It's probably too late to change the RTE field name "securityQuals",
> but maybe we could uniformly call those "security barrier quals" in
> the comments.  Then the basic terminology is that we have security
> barrier views and row-level security both implemented on top of
> security barrier quals, and we should be careful to use the right
> one of those three terms in comments/documentation.
>

+1 for that terminology and no renaming of fields.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improving RLS planning

2016-11-10 Thread Dean Rasheed

On 8 November 2016 at 16:46, Robert Haas  wrote:
> On Thu, Nov 3, 2016 at 6:23 PM, Tom Lane  wrote:
>>> I think that ordering might be sub-optimal if you had a mix of
>>> leakproof quals and security quals and the cost of some security quals
>>> were significantly higher than the cost of some other quals. Perhaps
>>> all leakproof quals should be assigned security_level 0, to allow them
>>> to be checked earlier if they have a lower cost (whether or not they
>>> are security quals), and only leaky quals would have a security_level
>>> greater than zero. Rule 1 would then not need to check whether the
>>> qual was leakproof, and you probably wouldn't need the separate
>>> "leakproof" bool field on RestrictInfo.
>>
>> Hm, but it would also force leakproof quals to be evaluated in front
>> of potentially-cheaper leaky quals, whether or not that's semantically
>> necessary.
>>

True. That's also what currently happens with RLS and SB views because
leakproof quals are pushed down into subqueries without considering
their cost. It would be nice to do better than that.


>> I experimented with ignoring security_level altogether for leakproof
>> quals, but I couldn't make it work properly, because that didn't lead to
>> a comparison rule that satisfies transitivity.  For instance, consider
>> three quals:
>> A: cost 1, security_level 1, leaky
>> B: cost 2, security_level 1, leakproof
>> C: cost 3, security_level 0, leakproof
>> A should sort before B, since same security_level and lower cost;
>> B should sort before C, since lower cost and leakproof;
>> but A must sort after C, since higher security_level and leaky.
>
> Yeah, this is pretty thorny.  IIUC, all leaky quals of a given
> security level must be evaluated before any quals of the next higher
> security level, or we have a security problem.  Beyond that, we'd
> *prefer* to evaluate cheaper quals first (though perhaps we ought to
> be also thinking about how selective they are) but that's "just" a
> matter of how good the query plan is.  So in this example, security
> dictates that C must precede A, but that's it.  We can pick between
> C-A-B, C-B-A, and B-C-A based on cost.  C-B-A is clearly inferior to
> either of the other two, but it's less obvious whether C-A-B or B-C-A
> is better.  If you expect each predicate to have a selectivity of 50%,
> then C-A-B costs 3+(0.5*1)+(0.25*2) = 4 while B-C-A costs
> 2+(0.5*3)+(0.25*1) = 3.75, so B-C-A is better.  But now make the cost
> of B and C 18 and 20 while keeping the cost of A at 1.  Now C-A-B
> costs 20+(0.5*1)+(0.25*18) = 25 while B-C-A costs 18+(0.5*20)+(0.25*1)
> = 28.25, so now C-A-B is better.
>
> So I think any attempt to come up with a transitive comparison rule is
> doomed.  We could do something like: sort by cost then security level;
> afterwards, allow leakproof qual to migrate forward as many position
> as is possible without passing a qual that is either higher-cost or
> (non-leakproof and lower security level).  So in the above example we
> would start by sorting the like C-A-B and then check whether B can
> move forward; it can't, so we're done.  If all operators were
> leakproof, this would essentially turn into an insertion-sort that
> orders them strictly by cost, whereas if they're all leaky, it orders
> strictly by security level and then by cost.  With a mix of leaky and
> non-leaky operators you get something in the middle.
>
> I'm not sure that this is practically better than the hack you
> proposed, but I wanted to take the time to comment on the theory here,
> as I see it anyway.
>

Yes, I think you're right. It doesn't look possible to invent a
transitive comparison rule.

I thought perhaps the rule could be to only "push down" a leakproof
qual (change it to a lower security_level) if there are more expensive
quals at the lower level, but as you point out, this doesn't guarantee
cheaper execution.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improving RLS planning

2016-11-10 Thread Dean Rasheed

On 8 November 2016 at 14:45, Tom Lane  wrote:
> Stephen Frost  writes:
>> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>>> * Since the planner is now depending on Query.hasRowSecurity to be set
>>> whenever there are any securityQuals, I put in an Assert about that,
>>> and promptly found three places in prepjointree.c and the rewriter where
>>> we'd been failing to set it.  I have not looked to see if these represent
>>> live bugs in existing releases, but they might.  Or am I misunderstanding
>>> what the flag is supposed to mean?
>
>> They're independent, actually.  securityQuals can be set via either
>> security barrier view or from RLS, while hasRowSecurity is specifically
>> for the RLS case.  The reason for the distinction is that changing your
>> role isn't going to impact security barrier views at all, while it could
>> impact what RLS policies are used.  See extract_query_dependencies().
>

Right. securityQuals was added for updatable SB views, which pre-dates RLS.

> OK.  In that case I'll need to adjust the patch so that the planner keeps
> its own flag about whether the query contains any securityQuals; that's
> easy enough.  But I'm still suspicious that the three places I found may
> represent bugs in the management of Query.hasRowSecurity.
>

I don't believe that there are any existing bugs here, but I think 1
or possibly 2 of the 3 places you changed should be kept on robustness
grounds (see below).

Query.hasRowSecurity is only used for invalidation of cached plans,
when the current role or environment changes, causing a change to the
set of policies that need to be applied, and thus requiring that the
query be re-planned. This happens in extract_query_dependencies(),
which walks the query tree and will find any Query.hasRowSecurity
flags and set PlannerGlobal.dependsOnRole, which is sufficient for its
intended purpose.

Regarding the 3 places you mention...

1). rewriteRuleAction() doesn't need to set Query.hasRowSecurity
because it's called during "Step 1" of the rewriter, where non-SELECT
rules are expanded, but RLS expansion doesn't happen until later, at
the end of "Step 2", after SELECT rules are expanded. That said, you
could argue that copying the flag in rewriteRuleAction() makes the
code more bulletproof, even though it is expected to always be false
at that point.

2). pull_up_simple_subquery() technically doesn't need to set
Query.hasRowSecurity because nothing in the planner refers to it, and
plancache.c will have already recorded the fact that the original
query had RLS. However, this seems like a bug waiting to happen, and
this really ought to be copying the flag in case we later add code
that does look at the flag later in the planning process.

3). rewriteTargetView() should not set the flag because the flag is
only for RLS, not for SB views, and we don't want cached plans for SB
views to be invalidated.

On a related note, I think it's worth establishing a terminology
convention for code and comments in this whole area. In the existing
code, or in my head at least, there's a convention that a term that
contains the words "row" or "policy" together with "security" refers
specifically to RLS, not to SB views. For example:

* Row-level security or just row security for the name of the feature itself
* row_security -- the configuration setting
* get_row_security_policies()
* Query.hasRowSecurity
* rowsecurity.c

On the other hand, terms that contain just the word "security" without
the words "row" or "policy" have a broader scope and may refer to
either RLS or SB views. For example:

* RangeTblEntry.security_barrier
* RangeTblEntry.securityQuals
* expand_security_quals()
* prepsecurity.c
* The new security_level field

It's a pretty fine distinction, and I don't know how others have been
thinking about this, but I think that it's helpful to make the
distinction, and there are at least a couple of places in the patch
that use RLS-specific terminology for what could also be a SB view.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improving RLS planning

2016-10-27 Thread Dean Rasheed

On 25 October 2016 at 22:58, Tom Lane  wrote:
> The alternative I'm now thinking about pursuing is to get rid of the
> conversion of RLS quals to subqueries.  Instead, we can label individual
> qual clauses with security precedence markings.  Concretely, suppose we
> add an "int security_level" field to struct RestrictInfo.  The semantics
> of this would be that a qual with a lower security_level value must be
> evaluated before a qual with a higher security_level value, unless the
> latter qual is leakproof.  (It would likely also behoove us to add a
> "leakproof" bool field to struct RestrictInfo, to avoid duplicate
> leakproof-ness checks on quals.  But that's just an optimization.)
>

+1 for this approach. It looks like it could potentially be much
simpler. There's some ugly code in the inheritance planner (and
probably one or two other places) that it might be possible to chop
out, which would probably also speed up planning times.

> In the initial implementation, quals coming from a RangeTblEntry's
> securityQuals field would have security_level 0, quals coming from
> anywhere else would have security_level 1; except that if we know
> there are no security quals anywhere (ie not Query->hasRowSecurity),
> we could give all quals security_level 0.  (I think this exception
> may be worth making because there's no need to test leakproofness
> for a qual with security level 0; it could never be a candidate
> for security delay anyway.)
>

Note that the securityQuals list represents nested levels of security
barrier (e.g., nested SB views), so you'd have to actually assign
security_level 0 to the first security qual, security_level 1 to the
second security qual, and so on. Then the quals coming from anywhere
else would have to have a security_level one greater than the maximum
of all the other security levels.

> Having done that much, I think all we need in order to get rid of
> RLS subqueries, and just stick RLS quals into their relation's
> baserestrictinfo list, are two rules:
>
> 1. When selecting potential indexquals, a RestrictInfo can be considered
> for indexqual use only if it is leakproof or has security_level <= the
> minimum among the table's baserestrictinfo clauses.
>
> 2. In order_qual_clauses, sort first by security_level and second by cost.
>

I think that ordering might be sub-optimal if you had a mix of
leakproof quals and security quals and the cost of some security quals
were significantly higher than the cost of some other quals. Perhaps
all leakproof quals should be assigned security_level 0, to allow them
to be checked earlier if they have a lower cost (whether or not they
are security quals), and only leaky quals would have a security_level
greater than zero. Rule 1 would then not need to check whether the
qual was leakproof, and you probably wouldn't need the separate
"leakproof" bool field on RestrictInfo.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics (v19)

2016-10-04 Thread Dean Rasheed

On 4 October 2016 at 09:15, Heikki Linnakangas  wrote:
> However, for tables and views, each object you store in those views is a
> "table" or "view", but with this thing, the object you store is
> "statistics". Would you have a catalog table called "pg_scissor"?
>

No, probably not (unless it was storing individual scissor blades).

However, in this case, we have related pre-existing catalog tables, so...

> We call the current system table "pg_statistic", though. I agree we should
> call it pg_mv_statistic, in singular, to follow the example of pg_statistic.
>
> Of course, the user-friendly system view on top of that is called
> "pg_stats", just to confuse things more :-).
>

I agree. Given where we are, with a pg_statistic table and a pg_stats
view, I think the least worst solution is to have a pg_statistic_ext
table, and then maybe a pg_stats_ext view.


>> It doesn't seem like the end of the world that it doesn't
>> match the user-facing syntax. A bigger concern is the use of "mv" in
>> the name, because as has already been pointed out, this table may also
>> in the future be used to store univariate expression and partial
>> statistics, so I think we should drop the "mv" and go with something
>> like pg_statistic_ext, or some other more general name.
>
>
> Also, "mv" makes me think of materialized views, which is completely
> unrelated to this.
>

Yeah, I hadn't thought of that.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics (v19)

2016-10-04 Thread Dean Rasheed

On 30 September 2016 at 12:10, Heikki Linnakangas  wrote:
> I fear that using "statistics" as the name of the new object might get a bit
> awkward. "statistics" is a plural, but we use it as the name of a single
> object, like "pants" or "scissors". Not sure I have any better ideas though.
> "estimator"? "statistics collection"? Or perhaps it should be singular,
> "statistic". I note that you actually called the system table
> "pg_mv_statistic", in singular.
>

I think it's OK. The functional dependency is a single statistic, but
MCV lists and histograms are multiple statistics (multiple facts about
the data sampled), so in general when you create one of these new
objects, you are creating multiple statistics about the data. Also I
find "CREATE STATISTIC" just sounds a bit clumsy compared to "CREATE
STATISTICS".

The convention for naming system catalogs seems to be to use the
singular for tables and plural for views, so I guess we should stick
with that. It doesn't seem like the end of the world that it doesn't
match the user-facing syntax. A bigger concern is the use of "mv" in
the name, because as has already been pointed out, this table may also
in the future be used to store univariate expression and partial
statistics, so I think we should drop the "mv" and go with something
like pg_statistic_ext, or some other more general name.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics (v19)

2016-10-04 Thread Dean Rasheed

On 4 October 2016 at 04:25, Michael Paquier  wrote:
> OK. A second thing was related to the use of schemas in the new system
> catalogs. As mentioned in [1], those could be removed.
> [1]: 
> https://www.postgresql.org/message-id/cab7npqtu40q5_nsghvomjfbyh1hdtqmbfdj+kwfjspam35b...@mail.gmail.com.
>

That doesn't work, because if the intention is to be able to one day
support statistics across multiple tables, you can't assume that the
statistics are in the same schema as the table.

In fact, if multi-table statistics are to be allowed in the future, I
think you want to move away from thinking of statistics as depending
on and referring to a single table, and handle them more like views --
i.e, store a pg_node_tree representing the from_clause and add
multiple dependencies at statistics creation time. That was what I was
getting at upthread when I suggested the alternate syntax, and also
answers Tomas' question about how JOIN might one day be supported.

Of course, if we don't think that we will ever support multi-table
statistics, that all goes away, and you may as well make the
statistics name local to the table, but I think that's a bit limiting.
One way or the other, I think this is a question that needs to be
answered now. My vote is to leave expansion room to support
multi-table statistics in the future.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics (v19)

2016-09-12 Thread Dean Rasheed

On 3 August 2016 at 02:58, Tomas Vondra  wrote:
> Attached is v19 of the "multivariate stats" patch series

Hi,

I started looking at this - just at a very high level - I've not read
much of the detail yet, but here are some initial review comments.

I think the overall infrastructure approach for CREATE STATISTICS
makes sense, and I agree with other suggestions upthread that it would
be useful to be able to build statistics on arbitrary expressions,
although that doesn't need to be part of this patch, it's useful to
keep that in mind as a possible future extension of this initial
design.

I can imagine it being useful to be able to create user-defined
statistics on an arbitrary list of expressions, and I think that would
include univariate as well as multivariate statistics. Perhaps that's
something to take into account in the naming of things, e.g., as David
Rowley suggested, something like pg_statistic_ext, rather than
pg_mv_statistic.

I also like the idea that this might one day be extended to support
statistics across multiple tables, although I think that might be
challenging to achieve -- you'd need a method of taking a random
sample of rows from a join between 2 or more tables. However, if the
intention is to be able to support that one day, I think that needs to
be accounted for in the syntax now -- specifically, I think it will be
too limiting to only support things extending the current syntax of
the form table1(col1, col2, ...), table2(col1, col2, ...), because
that precludes building statistics on an expression referring to
columns from more than one table. So I think we should plan further
ahead and use a syntax giving greater flexibility in the future, for
example something structured more like a query (like CREATE VIEW):

CREATE STATISTICS name
  [ WITH (options) ]
  ON expression [, ...]
  FROM table [, ...]
  WHERE condition

where the first version of the patch would only support expressions
that are simple column references, and would require at least 2 such
columns from a single table with no WHERE clause, i.e.:

CREATE STATISTICS name
  [ WITH (options) ]
  ON column1, column2 [, ...]
  FROM table

For multi-table statistics, a WHERE clause would typically be needed
to specify how the tables are expected to be joined, but potentially
such a clause might also be useful in single-table statistics, to
build partial statistics on a commonly queried subset of the table,
just like a partial index.

Of course, I'm not suggesting that the current patch do any of that --
it's big enough as it is. I'm just throwing out possible future
directions this might go in, so that we don't get painted into a
corner when designing the syntax for the current patch.


Regarding the statistics themselves, I read the description of soft
functional dependencies, and I'm somewhat skeptical about that
algorithm. I don't like the arbitrary thresholds or the sudden jump
from independence to dependence and clause reduction. As others have
said, I think this should account for a continuous spectrum of
dependence from fully independent to fully dependent, and combine
clause selectivities in a way based on the degree of dependence. For
example, if you computed an estimate for the fraction 'f' of the
table's rows for which a -> b, then it might be reasonable to combine
the selectivities using

  P(a,b) = P(a) * (f + (1-f) * P(b))

Of course, having just a single number that tells you the columns are
correlated, tells you nothing about whether the clauses on those
columns are consistent with that correlation. For example, in the
following table

CREATE TABLE t(a int, b int);
INSERT INTO t SELECT x/10, ((x/10)*789)%100 FROM generate_series(0,999) g(x);

'b' is functionally dependent on 'a' (and vice versa), but if you
query the rows with a<50 and with b<50, those clauses behave
essentially independently, because they're not consistent with the
functional dependence between 'a' and 'b', so the best way to combine
their selectivities is just to multiply them, as we currently do.

So whilst it may be interesting to determine that 'b' is functionally
dependent on 'a', it's not obvious whether that fact by itself should
be used in the selectivity estimates. Perhaps it should, on the
grounds that it's best to attempt to use all the available
information, but only if there are no more detailed statistics
available. In any case, knowing that there is a correlation can be
used as an indicator that it may be worthwhile to build more detailed
multivariate statistics like a MCV list or a histogram on those
columns.


Looking at the ndistinct coefficient 'q', I think it would be better
if the recorded statistic were just the estimate for
ndistinct(a,b,...) rather than a ratio of ndistinct values. That's a
more fundamental statistic, and it's easier to document and easier to
interpret. Also, I don't believe that the coefficient 'q' is the right
number to use for clause estimation:

Looking at

Re: [HACKERS] CREATE POLICY bug ?

2016-09-01 Thread Dean Rasheed

[Please reply to the list, not just to me, so that others can benefit
from and contribute to the discussion]

On 31 August 2016 at 11:52, Andrea Adami  wrote:
> Thnaks Dean, i did further investigations:
> i set the owner of the view to: "mana...@scuola247.it" with:
> ALTER TABLE public.policy_view OWNER TO "mana...@scuola247.it";
> and i thinking to see from the select:
> select * from policy_view
> the rows: 1,2,3
> then
> set role 'mana...@scuola247.it';
> select * from policy_view;
> return rows 1,2,3 as expected but:
> set role 'teac...@scuola247.it';
> select * from policy_view;
> returns rows 4,5 and
> set role 'postgres'
> select * from policy_view
> return nothing ...
> what you thinking about ?
>
> Andrea

That's correct. With the table owned by postgres and the view owned by
"mana...@scuola247.it", access to the table via the view is subject to
the policies that apply to "mana...@scuola247.it". So regardless of
who the current user is, when selecting from the view, the policy
"standard" will be applied, and that will limit the visible rows to
those for which usr = current_user.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RLS related docs

2016-08-30 Thread Dean Rasheed

On 28 August 2016 at 21:23, Joe Conway  wrote:
> Apologies for the delay, but new patch attached. Assuming no more
> comments, will commit this, backpatched to 9.5, in a day or two.
>

Looking at this again, I think there is something fishy about these
dump/restore flags.

If you do pg_dump --enable-row-security, then row_security is turned
on during the dump and only the user-visible portions of the tables
are dumped. But why does such a dump emit "SET row_security = on;" as
part of the dump? There doesn't appear to be any reason for having
row_security turned on during the restore just because it was on
during the dump. The INSERT policies may well be different from the
SELECT policies, and so this may lead to a dump that cannot be
restored. ISTM that row_security should be off inside the dump, and
only enabled during restore if the user explicitly asks for it,
regardless of what setting was used to produce the dump.

Also, isn't it the case that --enable-row-security during pg_restore
is only relevant when performing a data-only restore (like
--disable-triggers). Otherwise, it looks to me as though the restore
will create the tables, restore the data, and then only at the end
restore the table policies and enable row level security on the
tables. So it looks like the flag would have no effect (and a
COPY-format dump would work fine) for a non-data-only dump.

I never really looked at the RLS dump/restore code. Am I missing something?

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CREATE POLICY bug ?

2016-08-20 Thread Dean Rasheed

On 20 August 2016 at 03:15, Andrea Adami  wrote:
> when i run the query: "select * from public.policy_view"
> the ouput is the same (all rows)  for all users
> i'm doing some mistakes or this is a bug ?
>

No, it looks correct to me. When going through a view, the policies
and permission checks that apply are those that would apply to the
view's owner, which in this case is postgres, so no policies are
applied.

Or, quoting from the notes in the CREATE POLICY documentation:

As with normal queries and views, permission checks and policies for
the tables which are referenced by a view will use the view owner's
rights and any policies which apply to the view owner.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bogus ANALYZE results for an otherwise-unique column with many nulls

2016-08-07 Thread Dean Rasheed

On 5 August 2016 at 21:48, Tom Lane  wrote:
> OK, thanks.  What shall we do about Andreas' request to back-patch this?
> I'm personally willing to do it, but there is the old bugaboo of "maybe
> it will destabilize a plan that someone is happy with".
>

My inclination would be to back-patch it because arguably it's a
bug-fix -- at the very least the old behaviour didn't match the docs
for stadistinct:

  The number of distinct nonnull data values in the column.
  A value greater than zero is the actual number of distinct values.
  A value less than zero is the negative of a multiplier for the number
  of rows in the table; for example, a column in which values appear about
  twice on the average could be represented by
  stadistinct = -0.5.

Additionally, I think that example is misleading because it's only
really true if there are no null values in the column. Perhaps it
would help to have a more explicit example to illustrate how nulls
affect stadistinct, for example:

  ... for example, a column in which about 80% of the values are nonnull
  and each nonnull value appears about twice on average could be
  represented by stadistinct = -0.4.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining hash values

2016-08-01 Thread Dean Rasheed

On 1 August 2016 at 08:19, Greg Stark  wrote:
> Surely combining multiple hashes is the same problem as hashing a block of
> memory? Shouldn't we just use the same algorithm as hash_any()?
>

Yes, I imagine that should work well. I suspect that Thomas's proposal
would also probably work well, as would a number of other hashing
algorithms with reasonable pedigree, such as the one used for array
hashing. I don't have any particular preference, but I do know that
what usually turns out to be disastrous is an arbitrary made-up
formula like rotating and xor'ing. The last thing we should attempt to
do is invent our own hashing algorithms.

On that subject, while looking at hashfunc.c, I spotted that
hashint8() has a very obvious deficiency, which causes disastrous
performance with certain inputs:

create table foo (a bigint);
insert into foo select (x*2^32)+x from generate_series(1,100) g(x);
analyse foo;
select * from foo f1, foo f2 where f1.a=f2.a;

With random values that query (using a hash join) takes around 2
seconds on my machine, but with the values above I estimate it will
take 20 hours (although I didn't wait that long!). The reason is
pretty obvious -- all the bigint values above hash to the same value.
I'd suggest using hash_uint32() for values that fit in a 32-bit
integer and hash_any() otherwise.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimizing numeric SUM() aggregate

2016-07-27 Thread Dean Rasheed

On 27 July 2016 at 10:17, Andrew Borodin  wrote:
>>   if (accum->maxdig > (INT_MAX - INT_MAX / NBASE) / (NBASE - 1))
> Woth noting that (INT_MAX - INT_MAX / NBASE) / (NBASE - 1) == INT_MAX
> / NBASE for any NBASE > 1

Interesting. I think it's clearer the way it is in mul_var() though,
because the intention is to avoid letting digits exceed INT_MAX -
INT_MAX/NBASE, so that there is no danger of overflow in the carry
propagation step. The long form makes that clearer (and is just as
efficient, since these expressions will be evaluated by the
preprocessor).

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimizing numeric SUM() aggregate

2016-07-27 Thread Dean Rasheed

On 27 July 2016 at 09:57, Dean Rasheed <dean.a.rash...@gmail.com> wrote:
> it could be
> coded using something like
>
> accum->maxdig += NBASE - 1;
> if (accum->maxdig > (INT_MAX - INT_MAX / NBASE) / (NBASE - 1))
> Propagate carries...
>

Oops, that's not quite right because maxdig is actually epresents the
max possible value divided by NBASE-1 in mul_var(), so actually it
ought to have been accum->maxdig++ in the first line.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimizing numeric SUM() aggregate

2016-07-27 Thread Dean Rasheed

On 27 July 2016 at 07:33, Andrew Borodin  wrote:
>>I think we could do carry every 0x7FF / 1 accumulation, couldn't we?
>
> I feel that I have to elaborate a bit. Probably my calculations are wrong.
>
> Lets assume we already have accumulated INT_MAX of  digits in
> previous-place accumulator. That's almost overflow, but that's not
> overflow. Carring that accumulator to currents gives us INT_MAX/1
> carried sum.
> So in current-place accumulator we can accumulate: ( INT_MAX - INT_MAX
> / 1 ) / , where  is max value dropped in current-place
> accumulator on each addition.
> That is INT_MAX *  /  or simply INT_MAX / 1.
>
> If we use unsigned 32-bit integer that is 429496. Which is 43 times
> less frequent carring.
>
> As a bonus, we get rid of  const in the code (:
>
> Please correct me if I'm wrong.
>

This is basically the same problem that has already been solved in
mul_var() and other places in numeric.c, so in this case it could be
coded using something like

accum->maxdig += NBASE - 1;
if (accum->maxdig > (INT_MAX - INT_MAX / NBASE) / (NBASE - 1))
Propagate carries...

I agree that the new code should avoid explicitly referring to
constants like , and I don't think there is any reason for this
new code to assume NBASE=1.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls

2016-05-30 Thread Dean Rasheed

On 30 May 2016 at 15:44, Andrew Gierth <and...@tao11.riddles.org.uk> wrote:
>>>>>> "Dean" == Dean Rasheed <dean.a.rash...@gmail.com> writes:
>
>  Dean> That may be so, but we already support FILTER for all windows
>  Dean> functions as well as aggregates:
>
> Not so:
>
> "If FILTER is specified, then only the input rows for which the
> filter_clause evaluates to true are fed to the window function; other
> rows are discarded. Only window functions that are aggregates accept a
> FILTER clause."
>

Ah, yes. It's all coming back to me now.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls

2016-05-30 Thread Dean Rasheed

On 23 May 2016 at 17:01, Jeff Davis  wrote:
> On Fri, May 20, 2016 at 1:41 PM, David G. Johnston
>  wrote:
>> How does the relatively new FILTER clause play into this, if at all?
>
> My interpretation of the standard is that FILTER is not allowable for
> a window function, and IGNORE|RESPECT NULLS is not allowable for an
> ordinary aggregate.
>

That may be so, but we already support FILTER for all windows
functions as well as aggregates:

https://www.postgresql.org/docs/current/static/sql-expressions.html#SYNTAX-AGGREGATES
https://www.postgresql.org/docs/current/static/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS

so to be clear, what we're talking about here is just supporting SQL
standard syntax for window functions, rather than adding any new
functionality, right?

> So if we support IGNORE|RESPECT NULLS for anything other than a window
> function, we have to come up with our own semantics.
>

Given that we can already do this using FILTER for aggregates, and
that IGNORE|RESPECT NULLS for aggregates is not part of the SQL
standard, I see no reason to support it.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RLS related docs

2016-05-26 Thread Dean Rasheed

On 25 May 2016 at 02:04, Joe Conway  wrote:
> Please see attached two proposed patches for the docs related to RLS:
>
> 1) Correction to pg_restore
> 2) Additional mentions that "COPY FROM" does not allow RLS to be enabled
>
> Comments?
>

The pg_restore change looks good -- that was clearly wrong.

Also, +1 for the new note in pg_dump.

For COPY, I think perhaps it would be more logical to put the new note
immediately after the third note which describes the privileges
required, since it's kind of related, and then we can talk about the
RLS policies required, e.g.:

If row-level security is enabled for the table, COPY table TO is
internally converted to COPY (SELECT * FROM table) TO, and the
relevant security policies are applied. Currently, COPY FROM is not
supported for tables with row-level security.

> Related question: I believe
>
>   COPY tbl TO ...
>
> is internally converted to
>
>   COPY (select * FROM tbl) TO ...
>
> when RLS is involved. Do we want to document that?
>

I think so, yes, because that makes it clearer what policies will be applied.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql :: support for \ev viewname and \sv viewname

2016-05-06 Thread Dean Rasheed

On 4 May 2016 at 13:23, Peter Eisentraut
<peter.eisentr...@2ndquadrant.com> wrote:
> On 5/4/16 3:21 AM, Dean Rasheed wrote:
>> Well, appendStringLiteralAH() takes both, so that sets a precedent.
> Works for me.  Could you supply an updated patch with a static function
> instead of a macro?  Then I think this is good to go.
>
> (bonus points for some tests)
>

OK, pushed that way.

I didn't get round to adding any tests though. I strikes me that the
most likely bugs in this area are bugs of omission, like this and the
missing PARALLEL SAFE recently fixed for functions. Ideally tests
would be able to spot those kinds of issues, but it's not obvious how
to write such tests.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More inaccurate results from numeric pow()

2016-05-05 Thread Dean Rasheed

On 2 May 2016 at 18:38, Tom Lane  wrote:
> I don't much care for the hardwired magic number here, especially since
> exp_var() does not have its limit expressed as "6000" but as
> "NUMERIC_MAX_RESULT_SCALE * 3".  I think you should rephrase the limit
> to use that expression, and also add something like this in exp_var():

OK, that makes sense. Done that way.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql :: support for \ev viewname and \sv viewname

2016-05-04 Thread Dean Rasheed

On 4 May 2016 at 01:35, Peter Eisentraut
<peter.eisentr...@2ndquadrant.com> wrote:
> On 5/3/16 3:10 PM, Dean Rasheed wrote:
>> On 3 May 2016 at 16:52, Peter Eisentraut
>>>
>>> I would change appendReloptionsArrayAH() to a function and keep AH as the
>>> first argument (similar to other functions that take such a handle).
>>
>> I can understand changing it to a function, but I don't think AH
>> should be the first argument. All other append*() functions that
>> append to a buffer have the buffer as the first argument, including
>> the appendStringLiteralAH() macro on which this is based.
>
> Well, all the functions that take archive handles have that as the first
> argument, so how do we consolidate that?
>

Well, appendStringLiteralAH() takes both, so that sets a precedent.

And I think that makes sense too. The functions that take an archive
handle as their first argument are mostly functions whose primary
concern is to operate on the archive in some way. All the append*()
functions that take a buffer as their first argument are primarily
concerned with operating on the buffer. I'd say
appendStringLiteralAH() and appendReloptionsArrayAH() fall very much
into that second category. They only take an archive handle to get the
encoding and std_strings settings controlling *how* they operate on
the buffer. The main purpose of those append*() functions is to append
to a buffer, so it makes sense that that is their first argument.

All the append*() functions are consistent in their argument ordering,
including those that also take an archive handle, so I think
appendReloptionsArrayAH() should follow that pattern.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql :: support for \ev viewname and \sv viewname

2016-05-03 Thread Dean Rasheed

On 3 May 2016 at 16:52, Peter Eisentraut
 wrote:
> I would change appendReloptionsArrayAH() to a function and keep AH as the
> first argument (similar to other functions that take such a handle).

I can understand changing it to a function, but I don't think AH
should be the first argument. All other append*() functions that
append to a buffer have the buffer as the first argument, including
the appendStringLiteralAH() macro on which this is based.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More inaccurate results from numeric pow()

2016-05-02 Thread Dean Rasheed

On 2 May 2016 at 19:40, Robert Haas <robertmh...@gmail.com> wrote:
> On Mon, May 2, 2016 at 1:02 PM, Dean Rasheed <dean.a.rash...@gmail.com> wrote:
>> Doing some more testing of the numeric code patched in [1] I noticed
>> another case where the result is inaccurate -- computing 0.12 ^
>> -2345.6 gives a very large number containing 2162 digits, but only the
>> first 2006 correct, while the last 156 digits are wrong.
>
> Just out of curiosity, how can you tell?  Where do you get alternate
> output to compare against?
>

The easiest way is to use bc, although it's pretty slow for this kind of thing.

> Also, I wonder what we think the contract with the user is in cases
> like this.

My expectation is that the numeric computations should generally
produce results that are correct in all, or nearly all, digits
returned. This is commonly expressed in terms of ULP's --
https://en.wikipedia.org/wiki/Unit_in_the_last_place. An error of a
few ULP's may be OK, but I don't think hundreds of ULP's is OK.
Actually I think this particular issue is mostly of academic interest,
but we went to some effort to get accurate results in the previous
patch, and this is just closing a loophole to hopefully complete that
work.

  Surely, if we were dealing with floating point numbers,
> nobody would expect a calculation like this to be accurate beyond the
> first n digits, where n is surely much less than 2006.  I like the
> fact that numeric has a lot more precision than any built-in floating
> point type, but does it have to get every digit in front of the
> decimal point exactly right no matter how many there are?
>

I would say it should come close. Otherwise we're just returning a lot
of noise. Note that there is a limit to how many digits we will ever
return, so this is manageable.

> rhaas=# select tan(pi()::numeric/2), tan(pi()/2);
>tan   | tan
> -+--
>  618986325617924 | 1.63312393531954e+16
> (1 row)
>

That doesn't prove anything, since we haven't implemented tan(numeric)
(and I don't plan to), so the above is actually

select tan((pi()::numeric/2)::float8), tan(pi()/2);

and it's not surprising that the 2 results are wildly different (and
infinitely far away from the correct answer). It's using tan(float8)
in both cases, just with slightly different float8 inputs. There's not
really any reason to expect particularly accurate results from float8
functions in cases like this.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] More inaccurate results from numeric pow()

2016-05-02 Thread Dean Rasheed

Doing some more testing of the numeric code patched in [1] I noticed
another case where the result is inaccurate -- computing 0.12 ^
-2345.6 gives a very large number containing 2162 digits, but only the
first 2006 correct, while the last 156 digits are wrong.

The reason is this code in power_var():

/* limit to something that won't cause integer overflow */
val = Max(val, -NUMERIC_MAX_RESULT_SCALE);
val = Min(val, NUMERIC_MAX_RESULT_SCALE);

where "val" is the approximate decimal result weight. Here
NUMERIC_MAX_RESULT_SCALE is 2000, so it's clamping the estimated
result weight to 2000, and therefore reducing the rscale in the
subsequent calculations, causing the loss of precision at around 2000
digits.

In fact it's possible to predict exactly how large we need to allow
"val" to become, since the final result is computed using exp_var(),
which accepts inputs up to 6000, so the result weight "val" can be up
to around log10(exp(6000)) ~= 2606 before the final result causes an
overflow.

The obvious fix would be to modify the clamping limits. I think a
better answer though is to replace the clamping code with an overflow
test, immediately throwing an error if "val" is outside the allowed
range, per the attached patch.

This has the advantage that it avoids some expensive computations in
the case where the result will end up overflowing, but more
importantly it means that power_var() isn't so critically dependent on
the limits of exp_var() -- if someone in the future increased the
limits of exp_var() without touching power_var(), and power_var()
clamped to the old range, the problem would resurface. But doing an
overflow test in power_var() instead of clamping "val", it would
either compute an accurate result, or throw an overflow error early
on. There should be no possibility of it returning an inaccurate
result.

Regards,
Dean

[1] 
http://www.postgresql.org/message-id/CAEZATCV7w+8iB=07dj8q0zihxqt1semcqutek+4_rogc_zq...@mail.gmail.com
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
new file mode 100644
index 3ba373a..b4a2829
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -8008,9 +8008,16 @@ power_var(NumericVar *base, NumericVar *
 
 	val *= 0.434294481903252;	/* approximate decimal result weight */
 
-	/* limit to something that won't cause integer overflow */
-	val = Max(val, -NUMERIC_MAX_RESULT_SCALE);
-	val = Min(val, NUMERIC_MAX_RESULT_SCALE);
+	/*
+	 * Apply a crude overflow test so we can exit early if the result is sure
+	 * to overflow.  exp_var() supports inputs up to 6000, so we know that the
+	 * result will overflow if the approximate decimal result weight exceeds
+	 * log10(exp(6000)) ~= 2610.
+	 */
+	if (Abs(val) > 2610)
+		ereport(ERROR,
+(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("value overflows numeric format")));
 
 	/* choose the result scale */
 	rscale = NUMERIC_MIN_SIG_DIGITS - (int) val;
diff --git a/src/test/regress/expected/numeric_big.out b/src/test/regress/expected/numeric_big.out
new file mode 100644
index 1b3608a..8bb09c9
--- a/src/test/regress/expected/numeric_big.out
+++ b/src/test/regress/expected/numeric_big.out
@@ -1075,6 +1075,20 @@ SELECT b, p, bc_result, b^p AS power, b^
  27.234987 |  20.230957 |   108142427112079606637962972621.121293 |   108142427112079606637962972621.121293 |0.00
 (41 rows)
 
+-- Inputs close to overflow
+--
+-- bc(1) results computed with a scale of 2700 and truncated to 4 decimal
+-- places.
+WITH t(b, p, bc_result) AS (VALUES
+(0.12, -2829.8369,

Re: [HACKERS] psql :: support for \ev viewname and \sv viewname

2016-05-02 Thread Dean Rasheed

On 27 April 2016 at 08:36, Dean Rasheed <dean.a.rash...@gmail.com> wrote:
> Here is a rough patch based on the way pg_dump handles this. It still
> needs a bit of polishing -- in particular I think fmtReloptionsArray()
> (copied from pg_dump) should probably be moved to string_utils.c so
> that it can be shared between pg_dump and psql. Also, I'm not sure
> that's the best name for it -- I think appendReloptionsArray() is a
> more accurate description of what is does.
>

Here are updated patches doing that --- the first moves
fmtReloptionsArray() from pg_dump.c to fe_utils/string_utils.c so that
it can be shared by pg_dump and psql, and renames it to
appendReloptionsArray(). The second patch fixes the actual psql bug.

Regards,
Dean
From a01c897bf2945f8ebef39bf3c58bb14e1739abf9 Mon Sep 17 00:00:00 2001
From: Dean Rasheed <dean.a.rash...@gmail.com>
Date: Mon, 2 May 2016 11:27:15 +0100
Subject: [PATCH 1/2] Move and rename fmtReloptionsArray().

Move fmtReloptionsArray() from pg_dump.c to string_utils.c so that it
is available to other frontend code. In particular psql's \ev and \sv
commands need it to handle view reloptions. Also rename the function
to appendReloptionsArray(), which is a more accurate description of
what it does.
---
 src/bin/pg_dump/pg_backup.h |  7 
 src/bin/pg_dump/pg_dump.c   | 80 +++--
 src/fe_utils/string_utils.c | 72 +
 src/include/fe_utils/string_utils.h |  3 ++
 4 files changed, 88 insertions(+), 74 deletions(-)

diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 83f6029..6b162a4 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -288,4 +288,11 @@ extern int	archprintf(Archive *AH, const char *fmt,...) pg_attribute_printf(2, 3
 #define appendStringLiteralAH(buf,str,AH) \
 	appendStringLiteral(buf, str, (AH)->encoding, (AH)->std_strings)
 
+#define appendReloptionsArrayAH(buf,reloptions,prefix,AH) \
+	do { \
+		if (!appendReloptionsArray(buf, reloptions, prefix, \
+   (AH)->encoding, (AH)->std_strings)) \
+			write_msg(NULL, "WARNING: could not parse reloptions array\n"); \
+	} while (0)
+
 #endif   /* PG_BACKUP_H */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index d3f5157..3005bf5 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -261,8 +261,6 @@ static void binary_upgrade_extension_member(PQExpBuffer upgrade_buffer,
 static const char *getAttrName(int attrnum, TableInfo *tblInfo);
 static const char *fmtCopyColumnList(const TableInfo *ti, PQExpBuffer buffer);
 static bool nonemptyReloptions(const char *reloptions);
-static void fmtReloptionsArray(Archive *fout, PQExpBuffer buffer,
-   const char *reloptions, const char *prefix);
 static char *get_synchronized_snapshot(Archive *fout);
 static PGresult *ExecuteSqlQueryForSingleRow(Archive *fout, char *query);
 static void setupDumpWorker(Archive *AHX);
@@ -15046,7 +15044,7 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
 		if (nonemptyReloptions(tbinfo->reloptions))
 		{
 			appendPQExpBufferStr(q, " WITH (");
-			fmtReloptionsArray(fout, q, tbinfo->reloptions, "");
+			appendReloptionsArrayAH(q, tbinfo->reloptions, "", fout);
 			appendPQExpBufferChar(q, ')');
 		}
 		result = createViewAsClause(fout, tbinfo);
@@ -15301,13 +15299,14 @@ dumpTableSchema(Archive *fout, TableInfo *tbinfo)
 			if (nonemptyReloptions(tbinfo->reloptions))
 			{
 addcomma = true;
-fmtReloptionsArray(fout, q, tbinfo->reloptions, "");
+appendReloptionsArrayAH(q, tbinfo->reloptions, "", fout);
 			}
 			if (nonemptyReloptions(tbinfo->toast_reloptions))
 			{
 if (addcomma)
 	appendPQExpBufferStr(q, ", ");
-fmtReloptionsArray(fout, q, tbinfo->toast_reloptions, "toast.");
+appendReloptionsArrayAH(q, tbinfo->toast_reloptions, "toast.",
+		fout);
 			}
 			appendPQExpBufferChar(q, ')');
 		}
@@ -15908,7 +15907,7 @@ dumpConstraint(Archive *fout, ConstraintInfo *coninfo)
 			if (nonemptyReloptions(indxinfo->indreloptions))
 			{
 appendPQExpBufferStr(q, " WITH (");
-fmtReloptionsArray(fout, q, indxinfo->indreloptions, "");
+appendReloptionsArrayAH(q, indxinfo->indreloptions, "", fout);
 appendPQExpBufferChar(q, ')');
 			}
 
@@ -16809,7 +16808,7 @@ dumpRule(Archive *fout, RuleInfo *rinfo)
 	{
 		appendPQExpBuffer(cmd, "ALTER VIEW %s SET (",
 		  fmtId(tbinfo->dobj.name));
-		fmtReloptionsArray(fout, cmd, rinfo->reloptions, "");
+		appendReloptionsArrayAH(cmd, rinfo->reloptions, "", fout);
 		appendPQExpBufferStr(cmd, ");\n");
 	}
 
@@ -17704,73 +17703,6 @@ nonemptyReloptions(const char *reloptions)
 }
 
 /*
- * Format a reloptions array and append it to the

Re: [HACKERS] psql :: support for \ev viewname and \sv viewname

2016-04-27 Thread Dean Rasheed

[Looking back over old threads]

On 22 July 2015 at 22:00, Dean Rasheed <dean.a.rash...@gmail.com> wrote:
> This appears to be missing support for view options (WITH CHECK OPTION
> and security_barrier), so editing a view with either of those options
> will cause them to be stripped off.

It seems like this issue was never addressed, and it needs to be fixed for 9.6.

Here is a rough patch based on the way pg_dump handles this. It still
needs a bit of polishing -- in particular I think fmtReloptionsArray()
(copied from pg_dump) should probably be moved to string_utils.c so
that it can be shared between pg_dump and psql. Also, I'm not sure
that's the best name for it -- I think appendReloptionsArray() is a
more accurate description of what is does.

Regards,
Dean
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
new file mode 100644
index 4fa7760..96bc64d
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -72,6 +72,7 @@ static bool lookup_object_oid(EditableOb
   Oid *obj_oid);
 static bool get_create_object_cmd(EditableObjectType obj_type, Oid oid,
 	  PQExpBuffer buf);
+static bool fmtReloptionsArray(PQExpBuffer buffer, const char *reloptions);
 static int	strip_lineno_from_objdesc(char *obj);
 static int	count_lines_in_buf(PQExpBuffer buf);
 static void print_with_linenumbers(FILE *output, char *lines,
@@ -3274,12 +3275,51 @@ get_create_object_cmd(EditableObjectType
 			 * CREATE for ourselves.  We must fully qualify the view name to
 			 * ensure the right view gets replaced.  Also, check relation kind
 			 * to be sure it's a view.
+			 *
+			 * Starting with 9.2, views may have reloptions (security_barrier)
+			 * and from 9.4 onwards they may also have WITH [LOCAL|CASCADED]
+			 * CHECK OPTION.  These are not part of the view definition
+			 * returned by pg_get_viewdef() and so need to be retrieved
+			 * separately.  Materialized views (introduced in 9.3) may have
+			 * arbitrary storage parameter reloptions.
 			 */
-			printfPQExpBuffer(query,
-			  "SELECT nspname, relname, relkind, pg_catalog.pg_get_viewdef(c.oid, true) FROM "
- "pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n "
-			  "ON c.relnamespace = n.oid WHERE c.oid = %u",
-			  oid);
+			if (pset.sversion >= 90400)
+			{
+printfPQExpBuffer(query,
+  "SELECT nspname, relname, relkind, "
+  "pg_catalog.pg_get_viewdef(c.oid, true), "
+  "array_remove(array_remove(c.reloptions,'check_option=local'),'check_option=cascaded') AS reloptions, "
+  "CASE WHEN 'check_option=local' = ANY (c.reloptions) THEN 'LOCAL'::text "
+  "WHEN 'check_option=cascaded' = ANY (c.reloptions) THEN 'CASCADED'::text ELSE NULL END AS checkoption "
+  "FROM pg_catalog.pg_class c "
+  "LEFT JOIN pg_catalog.pg_namespace n "
+"ON c.relnamespace = n.oid WHERE c.oid = %u",
+  oid);
+			}
+			else if (pset.sversion >= 90200)
+			{
+printfPQExpBuffer(query,
+  "SELECT nspname, relname, relkind, "
+  "pg_catalog.pg_get_viewdef(c.oid, true), "
+  "c.reloptions AS reloptions, "
+  "NULL AS checkoption "
+  "FROM pg_catalog.pg_class c "
+  "LEFT JOIN pg_catalog.pg_namespace n "
+"ON c.relnamespace = n.oid WHERE c.oid = %u",
+  oid);
+			}
+			else
+			{
+printfPQExpBuffer(query,
+  "SELECT nspname, relname, relkind, "
+  "pg_catalog.pg_get_viewdef(c.oid, true), "
+  "NULL AS reloptions, "
+  "NULL AS checkoption "
+  "FROM pg_catalog.pg_class c "
+  "LEFT JOIN pg_catalog.pg_namespace n "
+"ON c.relnamespace = n.oid WHERE c.oid = %u",
+  oid);
+			}
 			break;
 	}
 
@@ -3304,6 +3344,8 @@ get_create_object_cmd(EditableObjectType
 	char	   *relname = PQgetvalue(res, 0, 1);
 	char	   *relkind = PQgetvalue(res, 0, 2);
 	char	   *viewdef = PQgetvalue(res, 0, 3);
+	char	   *reloptions = PQgetvalue(res, 0, 4);
+	char	   *checkoption = PQgetvalue(res, 0, 5);
 
 	/*
 	 * If the backend ever supports CREATE OR REPLACE
@@ -3328,11 +3370,30 @@ get_create_object_cmd(EditableObjectType
 			break;
 	}
 	appendPQExpBuffer(buf, "%s.", fmtId(nspname));
-	appendPQExpBuffer(buf, "%s AS\n", fmtId(relname));
-	appendPQExpBufferStr(buf, viewdef);
+	appendPQExpBufferStr(buf, fmtId(relname));
+
+	/* reloptions, if not an empty array "{}" */
+	if (reloptions != NULL && strlen(reloptions) > 2)
+	{
+		appendPQExpBufferStr(buf, "\n WITH (");
+		if (!fmtReloptionsArray(buf, reloptions))
+		{
+			psql_error("Could not parse relopti

Re: [HACKERS] [COMMITTERS] pgsql: Add trigonometric functions that work in degrees.

2016-04-26 Thread Dean Rasheed

On 26 April 2016 at 16:26, Tom Lane  wrote:
> The next time somebody proposes that we can get exact results out of
> floating-point arithmetic, I'm going to run away screaming.
>

Yeah, I think I will have the same reaction.

Thanks for all your hard work getting this to work.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Suspicious behaviour on applying XLOG_HEAP2_VISIBLE.

2016-04-26 Thread Dean Rasheed

On 26 April 2016 at 04:48, Andres Freund  wrote:
> No, I think we got to do this in all branches. I was just wondering
> about how to fix vm_extend(). Which I do think we got to fix, even in
> the back-branches.
>

I think replacing CacheInvalidateSmgr() with CacheInvalidateRelcache()
in vm_extend() is probably the safer thing to do, and ought to be
relatively harmless.

It means that an index-only scan won't be notified of VM extension
until the end of the other transaction, which might lead to extra heap
fetches, but I think that's unlikely to have any performance impact
because it ought to be a fairly rare event, and if it was another
transaction adding tuples, they wouldn't be all visible before it was
committed anyway, so the extra heap fetches would be required in any
case.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Add trigonometric functions that work in degrees.

2016-04-26 Thread Dean Rasheed

On 26 April 2016 at 04:25, Tom Lane  wrote:
> In short, these tests suggest that we need a coding pattern like
> this:
>
> volatile float8 asin_x = asin(x);
> return (asin_x / asin_0_5) * 30.0;
>
> We could drop the "volatile" by adopting -ffloat-store, but that
> doesn't seem like a better answer, because -ffloat-store would
> pessimize a lot of code elsewhere.  (It causes a whole lot of
> changes in float.c, for sure.)

Agreed. That looks like the least hacky way of solving the problem. I
think it's more readable when the logic is kept local, and it's
preferable to avoid any compiler-specific options or global flags that
would affect other code.

6b1a213bbd6599228b2b67f7552ff7cc378797bf by itself looks like a
worthwhile improvement that ought to be more robust, even though it
didn't fix this problem.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Add trigonometric functions that work in degrees.

2016-04-19 Thread Dean Rasheed

On 19 April 2016 at 14:38, Tom Lane  wrote:
> Yeah, what I was thinking of printing is something like
>
> asind(x),
> asind(x) IN (-90,-30,0,30,90) AS asind_exact,
> ...
>
> with extra_float_digits=3.  The point of this is not necessarily to give
> any extra information, though it might, but for failures to be more easily
> interpretable.  If I'd forgotten how the test worked just a few months
> after committing it, how likely is it that some random user faced with a
> similar failure would understand what they were seeing?
>
> Also, though I agree that it might not help much to know whether the
> output is 45.0001 or 44., our thoughts would
> be trending in quite a different direction if it turns out that the
> output is radically wrong, or even a NaN.  The existing test cannot
> exclude that possibility.
>

OK, that sounds like it would be a useful improvement to the tests.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Add trigonometric functions that work in degrees.

2016-04-19 Thread Dean Rasheed

On 19 April 2016 at 05:16, Noah Misch  wrote:
> On Mon, Apr 18, 2016 at 11:56:55PM -0400, Tom Lane wrote:
>> Noah Misch  writes:
>> > On Mon, Apr 18, 2016 at 10:22:59PM -0400, Tom Lane wrote:
>> >> We could alternatively set extra_float_digits to its max value and hope
>> >> that off-by-one-in-the-last-place values would get printed as something
>> >> visibly different from the exact result.  I'm not sure I want to trust
>> >> that that works reliably; but maybe it would be worth printing the
>> >> result both ways, just to provide additional info when there's a failure.
>>
>> > We'd have an independent problem if extra_float_digits=3 prints the same
>> > digits for distinguishable float values, so I wouldn't mind relying on it 
>> > not
>> > to do that.  But can we expect the extra_float_digits=3 representation of
>> > those particular values to be the same for every implementation?
>>
>> Hm?  The expected answer is exact (30, 45, or whatever) in each case.
>> If we get some residual low-order digits then it's a failure, so we don't
>> need to worry about whether it's the same failure everywhere.
>
> Does something forbid snprintf implementations from printing '45'::float8 as
> 45.0001 under extra_float_digits=3?

I'm not sure it's really worth having the test output something like
45.0001 since that extra detail doesn't really seem
particularly useful beyond the fact that the result wasn't exactly 45.
Also you'd have to be careful how you modified the test, since it's
possible that 45.0001 might be printed as '45' even under
extra_float_digits=3 and so there'd be a risk of the test passing when
it ought to fail, unless you also printed something else out to
indicate exactness.

Thinking about the actual failure in this case (asind(0.5) not
returning exactly 30) a couple of theories spring to mind. One is that
the compiler is being more aggressive over function inlining, so
init_degree_constants() is being inlined, and then asin_0_5 is being
evaluated at compile time rather than runtime, giving a slightly
different result, as happened in the original version of this patch.
Another theory is that the compiler is performing an unsafe ordering
rearrangement and transforming (asin(x) / asin_0_5) * 30.0 into
asin(x) * (30.0 / asin_0_5). This might happen for example under
something like -funsafe-math-optimizations.

I did a search for other people encountering similar problems and I
came across the following quite interesting example in the Gnu
Scientific Library's implementation of log1p() --
https://github.com/ampl/gsl/blob/master/sys/log1p.c. The reason the
code is now written in that way is in response to this bug:
https://lists.gnu.org/archive/html/bug-gsl/2007-07/msg8.html with
an overly aggressive compiler.

So using that technique, we might try using a volatile local variable
to enforce the desired evaluation order, e.g.:

volatile double tmp;

tmp = asin(x) / asin_0_5;
return tmp * 30.0;

A similar trick could be used to protect against
init_degree_constants() being inlined, by writing it as

volatile double one_half = 0.5;

asin_0_5 = asin(one_half);

since then the compiler wouldn't be able to assume that one_half was
still equal to 0.5, and wouldn't be able to optimise away the runtime
evaluation of asin() in favour of a compile-time constant.

These are both somewhat unconventional uses of volatile, but I think
they stand a better chance of being more portable, and also more
future-proof against compilers that might in the future make more
advanced code inlining and rearrangement choices.

Of course this is all pure speculation at this stage, but it seems
like it's worth a try.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] improving GROUP BY estimation

2016-04-02 Thread Dean Rasheed

On 31 March 2016 at 22:02, Tom Lane  wrote:
> I'm just concerned about what happens when the Dellera paper stops being
> available.  I don't mind including that URL as a backup to the written-out
> argument I just suggested.
>

Here's an updated patch with references to both papers, and a more
detailed justification for the formula, along with the other changes
discussed. Note that although equation (2) in the Dell'Era paper looks
different from the Yao formula, it's actually the same.

Regards,
Dean
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
new file mode 100644
index a6555e9..99f5f7c
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3439,9 +3439,51 @@ estimate_num_groups(PlannerInfo *root, L
 reldistinct = clamp;
 
 			/*
-			 * Multiply by restriction selectivity.
+			 * Update the estimate based on the restriction selectivity,
+			 * guarding against division by zero when reldistinct is zero.
+			 * Also skip this if we know that we are returning all rows.
 			 */
-			reldistinct *= rel->rows / rel->tuples;
+			if (reldistinct > 0 && rel->rows < rel->tuples)
+			{
+/*
+ * Given a table containing N rows with n distinct values in a
+ * uniform distribution, if we select p rows at random then
+ * the expected number of distinct values selected is
+ *
+ * n * (1 - product((N-N/n-i)/(N-i), i=0..p-1))
+ *
+ * = n * (1 - (N-N/n)! / (N-N/n-p)! * (N-p)! / N!)
+ *
+ * See "Approximating block accesses in database
+ * organizations", S. B. Yao, Communications of the ACM,
+ * Volume 20 Issue 4, April 1977 Pages 260-261.
+ *
+ * Alternatively, re-arranging the terms from the factorials,
+ * this may be written as
+ *
+ * n * (1 - product((N-p-i)/(N-i), i=0..N/n-1))
+ *
+ * This form of the formula is more efficient to compute in
+ * the common case where p is larger than N/n.  Additionally,
+ * as pointed out by Dell'Era, if i << N for all terms in the
+ * product, it can be approximated by
+ *
+ * n * (1 - ((N-p)/N)^(N/n))
+ *
+ * See "Expected distinct values when selecting from a bag
+ * without replacement", Alberto Dell'Era,
+ * http://www.adellera.it/investigations/distinct_balls/.
+ *
+ * The condition i << N is equivalent to n >> 1, so this is a
+ * good approximation when the number of distinct values in
+ * the table is large.  It turns out that this formula also
+ * works well even when n is small.
+ */
+reldistinct *=
+	(1 - pow((rel->tuples - rel->rows) / rel->tuples,
+			 rel->tuples / reldistinct));
+			}
+			reldistinct = clamp_row_est(reldistinct);
 
 			/*
 			 * Update estimate of total distinct groups.
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
new file mode 100644
index de64ca7..0fc93d9
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -807,27 +807,24 @@ select * from int4_tbl where
 explain (verbose, costs off)
 select * from int4_tbl o where (f1, f1) in
   (select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);
-  QUERY PLAN  
---
- Hash Join
+   QUERY PLAN   
+
+ Hash Semi Join
Output: o.f1
Hash Cond: (o.f1 = "ANY_subquery".f1)
->  Seq Scan on public.int4_tbl o
  Output: o.f1
->  Hash
  Output: "ANY_subquery".f1, "ANY_subquery".g
- ->  HashAggregate
+ ->  Subquery Scan on "ANY_subquery"
Output: "ANY_subquery".f1, "ANY_subquery".g
-   Group Key: "ANY_subquery".f1, "ANY_subquery".g
-   ->  Subquery Scan on "ANY_subquery"
- Output: "ANY_subquery".f1, "ANY_subquery".g
- Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
- ->  HashAggregate
-   Output: i.f1, (generate_series(1, 2) / 10)
-   Group Key: i.f1
-   ->  Seq Scan on public.int4_tbl i
- Output: i.f1
-(18 rows)
+   Filter: ("ANY_subquery".f1 = "ANY_subquery".g)
+   ->  HashAggregate
+ Output: i.f1, (generate_series(1, 2) / 10)
+ Group Key: i.f1
+ ->  Seq Scan on public.int4_tbl i
+   Output: i.f1
+(15 rows)
 
 select * from int4_tbl o where (f1, f1) in
   (select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] improving GROUP BY estimation

2016-03-31 Thread Dean Rasheed

On 31 March 2016 at 21:40, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Tom Lane wrote:
>>> Another minor gripe is the use of a random URL as justification.  This
>>> code will still be around when that URL exists nowhere but the Wayback
>>> Machine.  Can't we find a more formal citation to use?
>
>> The article text refers to this 1977 S. B. Yao paper "Approximating
>> block accesses in database organizations" which doesn't appear to be
>> available online, except behind ACM's paywall at
>> http://dl.acm.org/citation.cfm?id=359475
>
> Well, a CACM citation is perfectly fine by my lights (especially one
> that's that far back and therefore certainly patent-free ...)
>
> Let's use something like this:
>
> See "Approximating block accesses in database organizations", S. B. Yao,
> Communications of the ACM, Volume 20 Issue 4, April 1977 Pages 260-261
>

Sounds good.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] improving GROUP BY estimation

2016-03-31 Thread Dean Rasheed

On 31 March 2016 at 20:18, Tom Lane  wrote:
> Also, I wonder if it'd be a good idea to provide a guard against division
> by zero --- we know rel->tuples > 0 at this point, but I'm less sure that
> reldistinct can't be zero.  In the same vein, I'm worried about the first
> argument of pow() being slightly negative due to roundoff error, leading
> to a NaN result.

Yeah, that makes sense. In fact, if we only apply the adjustment when
reldistinct > 0 and rel->rows < rel->tuples, and rewrite the first
argument to pow() as (rel->tuples - rel->rows) / rel->tuples, then it
is guaranteed to be non-negative. If rel->rows >= rel->tuples (not
sure if it can be greater), then we just want the original
reldistinct.

> Maybe we should also consider clamping the final reldistinct estimate to
> an integer with clamp_row_est().  The existing code doesn't do that but
> it seems like a good idea on general principles.

OK, that seems sensible.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] improving GROUP BY estimation

2016-03-31 Thread Dean Rasheed

On 30 March 2016 at 14:03, Tomas Vondra  wrote:
> Attached is v4 of the patch

Thanks, I think this is good to go, except that I think we need to use
pow() rather than powl() because AIUI powl() is new to C99, and so
won't necessarily be available on all supported platforms. I don't
think we need worry about loss of precision, since that would only be
an issue if rel->rows / rel->tuples were smaller than maybe 10^-14 or
so, and it seems unlikely we'll get anywhere near that any time soon.

I think this is a good, well thought-out change, so unless anyone
objects I'll push it (probably this weekend).

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Add generate_series(date,date) and generate_series(date,date,integer)

2016-03-20 Thread Dean Rasheed

On 16 March 2016 at 23:32, David Steele  wrote:
> On 3/10/16 1:24 PM, Corey Huinker wrote:
>
>> New patch for Alvaro's consideration.
>>
>> Very minor changes since the last time, the explanations below are
>> literally longer than the changes:
>> - Rebased, though I don't think any of the files had changed in the
>> mean time
>> - Removed infinity checks/errors and the test cases to match
>> - Amended documentation to add 'days' after 'step' as suggested
>> - Did not add a period as suggested, to remain consistent with other
>> descriptions in the same sgml table
>> - Altered test case and documentation of 7 day step example so that
>> the generated dates do not land evenly on the end date. A reader
>> might incorrectly infer that the end date must be in the result set,
>> when it doesn't have to be.
>> - Removed unnecessary indentation that existed purely due to
>> following of other generate_series implementations
>
>
> As far as I can see overall support is in favor of this patch although it is
> not overwhelming by any means.
>
> I think in this case it comes down to a committer's judgement so I have
> marked this "ready for committer" and passed the buck on to Álvaro.
>

So I was pretty much "meh" on this patch too, because I'm not
convinced it actually saves much typing, if any.

However, I now realise that it introduces a backwards-compatibility
breakage. Today it is possible to type

SELECT * FROM generate_series('01-01-2000'::date, '01-04-2000', '7 days');

and it works with just that one cast. However, this patch breaks that.

Now I'm not saying that I have used the above construct. Probably not
in fact, but maybe my work colleagues have. I honestly can't say,  but
the thought of having to grep through thousands of lines of
application code to check isn't particularly appealing.

So I'm afraid it's -1 from me.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] improving GROUP BY estimation

2016-03-19 Thread Dean Rasheed

On 18 March 2016 at 00:37, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
>> On Sun, 2016-03-13 at 15:24 +, Dean Rasheed wrote:
>>> I think that a better formula to use would be
>>>
>>> reldistinct *= (1 - powl(1 - rel-rows / rel->tuples, rel->tuples /
>>> reldistinct)
>
> Attached is a v3 of the patch using this formula instead of the original
> one. Interestingly, that apparently reduces the number of regression tests
> that get broken to a single one.
>

Cool.

> I'm not sure whether we need to provide a link to the PDF the formula comes
> from - perhaps we should?
>

Probably a better URL to give is
http://www.adellera.it/investigations/distinct_balls/ which has a link
to the PDF version of the paper and also some supporting material.

However, while that paper is in general very clear, I don't think it
gives a very clear explanation of that particular formula, and it
doesn't explain what it represents. It merely says that if "i" can be
ignored "for some reason (e.g. i << Nr)", then that formula is an
approximation to the exact "without replacement" formula, which is the
subject of that paper.

But actually, that formula is the exact formula for the expected
number of distinct values when selecting *with replacement* from a
collection where each of the distinct values occurs an equal number of
times. So I think we should say that.

Perhaps something along the lines of:

/*
 * Update the estimate based on the restriction selectivity.
 *
 * This uses the formula for the expected number of distinct values
 * when selecting with replacement from a collection where each of
 * the distinct values occurs an equal number of times (a uniform
 * distribution of values). This is a very close approximation to
 * the more correct method of selecting without replacement when
 * the number of distinct values is quite large --- for example,
 * see http://www.adellera.it/investigations/distinct_balls/.
 * Additionally, it also turns out to work very well even when the
 * number of distinct values is small.
 */

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] improving GROUP BY estimation

2016-03-13 Thread Dean Rasheed

On 4 March 2016 at 13:10, Tomas Vondra  wrote:
> The risk associated with over-estimation is that we may switch from
> HashAggregate to GroupAggregate, and generally select paths better
> suited for large number of rows. Not great, because the plan may not be
> optimal and thus run much slower - but that usually only happens on
> small tables, because on large tables we would probably end up using
> those same paths anyway.
>
> With under-estimation, the main risks are much graver, as we may end up
> using HashAggregate only to get killed by OOM. When we're lucky not to
> hit that, my experience this often leads to a cascade of NestedLoop
> nodes processing orders of magnitude more tuples than expected. Awful.
>
> So I think the risk is asymmetric, and that's why I like the new
> estimator more, because it tends to over-estimate, but in a very
> reasonable way.
>

Agreed. Under-estimating is worse than over-estimating.

-   reldistinct *= rel->rows / rel->tuples;
+   reldistinct *= (1 - powl((reldistinct - 1) / reldistinct, rel->rows)

Looking at this, I agree that this new formula from [1] is generally
better than the old one in most (but not all) cases.

One problem with it is that it no longer takes into account
rel->tuples, which means that when returning all the tuples from the
table, it no longer gives an estimate equal to the total number of
distinct values in the table. In fact it tends to underestimate when
returning nearly all the rows from the table.

The old formula is clearly not a good general-purpose estimate.
However, there is one case where it does return an accurate estimate
-- when all the values in the underlying table are distinct. In this
case the new formula consistently produces underestimates, while the
old formula works well. For example:

CREATE TABLE t AS SELECT (1 * random())::int a,
 i::int b
FROM generate_series(1,100) s(i);
ANALYSE t;

EXPLAIN ANALYSE
SELECT a FROM t GROUP BY a;

So there are clearly around 1 million distinct values for "a", but the
new formula gives an estimate of around 630,000. That's not a massive
underestimate, but it's sufficiently obvious and visible that it would
be good to do better if we can.

I think that a better formula to use would be

reldistinct *= (1 - powl(1 - rel-rows / rel->tuples, rel->tuples / reldistinct)

This comes from [2], which gives a general formula for selection
without replacement, and then gives the above as an approximation to
the uniform distribution case. It's not really made clear in that
paper, but that approximation is actually the "with replacement"
approximation, but starting from different initial assumptions to give
a formula that has better boundary conditions and special-case
behaviour, as described below.

For the record, here's a quick analysis of how the 2 formulae come about:

Assume there are:
  N rows in the table
  n distinct values (n <= N)
  p rows are chosen at random (p <= N)

so the problem is to estimate how many distinct values there will be
in the p rows chosen. Both approaches work by first estimating the
probability that some particular value x is *not* chosen.

[1] starts from the assumption that each row of the table has a
probability of 1/n of being equal to x. So the probability that x is
not the first value chosen is

  P(not x, 1) = 1 - 1/n

Then, if the value chosen first is replaced, the probability that x is
not the second value chosen is the same. The "with replacement"
approximation makes each choice an independent probability, so the
overall probability that x is not in any of the p rows chosen is just
the product of the p individual probabilities, which is just

  P(not x, p) = (1 - 1/n)^p

Then of course the probability that x *is* chosen at least once in the p rows is

  P(x, p) = 1 - (1 - 1/n)^p

Finally, since there are n possible values of x, and they're all
equally likely in the table, the expected number of distinct values is

  D(p) = n * (1 - (1 - 1/n)^p)

The flaw in this approach is that for large p the "with replacement"
approximation gets worse and worse, and the probabilities P(x, p)
systematically under-estimate the actual probability that x is chosen,
which increases as more values not equal to x are chosen. By the time
the last value is chosen P(x, p=N) should actually be 1.

[2] starts from a different initial assumption (uniform distribution
case) -- for any given value x, assume that the table contains N/n
instances of x (ignoring the fact that that might not be an integer).
Note that with this assumption there is guaranteed to be at least one
instance of x, which is not the case with the above approach.

Now consider the first instance of x in the table. If we choose p rows
from the table, the probability that that first instance of x is not
chosen is

  P(not x[1], p) = 1 - p / N

Making the same "with replacement" simplification, the probability
that the

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-20 Thread Dean Rasheed

On 18 February 2016 at 10:05, Dean Rasheed <dean.a.rash...@gmail.com> wrote:
> OK, I'll add a check for that.
> Thanks for testing.
>

Pushed, with a catversion bump.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-18 Thread Dean Rasheed

On 18 February 2016 at 02:00, Vitaly Burovoy  wrote:
>> + else
>> + have_digits = false;
> Does it worth to move it to the declaration and remove that "else" block?
> +   boolhave_digits = false;

OK, that's probably a bit neater.

> Is it important to use:
>> +  ObjectIdGetDatum(InvalidOid),
>> +  Int32GetDatum(-1)));
> instead of "0, -1"? Previously I left immediate constants because I
> found the same thing in jsonb.c (rows 363, 774)...

I think it's preferable to use the macros because they make it clearer
what datatypes are being passed around. I think that's the more common
style, but the JSONB code is not technically wrong either.

>> +  if (pg_strcasecmp(strptr, "bytes") == 0)
>> +  else if
>> +  else if
>> +  else if
> It seems little ugly for me. In that case (not using values from GUC)
> I'd create a static array for it and do a small cycle via it. Would it
> better?

That's a matter of personal preference. I prefer the values inline
because then the values are closer to where they're being used, which
for me makes it easier to follow (see e.g., convert_priv_string()).

In guc.c they're in lookup tables because they're referred to from
multiple places, and because it needs to switch between tables based
on context, neither of which is true here. If there is a need to
re-use these values elsewhere in the future it can be refactored, but
right now that feels like an over-engineered solution for this
problem.

>> + NumericGetDatum(mul_num),
> Two more space for indentation.

pgindent pulled that back.

> Also I've found that with allowing exponent there is a weird result
> (we tried to avoid "type numeric" in the error message):
> SELECT 
> pg_size_bytes('123e1000
> kB');
> ERROR:  invalid input syntax for type numeric:
> "123e1000"

OK, I'll add a check for that.
Thanks for testing.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Dean Rasheed

On 17 February 2016 at 02:32, Jim Nasby <jim.na...@bluetreble.com> wrote:
> On 2/11/16 4:21 AM, Dean Rasheed wrote:
>>
>> Thinking about this some more though, perhaps*sorting*  the columns is
>> the wrong way to be thinking about it. Perhaps a better approach would
>> be to allow the columns to be*listed*  (optionally, using a separate
>> query). Something like the following (don't get too hung up on the
>> syntax):
>>
>> SELECT name,
>> to_char(date, 'Mon') AS month,
>> sum(amount) AS amount
>>   FROM invoices
>>   GROUP BY 1,2
>>   ORDER BY name
>> \crosstabview cols = (select to_char(d, 'Mon') from
>> generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
>
>
> My concern with that is that often you don't know what the columns will be,
> because you don't know what exact data the query will produce. So to use
> this syntax you'd have to re-create a huge chunk of the original query. :(
>

Yeah, that's a reasonable concern.

On the flip side, one of the advantages of the above syntax is that
you have absolute control over the columns, whereas with the
sort-based syntax you might find some columns missing (e.g., if there
were no invoices in August) and that could lead to confusion parsing
the results.

I'm not totally opposed to specifying a column sort order in psql, and
perhaps there's a way to support both 'cols' and 'col_order' options
in psql, since there are different situations where one or the other
might be more useful.

What I am opposed to is specifying the row order in psql, because IMO
that's something that should be done entirely in the SQL query.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-17 Thread Dean Rasheed

On 15 February 2016 at 14:08, Daniel Verite <dan...@manitou-mail.org> wrote:
> Dean Rasheed wrote:
>
>> My biggest problem is with the sorting, for all the reasons discussed
>> above. There is absolutely no reason for \crosstabview to be
>> re-sorting rows -- they should just be left in the original query
>> result order
>
> Normal top-down display:
>
> select v,to_char(d,'Mon') as m, c  from v_data order by d;
>
>  v  |  m  |  c
> +-+-
>  v2 | Jan | bar
>  v1 | Apr | foo
>  v1 | Jul | baz
>  v0 | Jul | qux
>
> At this point, it seems to me that it's perfectly reasonable for our user
> to expect the possibility of sorting additionally by "v" , without
> changing the query and without changing the order of the horizontal
> header:
>
>  \crosstabview +v m c
>
>  v  | Jan | Apr | Jul
> +-+-+-
>  v0 | | | qux
>  v1 | | foo | baz
>  v2 | bar | |
>

I don't find that example particularly compelling. If I want to sort
the rows coming out of a query, my first thought is always going to be
to add/adjust the query's ORDER BY clause, not use some weird +/- psql
syntax.

The crux of the problem here is that in a pivoted query resultset SQL
can be used to control the order of the rows or the columns, but not
both at the same time. IMO it is more natural to use SQL to control
the order of the rows. The columns are the result of the psql
pivoting, so it's reasonable to control them via psql options.

A couple of other points to bear in mind:

The number of columns is always going to be quite limited (at most
1600, and usually far less than that), whereas the number of rows
could be arbitrarily large. So sorting the rows client-side in the way
that you are could get very inefficient, whereas that's not such a
problem for the columns.

The column values are non-NULL, so they require a more limited set of
sort options, whereas the rows could be anything, and people will want
all the sort options to be available.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-17 Thread Dean Rasheed

On 17 February 2016 at 00:39, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
> On 2/16/16, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote:
>> On 2/16/16, Dean Rasheed <dean.a.rash...@gmail.com> wrote:
>>> Fixing that in parse_memory_unit() would be messy because it assumes a
>>> base unit of kB, so it would require a negative multiplier, and
>>> pg_size_bytes() would have to be taught to divide by the magnitude of
>>> negative multipliers in the same way that guc.c does.
>
> Now parse_memory_unit returns -1024 for bytes as divider, constant
> "bytes" has moved there.
> Add new memory_units_bytes_hint which differs from an original
> memory_units_int by "bytes" size unit.
> Allow hintmsg be NULL and if so, skip setting dereferenced variable to
> memory_units_bytes_hint.
>

I think that approach is getting more and more unwieldy, and it simply
isn't worth the effort just to share a few values from the unit
conversion table, especially given that the set of supported units
differs between the two places.


>>> ISTM that it would be far less code, and much simpler and more
>>> readable to just parse the supported units directly in
>>> pg_size_bytes(), rather than trying to share code with guc.c, when the
>>> supported units are actually different and may well diverge further in
>>> the future.
>>

I've gone with this approach and it is indeed far less code, and much
simpler and easier to read. This will also make it easier to
maintain/extend in the future.

I've made a few minor tweaks to the docs, and added a note to make it
clear that the units in these functions work in powers of 2 not 10.

I also took the opportunity to tidy up the number scanning code
somewhat (I was tempted to rip it out entirely, since it feels awfully
close to duplicating the numeric code, but it's probably worth it for
the better error message).

Additionally, note that I replaced strcasecmp() with pg_strcasecmp(),
since AIUI the former is not available on all supported platforms.

Barring objections, and subject to some more testing, I intend to
commit this version.

Regards,
Dean
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index f9eea76..60f117a
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17767,6 +17767,9 @@ postgres=# SELECT * FROM pg_xlogfile_nam
 pg_relation_size


+pg_size_bytes
+   
+   
 pg_size_pretty


@@ -17838,6 +17841,15 @@ postgres=# SELECT * FROM pg_xlogfile_nam
   
   

+pg_size_bytes(text)
+
+   bigint
+   
+ Converts a size in human-readable format with size units into bytes
+   
+  
+  
+   
 pg_size_pretty(bigint)
 
text
@@ -17968,11 +17980,27 @@ postgres=# SELECT * FROM pg_xlogfile_nam
 

 pg_size_pretty can be used to format the result of one of
-the other functions in a human-readable way, using kB, MB, GB or TB as
-appropriate.
+the other functions in a human-readable way, using bytes, kB, MB, GB or TB
+as appropriate.

 

+pg_size_bytes can be used to get the size in bytes from a
+string in human-readable format. The input may have units of bytes, kB,
+MB, GB or TB, and is parsed case-insensitively. If no units are specified,
+bytes are assumed.
+   
+
+   
+
+ The units kB, MB, GB and TB used by the functions
+ pg_size_pretty and pg_size_bytes are defined
+ using powers of 2 rather than powers of 10, so 1kB is 1024 bytes, 1MB is
+ 10242 = 1048576 bytes, and so on.
+
+   
+
+   
 The functions above that operate on tables or indexes accept a
 regclass argument, which is simply the OID of the table or index
 in the pg_class system catalog.  You do not have to look up
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 2084692..91260cd
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -700,6 +700,145 @@ pg_size_pretty_numeric(PG_FUNCTION_ARGS)
 }
 
 /*
+ * Convert a human-readable size to a size in bytes
+ */
+Datum
+pg_size_bytes(PG_FUNCTION_ARGS)
+{
+	text	   *arg = PG_GETARG_TEXT_PP(0);
+	char	   *str,
+			   *strptr,
+			   *endptr;
+	bool		have_digits;
+	char		saved_char;
+	Numeric		num;
+	int64		result;
+
+	str = text_to_cstring(arg);
+
+	/* Skip leading whitespace */
+	strptr = str;
+	while (isspace((unsigned char) *strptr))
+		strptr++;
+
+	/* Check that we have a valid number and determine where it ends */
+	endptr = strptr;
+
+	/* Part (1): sign */
+	if (*endptr == '-' || *endptr == '+')
+		endptr++;
+
+	/* Part (2): main digit string */
+	if (isdigit((unsigned char) *endptr))
+	{
+		have_digits = true;
+		do
+			endptr++;
+		while (isdigit((unsigned char) *endptr));
+	}
+	else
+		have_digits = false;
+
+	/*

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-16 Thread Dean Rasheed

On 16 February 2016 at 05:01, Pavel Stehule <pavel.steh...@gmail.com> wrote:
> 2016-02-15 10:16 GMT+01:00 Dean Rasheed <dean.a.rash...@gmail.com>:
>> Is there any reason not to make
>> pg_size_bytes() return numeric?
>>
> This is a question. I have not a strong opinion about it. There are no any
> technical objection - the result will be +/- same. But you will enforce
> Numeric into outer expression evaluation.
>
> The result will not be used together with function pg_size_pretty, but much
> more with functions pg_relation_size, pg_relation_size, .. and these
> functions doesn't return Numeric. These functions returns bigint. Bigint is
> much more natural type for this purpose.
>
> Is there any use case for Numeric?
>

[Shrug] I don't really have a strong opinion about it either, but it
seemed that maybe the function might have some wider uses beyond just
comparing object sizes, and since it's already computing the numeric
size, it might as well just return it.

Looking at the rest of the patch, I think there are other things that
need tidying up -- the unit parsing code for one. This is going to
some effort to reuse the memory_unit_conversion_table from guc.c, but
the result is that it has added quite a bit of new code and now the
responsibility for parsing different units is handled by different
functions in different files, which IMO is quite messy. Worse, the
error message is wrong and misleading:

select pg_size_bytes('10 bytes'); -- OK
select pg_size_bytes('10 gallons');
ERROR:  invalid size: "10 gallons"
DETAIL:  Invalid size unit "gallons"
HINT:  Valid units for this parameter are "kB", "MB", "GB", and "TB".

which fails to mention that "bytes" is also a valid unit.

Fixing that in parse_memory_unit() would be messy because it assumes a
base unit of kB, so it would require a negative multiplier, and
pg_size_bytes() would have to be taught to divide by the magnitude of
negative multipliers in the same way that guc.c does.

ISTM that it would be far less code, and much simpler and more
readable to just parse the supported units directly in
pg_size_bytes(), rather than trying to share code with guc.c, when the
supported units are actually different and may well diverge further in
the future.

I'll try to hack something up, and see what it looks like.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-15 Thread Dean Rasheed

> On 12/02/16 10:19, Dean Rasheed wrote:
>> This seems like a reasonable first patch for me as a committer, so
>> I'll take it unless anyone else was planning to do so.
>

So looking at this, it seems that for the most part pg_size_bytes()
will parse any output produced by pg_size_pretty(). The exception is
that there are 2 versions of pg_size_pretty(), one that takes bigint
and one that takes numeric, whereas pg_size_bytes() returns bigint, so
it can't handle all inputs. Is there any reason not to make
pg_size_bytes() return numeric?

It would still be compatible with the example use cases, but it would
be a better inverse of both variants of pg_size_pretty() and would be
more future-proof. It already works internally using numeric, so it's
a trivial change to make now, but impossible to change in the future
without introducing a new function with a different name, which is
messy.

Thoughts?

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-12 Thread Dean Rasheed

On 12 February 2016 at 06:25, Pavel Stehule  wrote:
> Thank you for review and other work
>

This seems like a reasonable first patch for me as a committer, so
I'll take it unless anyone else was planning to do so.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] max_parallel_degree context level

2016-02-11 Thread Dean Rasheed

On 11 February 2016 at 13:18, Robert Haas  wrote:
> On Thu, Feb 11, 2016 at 7:40 AM, Thom Brown  wrote:
>> As it currently stands, max_parallel_degree is set to a superuser
>> context
>
> I don't have a clue why it's like that.  It seems like it should be
> PGC_USERSSET just like, say, work_mem.  I think that's just brain fade
> on my part, and I think the current setting will be really
> inconvenient for unprivileged users: as it is, they have no way to
> turn parallel query off.  Unless somebody objects, I'll go change
> that.
>

+1. I would want it to be user settable.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 3 4 5 6 7 >

1 - 100 of 605 matches

Mail list logo