Re: Implementing SQL ASSERTION

2019-01-31 Thread Andres Freund
Hi,

On 2018-11-29 16:54:14 +0100, Dmitry Dolgov wrote:
> > On Tue, Sep 25, 2018 at 1:04 AM Joe Wildish 
> > wrote:
> >
> > All agreed.  I’ll give the patch some TLC and get a new version that
> > addresses the above.
> 
> Hi,
> 
> Just a reminder, that the patch still needs to be rebased, could you please do
> this? I'm moving the item to the next CF.

As nothing has happened, I'm marking this patch as returned with feedback.

Greetings,

Andres Freund



Re: Implementing SQL ASSERTION

2018-11-29 Thread Dmitry Dolgov
> On Tue, Sep 25, 2018 at 1:04 AM Joe Wildish 
> wrote:
>
> All agreed.  I’ll give the patch some TLC and get a new version that
> addresses the above.

Hi,

Just a reminder, that the patch still needs to be rebased, could you please do
this? I'm moving the item to the next CF.



Re: Implementing SQL ASSERTION

2018-09-29 Thread Andrew Gierth
> "Joe" == Joe Wildish  writes:

 >> I haven't looked at the background of this, but if what you want to
 >> know is whether the aggregate function has the semantics of min() or
 >> max() (and if so, which) then the place to look is
 >> pg_aggregate.aggsortop.

 Joe> Thanks for the pointer. I've had a quick look at pg_aggregate, and
 Joe> back at my code, but I think there is more to it than just the
 Joe> sorting property. Specifically we need to know about the aggregate
 Joe> function when combined with connectors <, <=, < ANY, <= ANY, < ALL
 Joe> and <= ALL (and their equivalents with ">" and ">=").

The presence of an aggsortop means "this aggregate function is
interchangeable with (select x from ... order by x using OP limit 1)",
with all of the semantic consequences that implies. Since OP must be the
"<" or ">" member of a btree index opclass, the semantics of its
relationships with other members of the same opfamily can be deduced
from that.

 Joe> Also, it looks like COUNT and SUM don't have a sortop

Right, because those currently have no semantics that PG needs to know
about or describe.

-- 
Andrew (irc:RhodiumToad)



Re: Implementing SQL ASSERTION

2018-09-29 Thread Joe Wildish
Hi David,

> On 26 Sep 2018, at 19:47, David Fetter  wrote:
> 
>> Invalidating operations are "INSERT(t) and UPDATE(t.b, t.n)".
> 
> So would DELETE(t), assuming n can be negative.

Oops, right you are. Bug in my implementation :-) 

> Is there some interesting and fairly easily documented subset of
> ASSERTIONs that wouldn't have the "can't prove" property?

We can certainly know at the time the ASSERTION is created if we
can use the transition table optimisation, as that relies upon
the expression being written in such a way that a key can be
derived for each expression.

We could warn or disallow the creation on that basis. Ceri & Widom
mention this actually in their papers, and their view is that most
real-world use cases do indeed allow themselves to be optimised
using the transition tables.

-Joe





Re: Implementing SQL ASSERTION

2018-09-29 Thread Joe Wildish
Hi Andrew,

On 25 Sep 2018, at 01:51, Andrew Gierth  wrote:
> I haven't looked at the background of this, but if what you want to know
> is whether the aggregate function has the semantics of min() or max()
> (and if so, which) then the place to look is pg_aggregate.aggsortop.

Thanks for the pointer. I've had a quick look at pg_aggregate, and back
at my code, but I think there is more to it than just the sorting property.
Specifically we need to know about the aggregate function when combined with
connectors <, <=, < ANY, <= ANY, < ALL and <= ALL (and their equivalents
with ">" and ">="). Also, it looks like COUNT and SUM don't have a sortop
(the other aggregates I've catered for do though).

When I come to do the rework of the patch I'll take a more in-depth look
though, and see if this can be utilised.

> As for operators, you can only make assumptions about their meaning if
> the operator is a member of some opfamily that assigns it some
> semantics. 

I had clocked the BT semantics stuff when doing the PoC patch. I have used
the "get_op_btree_interpretation" function for determining operator meaning.

-Joe




Re: Implementing SQL ASSERTION

2018-09-29 Thread Joe Wildish
On 26 Sep 2018, at 12:36, Peter Eisentraut  
wrote:
> 
> On 25/09/2018 01:04, Joe Wildish wrote:
>> Having said all that: there are obviously going to be some expressions
>> that cannot be proven to have no potential for invalidating the assertion
>> truth. I guess this is the prime concern from a concurrency PoV?
> 
> Before we spend more time on this, I think we need to have at least a
> plan for that.  

Having thought about this some more: the answer could lie in using predicate
locks, and enforcing that the transaction be SERIALIZABLE whenever an ASSERTION
is triggered.

To make use of the predicate locks we'd do a transformation on the ASSERTION
expression. I believe that there is derivation, similar to the one mentioned
up-thread re: "managers and administrators", that would essentially push
predicates into the expression on the basis of the changed data. The semantics
of the expression would remain unchanged, but it would mean that when the
expression is rechecked, the minimal set of data is read and would therefore not
conflict with other DML statements that had triggered the same ASSERTION but had
modified unrelated data. Example:

CREATE TABLE t
 (n INTEGER NOT NULL,
  m INTEGER NOT NULL,
  k INTEGER NOT NULL,
 PRIMARY KEY (n, m));

CREATE ASSERTION sum_k_at_most_10 CHECK
  (NOT EXISTS
(SELECT * FROM
  (SELECT n, sum(k)
 FROM t
GROUP BY n)
 AS r(n, ks)
  WHERE ks > 10));

On an INSERT/DELETE/UPDATE of "t", we would transform the inner-most expression
of the ASSERTION to have a predicate of "WHERE n = NEW.n". In my experiments I
can see that doing so allows concurrent transactions to COMMIT that have
modified unrelated segments of "t" (assuming the planner uses Index Scan). The
efficacy of this would be dictated by the granularity of the SIREAD locks; my
understanding is that this can be as low as tuple-level in the case where Index
Scans are used (and this is borne out in my experiments - ie. you don't want a
SeqScan).

> Perhaps we could should disallow cases that we can't
> handle otherwise.  But even that would need some analysis of which
> practical cases we can and cannot handle, how we could extend support in
> the future, etc.


The optimisation I mentioned up-thread, plus the one hypothesised here, both
rely on being able to derive the key of an expression from the underlying base
tables/other expressions. We could perhaps disallow ASSERTIONS that don't have
such properties?

Beyond that I think it starts to get difficult (impossible?) to know which
expressions are likely to be costly on the basis of static analysis. It could be
legitimate to have an ASSERTION defined over what turns out to be a small subset
of a very large table, for example.

-Joe






Re: Implementing SQL ASSERTION

2018-09-26 Thread David Fetter
On Tue, Sep 25, 2018 at 12:04:12AM +0100, Joe Wildish wrote:
> Hi Peter,
> 
> > My feeling is that if we want to move forward on this topic, we need to
> > solve the concurrency question first.  All these optimizations for when
> > we don't need to check the assertion are cool, but they are just
> > optimizations that we can apply later on, once we have solved the
> > critical problems.
> 
> Having said all that: there are obviously going to be some expressions
> that cannot be proven to have no potential for invalidating the assertion
> truth. I guess this is the prime concern from a concurrency PoV? Example:
> 
> CREATE TABLE t (
>   b BOOLEAN NOT NULL,
>   n INTEGER NOT NULL,
>   PRIMARY KEY (b, n)
> );
> 
> CREATE ASSERTION sum_per_b_less_than_10 CHECK
>   (NOT EXISTS
> (SELECT FROM (SELECT b, SUM(n)
> FROM t
>GROUP BY b) AS v(b, sum_n)
>   WHERE sum_n > 10));

> 
> Invalidating operations are "INSERT(t) and UPDATE(t.b, t.n)".

So would DELETE(t), assuming n can be negative.

Is there some interesting and fairly easily documented subset of
ASSERTIONs that wouldn't have the "can't prove" property?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Implementing SQL ASSERTION

2018-09-26 Thread Peter Eisentraut
On 25/09/2018 01:04, Joe Wildish wrote:
> Having said all that: there are obviously going to be some expressions
> that cannot be proven to have no potential for invalidating the assertion
> truth. I guess this is the prime concern from a concurrency PoV?

Before we spend more time on this, I think we need to have at least a
plan for that.  Perhaps we could should disallow cases that we can't
handle otherwise.  But even that would need some analysis of which
practical cases we can and cannot handle, how we could extend support in
the future, etc.

In the meantime, I have committed parts of your gram.y changes that seem
to come up every time someone dusts off an assertions patch.  Keep that
in mind when you rebase.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Implementing SQL ASSERTION

2018-09-24 Thread Andrew Gierth
> "Joe" == Joe Wildish  writes:

 Joe> Agreed. My assumption was that we would record in the data
 Joe> dictionary the behaviour (or “polarity") of each aggregate
 Joe> function with respect to the various operators. Column in
 Joe> pg_aggregate? I don’t know how we’d record it exactly.

I haven't looked at the background of this, but if what you want to know
is whether the aggregate function has the semantics of min() or max()
(and if so, which) then the place to look is pg_aggregate.aggsortop.

(For a given aggregate foo(x), the presence of an operator oid in
aggsortop means something like "foo(x) is equivalent to (select x from
... order by x using OP limit 1)", and the planner will replace the
aggregate by the applicable subquery if it thinks it'd be faster.)

As for operators, you can only make assumptions about their meaning if
the operator is a member of some opfamily that assigns it some
semantics. For example, the planner can assume that WHERE x=y AND x=1
implies that y=1 (assuming x and y are of appropriate types) not because
it assumes that "=" is the name of a transitive operator, but because
the operators actually selected for (x=1) and (x=y) are both "equality"
members of the same btree operator family. Likewise proving that (a>2)
implies (a>1) requires knowing that > is a btree comparison op.

-- 
Andrew (irc:RhodiumToad)



Re: Implementing SQL ASSERTION

2018-09-24 Thread Joe Wildish
Hi Peter,

> On 24 Sep 2018, at 15:06, Peter Eisentraut
>  wrote:
> 
> On 29/04/2018 20:18, Joe Wildish wrote:
>> 
>> Attached is a rebased patch for the prototype.
> 
> I took a look at this.

Thank you for reviewing.

> This has been lying around for a few months, so it will need to be
> rebased again.
> 
> 8< - - - snipped for brevity - - - 8<
> 
> All this new code in constraint.c that checks the assertion expression
> needs more comments and documentation.

All agreed.  I’ll give the patch some TLC and get a new version that
addresses the above.

> Stuff like this isn't going to work:
> 
> static int
> funcMaskForFuncOid(Oid funcOid)
> {
>char *name = get_func_name(funcOid);
> 
>if (name == NULL)
>return OTHER_FUNC;
>else if (strncmp(name, "min", strlen("min")) == 0)
>return MIN_AGG_FUNC;
>else if (strncmp(name, "max", strlen("max")) == 0)
>return MAX_AGG_FUNC;
> 
> You can assume from the name of a function what it's going to do.
> Solving this properly might be hard.

Agreed. My assumption was that we would record in the data dictionary the
behaviour (or “polarity") of each aggregate function with respect to the
various operators. Column in pg_aggregate? I don’t know how we’d record it
exactly. A bitmask would be a possibility. Also, I don’t know what we’d do
with custom aggregate functions (or indeed custom operators). Allowing end
users to determine the value would potentially lead to assertion checks
being incorrectly skipped. Maybe we’d say that custom aggregates always
have a neutral polarity and are therefore not subject to this
optimisation.

> This ought to be reproducible for you if you build with assertions.

Yes. I shall correct this when I do the aforementioned rebase and
application of TLC.

> My feeling is that if we want to move forward on this topic, we need to
> solve the concurrency question first.  All these optimizations for when
> we don't need to check the assertion are cool, but they are just
> optimizations that we can apply later on, once we have solved the
> critical problems.

I obviously agree that the concurrency issue needs solving. But I don’t
see that at all as a separate matter from the algos. Far from being merely
optimisations, the research indicates we can go a lot further toward
reducing the need for rechecks and, therefore, reducing the chance of
concurrency conflicts from occurring in the first place. This is true
regardless of whatever mechanism we use to enforce correct behaviour under
concurrent modifications -- e.g. a lock on the ASSERTION object itself,
enforced use of SERIALIZABLE, etc.

By way of example (lifted directly from the AM4DP book):

CREATE TABLE employee (
  id INTEGER PRIMARY KEY,
  dept INTEGER NOT NULL,
  job TEXT NOT NULL
);

CREATE ASSERTION department_managers_need_administrators CHECK
  (NOT EXISTS
(SELECT dept
   FROM employee a
  WHERE EXISTS (SELECT * FROM employee b
 WHERE a.dept = b.dept
   AND b.job IN ('Manager', 'Senior Manager'))
AND NOT EXISTS (SELECT * FROM employee b
 WHERE a.dept = b.dept
   AND b.job = 'Administrator')));

The current implementation derives "DELETE(employee), INSERT(employee) and
UPDATE(employee.dept, employee.job)" as the set of invalidating operations
and triggers accordingly. However, in this case, we can supplement the
triggers by having them inspect the transition tables to see if the actual
data from the triggering DML statement could in fact affect the truth of
the expression: specifically, only do the recheck on DELETE of an
"Administrator", INSERT of a "Manager" or "Senior Manager", or UPDATE when
the new job is a "Manager" or "Senior Manager" or the old job was an
"Administrator".

Now, if this is a company with 10,000 employees, and would therefore
presumably only require a handful of managers, right? ;-), then the
potential for a concurrency conflict is massively reduced when compared to
rechecking every time the employee table is touched.

(This optimisation has some caveats and is reliant upon being able to
derive the key of an expression from the underlying base tables plus some
stuff about functional dependencies. I have started work on it but sadly
not had time to progress it in recent months).

Having said all that: there are obviously going to be some expressions
that cannot be proven to have no potential for invalidating the assertion
truth. I guess this is the prime concern from a concurrency PoV? Example:

CREATE TABLE t (
  b BOOLEAN NOT NULL,
  n INTEGER NOT NULL,
  PRIMARY KEY (b, n)
);

CREATE ASSERTION sum_per_b_less_than_10 CHECK
  (NOT EXISTS
(SELECT FROM (SELECT b, SUM(n)
FROM t
   GROUP BY b) AS v(b, sum_n)
  WHERE sum_n > 10));

Invalidating operations are "INSERT(t) and UPDATE(t.b, t.n)". I guess the
interesting case, from a concurrency perspective, is how do we avoid an
INSERT WHERE 

Re: Implementing SQL ASSERTION

2018-09-24 Thread Peter Eisentraut
On 29/04/2018 20:18, Joe Wildish wrote:
> On 28 Mar 2018, at 16:13, David Fetter  wrote:
>>
>> Sorry to bother you again, but this now doesn't compile atop master.
> 
> Attached is a rebased patch for the prototype.

I took a look at this.

This has been lying around for a few months, so it will need to be
rebased again.  I applied this patch on top of
68e7e973d22274a089ce95200b3782f514f6d2f8, which was the HEAD around the
time this patch was created, and it applies cleanly there.

Please check you patch for whitespace errors:

warning: squelched 13 whitespace errors
warning: 18 lines add whitespace errors.

Also, reduce the amount of useless whitespace changes in the patch.

There are some compiler warnings:

constraint.c: In function 'CreateAssertion':
constraint.c:1211:2: error: ISO C90 forbids mixed declarations and code
[-Werror=declaration-after-statement]

constraint.c: In function 'oppositeDmlOp':
constraint.c:458:1: error: control reaches end of non-void function
[-Werror=return-type]

The version check in psql's describeAssertions() needs to be updated.
Also, you should use formatPGVersionNumber() to cope with two-part and
one-part version numbers.

All this new code in constraint.c that checks the assertion expression
needs more comments and documentation.

Stuff like this isn't going to work:

static int
funcMaskForFuncOid(Oid funcOid)
{
char *name = get_func_name(funcOid);

if (name == NULL)
return OTHER_FUNC;
else if (strncmp(name, "min", strlen("min")) == 0)
return MIN_AGG_FUNC;
else if (strncmp(name, "max", strlen("max")) == 0)
return MAX_AGG_FUNC;

You can assume from the name of a function what it's going to do.
Solving this properly might be hard.

The regression test crashes for me around

frame #4: 0x00010d3a4cdc postgres`castNodeImpl(type=T_SubLink,
ptr=0x7ff27006d230) at nodes.h:582
frame #5: 0x00010d3a61c6
postgres`visitSubLink(node=0x7ff270034040, info=0x7ffee2a23930)
at constraint.c:843

This ought to be reproducible for you if you build with assertions.


My feeling is that if we want to move forward on this topic, we need to
solve the concurrency question first.  All these optimizations for when
we don't need to check the assertion are cool, but they are just
optimizations that we can apply later on, once we have solved the
critical problems.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Implementing SQL ASSERTION

2018-04-29 Thread David Fetter
On Sun, Apr 29, 2018 at 07:18:00PM +0100, Joe Wildish wrote:
> On 28 Mar 2018, at 16:13, David Fetter  wrote:
> > 
> > Sorry to bother you again, but this now doesn't compile atop master.
> 
> Attached is a rebased patch for the prototype.

Thanks!

This is great timing for the 12 cycle :)

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Implementing SQL ASSERTION

2018-03-28 Thread David Fetter
On Sun, Mar 18, 2018 at 12:29:50PM +, Joe Wildish wrote:
> > 
> >> 
> >> This patch no longer applies.  Any chance of a rebase?
> 
> Attached is a rebased version of this patch. It takes into account
> the ACL checking changes and a few other minor amendments.

Sorry to bother you again, but this now doesn't compile atop master.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Implementing SQL ASSERTION

2018-03-18 Thread David Fetter
On Sun, Mar 18, 2018 at 12:29:50PM +, Joe Wildish wrote:
> > 
> >> 
> >> This patch no longer applies.  Any chance of a rebase?
> >> 
> > 
> 
> 
> Attached is a rebased version of this patch. It takes into account the ACL 
> checking changes and a few other minor amendments.

Thanks!

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Implementing SQL ASSERTION

2018-03-09 Thread Thomas Munro
On Sat, Mar 10, 2018 at 6:37 AM, Robert Haas  wrote:
> On Mon, Jan 15, 2018 at 11:35 AM, David Fetter  wrote:
>> - We follow the SQL standard and make SERIALIZABLE the default
>>   transaction isolation level, and
>
> The consequences of such a decision would include:
>
> - pgbench -S would run up to 10x slower, at least if these old
> benchmark results are still valid:
>
> https://www.postgresql.org/message-id/ca+tgmozog1wfbyrqzjukilsxw5sdujjguey0c2bqsg-tcis...@mail.gmail.com
>
> - pgbench without -S would fail outright, because it doesn't have
> provision to retry failed transactions.
>
> https://commitfest.postgresql.org/16/1419/
>
> - Many user applications would probably also experience similar difficulties.
>
> - Parallel query would no longer work by default, unless this patch
> gets committed:
>
> https://commitfest.postgresql.org/17/1004/
>
> I think a good deal of work to improve the performance of serializable
> would need to be done before we could even think about making it the
> default -- and even then, the fact that it really requires the
> application to be retry-capable seems like a pretty major obstacle.

Also:

- It's not available on hot standbys.  Experimental patches have been
developed based on the read only safe snapshot concept, but some
tricky problems remain unsolved.

- Performance is terrible (conflicts are maximised) if you use any
index type except btree, unless some of these get committed:

https://commitfest.postgresql.org/17/1172/
https://commitfest.postgresql.org/17/1183/
https://commitfest.postgresql.org/17/1466/

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Implementing SQL ASSERTION

2018-03-09 Thread Robert Haas
On Mon, Jan 15, 2018 at 11:35 AM, David Fetter  wrote:
> - We follow the SQL standard and make SERIALIZABLE the default
>   transaction isolation level, and

The consequences of such a decision would include:

- pgbench -S would run up to 10x slower, at least if these old
benchmark results are still valid:

https://www.postgresql.org/message-id/ca+tgmozog1wfbyrqzjukilsxw5sdujjguey0c2bqsg-tcis...@mail.gmail.com

- pgbench without -S would fail outright, because it doesn't have
provision to retry failed transactions.

https://commitfest.postgresql.org/16/1419/

- Many user applications would probably also experience similar difficulties.

- Parallel query would no longer work by default, unless this patch
gets committed:

https://commitfest.postgresql.org/17/1004/

I think a good deal of work to improve the performance of serializable
would need to be done before we could even think about making it the
default -- and even then, the fact that it really requires the
application to be retry-capable seems like a pretty major obstacle.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Implementing SQL ASSERTION

2018-03-08 Thread Joe Wildish
Hi David,

> 
> This patch no longer applies.  Any chance of a rebase?
> 



Of course. I’ll look at it this weekend,

Cheers,
-Joe




Re: Implementing SQL ASSERTION

2018-03-07 Thread David Fetter
On Mon, Jan 15, 2018 at 09:14:02PM +, Joe Wildish wrote:
> Hi David,
> 
> > On 15 Jan 2018, at 16:35, David Fetter  wrote:
> > 
> > It sounds reasonable enough that I'd like to make a couple of Modest
> > Proposals™, to wit:
> > 
> > - We follow the SQL standard and make SERIALIZABLE the default
> >  transaction isolation level, and
> > 
> > - We disallow writes at isolation levels other than SERIALIZABLE when
> >  any ASSERTION could be in play.
> 
> Certainly it would be easy to put a test into the assertion check
> function to require the isolation level be serialisable. I didn’t
> realise that that was also the default level as per the standard.
> That need not necessarily be changed, of course; it would be obvious
> to the user that it was a requirement as the creation of an
> assertion would fail without it, as would any subsequent attempts to
> modify the involved tables.

This patch no longer applies.  Any chance of a rebase?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Implementing SQL ASSERTION

2018-01-15 Thread Joe Wildish
Hi David,

> On 15 Jan 2018, at 16:35, David Fetter  wrote:
> 
> It sounds reasonable enough that I'd like to make a couple of Modest
> Proposals™, to wit:
> 
> - We follow the SQL standard and make SERIALIZABLE the default
>  transaction isolation level, and
> 
> - We disallow writes at isolation levels other than SERIALIZABLE when
>  any ASSERTION could be in play.

Certainly it would be easy to put a test into the assertion check function to 
require the isolation level be serialisable. I didn’t realise that that was 
also the default level as per the standard. That need not necessarily be 
changed, of course; it would be obvious to the user that it was a requirement 
as the creation of an assertion would fail without it, as would any subsequent 
attempts to modify the involved tables.

-Joe

Re: Implementing SQL ASSERTION

2018-01-15 Thread David Fetter
On Mon, Jan 15, 2018 at 03:40:57PM +0100, Fabien COELHO wrote:
> 
> >>I'm wondering about the effect of MVVC on this: if the check is
> >>performed when the INSERT is done, concurrent inserting transactions
> >>would count the current status which would be ok, but on commit all
> >>concurrent inserts would be there and the count could not be ok anymore?
> 
> >The patch doesn’t attempt to address concurrency (beyond the obvious
> >benefit of reducing the circumstances under which the assertion is
> >checked). I am working under the assumption that we will find some
> >acceptable way for that to be resolved :-) And at the moment, working in
> >serialisable mode addresses this issue. I think that is suggested in the
> >thread actually (essentially, if you want to use assertions, you require
> >that transactions be performed at serialisable isolation level).
> 
> Thanks for the pointers. The "serializable" isolation level restriction
> sounds reasonnable.

It sounds reasonable enough that I'd like to make a couple of Modest
Proposals™, to wit:

- We follow the SQL standard and make SERIALIZABLE the default
  transaction isolation level, and

- We disallow writes at isolation levels other than SERIALIZABLE when
  any ASSERTION could be in play.

That latter could range in implementation from crashingly unsubtle to
very precise.  

Crashingly Unsubtle:

Disallow writes at any isolation level other than SERIALIZABLE.

Very Precise:

Disallow writes at any other isolation level when the ASSERTION
could come into play using the same machinery that enforces the
ASSERTION in the first place.

What say?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Implementing SQL ASSERTION

2018-01-15 Thread Fabien COELHO


I'm wondering about the effect of MVVC on this: if the check is 
performed when the INSERT is done, concurrent inserting transactions 
would count the current status which would be ok, but on commit all 
concurrent inserts would be there and the count could not be ok 
anymore?


The patch doesn’t attempt to address concurrency (beyond the obvious 
benefit of reducing the circumstances under which the assertion is 
checked). I am working under the assumption that we will find some 
acceptable way for that to be resolved :-) And at the moment, working in 
serialisable mode addresses this issue. I think that is suggested in the 
thread actually (essentially, if you want to use assertions, you require 
that transactions be performed at serialisable isolation level).


Thanks for the pointers. The "serializable" isolation level restriction 
sounds reasonnable.


--
Fabien.

Re: Implementing SQL ASSERTION

2018-01-15 Thread Joe Wildish
Hi Fabien,

>> * certain combinations of aggregates with comparison operations cannot be 
>> invalidating.
>> 
>> As an example of the last point, the expression "CHECK (10 > (SELECT 
>> COUNT(*) FROM t))" cannot be invalidated by a delete or an update but can be 
>> invalidated by an insert.
> 
> I'm wondering about the effect of MVVC on this: if the check is performed 
> when the INSERT is done, concurrent inserting transactions would count the 
> current status which would be ok, but on commit all concurrent inserts would 
> be there and the count could not be ok anymore?

Yes, there was quite a bit of discussion in the original thread about 
concurrency. See here:

https://www.postgresql.org/message-id/flat/1384486216.5008.17.camel%40vanquo.pezone.net#1384486216.5008.17.ca...@vanquo.pezone.net
 


The patch doesn’t attempt to address concurrency (beyond the obvious benefit of 
reducing the circumstances under which the assertion is checked). I am working 
under the assumption that we will find some acceptable way for that to be 
resolved :-) And at the moment, working in serialisable mode addresses this 
issue. I think that is suggested in the thread actually (essentially, if you 
want to use assertions, you require that transactions be performed at 
serialisable isolation level). 

> Maybe if the check was deferred, but this is not currently possible with pg 
> (eg the select can simply be put in a function), and I there might be race 
> conditions. ISTM that such a check would imply non trivial locking to be 
> okay, it is not just a matter of deciding whether to invoke the check or not.

I traverse into SQL functions so that the analysis can capture invalidating 
operations from the expression inside the function. Only internal and SQL 
functions are considered legal. Other languages are rejected.

-Joe




Re: Implementing SQL ASSERTION

2018-01-15 Thread Fabien COELHO


Hello Joe,

Just a reaction to the example, which is maybe addressed in the patch 
which I have not investigated.


* certain combinations of aggregates with comparison operations cannot 
be invalidating.


As an example of the last point, the expression "CHECK (10 > (SELECT 
COUNT(*) FROM t))" cannot be invalidated by a delete or an update but 
can be invalidated by an insert.


I'm wondering about the effect of MVVC on this: if the check is performed 
when the INSERT is done, concurrent inserting transactions would count the 
current status which would be ok, but on commit all concurrent inserts 
would be there and the count could not be ok anymore?


Maybe if the check was deferred, but this is not currently possible with 
pg (eg the select can simply be put in a function), and I there might be 
race conditions. ISTM that such a check would imply non trivial locking to 
be okay, it is not just a matter of deciding whether to invoke the check 
or not.


--
Fabien.