On Sat, Feb 10, 2018 at 11:59 AM, Alvaro Herrera <alvhe...@alvh.no-ip.org> wrote:
> PG Doc comments form wrote: > > > create table t(x int); > > create table u(x int); > > > > insert into t values (1), (2), (2), (3), (3); > > insert into u values (1), (2); > > > > select * from t except all select * from u; > > > x > > --- > > 3 > > 3 > > 2 > > (3 rows) > > I find this pretty odd behavior. Is this not an outright bug? I > haven't read the SQL standard on this matter[*], but if they define > EXCEPT ALL to work like this, then it seems pretty useless. (Maybe they > just didn't intend EXCEPT ALL to be useful?) If this is indeed the > case, maybe we should amend the docs not only to explain this behavior > but also to warn against the construct. > I think that's the way it is defined (but I agree, I don't remember ever seeing a use for it) > > [*] I didn't try terribly hard, but couldn't actually find where the > behavior is defined. What I have on hand is a draft of SQL:2011 where > this appears to be defined in 7.13 <query expression> but I was unable > to find the rules for set operations. It refers to 9.12 Grouping > operations but that defines conformance rules only. > > -- > I may have a different version but I see: iii) T contains the following rows: 1) Let R be a row that is a duplicate of some row in ET1 or of some row in ET2 or both. Let m be the number of duplicates of R in ET1 and let n be the number of duplicates of R in ET2, where m ≥ 0 and n ≥ 0. 2) If DISTINCT is specified or implicit, then .... ... 3) If ALL is specified, then Case: A) If UNION is specified, then the number of duplicates of R that T contains is (m + n). B) If EXCEPT is specified, then the number of duplicates of R that T contains is the maximum of (m – n) and 0 (zero). C) If INTERSECT is specified, then the number of duplicates of R that T contains is the minimum of m and n.