Re: [sqlite] Group by Literals
A lot of what we're mentioning is in http://www.sqlite.org/lang_select.html in section 3. For what Keith mentioned below a reminder that min() and max() are special cases where the bare columns are guaranteed to be from same row as (one of the) min or max values. Any other expressions are only guaranteed to be from "an undefined one of" the grouped rows. (But every column in any one single output row will have been calculated from the same input row) -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Keith Medcalf Sent: Wednesday, May 24, 2017 1:22 PM To: SQLite mailing list Subject: Re: [sqlite] Group by Literals This means that you can do things like: SELECT a, b, max(c) FROM t GROUP BY a; And you will be returned the groups of values of a, the max value of c in that group, and the value of b from (one of the rows) containing that maximum value. The actual row (assuming multiple rows have the same max(c)) is undefined (but is in fact determined by the visitation order of the underlying table containing c). ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Group by Literals
On Wednesday, 24 May, 2017 06:07, Denis Burke wrote: > These all produce a single row of output (and it happens to be the last > row > inserted [a1,b5]): > select c1,c2 from t1 group by '1'; > select c1,c2 from t1 group by '2'; > select c1,c2 from t1 group by '3'; > select c1,c2 from t1 group by 'c1'; > In the case of string literals, I would have expected to see all rows > returned and in these cases the group by become effectively a no-op. But > I cannot find documentation to justify my expectation, or to validate the > actual output. In many SQL products this would yield an error of the stating that you can only select either (1) results appearing in the group by clause or (2) the result of an aggregate function applied to the group. That is to say, you can ONLY use group by's of the following form: SELECT , FROM t GROUP BY In the case of grouping by a constant, you have created a group that contains all rows. SQLite, rather than tossing an error, permits you to request columns that do not conform with (1) and (2) above. If you do so, then the returned values for those columns come from "one of the rows" comprising the group. Which particular row, is not defined (to average mortals that is -- in actual fact the values are taken from the last row "visited" that yielded the group aggregate result -- this is called "not defined" because it depends on the vagaries of query optimization and the available indexes -- for a given set of data and known indexes it is 100% determinate -- but that means that you are relying on an implementation detail.) In your case it just happens to be the last row inserted because (1) a table scan is used to retrieve the data for the underlying table (ie, in order by rowid) and (b) the last row inserted happens to have the highest rowid. This means that you can do things like: SELECT a, b, max(c) FROM t GROUP BY a; And you will be returned the groups of values of a, the max value of c in that group, and the value of b from (one of the rows) containing that maximum value. The actual row (assuming multiple rows have the same max(c)) is undefined (but is in fact determined by the visitation order of the underlying table containing c). ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Group by Literals
Apologies for the multiple posts, but having now read the documentation thoroughly, I think the OP has a point and the GROUP BY documentation can benefit from local inclusion of the integer constant explanation that is given later for ORDER-BY (as quoted below) - or perhaps simply referring/linking to the ORDER-BY section for clarity. It seems to be one of those things "we all just know" - but in truth, it isn't obvious from the current documentation. Thanks, Ryan On 2017/05/24 2:38 PM, R Smith wrote: This is quite clear in the documentation I think, and might even be made clear in the SQL standard (but I did not check). An integer literal (and only an integer literal) denotes the column number to order or group by. This is true for all Databases I know of, but that list is obviously not total. A string literal simply groups by the value in that string, like any other expression that evaluates to a constant value , which can only ever return one single result since that value will be the same for all queried rows in the result-set. To quote directly (via copy-paste) from the documentation you referenced, about 3/4 of the way down (emphasis by me): ...// Each ORDER BY expression is processed as follows: 1. If the ORDER BY expression is a *constant integer K* then the expression is considered an alias for the K-th column of the result set (columns are numbered from left to right starting with 1). 2. If the ORDER BY expression is an identifier that corresponds to the alias of one of the output columns, then the expression is considered an alias for that column. 3. Otherwise, if the ORDER BY expression is *any other expression*, it is evaluated and the returned value used to order the output rows//... As far as clarity goes I think the devs did an amicable job here - it's hard for me to come up with a similarly succinct statement that would make point 1 any clearer. Hope this answers the question, Good luck! Ryan On 2017/05/24 2:07 PM, Denis Burke wrote: The SQLite documentation (http://www.sqlite.org/lang_select.html) says the GROUP BY clause accepts [expr]. And [expr] can be composed of a literal. What I cannot find is what SQLite does (or should do) with a literal in the GROUP BY clause. In the simple case of table T1 with two columns C1,C2 CREATE TABLE [T1]( [c1] text COLLATE NOCASE, [c2] text COLLATE NOCASE); INSERT INTO T1 VALUES ('a0','b8'),('a1','b7'),('a2','b8'),('a1','b5'); These queries give the same results: SELECT c1,c2 from T1 GROUP BY c1; SELECT c1,c2 from T1 GROUP BY 1; SELECT c2,c1 from T1 GROUP BY 2; and these two give the same result: SELECT c1,c2 from T1 GROUP BY c2; SELECT c1,c2 from T1 GROUP BY 2; and this produces an error: SELECT c1,c2 from T1 GROUP BY 3; This seems to show in the case of an integer literal, it substitutes the integer for that number of the selected columns. Maybe this is obvious and documented somewhere, but I was not able to find it and was not obvious to me initially. But when the literal GROUP BY is a string: These all produce a single row of output (and it happens to be the last row inserted [a1,b5]): select c1,c2 from t1 group by '1'; select c1,c2 from t1 group by '2'; select c1,c2 from t1 group by '3'; select c1,c2 from t1 group by 'c1'; In the case of string literals, I would have expected to see all rows returned and in these cases the group by become effectively a no-op. But I cannot find documentation to justify my expectation, or to validate the actual output. I did compare to SQL Server to see if behavior was similar and SQL Server simply does not allow this and complains "Each GROUP BY expression must contain at least one column that is not an outer reference." So I think my questions are simply: 1) should literals be allowed in the group by? 2) if so, is SQLite doing what it should with them? Thanks, Denis ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Group by Literals
The number 1 references the first column of the result set, 2 the second, and 3 is an error because there are only 2 columns. If the expression is a constant, then there is only 1 group. You are missing any meaningful information because you do not have an aggregate expression in your select list. -Ursprüngliche Nachricht- Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Denis Burke Gesendet: Mittwoch, 24. Mai 2017 14:07 An: SQLite mailing list Betreff: [sqlite] Group by Literals The SQLite documentation (http://www.sqlite.org/lang_select.html) says the GROUP BY clause accepts [expr]. And [expr] can be composed of a literal. What I cannot find is what SQLite does (or should do) with a literal in the GROUP BY clause. In the simple case of table T1 with two columns C1,C2 CREATE TABLE [T1]( [c1] text COLLATE NOCASE, [c2] text COLLATE NOCASE); INSERT INTO T1 VALUES ('a0','b8'),('a1','b7'),('a2','b8'),('a1','b5'); These queries give the same results: SELECT c1,c2 from T1 GROUP BY c1; SELECT c1,c2 from T1 GROUP BY 1; SELECT c2,c1 from T1 GROUP BY 2; and these two give the same result: SELECT c1,c2 from T1 GROUP BY c2; SELECT c1,c2 from T1 GROUP BY 2; and this produces an error: SELECT c1,c2 from T1 GROUP BY 3; This seems to show in the case of an integer literal, it substitutes the integer for that number of the selected columns. Maybe this is obvious and documented somewhere, but I was not able to find it and was not obvious to me initially. But when the literal GROUP BY is a string: These all produce a single row of output (and it happens to be the last row inserted [a1,b5]): select c1,c2 from t1 group by '1'; select c1,c2 from t1 group by '2'; select c1,c2 from t1 group by '3'; select c1,c2 from t1 group by 'c1'; In the case of string literals, I would have expected to see all rows returned and in these cases the group by become effectively a no-op. But I cannot find documentation to justify my expectation, or to validate the actual output. I did compare to SQL Server to see if behavior was similar and SQL Server simply does not allow this and complains "Each GROUP BY expression must contain at least one column that is not an outer reference." So I think my questions are simply: 1) should literals be allowed in the group by? 2) if so, is SQLite doing what it should with them? Thanks, Denis ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: h...@scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Group by Literals
Just realized, the section I've quoted from ORDER BY, not GROUP BY, but the effect pertains the both. On 2017/05/24 2:38 PM, R Smith wrote: This is quite clear in the documentation I think, and might even be made clear in the SQL standard (but I did not check). An integer literal (and only an integer literal) denotes the column number to order or group by. This is true for all Databases I know of, but that list is obviously not total. A string literal simply groups by the value in that string, like any other expression that evaluates to a constant value , which can only ever return one single result since that value will be the same for all queried rows in the result-set. To quote directly (via copy-paste) from the documentation you referenced, about 3/4 of the way down (emphasis by me): ...// Each ORDER BY expression is processed as follows: 1. If the ORDER BY expression is a *constant integer K* then the expression is considered an alias for the K-th column of the result set (columns are numbered from left to right starting with 1). 2. If the ORDER BY expression is an identifier that corresponds to the alias of one of the output columns, then the expression is considered an alias for that column. 3. Otherwise, if the ORDER BY expression is *any other expression*, it is evaluated and the returned value used to order the output rows//... As far as clarity goes I think the devs did an amicable job here - it's hard for me to come up with a similarly succinct statement that would make point 1 any clearer. Hope this answers the question, Good luck! Ryan On 2017/05/24 2:07 PM, Denis Burke wrote: The SQLite documentation (http://www.sqlite.org/lang_select.html) says the GROUP BY clause accepts [expr]. And [expr] can be composed of a literal. What I cannot find is what SQLite does (or should do) with a literal in the GROUP BY clause. In the simple case of table T1 with two columns C1,C2 CREATE TABLE [T1]( [c1] text COLLATE NOCASE, [c2] text COLLATE NOCASE); INSERT INTO T1 VALUES ('a0','b8'),('a1','b7'),('a2','b8'),('a1','b5'); These queries give the same results: SELECT c1,c2 from T1 GROUP BY c1; SELECT c1,c2 from T1 GROUP BY 1; SELECT c2,c1 from T1 GROUP BY 2; and these two give the same result: SELECT c1,c2 from T1 GROUP BY c2; SELECT c1,c2 from T1 GROUP BY 2; and this produces an error: SELECT c1,c2 from T1 GROUP BY 3; This seems to show in the case of an integer literal, it substitutes the integer for that number of the selected columns. Maybe this is obvious and documented somewhere, but I was not able to find it and was not obvious to me initially. But when the literal GROUP BY is a string: These all produce a single row of output (and it happens to be the last row inserted [a1,b5]): select c1,c2 from t1 group by '1'; select c1,c2 from t1 group by '2'; select c1,c2 from t1 group by '3'; select c1,c2 from t1 group by 'c1'; In the case of string literals, I would have expected to see all rows returned and in these cases the group by become effectively a no-op. But I cannot find documentation to justify my expectation, or to validate the actual output. I did compare to SQL Server to see if behavior was similar and SQL Server simply does not allow this and complains "Each GROUP BY expression must contain at least one column that is not an outer reference." So I think my questions are simply: 1) should literals be allowed in the group by? 2) if so, is SQLite doing what it should with them? Thanks, Denis ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Group by Literals
Denis Burke wrote: > The SQLite documentation (http://www.sqlite.org/lang_select.html) says the > GROUP BY clause accepts [expr]. And [expr] can be composed of a literal. > What I cannot find is what SQLite does (or should do) with a literal in the > GROUP BY clause. SQL-92 doesn't allow it: | Each in the shall unambiguously | reference a column of T. But ORDER BY supports referencing columns by their index: | If a contains an , then the | shall be greater than 0 and not greater than the | degree of T. The identifies the column of T | with the ordinal position specified by the . SQLite extends GROUP BY to also allow this kind of column references. > In the case of string literals, I would have expected to see all rows > returned and in these cases the group by become effectively a no-op. | The result of the is a partitioning of T into | a set of groups. The set is the minimum number of groups such | that, for each grouping column of each group of more than one | row, no two values of that grouping column are distinct. If we say that the "value of the grouping column" is the value of the expression itself, then the query must result in a single group because no two values are distinct in this column in any of the rows. Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Group by Literals
This is quite clear in the documentation I think, and might even be made clear in the SQL standard (but I did not check). An integer literal (and only an integer literal) denotes the column number to order or group by. This is true for all Databases I know of, but that list is obviously not total. A string literal simply groups by the value in that string, like any other expression that evaluates to a constant value , which can only ever return one single result since that value will be the same for all queried rows in the result-set. To quote directly (via copy-paste) from the documentation you referenced, about 3/4 of the way down (emphasis by me): ...// Each ORDER BY expression is processed as follows: 1. If the ORDER BY expression is a *constant integer K* then the expression is considered an alias for the K-th column of the result set (columns are numbered from left to right starting with 1). 2. If the ORDER BY expression is an identifier that corresponds to the alias of one of the output columns, then the expression is considered an alias for that column. 3. Otherwise, if the ORDER BY expression is *any other expression*, it is evaluated and the returned value used to order the output rows//... As far as clarity goes I think the devs did an amicable job here - it's hard for me to come up with a similarly succinct statement that would make point 1 any clearer. Hope this answers the question, Good luck! Ryan On 2017/05/24 2:07 PM, Denis Burke wrote: The SQLite documentation (http://www.sqlite.org/lang_select.html) says the GROUP BY clause accepts [expr]. And [expr] can be composed of a literal. What I cannot find is what SQLite does (or should do) with a literal in the GROUP BY clause. In the simple case of table T1 with two columns C1,C2 CREATE TABLE [T1]( [c1] text COLLATE NOCASE, [c2] text COLLATE NOCASE); INSERT INTO T1 VALUES ('a0','b8'),('a1','b7'),('a2','b8'),('a1','b5'); These queries give the same results: SELECT c1,c2 from T1 GROUP BY c1; SELECT c1,c2 from T1 GROUP BY 1; SELECT c2,c1 from T1 GROUP BY 2; and these two give the same result: SELECT c1,c2 from T1 GROUP BY c2; SELECT c1,c2 from T1 GROUP BY 2; and this produces an error: SELECT c1,c2 from T1 GROUP BY 3; This seems to show in the case of an integer literal, it substitutes the integer for that number of the selected columns. Maybe this is obvious and documented somewhere, but I was not able to find it and was not obvious to me initially. But when the literal GROUP BY is a string: These all produce a single row of output (and it happens to be the last row inserted [a1,b5]): select c1,c2 from t1 group by '1'; select c1,c2 from t1 group by '2'; select c1,c2 from t1 group by '3'; select c1,c2 from t1 group by 'c1'; In the case of string literals, I would have expected to see all rows returned and in these cases the group by become effectively a no-op. But I cannot find documentation to justify my expectation, or to validate the actual output. I did compare to SQL Server to see if behavior was similar and SQL Server simply does not allow this and complains "Each GROUP BY expression must contain at least one column that is not an outer reference." So I think my questions are simply: 1) should literals be allowed in the group by? 2) if so, is SQLite doing what it should with them? Thanks, Denis ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users