[jira] [Commented] (DERBY-6227) Distinct aggregates don't work well with territory-based collation
[ https://issues.apache.org/jira/browse/DERBY-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170372#comment-14170372 ] ASF subversion and git services commented on DERBY-6227: Commit 1631604 from [~mamtas] in branch 'code/branches/10.10' [ https://svn.apache.org/r1631604 ] DERBY-6227(Distinct aggregates don't work well with territory-based collation) Backporting to 10.10 Had to hand tweek the changes in GroupedAggregateResultSet.java but there weren't too many changes. derbyall and junit suite ran fine. > Distinct aggregates don't work well with territory-based collation > -- > > Key: DERBY-6227 > URL: https://issues.apache.org/jira/browse/DERBY-6227 > Project: Derby > Issue Type: Bug > Components: SQL >Affects Versions: 10.6.1.0, 10.6.2.1, 10.7.1.1, 10.8.1.2, 10.8.2.2, > 10.8.3.0, 10.9.1.0, 10.10.1.1 >Reporter: Knut Anders Hatlen >Assignee: Mamta A. Satoor > Labels: derby_triage10_11 > Fix For: 10.11.1.1 > > Attachments: d6227-1a.diff > > > When working on DERBY-5840, I noticed that GroupedAggregateResultSet would > do duplicate elimination by comparing the java.lang.String representation of > the values. With territory-based collation, it is possible that two values > that have different java.lang.String representation should be considered > duplicates, and this logic will produce incorrect results. > Example: > ij version 10.10 > ij> connect > 'jdbc:derby:memory:db;territory=en_US;collation=TERRITORY_BASED:PRIMARY;create=true'; > ij> create table t(i int, s varchar(10)); > 0 rows inserted/updated/deleted > ij> insert into t values (1, 'a'), (1, 'a'), (2, 'b'), (2, 'B'), (3, 'a'), > (3, 'A'), (3, 'b'), (3, 'B'), (3, 'c'); > 9 rows inserted/updated/deleted > ij> select distinct s from t; > S > -- > b > a > c > 3 rows selected > ij> select i, count(distinct s) from t group by i; > I |2 > --- > 1 |1 > 2 |2 > 3 |5 > 3 rows selected > I would have expected the last query to return > (1, 1) > (2, 1) > (3, 3) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DERBY-6227) Distinct aggregates don't work well with territory-based collation
[ https://issues.apache.org/jira/browse/DERBY-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037220#comment-14037220 ] Knut Anders Hatlen commented on DERBY-6227: --- Thanks, Dag. That is a good question. I just happened to stumble across this problem. I'm not sure if there's any systematic way of finding other instances of it. There are not many callers of DVD.getString() left in the impl.sql.execute package now, so hopefully this was just a one-off. > Distinct aggregates don't work well with territory-based collation > -- > > Key: DERBY-6227 > URL: https://issues.apache.org/jira/browse/DERBY-6227 > Project: Derby > Issue Type: Bug > Components: SQL >Affects Versions: 10.6.1.0, 10.6.2.1, 10.7.1.1, 10.8.1.2, 10.8.2.2, > 10.8.3.0, 10.9.1.0, 10.10.1.1 >Reporter: Knut Anders Hatlen >Assignee: Knut Anders Hatlen > Labels: derby_triage10_11 > Fix For: 10.11.0.0 > > Attachments: d6227-1a.diff > > > When working on DERBY-5840, I noticed that GroupedAggregateResultSet would > do duplicate elimination by comparing the java.lang.String representation of > the values. With territory-based collation, it is possible that two values > that have different java.lang.String representation should be considered > duplicates, and this logic will produce incorrect results. > Example: > ij version 10.10 > ij> connect > 'jdbc:derby:memory:db;territory=en_US;collation=TERRITORY_BASED:PRIMARY;create=true'; > ij> create table t(i int, s varchar(10)); > 0 rows inserted/updated/deleted > ij> insert into t values (1, 'a'), (1, 'a'), (2, 'b'), (2, 'B'), (3, 'a'), > (3, 'A'), (3, 'b'), (3, 'B'), (3, 'c'); > 9 rows inserted/updated/deleted > ij> select distinct s from t; > S > -- > b > a > c > 3 rows selected > ij> select i, count(distinct s) from t group by i; > I |2 > --- > 1 |1 > 2 |2 > 3 |5 > 3 rows selected > I would have expected the last query to return > (1, 1) > (2, 1) > (3, 3) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (DERBY-6227) Distinct aggregates don't work well with territory-based collation
[ https://issues.apache.org/jira/browse/DERBY-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037180#comment-14037180 ] ASF subversion and git services commented on DERBY-6227: Commit 1603793 from [~knutanders] in branch 'code/trunk' [ https://svn.apache.org/r1603793 ] DERBY-6227: Distinct aggregates don't work well with territory-based collation Make the duplicate elimination use the DataValueDescriptors directly instead of converting the values to String objects first. This ensures that the collation rules of the database are used to compare the values, and those rules may be different from those used by String.equals(). > Distinct aggregates don't work well with territory-based collation > -- > > Key: DERBY-6227 > URL: https://issues.apache.org/jira/browse/DERBY-6227 > Project: Derby > Issue Type: Bug > Components: SQL >Affects Versions: 10.6.1.0, 10.6.2.1, 10.7.1.1, 10.8.1.2, 10.8.2.2, > 10.8.3.0, 10.9.1.0, 10.10.1.1 >Reporter: Knut Anders Hatlen >Assignee: Knut Anders Hatlen > Labels: derby_triage10_11 > Attachments: d6227-1a.diff > > > When working on DERBY-5840, I noticed that GroupedAggregateResultSet would > do duplicate elimination by comparing the java.lang.String representation of > the values. With territory-based collation, it is possible that two values > that have different java.lang.String representation should be considered > duplicates, and this logic will produce incorrect results. > Example: > ij version 10.10 > ij> connect > 'jdbc:derby:memory:db;territory=en_US;collation=TERRITORY_BASED:PRIMARY;create=true'; > ij> create table t(i int, s varchar(10)); > 0 rows inserted/updated/deleted > ij> insert into t values (1, 'a'), (1, 'a'), (2, 'b'), (2, 'B'), (3, 'a'), > (3, 'A'), (3, 'b'), (3, 'B'), (3, 'c'); > 9 rows inserted/updated/deleted > ij> select distinct s from t; > S > -- > b > a > c > 3 rows selected > ij> select i, count(distinct s) from t group by i; > I |2 > --- > 1 |1 > 2 |2 > 3 |5 > 3 rows selected > I would have expected the last query to return > (1, 1) > (2, 1) > (3, 3) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (DERBY-6227) Distinct aggregates don't work well with territory-based collation
[ https://issues.apache.org/jira/browse/DERBY-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036075#comment-14036075 ] Dag H. Wanvik commented on DERBY-6227: -- Looks like a good patch to me. Makes me wonder whether we have other places in the execution machinery that uses plain Strings instead of DVDs.. Thanks for adding that comment caveat. +1 > Distinct aggregates don't work well with territory-based collation > -- > > Key: DERBY-6227 > URL: https://issues.apache.org/jira/browse/DERBY-6227 > Project: Derby > Issue Type: Bug > Components: SQL >Affects Versions: 10.6.1.0, 10.6.2.1, 10.7.1.1, 10.8.1.2, 10.8.2.2, > 10.8.3.0, 10.9.1.0, 10.10.1.1 >Reporter: Knut Anders Hatlen >Assignee: Knut Anders Hatlen > Labels: derby_triage10_11 > Attachments: d6227-1a.diff > > > When working on DERBY-5840, I noticed that GroupedAggregateResultSet would > do duplicate elimination by comparing the java.lang.String representation of > the values. With territory-based collation, it is possible that two values > that have different java.lang.String representation should be considered > duplicates, and this logic will produce incorrect results. > Example: > ij version 10.10 > ij> connect > 'jdbc:derby:memory:db;territory=en_US;collation=TERRITORY_BASED:PRIMARY;create=true'; > ij> create table t(i int, s varchar(10)); > 0 rows inserted/updated/deleted > ij> insert into t values (1, 'a'), (1, 'a'), (2, 'b'), (2, 'B'), (3, 'a'), > (3, 'A'), (3, 'b'), (3, 'B'), (3, 'c'); > 9 rows inserted/updated/deleted > ij> select distinct s from t; > S > -- > b > a > c > 3 rows selected > ij> select i, count(distinct s) from t group by i; > I |2 > --- > 1 |1 > 2 |2 > 3 |5 > 3 rows selected > I would have expected the last query to return > (1, 1) > (2, 1) > (3, 3) -- This message was sent by Atlassian JIRA (v6.2#6252)