[jira] Commented: (DERBY-2967) Single character does not match high value unicode character with collation TERRITORY_BASED

Mamta A. Satoor (JIRA) Tue, 28 Aug 2007 12:42:01 -0700

    [ 
https://issues.apache.org/jira/browse/DERBY-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523324
 ]


Mamta A. Satoor commented on DERBY-2967:
----------------------------------------

I do not have any numbers for the performance results for int array vs 
CollationElementIterator. I searched Derby dev list and found few postings 
about how constructing an array before hand may not be not necessary for a 
check which would fail within say first few characters. I also found DERBY-2699 
(performance of like in territory based collation databases may be improved by 
changing way collation elements are calculated.) which also talks about the 
same issue as your comment "One potential problem with the arrays is that the 
array is fully populated even if the value could be disqualified on the first 
character, that would seem to degrade performance, not improve it. " 

I will start investigating into using CollationElementIterator rather than an 
int array. This will fix both this Jira entry and DERBY-2699 and possibly 
DERBY-2698.

If anyone has any comments please let me know.

> Single character does not match high value unicode character with collation 
> TERRITORY_BASED
> -------------------------------------------------------------------------------------------
>
>                 Key: DERBY-2967
>                 URL: https://issues.apache.org/jira/browse/DERBY-2967
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.4.0.0
>            Reporter: Kathey Marsden
>         Attachments: TestFrench.java, TestNorway.java
>
>
> With TERRITORY_BASED collation '_' does not match  the character \uFA2D.  It 
> is the same for english or norwegian. FOR collation UCS_BASIC it matches 
> fine.  Could you tell me if this is a bug?
> Here is a program to reproduce.
> import java.sql.*;
> public class HighCharacter {
>    public static void main(String args[]) throws Exception
>    {
>    System.out.println("\n Territory no_NO");
>    Class.forName("org.apache.derby.jdbc.EmbeddedDriver");
>    Connection conn = 
> DriverManager.getConnection("jdbc:derby:nordb;create=true;territory=no_NO;collation=TERRITORY_BASED");
>    testLikeWithHighestValidCharacter(conn);
>    conn.close();
>    System.out.println("\n Territory en_US");
>    conn = 
> DriverManager.getConnection("jdbc:derby:endb;create=true;territory=en_US;collation=TERRITORY_BASED");
>    testLikeWithHighestValidCharacter(conn);
>    conn.close();
>    System.out.println("\n Collation USC_BASIC");
>    conn = DriverManager.getConnection("jdbc:derby:basicdb;create=true");
>    testLikeWithHighestValidCharacter(conn);
>    }
> public static  void testLikeWithHighestValidCharacter(Connection conn) throws 
> SQLException {
>    Statement stmt = conn.createStatement();
>    try {
>    stmt.executeUpdate("drop table t1");
>    }catch (SQLException se)
>    {// drop failure ok.
>    }
>    stmt.executeUpdate("create table t1(c11 int)");
>    stmt.executeUpdate("insert into t1 values 1");
>  
>    // \uFA2D - the highest valid character according to
>    // Character.isDefined() of JDK 1.4;
>    PreparedStatement ps =
>    conn.prepareStatement("select 1 from t1 where '\uFA2D' like ?");
>      String[] match = { "%", "_", "\uFA2D" };
>    for (int i = 0; i < match.length; i++) {
>    System.out.println("select 1 from t1 where '\\uFA2D' like " + match[i]);
>    ps.setString(1, match[i]);
>    ResultSet rs = ps.executeQuery();
>    if( rs.next() && rs.getString(1).equals("1"))
>        System.out.println("PASS");
>    else          System.out.println("FAIL: no match");
>    rs.close();
>    }
>   }
> }
> Mamta made some comments on this issue in the following thread:
> http://www.nabble.com/Single-character-does-not-match-high-value-unicode-character-with-collation-TERRITORY_BASED.-Is-this-a-bug-tf4118767.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-2967) Single character does not match high value unicode character with collation TERRITORY_BASED

Reply via email to