[jira] [Updated] (LANG-607) StringUtils methods do not handle Unicode 2.0+ supplementary characters correctly.
[ https://issues.apache.org/jira/browse/LANG-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henri Yandell updated LANG-607: --- Fix Version/s: (was: 3.x) Patch Needed > StringUtils methods do not handle Unicode 2.0+ supplementary characters > correctly. > -- > > Key: LANG-607 > URL: https://issues.apache.org/jira/browse/LANG-607 > Project: Commons Lang > Issue Type: Bug > Components: lang.* >Affects Versions: 2.5 > Environment: java version "1.6.0_16" > Java(TM) SE Runtime Environment (build 1.6.0_16-b01) > Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode) > Microsoft Windows [Version 6.0.6002] > Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700) > Java version: 1.6.0_16 > Java home: C:\Program Files\Java\jdk1.6.0_16\jre > Default locale: en_US, platform encoding: Cp1252 > OS name: "windows vista" version: "6.0" arch: "amd64" Family: "windows" >Reporter: Gary Gregory >Assignee: Gary Gregory >Priority: Minor > Fix For: Patch Needed > > Attachments: LANG-607.diff > > > StringUtils.containsAny methods incorrectly matches Unicode 2.0+ > supplementary characters. > For example, define a test fixture to be the Unicode character U+2 where > U+2 is written in Java source as "\uD840\uDC00" > private static final String CharU2 = "\uD840\uDC00"; > private static final String CharU20001 = "\uD840\uDC01"; > You can see Unicode supplementary characters correctly implemented in the JRE > call: > assertEquals(-1, CharU2.indexOf(CharU20001)); > But this is broken: > assertEquals(false, StringUtils.containsAny(CharU2, CharU20001)); > assertEquals(false, StringUtils.containsAny(CharU20001, CharU2)); > This is fine: > assertEquals(true, StringUtils.contains(CharU2 + CharU20001, > CharU2)); > assertEquals(true, StringUtils.contains(CharU2 + CharU20001, > CharU20001)); > assertEquals(true, StringUtils.contains(CharU2, CharU2)); > assertEquals(false, StringUtils.contains(CharU2, CharU20001)); > because the method calls the JRE to perform the match. > More than you want to know: > - http://java.sun.com/developer/technicalArticles/Intl/Supplementary/ -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] Updated: (LANG-607) StringUtils methods do not handle Unicode 2.0+ supplementary characters correctly.
[ https://issues.apache.org/jira/browse/LANG-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henri Yandell updated LANG-607: --- Fix Version/s: (was: 3.0) 3.1 > StringUtils methods do not handle Unicode 2.0+ supplementary characters > correctly. > -- > > Key: LANG-607 > URL: https://issues.apache.org/jira/browse/LANG-607 > Project: Commons Lang > Issue Type: Bug > Components: lang.* >Affects Versions: 2.5 > Environment: java version "1.6.0_16" > Java(TM) SE Runtime Environment (build 1.6.0_16-b01) > Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode) > Microsoft Windows [Version 6.0.6002] > Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700) > Java version: 1.6.0_16 > Java home: C:\Program Files\Java\jdk1.6.0_16\jre > Default locale: en_US, platform encoding: Cp1252 > OS name: "windows vista" version: "6.0" arch: "amd64" Family: "windows" >Reporter: Gary Gregory >Assignee: Gary Gregory >Priority: Minor > Fix For: 3.1 > > Attachments: LANG-607.diff > > > StringUtils.containsAny methods incorrectly matches Unicode 2.0+ > supplementary characters. > For example, define a test fixture to be the Unicode character U+2 where > U+2 is written in Java source as "\uD840\uDC00" > private static final String CharU2 = "\uD840\uDC00"; > private static final String CharU20001 = "\uD840\uDC01"; > You can see Unicode supplementary characters correctly implemented in the JRE > call: > assertEquals(-1, CharU2.indexOf(CharU20001)); > But this is broken: > assertEquals(false, StringUtils.containsAny(CharU2, CharU20001)); > assertEquals(false, StringUtils.containsAny(CharU20001, CharU2)); > This is fine: > assertEquals(true, StringUtils.contains(CharU2 + CharU20001, > CharU2)); > assertEquals(true, StringUtils.contains(CharU2 + CharU20001, > CharU20001)); > assertEquals(true, StringUtils.contains(CharU2, CharU2)); > assertEquals(false, StringUtils.contains(CharU2, CharU20001)); > because the method calls the JRE to perform the match. > More than you want to know: > - http://java.sun.com/developer/technicalArticles/Intl/Supplementary/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (LANG-607) StringUtils methods do not handle Unicode 2.0+ supplementary characters correctly.
[ https://issues.apache.org/jira/browse/LANG-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henri Yandell updated LANG-607: --- Moving to 3.1 as not a backwards incompatibility. > StringUtils methods do not handle Unicode 2.0+ supplementary characters > correctly. > -- > > Key: LANG-607 > URL: https://issues.apache.org/jira/browse/LANG-607 > Project: Commons Lang > Issue Type: Bug > Components: lang.* >Affects Versions: 2.5 > Environment: java version "1.6.0_16" > Java(TM) SE Runtime Environment (build 1.6.0_16-b01) > Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode) > Microsoft Windows [Version 6.0.6002] > Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700) > Java version: 1.6.0_16 > Java home: C:\Program Files\Java\jdk1.6.0_16\jre > Default locale: en_US, platform encoding: Cp1252 > OS name: "windows vista" version: "6.0" arch: "amd64" Family: "windows" >Reporter: Gary Gregory >Assignee: Gary Gregory >Priority: Minor > Fix For: 3.1 > > Attachments: LANG-607.diff > > > StringUtils.containsAny methods incorrectly matches Unicode 2.0+ > supplementary characters. > For example, define a test fixture to be the Unicode character U+2 where > U+2 is written in Java source as "\uD840\uDC00" > private static final String CharU2 = "\uD840\uDC00"; > private static final String CharU20001 = "\uD840\uDC01"; > You can see Unicode supplementary characters correctly implemented in the JRE > call: > assertEquals(-1, CharU2.indexOf(CharU20001)); > But this is broken: > assertEquals(false, StringUtils.containsAny(CharU2, CharU20001)); > assertEquals(false, StringUtils.containsAny(CharU20001, CharU2)); > This is fine: > assertEquals(true, StringUtils.contains(CharU2 + CharU20001, > CharU2)); > assertEquals(true, StringUtils.contains(CharU2 + CharU20001, > CharU20001)); > assertEquals(true, StringUtils.contains(CharU2, CharU2)); > assertEquals(false, StringUtils.contains(CharU2, CharU20001)); > because the method calls the JRE to perform the match. > More than you want to know: > - http://java.sun.com/developer/technicalArticles/Intl/Supplementary/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (LANG-607) StringUtils methods do not handle Unicode 2.0+ supplementary characters correctly.
[ https://issues.apache.org/jira/browse/LANG-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Gregory updated LANG-607: -- Summary: StringUtils methods do not handle Unicode 2.0+ supplementary characters correctly. (was: StringUtils methods incorrectly matches Unicode 2.0+ supplementary characters.) > StringUtils methods do not handle Unicode 2.0+ supplementary characters > correctly. > -- > > Key: LANG-607 > URL: https://issues.apache.org/jira/browse/LANG-607 > Project: Commons Lang > Issue Type: Bug > Components: lang.* >Affects Versions: 2.5 > Environment: java version "1.6.0_16" > Java(TM) SE Runtime Environment (build 1.6.0_16-b01) > Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode) > Microsoft Windows [Version 6.0.6002] > Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700) > Java version: 1.6.0_16 > Java home: C:\Program Files\Java\jdk1.6.0_16\jre > Default locale: en_US, platform encoding: Cp1252 > OS name: "windows vista" version: "6.0" arch: "amd64" Family: "windows" >Reporter: Gary Gregory >Assignee: Gary Gregory >Priority: Minor > Fix For: 3.0 > > Attachments: LANG-607.diff > > > StringUtils.containsAny methods incorrectly matches Unicode 2.0+ > supplementary characters. > For example, define a test fixture to be the Unicode character U+2 where > U+2 is written in Java source as "\uD840\uDC00" > private static final String CharU2 = "\uD840\uDC00"; > private static final String CharU20001 = "\uD840\uDC01"; > You can see Unicode supplementary characters correctly implemented in the JRE > call: > assertEquals(-1, CharU2.indexOf(CharU20001)); > But this is broken: > assertEquals(false, StringUtils.containsAny(CharU2, CharU20001)); > assertEquals(false, StringUtils.containsAny(CharU20001, CharU2)); > This is fine: > assertEquals(true, StringUtils.contains(CharU2 + CharU20001, > CharU2)); > assertEquals(true, StringUtils.contains(CharU2 + CharU20001, > CharU20001)); > assertEquals(true, StringUtils.contains(CharU2, CharU2)); > assertEquals(false, StringUtils.contains(CharU2, CharU20001)); > because the method calls the JRE to perform the match. > More than you want to know: > - http://java.sun.com/developer/technicalArticles/Intl/Supplementary/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.