Re: Comparing equivalent UTF-8 characters across languages
In case anyone else is facing a similar problem, here's the code, tested on IE 7 and Firefox 3. Note that this only supports Spanish characters, you can add more to the charEquivalency table for other languages. private boolean contains(String container, String sub) { char[] contChars = container.toCharArray(); char[] subChars = sub.toCharArray(); for(int i = 0; i < contChars.length; i++) { if( equivalentFrom(contChars, i, subChars) ) { return true; } } return false; } private boolean equivalentFrom(char[] contChars, int contI, char[] subChars) { int subI = 0; for(; subI < subChars.length && contI < contChars.length; subI++, contI++) { if( ! equivalent(subChars[subI], contChars[contI]) ) { return false; } } if(subI == subChars.length) return true; return false; } static HashMap charEquivalency = new HashMap(); static { charEquivalency.put( 'n', (char)0x00F1 ); // ñ charEquivalency.put( 'a', (char)0x00E1 ); // á charEquivalency.put( 'e', (char)0x00E9 ); // é charEquivalency.put( 'i', (char)0x00ED ); // í charEquivalency.put( 'o', (char)0x00F3 ); // ó charEquivalency.put( 'u', (char)0x00FA ); // ú charEquivalency.put( 'u', (char)0x00FC ); // ü // Insert upper-case equivalents Set keySet = charEquivalency.keySet(); HashMap upperCase = new HashMap(keySet.size()); for(Character charEn : keySet) { Character charSp = charEquivalency.get(charEn); Character upperEn = Character.toUpperCase(charEn); Character upperSp = Character.toUpperCase(charSp); upperCase.put(upperEn, upperSp); } charEquivalency.putAll(upperCase); // Insert the opposite conversion, from Spanish // to English equivalents keySet = charEquivalency.keySet(); HashMap opposites = new HashMap(keySet.size()); for(Character charEn : keySet) { Character charSp = charEquivalency.get(charEn); opposites.put(charSp, charEn); } charEquivalency.putAll(opposites); for(Character c : charEquivalency.keySet()) { Character val = charEquivalency.get(c); //GWT.log(c + "("+ (int)c + "): " + val + "("+ (int)val + ")", null); } } private boolean equivalent(char first, char second) { if(first == second) return true; Character firstEquiv = charEquivalency.get(first); if(firstEquiv == null) return false; return firstEquiv.equals(second); } On Feb 27, 3:44 pm, dirk wrote: > Thanks for the response Freller. > > I already am doing it on the server side, as I say MySQL will do it > with the right configuration settings, no need to use soundex. When > the user enters some characters in a text field, the server returns > strings that match those characters, but then when they enter another > character I don't want to go all the way back to the server, I just > want to filter the results I already have. > > I guess I'll implement it manually for the time being. > Thanks again, > Dirk > > On Feb 26, 9:22 pm, Freller wrote: > > > You should do that on ther server side of your app. > > You should Google soundex, I'm sure you will find plenty of material > > on how to do that. > > > Freller > > > On Feb 26, 6:14 pm, dirk wrote: > > > > It seems that in order to do this using regular Java, I would use the > > > java.text.Collator > > > class:http://java.sun.com/j2se/1.5.0/docs/api/java/text/Collator.html > > > > So now my question becomes simpler, is this supported, or is there an > > > equivalent in GWT? > > > > Thanks, > > > Dirk > > > > On Feb 26, 4:59 pm, dirk wrote: > > > > > Hi there, > > > > > I'm writing an application in which I'd like to compare characters > > > > that exist in different languages. > > > > Specifically, I need String.contains(String s) to return true if one > > > > string contains characters _equivalent_ to another. > > > > > For example: > > > > "soñar".contains("on") > > > > should return true, recognising that the 'ñ' character is equivalent > > > > to 'n'. > > > > > My database (MySQL) can do this automatically with the right > > > > configuration settings. I'm not sure if it's possible in Java. If so, > > > > is it supported by GWT? > > > > > Thanks, > > > > Dirk > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group. To post to this group, send email to Google-Web-Toolkit@googlegroups.com To unsubscribe from this group, send email to google-web-toolkit+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/Google-Web-Toolkit?hl=en -~--~~~~--~~--~--~---
Re: Comparing equivalent UTF-8 characters across languages
Thanks for the response Freller. I already am doing it on the server side, as I say MySQL will do it with the right configuration settings, no need to use soundex. When the user enters some characters in a text field, the server returns strings that match those characters, but then when they enter another character I don't want to go all the way back to the server, I just want to filter the results I already have. I guess I'll implement it manually for the time being. Thanks again, Dirk On Feb 26, 9:22 pm, Freller wrote: > You should do that on ther server side of your app. > You should Google soundex, I'm sure you will find plenty of material > on how to do that. > > Freller > > On Feb 26, 6:14 pm, dirk wrote: > > > It seems that in order to do this using regular Java, I would use the > > java.text.Collator > > class:http://java.sun.com/j2se/1.5.0/docs/api/java/text/Collator.html > > > So now my question becomes simpler, is this supported, or is there an > > equivalent in GWT? > > > Thanks, > > Dirk > > > On Feb 26, 4:59 pm, dirk wrote: > > > > Hi there, > > > > I'm writing an application in which I'd like to compare characters > > > that exist in different languages. > > > Specifically, I need String.contains(String s) to return true if one > > > string contains characters _equivalent_ to another. > > > > For example: > > > "soñar".contains("on") > > > should return true, recognising that the 'ñ' character is equivalent > > > to 'n'. > > > > My database (MySQL) can do this automatically with the right > > > configuration settings. I'm not sure if it's possible in Java. If so, > > > is it supported by GWT? > > > > Thanks, > > > Dirk > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group. To post to this group, send email to Google-Web-Toolkit@googlegroups.com To unsubscribe from this group, send email to google-web-toolkit+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/Google-Web-Toolkit?hl=en -~--~~~~--~~--~--~---
Re: Comparing equivalent UTF-8 characters across languages
You should do that on ther server side of your app. You should Google soundex, I'm sure you will find plenty of material on how to do that. Freller On Feb 26, 6:14 pm, dirk wrote: > It seems that in order to do this using regular Java, I would use the > java.text.Collator > class:http://java.sun.com/j2se/1.5.0/docs/api/java/text/Collator.html > > So now my question becomes simpler, is this supported, or is there an > equivalent in GWT? > > Thanks, > Dirk > > On Feb 26, 4:59 pm, dirk wrote: > > > Hi there, > > > I'm writing an application in which I'd like to compare characters > > that exist in different languages. > > Specifically, I need String.contains(String s) to return true if one > > string contains characters _equivalent_ to another. > > > For example: > > "soñar".contains("on") > > should return true, recognising that the 'ñ' character is equivalent > > to 'n'. > > > My database (MySQL) can do this automatically with the right > > configuration settings. I'm not sure if it's possible in Java. If so, > > is it supported by GWT? > > > Thanks, > > Dirk --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group. To post to this group, send email to Google-Web-Toolkit@googlegroups.com To unsubscribe from this group, send email to google-web-toolkit+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/Google-Web-Toolkit?hl=en -~--~~~~--~~--~--~---
Re: Comparing equivalent UTF-8 characters across languages
It seems that in order to do this using regular Java, I would use the java.text.Collator class: http://java.sun.com/j2se/1.5.0/docs/api/java/text/Collator.html So now my question becomes simpler, is this supported, or is there an equivalent in GWT? Thanks, Dirk On Feb 26, 4:59 pm, dirk wrote: > Hi there, > > I'm writing an application in which I'd like to compare characters > that exist in different languages. > Specifically, I need String.contains(String s) to return true if one > string contains characters _equivalent_ to another. > > For example: > "soñar".contains("on") > should return true, recognising that the 'ñ' character is equivalent > to 'n'. > > My database (MySQL) can do this automatically with the right > configuration settings. I'm not sure if it's possible in Java. If so, > is it supported by GWT? > > Thanks, > Dirk --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group. To post to this group, send email to Google-Web-Toolkit@googlegroups.com To unsubscribe from this group, send email to google-web-toolkit+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/Google-Web-Toolkit?hl=en -~--~~~~--~~--~--~---
Comparing equivalent UTF-8 characters across languages
Hi there, I'm writing an application in which I'd like to compare characters that exist in different languages. Specifically, I need String.contains(String s) to return true if one string contains characters _equivalent_ to another. For example: "soñar".contains("on") should return true, recognising that the 'ñ' character is equivalent to 'n'. My database (MySQL) can do this automatically with the right configuration settings. I'm not sure if it's possible in Java. If so, is it supported by GWT? Thanks, Dirk --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group. To post to this group, send email to Google-Web-Toolkit@googlegroups.com To unsubscribe from this group, send email to google-web-toolkit+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/Google-Web-Toolkit?hl=en -~--~~~~--~~--~--~---