Re: Comparing equivalent UTF-8 characters across languages

2009-02-27 Thread dirk

In case anyone else is facing a similar problem, here's the code,
tested on IE 7 and Firefox 3. Note that this only supports Spanish
characters, you can add more to the charEquivalency table for other
languages.


private boolean contains(String container, String sub)
{
char[] contChars = container.toCharArray();
char[] subChars = sub.toCharArray();

for(int i = 0; i < contChars.length; i++)
{
if( equivalentFrom(contChars, i, subChars) )
{
return true;
}
}

return false;
}

private boolean equivalentFrom(char[] contChars, int contI, char[]
subChars)
{
int subI = 0;
for(; subI < subChars.length && contI < contChars.length; subI++,
contI++)
{
if( ! equivalent(subChars[subI], contChars[contI]) )
{
return false;
}
}

if(subI == subChars.length)
return true;

return false;
}


static HashMap charEquivalency = new
HashMap();
static
{
charEquivalency.put( 'n', (char)0x00F1 ); // ñ
charEquivalency.put( 'a', (char)0x00E1 ); // á
charEquivalency.put( 'e', (char)0x00E9 ); // é
charEquivalency.put( 'i', (char)0x00ED ); // í
charEquivalency.put( 'o', (char)0x00F3 ); // ó
charEquivalency.put( 'u', (char)0x00FA ); // ú
charEquivalency.put( 'u', (char)0x00FC ); // ü

// Insert upper-case equivalents
Set keySet = charEquivalency.keySet();
HashMap upperCase = new HashMap(keySet.size());
for(Character charEn : keySet)
{
Character charSp = charEquivalency.get(charEn);
Character upperEn = Character.toUpperCase(charEn);
Character upperSp = Character.toUpperCase(charSp);
upperCase.put(upperEn, upperSp);
}
charEquivalency.putAll(upperCase);

// Insert the opposite conversion, from Spanish
// to English equivalents
keySet = charEquivalency.keySet();
HashMap opposites = new HashMap(keySet.size());
for(Character charEn : keySet)
{
Character charSp = charEquivalency.get(charEn);
opposites.put(charSp, charEn);
}
charEquivalency.putAll(opposites);

for(Character c : charEquivalency.keySet())
{
Character val = charEquivalency.get(c);
//GWT.log(c + "("+ (int)c + "): " + val + "("+ (int)val + ")",
null);
}
}

private boolean equivalent(char first, char second)
{
if(first == second)
return true;

Character firstEquiv = charEquivalency.get(first);

if(firstEquiv == null)
return false;

return firstEquiv.equals(second);
}


On Feb 27, 3:44 pm, dirk  wrote:
> Thanks for the response Freller.
>
> I already am doing it on the server side, as I say MySQL will do it
> with the right configuration settings, no need to use soundex. When
> the user enters some characters in a text field, the server returns
> strings that match those characters, but then when they enter another
> character I don't want to go all the way back to the server, I just
> want to filter the results I already have.
>
> I guess I'll implement it manually for the time being.
> Thanks again,
> Dirk
>
> On Feb 26, 9:22 pm, Freller  wrote:
>
> > You should do that on ther server side of your app.
> > You should Google soundex, I'm sure you will find plenty of material
> > on how to do that.
>
> > Freller
>
> > On Feb 26, 6:14 pm, dirk  wrote:
>
> > > It seems that in order to do this using regular Java, I would use the
> > > java.text.Collator 
> > > class:http://java.sun.com/j2se/1.5.0/docs/api/java/text/Collator.html
>
> > > So now my question becomes simpler, is this supported, or is there an
> > > equivalent in GWT?
>
> > > Thanks,
> > > Dirk
>
> > > On Feb 26, 4:59 pm, dirk  wrote:
>
> > > > Hi there,
>
> > > > I'm writing an application in which I'd like to compare characters
> > > > that exist in different languages.
> > > > Specifically, I need String.contains(String s) to return true if one
> > > > string contains characters _equivalent_ to another.
>
> > > > For example:
> > > > "soñar".contains("on")
> > > > should return true, recognising that the 'ñ' character is equivalent
> > > > to 'n'.
>
> > > > My database (MySQL) can do this automatically with the right
> > > > configuration settings. I'm not sure if it's possible in Java. If so,
> > > > is it supported by GWT?
>
> > > > Thanks,
> > > > Dirk
>
>
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google Web Toolkit" group.
To post to this group, send email to Google-Web-Toolkit@googlegroups.com
To unsubscribe from this group, send email to 
google-web-toolkit+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/Google-Web-Toolkit?hl=en
-~--~~~~--~~--~--~---



Re: Comparing equivalent UTF-8 characters across languages

2009-02-27 Thread dirk

Thanks for the response Freller.

I already am doing it on the server side, as I say MySQL will do it
with the right configuration settings, no need to use soundex. When
the user enters some characters in a text field, the server returns
strings that match those characters, but then when they enter another
character I don't want to go all the way back to the server, I just
want to filter the results I already have.

I guess I'll implement it manually for the time being.
Thanks again,
Dirk

On Feb 26, 9:22 pm, Freller  wrote:
> You should do that on ther server side of your app.
> You should Google soundex, I'm sure you will find plenty of material
> on how to do that.
>
> Freller
>
> On Feb 26, 6:14 pm, dirk  wrote:
>
> > It seems that in order to do this using regular Java, I would use the
> > java.text.Collator 
> > class:http://java.sun.com/j2se/1.5.0/docs/api/java/text/Collator.html
>
> > So now my question becomes simpler, is this supported, or is there an
> > equivalent in GWT?
>
> > Thanks,
> > Dirk
>
> > On Feb 26, 4:59 pm, dirk  wrote:
>
> > > Hi there,
>
> > > I'm writing an application in which I'd like to compare characters
> > > that exist in different languages.
> > > Specifically, I need String.contains(String s) to return true if one
> > > string contains characters _equivalent_ to another.
>
> > > For example:
> > > "soñar".contains("on")
> > > should return true, recognising that the 'ñ' character is equivalent
> > > to 'n'.
>
> > > My database (MySQL) can do this automatically with the right
> > > configuration settings. I'm not sure if it's possible in Java. If so,
> > > is it supported by GWT?
>
> > > Thanks,
> > > Dirk
>
>
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google Web Toolkit" group.
To post to this group, send email to Google-Web-Toolkit@googlegroups.com
To unsubscribe from this group, send email to 
google-web-toolkit+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/Google-Web-Toolkit?hl=en
-~--~~~~--~~--~--~---



Re: Comparing equivalent UTF-8 characters across languages

2009-02-26 Thread Freller


You should do that on ther server side of your app.
You should Google soundex, I'm sure you will find plenty of material
on how to do that.

Freller


On Feb 26, 6:14 pm, dirk  wrote:
> It seems that in order to do this using regular Java, I would use the
> java.text.Collator 
> class:http://java.sun.com/j2se/1.5.0/docs/api/java/text/Collator.html
>
> So now my question becomes simpler, is this supported, or is there an
> equivalent in GWT?
>
> Thanks,
> Dirk
>
> On Feb 26, 4:59 pm, dirk  wrote:
>
> > Hi there,
>
> > I'm writing an application in which I'd like to compare characters
> > that exist in different languages.
> > Specifically, I need String.contains(String s) to return true if one
> > string contains characters _equivalent_ to another.
>
> > For example:
> > "soñar".contains("on")
> > should return true, recognising that the 'ñ' character is equivalent
> > to 'n'.
>
> > My database (MySQL) can do this automatically with the right
> > configuration settings. I'm not sure if it's possible in Java. If so,
> > is it supported by GWT?
>
> > Thanks,
> > Dirk
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google Web Toolkit" group.
To post to this group, send email to Google-Web-Toolkit@googlegroups.com
To unsubscribe from this group, send email to 
google-web-toolkit+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/Google-Web-Toolkit?hl=en
-~--~~~~--~~--~--~---



Re: Comparing equivalent UTF-8 characters across languages

2009-02-26 Thread dirk

It seems that in order to do this using regular Java, I would use the
java.text.Collator class:
http://java.sun.com/j2se/1.5.0/docs/api/java/text/Collator.html

So now my question becomes simpler, is this supported, or is there an
equivalent in GWT?

Thanks,
Dirk


On Feb 26, 4:59 pm, dirk  wrote:
> Hi there,
>
> I'm writing an application in which I'd like to compare characters
> that exist in different languages.
> Specifically, I need String.contains(String s) to return true if one
> string contains characters _equivalent_ to another.
>
> For example:
> "soñar".contains("on")
> should return true, recognising that the 'ñ' character is equivalent
> to 'n'.
>
> My database (MySQL) can do this automatically with the right
> configuration settings. I'm not sure if it's possible in Java. If so,
> is it supported by GWT?
>
> Thanks,
> Dirk
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google Web Toolkit" group.
To post to this group, send email to Google-Web-Toolkit@googlegroups.com
To unsubscribe from this group, send email to 
google-web-toolkit+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/Google-Web-Toolkit?hl=en
-~--~~~~--~~--~--~---



Comparing equivalent UTF-8 characters across languages

2009-02-26 Thread dirk

Hi there,

I'm writing an application in which I'd like to compare characters
that exist in different languages.
Specifically, I need String.contains(String s) to return true if one
string contains characters _equivalent_ to another.

For example:
"soñar".contains("on")
should return true, recognising that the 'ñ' character is equivalent
to 'n'.

My database (MySQL) can do this automatically with the right
configuration settings. I'm not sure if it's possible in Java. If so,
is it supported by GWT?

Thanks,
Dirk

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Google Web Toolkit" group.
To post to this group, send email to Google-Web-Toolkit@googlegroups.com
To unsubscribe from this group, send email to 
google-web-toolkit+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/Google-Web-Toolkit?hl=en
-~--~~~~--~~--~--~---