Re: [The Java Posse] Re: Dick, that's not how you compare strings!

Amarjeet Singh Thu, 12 Aug 2010 01:10:52 -0700

And what on earth are these algorithms for string comparison then?

http://www-igm.univ-mlv.fr/~lecroq/string/index.html


Reg

On Mon, Aug 9, 2010 at 10:29 AM, Dick Wall <dickw...@gmail.com> wrote:
> I can't help but feel that the discussion has got a little bit lost in
> the rough :-). I do wish I had pulled a better example out for that
> original post, but lest anyone not remember, the point was to show how
> closures (and in particular good language support for them) greatly
> cuts boilerplate and enhances readability. I could have used an
> example with some genetic calculation code or something like that, but
> it would have needed far more supporting material. Point is, Java
> exhibits its own ugly backwaters of complexity, and they tend to be in
> features we use all the time (like anonymous inner classes).
>
> Dick
>
> On Aug 8, 3:23 pm, Reinier Zwitserloot <reini...@gmail.com> wrote:
>> So close.
>>
>> java's own String.CASE_INSENSITIVE_ORDER uses this tactic, and as far
>> as case insensitive tactics go, this really isn't such a bad one.
>> However, they completely bollocks it up by doing this character-by-
>> character for some completely unfathomable reason. This is dumb, and
>> explains why STRASSE and straße aren't equal.
>> Character.toUpperCase('\u00DF') can't very well return "SS", so it has
>> to return the unicode codepoint for capital eszett.
>>
>> Nevertheless, as someone else has pointed out to me, both großman and
>> grossman are somewhat common german surnames and shouldn't be
>> considered equal, so, in many ways, yes, 'case insensitive' as a
>> concept doesn't really make sense beyond english.
>>
>> Doing a canonical comparison to answer the question: "Are these
>> strings most likely intended to be equal considering that they are
>> both written in language X", is completely valid though, and that's
>> exactly what java.text.Collator is for. I don't think this is mission
>> impossible. It's just crazy complicated.
>>
>> Many props to A McDowell for teaching us all about the case folding
>> rules of unicode. I learned something new.
>>
>> On Aug 8, 9:34 am, Christian Catchpole <christ...@catchpole.net>
>> wrote:
>>
>>
>>
>> > So, without some kind of case translation dictionary that can be
>> > trusted on the particular strings we want to test, can we assume
>> > that's it's not actually a solvable problem? (because, like divide by
>> > zero, the question isn't valid to start with)
>>
>> > Could you maybe get better results by (if upperCompare ||
>> > lowerCompare)?
>>
>> > Was I serious for a second there?
>>
>> > GERBILS!
>>
>> > That's better.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "The Java Posse" group.
> To post to this group, send email to javapo...@googlegroups.com.
> To unsubscribe from this group, send email to 
> javaposse+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/javaposse?hl=en.
>
>



-- 
Amarjeet Singh
Phone: +91-98712-76661

-- 
You received this message because you are subscribed to the Google Groups "The 
Java Posse" group.
To post to this group, send email to javapo...@googlegroups.com.
To unsubscribe from this group, send email to 
javaposse+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/javaposse?hl=en.

Re: [The Java Posse] Re: Dick, that's not how you compare strings!

Reply via email to