Re: [go-nuts] Question about strings.EqualFold vs Python's casefold

2020-05-01 Thread Miki Tebeka
Thanks!

On Friday, May 1, 2020 at 8:21:48 AM UTC+3, Ian Lance Taylor wrote:
>
> On Thu, Apr 30, 2020 at 9:42 PM Miki Tebeka  > wrote: 
> > 
> > I'm trying to find an example where strings.EqualFold returns true but 
> comparison of strings.ToLower fails. 
> > I've found this example (in Python): 
> > 
> > s1 = "der Fluß" 
> > s2 = "der Fluss" 
> > 
> > print('lower', s1.lower() == s2.lower()) 
> > print('fold ', s1.casefold() == s2.casefold()) 
> > 
> > Which prints False for lower and True for casefold. 
> > 
> > When I try the same in Go 
> > package main 
> > 
> > import ( 
> > "fmt" 
> > "strings" 
> > ) 
> > 
> > func main() { 
> > 
> > s1 := "der Fluß" 
> > s2 := "der Fluss" 
> > 
> > fmt.Println("lower", strings.ToLower(s1) == strings.ToLower(s2)) 
> > fmt.Println("fold ", strings.EqualFold(s1, s2)) 
> > } 
> > 
> > I get false for both ToLower and EqualFold. 
> > 
> > Shouldn't Unicode folding be the same across languages? 
> > Also, does anyone have an example I can show in Go where ToLower does 
> not compare and EqualFold does? 
>
> strings.EqualFold uses Unicode case folding, not case mapping.  Case 
> folding is only one-to-one character transformations.  Converting "ss" 
> to "ß" is case mapping, and as such is not used by strings.EqualFold. 
> For that, you want the x/text/search package, as in 
>
> package main 
>
> import ( 
> "fmt" 
>
> "golang.org/x/text/language" 
> "golang.org/x/text/search" 
> ) 
>
> func main() { 
> m := search.New(language.German, search.Loose) 
> s1 := "der Fluß" 
> s2 := "der Fluss" 
> fmt.Println(m.EqualString(s1, s2)) 
> } 
>
> That should print true. 
>
> An example where strings.EqualFold does not return the same result as 
> strings.ToLower(s1) == strings.ToLower(s2) is 
>
> s1 := "σς" 
> s2 := "ΣΣ" 
>
> strings.EqualFold will return true but comparing the ToLower of the 
> strings will return false.  This is because there are two lower case 
> forms of Σ (depending on position in the word), but strings.ToLower 
> can of course only return one. 
>
> Ian 
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/3e9f1ac7-fb91-44ce-9f7c-68a1f746eba1%40googlegroups.com.


Re: [go-nuts] Question about strings.EqualFold vs Python's casefold

2020-04-30 Thread Ian Lance Taylor
On Thu, Apr 30, 2020 at 9:42 PM Miki Tebeka  wrote:
>
> I'm trying to find an example where strings.EqualFold returns true but 
> comparison of strings.ToLower fails.
> I've found this example (in Python):
>
> s1 = "der Fluß"
> s2 = "der Fluss"
>
> print('lower', s1.lower() == s2.lower())
> print('fold ', s1.casefold() == s2.casefold())
>
> Which prints False for lower and True for casefold.
>
> When I try the same in Go
> package main
>
> import (
> "fmt"
> "strings"
> )
>
> func main() {
>
> s1 := "der Fluß"
> s2 := "der Fluss"
>
> fmt.Println("lower", strings.ToLower(s1) == strings.ToLower(s2))
> fmt.Println("fold ", strings.EqualFold(s1, s2))
> }
>
> I get false for both ToLower and EqualFold.
>
> Shouldn't Unicode folding be the same across languages?
> Also, does anyone have an example I can show in Go where ToLower does not 
> compare and EqualFold does?

strings.EqualFold uses Unicode case folding, not case mapping.  Case
folding is only one-to-one character transformations.  Converting "ss"
to "ß" is case mapping, and as such is not used by strings.EqualFold.
For that, you want the x/text/search package, as in

package main

import (
"fmt"

"golang.org/x/text/language"
"golang.org/x/text/search"
)

func main() {
m := search.New(language.German, search.Loose)
s1 := "der Fluß"
s2 := "der Fluss"
fmt.Println(m.EqualString(s1, s2))
}

That should print true.

An example where strings.EqualFold does not return the same result as
strings.ToLower(s1) == strings.ToLower(s2) is

s1 := "σς"
s2 := "ΣΣ"

strings.EqualFold will return true but comparing the ToLower of the
strings will return false.  This is because there are two lower case
forms of Σ (depending on position in the word), but strings.ToLower
can of course only return one.

Ian

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAOyqgcU47SM2foJdGFH6mspRpppOUugLG15R_0ZhOGp2fbO6FA%40mail.gmail.com.


[go-nuts] Question about strings.EqualFold vs Python's casefold

2020-04-30 Thread Miki Tebeka
Hi,

I'm trying to find an example where strings.EqualFold returns true but 
comparison of strings.ToLower fails.
I've found this example 
 (in 
Python):

s1 = "der Fluß"
s2 = "der Fluss"

print('lower', s1.lower() == s2.lower())
print('fold ', s1.casefold() == s2.casefold())

Which prints False for lower and True for casefold.

When I try the same in Go
package main

import (
"fmt"
"strings"
)

func main() {

s1 := "der Fluß"
s2 := "der Fluss"

fmt.Println("lower", strings.ToLower(s1) == strings.ToLower(s2))
fmt.Println("fold ", strings.EqualFold(s1, s2))
}

I get false for both ToLower and EqualFold.

Shouldn't Unicode folding be the same across languages?
Also, does anyone have an example I can show in Go where ToLower does not 
compare and EqualFold does?

Thanks,
Miki

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/c61bcb93-aa76-4b8c-ae69-7bcbf0c7567c%40googlegroups.com.