[go-nuts] Re: Go string and UTF-8

2019-08-20 Thread Pierre Durand
OK, thank you !

Le mardi 20 août 2019 10:34:55 UTC+2, djeg...@gmail.com a écrit :
>
> On Tue, Aug 20, 2019 at 10:12 AM Pierre Durand wrote:
> >
> > I know that by convention Go string contain UTF-8 encoded text.
>
> To my understanding this is not entirely true -- see 
> https://blog.golang.org/strings#TOC_2. -- It is simply a readonly slice 
> of bytes. However there is at least 2 places where UTF-8 encoding is used 
> for strings in the language spec: source code file is expected to be UTF-8 
> (thus string literals are partially influenced), and when using the `for 
> range` construct on a string. Otherwise there are various packages (e.g. 
> unicode/utf8) which expect UTF-8 encoded strings as arguments.
>
> > Is it recommended/a good practice to store invalid bytes in a string ?
>
> Thus the concept of _invalid bytes in a string_ doesn't really exist ;-).
>
> > The use case:
> > - compute a hash => get a []byte
> > - convert the []byte to string (this string is not UTF-8 valid)
> > - use the string as a map key
>
> I don't see any issues with this.
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/e8610962-732e-4744-9faf-c8581c444064%40googlegroups.com.


[go-nuts] Re: Go string and UTF-8

2019-08-20 Thread djego . joss
On Tue, Aug 20, 2019 at 10:12 AM Pierre Durand wrote:
>
> I know that by convention Go string contain UTF-8 encoded text.

To my understanding this is not entirely true -- see 
https://blog.golang.org/strings#TOC_2. -- It is simply a readonly slice of 
bytes. However there is at least 2 places where UTF-8 encoding is used for 
strings in the language spec: source code file is expected to be UTF-8 
(thus string literals are partially influenced), and when using the `for 
range` construct on a string. Otherwise there are various packages (e.g. 
unicode/utf8) which expect UTF-8 encoded strings as arguments.

> Is it recommended/a good practice to store invalid bytes in a string ?

Thus the concept of _invalid bytes in a string_ doesn't really exist ;-).

> The use case:
> - compute a hash => get a []byte
> - convert the []byte to string (this string is not UTF-8 valid)
> - use the string as a map key

I don't see any issues with this.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/34d7cce7-91d0-454f-ab1c-c373a984d66f%40googlegroups.com.