Re: [go-nuts] Data Structure for String Interning?

2022-01-26 Thread Jesper Louis Andersen
On Sun, Jan 9, 2022 at 11:52 PM jlfo...@berkeley.edu wrote: > > I'm wondering if there's a map-like data structure that would store a > string > as the key, and the address of the key as the value. I'm aware that a > standard > Go map can't be used for this because its components might be moved a

Re: [go-nuts] Data Structure for String Interning?

2022-01-09 Thread Bakul Shah
> On Jan 9, 2022, at 6:15 PM, burak serdar wrote: > > I think the question is "why do you want to intern strings". The solution can > be different for different use cases. Indeed. This is a given! Such tricks should only be tried if the benefits outweigh the cost, as per profiling expected us

Re: [go-nuts] Data Structure for String Interning?

2022-01-09 Thread burak serdar
I think the question is "why do you want to intern strings". The solution can be different for different use cases. I work with large Json files where keys are repeated, and the keys are long strings (URLs) so using a map[string]string to intern the keys makes a lot of sense. For serialization, th

Re: [go-nuts] Data Structure for String Interning?

2022-01-09 Thread Bakul Shah
The string header will be 16 bytes on 64bit word size machines. If most of the words are much shorter, interning won’t buy you much. For applications where you *know* all the words are short, and the total string space won’t exceed 4GB, you can try other alternatives. For instance if the max length

Re: [go-nuts] Data Structure for String Interning?

2022-01-09 Thread burak serdar
On Sun, Jan 9, 2022 at 4:30 PM jlfo...@berkeley.edu wrote: > > > On Sunday, January 9, 2022 at 3:07:18 PM UTC-8 bse...@computer.org wrote: > >> Note that a with a map[string]string, the code: >> >> m[s]=s >> >> The contents of the string s are not duplicated, only the string header s >> is. >> >

Re: [go-nuts] Data Structure for String Interning?

2022-01-09 Thread 'Axel Wagner' via golang-nuts
You might be interested in this package: https://pkg.go.dev/go4.org/intern It is probably your best choice for a long-term maintained implementation of this concept. Matt Layher explained the rationale and design here: https://mdlayher.com/blog/unsafe-string-interning-in-go/ and Brad Fitzpatrick wr

Re: [go-nuts] Data Structure for String Interning?

2022-01-09 Thread jlfo...@berkeley.edu
On Sunday, January 9, 2022 at 3:07:18 PM UTC-8 bse...@computer.org wrote: > Note that a with a map[string]string, the code: > > m[s]=s > > The contents of the string s are not duplicated, only the string header s > is. > I didn't know this. That's very good to know. What about this: package

Re: [go-nuts] Data Structure for String Interning?

2022-01-09 Thread burak serdar
Note that a with a map[string]string, the code: m[s]=s The contents of the string s are not duplicated, only the string header s is. On Sun, Jan 9, 2022 at 3:52 PM jlfo...@berkeley.edu wrote: > I'm aware of Artem Krylysov's idea for string interning published on > https://artem.krylysov.com/bl

[go-nuts] Data Structure for String Interning?

2022-01-09 Thread jlfo...@berkeley.edu
I'm aware of Artem Krylysov's idea for string interning published on https://artem.krylysov.com/blog/2018/12/12/string-interning-in-go/ If I understand it correctly, each string is stored twice in a map, once as a key and once as a value. That means that words that only appear once in his example