I looked at doing boltdb but ended up with a slightly simpler approach. I 
don't need the disk persistence in this case.

I tried radix tree implementation but found it slower than my current 
nested map with locks. In that interest I rewrote the nested map approach 
considering the key partitioning noted by Robert.

The primary differences in my approach this time were:
1) Read the whole dataset once treat it as immutable, no locking
2) Align the API to no require any manipulation of the keys after the first 
load, so every lookup is now 0 allocation.

The structure is a simplistic map[string]map[string]string - language -> 
key. There was a breakdown of the second map further in the previous 
implementation, probably bringing it from a few thousand to a few hundred 
each. Don't know what the threshold on the benefits there, based on my 
experience the map implementation can handle 10s of thousands with ease.

This brought the lookups down by about 50%:

BenchmarkLookupLooseLazy-8 3000000 458 ns/op 32 B/op 1 allocs/op 
BenchmarkLookupLooseMap-8 5000000 265 ns/op 0 B/op 0 allocs/op

Let me know if see anything else worth adjusting.

Best,
James


On Wednesday, June 14, 2017 at 2:01:44 PM UTC-7, Robert Johnstone wrote:
>
> I'm surprised that the memory overhead is significant - 100 MB is not that 
> much.
>
> Assuming that you don't need atomic updates to the entire KV store, 
> partition the keys.
>
> Does the periodic reload involve changing the keys?  If not, you could map 
> the dataset into nested structs.  However, you will still need to 
> synchronise access if you want to to reload without stopping the server, 
> but that would just be the leaves.  Switching just the top-tier to a struct 
> could help with the contention.
>
>
> On Wednesday, 14 June 2017 14:45:25 UTC-4, James Pettyjohn wrote:
>>
>> I have an application which has all of it's text, multiple languages, 
>> stored in XML on disk that is merged into templates on the fly. About 
>> 100MB. Templates use dozens of strings for each render. 
>>
>> Currently this is loaded in full into memory in a bunch of tier hash 
>> maps. They are lazy loaded and using multiple locks to perform reads but, 
>> unless in dev mode, actually don't change throughout the lifetime of the 
>> application and should be considered immutable.
>>
>> While workable at a smaller scale, it's slow at scale. The most important 
>> factor is concurrent lookup speed, secondary concern is memory overhead. 
>> And it cannot preclude periodic reload while doing dev.
>>
>> Is there a data structure or lib that suits this scenarios more than 
>> others?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to