Re: Using an atom for a caching map

Marcus Magnusson Mon, 01 Sep 2014 15:54:16 -0700

Forget about sleep, there's work to be done! I realized that the drawback I 
mentioned with my solution, where concurrent calls to the memoized function 
with different sets of arguments would be handled one-by-one, could easily 
be resolved by having the cache inside synced-memoize store futures for 
calculating the value, rather than the value itself - some small changes 
and everything should hopefully work well (don't ask me about overhead, 
though!)



(defn synced-memoize [f]
  (let [mem (atom {})
        handle-ch (chan)]
    (go-loop []
      (let [[args ret-ch] (<! handle-ch)]
        (>! ret-ch (if-let [e (find @mem args)]
                     (val e)
                     (let [ret (future (apply f args))]
                       (swap! mem assoc args ret)
                       ret)))
        (recur)))
    (fn [& args]
      (if-let [e (find @mem args)]
        (deref (val e))
        (let [ret (chan)]
          (>!! handle-ch [args ret])
          (deref (<!! ret)))))))

(def lookup
  (let [calc-value* (synced-memoize calc-value)]
    (fn [cache key]
      (if (contains? @cache key)
        (@cache key)
        ((swap! cache assoc key (calc-value* key)) key)))))


Den måndagen den 1:e september 2014 kl. 23:55:58 UTC+2 skrev Marcus 
Magnusson:
>
> Huh, I gave it some more thought - of course, the reason why memoize won't 
> help us here is that there is no synchronization between simultaneous 
> invocations with the same set of arguments. This is typically not a problem 
> if the underlying function is fast, but in our case it would be neat with 
> an alternative. I got the idea of writing a channel-based version of 
> memoize, which will ensure that invoking the memoized function 
> simultaneously with the same arguments will only call the underlying 
> function once. Below is a quick implementation with one obvious drawback 
> (and probably tons of bugs and points of improvement!), namely that if the 
> underlying function is costly, and we invoke the memoized function with a 
> bunch of different non-cached set of arguments, then calculating the values 
> will be done one-by-one. This could be fixed for example by having 
> handle-ch delegate to another channel, where we have one channel per set of 
> arguments - I'll leave that for someone else, or for when I've had some 
> sleep and realized what a bad idea it was...  :) Note that I haven't worked 
> much with core.async, and that there is probably a much more 
> straightforward solution (for example the one given by Thomas Heller), so 
> please let me know of any issues with this:
>
>
> (defn synced-memoize [f]
>   (let [mem (atom {})
>         handle-ch (chan)]
>     (go-loop []
>       (let [[args ret-ch] (<! handle-ch)]
>         (>! ret-ch (if-let [e (find @mem args)]
>                      (val e)
>                      (let [ret (apply f args)]
>                        (swap! mem assoc args ret)
>                        ret)))
>         (recur)))
>     (fn [& args]
>       (if-let [e (find @mem args)]
>         (val e)
>         (let [ret (chan)]
>           (>!! handle-ch [args ret])
>           (<!! ret))))))
>
> (def lookup
>   (let [calc-value* (synced-memoize calc-value)]
>     (fn [cache key]
>       (if (contains? @cache key)
>         (@cache key)
>         (swap! cache assoc key (calc-value* key))))))
>
>
>
> Den måndagen den 1:e september 2014 kl. 22:35:56 UTC+2 skrev Marcus 
> Magnusson:
>>
>> I reckon if that worked, there would be no need for memoize anyway, but I 
>> don't think swap! will allow for it. I'm far from an expert on swap! or 
>> atoms, but several swap!s may be run simultaneously on a single atom (and 
>> swap! may re-run the swapping function if the atom has been changed since 
>> the swapping function was invoked). In other words, if two swap!s are 
>> invoked at about the same time for the same key (which is not currently in 
>> the cache), both invocations may be given the same value of the cache at 
>> that moment, which will lead to the value being re-calculated in both 
>> invocations - and even if calc-value is memoized, the same thing may occur 
>> in the memoized function.
>>
>> As I said, I'm far from an expert, so I wrote a small test that shows 
>> that calc-value may indeed be called more than once (core.async here is 
>> simply to ensure that each invocation of calc-value is printed on its own 
>> line):
>>
>>
>> (def out-ch (chan 100))
>>
>> (go-loop []
>>   (println (<! out-ch))
>>   (recur))
>>
>> (defn calc-value [x]
>>   (put! out-ch x)
>>   (* x x))
>>
>> (def cache (atom {}))
>>
>> (def lookup
>>   (let [calc-value* (memoize calc-value)]
>>     (fn [cache key]
>>       ((swap! cache (fn [cache']
>>                       (if (contains? cache' key)
>>                         cache'
>>                         (assoc cache' key (calc-value* key)))))
>>         key))))
>>
>>
>> => (dotimes [_ 1000]
>>      (future (lookup cache (rand-int 20))))
>> 11
>> 12
>> 16
>> 1
>> nil
>> 6
>> 15
>> 17
>> 18
>> 14
>> 2
>> 5
>> 19
>> 10
>> 0
>> 7
>> *9*
>> *4*
>> *4*
>> *13*
>> *9*
>> *13*
>>
>>
>> Den måndagen den 1:e september 2014 kl. 21:32:01 UTC+2 skrev Alex 
>> Baranosky:
>>>
>>> I believe you can replace:
>>> (when-not (contains? @cache k)
>>>   (swap! cache assoc k (calc-value* k))))
>>>
>>> with:
>>> (swap! cache (fn [cache']
>>>                (if (contains? cache' k)
>>>                  cache'
>>>                  (assoc cache' k (calc-value* k)))))
>>>
>>> Note: I haven't run this code
>>>
>>>
>>> On Mon, Sep 1, 2014 at 2:20 AM, Colin Fleming <colin.ma...@gmail.com> 
>>> wrote:
>>>
>>>> Hi Thomas,
>>>>
>>>> Normally I'd agree with you, but in my case it actually works quite 
>>>> well since I don't need to expire or worry about sizing. This is for 
>>>> caching objects on IntelliJ parse tree elements, and the element with its 
>>>> cached values is thrown away every time the document is parsed, so these 
>>>> are pretty transient caches. In this particular case the calculation isn't 
>>>> too heavyweight either so recalculating it in the unlikely event of 
>>>> contention isn't a big deal. In fact, this discussion and the related 
>>>> thinking has made me realise that my original solution was actually 
>>>> sufficient for my somewhat strange use case, which is nice :-). For more 
>>>> typical server side caching, I agree that Guava would be a great solution.
>>>>
>>>> Cheers,
>>>> Colin
>>>>
>>>>
>>>> On 1 September 2014 20:54, Thomas Heller <th.h...@gmail.com> wrote:
>>>>
>>>>> As much as I like Clojure and atoms, I do not think they are a good 
>>>>> fit for caching. Not only is it impossible to address the concurrency 
>>>>> issues related to multiple threads loading the same object, but you also 
>>>>> have to do expiration and size management yourself. Immutability doesn't 
>>>>> help much for caching either. There is core.cache that does some bits but 
>>>>> I 
>>>>> probably would recommend using something like CacheBuilder from the guava 
>>>>> libs:
>>>>>
>>>>> See
>>>>> https://code.google.com/p/guava-libraries/
>>>>>
>>>>> http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/cache/CacheBuilder.html
>>>>>
>>>>> Its Java but the Clojure<->Java interop is so good that it doesn't 
>>>>> matter much.
>>>>>
>>>>> On Saturday, August 30, 2014 7:27:05 AM UTC+2, Colin Fleming wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I want to use a map to cache values based on a key. I'm planning to 
>>>>>> use an atom for this. My basic operation is "give me the value for this 
>>>>>> key" - if the value exists in the map then that value should be 
>>>>>> returned, 
>>>>>> otherwise a new value should be calculated, inserted in the map and then 
>>>>>> returned. My plan is to implement something like the following:
>>>>>>
>>>>>>
>>>>>> (defn ensure [cache key]  (if (contains? cache key)    cache    (assoc 
>>>>>> cache key (calc-value key))))(let [value (get (swap! cache ensure key) 
>>>>>> key)]  ... do my thing with value ...)
>>>>>>
>>>>>>
>>>>>> So 'ensure' ensures that the cache contains the value for key, the 
>>>>>> swap! operation returns the cache with the value and then I get it out. 
>>>>>> This works but feels a little clumsy, is there a better way to do this?
>>>>>>
>>>>>> Also, looking at the Atom source code, I see that this will cause a 
>>>>>> CAS operation even if the value returned from swap! is identical to the 
>>>>>> original value. It seems like a reasonable optimisation would be to 
>>>>>> check 
>>>>>> if the values are identical and not update if so - is there a reason 
>>>>>> this 
>>>>>> might not be a good idea?
>>>>>>
>>>>>> Thanks,
>>>>>> Colin
>>>>>>
>>>>>  -- 
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Clojure" group.
>>>>> To post to this group, send email to clo...@googlegroups.com
>>>>> Note that posts from new members are moderated - please be patient 
>>>>> with your first post.
>>>>> To unsubscribe from this group, send email to
>>>>> clojure+u...@googlegroups.com
>>>>> For more options, visit this group at
>>>>> http://groups.google.com/group/clojure?hl=en
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "Clojure" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to clojure+u...@googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  -- 
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Clojure" group.
>>>> To post to this group, send email to clo...@googlegroups.com
>>>> Note that posts from new members are moderated - please be patient with 
>>>> your first post.
>>>> To unsubscribe from this group, send email to
>>>> clojure+u...@googlegroups.com
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/clojure?hl=en
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Clojure" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to clojure+u...@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Using an atom for a caching map

Reply via email to