Re: How to do a counter in DB cache class?

2021-06-19 Thread 'Adam Johnson' via Django developers (Contributions to Django itself)
I would love to make it work on all backends so we could have an improved
database cache class, but haven't the time.

On Tue, 1 Jun 2021 at 19:44, 'Mike Lissner' via Django developers
(Contributions to Django itself)  wrote:

> Wow, that's pretty great! Did you consider merging this functionality into
> django itself? I hadn't seen anything related before now.
>
> On Tue, Jun 1, 2021 at 11:38 AM 'Adam Johnson' via Django developers
> (Contributions to Django itself) 
> wrote:
>
>> Hi Mike!
>>
>> Probabilistic culling probably is the best we can do in the DB cache,
>> aside from moving culling to a background task.
>>
>> I wrote an implementation of this probabilistic culling in the
>> django-mysql cache backend (which is mysql only):
>> https://django-mysql.readthedocs.io/en/latest/cache.html#culling . This
>> also shows a way of pushing the cull into a background task by setting the
>> probability to 0 and calling the cull() method.
>>
>> Counting in the DB is a bad idea since it will put every write through a
>> single hot row, and if transactions are active that will even serialize
>> requests - ouch!
>>
>> An in-python counter, with a random offset per process (actually process
>> id *plus* thread id), could also work. But it would be trickier imo.
>>
>> Thanks,
>>
>> Adm
>>
>> On Tue, 1 Jun 2021 at 18:28, 'Mike Lissner' via Django developers
>> (Contributions to Django itself) 
>> wrote:
>>
>>> This might be more of a Python question than a Django one, but in this
>>> ticket  I'm hoping to make
>>> the DB cache a bit more performant by having it not cull stale entries
>>> *every* time somebody adds, changes, or touches a cache key.
>>>
>>> The idea is to use a setting or or class attribute so that the cache is
>>> instead culled every 50 times or 1000 times or whatever. Most of this is
>>> easy, but is there a good way to maintain a counter of how many times a key
>>> has been set?
>>>
>>> The best I've come up with is just to do a mod of a random number, and
>>> let that be good enough. E.g.:
>>>
>>> random_number = randint(0, CULL_EVERY_X)
>>> if random_number == CULL_EVERY_X:
>>> cull()
>>>
>>> That's pretty sloppy, but it'd work (on average), and it'd handle things
>>> like counting in a multi-process set up.
>>>
>>> The other approach, I suppose, would be to keep a counter in the DB and
>>> just increment it as needed, but it'd be nice to do something in memory
>>> even if it's a little less accurate, since this counter doesn't have to be
>>> exact.
>>>
>>> Anybody have thoughts on this?
>>>
>>> Thanks,
>>>
>>>
>>> Mike
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Django developers (Contributions to Django itself)" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to django-developers+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/django-developers/7f74e5f0-3206-4956-9ede-79788eda7982n%40googlegroups.com
>>> 
>>> .
>>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/django-developers/qFxdRNLDGuA/unsubscribe
>> .
>> To unsubscribe from this group and all its topics, send an email to
>> django-developers+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/django-developers/CAMyDDM2jSqZNGuBAobuuh%3D6%3DgWyD7SD0TTEzgUxO9CdqRhXotg%40mail.gmail.com
>> 
>> .
>>
>
>
> --
> Mike Lissner
> Executive Director
> Free Law Project
> https://free.law
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/CAKs1xOEofZLoe7RMCR%2BxT6vkCbiFjGjMrJprfLGsVubXR6n9mQ%40mail.gmail.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 

Re: How to do a counter in DB cache class?

2021-06-01 Thread 'Mike Lissner' via Django developers (Contributions to Django itself)
Wow, that's pretty great! Did you consider merging this functionality into
django itself? I hadn't seen anything related before now.

On Tue, Jun 1, 2021 at 11:38 AM 'Adam Johnson' via Django developers
(Contributions to Django itself)  wrote:

> Hi Mike!
>
> Probabilistic culling probably is the best we can do in the DB cache,
> aside from moving culling to a background task.
>
> I wrote an implementation of this probabilistic culling in the
> django-mysql cache backend (which is mysql only):
> https://django-mysql.readthedocs.io/en/latest/cache.html#culling . This
> also shows a way of pushing the cull into a background task by setting the
> probability to 0 and calling the cull() method.
>
> Counting in the DB is a bad idea since it will put every write through a
> single hot row, and if transactions are active that will even serialize
> requests - ouch!
>
> An in-python counter, with a random offset per process (actually process
> id *plus* thread id), could also work. But it would be trickier imo.
>
> Thanks,
>
> Adm
>
> On Tue, 1 Jun 2021 at 18:28, 'Mike Lissner' via Django developers
> (Contributions to Django itself) 
> wrote:
>
>> This might be more of a Python question than a Django one, but in this
>> ticket  I'm hoping to make
>> the DB cache a bit more performant by having it not cull stale entries
>> *every* time somebody adds, changes, or touches a cache key.
>>
>> The idea is to use a setting or or class attribute so that the cache is
>> instead culled every 50 times or 1000 times or whatever. Most of this is
>> easy, but is there a good way to maintain a counter of how many times a key
>> has been set?
>>
>> The best I've come up with is just to do a mod of a random number, and
>> let that be good enough. E.g.:
>>
>> random_number = randint(0, CULL_EVERY_X)
>> if random_number == CULL_EVERY_X:
>> cull()
>>
>> That's pretty sloppy, but it'd work (on average), and it'd handle things
>> like counting in a multi-process set up.
>>
>> The other approach, I suppose, would be to keep a counter in the DB and
>> just increment it as needed, but it'd be nice to do something in memory
>> even if it's a little less accurate, since this counter doesn't have to be
>> exact.
>>
>> Anybody have thoughts on this?
>>
>> Thanks,
>>
>>
>> Mike
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to django-developers+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/django-developers/7f74e5f0-3206-4956-9ede-79788eda7982n%40googlegroups.com
>> 
>> .
>>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/django-developers/qFxdRNLDGuA/unsubscribe
> .
> To unsubscribe from this group and all its topics, send an email to
> django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/CAMyDDM2jSqZNGuBAobuuh%3D6%3DgWyD7SD0TTEzgUxO9CdqRhXotg%40mail.gmail.com
> 
> .
>


-- 
Mike Lissner
Executive Director
Free Law Project
https://free.law

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAKs1xOEofZLoe7RMCR%2BxT6vkCbiFjGjMrJprfLGsVubXR6n9mQ%40mail.gmail.com.


Re: How to do a counter in DB cache class?

2021-06-01 Thread 'Adam Johnson' via Django developers (Contributions to Django itself)
Hi Mike!

Probabilistic culling probably is the best we can do in the DB cache, aside
from moving culling to a background task.

I wrote an implementation of this probabilistic culling in the django-mysql
cache backend (which is mysql only):
https://django-mysql.readthedocs.io/en/latest/cache.html#culling . This
also shows a way of pushing the cull into a background task by setting the
probability to 0 and calling the cull() method.

Counting in the DB is a bad idea since it will put every write through a
single hot row, and if transactions are active that will even serialize
requests - ouch!

An in-python counter, with a random offset per process (actually process id
*plus* thread id), could also work. But it would be trickier imo.

Thanks,

Adm

On Tue, 1 Jun 2021 at 18:28, 'Mike Lissner' via Django developers
(Contributions to Django itself)  wrote:

> This might be more of a Python question than a Django one, but in this
> ticket  I'm hoping to make
> the DB cache a bit more performant by having it not cull stale entries
> *every* time somebody adds, changes, or touches a cache key.
>
> The idea is to use a setting or or class attribute so that the cache is
> instead culled every 50 times or 1000 times or whatever. Most of this is
> easy, but is there a good way to maintain a counter of how many times a key
> has been set?
>
> The best I've come up with is just to do a mod of a random number, and let
> that be good enough. E.g.:
>
> random_number = randint(0, CULL_EVERY_X)
> if random_number == CULL_EVERY_X:
> cull()
>
> That's pretty sloppy, but it'd work (on average), and it'd handle things
> like counting in a multi-process set up.
>
> The other approach, I suppose, would be to keep a counter in the DB and
> just increment it as needed, but it'd be nice to do something in memory
> even if it's a little less accurate, since this counter doesn't have to be
> exact.
>
> Anybody have thoughts on this?
>
> Thanks,
>
>
> Mike
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/7f74e5f0-3206-4956-9ede-79788eda7982n%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAMyDDM2jSqZNGuBAobuuh%3D6%3DgWyD7SD0TTEzgUxO9CdqRhXotg%40mail.gmail.com.