Re: Large Queryset Calculation In Background?

Andre Terra Tue, 22 Nov 2011 16:52:09 -0800

You will definitely need to look into caching those results, perhaps
"permanently" in a database.


My recommendation is redis[1] and possibly tools like sebleier's
django-redis-cache[2]. Cache invalidation is a pain, I know, but it's
pretty much the only way to go.

Long term, you will need to profile the bottlenecks and dive into the
django generated SQL to find if you can tune it by refactoring, and
possibly switching to either .raw() or .sql() in some cases.

There are plenty of presentations from python/django conferences out there
that touch on the subject of ORM optimization, so don't be afraid to google.

Good luck!


Cheers,
AT


[1] http://redis.io
[2] https://github.com/sebleier/django-redis-cache

On Tue, Nov 22, 2011 at 9:04 PM, Nikolas Stevenson-Molnar <
nik.mol...@consbio.org> wrote:

>  I wouldn't expect it to lock the database (though someone with more
> database expertise should address that). I *would* expect it to consume
> significant CPU. If you're on UNIX, you could address this issue by making
> your process 'nice': http://docs.python.org/library/os.html#os.nice The
> nicer a process (higher the value), the less CPU it will hog. IIRC, nice
> values default to 0 for processes and range from -20 (biggest CPU usage) to
> +20 (smallest CPU usage).
>
> _Nik
>
>
> On 11/22/2011 2:37 PM, Nan wrote:
>
> Hi folks --
>
> I need to run a fairly CPU-intensive calculation nightly over a
> dataset that's already large and growing quickly.  I'm planning to run
> this via a cron job, but would like to make sure that it neither eats
> up the entire CPU nor locks the database, so that my site can continue
> functioning in the meantime.  The rough outline of what it needs to do
> is as follows:
>
> class OtherThing(models.Model):
>     anotherthing = models.ManyToManyField(Whatever)
>     ...
>
> class Thing(models.Model):
>     other_things = models.ManyToManyField(OtherThing,
> through='SomethingElse')
>     ...
>
> for thing in Thing.objects.select_related('other_things',
> 'other_things__anotherthing__etc'):
>     calculated = calculation_on_thing_and_its_otherthings(thing) #
> this mainly involves serialization to a great depth
>     thing.calculated_data = calculated
>     thing.save()
>
> Will the above approach lock the database for a while or eat tons of
> CPU?  Any suggestions?  I'm using Django 1.2, btw.
>
> Thanks,
> -Nan
>
>
>   --
> You received this message because you are subscribed to the Google Groups
> "Django users" group.
> To post to this group, send email to django-users@googlegroups.com.
> To unsubscribe from this group, send email to
> django-users+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/django-users?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Re: Large Queryset Calculation In Background?

Reply via email to