Re: Repetitive background tasks

2006-06-26 Thread Simon Willison


On 26 Jun 2006, at 19:07, Harish Mallipeddi wrote:

> I'm wondering if someone could advise me on how to do certain periodic
> background tasks with django? For instance, if I needed to retrieve a
> list of RSS feeds daily to check for updates how would I do that?
>
> Is there a way to do this by resorting to a solution within the django
> framework and not some OS-level solution like cron jobs on Linux? I'm
> developing on Windows and would love it if the solution is

Ah. I've always done this kind of stuff with cron - that's certainly  
the gold standard for this kind of problem on Linux/Unix and  
something that's well supported by Django (since Python scripts can  
import and use Django models).

Hopefully someone who has actually solved this will chip in, but from  
scanning around the web it seems that the equivalent in the Windows  
world is "Scheduled Tasks". There's a thread here that might be  
useful to you:

http://weblogs.asp.net/pmarcucci/archive/2003/10/20/32662.aspx

Cheers,

Simon

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~--~~~~--~~--~--~---



Re: Repetitive background tasks

2006-06-26 Thread Joseph Heck
It really depends on how you're running your instance of Django...If you are using mod-python, then you have the central, single process of python that you *could* take advantage of and run a thread from. Perhaps not ideal, and a a complete PITA to debug, but available. If you're running Django with FLUP/FastCGI those same options aren't (potentially) available.
For myself, I just expect to have some required "out of band" processing and run those things as completely external processes. In one case, it's a system scanner - and I run it from a different machine than the django instance is served from. It has the same codebase, and accesses the central database - but the process of scanning takes advantage of win32 specific bits (wmi) and I'm hosting the django instance on linux.
There's a periodic process and a nightly process - both of which I run with scheduled tasks on windows. For linux, it would be cron. Finding a cross platform process invoking scheduling system is, well, tricky. cron with cygwin if you're really insistent. I just gave up on perfect platform parity - but then I was taking advantage of win32 specific bits anyway.
-joeOn 6/26/06, Harish Mallipeddi <[EMAIL PROTECTED]> wrote:
I'm wondering if someone could advise me on how to do certain periodicbackground tasks with django? For instance, if I needed to retrieve alist of RSS feeds daily to check for updates how would I do that?Is there a way to do this by resorting to a solution within the django
framework and not some OS-level solution like cron jobs on Linux? I'mdeveloping on Windows and would love it if the solution isOS-independent.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups "Django users" group.  To post to this group, send email to django-users@googlegroups.com  To unsubscribe from this group, send email to [EMAIL PROTECTED]  For more options, visit this group at http://groups.google.com/group/django-users  -~--~~~~--~~--~--~---


Re: Repetitive background tasks

2006-06-26 Thread Glenn Tenney

On Mon, Jun 26, 2006 at 06:07:19PM -, Harish Mallipeddi wrote:
> Is there a way to do this by resorting to a solution within the django
> framework and not some OS-level solution like cron jobs on Linux? 

As has been mentioned, using cron (or cron-like) to schedule running a
Python script is a great solution... but there are other ways that
might work for you too.

1) Within your application, write a "view" that doesn't necessarily
view anything, but instead it does all of the functionality you need
for this repetitive task... then...  Still using cron functionality,
you have some machine somewhere in the world (or on your internal LAN
with appropriate ACLs and firewalling) go to a specific URL that
executes that "view" to do the repetitive tasks.

or

2) Also within your application, write a global function that (a)
immediately checks to see if it's the first time it's invoked each day
(or whatever interval), and then, if it is the first time, (b) does
all of the repetitive tasks...  This translates to: the first click
within the desired interval automatically performs the pending
repetitive tasks.  Note: this is not always a good solution (but can
sometimes be fine), since you might have tasks that need to be done
every interval even if no one clicks your site, or the tasks run a
long time, etc.  This can be combined with #1 above in that the URL in
#1 above would only need to be some regular URL into your application
since ANY click to your application will do the repetitive task once
each interval.


-- 
Glenn Tenney CISSP CISM

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~--~~~~--~~--~--~---



Re: Repetitive background tasks

2006-06-26 Thread Tyson Tate


On Jun 26, 2006, at 12:43 PM, Glenn Tenney wrote:
>
> [...]
> 2) Also within your application, write a global function that (a)
> immediately checks to see if it's the first time it's invoked each day
> (or whatever interval), and then, if it is the first time, (b) does
> all of the repetitive tasks...  This translates to: the first click
> within the desired interval automatically performs the pending
> repetitive tasks.  Note: this is not always a good solution (but can
> sometimes be fine), since you might have tasks that need to be done
> every interval even if no one clicks your site, or the tasks run a
> long time, etc.  This can be combined with #1 above in that the URL in
> #1 above would only need to be some regular URL into your application
> since ANY click to your application will do the repetitive task once
> each interval. [...]

I've found a better solution with signals and the dispatcher. The  
essence is that you raise a signal every time someone views the  
relevant page, for instance, a "Recent Flickr Photos" page. In the  
relevant models.py file, you'll have attached a function call to be  
called on the raising of that signal. That function should check when  
the last time it was run, if it was run more than X minutes ago, have  
it call the relevant Flickr synchronization code, or whatever else  
you want it to do.

I'm using that method on a current site and it works very well. On  
top of that, it doesn't rely on OS-specific tools to work.

Maybe that would be a nice contrib app: a "fake" cron using the above  
method.

Just a thought, anyways.

-Tyson

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~--~~~~--~~--~--~---



Re: Repetitive background tasks

2006-06-26 Thread Aaron

Embrace the OS.

Whats wrong with OS specific tools anyway?  If you are developing on
windows and deploying to unix then you can use something like cygwin to
get cron in there.

When I code in java I use the excellent quartz scheduler, but then I
have to use my home-gronw monitoring tools to ensure that its working.
When I code in Python I can use cron and let the sysadmin deal with
failed jobs, he has to watch a bunch of other crons anyway.

But although it is easier to deploy django in a single self-contained
app I would look to the OS to give you what it can.  It may save you
some headaches later.

There are 2 types of scheduled processes though - ones based on clock
time (like running the credit card transactions at 11:59) and ones
based on application performance (like cleaning the cache or sessions)

For timely events (like billing) I would stick to the most reliable and
monitorable scheduling tool that you have.  For other apps the signals
trick or other 'in-process' systems will work.  I recently helped a
friend debug a PHP app where his in-process RSS scraper has a suble bug
that was bringing the server to its knees.

I was just having this debate with the java folk and I have seen it as
almost a cultureal thing.  I *like* the fact that python sits close to
the OS, and I *like* the fact that java is completely hosting in its
own library rich VM.  Then again when I develop in MS I *like* Visual
Studio and I *like* MSSQL. (When doing python or java I am free to
dislike MSSQL though)

As the previous email said:
Just a thought, anyways.

-Aaron


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~--~~~~--~~--~--~---



Re: Repetitive background tasks

2006-06-27 Thread Malcolm Tredinnick

On Mon, 2006-06-26 at 13:28 -0700, Tyson Tate wrote:
> 
> On Jun 26, 2006, at 12:43 PM, Glenn Tenney wrote:
> >
> > [...]
> > 2) Also within your application, write a global function that (a)
> > immediately checks to see if it's the first time it's invoked each day
> > (or whatever interval), and then, if it is the first time, (b) does
> > all of the repetitive tasks...  This translates to: the first click
> > within the desired interval automatically performs the pending
> > repetitive tasks.  Note: this is not always a good solution (but can
> > sometimes be fine), since you might have tasks that need to be done
> > every interval even if no one clicks your site, or the tasks run a
> > long time, etc.  This can be combined with #1 above in that the URL in
> > #1 above would only need to be some regular URL into your application
> > since ANY click to your application will do the repetitive task once
> > each interval. [...]
> 
> I've found a better solution with signals and the dispatcher. The  
> essence is that you raise a signal every time someone views the  
> relevant page, for instance, a "Recent Flickr Photos" page. In the  
> relevant models.py file, you'll have attached a function call to be  
> called on the raising of that signal. That function should check when  
> the last time it was run, if it was run more than X minutes ago, have  
> it call the relevant Flickr synchronization code, or whatever else  
> you want it to do.

Not a good idea if the process takes any significant amount of time to
run (such as, for example, retrieving things of the Internet). It will
block the response back to the user.

Malcolm



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~--~~~~--~~--~--~---



Re: Repetitive background tasks

2006-06-27 Thread Jay Parlar

On 6/27/06, Malcolm Tredinnick <[EMAIL PROTECTED]> wrote:
>
> On Mon, 2006-06-26 at 13:28 -0700, Tyson Tate wrote:
> > I've found a better solution with signals and the dispatcher. The
> > essence is that you raise a signal every time someone views the
> > relevant page, for instance, a "Recent Flickr Photos" page. In the
> > relevant models.py file, you'll have attached a function call to be
> > called on the raising of that signal. That function should check when
> > the last time it was run, if it was run more than X minutes ago, have
> > it call the relevant Flickr synchronization code, or whatever else
> > you want it to do.
>
> Not a good idea if the process takes any significant amount of time to
> run (such as, for example, retrieving things of the Internet). It will
> block the response back to the user.


You could though have a page that sends an XMLHTTPRequest to a
different view, which can do the processing. Then the user won't
experience any blocking at all.

Jay P.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~--~~~~--~~--~--~---



Re: Repetitive background tasks

2006-06-27 Thread canen

You could use kronos.py, found here
http://snakelets.cvs.sourceforge.net/snakelets/Plugins/scheduler/kronos.py?view=markup
turbogears uses it for its scheduler. You can find the tg
implementation here
http://trac.turbogears.org/turbogears/browser/branches/1.0/turbogears/scheduler.py?rev=1363


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~--~~~~--~~--~--~---



Re: Repetitive background tasks

2006-06-27 Thread Harish Mallipeddi

Thanks everyone for the replies.

Anyways I looked at the kronos scheduler (pointed out by Canen above)
being used in Turbogears and what I wanted is more like this.

So currently the way I'm using this is to put kronos.py into the django
utils directory and then importing it from there. In a particular view,
I start a threaded scheduler if it hasn't been started yet and I assign
the repetitive task to it. This way the view's response is
instantaneous even the first time because the task will be running in a
separate thread. This seems to work for me!

Anyways I think it will be useful if Django has a scheduler like this
one included by default like TurboGears (actually it does not seem to
require much effort; just have to reuse the code in kronos.py).

Cheers,
Harish


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~--~~~~--~~--~--~---