[web2py] Re: Best practice using scheduler as a task queue?

Niphlod Mon, 18 Jun 2012 01:22:06 -0700

I'll get back to you this evening, but for now, maybe it's a fix.....

You're afraid that starting n web2py processes your code in "initialize" 
will duplicate inserts. That is the common behaviour in db 
transactions..... if you're running three processes, and you're not lucky, 
all three will "see" a clean slate and will try to insert data.


But, what if you schedule the "initialize_task_queue" as a repeating task ?

Then you'll be sure that it gets processed not concurrently, and that 
function will be the one in charge of reinitializing the task queue. 
I use a similar method in w2p_tvseries to enable group of tasks that are 
yet scheduled: not quite the same thing but it should work nonetheless.

On Monday, June 18, 2012 1:29:56 AM UTC+2, Michael Toomim wrote:
>
> To respond to your last two points:
>
> You're right that models only runs on every request... I figured if my 
> website isn't getting any usage then the tasks don't matter anyway. :P
>  
> Yes, I think there are design issues here, but I haven't found a better 
> solution. I'm very interested in hearing better overall solutions! The 
> obvious alternative is to write a standalone script that loops forever, and 
> launch it separately using something like "python web2py.py -S 
> app/controller -M -N -R background_work.py -A foo". But this requires 
> solving the following problems that are *already solved* by the scheduler:
>   • During development, restarting & reloading models as I change code
>   • Killing these background processes when I quit the server
>   • Ensuring that no more than one background process runs at a time
>
> On Wednesday, June 13, 2012 7:16:56 AM UTC-7, Niphlod wrote:
>>
>> Maybe I didn't get exactly what you need , but ...... you have 3 tasks, 
>> that needs to be unique.
>> Also, you want to be sure that if a task crashes doesn't remain "hanged".
>>
>> This should never happen with the scheduler .... the worst situation is 
>> that if a worker crashes (here "crashes" is it disconnects from the 
>> database) leaves the task status as running, but as soon as another 
>> scheduler checks if that one sends heartbeats, he removes the dead worker 
>> and requeue that task.
>> If your task goes into timeout and it's a repeating task the best 
>> practice should be to raise the timeout.
>>
>> Assured this, you need to initialize the database if someone truncates 
>> the scheduler_task table, inserting the 3 records in one transaction.
>>
>> If you need to be sure, why all the hassle when you can "prepare" the 
>> task_name column as a unique value and then do 
>> db.update_or_insert(task_name==myuniquetaskname, **task_record) ?
>>
>> PS: code in models get executed every request. What if you have no users 
>> accessing the site and in the need to call initialize_task_queue ? Isn't it 
>> better to insert the values and then start the workers ?
>>
>> BTW: a task that needs to be running "forever" but can't be "launched" in 
>> two instances seems to suffer some design issues but hey, everyone needs to 
>> be able to do what he wants ;-)
>>
>

[web2py] Re: Best practice using scheduler as a task queue?

Reply via email to