I'm have a few things that need clarification and am also experiencing some 
odd behavior with the scheduler. I'm using my app's db instance (mysql) for 
the scheduler.

at the bottom of scheduler.py:


from gluon.scheduler import Scheduler

scheduler = Scheduler(db,heartbeat=3)



I start my workers like this:

head node:

python web2py.py -K myapp:upload,myapp:upload,myapp:upload,myapp:upload,
myapp:upload,myapp:download,myapp:download,myapp:download,myapp:download,
myapp:download,myapp:head_monitorQ 

5 compute nodes:

GROUP0="myapp:"$HOSTNAME"_comp_0:compQ"
GROUP1="myapp:"$HOSTNAME"_comp_1:compQ"
GROUP2="myapp:"$HOSTNAME"_comp_2:compQ"
GROUP3="myapp:"$HOSTNAME"_comp_3:compQ"
GROUP4="myapp:"$HOSTNAME"_comp_4:compQ"
GROUP5="myapp:"$HOSTNAME"_comp_5:compQ"
GROUP6="myapp:"$HOSTNAME"_comp_6:compQ"
MON="myapp:"$HOSTNAME"_monitorQ"

python web2py.py -K 
$GROUP0,$GROUP1,$GROUP2,$GROUP3,$GROUP4,$GROUP5,$GROUP6,$MON


The head node has 4 "upload" and 4 "download" processes.  Each compute node 
has 7 "compQ" processes that do the actual work.  The hostname based groups 
are unique so I can remotely manage the workers.  The monitorQ's run a task 
every 30s to provide hw monitoring to my application.


1) I have the need to dynamically enable/disable workers to match available 
hardware.  I was hoping to do this with the disable/resume commands but the 
behavior isn't what I had hoped (but I think what is intended).  I would 
like to send a command that will stop a worker from getting 
assigned/picking up jobs until a resume is issued.  From the docs and 
experimenting, it looks like all disable does is simply sleep the worker 
for a little bit and then it gets right back to work.  To get my current 
desired behavior I issue a terminate command, but then i need to ssh into 
each compute node and restart workers when i want to scale back up...which 
works but is less than ideal.

*Is there any way to "toggle" a worker into a disabled state?*




2) A previous post from Niphlod explains the worker assignment:

A QUEUED task is not picked up by a worker, it is first ASSIGNED to a 
> worker that can pick up only the ones ASSIGNED to him. The "assignment" 
> phase is important because:
> - the group_name parameter is honored (task queued with the group_name 
> 'foo' gets assigned only to workers that process 'foo' tasks (the 
> group_names column in scheduler_workers))
> - DISABLED, KILL and TERMINATE workers are "removed" from the assignment 
> alltogether 
> - in multiple workers situations the QUEUED tasks are split amongst 
> workers evenly, and workers "know in advance" what tasks they are allowed 
> to execute (the assignment allows the scheduler to set up n "independant" 
> queues for the n ACTIVE workers)


This is an issue for me, because my tasks do not have a uniform run time. 
 Some jobs can take 4 minutes while some can take 4 hours.  I keep getting 
into situations where a node is sitting there with plenty of idle workers 
available, but they apparently don't have tasks to pick up.  Another node 
is chugging along with a bunch of backlogged assigned tasks.  Also 
sometimes a single worker on a node is left with all the assigned tasks 
while the other works are sitting idle.

*Is there any built-in way to periodically force a reassignment of tasks to 
deal with this type if situation?*




3) I had been using "immediate=True" on all of my tasks.  I started to see 
db deadlock errors occasionally when scheduling jobs using queue_task(). 
 Removing "immediate=True" seemed to fix this problem.

*Is there any reason why immediate could be causing deadlocks?*






-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to