P.S. I always assumed people would run different nodes on different
machines and therefore different hostnames, but even in this case they
may get hostname equal to 127.0.0.1 so there must be a way to specify
a worker_name.

On 12 Ago, 18:35, Massimo Di Pierro <massimo.dipie...@gmail.com>
wrote:
> I see the problem. All your workers have the same name because the
> worker_name defaults to hostanme:port of the web2py instance,
> The API have a way to specify a hostname but the command line function
> does not.
>
> There are two simple solutions:
> - specify a worker_name using shell arguments
> - have the worker pick up a UUID worker_name (this may create
> problems)
>
> meanwhile you can start the workers passing different -p <n> and this
> will fake different worker_names
>
> python web2py.py -K a0 -p 9001
> python web2py.py -K a0 -p 9002
> python web2py.py -K a0 -p 9003
>
> this should solve the problems.
> Your suggested trunk below should go in trunk but it does not solve
> the problem. Only makes it more rare.
>
> On 12 Ago, 18:26, Niphlod <niph...@gmail.com> wrote:
>
>
>
>
>
>
>
> > I'm trying some variations but it seems that the culprit is assigning
> > and retrieving task_scheduled in the same process.
>
> > I don't know dal internals with transactions, locking and commits... a
> > hint though (my 2 cents): I added, just to check, a line after 245
>
> > ...
> > if task:
> >     if task.assigned_worker_name != self.worker_name:
> >         logging.info('Someone stole my task!')
> >         return False
> >     logging.info('running task %s' % task.name)
> > ...
>
> > and it never gets actually printed.
> > So it's not a problem of "I assigned a task to me, and before it gets
> > executed another one picked that task", at least I think.
>
> > Right now it seems working ok only if
>
> > def assign_next_task_new(self, group_names=['main']):
> >         """
> >         find next task that needs to be executed
> >         """
> >         db = self.db
> >         row =
> > db(db.task_scheduled.assigned_worker_name==self.worker_name).select(limitby 
> > =(0,1)).first()
> >         return row
>
> > is used as a replacement for assign_next_task.
>
> > I don't know if it's viable to run a single "assigner" and several
> > workers. In python maybe the "assigner" could fetch the task_scheduled
> > in waiting state list (with a sane limitby clause) and split evenly
> > the list assigning to alive workers....
>
> >http://stackoverflow.com/questions/312443/how-do-you-split-a-list-int...
>
> > For sql maniacs a simple script can be run as assigner (works only
> > with windowing functions, so check if your database supports it)
>
> > for postgres this works like a charm:
>
> > update task_scheduled
> > set assigned_worker_name = worker.name,
> > status='running',
> > last_run_time=now()
> > from
> >         (
> >         select ntile((select count(*)::int from worker_heartbeat)) OVER
> > (order by id) as t_id, *
> >         from task_scheduled
> >         WHERE 1 = 1
> >         AND status = 'queued'
> >         AND ((assigned_worker_name IS NULL) OR (assigned_worker_name = ''))
> >         ) sched,
> >         (
> >         select ntile((select count(*)::int from worker_heartbeat)) OVER
> > (order by id) as w_id, name
> >         from worker_heartbeat
> >         ) worker
> > WHERE worker.w_id = sched.t_id
> > and sched.id = task_scheduled.id
>
> > PS: I noticed another "fixable" aspect.... worker_heartbeat gets
> > polluted, it would be ok to eliminate the record when ctrl+c is
> > pressed (or process is killed gracefully)

Reply via email to