Seems like we have the same situation (Version 2.3.2 (2012-12-17 15:03:30) stable). More and more tasks are just being assigned, but not completed nor failed. I can send you a debug log via email, if this hasn't been fixed in trunk?
Thanks, Adnan On Wednesday, December 19, 2012 2:46:07 PM UTC-5, Niphlod wrote: > > yep, I started with "sorry" because I can't reproduce with the newer one > and the older one. But let me (us) know as soon as logs are filled. > > On Wednesday, December 19, 2012 7:52:44 PM UTC+1, JimK wrote: >> >> I'll give it a try with debug mode later this afternoon and send along >> the logs. >> >> I'm certain that there's a real bug here and not user error as I tested >> it quite thoroughly from a blackbox perspective (I didn't know how to turn >> on logging, doh!). Again, with all other criteria being equal, switching >> out the schedule.py file out with the 2.12(?) version fixed the problem. >> >> Jim >> >> On Wednesday, December 19, 2012 1:32:09 AM UTC-8, Niphlod wrote: >>> >>> sorry, just tried, can't reproduce the issue. Can you post the logs of >>> the schedulers ? >>> Maybe start them adding *-D 0* to activate the DEBUG level >>> What normally goes on is: >>> - you start "dir", it's the first one so it becomes the "TICKER", but it >>> doesn't see any "metrics" worker so it doesn't assign tasks >>> - you start "metrics", after a while the "dir" worker sees it and starts >>> assigning tasks to "metrics" >>> - etc. >>> >>> >>> Il giorno mercoledì 19 dicembre 2012 04:31:32 UTC+1, JimK ha scritto: >>>> >>>> I'm not sure what the issue was but replacing the gluon/scheduler.py >>>> file with the older one fixed the problem. There were a lot of changes in >>>> 2.3.2 so it's hard to pinpoint what broke things. >>>> >>>> On Tuesday, December 18, 2012 6:36:29 PM UTC-8, JimK wrote: >>>>> >>>>> There appears to be a major bug with the scheduler workers with 2.3.2 >>>>> in tasks are stuck in the ASSIGNED state if that worker group was not the >>>>> first one started. This problem did not exist until I upgraded to 2.3.2 >>>>> this morning. >>>>> >>>>> I have 2 task groups in my service (dir, metrics). >>>>> >>>>> I start the workers after starting the server like so (art is the >>>>> application name): >>>>> /usr/bin/python2.6 /var/www/web2py/web2py.py -K art:dir >>>>> /usr/bin/python2.6 /var/www/web2py/web2py.py -K art:metrics >>>>> /usr/bin/python2.6 /var/www/web2py/web2py.py -K art:rap >>>>> >>>>> If I start the "dir" worker first and the "metrics", my "dir" group >>>>> tasks complete fine but the "metrics" group tasks are stuck in the >>>>> ASSIGNED >>>>> state. >>>>> To prove my theory, I then killed all of the workers and stared the >>>>> "metrics" worker first and then "dir" worker. As hypothesized, the >>>>> "metrics" group tasks complete but the "dir" group tasks are stuck in the >>>>> ASSIGNED state. >>>>> >>>>> Any ideas??? >>>>> >>>> --