awwwww hell, it's true. huge outputs are effectively blocking. don't know if it's fixable though. I don't see any obvious error in the implementation that could prevent it....
On Monday, February 29, 2016 at 9:26:38 PM UTC+1, Niphlod wrote: > > uhm, not sure about that though. the order of operations in the scheduler > respects the docs, and even if the bugreport has the 128k statement, the > docs don't report it.... let me try..... > > On Monday, February 29, 2016 at 9:14:41 PM UTC+1, Boris Aramis Aguilar > Rodríguez wrote: >> >> It happends not only while printing but also when returning values on a >> function as noted on the example :) so yes i do avoid printing but this >> also happens when returning more than 128kb of data (as noted by the bug >> reports) due to the fact (it seems) that the max data a process can >> comunicate to other via pipes is around that limit so that is the >> underlying reason. >> El feb 29, 2016 2:07 PM, "Niphlod" <nip...@gmail.com> escribió: >> >>> don't know if it's documented on the book but nevertheless there has >>> been a few times it popped up on this group. As output is buffered, >>> especially on Windows but on unixes flavoured OSes as well, the "printed" >>> output should be limited. >>> Of course it never has been a real issue because if you need to return >>> something you shouldn't NEVER EVER use print. >>> And because you should be already accustomed to NEVER EVER using print >>> in any web2py application, first because print doesn't get you anything and >>> second because if you need huge prints you're probably sidestepping a >>> correct usage of the logging library. >>> Anyhow, I'll submit right away a PR on the book with a solid note about >>> it. >>> >>> On Monday, February 29, 2016 at 7:41:41 PM UTC+1, Boris Aramis Aguilar >>> Rodríguez wrote: >>>> >>>> Hi, there is an issue driving me crazy with the web2py scheduler: >>>> >>>> If you return something that has a huge size then it will always >>>> timeout; even if the scheduler task correctly finishes. Let me explain >>>> with >>>> an example: >>>> >>>> def small_test(): >>>> s = 's'*1256018 >>>> another_s = s >>>> #print s >>>> #print another_s >>>> #print 'FINISHED PROCESS' >>>> return dict(s = s, another_s = another_s, f = 'finished') >>>> >>>> small_test is the function to execute, as you can see a string full of >>>> 's' 1256018 times is. Simple >>>> >>>> So when you enqueue the scheduler every time the output is the same: >>>> http://prnt.sc/a9iarj (screenshot of the TIMEOUT) >>>> >>>> As you can see from the screenshot, the process actually finished; >>>> while logging the scheduler output shows the following: >>>> >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: work to do 1405 >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: new scheduler_run >>>> record >>>> INFO:web2py.scheduler.PRTALONENETLAPP-SRV#24475:new task 1405 >>>> "small_test" portal/default.small_test >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: new task allocated: >>>> portal/default.small_test >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: task starting >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: task started >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: new task report: >>>> COMPLETED >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: result: {"s": >>>> "ssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss$ >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: freeing workers >>>> that have not sent heartbeat >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: freeing workers >>>> that have not sent heartbeat >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: freeing workers >>>> that have not sent heartbeat >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:........recording >>>> heartbeat (RUNNING) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: task timeout >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: recording task >>>> report in db (TIMEOUT) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475: status TIMEOUT, >>>> stop_time 2016-02-29 11:56:52.393706, run_result {"s": >>>> "sssssssssssssssssssssssssss$ >>>> INFO:web2py.scheduler.PRTALONENETLAPP-SRV#24475:task completed >>>> (TIMEOUT) >>>> DEBUG:web2py.scheduler.PRTALONENETLAPP-SRV#24475:looping... >>>> INFO:web2py.scheduler.PRTALONENETLAPP-SRV#24475:nothing to do >>>> >>>> >>>> >>>> As you can see there is a TaskReport object in the queue with a >>>> COMPLETED status (I know this because I read the scheduler.py code of >>>> web2py) So I'm pretty sure the task finishes quite fast but then it hangs. >>>> >>>> So I did another test, that doesn't directly use the scheduler but only >>>> calls the executor method from the scheduler and usess process; just like >>>> the scheduler would: >>>> >>>> from gluon.scheduler import Task >>>> from gluon.scheduler import executor >>>> t = Task(app='portal', function='small_test', timeout = 120) >>>> import logging >>>> logging.getLogger().setLevel(logging.DEBUG) >>>> import multiprocessing >>>> queue = multiprocessing.Queue(maxsize = 1) >>>> out = multiprocessing.Queue() >>>> t.task_id = 123 >>>> t.uuid = 'asdfasdf' >>>> p = multiprocessing.Process(target=executor, args=(queue, t, out)) >>>> p.start() >>>> p.join(timeout = 120) >>>> p.is_alive() >>>> >>>> >>>> when the join finishes waiting (2 minutes) if you check for >>>> p.is_alive() it always returns True; but when you do a queue.get() and >>>> then >>>> instantly check for p.is_alive() the process finishes!!!!! >>>> >>>> So i noticed the problem is from multiprocessing library, due to the >>>> fact that it can't handle lots of data from a queue (which seems kind of >>>> strange for my case, but I don't know how it is implemented); anyways i >>>> found this bug: http://bugs.python.org/issue8237 and >>>> http://bugs.python.org/issue8426 >>>> >>>> The interesting part is it is actually documented (I didn't knew that): >>>> >>>> https://docs.python.org/2/library/multiprocessing.html#multiprocessing-programming >>>> >>>> But in my current implementation this will happen quite often, I'll >>>> work on a work-around but I would really like that web2py scheduler could >>>> handle large data output from my processes for me, but well that is my >>>> wish >>>> and I would like to have some guidance on this issue and avoid a >>>> work-around. >>>> >>>> Anyway, this should be documented somewhere in web2py too (that >>>> probably could had saved me a week of code reading and debugging); or it >>>> should be managed somehow (I wouldn't naturally expect an output limit >>>> besides the database implementation). >>>> >>> -- >>> Resources: >>> - http://web2py.com >>> - http://web2py.com/book (Documentation) >>> - http://github.com/web2py/web2py (Source code) >>> - https://code.google.com/p/web2py/issues/list (Report Issues) >>> --- >>> You received this message because you are subscribed to a topic in the >>> Google Groups "web2py-users" group. >>> To unsubscribe from this topic, visit >>> https://groups.google.com/d/topic/web2py/PunI6ffg9xk/unsubscribe. >>> To unsubscribe from this group and all its topics, send an email to >>> web2py+un...@googlegroups.com. >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.