On Monday, May 16, 2016 at 1:00:32 PM UTC-7, Alfonso Serra wrote: > > The difference is, from my modest knowledge about the scheduler, the > following. > [...] >
> Scheduler problems: > - Enabling the scheduler adds overhead to the database, a lot i might say. > - I have to manually run the scheduler on the server. This is bad because > after a few days it becomes unresponsive and i have to kill it and restart > it again, manually > This also sounds suspicous to me. On my AWS system, I last started the scheduler on Apr 18. Mine has a light load, that's true, but the only reason I restarted it was because I was making a code change. . PythonAnywhere stops their servers 30mins once in a month for > mantainance, so i have to watch for it as well. > > That can be an issue, but don't you have to restart the webserver at that time? Your start-up script for the webserver would handle starting the scheduler. > - The scheduler is writing each few seconds for worker's heartbeats, in > order to know how many workers are avalaible. The more workers the more > overhead. > > - Since many users should be able to import at the same time i have to > declare multiple workers beforehand > Even if no one is importing anything, the db is continuosly doin io > operations. > > Ok i dont know how many users would be importing at the same time so i > declare like 3 or 4 workers to be running just in case. So only 4 users > would be able to import data at the same time and the db is writing 4 times > each few seconds, all the time. > > That is indeed overhead, but you would have to have a lot of activity going for that to be measurable by the user. > Going on with all these, i have to make a progress bar for each import > process. > So i write the task's run output to indicate the percentage done each 5%. > This way im not writing the percentage each time a row is inserted. A lot > more overhead to the db. > > Not only that, the client's browser has to ask the server for the > percentage. So it has to ask the db as well, lets say each 5 seconds > (progress bar update interval). x number of clients importing at the same > time. > > To sum up, while im performing an intensive db operation: > > - Scheduler is writing heartbeats each few seconds for each available > worker. > - Running Tasks are writing percentages each 5%. > - Browser is asking the db each 5 seconds for task's progress. > > No wonder why is slower in comparison. > With the thread's approach, all statistics are in memory just while its > running (percentage, inserted rows, elapsed time...), Im not limited to a > fixed amount of users importing at the same time. They are launched on > demand instead all the time. > > The times are those more or less, same importing function, both using DAL, > not concurrency. > Thread DAL: 4 mins aprox > Threads mysql.connector: 2/3mins > Scheduler: 20/30+mins > > Of course ill stick with DAL. so i dont have to write INSERTS myself. Its > slower but not by much. > > The scheduler might be good for mailing operations or mantainance but > importing bulk data, not so much. > I maintain my suspicions that something else is at fault. /dps -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.