On Monday, May 16, 2016 at 1:00:32 PM UTC-7, Alfonso Serra wrote:
>
> The difference is, from my modest knowledge about the scheduler, the 
> following.
> [...]
>
 

> Scheduler problems:
> - Enabling the scheduler adds overhead to the database, a lot i might say. 
> - I have to manually run the scheduler on the server. This is bad because 
> after a few days it becomes unresponsive and i have to kill it and restart 
> it again, manually
>

This also sounds suspicous to me.  On my AWS system, I last started the 
scheduler on Apr 18.  Mine has a light load, that's true, but the only 
reason I restarted it was because I was making a code change.

. PythonAnywhere stops their servers 30mins once in a month for 
> mantainance, so i have to watch for it as well.
>
>
That can be an issue, but don't you have to restart the webserver at that 
time?  Your start-up script for the webserver would handle starting the 
scheduler.
 

> - The scheduler is writing each few seconds for worker's heartbeats, in 
> order to know how many workers are avalaible. The more workers the more 
> overhead.
>
> - Since many users should be able to import at the same time i have to 
> declare multiple workers beforehand 
> Even if no one is importing anything, the db is continuosly doin io 
> operations.
>  
>
Ok i  dont know how many users would be importing at the same time so i 
> declare like 3 or 4 workers to be running just in case. So only 4 users 
> would be able to import data at the same time and the db is writing 4 times 
> each few seconds, all the time.
>
>
That is indeed overhead, but you would have to have a lot of activity going 
for that to be measurable by the user.

 

> Going on with all these, i have to make a progress bar for each import 
> process.
> So i write the task's run output to indicate the percentage done each 5%. 
> This way im not writing the percentage each time a row is inserted. A lot 
> more overhead to the db.
>
> Not only that, the client's browser has to ask the server for the 
> percentage. So it has to ask the db as well, lets say  each 5 seconds 
> (progress bar update interval). x number of clients importing at the same 
> time.
>
> To sum up, while im performing an intensive db operation:
>
> - Scheduler is writing heartbeats each few seconds for each available 
> worker.
> - Running Tasks are writing percentages each 5%. 
> - Browser is asking the db each 5 seconds for task's progress.
>
> No wonder why is slower in comparison. 
> With the thread's approach, all statistics are in memory just while its 
> running (percentage, inserted rows, elapsed time...), Im not limited to a 
> fixed amount of users importing at the same time. They are launched on 
> demand instead all the time.
>
> The times are those more or less, same importing function, both using DAL, 
> not concurrency.
> Thread DAL: 4 mins aprox
> Threads mysql.connector: 2/3mins
> Scheduler: 20/30+mins
>
> Of course ill stick with DAL. so i dont have to write INSERTS myself. Its 
> slower but not by much.
>
> The scheduler might be good for mailing operations or mantainance but 
> importing bulk data, not so much.
>


I maintain my suspicions that something else is at fault.

/dps
 
 

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to