> Dnia wtorek, 29 listopada 2011 20:48:52 Roberto De Ioris pisze: >> I have committed another series of fix/optimizations. >> >> - nodes marked as dead will be skipped even if their reference is > 0 (only already accepted request will continue) >> - added another check in auto-opmtized-list consistence (i suppose this is >> the one throwing your error) >> - a potential segmentation fault in slot removal should be fixed >> >> Let me know if it definitely fix it for you > > I left slamd job to do some more stress testing of uWSGI stack I'm working on > and I just did a quick look on how is it doing, this is what I found: > > 1. evil-reload-on-rss was not able to reload my workers on one node, they have > not up to 2GB of memory, evil-reload-on-rss should kill them after using more > than 256MB. Looking at the logs I see a storm of messages related to killing > those workers but they are still alive and the number of restarts is very very > high > 2. worker stats are broken, they show 0 bytes of used memory, I suspect that > this is related to problem above, it seems that uWSGI thinks that it did reload those workers so it zeroed memory usage is stats output
They are related, you have to add memory-report = true to have those flags do their job. Yes, it is a bit annoying, i will try to found a way to fix it without adding the memory reports in the logs. > 3. I got two servers with this application and now the one with broken output > is handling almost all requests and eating all cpu at the same time, second is > almost idle, the request count per second seems very low when I look at fastrouter stats so maybe the server which is restarting workers all the time > is blocking fastrouter from dispatching requests to second server? I think that this is related to this misbehaving server but I don't know if uWSGI could do something about it. It looks like requests got enqueued in a server announcing itself as healthy, while the other can normally answer requests. This is obviously suboptimal. Maybe we could think about some form of "performance" measure, where node behaving bad lose weight in the round-robin until they are totally removed. Another feature to add would be forcing the master process to unsubscribe if all of the workers are behaving bad. I still do not have code for weighted round-robin, so i am opened to every suggestion. -- Roberto De Ioris http://unbit.it -- Roberto De Ioris http://unbit.it _______________________________________________ uWSGI mailing list [email protected] http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi
