First of all we want thank you, John Mettraux, for your outstanding work on Ruote and your seemingly tireless responses to peoples questions and issues!
We've been running an implementation of Ruote in production for about 1.5 years. The relavent parts of our stack include: - Heroku - rails 3 - ruote - ruote-kit (web and workers) - ruote-redis We use route to do marketing automation (phone calls, emails etc...). Up until recently our load has been fairly light but we've been pushing more and more workflows (generally in large bursts) through our application and have started to run into some performance issues. At the moment we have the following stats: - ~100k expressions - ~6k schedules - 400 MB redis store (actually this seems to fluctuate between 300 and 500 MB) - unknown number of expressions are abandoned by external services, could these cause performance issues? We've started getting memory warnings from Heroku on our workers and they will eventually stop processing messages until we restart them. Heroku provides 512 MB of memory and then starts paging to disk after that. We've seen warnings with memory usage approaching 1 GB. It's unclear to us how a worker could be consuming more than 2x in memory over storage. While looking through the code this week we noticed the frequent, perhaps inefficient (with so many), checks for schedules to process. We've drastically extended the time between checks (perhaps this could be a configuration option) as we do not have any schedules that need to be processed precisely on time. This change seems to have improved things dramatically but over time we still seem to get memory warnings and eventually stalled workers. We've experimented with the number of worker processes and have found that we get the best performance with 5-10 process running. For some time we had threading disabled so we tried re-enabling that in the effort to improve performance. It's tough to tell because we haven't done formal benchmarks but it appears that we did see a boost in performance however we eventually started receiving errors as Ruote tried to create new threads. Our theory is this is related to the lack of memory on the virtual. We tend to have fairly simple workflows with a participant that doesn't reply until it gets more information and then a few more participants that reply fairly quickly in a cursor. We're not setting any variables and tend to have a few :if conditions. While reading another thread on performance here (https://groups.google.com/d/topic/openwferu-users/IQloVPNuzcI/discussion) I got to thinking about the encoding and decoding of data. I realize that we do not have yajl available in our bundle and that might provide some performance improvements. I've also been investigating MessagePack (http://msgpack.org/) and thinking that with support in Redis it might make an even better improvement (see benchmarks here: https://gist.github.com/1159138). Should we go down that road I wonder if you have any advice on converting the existing messages. Would you add code to detect the message format and slowly transition over time or take the system offline and try to re-encode all of the existing messages? Also, does adding MessagePack to rufus-json make sense (not strictly JSON related) or would you recommend overriding the #encode #decode methods in ruote-redis? Overall I wanted to get some impressions and see if others have had performance and/or memory issues with the kind of load we've been putting on Ruote. I've tried to include as much detail as possible. If there are specific questions I could answer please let me know. I realize I haven't given specific benchmark information so if that would help I can try to perform some benchmarking (been too busy putting out fires up until now). Thanks again! -- you received this message because you are subscribed to the "ruote users" group. to post : send email to [email protected] to unsubscribe : send email to [email protected] more options : http://groups.google.com/group/openwferu-users?hl=en
