Hi Roger and Sam! Sam-- Could you please alter the permissions on that google doc so that people can add comments (at least)? Right now, it appears I have a read-only view.
Roger: What you're describing seems to me like it isn't something that would apply to ceilometer integration so much as log offloading (and specifically, error log offloading). (Ceilometer is more about real-time stats than logs.) This isn't one of the features I had fleshed out on my model yet, as in our environment we dump the logs to local files (via syslog-ng) and any troubleshooting we do thereafter is done by looking at these log files on our software appliances directly (on behalf of our customers, in most cases). But I can see that having the ability to ship logs off the load balancers is going to be important. I see two possible ways of doing this: 1. Periodic archiving logs off-server. In practical terms for a software appliance, this would mean haproxy continues to log to disk as usual, then a cron job periodically rsyncs logs off the appliance to some user-defined destination. I see mostly disadvantages with this approach: - Shipping the logs can consume a lot of bandwidth when it's happening, so care would be needed in scheduling this so as not to affect production traffic going to the load balancer. - We'll have to deal with credentials for logging into the remote log archive server (and communicating and storing those credentials in a secure manner.) - Logs are not immediately available, which makes real-time troubleshooting of problems...er... problematic. - If the load balancer dies before shipping the logs off, the logs are lost. - A very busy load balancer needs to worry about disk space (and log rotation) at lot more. 2. Real-time shipping of the logs via syslog-ng. In this case, haproxy would be configured to pass its logs to syslog-ng, which is in turn configured to pass them in real time to a logging host somewhere on the network. The main advantages of this approach are: - No bandwidth-hogging periodic rsync process to deal with. - Real-time troubleshooting is now possible. - If the load balancer dies, we still get to see its last gasps just before it died. - Log rotation, etc. are not a concern. Neither is disk space on the load balancer. The main disadvantages to this approach are: - DOSing the load balancer is easier, as requests now generate extra traffic for the load balancer (to the logging host) - If the load balancer gets DOSed, the logging host might be getting DOSed as well. Given the above I'm favoring the second approach. But does anyone else have ideas with regard to how to handle this? Other attributes we might want to consider for the logging object: - Verbosity - Log format Anything else? In any case, any kind of 'logging' resource (with its various attributes) probably makes the most sense to attach in a 1:N relationship with the listener in my model (ie. one 'logging' object can be associated with many listeners.) Thanks, Stephen On Wed, Feb 12, 2014 at 1:58 AM, Samuel Bercovici <samu...@radware.com>wrote: > Hi, > > We plan to address LBaaS in ceilometer for Juno. > A blue print was registered > https://blueprints.launchpad.net/neutron/+spec/lbaas-ceilometer-integration > Please use the following google document to add include requirements and > thoughts at: > https://docs.google.com/document/d/1mrrn6DEQkiySwx4eTaKijr0IJkJpUT3WX277aC12YFg/edit?usp=sharing > > Regards, > -Sam. > > > -----Original Message----- > From: WICKES, ROGER [mailto:rw3...@att.com] > Sent: Tuesday, February 11, 2014 7:35 PM > To: openstack-dev@lists.openstack.org > Subject: Re: [openstack-dev] [Neutron][LBaaS] Proposal for model > > [Roger] Hi Stephen! Great job! Obviously your experience is both awesome > and essential here. > > I would ask that we add a historical archive (physically implemented as a > log file, probably) object to your model. When you mentioned sending data > off to Ceilometer, that triggered me to think about one problem I have had > to deal with is "what packet went where? " > in diagnosing errors usually related to having a bug on 1 out of 5 > load-balanced servers, usually because of a deployed version mismatch, but > could also be due to virus. When our customer sees "hey every now and then > this image is broken on a web page" that points us to an inconsistent farm, > and having the ability to trace or see which server got that customer's > packet (routed to by the LB) would really help in pinpointing the errant > server. > > > Benefits of a new model > > > > If we were to adopt either of these data models, this would enable us > > to eventually support the following feature sets, in the following > > ways (for > > example): > > > > Automated scaling of load-balancer services > > > [Roger] Would the Heat module be called on to add more LB's to the farm? > > > I talked about horizontal scaling of load balancers above under "High > > Availability," but, at least in the case of a software appliance, > > vertical scaling should also be possible in an active-standby > > cluster_model by > ****************************************** > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Stephen Balukoff Blue Box Group, LLC (800)613-4305 x807
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev