It's the old producer-consumer problem. Using a central and external MQ manager can be a good idea, an elegant idea (see amqp or zeromq for example), but I agree with Jean : it won't solve the persistence problem magically : queues's items still need to be secured and stored until brokers consume them, by satellites (brokers/schedulers...), on by the central MQ.
Some admins will love to install and use external and modern tools like couchdb or MQ, but others may be scared, and this could be problematic especially in highly secured environements. Furthermore, code that let the possibility to use a MQ or not can be very complex. The choice is not trivial ;) A good starting point before rewrite all the code (^^) could be securing existing queues, to be at least restart-safe and link-loss-safe : * write queues items on disk in a python pickle when scheduler/broker stops * re-read these pickles when scheduler/broker restarts This pickle could also be used to store what you call "backlog" : when queue is full and no broker comes to get broks, store oldest broks on disk (not very IO costly i think). Laurent Le jeudi 06 janvier 2011 à 09:59 +0100, nap a écrit : > > > On Wed, Jan 5, 2011 at 6:59 PM, Hartmut Goebel > <h.goe...@goebel-consult.de> wrote: > Am 05.01.2011 11:50, schrieb nap: > > Yes. There should be one broker in a realm (and a spare of > > course). From now I think with only one we are good. Maybe > > if we really need in the fututre we will add multiple queues > > for multiples brokers, but it will complexify the > > architecture, so if we do not need it, we won't do it. > This can be solved by a decent asynchronous communication. > > > * The scheduler and broker queues are only stored > > in memory, so if the > > scheduler or the broker stops or crashes, will all > > the pending broks be > > lost ? (more problematic) > > Yes. There too much data to store them in database in fact. > > When the scheduler came back, the broker ask him a full > > state so it goes fresh states. The scheduler got the > > retention for not starting from a PENDING environment. > > > This may be able to be solved by a decent asynchronous > communication. > > > * When the scheduler queue is full, oldest broks > > start to be dropped. > > Yes, from now there is no "back log for broks". See the idea > > > http://shinken.ideascale.com/a/dtd/Backlogs-for-broks-in-scheduler/76429-10373 > to vote for ;) > > > This can be solved by a decent asynchronous communication, > too. No need for backlogs. > > Yes. With backlogs, it won't be a problem anymore. But it > > will cost a lot of I/O if the connexion is lost for a long > > time of course. > > > This can be solved by a decent asynchronous communication, > too. No need for backlogs. [Gosh, I'm repeating myself :-)] > > > Not sure about it. Yes, we will manage less messages. But after all, > the communication way is already with a lot of queues. What the point > of exporting them? > > Why if the MQ server crash? You still lost your messages/data after > all. Unless there is a HA way (and it must be one I think). > > In fact, we got 2 communications types : > * data (in the order : confs, checks, notifications, event handlers > and broks) > * order (like when the arbiter say to a satellite : drop your conf for > example). > > The order way must be direct, there is no need of a proxy for it. Data > is already async, because they are all in queues. The only exception > is confs : from now, the arbiter look if the conf is really send or > not, so no proxy for it from now. > > What is the efficient way : backlog or external server? Admins won't > be happy about installing another service too. So it must be an > option, and it must give us a log of good things. > > If it does and the code to managing it won't make this part of code > complex, it can be a good module :) > > > Jean > > > -- > Schönen Gruß - Regards > Hartmut Goebel > Dipl.-Informatiker (univ.), CISSP, CSSLP > > Goebel Consult > Spezialist für IT-Sicherheit in komplexen Umgebungen > http://www.goebel-consult.de > > Monatliche Kolumne: http://www.cissp-gefluester.de/ > Goebel Consult mit Mitglied bei http://www.7-it.de > > > ------------------------------------------------------------------------------ > Learn how Oracle Real Application Clusters (RAC) One Node > allows customers > to consolidate database storage, standardize their database > environment, and, > should the need arise, upgrade to a full multi-node Oracle RAC > database > without downtime or disruption > http://p.sf.net/sfu/oracle-sfdevnl > _______________________________________________ > Shinken-devel mailing list > Shinken-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/shinken-devel > > > ------------------------------------------------------------------------------ > Learn how Oracle Real Application Clusters (RAC) One Node allows customers > to consolidate database storage, standardize their database environment, and, > should the need arise, upgrade to a full multi-node Oracle RAC database > without downtime or disruption > http://p.sf.net/sfu/oracle-sfdevnl > _______________________________________________ Shinken-devel mailing list > Shinken-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/shinken-devel ------------------------------------------------------------------------------ Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl _______________________________________________ Shinken-devel mailing list Shinken-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shinken-devel