On Fri, 2006-02-10 at 09:45 +0200, Buchan Milne wrote: > On Thursday 09 February 2006 19:57, Samuel Tran wrote: > > On Mon, 2006-02-06 at 14:41 -0500, Aaron Richton wrote: > > > That's been on my todo list for over a year now. (So I'll join in the > > > request for a copy if there is such a script!) > > > > > > If anybody does write this, it's important to note that something that > > > strictly compares contextcsns is likely useless (I think it would just be > > > a false positive disaster). Replication doesn't happen instantly; there > > > should be some sort of configurable threshold for "csns should be within > > > <time>". > > > > > > > > > I've been meaning to ask the list: how many of you check up on your > > > slaves from a consistency perspective? What do you do? (contextcsn is the > > > approach I've wanted to take. Every time I get annoyed enough to write a > > > nagios plugin, I notice that everything is in sync and defer it...) > > > > I wrote a very generic python script with exhaustive comments/debugging. > > It can be modified to be used as a Nagios script plugin. > > > > To view a description of the script: > > $ pydoc ldapSynchCheck > > > > To view the help: > > $ ./ldapSynchCheck.py -h > > > > I guess you didn't look at the perl extension script for BigBrother/Hobbit > that I posted. It assumes that it will be able to: > 1)read sufficient configuration information from cn=config to be able to > determine all the databases using sync-repl, and the master for each > database, on any server
This is a good idea. However some people may not use cn=config yet. We don't in our production environment. > 2)read the contextCSN for any database on any server > anonymously, but, due to this, requires absolutely no configuration. For use > with Hobbit, it just needs to be run on the hobbit server, and any host in > the bb-hosts file just needs 'ol'. Of course, the hobbit server needs to be > able to access all the LDAP servers involved. In my script the default binding is anonymous as well. I just wanted to have the option to bind with a specific dn. > > You may want to take a look, so a user of your script doesn't need to provide > the URIs, but instead can just provide the server to check. > > http://www.zarb.org/~bgmilne/hobbit/ > > At present, it only goes yellow (not red), since there's no real way to > determine if the server being 3 months behind (ie you catch the 30 second > perion it takes to replicate the first change to one database in 3 months) is > severe enough for an error .. but it does show how far ahead (which could > indicate checkpointing/recover problems on the master) or behind the slave is > (so you don't have to compare contextCSNs in your head). > I will take a look at your script. > I could take a look at making it work for nagios, but we're phasing nagios > out, and the only LDAP servers monitored for anything by nagios don't use > sync-repl. > I am curious, what are you going to replace Nagios with? Thanks for your valuable comments. Best Regards, Sam
