Greetings folks, I had hoped to get this message out this weekend, but a trip to the emergency room and 4 stitches for my 3 year old drove everything else out of my head this weekend (everything's fine though).
Some of you may have noticed on Friday morning that certain things like logging into the IMAP server or ssh into the old login server, moya, didn't work. This is what happened, as near as I can tell. When moya was first setup with the ldap/kerberos single sign-on, we setup connections to the ldap server requesting user information to use SSL. This was setup with a self-signed certificate. Later, TriLUG created it's own certificate authority which we used to sign certificates for the web server, imap server, smtp server, etc.. The LDAP service was never converted to use the new certificate format and later machines didn't even use ssl, mainly because it's chief use is if you're actually transmitting passwords over ldap, which we're not (passwords are entirely handled by kerberos, which doesn't transmit password in over the wire at all). So, the ldap setup from moya to the server (yes, also on moya) kept going using the old certificate. This certificate expired this past August! We didn't actually realize, though, at the time for two reasons. It happened right around the time we moved the login server from moya to dargo (which doesn't connect to the ldap server using SSL). Also, the name service caching daemon (nscd) was running on moya and it by the time the certificate expired, it apparently had everyone (or at least most) in its cache and could respond for requests even though ldap wasn't working. We did notice some things start acting weirdly. The user addition script stopped working correctly and I incorrectly attributed it to the /home directory move. Anyway, what finally caused everything to fail was that late Thursday night, around midnight, I restarted the ldap server and nscd while doing routine maintenance. As soon as they were restarted, no one accessing moya (which is the IMAP and mail server) could access their user credentials and as a result imap logins stopped working, and worse yet, mail started bouncing with "No such user" error messages! :-( This continued until about 9:30am when I finally figured out what was happening and turned off SSL and fixed the error. So, as a result people using the trilug mail server should check to see if they've been suspended from any mailing list they're on, and if you were expecting a certain email from Friday morning at midnight until about 9:30, you should check with whoever you were expecting it from to see if it bounced. Anyway, we very much apologize for this screwup and hope that it won't happen again. Cheers, Tanner Lovelace -- TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug TriLUG Organizational FAQ : http://trilug.org/faq/ TriLUG Member Services FAQ : http://members.trilug.org/services_faq/ TriLUG PGP Keyring : http://trilug.org/~chrish/trilug.asc
