On Wed, February 19, 2014 00:06, Alan McKinnon wrote:
> On 18/02/2014 14:16, J. Roeleveld wrote:
>> On Tue, February 18, 2014 12:17, Alan McKinnon wrote:
>>> It's a little more complex than just that. It's an auth service and
>>> user
>>> are frequently added, removed and modified. The daemon does syntax
>>> checking on it's config file at startup or after being HUP'ed but that
>>> only finds static errors. It catches things like adding people to a
>>> grop
>>> instead of to a group, but misses dynamic mistakes like adding users to
>>> groups that don't exist.
>>
>> The auth-service gets the current state from a static file that is only
>> read upon service-start?
>
> Yes.
>
> It's a good design for reasonably static userbases. The user details,
> priviledge definitions, passwords hashes and such are stored in a single
> flat file readable only by root and protected by file permissions.
> Overall protection is provided by restricted shell access to the host.

True, then again, I use ldap for the user accounts at home.
Allows my wife to change her own password and I can quickly add an account
in case someone needs access.

> We're not talking about AT&T's radius servers for dsl users here who
> sign up on a web form - for that you would use a database backend - this
> is for the company's network support personnel who log into the backbone
> and configure the network itself. There's no rush to add new (and
> unproven...) users so this scheme suits me just fine. Yes, it has quirks
> but these no longer bother me myself, we get caught out by new sysadmins
> who have not felt that pain yet

Show them a blood-stained (ok, some dark red paint) stick when they start.
Then tell them it's used when they kill that service? ;)

>>> Despite this all being run out of cron with wrapper scripts to check
>>> validity, automated additions and safety checks between all three
>>> daemons, plus being fully documented on the internal wiki and in bold
>>> blinking red caps in the login motd, people still find ways to do stuff
>>> things in an attempt to fix it.
>>
>> (OT: Does the bold blinking red caps work on all terminals? :) )
>
>
> Um, OK, you got me there. I was exaggerating!

Too bad, I could use that on one of my machines :)

>>> The daemon also tries to log these errors, by writing to a log file it
>>> has no write permissions on.
>>
>> "setuid" on the group with group-write in the umask not an option?
>
> Hmmm, that's worth investigating. I hadn't really considered that as I
> have an aversion to trying to use umask as a control for anything.

Same here, but that could work.
Then again, I believe setuid on the folder does the same on some OSs. (Not
Linux though)

>>> There is nothing I can do about the quality of sysadmins, I have no
>>> input into the HR process and damagement think cheaper is always
>>> better,
>>> including skills. What I can do, is find ways to make the software more
>>> resistant to errors than it already is.
>>
>> And only grant access permissions to these rookies once they have proven
>> they understand rule #1: If In Doubt, Call Someone Who Knows!
>
> Hah! I fought that good fight for years and fought it well. They don't
> call me the sysadmin from hell around here without good reason. And I
> did manage to get a cowboy network under control and instill respect for
> how much breakage Cisco's products can cause.
>
> It's getting harder to grant access based purely on expertise,
> especially when someone crunched the numbers. It turns out that the cost
> of fixing mistakes is far less than the cost of leaving new untrained
> people unutilized and have support tickets pile up...

True, unfortunately...
Then again, a core of really good people can be the better option. But
then you end up becoming overly dependent on that group.

>> But yes, I fully understand the methods of HR and Damagement.
>> It is a financial mistake and risk not to include technical expertise
>> checks in the recruitement fase for technical positions.
>
> Interesting story:
>
> I once had a good shouting match with a support manager about the
> quality of his recruits. I demanded to know why he hired so many
> clueless idiots (my exact words). This manager knows me well so he just
> smiled and said "Alan, you didn't get to see the applicants we rejected.
> These are the best in the market who applied".
>
> *That* was a wake-up call of note :-)

Either done during the "boom" of IT, or wrong recruitment tactics.

>> How much does it cost the company each time this goes wrong and someone
>> like you has to come online to fix the issue?
>> That is what Damagement needs to understand.
>
> Surprisingly, it's not too expensive. There's always one of us on duty
> or standby and outages don't continue unnoticed for long. Longest that I
> recall is 3 minutes, then the phone starts ringing non-stop. remember,
> this system is internal, it does not service customers.

3 minutes downtime is acceptable, even for customers.
They generally first assume they are making a mistake ;)

--
Joost


Reply via email to