On Tue, February 18, 2014 12:54, Mark David Dumlao wrote:
> On Tue, Feb 18, 2014 at 5:52 PM, J. Roeleveld <jo...@antarean.org> wrote:
>> On Tue, February 18, 2014 10:47, Alan McKinnon wrote:
>>> On 18/02/2014 05:46, Mark David Dumlao wrote:
>>>> I used to use cherokee. Fast, light, awesome, and with a web admin.
>>>> The init script always failed me. /etc/init.d/cherokee stop was not a
>>>> guaranteed stop to all forked cherokee processes - the parent pid
>>>> dies, but some forked process or something, usually related to
>>>> rrdtool, doesn't. Or the parent does exit and erases the pid file but
>>>> it returns control immediately and its not yet done exiting. Something
>>>> like that or other. Point is, I've several times had to ps aux|grep
>>>> ... kill; zap; start - on production servers.
>>>
>>>
>>> Valid point. Other than vixie-cron (damn thing just never seems to die
>>> properly on any platform so restarts always fail) I don't really run
>>> into these issues
>>
>> Interesting, I have never had issues with restarting vixie-cron using
>> the
>> supplied init-scripts.
>>
>>> What I do run into is daemons that drop privs on start up, like
>>> tac_plus. Unwary new sysadmins always try start/stop it as root,
>>> causing
>>> an unholy mess. Root the owns the log and pid files, when tac_plus
>>> drops
>>> privs it can't record it's state so continues to service requests but
>>> fails to log any of them. For an auth daemon, that's a serious issue.
>>
>> Shouldn't sysadmins use the init-scripts for that?
>> If done correctly, permissions should not be an issue.
>>
>> Restarting services without keeping file ownership into account will
>> always cause issues. Regardless of the init-system used.
>>
>
> That's just the thing though. As a sysadmin, how do you debug a service
> that isn't starting to begin with?

This isn't what Alan was talking about.
He was talking about restarting an existing, working service.

> Let's say your new to the service.
> You're
> not even sure if you got the config right the first time around. Or maybe
> you're adjusting a setting somewhere, and you're confused why it
> isn't taking effect.

In an environment where Alan works, I wouldn't be the only person around.
There should be someone on call who knows.

> All the /upstream documentation/, all the /man pages/, all the
> /usr/share/doc
> stuff will tell you to start it _raw_. The init script obscures the
> starting options,
> environment variables, and sometimes even the running user from you. What
> are
> you gonna do, play a human shell script parser? Nobody's perfect, do it
> enough times and you're going to casually gloss over the line where
> --safe-mode is appended to the string depending on the phase of
> the moon...
>
> If you're lucky, you've never had to start an unfamiliar service, or debug
> someone else's unfamiliar config under time pressure...

I have been on both ends of this.

I have multiple times been in a situation where I was under time-pressure
to get services running again on unfamiliar systems. Talking untrained
admins through the process by phone-communication only.
It is not easy, but by staying calm and focused, mistakes are avoided.
Also, in my experience, a calm systematic approach is usually faster then
the cowboy-method of trying everything I can find on Google.

I have also, too often, had to clean up the mess caused by these cowboy
tactics.

--
Joost


Reply via email to