Peter Tribble wrote:
> On Thu, 2006-06-01 at 20:59, Dave Miner wrote:
>> The administrator would more than likely want to know that his impending 
>> action is in fact going to break something, so that he can pro-actively 
>> deal with the situation, either by not performing the action or 
>> coordinating with the affected user(s) in some way.  To me, this is how 
>> your proposal might work in some large corporations:
> 
> The cynical might suggest that the administrative processes
> in some large corporations would be capable of defeating
> any proposal you come up with...
> 

I'm happy to let them defeat it, I mostly want them to avoid having 
preventable accidents without deliberate choice.

>> - I install Firefox 2 in my domain since the bundled version doesn't get 
>> updated often enough
> 
> Is this because the operating system does not support
> the possibility of multiple versions, or because the
> administration won't use it?
> 

Both happen, though it was more in reference to the fact that release 
cycles take time to roll through.  I don't think it particularly matters 
for my scenario, though I think it's relevant to understanding what 
problems the proposal is addressing.

>> - Outsourced administrator blithely removes package on server with a 
>> library my Firefox install depends on, without knowing that there are 
>> any dependencies
> 
> Did you install firefox on this server? Why are you running
> firefox on a server? Would the outsourced administrator behave
> any differently if they were aware of the dependencies? Why
> was the package removed when it might be used by something else
> (chances are, a 3rd party installation of firefox isn't the
> only thing that's going to break)? Why is firefox reliant on
> a package that might be swept away on a whim? Why is the set
> of installed packages being modified anyway?
> 

yes, I installed it.  I'm running it there because that's where my Sun 
Ray is connected, perhaps.  I don't know if the outsourced admin would 
behave differently or not, but I'd think they might if they had the 
information.  Packages get removed for many reasons - maybe they decided 
to move the service elsewhere, maybe they've switched the supported 
application to Opera and took out things that weren't needed anymore.

> I used to manage a system with many hundreds of software
> suites, often with many versions. We had a good idea of
> the "official" dependencies, but it was tracking other
> dependencies that was more interesting. Often, 'ls -ultr'
> to see what was actually being accessed was the most
> useful tool. Often, I would be absolutely sure that
> there was no valid reason why something should be accessed,
> but it was necessary to track down what it was before
> wiping it from the system. (Erroneous LD_LIBRARY_PATH
> settings were one common cause.)
> 

Maintaining correct dependencies is a major problem, to be sure; it's 
somewhat orthogonal to this proposal, though - at least if someone's 
gone to the trouble to express them in the current state of things, I'd 
rather we could check them.

>> - Sometime later, I run Firefox, it fails.  I'm smart enough to find the 
>> actual Firefox executable, run ldd, and figure out which library is 
>> missing; this is a bit beyond a non-engineer user.
> 
> We have computers. They should do this. For one thing, software
> shouldn't ship with dependencies on components that are likely
> to be removed; for another, why can't software be self-healing?
> 

But if you don't depend on other components, typically you end up 
bundling your own copies, which get stale and replicate the 
maintainability issues more widely.  There's no perfect answer, but I'd 
rather encourage componentry than discourage it.

I agree that we want it to be self-healing, but to do that means 
collecting information that you can use; that's basically all I'm asking 
to have happen here.

> One concern I have is that making it easier to deal with breakage
> might encourage more breakage to occur in the first place.
> 

That's a pretty twisted view of things ;^)  Does SMF automatically 
restarting failed services make it more likely that people write crappy 
services that core dump?  I suppose it could, though there's no evidence 
at this point to say; it makes it less likely that they're dealt with 
manually, anyway, and that perhaps means the bugs aren't as likely to be 
noticed and reported, but then again, the infrastructure provides the 
means for us to automatically collect that telemetry and thus we could 
get a more complete picture of problems than we possibly could the old way.

Dave

Reply via email to