> This is very true. At the same time there has to be a > fine balance between operations and engineering > support. Products should be easy enough for > operations to get the day-to-day work accomplished, > but have enough flexibility and control for experts > to get under the hood for tuning and troubleshooting. > No organization should be based on having armies of > highly experienced and costly technical staff. > Operations folks have to be able to take care of the > bulk of BAU tasks so that the highly skill SA's, > engineers, and architects can focus on solving real > problems and building new solutions. Otherwise, the > product becomes too complex to operate and manage, > which is a bad combo when paired with expensive > hardware and software. When that combo comes up, it's > an easy target for disposal.
I'm not saying all support personnel have to be experts. A diligent click-monkey is good enough to answer the phone and take care of simple problems, _provided_ they don't create more work than they do, and _provided_ the simplicity isn't the deceptive sort. For example: zfs mostly does a good job at providing the appearance of simplicity for reasonably routine uses, but frequently updated best practices documents would be in order to advise as to configuration choices. (For example: disks larger than what, or raidz larger than how many disks, should use raidz2 for reasonable safety? How about raidz3? How does using desktop disks vs "enterprise" grade disks affect that choice? I've seen that sort of thing discussed in general terms, but if there are some reasonably straightforward guidelines, I must have missed them.) However, the separation between the zpool command and the zfs command at least roughly aligns with the difference between laying out a suitable storage pool vs simply dividing it up into filesystems or maintaining quotas and such. But for virtualization, especially with multiple ways of achieving it (or at any rate, some approximation of it), it seems to me that the configuration choices are potentially far more complicated (the first example that comes to mine: how many different ways might one provide storage to client LDOMs? which ways would offer good performance and the freedom to reboot any one control or service LDOM without affecting the clients, how much complexity does that add, and how does that complicate moving LDOMs on the fly?). I don't think that (for example) Ops Center goes all that far in providing a consistent illusion of simplicity of management, at least not from what I've read thus far (haven't had the opportunity to actually use it yet). I'm certainly not arguing against powerful tools, even reasonably easy to use ones. I would prefer command line ways of doing things in addition to a GUI, both for scripting and recovering from major power or network disruptions (where interfaces that might require access to more ports, more authentication infrastructure running, etc, might not yet be functional). I think part of the problem is that simplicity is not only deceptive, but is almost intentionally deceptive. It's typically a selling point of some powerful management software how simple it is. But without a very clear presentation not only of what it can do, but of what it can't (or worse, can sometimes but not always depending on other circumstances), well, most places, understanding and spending aren't done by the same person, although hopefully some communication does occur to align them. Does the seller of a product really have an interest in being clear about its limitations? In the long run, I think they do; better a few less immediate sales and more customer loyalty, than more sales right away and a lot of them never coming back. The conflict goes beyond that though; when simplicity is a selling point, great care is required to avoid treating realistic appraisal of strengths and limitations as merely something that distracts from the argument that life will be simple if you just buy product X. When someone peddling simplicity follows the "Miracle on 34th Street" policy, and tells me that their product _can't_ do everything I want it to, or even shocks me completely by telling me of a better alternative from a competitor, then perhaps I'll believe that simplicity isn't being oversold. Here's one for you: starting to read the Ops Center documentation, I see it mentioning LDOM software 1.2, while 1.3 is the latest. And I see discussion of provisioning a control LDOM, but nothing about an additional service LDOM. Yet a configuration that allows reboot of either control or service LDOM without affecting the guest LDOMs requires both, and requires multipathing (via both) for client storage and network. The most robust and versatile configuration (where one can maintain the infrastructure with potentially zero impact on the guest LDOMs) seems not to be discussed. Nor does it appear that it can handle migrating an active guest LDOM (which I gather requires 1.3, and has a lot of constraints as to the circumstances under which it will work). Now that's not even getting into containers (zones+resource controls), but it sure looks to me that even with the added complexity of Ops Center providing a (more superficial than I'd hoped) illusion of simplicity for some tasks (status, inventory, provisioning), it's a good bit more difficult than e.g. a managed vmware installation, where not only can running VMs be migrated, that can even happen automatically as needed. (not to say that such a VMware installation is _really_ simple either; they can get cranky too, but at what looks like the present stage of the game, they have rather more managed functionality, albeit at the cost of higher CPU overhead and quite possibly higher power and cooling requirements) -- This message posted from opensolaris.org _______________________________________________ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org