On Mon, Nov 19, 2007 at 12:35:47PM -0500, Andrew Hume wrote:
> folks,
> 
>       i wanted to draw on teh experience of this group.
> my question has to do with models (or as strasser would have said,  
> ontologies).
> 
>       when a higher-level tool tasked with maintaining 2 active web servers
> notices that one has died, it has to pick a new node and start up the  
> web service there.
> how does the tool know what requirements the web service needs? does  
> it share
> an overall model with teh web service, so that it knows the concepts  
> of memory,
> ports, etc? or does the tool essentially get a model from the web  
> service
> and then paw through that to figure out what is needed?


        It dosn't need to know these things.  Just like a well writen makefile 
dosn't
need to know how many bits are in an int.

        There are bunches of different ways the cat can be skinned.  I'd say 
decompose 
above problem into a few different layers. 

        The high level tool shouldn't be asked to keep two webservers running.
The chunks are too big.  Everything sysadmin is in 'two webservers running'.   

        1) Build and reliable web service on a clustered platform.  Take LSF or 
PBS. 
        When ever a webservice node is down another can be launched.  (with 
lots of handwaving,
like backend storage, uniformity of hardware etc.)  A functional 'batching' 
system can be
constructed out of jumpstart / kickstart  plus puppet / lssconf / whatever , to 
build whatever 
you'd like out of allocated hardware.


        2) A little more strict.  When the original package, or 'webserver' app 
is built 
there will be some acceptance criteria.  Idealy this would be rolled into the 
monitoring suite.
When the orignal package is qualified on a set of hardware, thouse boxes are 
the ones it will
allocate too when a node fails (or when the load increases and viloates an 
SLA). 

        
        3) With a little jimmying you can automate the qualification/acceptance 
tests.  Then 
whenever new hardware shows up (and/or is needed) the webservice app can get 
qualified against
the new hardware.  This is horribly brutal, but it has to be done anyway, you 
might as well
just do it, and do it automaticly before it is nessisary. 

        
        Or am I missing the point? 
        
        As for the  Ontologies.  There can be no global ontology, seemingly 
every new software thing will end up with some variation, the ontology will
get mutated, expanded, abused, or start lieing.   There are some things that
will be relativly stable.  There needs to be some language to describe the APIs,
but much of the logic can be flushed by just usting sane defaults and building 
uniformity into the structure.  


        Some things in the ontology are, Unknown Unknowns, for exmple bug 
incompatabilities in software not yet writen.  The more logical proving that 
goes
into the solution will only make these worse.  keep the systme grounded in 
reality.


        You could go through the entire schema building of a vlan, a cdma space,
vlanids , ip net blocks etc.  Or you can just define them away.  Define 
networks 
as well functioning and ignore and never use there neat little tricks.  For 
example 
running several ipv4 nets on the same vlanid.  If you need more than 4000 /24 
networks,
get another datacenter / switching fabric. 

        You can build elaborate and intrecate structures, or you can build self 
service
modules and do everything proceduraly.   If it fails, it fails and the admin 
can pick
up the pieces like we always end up doing anyway. 

        Back to the web exmaple.  Each webserver will need a few things.  The 
host, because it
is a good and happy host, will already have an OS (was chosen by the app SME), 
a name, network interfaces
and a network that works to get everywhere it should (at least a service 
interface for admin host imaging
monitoring/etc).  
        
        Then just run a procedure for each requriement (aka make if you want 
rules), I ended
up just using procedural perl.  But ruby, shell, anything'll work, I assume 
puppit will to.   
The web service has some sort of service interface, and the host needs to get 
at that service
interface.  Be it a load balancer or a dns-rr, or akamai, or whatever. 

        i.e. 

        install_os($thishost, "redhat_myversion"); 
        add_dns_rr($thishost,$httpservicename); 
        allow_firewall($thishost,$httpservice); 
        add_loadbalanced_server($thishost,$service_uri); 
        patch_monkey($this_port, $this_other_port);
        

        To keep some semblence of sanity the network config and OS work should 
probibly be driven off a different
machine than the webserver (in case of breakin, blah blah).     This structure 
is
good for about 95% of what anyone really needs to ever do.  A more locked down
enviroment can be supported by assuming less things work, (and makeing sure 
they don't)
untell they are constructed proceduraly. 

        Note: Patchmonkey is real.  It is a blocking email/ticket wrapper for 
doing manual patchpanel
work. 

        Reclimation or garbage collection of things can be automatic or manual. 
 Migration
to new hardware is just adding two more nodes to your webserver cluster then 
truning the old ones off.


        This sort of thing won't get you static varification of functionality,
but with the complexity of todays systems nothing will, so why try in the 
general case.
If your running close to performance tolerances, doing things where two 
seemingly unrelated
functional nodes will end up contending for resources (network bandwidth for 
example), this 
works less well, but they can still work somewhat by doing qos reservations (oh 
god the pain)
, or logical kinda inforced on the honor system resource allocations. 

        
        So maybe that is a site/platform wide very simple a) with a b) for 
everything under it. 

        If you have strong structural requirements, such as all on the same 
vlan, or switch,
or colo, or whatever.  Push that requirement up to the hardware allocator.  


        Or did I miss the point entierly?


                                                been in a hole for a while
                                                        -gulfie
 



>       it seems like there are just two answers:
> 
> a) there is a global model, populated by things like port numbers,  
> main memory,
> cpu load, available disk space, firewalls and so on. all services can  
> state their requirements
> and measure their usage and performance in terms (predicates etc) of  
> these entities.
> 
> b) the tool inherently knows nothing. it somehow discovers the  
> services extant in teh cluster
> and then figures out what to do by grubbing around through teh  
> ontologies for each service.
> so when the web service says it needs a port number as part of its  
> installation, the tool
> finds the entity 'port number' as part of teh models belonging to the  
> 'firewall' service
> and the 'tcp/ip stack' service for a node.
> 
>       it would be tempting to go with a), which is what i think the CIM  
> folks do.
> even though it is a huge model, you can start small and it is fairly  
> straight forward
> to compute what you need. it is work to extend the model, as you need  
> new code
> to deal with structurally new things (as well as new bits of the model).
> 
>       it would seem like b) is the most flexible and supportive of change, 
> having the least
> impact on existing things when a new service (and model) is  
> introduced. but it is harder,
> and if you thought bcfg2 or puppet specifications are a barrier to  
> adoption, just wait
> til you see ontologies!
> 
>       of course, if you are careful, you could code for b) while adopting  
> a) to start with...
> 
> so to restate, which answer a) or b) (or maybe another one) is the  
> best solution
> for teh question of models for configuration management tools?
> 
> ------------------
> Andrew Hume  (best -> Telework) +1 732-886-1886
> [EMAIL PROTECTED]  (Work) +1 973-360-8651
> AT&T Labs - Research; member of USENIX and LOPSA
> 
> 
> 

> _______________________________________________
> lssconf-discuss mailing list
> lssconf-discuss@inf.ed.ac.uk
> http://lists.inf.ed.ac.uk/mailman/listinfo/lssconf-discuss
_______________________________________________
lssconf-discuss mailing list
lssconf-discuss@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/lssconf-discuss

Reply via email to