RE: [lssconf-discuss] models

Natale, Bob Thu, 03 Jan 2008 05:17:17 -0800

Hi,

Perhaps I'm putting too fine of a filter on my reading of this
exchange, but there seems to be some confusion between "web service"
and "web server" as the construct in question...?


While some principles and practices might apply to both, I can see some
differences at a more detailed level if the question is being asked at
the "web service" level (registries, WSDL, WS-Policy, SCA, etc.) vs the
"web server" level.

Cheers,
BobN

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Gulfie
Sent: Thursday, January 03, 2008 5:55 AM
To: Andrew Hume
Cc: lssconf-discuss@inf.ed.ac.uk
Subject: Re: [lssconf-discuss] models

On Mon, Nov 19, 2007 at 12:35:47PM -0500, Andrew Hume wrote:
> folks,
> 
>       i wanted to draw on teh experience of this group.
> my question has to do with models (or as strasser would have said,  
> ontologies).
> 
>       when a higher-level tool tasked with maintaining 2 active web
servers
> notices that one has died, it has to pick a new node and start up the

> web service there.
> how does the tool know what requirements the web service needs? does

> it share
> an overall model with teh web service, so that it knows the concepts

> of memory,
> ports, etc? or does the tool essentially get a model from the web  
> service
> and then paw through that to figure out what is needed?


        It dosn't need to know these things.  Just like a well writen
makefile dosn't
need to know how many bits are in an int.

        There are bunches of different ways the cat can be skinned.
I'd say decompose 
above problem into a few different layers. 

        The high level tool shouldn't be asked to keep two webservers
running.
The chunks are too big.  Everything sysadmin is in 'two webservers
running'.   

        1) Build and reliable web service on a clustered platform.
Take LSF or PBS. 
        When ever a webservice node is down another can be launched.
(with lots of handwaving,
like backend storage, uniformity of hardware etc.)  A functional
'batching' system can be
constructed out of jumpstart / kickstart  plus puppet / lssconf /
whatever , to build whatever 
you'd like out of allocated hardware.


        2) A little more strict.  When the original package, or
'webserver' app is built 
there will be some acceptance criteria.  Idealy this would be rolled
into the monitoring suite.
When the orignal package is qualified on a set of hardware, thouse
boxes are the ones it will
allocate too when a node fails (or when the load increases and viloates
an SLA). 

        
        3) With a little jimmying you can automate the
qualification/acceptance tests.  Then 
whenever new hardware shows up (and/or is needed) the webservice app
can get qualified against
the new hardware.  This is horribly brutal, but it has to be done
anyway, you might as well
just do it, and do it automaticly before it is nessisary. 

        
        Or am I missing the point? 
        
        As for the  Ontologies.  There can be no global ontology,
seemingly 
every new software thing will end up with some variation, the ontology
will
get mutated, expanded, abused, or start lieing.   There are some things
that
will be relativly stable.  There needs to be some language to describe
the APIs,
but much of the logic can be flushed by just usting sane defaults and
building 
uniformity into the structure.  


        Some things in the ontology are, Unknown Unknowns, for exmple
bug 
incompatabilities in software not yet writen.  The more logical proving
that goes
into the solution will only make these worse.  keep the systme grounded
in reality.


        You could go through the entire schema building of a vlan, a
cdma space,
vlanids , ip net blocks etc.  Or you can just define them away.  Define
networks 
as well functioning and ignore and never use there neat little tricks.
For example 
running several ipv4 nets on the same vlanid.  If you need more than
4000 /24 networks,
get another datacenter / switching fabric. 

        You can build elaborate and intrecate structures, or you can
build self service
modules and do everything proceduraly.   If it fails, it fails and the
admin can pick
up the pieces like we always end up doing anyway. 

        Back to the web exmaple.  Each webserver will need a few
things.  The host, because it
is a good and happy host, will already have an OS (was chosen by the
app SME), a name, network interfaces
and a network that works to get everywhere it should (at least a
service interface for admin host imaging
monitoring/etc).  
        
        Then just run a procedure for each requriement (aka make if you
want rules), I ended
up just using procedural perl.  But ruby, shell, anything'll work, I
assume puppit will to.   
The web service has some sort of service interface, and the host needs
to get at that service
interface.  Be it a load balancer or a dns-rr, or akamai, or whatever. 

        i.e. 

        install_os($thishost, "redhat_myversion"); 
        add_dns_rr($thishost,$httpservicename); 
        allow_firewall($thishost,$httpservice); 
        add_loadbalanced_server($thishost,$service_uri); 
        patch_monkey($this_port, $this_other_port);
        

        To keep some semblence of sanity the network config and OS work
should probibly be driven off a different
machine than the webserver (in case of breakin, blah blah).     This
structure is
good for about 95% of what anyone really needs to ever do.  A more
locked down
enviroment can be supported by assuming less things work, (and makeing
sure they don't)
untell they are constructed proceduraly. 

        Note: Patchmonkey is real.  It is a blocking email/ticket
wrapper for doing manual patchpanel
work. 

        Reclimation or garbage collection of things can be automatic or
manual.  Migration
to new hardware is just adding two more nodes to your webserver cluster
then truning the old ones off.


        This sort of thing won't get you static varification of
functionality,
but with the complexity of todays systems nothing will, so why try in
the general case.
If your running close to performance tolerances, doing things where two
seemingly unrelated
functional nodes will end up contending for resources (network
bandwidth for example), this 
works less well, but they can still work somewhat by doing qos
reservations (oh god the pain)
, or logical kinda inforced on the honor system resource allocations. 

        
        So maybe that is a site/platform wide very simple a) with a b)
for everything under it. 

        If you have strong structural requirements, such as all on the
same vlan, or switch,
or colo, or whatever.  Push that requirement up to the hardware
allocator.  


        Or did I miss the point entierly?


                                                been in a hole for a
while
                                                        -gulfie
 



>       it seems like there are just two answers:
> 
> a) there is a global model, populated by things like port numbers,  
> main memory,
> cpu load, available disk space, firewalls and so on. all services can

> state their requirements
> and measure their usage and performance in terms (predicates etc) of

> these entities.
> 
> b) the tool inherently knows nothing. it somehow discovers the  
> services extant in teh cluster
> and then figures out what to do by grubbing around through teh  
> ontologies for each service.
> so when the web service says it needs a port number as part of its  
> installation, the tool
> finds the entity 'port number' as part of teh models belonging to the

> 'firewall' service
> and the 'tcp/ip stack' service for a node.
> 
>       it would be tempting to go with a), which is what i think the
CIM  
> folks do.
> even though it is a huge model, you can start small and it is fairly

> straight forward
> to compute what you need. it is work to extend the model, as you need

> new code
> to deal with structurally new things (as well as new bits of the
model).
> 
>       it would seem like b) is the most flexible and supportive of
change, 
> having the least
> impact on existing things when a new service (and model) is  
> introduced. but it is harder,
> and if you thought bcfg2 or puppet specifications are a barrier to  
> adoption, just wait
> til you see ontologies!
> 
>       of course, if you are careful, you could code for b) while
adopting  
> a) to start with...
> 
> so to restate, which answer a) or b) (or maybe another one) is the  
> best solution
> for teh question of models for configuration management tools?
> 
> ------------------
> Andrew Hume  (best -> Telework) +1 732-886-1886
> [EMAIL PROTECTED]  (Work) +1 973-360-8651
> AT&T Labs - Research; member of USENIX and LOPSA
> 
> 
> 

> _______________________________________________
> lssconf-discuss mailing list
> lssconf-discuss@inf.ed.ac.uk
> http://lists.inf.ed.ac.uk/mailman/listinfo/lssconf-discuss
_______________________________________________
lssconf-discuss mailing list
lssconf-discuss@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/lssconf-discuss
_______________________________________________
lssconf-discuss mailing list
lssconf-discuss@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/lssconf-discuss

RE: [lssconf-discuss] models

Reply via email to