[service-orientated-architecture] Re: 7x24 monitoring and management of the SOA environment

Robin Wed, 16 Aug 2006 16:11:39 -0700

I think that monitoring running services is like monitoring web
applications. You need a tool that either parses the log files,
detecting the potential errors that might cause incidents and/or a
tool acting like a probe, constantly checking the service availability
by invoking one of its methods.

But in SOA, the complexity is often not to monitor the services, the
complexity is all what is between the service consumer and the service
provider. Mediation systems(ESB, async middleware), registries,
hardware devices (load-balancing, fail-over,...).
When an application invokes an external service. We might know exactly
which logical service is used but the service consumer application is
largely ignorant of which physical path its message or request is
following. A consequence of loosely coupling in large environments.

In these circumstances, correlating an incident on a user-facing
application with an undetected problem in the plumbing is a challenge.
The complexity grows with the number of intermediaries in the chain.

In one of my previous jobs, we had to manage 600+ services from a
variety of back-ends and technologies.

There was one "24*7 on call" operational team dealing exclusively with
middleware and services. This team was using/keeping up-to-date a
registry/repository where the dependencies between service consumers,
service providers, installed service versions and deployment
environments were documented.

We have also created an end-2-end logging system used to track the
physical path a message or a request was following.

You can also think SOA operational management as something pro-active.
 For example: we are about to stop one physical machine for a planned
maintenance, what is the impact?

I should say that discovering the root cause of one incident is
difficult but the most difficult situations are:
- when only a fraction of the requests or messages are lost or unhandled
- when the service is working but too slow
- when the service is invoking dynamically other services
In these situations, you really have to analyse the full chain and
start correlating information from the different systems involved to
see exactly where the problem is located. A painful job!
<http://blogs.ittoolbox.com/eai/applications/archives/troubleshooting-composite-applications-8759>

Robin

--- In [email protected], "Erik van
Gilder" <[EMAIL PROTECTED]> wrote:
>
> Hi,
> 
> Have you had much success including your existing 7x24 operations
> staff in managing your "SOA" environment, and, if so, to what do you
> attribute your success? I'm working in a large centralized IT
> operations where the 7x24 monitoring and management of the computing
> environment is the responsibility of an enterprise command center. The
> command center has deep roots in a mainframe operations and continues
> to struggle with the e-commerce infrastructure. I'd like to see the
> operations staff help monitor and manage the environment otherwise the
> developers will bear a heavy burden. Any thoughts?
> 
> In our case, the toolset includes WebMethods, WebSphere and Tivoli,
> but I believe the problem to be tool-independent. 
> 
> Thanks,
> Erik
>

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/service-orientated-architecture/

<*> To unsubscribe from this group, send an email to:
    [EMAIL PROTECTED]

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/

[service-orientated-architecture] Re: 7x24 monitoring and management of the SOA environment

Reply via email to