[service-orientated-architecture] Re: Caganoff on The Power of Later

Rob Eamon Fri, 08 May 2009 12:43:15 -0700

Oops. I misattributed to Vinoski. It should have been Caganoff.

-Rob


--- In [email protected], "Rob Eamon" 
<rea...@...> wrote:
>
> That's a nice turn-about of Ranadive's "The Power of Now" phrase.
> 
> While I take issue with gratuitous use of "BPM" in this posting, async vs. 
> sync is good to keep in mind when designing the technical details of a 
> system. I agree with Vinoski that synchronous interaction is too often the 
> default approach (e.g. web services). Store-and-forward might be a better 
> default.
> 
> But let's not swing too far the other way in labelling it "the worst of all 
> the options available." A robust, fault-tolerant synchronous interaction is 
> sometimes the right approach.
> 
> -Rob
> 
> --- In [email protected], Gervas Douglas 
> <gervas.douglas@> wrote:
> >
> > <<Steve Vinoski <http://steve.vinoski.net/blog/> nominates RPC as an 
> > historically bad idea 
> > <http://qconlondon.com/london-2009/presentation/RPC+and+its+Offspring%3A+Convenient%2C+Yet+Fundamentally+Flawed>,
> >  
> > yet the synchronous request reply message pattern is undeniably the most 
> > common pattern out there in our client-server world. Steve puts this 
> > down to "convenience" but I actually think it goes deeper than that. 
> > Because RPC is actually not convenient - it causes an awful lot of problems.
> > 
> > In addition to the usual problems cited by Steve, RPC makes heavy 
> > demands on scalability. Consider the SLA requirements for an RPC service 
> > provider. An RPC request must respond in a reasonable time interval - 
> > typically a few tens of seconds at most. But what happens when the 
> > service is under heavy load? What strategies can we use to ensure a 
> > reasonable level of service availability?
> > 
> >    1. Keep adding capacity to the service so as to maintain the required
> >       responsiveness...damn the torpedoes and the budget!
> >    2. The client times-out, leaving the request in an indeterminate
> >       state. In the worst case, the service may continue working only to
> >       return to find the client has given up. Under continued assault,
> >       the availability of both the client and the server continues to
> >       degrade.
> >    3. The service stops accepting requests beyond a given threshold.
> >       Clients which have submitted a request are responded to within the
> >       SLA. Later clients are out of luck until the traffic drops back to
> >       manageable levels. They will need to resubmit later (if it is
> >       still relevant).
> >    4. The client submits the request to a proxy (such as a message queue
> >       
> > <http://www.soabloke.com/2008/09/20/the-architectural-role-of-messaging/>)
> >       and then carries on with other work. The service responds when it
> >       can and hopefully the response is still relevant at that point.
> > 
> > Out of all these coping strategies, it seems that Option 2 is the most 
> > common, even when in many cases one of the other strategies is more 
> > efficient or cost-effective. Option 1 might be the preferred option 
> > given unlimited funds (and ignoring the fact it is often technically 
> > infeasible). In the real world Option 2 more often becomes the default.
> > 
> > The best choice depends on what the service consumer represents and what 
> > is the cost of any of the side-effects when the service fails to meet 
> > its SLA.
> > 
> > When the client is a human - say ordering something at our web site:
> > 
> >     * Option 2 means that we get a pissed off user. That may represent a
> >       high, medium or low cost to the organization
> >       
> > <http://www.soabloke.com/2008/01/13/cost-vs-benefit-occams-razor-for-the-enterprise-architect/>
> >       depending on the value of that user. In addition there is the cost
> >       of indeterminate requests. What if a request was executed after
> >       the client timed-out? There may be a cost of cleaning up or
> >       reversing those requests.
> >     * Option 3 means that we also get a pissed off user - with the
> >       associated costs. We may lose a lot of potential customers who
> >       visit us during the "outage". On the positive side, we minimise
> >       the risk/cost of indeterminate outcomes.
> >     * Option 4 is often acceptable to users - they know we have received
> >       their request and are happy to wait for a notification in the
> >       future. But there are some situations where immediate
> >       gratification is paramount.
> > 
> > On the other hand, if the client is a system participating in a 
> > long-running BPM flow, then we have a different cost/benefit equation.
> > 
> >     * For Option 2, we don't have a "pissed off" user. But the
> >       transaction times out into an "error bucket" and is left in an
> >       indeterminate state. We must spend time and effort (usually costly
> >       human effort) to determine where that request got to and remediate
> >       that particular process. This can be very costly.
> >     * Option 3  once again has no user impact, and we minimise the risk
> >       of indeterminate requests. But what happens to the halted
> >       processes? Either they error out and must be restarted - which is
> >       expensive. Alternatively they must be queued up in some way - in
> >       which case Option 3 becomes equivalent to Option 4.
> >     * In the BPM scenario, option 4 represents the smoothest path.
> >       Requests are queued up and acted upon when the service can get to
> >       it. All we need is patience and the process will eventually
> >       complete without the need for unusual process rollbacks or error
> >       handling. If the queue is persistent then we can even handle a
> >       complete outage and restoration of the service.
> > 
> > So if I am a service designer planning to handle service capacity 
> > constraints, for human clients I would probably choose (in order) Option 
> > 3, 4 and consider the costs 
> > <http://www.soabloke.com/2008/01/13/cost-vs-benefit-occams-razor-for-the-enterprise-architect/>
> >  
> > of option 2. For BPM processes where clients are "machines" then I would 
> > prefer Option 4 every time. Why make work for myself handling timeouts?
> > 
> > One problem I see so often is that solution designers go for Option 2 by 
> > default - the worst of all the options available to them.>>
> > 
> > You can read this blog at: 
> > http://www.soabloke.com/2009/05/05/the-power-of-later/
> > 
> > Gervas
> >
>

[service-orientated-architecture] Re: Caganoff on The Power of Later

Reply via email to