[service-orientated-architecture] Re: Caganoff on The Power of Later

Rob Eamon Thu, 07 May 2009 21:27:17 -0700

That's a nice turn-about of Ranadive's "The Power of Now" phrase.


While I take issue with gratuitous use of "BPM" in this posting, async vs. sync 
is good to keep in mind when designing the technical details of a system. I 
agree with Vinoski that synchronous interaction is too often the default 
approach (e.g. web services). Store-and-forward might be a better default.

But let's not swing too far the other way in labelling it "the worst of all the 
options available." A robust, fault-tolerant synchronous interaction is 
sometimes the right approach.

-Rob

--- In [email protected], Gervas Douglas 
<gervas.doug...@...> wrote:
>
> <<Steve Vinoski <http://steve.vinoski.net/blog/> nominates RPC as an 
> historically bad idea 
> <http://qconlondon.com/london-2009/presentation/RPC+and+its+Offspring%3A+Convenient%2C+Yet+Fundamentally+Flawed>,
>  
> yet the synchronous request reply message pattern is undeniably the most 
> common pattern out there in our client-server world. Steve puts this 
> down to "convenience" but I actually think it goes deeper than that. 
> Because RPC is actually not convenient - it causes an awful lot of problems.
> 
> In addition to the usual problems cited by Steve, RPC makes heavy 
> demands on scalability. Consider the SLA requirements for an RPC service 
> provider. An RPC request must respond in a reasonable time interval - 
> typically a few tens of seconds at most. But what happens when the 
> service is under heavy load? What strategies can we use to ensure a 
> reasonable level of service availability?
> 
>    1. Keep adding capacity to the service so as to maintain the required
>       responsiveness...damn the torpedoes and the budget!
>    2. The client times-out, leaving the request in an indeterminate
>       state. In the worst case, the service may continue working only to
>       return to find the client has given up. Under continued assault,
>       the availability of both the client and the server continues to
>       degrade.
>    3. The service stops accepting requests beyond a given threshold.
>       Clients which have submitted a request are responded to within the
>       SLA. Later clients are out of luck until the traffic drops back to
>       manageable levels. They will need to resubmit later (if it is
>       still relevant).
>    4. The client submits the request to a proxy (such as a message queue
>       
> <http://www.soabloke.com/2008/09/20/the-architectural-role-of-messaging/>)
>       and then carries on with other work. The service responds when it
>       can and hopefully the response is still relevant at that point.
> 
> Out of all these coping strategies, it seems that Option 2 is the most 
> common, even when in many cases one of the other strategies is more 
> efficient or cost-effective. Option 1 might be the preferred option 
> given unlimited funds (and ignoring the fact it is often technically 
> infeasible). In the real world Option 2 more often becomes the default.
> 
> The best choice depends on what the service consumer represents and what 
> is the cost of any of the side-effects when the service fails to meet 
> its SLA.
> 
> When the client is a human - say ordering something at our web site:
> 
>     * Option 2 means that we get a pissed off user. That may represent a
>       high, medium or low cost to the organization
>       
> <http://www.soabloke.com/2008/01/13/cost-vs-benefit-occams-razor-for-the-enterprise-architect/>
>       depending on the value of that user. In addition there is the cost
>       of indeterminate requests. What if a request was executed after
>       the client timed-out? There may be a cost of cleaning up or
>       reversing those requests.
>     * Option 3 means that we also get a pissed off user - with the
>       associated costs. We may lose a lot of potential customers who
>       visit us during the "outage". On the positive side, we minimise
>       the risk/cost of indeterminate outcomes.
>     * Option 4 is often acceptable to users - they know we have received
>       their request and are happy to wait for a notification in the
>       future. But there are some situations where immediate
>       gratification is paramount.
> 
> On the other hand, if the client is a system participating in a 
> long-running BPM flow, then we have a different cost/benefit equation.
> 
>     * For Option 2, we don't have a "pissed off" user. But the
>       transaction times out into an "error bucket" and is left in an
>       indeterminate state. We must spend time and effort (usually costly
>       human effort) to determine where that request got to and remediate
>       that particular process. This can be very costly.
>     * Option 3  once again has no user impact, and we minimise the risk
>       of indeterminate requests. But what happens to the halted
>       processes? Either they error out and must be restarted - which is
>       expensive. Alternatively they must be queued up in some way - in
>       which case Option 3 becomes equivalent to Option 4.
>     * In the BPM scenario, option 4 represents the smoothest path.
>       Requests are queued up and acted upon when the service can get to
>       it. All we need is patience and the process will eventually
>       complete without the need for unusual process rollbacks or error
>       handling. If the queue is persistent then we can even handle a
>       complete outage and restoration of the service.
> 
> So if I am a service designer planning to handle service capacity 
> constraints, for human clients I would probably choose (in order) Option 
> 3, 4 and consider the costs 
> <http://www.soabloke.com/2008/01/13/cost-vs-benefit-occams-razor-for-the-enterprise-architect/>
>  
> of option 2. For BPM processes where clients are "machines" then I would 
> prefer Option 4 every time. Why make work for myself handling timeouts?
> 
> One problem I see so often is that solution designers go for Option 2 by 
> default - the worst of all the options available to them.>>
> 
> You can read this blog at: 
> http://www.soabloke.com/2009/05/05/the-power-of-later/
> 
> Gervas
>

[service-orientated-architecture] Re: Caganoff on The Power of Later

Reply via email to