About
the ratio of writing data, I usually tend to look more at the read/write ratio.
It allows me to write better caching gadgets and to plan network-trips more
carefully. I say plan, because on many ocassions, obtaining proper metrics is
usually beyond my reach(I have many, many problems simulating realistically the
actual load of some of these systems, as some have to process millions of not so
simple transactions within a few minutes, and I just don't have access to
hundreds of PCs in my dev environment).
The
actual read/write ratio for some of the examples you provide is in the
25-50 range. I have many times checked for books/DVDs in Amazon, yet I have
embarked in the actual process of purchasing only a few times. Logic executed
most of the time is merely presentation logic; hence, the RDBMS manager only
needs to produce reads, and only needs to manage little TX contexts(the level of
isolation needed is minimal). Hence this logic is best outside the RDBMS, and I
think we'd agree on that.
What
happens when I purchase a book? Here's when I _must_ agree with you, but only
for the most simple systems(and I think that the model used in Amazon is one of
these, for reasons I'll state below). A fetch/calculate/store procedure, if
simple and without the need to lock important resources(table/page locks) is the
way to go. But at the most this approach yields faster execution times, not
greater scalability. In these cases, moving the business logic out of the RDBMS
yields longer processing times, but these are usually LAN latencies, which may
not produce a noticeable distinction for the end user, and in turn, allow the
same RDBMS(both hardware and software), to process more concurrent querys, hence
scaling more.
If the
assumption that the splitting is too expensive (in response time), then by all
means use SPs.
I
would also like to add that we're talking about really big systems here. If you
can determine load and it is easily reproducible in the dev environment,
then you're probably in the best position to decide wether SPs are enough for
you or not. I only tend to walk away from them when the load(or it's
consecuences) is unknown at the time of building the system, which is most of
the time for me, and that's why I get the big bucks ;-).
Now,
what if we add data-warehousing? You know, Amazon not selling you(and charging
for) items that it cannot deliver as they're not available in
stock?
Usually, these systems require to implement either a
higher isolation level or a transactional saga(both are very scary to me as I've
had to do some such things in the past, and you shouldn't get started without a lot of aspirins and sending a lot
of flowers to your soulmate in advance
;-) ). BTW, Amazon doesn't do this, in fact they have a ranking of
availability for each product instead of handling stock more directly; some
logistics systems cannot afford this, particulary in tracking orders from
contractors in the automobile and construction industries; if cars lights
do not reach the assembly line on time, the whole assembly line must stop; in
construction the problem worsens, as it is always project oriented. Many
companies are now selling mixed concrete right on time, and making lots of
profit from it(the dynamics are pretty much similar to those of IT projects: the
project gets budgeted, then it costs more money and lasts longer, then you renegotiate, etc.).
Whenever these impediments are present, my first
instinct is to marshall the business logic out of the RDBMS as much as
possible.
Reading data and sending it thru the wire is really
very inexpensive if you have set-up your LAN properly; regrettably this doesn't
happen as often as I'd like to(damn NATs). Also, this is particulary slow in
COM+, as you're always using ADO to fetch and as the underlying protocol is
usually DCOM itself, first going in and out of MS DTC, which altogether is a
performance hog(yes, even worse than J2EE comparables, mostly because of the
location transparency magic COM and COM+ have to perform).
So,
the bottom line for me is to try to avoid SPs if there's a choice and it's not
the best for the project. Many many systems don't need to scale at all, as
they're deployed on a controlled environment. Many just cannot afford to handle
BL outside the RDBMS because at the time of writing no app server technology was
present, not to mention I'm not keen to trash out systems that work and are
stable after years of continued service.
I'd
like to add that it's always a pleasure to discuss these subjects with so many
bright, experienced and polite engineers. I owe lots to this forum and the
brilliant gentleman that so frecuently visit it.
My
2c,
Juan Pablo Lorandi
Chief Software
Architect
Code Foundry Ltd.
Barberstown, Straffan, Co. Kildare, Ireland.
Tel: +353-1-6012050 Fax: +353-1-6012051
Mobile: +353-86-2157900
www.codefoundry.com
Disclaimer:
Opinions expressed are entirely
personal and bear no relevance to opinions held by my employer.
Code Foundry Ltd.'s opinion is that I
should get back to work.
-----Original Message-----
From: A mailing list for Enterprise JavaBeans development [mailto:[EMAIL PROTECTED]] On Behalf Of Craig McMurtry
Sent: Wednesday, July 17, 2002 7:06 PM
To: [EMAIL PROTECTED]
Subject: Re: Why Ejb?
Your response is certainly not flame-bait, Juan, for it is as careful and well-informed as your posts usually are.
It seems that the nub of the issue is the "weight" of the business logic. May I propose that the ratio of the work involved in executing the business logic versus the work involved in reading in writing the data varies between systems, and that in many systems that ratio is high, whereas in many others, it is relatively low. If that ratio is relatively low, then would you agree that if the DBMS is approaching the point of choking in cases where it is executing the business logic in the form of stored procedures, then it would be very close to choking even if it was merely reading and writing data at the behest of component-based middleware that was doing the "thinking" ? Then is it possible that how appropriate it would be to increase the amount of data shipping (moving data back and forth to the component-based middleware so that it can apply the processing) would depend on how sophisticated the ! business logic is?
The inclination that I have developed based on my own experience is that typical business systems, as opposed to scientific applications, do not tend to manipulate data in very sophisticated ways (which is why COBOL was the perfect tool for so many years). So, if one is the paradigmatic modern e-commerce applications, Amazon.COM, eBay, etc., except where one is going off to validate credit cards, for example, I cannot see what one would gain by data shipping.
--Craig McMurtry
Juan Pablo Lorandi <[EMAIL PROTECTED]>
Sent by: A mailing list for Enterprise JavaBeans development <[EMAIL PROTECTED]>07/17/2002 11:12 AM
Please respond to Juan Pablo Lorandi
To: [EMAIL PROTECTED]
cc:
Subject: Re: Why Ejb?
