At 02:44 PM 8/1/2001 -0700, Jeffrey W. Baker wrote:


>On Thu, 2 Aug 2001, Gunther Birznieks wrote:
>
> > When you've had your fill of wrestling over mySQL vs PostGres and stored
> > procs versus inline SQL.... (I know I have long ago)
> >
> > You guys should definitely read the following:
> >
> > http://www.ambysoft.com/persistenceLayer.html
> >
> > One of my current coworkers turned me on to this. I have found it to be one
> > of the best series of articles related towards what it takes to abstract
> > database away from your object layer and the various levels at which it
> > makes sense to do so.
> >
> > You may find the design a little complex, but Scott pretty explicitly
> > states that this is what is necessary for a *large* system. You can always
> > go down a less complex path by choice if you feel your programs aren't
> > complex enough to need the full Persistence Layer structure he advocates.
>
>I've worked with Scott Ambler, and I could record everything Scott Ambler
>knows about actually devleloping large systems on the head of a pin, using
>a magic marker.  That guy is a hopeless academic without the slightest
>clue of how to actually make software happen.

I suppose I can't comment on your opinion as I do not personally know him. 
But I find his statements to be worthy (as explained further below) 
regardless of what you say about his real-world knowledge.

So I can only imagine that he has taken in many comments from users over 
the years and made up his articles based on feedback since I think this one 
is particular is reasonable. Although I've never had to implement all 6 or 
so object abstractions in the ultimate persistence layer he recommends. :)

>Here's the brutal truth about persistance "abstractions" using an RDBMS
>backing store.  At some point, your DBA is going to come up to you and
>tell you that you code is too slow.  You need to rewrite some SQL queries
>to use a different index, or some sorting hints, or whatever.  You will
>realize that you need to pass some extra information down through your
>abstraction layers to make it all happen.  After that happens twice or
>thrice, you will slowly come to realize that your abstraction is really no
>abstraction at all: every time the schema changes, the top level interface
>needs to change as well.

I can't say that I agree.

It depends on what you are coding for. Are you coding for performance or 
are you coding for getting a product out there that is easy to maintain?

In many cases, these two requirements are quite at odds. This thread was 
originally sparked by someone getting annoyed that SQL was embedded 
throughout the code and finding it hard to grasp how to deal with this.

While it's true that the best performance comes from hand-coding the SQL, 
and if you hand-code the SQL, it should arguably be close to the section of 
code that requires this SQL, not all programs require this. In fact, very 
few in my experience. Those that have required speed have required it for a 
small subset of operations in a larger project.

I strongly believe many apps can get away without having SQL embedded. I've 
been doing it for the last several years and definitely coding and 
maintenance time improves with some persistence layer abstraction. But yes, 
you run the risk of having to go back and code a SQL statement or two, and 
you run the risk of somewhat lower performance, but as Scott mentions in 
his article, these should be the well-documented exception, not the rule.

Nick Tonkin posted a very clear and well written post a few minutes ago 
about embedding SQL close to the code which may demonstrate the opposite of 
what I am trying to say. But on the other hand, I could understand that a 
company such as ValueClick really have to make sure their databases and the 
SQL that accesses them are completely tweaked.

So I think given speed requirements, making a HERE document and using other 
clean-coding techniques to keep the SQL close to the code that needs it is 
quite reasonable.

However, in my experience...

Of the things that are harder to duplicate in a persistence layer to one 
degree or another...

Not all applications require transactions
Not all applications require aggregation beyond count
Not all applications require blinding speed (just decent speed)
Not all applications require joins
Not all applications require unions
Not all applications require subselects

And even if you would argue that taking into account a union of 
probabilities an application may need at least one of the above, I have 
found it simply is not true. Usually when an application has a fairly 
complex data model then they need more than one of the above and that's 
when you have to move to SQL.

In other words, if the probability that an app needs each of the features 
above is 5%, then rather than the union of the probabilities being 5 + 5 + 
5 + 5 + 5 + 5, it is really more like 8% where the majority of the 5% is 
really in applications that needs more than one of the above advanced SQL 
list. Note that these percentages are not actual percentages, I just am 
throwing out a number to demonstrate the point.

In simpler terms, most applications really need some place to just dump a 
set of data and if it were flat files, the application and it's users would 
be perfectly happy (robustness and speed of flat file searching not 
withstanding).

It's also telling that mySQL is so successful and before that miniSQL had 
been. Simply put, the features Sybase and Oracle offers about sub-selects 
etc are just not frequently needed. I do think they are useful. I was a 
Sybase DBA for 5 years from Sybase 4.2 on OS/2 and VMS and then evolved to 
Sybase ASE when it came out a 3 years ago (and I quit doing DBA work).

Although I had to spend some of my time helping people performance tune 
their queries, most of the applications I saw accessing "my dbms systems" 
really could have been on mySQL for all the fanciness of their SQL and what 
their applications actually required.

The one thing I would say, is that those of you that have been burned by 
persistence layers are perhaps burned for a parallel reason people have 
been burned by writing their own templating system. They start off simple 
and then they try to accommodate complexities in an automatic way. The best 
persistence layers for SQL that I Have worked with start out simple and 
STAY simple. If you need a complexity, you expose $dbh and then run with it 
and document that anomaly.

Following this rule, few of my programs have the equivalent of $dbh exposed 
and those that do are the exception. This allows me to teach a new 
programmer the persistence layer in half a day and then that's it and the 
programs tend to be easier to maintain and write. It's really not that 
different from utility functions that I see DBI programmers being used to. 
It's just an argument of where you place this code in the program.

Later,
     Gunther


Reply via email to