[sqlite] SQL code vs application code (was: Insert or Update, was: ON DELETE CASCADE along with ON CONFLICT REPLACE)

BareFeetWare Tue, 19 Oct 2010 16:27:18 -0700

On 20/10/2010, at 1:23 AM, Pavel Ivanov wrote:

>> 2. I suggest that you're better off doing the logic entirely in SQL, rather 
>> than application code, for the sake of portability, data integrity and speed.
> 
> I'd say this is a very bad advice for the developer using SQLite.


Thanks for your input. I thought someone might bite ;-)

> First of all "insert or ignore" and "insert or replace" are not portable SQL 
> structures, so we can forget about portability.

I should clarify that by "portable", I meant that the SQLite file itself is 
portable. If all the logic is contained in the SQL itself, then the SQLite file 
can be moved from a dedicated app to a generic GUI app to an iPhone app to a 
PHP driven website. If the logic is in the application code (instead of the 
SQL), then that logic would have to be recreated in Objective-C, PHP and 
whatever other application code is used. In the case of moving the SQLite file 
between SQLite GUI apps, the user has no control over the code, so the logic 
would no longer work. Since SQLite is the one constant of all situations I 
describe, then it seems efficient to me to put as much of the logic into it as 
possible.

Even if you're looking at portability across SQL flavors, the same principle 
holds. We're much better off writing as close to standard SQL as possible and 
adapting only what the new flavor might require/omit, than leaving logic, data 
integrity and data optimization in the hands of an external coding environment. 
You will also pick up for free whatever optimizations the new SQL flavor 
provides, as long as the SQL includes the logic and constraints, otherwise the 
RDMS can't see it.

> Secondly SQL and application are completely separated creatures only in case 
> of some server-side RDBMS. In case of SQLite they are the same
> application and if you add complexity to SQL, you add complexity to your 
> application and it happens very often that complexity on SQL side
> is much worse in performance than some additional application bits on the 
> other side. Just test different approaches and see it for
> yourself.

I guess this is where our experience or approach differs. I've heard others 
vouch for the same perspective that you suggest. In my case, however, I have 
moved SQLite databases from fledgling creation using the command line tool and 
a rudimentary GUI built in AppleScript, some PERL, to PHP driven web sites, a 
desktop application using an SQLite wrapper written in C and most recently 
iPad/iPhone apps written in Objective C. Within any one of those particular 
environments, I've used various wrappers of frameworks to talk to the SQLite 
databases. There are probably a dozen combinations of application environments 
used. Thankfully, since I designed the logic of the databases in SQLite itself, 
I was able to move the SQLite database file from one environment to the next 
with no change to the SQL itself and without any customisation of the 
application code to cater for a particular database. I can to this day open my 
iPad SQLite database using a desktop command line tool, for example, and know 
that any inserts, updates etc I perform will be internally checked and 
consistent since the logic is within the SQL. This is very handy for debugging 
as well.

In the case of the OP (original poster), he was looking through "several GUI 
editors for SQLite". He is not writing his own application code. So in his case 
also, there is a clear line between application code and SQLite code.

I would further argue that even if you have full control of the application 
code and somehow magically know that you'll never need to move it over to a new 
platform (ie portability is not a concern), that you're still better off 
keeping the logic in the SQLite code itself. Queries and aggregates and tests 
are generally much faster when executed as a single transaction, than when 
handed back and forward or in a loop between application code and SQL, 
reinjecting the results of one query back into some other SQL. I know that from 
a procedural mindset (including modern OO languages), I was initially tempted 
to stick to if-then and loop type logic, making several SQL calls throughout. 
But I obtained better performance by thinking in sets and moving the logic to 
SQL. This approach also forced me to look at optimising queries which often 
simplified greatly in a purely SQL context.

And, just touching on the "data integrity" aspect I mentioned: It seems 
intuitively and is practically less error prone to let the SQL take care of the 
data structure that it knows best, rather than create the data integrity rules 
in an application wrapper. Constraints, foreign keys, triggers, views are all 
tools that are designed for this task and since they are closer to the data, 
are more accurate and efficient than us re-inventing the wheel in another layer 
of code.

I'll leave it to someone smarter than I to have the last say:

Quoting from "The Art of SQL":

>> Chapter 1
>> 
>> Define all the constraints you can... Constraints have two major impacts:
>> •    They contribute to ensuring the integrity of your data, guaranteeing 
>> that everything, as far as defined rules are concerned, is consistent with 
>> those rules.
>> •    They provide valuable information about your data to the DBMS kernel, 
>> and more specifically to the optimizer. Even if today the optimizer does not 
>> make full use of all available constraint data, it is likely that in future 
>> releases of the database system, that constraint data will become used for 
>> more sophisticated processing by the kernel.
>> ... semantic information is lost to the database by a fundamentally weak 
>> design... everything that is defined in the database is defined only once, 
>> thus guaranteeing that no program will use the data inconsistently. Implicit 
>> rules ... must be coded into every program accessing the data... Data 
>> semantics belong in the DBMS, not in the application programs.

>> Chapter 2


>> Leave as much as you possibly can to the database optimizer to sort out.


>> Maximize each visit to the database to complete as much work as can 
>> reasonably be achieved for every visit.


>> The nearer to the DBMS kernel your code can execute, the faster it will run. 
>> This is where the true strength of the database lies.


>> Even when using a well-integrated host language within which SQL statements 
>> are embedded, it is still preferable to embed as much procedural logic as 
>> possible within an actual SQL statement, rather than in the host language. 
>> Of the two alternatives, embedding logic in the SQL statement will yield 
>> higher performance than embedding it in the application.


>> Wherever possible, try to embed your conditional logic within your SQL 
>> statements rather than in an associated host language.

>> Chapter 4

>> Any decent SQL engine relies heavily on its query optimizer, which very 
>> often performs an excellent job. However, there are many aspects of the way 
>> optimizers work that you must keep in mind:


>> Where important semantic information relating to the data relations is 
>> embedded in ... application program code, that vital information will be 
>> totally unavailable to the optimizer.

>> Whatever its efforts, if the bulk of your program is fetching data inside 
>> procedural code just to feed into subsequent queries, the optimizer will not 
>> be able to do anything for you.


>> Chapter 9

>> The other key point is to try to pack as much "action" as possible into an 
>> SQL query, and it is in respect to this second key point that aggregate 
>> functions can be particularly useful... When, in a program, you need a lot 
>> of intermediate variables to hold values you get from the database before 
>> reinjecting them into the database as input to other queries, and if you 
>> perform against those variables nothing but very simple tests, you can bet 
>> that you have the algorithm wrong. And it is a striking feature of poorly 
>> written SQL programs to see the high number of lines of code outside of SQL 
>> queries that are simply devoted to summing up, multiplying, dividing, and 
>> subtracting inside loops what is painfully returned from the database. This 
>> is a totally useless and utterly inefficient job: we have SQL aggregate 
>> functions for that sort of work.


Thanks,
Tom
BareFeetWare

 --
Comparison of SQLite GUI tools:
http://www.barefeetware.com/sqlite/compare/?ml

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] SQL code vs application code (was: Insert or Update, was: ON DELETE CASCADE along with ON CONFLICT REPLACE)

Reply via email to