Hello All,
 
Not trying to be antagonistic, but I'm curious to know how many of you agree 
with Darren's sentiments on this issue. To restate briefly, ANSI SQL-92 
specifies that when comparing two character fields, trailing spaces should be 
ignored. Correct me if I'm wrong Darren, but you feel this is a bad decision, 
and in fact SQLite's implementation of character comparison (respecting 
trailing spaces) is superior to ANSI's specs. Keep in mind this is not some 
obscure issue that can be subject to different interpretations by different 
vendors; it's very clearly stated: "The ANSI standard requires padding for the 
character strings used in comparisons so that their lengths match before 
comparing them."
 
Does anyone know of another RDBMS (ANSI or no) that respects trailing spaces 
when comparing character data? We have both Oracle 10g and SQL Server 2005 in 
house and they both work "correctly" according to the specification. Has anyone 
tried it with DB2 or Informix? What about PostGres and MySQL? Although I 
haven't asked him, I'm guessing Zbigniew's suggestion a while back for an 
auto-trim feature stemmed from this issue. Other than saving space, would there 
be a need to trim data if WHERE, HAVING clauses and joins followed the spec? 
Also, other than performance (which seems to be the primary concern), would 
anyone would be negatively impacted if the current behavior were changed?
 
Our app creates SQLite tables dynamically based on the output from user-defined 
queries that run against data warehouses (of practically any "flavor") we have 
no control over, and we insert the results into SQLite. Sure - we can handle 
this situation by writing more code looking for spaces everywhere they might 
occur. But to me (and maybe only to me?), it makes sense for SQLite -- where 
reasonably possible -- to attempt to follow clear ANSI guidelines, allowing 
developers to override it only in cases where this adherence produces 
undesirable results. I can't see where this is undesirable from an applications 
standpoint, although I can see where performance may be a concern.
 
Regards,
 
Jeff Fowler
 
________________________________

From: Darren Duncan [mailto:[EMAIL PROTECTED]
Sent: Fri 1/18/2008 4:47 PM
To: sqlite-users@sqlite.org
Subject: RE: [sqlite] SQLite character comparisons



At 4:11 PM -0500 1/18/08, Fowler, Jeff wrote:
>"Better" depends on who you ask - I'd say it's worse, and I bet most
>DBA's would agree. The ANSI standard is to ignore trailing spaces when
>comparing character strings in a WHERE clause, a HAVING clause, or a
>join. So I can take the exact same data, run the exact same query, yet
>get a different answer from SQLite vs. Oracle or SQL Server. In fact, we
>found this issue because we DID get a different answer.

And every SQL implementation already has many differences from every
other one, even with fundamentals, making them all generally
incompatible and proprietary, and this example is right in line with
the kinds of differences they have.  Other implementations have
distinct var-char and fixed-char types, while SQLite only has the
former.

Part of the problem here is that the ANSI standard is itself flawed.

(Maybe this particular case of trailing spaces is an exception, but
many parts of ANSI SQL, AFAIK, are more focused on what the syntax of
SQL is, and they say that the semantics, even of some fundamental
operations, is left up to the implementation to decide for itself.
So what good does it do you if SQL of the same syntax will compile on
different DBMSs if it behaves differently in each one?  And the
standard considers this valid.)

How many other programming langauges besides ANSI SQL treat trailing
spaces as insignificant.

>Regarding whether by extension it should be impossible to create strings
>with trailing spaces; I side with the SQLite developers who say it isn't
>the engine's job to trim blanks in data. Most other engines I've used do
>not trim spaces either, even if the field is a varchar.

And rightly so, you should not trim spaces, because spaces are significant.

Herein lies a wider part of the problem.  The ANSI SQL is
inconsistent in how it treats trailing spaces in strings.  On one
hand it wants to preserve them, but on the other hand it wants to
ignore them in its most fundamental operation other than preserving.

(With my "it should not be possible" sentence, I was not saying that
spaces should not be trimmed in the fictional scenario where a
character string does by definition not contain trailing spaces, but
that code specifying them should produce an error rather than
succeed.  The matter is analagous to what would happen if you write
code that tries to treat the character string literal 'foo' as a
number.)

>But - whether ANSI compliance is considered "bloat" is not really my
>place to comment upon.  I guess it's really is up to the SQLite team.

Yes it is up to the developers.  And they have already demonstrated
willingness to do some things differently than ANSI SQL because they
considered the differences to be improvements, or alternately
reasonable feature cutting.

>Purely from a business usability standpoint (not a programming one), I
>would say there's no question that it's far more useful to do
>comparisons the ANSI way.

And why is it more useful to ignore trailing spaces than respect
them.  And if ignoring them is more useful, why do most programming
languages (AFAIK) respect them?

>  If for some reason I truly want to compare &
>respect trailing spaces, I can still do that using a function such as
>HEX(A) = HEX(B) or something better.

I would argue that it is rediculous to do such ugly things in order
to do something that should be fundamental, and is simple and
fundamental in any other language.  Better for basic '=' comparison
to test that the values are the same, and have some other operator or
function like 'equal_when_trimmed( v1, v2 )' when you want various
exceptional comparisons.

-- Darren Duncan

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------



Reply via email to