Hello All, Not trying to be antagonistic, but I'm curious to know how many of you agree with Darren's sentiments on this issue. To restate briefly, ANSI SQL-92 specifies that when comparing two character fields, trailing spaces should be ignored. Correct me if I'm wrong Darren, but you feel this is a bad decision, and in fact SQLite's implementation of character comparison (respecting trailing spaces) is superior to ANSI's specs. Keep in mind this is not some obscure issue that can be subject to different interpretations by different vendors; it's very clearly stated: "The ANSI standard requires padding for the character strings used in comparisons so that their lengths match before comparing them." Does anyone know of another RDBMS (ANSI or no) that respects trailing spaces when comparing character data? We have both Oracle 10g and SQL Server 2005 in house and they both work "correctly" according to the specification. Has anyone tried it with DB2 or Informix? What about PostGres and MySQL? Although I haven't asked him, I'm guessing Zbigniew's suggestion a while back for an auto-trim feature stemmed from this issue. Other than saving space, would there be a need to trim data if WHERE, HAVING clauses and joins followed the spec? Also, other than performance (which seems to be the primary concern), would anyone would be negatively impacted if the current behavior were changed? Our app creates SQLite tables dynamically based on the output from user-defined queries that run against data warehouses (of practically any "flavor") we have no control over, and we insert the results into SQLite. Sure - we can handle this situation by writing more code looking for spaces everywhere they might occur. But to me (and maybe only to me?), it makes sense for SQLite -- where reasonably possible -- to attempt to follow clear ANSI guidelines, allowing developers to override it only in cases where this adherence produces undesirable results. I can't see where this is undesirable from an applications standpoint, although I can see where performance may be a concern. Regards, Jeff Fowler ________________________________
From: Darren Duncan [mailto:[EMAIL PROTECTED] Sent: Fri 1/18/2008 4:47 PM To: sqlite-users@sqlite.org Subject: RE: [sqlite] SQLite character comparisons At 4:11 PM -0500 1/18/08, Fowler, Jeff wrote: >"Better" depends on who you ask - I'd say it's worse, and I bet most >DBA's would agree. The ANSI standard is to ignore trailing spaces when >comparing character strings in a WHERE clause, a HAVING clause, or a >join. So I can take the exact same data, run the exact same query, yet >get a different answer from SQLite vs. Oracle or SQL Server. In fact, we >found this issue because we DID get a different answer. And every SQL implementation already has many differences from every other one, even with fundamentals, making them all generally incompatible and proprietary, and this example is right in line with the kinds of differences they have. Other implementations have distinct var-char and fixed-char types, while SQLite only has the former. Part of the problem here is that the ANSI standard is itself flawed. (Maybe this particular case of trailing spaces is an exception, but many parts of ANSI SQL, AFAIK, are more focused on what the syntax of SQL is, and they say that the semantics, even of some fundamental operations, is left up to the implementation to decide for itself. So what good does it do you if SQL of the same syntax will compile on different DBMSs if it behaves differently in each one? And the standard considers this valid.) How many other programming langauges besides ANSI SQL treat trailing spaces as insignificant. >Regarding whether by extension it should be impossible to create strings >with trailing spaces; I side with the SQLite developers who say it isn't >the engine's job to trim blanks in data. Most other engines I've used do >not trim spaces either, even if the field is a varchar. And rightly so, you should not trim spaces, because spaces are significant. Herein lies a wider part of the problem. The ANSI SQL is inconsistent in how it treats trailing spaces in strings. On one hand it wants to preserve them, but on the other hand it wants to ignore them in its most fundamental operation other than preserving. (With my "it should not be possible" sentence, I was not saying that spaces should not be trimmed in the fictional scenario where a character string does by definition not contain trailing spaces, but that code specifying them should produce an error rather than succeed. The matter is analagous to what would happen if you write code that tries to treat the character string literal 'foo' as a number.) >But - whether ANSI compliance is considered "bloat" is not really my >place to comment upon. I guess it's really is up to the SQLite team. Yes it is up to the developers. And they have already demonstrated willingness to do some things differently than ANSI SQL because they considered the differences to be improvements, or alternately reasonable feature cutting. >Purely from a business usability standpoint (not a programming one), I >would say there's no question that it's far more useful to do >comparisons the ANSI way. And why is it more useful to ignore trailing spaces than respect them. And if ignoring them is more useful, why do most programming languages (AFAIK) respect them? > If for some reason I truly want to compare & >respect trailing spaces, I can still do that using a function such as >HEX(A) = HEX(B) or something better. I would argue that it is rediculous to do such ugly things in order to do something that should be fundamental, and is simple and fundamental in any other language. Better for basic '=' comparison to test that the values are the same, and have some other operator or function like 'equal_when_trimmed( v1, v2 )' when you want various exceptional comparisons. -- Darren Duncan ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------