Simon Slavin wrote: > On 9 Nov 2011, at 8:03pm, Yuriy Kaminskiy wrote: > >> Look at: SELECT hex(X'1245005679'),hex(X'1245001234'); >> >> And compare: SELECT X'1245005679' LIKE X'1245001234'; 1 -- incorrect SELECT >> X'1245005679' = X'1245001234'; 0 -- correct SELECT X'1245005679' > >> X'1245001234'; 1 -- correct >> >> "LIKE" (in both "native" and "icu" implementations) ignores value length >> (sqlite_value_bytes) and stops at NUL character. Compare that with "=" and >> ">" that compares full value. >> >> Arguable SQL_BLOB should be illegal operand for LIKE/GLOB, but then this >> should be documented and they should return error when called with BLOB >> argument instead of returning nonsense. > > The definition of LIKE on the expression page states (not with complete > clarity) that LIKE operates on strings. So we're back to an old favourite: a > decision as to what constitutes a string inside SQLite. One definition would > say that a string would terminate with the first 0x00 no matter how many > bytes are stored (C style strings). Another would say that the string > terminates with the last byte stored (Pascal style strings).
One way or other, "=", "LIKE" and "GLOB" results should be consistent. If string is NUL-terminated, "=" should ignore everything after NUL. If string is length-terminated, "LIKE" should not ignore bytes after NUL. If devs have no time to change code and/or make decisions now, those cases should be at least documented as "function/operation A,B,C (currently) have undefined behavior on BLOB". Not very nice, but randomly stumbling over such surprises is certainly worse. ... and few related strangeness: UPPER/LOWER/*TRIM/REPLACE also accept BLOB argument, but return *TEXT* instead. With somewhat unexpected result: sqlite> SELECT length( X'41424300414243313233'); 10 sqlite> SELECT length( lower(X'41424300414243313233')); 3 sqlite> SELECT length(CAST(lower(X'41424300414243313233') AS BLOB)); 10 sqlite> .mode insert sqlite> select rtrim(X'4142430061626333',X'334363'); INSERT INTO table VALUES('ABC'); sqlite> select CAST(rtrim(X'4142430061626333',X'334363') AS BLOB) INSERT INTO table VALUES(X'414243006162'); (BTW, one can consider that *TRIM, LIKE and GLOB semantic on BLOB should be different - with BLOB they should operate on *bytes*, not *utf-8 chars*; one more reason to state "undefined behavior" for now). Same with || operator (OP_Concat): it takes BLOB, but make result TEXT (leading to [arguable illegal] TEXT with embedded NUL {or with broken UTF-X encoding}: X'1234' || X'004567' [also look at <http://permalink.gmane.org/gmane.comp.db.sqlite.general/68473>, where this problem triggered invalid .dump]) SUBSTR also documented to operate *on string*, but, unlike above functions, when supplied with BLOB argument it correctly returns BLOB value. Not sure if it worth changing code, but certainly should be somehow mentioned in documentation (including clearly stated "undocumented behavior on BLOB" as option). > I don't think any official definition is stated anywhere in the documentation > intended for users. (I haven't looked at comments in the source code.) And > I suspect that if DRH wants to decide one way or another, this may point up > some inconsistencies in the codebase which should be quickly fixed. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users