Re: [sqlite] builtin functions and strings with embedded nul characters

James K. Lowden Thu, 07 Jul 2016 18:29:59 -0700

On Mon, 4 Jul 2016 13:07:18 +0200
R Smith <rsm...@rsweb.co.za> wrote:


> I think you are missing an important bit in all of this - the strings
> in C is the problem, they think a Null character indicates
> termination. It has nothing to do with how SQL stores data - SQLite
> will store it with all bytes intact, but you typically retrieve or
> set it via some C calls using a C api.. and this is where the problem
> is. 

Dijkstra: On anthropomorphism in science
https://www.cs.utexas.edu/users/EWD/transcriptions/EWD09xx/EWD936.html

C doesn't have strings, and they don't think.  

C has some standard functions that by convention treat byte arrays as
strings.  The convention is to signify EOS with a NUL bytes.  Using
those functions on arrays with non-terminating NULs will probably lead
to undesired results.  

IIUC, there are some string functions in SQLite (including in the SQL
itself) that behave unpredictably if presented with strings that
include embedded NULs.  That needs no defense: it is a defect.  The
DBMS keeps (as it should) explicit lengths for all data.  Treating NUL
specially is only problematic.  Saying their behavior is undefined in
that case at least gives the user fair warning; better would be make it
defined.  

A few people in this thread mentioned something along the lines of
"SQLite data are encoded as UTF-8".  That's not true.  It does not
check that text is correctly or uniformly encoded, nor does it record
what encoding is in force for a given database.  It would be more
accurate to say that SQLite supports 4 Unicode encodings.  

The default encoding is "binary", which is to say unencoded
bytes-as-text.   If the comparison function is memcmp(3), NUL needs no
special treatment.  

--jkl
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] builtin functions and strings with embedded nul characters

Reply via email to