Re: [sqlite] NUL handling bugs (was Re: c-api document suggestion)

Roger Binns Sat, 24 Sep 2011 09:38:29 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 09/23/2011 05:51 PM, David Garfield wrote:
>> SQLite's API supports both (mostly).  Internally, you must use one or
>> the other (or hideously duplicate code),


Not really.  If your own code only uses NUL termination then use that form
of APIs.  If you use counted strings then use that form.  As a developer
using SQLite you do not have to use both.  And you can mix and match with
what is most convenient at each call, although supporting embedded NUL
requires the counted form for obvious reasons.

>> and SQLite uses the second --
>> except for some functions (which use the hybrid model).  That
>> exception is the bug.

The "bug" is that a performance optimisation is mentioned in the doc.  The
internal SQL parsing code always stops at a NUL, and requires that a string
be NUL terminated.  If you do not explicitly provide one then it will copy
the string in order to NUL terminate it.  Sure this is a little messy and
could be explained a little better but it isn't a bug.  The internal code
could also change in the future to avoid the NUL requirement but I'd expect
that to be *really* low in the list of priorities.

>> Correction: with the exception of a number of BUILT IN functions.

I meant user defined functions in the sense of components of a SQL statement
(like verbs, operators and collations are components).  Yet another pesky
ambiguation introduced by the user word!

Note that you can override all built in user defined functions - just
register one with the same name.  You do however have to ensure that you
register variants for the different Unicode encodings.

ie you can make your installation of SQLite behave exactly how you want.
Should the built in implementations be fixed?  IMHO yes, but it isn't a
priority. In the 6 years since SQLite 3 has been available you are only the
second person to complain.  (I was the first :-)

>> sqlite3_value_*() and sqlite3_result_*() are fully capable of using
>> the counted model,

Indeed.  It is how I ensure my code is NUL safe/correct.  A far bigger bug
for those functions is that they use int for the size of data rather than
size_t.  I did a survey a few years back using google code search and every
instance I could find where -1 was not passed in as the length treated them
as though they used size_t and would result in (silent) truncation on 64 bit
machines.  My own code explicitly makes sure the values about to be passed
in are less than 2GB.

>> Of course, the SQLite shell does it anyway.  So "cannot" is not really
>> correct.

Well you can always spew arbitrary bytes to stdout which generally works for
people who only ever use ASCII.  But the rule really is that bytes cannot be
converted to characters without knowing the encoding.

> The SQLite shell isn't particular well structured for easy developer
> extension.
>> I've seen that...  ouch.

It is best to think of the shell as a convenience tool for the SQLite
developers to throw commands at the library as they add them, not as some
formal tool for SQL access.  It works reasonably well.

>> And your python wrapper is probably implemented using the counted
>> string form exclusively.  :-)

Originally it used whichever forms were most convenient at that place in the
code.  This is further complicated by Python being compilable with two
different forms of Unicode character size and SQLite having UTF8 and UTF16
apis.  Then one day I discovered that SQLite allows embedded NUL in string
values and made sure my code always works correctly - I'm OCD like that with
my wrapper.

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAk5+B3cACgkQmOOfHg372QSoPACfRNsbvh4ztr9MtGCQsAtxVMtU
09oAoN+U8AfKsebx+sqoUIKBorNUq6Hz
=eFoT
-----END PGP SIGNATURE-----
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] NUL handling bugs (was Re: c-api document suggestion)

Reply via email to