On Fri, Jan 26, 2018 at 4:36 PM, Keith Medcalf <kmedc...@dessus.com> wrote:
> > I do not understand this at all. > > If the definition of a C-String is a > "bunch-a-non-zero-byes-terminated-by-a-zero-byte", > then how is it possible to have a zero/null byte "embedded" within a > C-Style String? > this is about SQL not C. C is easy \0. Lots of \\, support. \x00 char thing[] = "Hello\0world"; #define strlen_thing(a) ( sizeof(a)/(sizeof(a[0])-1 ) ) > > Similarly, if a C-Style-Wide-String is defined as a > "bunch-a-non-zero-words-terminated-by-a-zero-word", then how is it > possible to have a zero/null word "embedded" within a C-Style-Wide-String? > > Given that SQLite3 is written in C and uses C-Strings or > C-Style-Wide-Strings, then you cannot have zero/null bytes embedded in > those strings. > > It certainly can. It's isolated from the language by working on tokens and expression pieces for which it has(or can have) the length. > You may of course argue that perhaps SQLite3 should use something other > than C-Style-Strings, however, this is not what seems to be proposed. It > seems to be proposing the use of some magical C-Style-String that is not > actually a C-Style-String, without explicitly stating this. > > They don't use c style strings though, the strings passed are SQL, which only have a single escape for the type of quote it starts with. If it used 'c style strings' it would behave like MySQL/TSQL but it's not; it's PSQL. "Hello""World" 'Hello''world' > SQLite3 does handle non-C-Ctyle-Strings. They are called "blobs". > > It does, and has for a decade or more as char. > --- > The fact that there's a Highway to Hell but only a Stairway to Heaven says > a lot about anticipated traffic volume. > > > >-----Original Message----- > >From: sqlite-users [mailto:sqlite-users- > >boun...@mailinglists.sqlite.org] On Behalf Of J Decker > >Sent: Friday, 26 January, 2018 17:18 > >To: SQLite mailing list > >Subject: Re: [sqlite] UTF8 and NUL > > > >On Fri, Jan 26, 2018 at 3:56 PM, Peter Da Silva < > >peter.dasi...@flightaware.com> wrote: > > > >> On 2018-01-26, at 17:05, J Decker <d3c...@gmail.com> wrote: > >> > On Fri, Jan 26, 2018 at 1:21 PM, Peter Da Silva < > >> > peter.dasi...@flightaware.com> wrote: > >> >> Sqlite uses NUL as the string terminator internally, the > >published API > >> >> specifies has stuff like this all over the place: > >> > >> >>> In those routines that have a fourth argument, its value is the > >number > >> of bytes in the parameter. To be clear: the value is the number of > >bytes in > >> the value, not the number of characters. If the fourth parameter to > >> sqlite3_bind_text() or sqlite3_bind_text16() is negative, then the > >length > >> of the string is the number of bytes UP TO THE FIRST ZERO > >TERMINATOR. > >> > >> > You stressed the wrong part there - *IS NEGATIVE* > >> > >> Why? Passing -1 as the length is a common way to tell sqlite3 to > >calculate > >> the length itself. It's a documented and widely used part of the > >API. > > > > > >Exactly, so on neither side, input or output is there a problem > >storing a > >length of valid characters. > >The deficiency is 1) the command line tool for diagnostics > >2) always scanning for a nul in prepare() unless the length is before > >that. It's simple to add an option that could change that behavior; > >or > >move the string measuring up to prepare[_v2,_v3,_v4] and even add a > >V5 that > >just passes the length passed without a scan. > > > >The input is read by a tokenizer that returns in-buffer references to > >the > >next SQL token by length. > >Some tokens can be quoted, and those end up being a copy of the > >original; > >but the length of the SQL statement should already be known, so it > >doesn't > >need to scan for 0. > > > >Once tokenized it's converted into expressions; those expressions > >(have > >previously) stored only the char*. It's not a lot of places to > >change to > >include storing the length; which is often known unless the mprintf > >internals are used; then any token passed through that does not pass > >%s. > >So %s cannot be used for UTF8 strings; but rather the literal string > >fwrite( buf, 1, stringlen, <output) ; gets all the standard character > >treatment as the file was opened with (O_BINARY or not, "b" or "t" > >specifiers for fopen, or stderr ). > > > >fprintf( out, "%s", (string) ); > >is exactly the same as > >fwrite( out, 1, strlen( string ), string ); > > > >(Can anyone dispute that? I doubt that's specified) > > > >Other than, the fwrite will include outputing the NUL character and > >trust > >the length given to it. \n will still get promoted to \r\n depending > >on > >platform and C library personality. > > > > > > > > > > > >> Therefore: > >> > > > > > >> > >> >> It would be a huge push-up to change this, it would break > >everything, > >> >> including extensions. I don't think it would be possible until > >something > >> >> like sqlite4. > >> > > > >maybe I don't understand what you're saying 'it' is. > > > > > >> _______________________________________________ > >> sqlite-users mailing list > >> sqlite-users@mailinglists.sqlite.org > >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite- > >users > >> > >_______________________________________________ > >sqlite-users mailing list > >sqlite-users@mailinglists.sqlite.org > >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users