On Thu, Apr 19, 2018 at 4:07 PM, Keith Medcalf <kmedc...@dessus.com> wrote:
> > And what makes you think a "javascript string" is a "C string"? While the > "string" part may be the same, "javascript" certainly does not equal "C". > Just like you do not have issues with embedded zero-bytes in "pascal > strings". Note that "pascal" != "C" even though "string" == "string". > > by the same reasoning that you apply saying SQL strings are C strings. > Note that the sqlite3_value_text returns the data (including embedded zero > bytes), but not the length. sqlite3_value_bytes() returns the number of bytes. <invalidated> > If you pass the data returned thereby to a function expecting a C string > (zero terminated), it will terminate at the first zero byte encountered. > If you retrieve the length and the data separately and construct > pascal-style strings and pass them to functions expecting "pascal" style > strings, then the embedded zero is just "string data" (NB: pascal is used > only as an example -- many X strings contain an embedded length for any > given value of X -- C strings do not). > > </invalidated> Obviously "javascript" strings contain a length indicator and are not > zero-terminated. > > --- > The fact that there's a Highway to Hell but only a Stairway to Heaven says > a lot about anticipated traffic volume. > > > >-----Original Message----- > >From: sqlite-users [mailto:sqlite-users- > >boun...@mailinglists.sqlite.org] On Behalf Of J Decker > >Sent: Thursday, 19 April, 2018 16:41 > >To: SQLite mailing list > >Subject: Re: [sqlite] SQLite3 - Search on text field with \0 binary > >data > > > >On Thu, Apr 19, 2018 at 3:37 PM, J Decker <d3c...@gmail.com> wrote: > > > >> > >> > >> On Thu, Apr 19, 2018 at 3:22 PM, Keith Medcalf > ><kmedc...@dessus.com> > >> wrote: > >> > >>> > >>> Actually, nothing in the C or C++ world will "go past" the NULL > >byte > >>> since the very definition of a C string is a "bunch-o-bytes that > >are > >>> non-zero followed by one that is". > >>> > >>> And sory for the double response; but if C/C++ couldn't handle a > >NUL > >character (the character is 1 L) then spidermonkey/chakra/V8 would > >have > >problems with NUL characters in javascript strings. But it doesn't. > >Why > >is that? > > > > > >> that doesnt' mean you can use a custom token structure that > >contains both > >> the pointer and length of the data. (which it already has) > >> sure, using standard C api - strlen, etc sure... but sqlite uses a > >custom > >> function internally sqlite3stlren30 which can easily be extended > >to take > >> the length of the string; but wait, if it was saved, it wouldn't > >need to be > >> called, and a overall performance gain is created. > >> > >> the biggest problem is really the internal function > >'(something)printf' > >> which returns a char *, and has no space to return the length, like > >> snprintf would. > >> > >> and I can easily put nuls into a string.... > >> > >> char buf[256]; > >> int len = snprintf( buf, 256, "put a nul %c here and here %c", 0, 0 > >); > >> and the length returned would be 27. > >> > >> > >>> If you want to embed non UTF8 text you should be using a BLOB not > >TEXT. > >>> Text means "an array of non-zero characters terminated by a zero > >byte" and > >>> a BLOB means a "bag-o-bytes" of a specific size. > >>> > >>> Blob means binary; havihng to deal with a binary structure to > >convert to > >> a string and back is ridiculous when the interface already supports > >storing > >> and getting strings with \0 in them. > >> > >> > >>> Things meants to work on C "strings" should always stop at the > >zero > >>> terminator. Failure to do so can lead to AHBL. > >>> > >>> > >> So don't use the standard library. That was one of the first > >htings I > >> created for my MUD client; a smart text string class. (I say class > >in the > >> generic term, not the literal, since it was written in C) > >> > >> > >>> (Note, this applies to "wide" (as in word) and "fat" (as in double > >word) > >>> and obese (as in quad word) strings as well. They are a sequence > >of > >>> words/double-words/quad-words/ten-words (whatever) that are non- > >zero > >>> followed by one that is zero -- and the narrow/wide/fat/obese > >string ends > >>> at the zeo value). > >>> > >>> > >> utf8everywhere.org > >> No reason to use wide char. > >> > >> > >> get good, son. (sorry if that's overly offensive) > >> > >> --- > >>> The fact that there's a Highway to Hell but only a Stairway to > >Heaven > >>> says a lot about anticipated traffic volume. > >>> > >> > >_______________________________________________ > >sqlite-users mailing list > >sqlite-users@mailinglists.sqlite.org > >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users