> The "datatype" that comes closest to being a suitable native byte
array
> these days is Stdio.Buffer.
Is this true? If the binding were to interpret any Buffer object as
binary data, would that cause surprise? My reading suggests that it is a
grey area and I'd rather solve the problem unambiguously if it's
possible.
> I presume you mean SQLite with typed queries here?
I'm not sure what a typed query is; this has to do with using binding
parameters and the way the pike glue to SQLite binds a given parameter.
> doing this and hasn???t run into this problem?), as any 8-bit
strings
> that weren???t binary data would be stored as blobs, even if they
were
> in a text typed field. These records would need to be re-stored
with
> the proper text type, which could be done with a query to update
the
> table. My sense is that this is the proper thing to do, as blob
fields
> should be reserved for data that???s actually binary data (as
opposed
> to text).
> - The contortions you describe to get the queries right with the
current
> behaviour would indicate that anyone who would have tried to do
the same
> would likely have ended up complaining here while trying to get it
right.
I have a hard time imagining that anyone has tried it. Not only does it
cause all kinds of complications within pike, it also means that queries
executed directly against the sqlite command would fail in similar ways,
as there could be mixed types of data within a given column and writing
some string as a blob literal means converting it to hex data first,
which isn't something many people can do on the fly.
> - In general I've noticed that very few, if any, people are
actually using
> *typed* queries from Sql.Sql.
> So that would suggest that your change would be beneficial, and
would
> (for safety) require a bit of compat code to get it right for
anyone
> unfortunate enough to rely on older behaviour.
I would argue the opposite: the existing behavior is so broken that
trying to come up with compat for it would just make existing code more
brittle because the effect of the current code is to effectively corrupt
the data in a column.
Example:
a column of type text may have individual elements that are text or blob
(or number even) values depending on whether the data was inserted using
a text literal 'some text', a blob literal X'14EC24', a binding
parameter from pike that happened to be a wide string (stored as text)
or a narrow string (stored as binary). So any query run against such a
column would likely return incorrect results unless the values and the
query values were always cast to one or the other.
I can't think of a scenario where this is desirable. Even if you ignore
applications written for ASCII code only, I imagine there are lots of
narrow values in languages that also have wide characters.
My proposal would be to fix this so that all strings are stored as text
and that all values of some other object type (Stdio.Buffer or
preferably some Sql datatype wrapper for bytestrings) are stored as
blobs. A reasonable workaround for anyone crazy enough to use the
existing broken functionality can just wrap their narrow strings with
the object mentioned above and have that value bound as a blob... the
perfect use of a release note.
I'd also argue that this ought to be fixed in 8.0 as well... the
behavior is so bad that I honestly can't imagine anyone has used it
successfully.
Bill