On 08/22/2013 03:36 PM, Justin Ross wrote:
On Thu, Aug 22, 2013 at 8:41 AM, Gordon Sim <[email protected]> wrote:
On 08/21/2013 10:43 PM, Justin Ross wrote:
"If I put a binary value in a map and encoded it some of the time it
might be valid utf8, other times not."  This shouldn't be allowed to
happen, IMO.  You meant it to be a binary value--we have to find a way
to capture and preserve that information.

I believe the point was that for an application sending binary data via the
ambiguous string type (between two processes in languages that have such a
type), if that was encoded on the wire as str16 (i.e. utf8) it could lead to
subtle bugs.

Testing could work until the actual binary payload was changed in some way
such that it was not valid utf8.

Right.  I'm saying that sucks, so don't do that.  For instance, we
could ask our users to use a 'Data' class to input arbitrary bytes,
and otherwise treat ambiguous strings as textual.

The point is that it is easy for people to miss that. Just as it is easy for them to miss the fact that you should choose the explicit utf8 type for textual data.

An explicit type is always preferable. The question is how to handle an ambiguous type. If encoded as a str16 then it may work in some case and fail in others; i.e. a subtle bug that testing may not catch depending on payloads actually tested. By contrast if it is encoded as a vbin the behaviour - even though admittedly unexpected for many - will at least be the same each time you try it independent of the actual contents of the string.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to