On Wed, 9 Dec 2020 at 19:21, Sam Whited <s...@samwhited.com> wrote: > I believe this is a mischaracterization of my argument. My argument is > "everything will have a way to get at the underlying bytes, not > everything will have them pre-converted into code points".
I think this, in particular, is not correct. The counter-argument - that everything can obtain a sequence of codepoints, but might not be able to get at a sequence of octets - is more accurate. In particular, I think anything based on Python would only receive text nodes as `str` objects, which are codepoint-based, and the {de|en}coding to UTF-8 is part and parcel of the XML [de]serialization. If we're counting codepoints and we only have the UTF-8, though, this should be fairly easy without formal decoding, assuming we do not require normalization. > Also "this > gives us the option to do certain optimizations on systems that support > them, but using code points doesn't so we should do the thing that is > the most flexible". > Oh, I agree with this, as a broad principle. But I don't think it's viable in this case. > > —Sam > > On Wed, Dec 9, 2020, at 19:09, Tedd Sterr wrote: > > Regardless, your argument is still "bytes is more convenient for me, > > so everyone else should do what's best for me." I don't think that's a > > good argument. > _______________________________________________ > Standards mailing list > Info: https://mail.jabber.org/mailman/listinfo/standards > Unsubscribe: standards-unsubscr...@xmpp.org > _______________________________________________ >
_______________________________________________ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org _______________________________________________