Re: [Development] [Qt5-feedback] A micro API review: for V3(md5) and V5(sha1) in QUuid
Hi João On Fri, Dec 23, 2011 at 6:31 PM, wrote: > [ Re-trying after the previous massive quoting and line-wrap fail :-/ ] > > Denis Dzyubenko wrote: >> 2011/12/9 João Abecasis : >> >> inline QUuid QUuid::createFromName(const QUuid &ns, const >> >> QString &name) >> >> { >> >> return createFromName(ns, name.toUtf8()); >> >> } >> > >> > would only be updated to call the right implementations, as >> > appropriate. >> >> I like the current status of the patch very much. >> >> However I have one question - where utf8 comes from? Shouldn't it be >> defined by rfc, and if not imo we shouldn't arbitrary choose >> encodings, and maybe leave the default one in - which is utf-16 for >> QString > > This is my reasoning: > > 1) As you mention the RFC doesn't specify encodings. In fact, it says > the owner of a namespace is free to decide how it should be used. For > this reason it's important that we support QByteArray as the canonical > form and let users make conscious decisions. absolutely agree with that. I would even add an overload that takes (char *, int len) to avoid mallocing a d-pointer for QByteArray. > 2) In Qt, strings of text are represented as QString so it would be nice > to support QString-based names. This is the reason for adding those > overloads as convenience API, but doesn't tell us how QString-based > names should be translated to "a canonical sequence of octets" (quoting > the standard). > > 3) The point of name-based UUIDs is that you can regenerate the UUIDs > knowing only the namespace UUID and a particular name. If you use the > QByteArray version, it's up to you to ensure this. When using the QString > version Qt needs to ensure it for you. > > This excludes locale- and system-dependent conversions, like > toLocal8Bit(), it also excludes straightforward utf16() as it is > dependent on endianness, and thus platform. > > 4) UTF-8 is a good candidate because it is one possible "canonical > sequence of octets". But it's mostly that, a good candidate. that is a very good reason indeed! I didn't think about endianness of utf-16. Another alternative would be to always use utf-16 little endian (since this is the most common system) in a canonical form (e.g. D-form to make it cheap on mac). > So, there isn't a reason why it *has* to be utf-8, but I haven't seen > better alternatives. Other alternatives are toAscii or toLatin1, but > they're lossy encodings. Network-byte order UTF-16?... Denis. ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] [Qt5-feedback] A micro API review: for V3(md5) and V5(sha1) in QUuid
Hi, 2011/12/9 João Abecasis : >> inline QUuid QUuid::createFromName(const QUuid &ns, const QString &name) >> { >> return createFromName(ns, name.toUtf8()); >> } > > would only be updated to call the right implementations, as appropriate. I like the current status of the patch very much. However I have one question - where utf8 comes from? Shouldn't it be defined by rfc, and if not imo we shouldn't arbitrary choose encodings, and maybe leave the default one in - which is utf-16 for QString Denis. ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] [Qt5-feedback] A micro API review: for V3(md5) and V5(sha1) in QUuid
On 12/23/11 6:31 PM, "ext joao.abeca...@nokia.com" wrote: >[ Re-trying after the previous massive quoting and line-wrap fail :-/ ] > >Denis Dzyubenko wrote: >> 2011/12/9 João Abecasis : >> >>inline QUuid QUuid::createFromName(const QUuid &ns, const >> >>QString &name) >> >>{ >> >>return createFromName(ns, name.toUtf8()); >> >>} >> > >> > would only be updated to call the right implementations, as >> > appropriate. >> >> I like the current status of the patch very much. >> >> However I have one question - where utf8 comes from? Shouldn't it be >> defined by rfc, and if not imo we shouldn't arbitrary choose >> encodings, and maybe leave the default one in - which is utf-16 for >> QString > >This is my reasoning: > >1) As you mention the RFC doesn't specify encodings. In fact, it says >the owner of a namespace is free to decide how it should be used. For >this reason it's important that we support QByteArray as the canonical >form and let users make conscious decisions. > >2) In Qt, strings of text are represented as QString so it would be nice >to support QString-based names. This is the reason for adding those >overloads as convenience API, but doesn't tell us how QString-based >names should be translated to "a canonical sequence of octets" (quoting >the standard). > >3) The point of name-based UUIDs is that you can regenerate the UUIDs >knowing only the namespace UUID and a particular name. If you use the >QByteArray version, it's up to you to ensure this. When using the QString >version Qt needs to ensure it for you. > >This excludes locale- and system-dependent conversions, like >toLocal8Bit(), it also excludes straightforward utf16() as it is >dependent on endianness, and thus platform. > >4) UTF-8 is a good candidate because it is one possible "canonical >sequence of octets". But it's mostly that, a good candidate. > >So, there isn't a reason why it *has* to be utf-8, but I haven't seen >better alternatives. Other alternatives are toAscii or toLatin1, but >they're lossy encodings. Network-byte order UTF-16?... > >Anyway, one use case mentioned in the standard makes this convenience >approach very nice: > >QUrl url; > >// ... > >// NameSpace_DNS from RFC4122 >// {6ba7b810-9dad-11d1-80b4-00c04fd430c8} >QUuid nsDns(0x6ba7b810, 0x9dad, 0x11d1, 0x80, 0xb4, >0x00, 0xc0, 0x4f, 0xd4, 0x30, 0xc8); > >QUuid uuidForUrl = QUuid::createFromName(nsDns, url.toString()); > >With the added benefit that in that use case it interoperates with >Python. > >("And what does python do?", you ask. Well, it avoids the decision >altogether and bails out on unicode strings. It only accepts a >byte-strings: > >$ python >Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) >[GCC 4.2.1 (Apple Inc. build 5646)] on darwin >Type "help", "copyright", "credits" or "license" for more information. >>>> import uuid >>>> uuid.NAMESPACE_DNS >UUID('6ba7b810-9dad-11d1-80b4-00c04fd430c8') >>>> uuid.uuid3(uuid.NAMESPACE_DNS, "www.widgets.com") >UUID('3d813cbb-47fb-32ba-91df-831e1593ac29') >>>> uuid.uuid3(uuid.NAMESPACE_DNS, u"www.widgets.com") >Traceback (most recent call last): > File "", line 1, in > File >"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/uu >id.py", >line 512, in uuid3 >hash = md5(namespace.bytes + name).digest() >UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in position >1: ordinal not in range(128) > >) > >What do others think? I can see only two options that make sense. Either accept only ascii (ie. code points smaller 0x80), or use utf-8. The first option is a subset of the second one. Cheers, Lars ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] [Qt5-feedback] A micro API review: for V3(md5) and V5(sha1) in QUuid
[ Re-trying after the previous massive quoting and line-wrap fail :-/ ] Denis Dzyubenko wrote: > 2011/12/9 João Abecasis : > >> inline QUuid QUuid::createFromName(const QUuid &ns, const > >> QString &name) > >> { > >> return createFromName(ns, name.toUtf8()); > >> } > > > > would only be updated to call the right implementations, as > > appropriate. > > I like the current status of the patch very much. > > However I have one question - where utf8 comes from? Shouldn't it be > defined by rfc, and if not imo we shouldn't arbitrary choose > encodings, and maybe leave the default one in - which is utf-16 for > QString This is my reasoning: 1) As you mention the RFC doesn't specify encodings. In fact, it says the owner of a namespace is free to decide how it should be used. For this reason it's important that we support QByteArray as the canonical form and let users make conscious decisions. 2) In Qt, strings of text are represented as QString so it would be nice to support QString-based names. This is the reason for adding those overloads as convenience API, but doesn't tell us how QString-based names should be translated to "a canonical sequence of octets" (quoting the standard). 3) The point of name-based UUIDs is that you can regenerate the UUIDs knowing only the namespace UUID and a particular name. If you use the QByteArray version, it's up to you to ensure this. When using the QString version Qt needs to ensure it for you. This excludes locale- and system-dependent conversions, like toLocal8Bit(), it also excludes straightforward utf16() as it is dependent on endianness, and thus platform. 4) UTF-8 is a good candidate because it is one possible "canonical sequence of octets". But it's mostly that, a good candidate. So, there isn't a reason why it *has* to be utf-8, but I haven't seen better alternatives. Other alternatives are toAscii or toLatin1, but they're lossy encodings. Network-byte order UTF-16?... Anyway, one use case mentioned in the standard makes this convenience approach very nice: QUrl url; // ... // NameSpace_DNS from RFC4122 // {6ba7b810-9dad-11d1-80b4-00c04fd430c8} QUuid nsDns(0x6ba7b810, 0x9dad, 0x11d1, 0x80, 0xb4, 0x00, 0xc0, 0x4f, 0xd4, 0x30, 0xc8); QUuid uuidForUrl = QUuid::createFromName(nsDns, url.toString()); With the added benefit that in that use case it interoperates with Python. ("And what does python do?", you ask. Well, it avoids the decision altogether and bails out on unicode strings. It only accepts a byte-strings: $ python Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import uuid >>> uuid.NAMESPACE_DNS UUID('6ba7b810-9dad-11d1-80b4-00c04fd430c8') >>> uuid.uuid3(uuid.NAMESPACE_DNS, "www.widgets.com") UUID('3d813cbb-47fb-32ba-91df-831e1593ac29') >>> uuid.uuid3(uuid.NAMESPACE_DNS, u"www.widgets.com") Traceback (most recent call last): File "", line 1, in File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/uuid.py", line 512, in uuid3 hash = md5(namespace.bytes + name).digest() UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in position 1: ordinal not in range(128) ) What do others think? Cheers, João ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] [Qt5-feedback] A micro API review: for V3(md5) and V5(sha1) in QUuid
Denis Dzyubenko wrote:> 2011/12/9 João Abecasis :> >> inline QUuid QUuid::createFromName(const QUuid &ns, const> >> QString &name)> >> {> >> return createFromName(ns, name.toUtf8());> >> }> >> > would only be updated to call the right implementations, as> > appropriate.> > I like the current status of the patch very much.> > However I have one question - where utf8 comes from? Shouldn't it be> defined by rfc, and if not imo we shouldn't arbitrary choose> encodings, and maybe leave the default one in - which is utf-16 for> QString This is my reasoning: 1) As you mention the RFC doesn't specify encodings. In fact, it saysthe owner of a namespace is free to decide how it should be used. Forthis reason it's important that we support QByteArray as the canonicalform and let users make conscious decisions. 2) In Qt, strings of text are represented as QString so it would be niceto support QString-based names. This is the reason for adding thoseoverloads as convenience API, but doesn't tell us how QString-basednames should be translated to "a canonical sequence of octets" (quotingthe standard). 3) The point of name-based UUIDs is that you can regenerate the UUIDsknowing only the namespace UUID and a particular name. If you use theQByteArray version, it's up to you to ensure this. When using the QStringversion Qt needs to ensure it for you. This excludes locale- and system-dependent conversions, liketoLocal8Bit(), it also excludes straightforward utf16() as it isdependent on endianness, and thus platform. 4) UTF-8 is a good candidate because it is one possible "canonicalsequence of octets". But it's mostly that, a good candidate. So, there isn't a reason why it *has* to be utf-8, but I haven't seenbetter alternatives. Other alternatives are toAscii or toLatin1, butthey're lossy encodings. Network-byte order UTF-16?... Anyway, one use case mentioned in the standard makes this convenienceapproach very nice: QUrl url; // ... // NameSpace_DNS from RFC4122 // {6ba7b810-9dad-11d1-80b4-00c04fd430c8} QUuid nsDns(0x6ba7b810, 0x9dad, 0x11d1, 0x80, 0xb4, 0x00, 0xc0, 0x4f, 0xd4, 0x30, 0xc8); QUuid uuidForUrl = QUuid::createFromName(nsDns, url.toString()); With the added benefit that in that use case it interoperates withPython. ("And what does python do?", you ask. Well, it avoids the decisionaltogether and bails out on unicode strings. It only accepts abyte-strings: $ python Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import uuid >>> uuid.NAMESPACE_DNS UUID('6ba7b810-9dad-11d1-80b4-00c04fd430c8') >>> uuid.uuid3(uuid.NAMESPACE_DNS, "www.widgets.com") UUID('3d813cbb-47fb-32ba-91df-831e1593ac29') >>> uuid.uuid3(uuid.NAMESPACE_DNS, u"www.widgets.com") Traceback (most recent call last): File "", line 1, in File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/uuid.py", line 512, in uuid3 hash = md5(namespace.bytes + name).digest() UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in position 1: ordinal not in range(128) ) What do others think? Cheers, João ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] [Qt5-feedback] A micro API review: for V3(md5) and V5(sha1) in QUuid
On 9. des. 2011, at 17.10, ext Denis Dzyubenko wrote: > 2011/12/9 João Abecasis : >> This has my vote: >> >>QUuid QUuid::createFromNameV3(const QUuid &, const QByteArray &); >>QUuid QUuid::createFromNameV5(const QUuid &, const QByteArray &); >> >>inline QUuid QUuid::createFromName(const QUuid &ns, const QByteArray >> &name) >>{ >>// SHA1 (v5) is recommended >>return createFromNameV5(ns, name); >>} >> >>inline QUuid QUuid::createFromName(const QUuid &ns, const QString &name) >>{ >>return createFromName(ns, name.toUtf8()); >>} > > I like names createFromNameV3() ! I also think we should have api that > takes QString (i.e. operates on utf-16 data) and QByteArray (i.e. raw > data), and maybe even an overload that takes const char * and int size > - for passing raw data. Above, I already suggest versions taking QByteArray and that operate on the byte data, or did you mean something different? And what's wrong with QByteArray::fromRawData(const char *data, int size)? (Can we do something in QByteArray improve it, for instance? That would keep everyone from having to add that one extra overload...) Cheers, João ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] [Qt5-feedback] A micro API review: for V3(md5) and V5(sha1) in QUuid
On 9. des. 2011, at 13.27, João Abecasis wrote: > Lars wrote: >> On 12/9/11 12:28 PM, "ext liang...@nokia.com" wrote: >>> >>> The original task is: >>> http://bugreports.qt.nokia.com/browse/QTBUG-23071 >>> >>> And the change is: >>> http://codereview.qt-project.org/10803 >>> >>> For the API name, we need a micro API review: >>> Set 1: >>> createUuidMd5() >>> createUuidSha1() >>> >>> or >>> >>> createUuidMd5OrSha1() >>> >>> Set2: >>> createUuidV3() >>> createUuidV5() >>> >>> or >>> >>> createUuidV3OrV5() >>> >>> Any other suggestion is also welcome. >> >> These names look ugly. Why not simply QUuid::createUuid(const QUuid &ns, >> const QByteArray &baseData, Version v); ? > > I don't like that one since the namespace and name version only makes sense > for v3(Md5) and v5(Sha1), making all other options useless. I would prefer > one name that makes explicit either the version (v3/v5), the approach > (fromName) or the hash function (Md5, Sha1). > > This has my vote: > >QUuid QUuid::createFromNameV3(const QUuid &, const QByteArray &); >QUuid QUuid::createFromNameV5(const QUuid &, const QByteArray &); Thinking over it some more, the two function names above could instead be createUuidV3 and createUuidV5. In that vein, we could introduce overloads for v1, v2 and v4 taking no arguments. We currently have an implementation for v4, with QUuid::createUuid(), making that trivial to implement: static inline QUuid createUuidV4() { return QUuid::createUuid(); } The other overloads, >inline QUuid QUuid::createFromName(const QUuid &ns, const QByteArray &name) >{ >// SHA1 (v5) is recommended >return createFromNameV5(ns, name); >} > >inline QUuid QUuid::createFromName(const QUuid &ns, const QString &name) >{ >return createFromName(ns, name.toUtf8()); >} would only be updated to call the right implementations, as appropriate. Cheers, João ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] [Qt5-feedback] A micro API review: for V3(md5) and V5(sha1) in QUuid
On 12/9/11 1:27 PM, "João Abecasis" wrote: > >Lars wrote: >> On 12/9/11 12:28 PM, "ext liang...@nokia.com" >>wrote: >>> >>> The original task is: >>> http://bugreports.qt.nokia.com/browse/QTBUG-23071 >>> >>> And the change is: >>> http://codereview.qt-project.org/10803 >>> >>> For the API name, we need a micro API review: >>> Set 1: >>> createUuidMd5() >>> createUuidSha1() >>> >>> or >>> >>> createUuidMd5OrSha1() >>> >>> Set2: >>> createUuidV3() >>> createUuidV5() >>> >>> or >>> >>> createUuidV3OrV5() >>> >>> Any other suggestion is also welcome. >> >> These names look ugly. Why not simply QUuid::createUuid(const QUuid &ns, >> const QByteArray &baseData, Version v); ? > >I don't like that one since the namespace and name version only makes >sense for v3(Md5) and v5(Sha1), making all other options useless. I would >prefer one name that makes explicit either the version (v3/v5), the >approach (fromName) or the hash function (Md5, Sha1). > >This has my vote: > >QUuid QUuid::createFromNameV3(const QUuid &, const QByteArray &); >QUuid QUuid::createFromNameV5(const QUuid &, const QByteArray &); > >inline QUuid QUuid::createFromName(const QUuid &ns, const QByteArray >&name) >{ >// SHA1 (v5) is recommended >return createFromNameV5(ns, name); >} > >inline QUuid QUuid::createFromName(const QUuid &ns, const QString >&name) >{ >return createFromName(ns, name.toUtf8()); >} A lot better. The other option (if you want to avoid having two symbols is to only have createFromName(Š, Version = Sha1), and document that anything that's not Md5 will use to Sha1. I don't have strong opinions on either option though. Cheers, Lars ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] [Qt5-feedback] A micro API review: for V3(md5) and V5(sha1) in QUuid
Lars wrote: > On 12/9/11 12:28 PM, "ext liang...@nokia.com" wrote: >> >> The original task is: >> http://bugreports.qt.nokia.com/browse/QTBUG-23071 >> >> And the change is: >> http://codereview.qt-project.org/10803 >> >> For the API name, we need a micro API review: >> Set 1: >> createUuidMd5() >> createUuidSha1() >> >> or >> >> createUuidMd5OrSha1() >> >> Set2: >> createUuidV3() >> createUuidV5() >> >> or >> >> createUuidV3OrV5() >> >> Any other suggestion is also welcome. > > These names look ugly. Why not simply QUuid::createUuid(const QUuid &ns, > const QByteArray &baseData, Version v); ? I don't like that one since the namespace and name version only makes sense for v3(Md5) and v5(Sha1), making all other options useless. I would prefer one name that makes explicit either the version (v3/v5), the approach (fromName) or the hash function (Md5, Sha1). This has my vote: QUuid QUuid::createFromNameV3(const QUuid &, const QByteArray &); QUuid QUuid::createFromNameV5(const QUuid &, const QByteArray &); inline QUuid QUuid::createFromName(const QUuid &ns, const QByteArray &name) { // SHA1 (v5) is recommended return createFromNameV5(ns, name); } inline QUuid QUuid::createFromName(const QUuid &ns, const QString &name) { return createFromName(ns, name.toUtf8()); } Cheers, João ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] [Qt5-feedback] A micro API review: for V3(md5) and V5(sha1) in QUuid
Hi Liang, please send these to development@qt-project.org. On 12/9/11 12:28 PM, "ext liang...@nokia.com" wrote: >Hi, all, > >The original task is: >http://bugreports.qt.nokia.com/browse/QTBUG-23071 > >And the change is: >http://codereview.qt-project.org/10803 > >For the API name, we need a micro API review: >Set 1: >createUuidMd5() >createUuidSha1() > >or > >createUuidMd5OrSha1() > >Set2: >createUuidV3() >createUuidV5() > >or > >createUuidV3OrV5() > >Any other suggestion is also welcome. These names look ugly. Why not simply QUuid::createUuid(const QUuid &ns, const QByteArray &baseData, Version v); ? Lars > >Regards, >Liang > > >___ >Qt5-feedback mailing list >qt5-feedb...@qt.nokia.com >http://lists.qt.nokia.com/mailman/listinfo/qt5-feedback ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development