Re: [Development] QString and related changes for Qt 6
Den tors 14 maj 2020 15:46Marc Mutz via Development < development@qt-project.org> skrev: > On 2020-05-13 17:17, Matthew Woehlke wrote: > [...] > > Non-owning QString is dangerous. QStringLiteral is less dangerous > > because it is almost never used with non-rodata storage (and indeed, I > > would consider any such usage highly suspect, if not outright broken). > > QString::fromRawData is dangerous, but "obviously" so. > > > > We should not implement any way of creating a non-owning QString that > > is not explicit, and if we adhere to that, I don't see us *not* > > wanting QStringView in many instances. > > I must be crazy, but ... +1! > *chuckle* :) > Thanks, > Marc > ___ > Development mailing list > Development@qt-project.org > https://lists.qt-project.org/listinfo/development > ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 2020-05-13 17:17, Matthew Woehlke wrote: [...] Non-owning QString is dangerous. QStringLiteral is less dangerous because it is almost never used with non-rodata storage (and indeed, I would consider any such usage highly suspect, if not outright broken). QString::fromRawData is dangerous, but "obviously" so. We should not implement any way of creating a non-owning QString that is not explicit, and if we adhere to that, I don't see us *not* wanting QStringView in many instances. I must be crazy, but ... +1! Thanks, Marc ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 2020-05-13 20:48, Jaroslaw Kobus wrote: From: Development on behalf of Thiago Macieira Sent: Wednesday, May 13, 2020 6:21 PM To: development@qt-project.org Subject: Re: [Development] QString and related changes for Qt 6 On terça-feira, 12 de maio de 2020 22:57:31 PDT Jaroslaw Kobus wrote: > That's why I've mentioned the better option: aggregation: QStringView could > be a member of QString. However, the downside would be that every time you > want to call a const method for QString, you would need to first get access > to the QStringView member. The advantage is that in this way you may easily > integrate different interfaces inside one class. This is more or less what we want to do. QString in Qt 6 is {begin, size, d} and QStringView has always been {begin, size}. So, yeah, it can be done. The idea is indeed to offload the majority of the non-mutating methods to the same functions, from inline code. There's no reason to have both QString::indexOf and QStringView::indexOf entry points in the library. Good to hear. And I hope that Marc will resurrect soon after his veto. Had you looked into qstring.cpp (I know it hurts!), you'd've seen that it's already implemented that way. But Neither does QString aggregate a QStringView nor does it inherit it. So, there's no resurrection coming because no death was caused. Thanks, Marc ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
> From: Development on behalf of Thiago > Macieira > Sent: Wednesday, May 13, 2020 6:21 PM > To: development@qt-project.org > Subject: Re: [Development] QString and related changes for Qt 6 > > On terça-feira, 12 de maio de 2020 22:57:31 PDT Jaroslaw Kobus wrote: > > That's why I've mentioned the better option: aggregation: QStringView could > > be a member of QString. However, the downside would be that every time you > > want to call a const method for QString, you would need to first get access > > to the QStringView member. The advantage is that in this way you may easily > > integrate different interfaces inside one class. > > This is more or less what we want to do. QString in Qt 6 is {begin, size, d} > and QStringView has always been {begin, size}. So, yeah, it can be done. > > The idea is indeed to offload the majority of the non-mutating methods to the > same functions, from inline code. There's no reason to have both > QString::indexOf and QStringView::indexOf entry points in the library. Good to hear. And I hope that Marc will resurrect soon after his veto. Regards Jarek ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On terça-feira, 12 de maio de 2020 22:57:31 PDT Jaroslaw Kobus wrote: > That's why I've mentioned the better option: aggregation: QStringView could > be a member of QString. However, the downside would be that every time you > want to call a const method for QString, you would need to first get access > to the QStringView member. The advantage is that in this way you may easily > integrate different interfaces inside one class. This is more or less what we want to do. QString in Qt 6 is {begin, size, d} and QStringView has always been {begin, size}. So, yeah, it can be done. The idea is indeed to offload the majority of the non-mutating methods to the same functions, from inline code. There's no reason to have both QString::indexOf and QStringView::indexOf entry points in the library. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel System Software Products ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 13/05/2020 11.49, Giuseppe D'Angelo wrote: Il 13/05/20 16:44, Matthew Woehlke ha scritto: Note that adding the QString(char16_t*) constructor Pedantic, but surely you meant `char16_t const*`. Hey, you can't nitpick here ... This can be solved with a third overload: template void foo(char16_t ()[N]) { foo(QStringView{s, N}); } ... and then do the same mistake in the same email >:-) Touché :-D. I fixed it in my godbolt experiment, but yup, missed it here. -- Matthew ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
Il 13/05/20 16:44, Matthew Woehlke ha scritto: Note that adding the QString(char16_t*) constructor Pedantic, but surely you meant `char16_t const*`. Hey, you can't nitpick here ... This can be solved with a third overload: template void foo(char16_t ()[N]) { foo(QStringView{s, N}); } ... and then do the same mistake in the same email >:-) -- Giuseppe D'Angelo | giuseppe.dang...@kdab.com | Senior Software Engineer KDAB (France) S.A.S., a KDAB Group company Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com KDAB - The Qt, C++ and OpenGL Experts smime.p7s Description: Firma crittografica S/MIME ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 12/05/2020 17.21, Thiago Macieira wrote: On Tuesday, 12 May 2020 08:42:28 PDT Matthew Woehlke wrote: How will this work? As I understand, the main advantage to QStringLiteral is that it statically encodes the *length* as well as the data. This isn't possible with raw literals, which are merely NUL-terminated. Black magic! I mean, templates and constexpr. Yeah... I'm not sure what I was thinking when I wrote that... Oh, wait... I don't see us ever getting rid of some form of QString literal short of templatizing *everything* that takes a T* (for T in char, char16_t, etc.) to take a T(&)[N] instead. ...I was thinking this. You might be able to escape this for methods that don't take both QString *and* QStringView. Otherwise, well, see my later message on that point. And on that note... But QStringView(u"foo") should call that first constructor. Doesn't it? I never remember if the literal decays to pointer before the overload resolution. Uh... no, actually it doesn't. (Which TBH smells a bit like a defect to me, but we're stuck with it for now.) Note: https://godbolt.org/z/FbjQkM (That was experimenting with QString/QStringView overload disambiguation, but also includes the relevant ctors. Comment out the templated overload of `foo` and one of the others, and you'll see that the invocation with a literal calls the "wrong" ctor.) So, we either need to retain literals in some form, or, as I was saying, every method needs to have a templated flavor for string literals. The "nice" thing about QStringView is that it does not have ownership; you have to be careful about how long you hold onto it lest it turn into a dangling pointer. You can't construct a QString from any old bag of byt^Wcharacters because a QString is implicitly valid until it is destroyed. That's the problem we've had with QStringLiteral and QString::fromRawData(). You *can* create it from read-only data and tell it never to try to modify. The trick is guaranteeing that it remains valid until the last user finished using it. Because of copy-on-write, that last user can be much later than the statement that created the QString in the first place. Right, but if you're using QStringLiteral / QString::fromRawData, you "know" you're taking on that responsibility. (And for QStringLiteral, you only run into problems in some instances with library unloading, which is a non-issue for many applications.) What I worry about with trying to avoid QStringView is that we either lose the ability to avoid copies when the input is a *temporary* (e.g. stack-allocated) buffer, or else we silently accept such uses and produce broken programs. Note that you can't rely on adding non-const overloads as a work-around; the string might be coming from an intermediate function that doesn't have a non-const overload, but was called with a (non-const) temporary buffer. Example: void foo(char const* s) { ... method_taking_qt_string(s); ... } void bar() { char buffer[MAX_SIZE]; ...do stuff to put data in buffer... foo(buffer); } Non-owning QString is dangerous. QStringLiteral is less dangerous because it is almost never used with non-rodata storage (and indeed, I would consider any such usage highly suspect, if not outright broken). QString::fromRawData is dangerous, but "obviously" so. We should not implement any way of creating a non-owning QString that is not explicit, and if we adhere to that, I don't see us *not* wanting QStringView in many instances. -- Matthew ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 13/05/2020 02.33, Lars Knoll wrote: On 12 May 2020, at 23:09, Thiago Macieira wrote: I want rules that determine what the API should be without looking at the implementation of those two functions. You may be disappointed, at least as far as parameters. This is one reason why I think we should simply use QString in most of those cases. Additionally, QString is a class that owns it’s data, making it the class that’s easiest to use and safest. QStringView doesn’t own it’s data and as such there are always lifetime considerations that need to be taken into account when using it. So using it would make using the API harder and more error prone. That might be true for return values. For parameters, if the *user* needs to care whether the function takes a QString vs. QStringView, we're doing something wrong. The onus to properly handle a QStringView in that case should be entirely on the *implementer* of the API. ...but yeah, if we're talking about return values, that's a whole other kettle of fish. -- Matthew ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 12/05/2020 13.48, Giuseppe D'Angelo via Development wrote: On 5/12/20 6:12 PM, Иван Комиссаров wrote: So the question is - is it possible to allow to construct QString from unicode literal? "Not yet", but adding a constructor from char16_t to QString makes sense. This creates a problem down the line: today you have a f(QString) and you call it with f(u"whatever"). Then, later on, you realize that QString is not needed and QStringView suffices. (This is the case all over existing Qt code.) What do you do? Adding a QStringView overload will make calls ambiguous, removing the QString one will be an ABI break. We need an established solution for these cases as they'll pop up during the Qt 6 lifetime. This can be solved with a third overload: template void foo(char16_t ()[N]) { foo(QStringView{s, N}); } Of course, this isn't quite right; we actually want: QStringView{s, s[N - 1] ? N : N - 1} ...so that we correctly handle both NUL-terminated literals and also raw arrays (which may not be NUL-terminated!). There is the slight caveat that we will ignore a final NUL in a raw array, but a) I think that's reasonable, and b) I don't see a way around that short of a language change to give string literals a distinct type. Also note that reasonable compilers should optimize away the conditional, so there is no added overhead. Note that adding the QString(char16_t*) constructor Pedantic, but surely you meant `char16_t const*`. (Also, please provide the templated overload so calling strlen is not needed!) -- Matthew ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On Wed, 13 May 2020 at 12:50, Tor Arne Vestbø wrote: > > > > > On 13 May 2020, at 10:12, Edward Welbourne wrote: > > > >> Note that adding the QString(char16_t*) constructor introduces this > >> ambiguity for the functions that are already overloaded on > >> QString+QStringView (and thus today are using QStringView). > > > > Would it suffice to skip the QString(char16_t *) constructor and, > > instead, have a QString(QStringView) constructor ? > > > > I guess calls to functions taking QString would have to make one of the > > steps explicit, when passing a u"...", i.e. either call > > f(QString(u"...")) or f(QStringView(u"...")), preferring the latter (as > > it's future-proof against f changing signature from QString to > > QStingView later; note that this concern applies to Qt-using code, which > > may allow itself such ABI-breaks, not just Qt itself, which wouldn't, at > > least not once the old API has appeared in a public release). I suppose > > both forms are capable of exploiting constexpr and happening at > > compile-time, when the compiler deigns to make it so. > > Whatever we end up with, _please_ avoid the > explicitness/verboseness/boilerplate of having to wrap every “foo” in some > QPreferredStringTypeOfTheWeek(“foo”) > > I expect my code to looks like this: > > foo.bar(“baz”) > > Or if the allocations and conversations are really a performance issue for > this particular piece of code: > > foo.bar(u“baz”) > > Anything else should be reserved for corner cases where the explicitness is > warranted. That's all well and good, but if foo.bar(a) and foo.bar(b) have different semantics on whether the class copies or views what I pass in, I am going to hurt you. :) Meaning that if it sometimes stores a copy, then it should always store a copy, instead of sometimes storing a copy and sometimes storing a view, in which case I need to be insanely careful about calling such functionality. If the class doesn't store the argument, I don't care. If it does, it should decide whether it stores a copy or a view. Overloads in an overload set should have the same semantics, otherwise that API is a vector, where for some incoming types it does A and for others it does B, and the code can no longer be read without looking at the API documentation for every call. ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
> On 13 May 2020, at 10:12, Edward Welbourne wrote: > >> Note that adding the QString(char16_t*) constructor introduces this >> ambiguity for the functions that are already overloaded on >> QString+QStringView (and thus today are using QStringView). > > Would it suffice to skip the QString(char16_t *) constructor and, > instead, have a QString(QStringView) constructor ? > > I guess calls to functions taking QString would have to make one of the > steps explicit, when passing a u"...", i.e. either call > f(QString(u"...")) or f(QStringView(u"...")), preferring the latter (as > it's future-proof against f changing signature from QString to > QStingView later; note that this concern applies to Qt-using code, which > may allow itself such ABI-breaks, not just Qt itself, which wouldn't, at > least not once the old API has appeared in a public release). I suppose > both forms are capable of exploiting constexpr and happening at > compile-time, when the compiler deigns to make it so. Whatever we end up with, _please_ avoid the explicitness/verboseness/boilerplate of having to wrap every “foo” in some QPreferredStringTypeOfTheWeek(“foo”) I expect my code to looks like this: foo.bar(“baz”) Or if the allocations and conversations are really a performance issue for this particular piece of code: foo.bar(u“baz”) Anything else should be reserved for corner cases where the explicitness is warranted. Tor Arne ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On Tue, May 12, 2020 at 02:09:21PM -0700, Thiago Macieira wrote: > On Tuesday, 12 May 2020 10:48:24 PDT Giuseppe D'Angelo via Development wrote: > > What do you do? Adding a QStringView overload will make calls ambiguous, > > removing the QString one will be an ABI break. We need an established > > solution for these cases as they'll pop up during the Qt 6 lifetime. > > Indeed. > > And the API policy must be one such that it doesn't depend on what the method > does *today* and it doesn't create a mess. Functions change. > > [Good regexp example snipped] > > I want rules that determine what the API should be without looking at the > implementation of those two functions. Same for me. And I think this is an important point, even to the degree that a clear, uniform API is more worth than a handful cycles. Most of API changes that are currently discussed or even done "for performance reasons" *do not matter in practice*. If a real world Qt application has a performance problem, this is *not* solved by changing QRegularExpression::pattern() from returning a QString to returning QStringView. There are very few cases in repeatedly used low level functions where it actually *does* make sense, but there it's actually ok to have duplicated interface. The "overload" problem would also be solvable, by not using overloads, but differently named functions, e.g. by sth like .midView() instead of .mid(). Andre' ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 5/12/20 6:12 PM, Иван Комиссаров wrote: >> So the question is - is it possible to allow to construct QString from >> unicode literal? Giuseppe D'Angelo (12 May 2020 19:48) replied: > "Not yet", but adding a constructor from char16_t to QString makes sense. > > This creates a problem down the line: today you have a > > f(QString) > > and you call it with f(u"whatever"). Then, later on, you realize that > QString is not needed and QStringView suffices. (This is the case all > over existing Qt code.) > > What do you do? Adding a QStringView overload will make calls ambiguous, > removing the QString one will be an ABI break. We need an established > solution for these cases as they'll pop up during the Qt 6 lifetime. > > Note that adding the QString(char16_t*) constructor introduces this > ambiguity for the functions that are already overloaded on > QString+QStringView (and thus today are using QStringView). Would it suffice to skip the QString(char16_t *) constructor and, instead, have a QString(QStringView) constructor ? I guess calls to functions taking QString would have to make one of the steps explicit, when passing a u"...", i.e. either call f(QString(u"...")) or f(QStringView(u"...")), preferring the latter (as it's future-proof against f changing signature from QString to QStingView later; note that this concern applies to Qt-using code, which may allow itself such ABI-breaks, not just Qt itself, which wouldn't, at least not once the old API has appeared in a public release). I suppose both forms are capable of exploiting constexpr and happening at compile-time, when the compiler deigns to make it so. Eddy. ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
> On 12 May 2020, at 23:09, Thiago Macieira wrote: > > On Tuesday, 12 May 2020 10:48:24 PDT Giuseppe D'Angelo via Development wrote: >> What do you do? Adding a QStringView overload will make calls ambiguous, >> removing the QString one will be an ABI break. We need an established >> solution for these cases as they'll pop up during the Qt 6 lifetime. > > Indeed. > > And the API policy must be one such that it doesn't depend on what the method > does *today* and it doesn't create a mess. Functions change. > > Let's take an example with QRegularExpression's pattern (not picking on > Giuseppe). Today it is: > >QString pattern() const; >void setPattern(const QString ); > > QString QRegularExpression::pattern() const > { >return d->pattern; > } > > Since this is returning a stored QString, someone might feel that it should > instead return a QStringView. But if it's storing, then the setter should > remain const QString &. That would be: > >QStringView pattern() const; >void setPattern(const QString ); > > But suppose that there's a pcre2_get_pattern_16() function. Then someone > might > be tempted to say that since PCRE stores the pattern, we don't need to. That > would mean QRegularExpression::pattern() ought to be written as: > > QString QRegularExpression::pattern() const > { >qsizetype len = pcre2_get_pattern_length_16(d->compiledPattern); >QString retval(Qt::Uninitialized, len); >pcre2_get_pattern_16(d->compiledPattern, retval.data(), len); >return retval; > } > > But if PCRE is going to store the pattern and PCRE doesn't use QString, then > setPattern could take a QStringView instead. That would be: > >QString pattern() const; >void setPattern(QStringView pattern); > > That's the opposite of the previous one. > > I want rules that determine what the API should be without looking at the > implementation of those two functions. This is one reason why I think we should simply use QString in most of those cases. Additionally, QString is a class that owns it’s data, making it the class that’s easiest to use and safest. QStringView doesn’t own it’s data and as such there are always lifetime considerations that need to be taken into account when using it. So using it would make using the API harder and more error prone. Cheers, Lars ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
> On 13 May 2020, at 08:14, André Somers wrote: > > > On 12-05-20 22:42, Thiago Macieira wrote: >> >> QStringView::mid(), for example, returns QStringView, but QString::mid() >> returns QString. > _Should_ QString::mid be returning a QString though? Perhaps it should return > a QStringView? That’s a separate question, but I agree it’s something we should investigate. Most likely it would break a large amount of code however (mid() being a methods that’s extremely widely used). Cheers, Lars ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
> On 12 May 2020, at 23:21, Thiago Macieira wrote: > > On Tuesday, 12 May 2020 08:42:28 PDT Matthew Woehlke wrote: >> How will this work? As I understand, the main advantage to >> QStringLiteral is that it statically encodes the *length* as well as the >> data. This isn't possible with raw literals, which are merely >> NUL-terminated. > > Black magic! > > I mean, templates and constexpr. QStringView has these two constructors: > >template >Q_DECL_CONSTEXPR QStringView(const Char ()[N]) noexcept; > >template >Q_DECL_CONSTEXPR QStringView(const Char *str) noexcept; > > The first one has a clear-cut size and can be initialised from a character > literal. The second one can attempt to determine at constexpr time what the > string length is. > > It can't do so today (5.15) because of the lack of if constexpr. But Qt 6.0 > will require C++17, so it can use if constexpr and implement a scan-for-NUL > at > constexpr time if the payload is also constexpr. If it isn't, then it falls > back to calling qustrlen(). > >> Even std::string wants literals for this reason. A UDL would obviously >> be superior, but I don't see us ever getting rid of some form of QString >> literal short of templatizing *everything* that takes a T* (for T in >> char, char16_t, etc.) to take a T(&)[N] instead. > > u"foo"_qs > u"foo"_qsv; > > But QStringView(u"foo") should call that first constructor. Doesn't it? I > never remember if the literal decays to pointer before the overload > resolution. > >>> In most other places we should by default only use QString, unless >>> there are very significant performance benefits to be had from using >>> QStringView. This helps us keep an API that’s both easy to use and >>> maintain. With the ideas above, you can still create a read-only >>> string, so data copies can in many cases be avoided if required. >> >> Really? How? >> >> The "nice" thing about QStringView is that it does not have ownership; >> you have to be careful about how long you hold onto it lest it turn into >> a dangling pointer. You can't construct a QString from any old bag of >> byt^Wcharacters because a QString is implicitly valid until it is destroyed. > > That's the problem we've had with QStringLiteral and QString::fromRawData(). > > You *can* create it from read-only data and tell it never to try to modify. > The trick is guaranteeing that it remains valid until the last user finished > using it. Because of copy-on-write, that last user can be much later than the > statement that created the QString in the first place. > > One way to ensure that guarantee is to never unload/free the memory block in > the first place. We already don't unload plugins for this and similar reasons. I have partial patches (they still need some more work) where we can create a QString from read-only data. This is possible because QString in Qt 6 has a begin/end pointer in the class itself (not in the d-pointer). So a read-only QString would contain a null d-pointer plus the pointer to data and size/end. To avoid problems with plugins, we have two options. Either we continue not unloading them (safe bet), or we disable those constructors when compiling plugin code, and enforce a copy of the data in that case. > > One thing Lars and I agree is that those literals must be null-terminated, > unlike QStringView. Whether it's simply an API contract or whether we test/ > enforce remains to be seen. On the platforms where Qt runs, we can almost > always read past the end of the string to see if the terminator is there, > even > if it means writing assembly code. Ideally, we can check this at compile time for most cases. We have been making that assumption, but not checking it in Qt5’s QString (you could get a non zero terminated string by using fromRawData()). Cheers, Lars > >> That said, I think I understand the reasoning here; make it up front >> that the input is going to wind up in *a* QString. If the user's input >> is *already* a QString, the function can make a shared copy rather than >> constructing a brand new one. However, it would be nice for such >> functions to offer r-value reference overloads for cases where a QString >> needs to be created, or if the user is done with their copy. (Actually, >> a possibly-owning reference wrapper could be useful here...) > > -- > Thiago Macieira - thiago.macieira (AT) intel.com > Software Architect - Intel System Software Products > > > > ___ > Development mailing list > Development@qt-project.org > https://lists.qt-project.org/listinfo/development ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
> On 12 May 2020, at 22:42, Thiago Macieira wrote: > > On Tuesday, 12 May 2020 09:34:40 PDT Marc Mutz via Development wrote: >> On 2020-05-12 11:31, Jaroslaw Kobus wrote: >>> So, just an idea: instead of repeating the common API part in QString >>> and QStringView, what about making it one common? E.g. what about: >>> - deriving QString from QStringView (and adding mutator API) >>> or (maybe even better): >>> - aggregating QStringView object as a part of QString API and giving >> >>> accesor for it, like: >> Vetoed. Over my dead body™. No inheriting of non-polymorphic types from >> each other. What we have is static polymorphism, and that's what we >> should continue to have. > > Agreed, but also because many of the methods in QStringView are not > applicable > to QString. > > QStringView::mid(), for example, returns QStringView, but QString::mid() > returns QString. > > QString is neither a specialisation nor a broadening of QStringView. Agreed as well. Those are two separate classes, but they can share the implementation of many methods. Lars ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 12-05-20 22:42, Thiago Macieira wrote: QStringView::mid(), for example, returns QStringView, but QString::mid() returns QString. _Should_ QString::mid be returning a QString though? Perhaps it should return a QStringView? André ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
> From: Development on behalf of Thiago > Macieira > Sent: Tuesday, May 12, 2020 10:42 PM > To: development@qt-project.org > Subject: Re: [Development] QString and related changes for Qt 6 > > On 2020-05-12 11:31, Jaroslaw Kobus wrote: > > >So, just an idea: instead of repeating the common API part in QString > > > and QStringView, what about making it one common? E.g. what about: [...] > > > or (maybe even better): > > > - aggregating QStringView object as a part of QString API and giving [...] > > QStringView::mid(), for example, returns QStringView, but QString::mid() > returns QString. > > QString is neither a specialisation nor a broadening of QStringView. The first option (inheritance) just gives the idea for simple, not perfect solution. That's why I've mentioned the better option: aggregation: QStringView could be a member of QString. However, the downside would be that every time you want to call a const method for QString, you would need to first get access to the QStringView member. The advantage is that in this way you may easily integrate different interfaces inside one class. Anyway, if you are saying the APIs of QString and QStringView are not the same, and they should still differ, than forget about the above. Regards Jarek ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On Tuesday, 12 May 2020 08:42:28 PDT Matthew Woehlke wrote: > How will this work? As I understand, the main advantage to > QStringLiteral is that it statically encodes the *length* as well as the > data. This isn't possible with raw literals, which are merely > NUL-terminated. Black magic! I mean, templates and constexpr. QStringView has these two constructors: template Q_DECL_CONSTEXPR QStringView(const Char ()[N]) noexcept; template Q_DECL_CONSTEXPR QStringView(const Char *str) noexcept; The first one has a clear-cut size and can be initialised from a character literal. The second one can attempt to determine at constexpr time what the string length is. It can't do so today (5.15) because of the lack of if constexpr. But Qt 6.0 will require C++17, so it can use if constexpr and implement a scan-for-NUL at constexpr time if the payload is also constexpr. If it isn't, then it falls back to calling qustrlen(). > Even std::string wants literals for this reason. A UDL would obviously > be superior, but I don't see us ever getting rid of some form of QString > literal short of templatizing *everything* that takes a T* (for T in > char, char16_t, etc.) to take a T(&)[N] instead. u"foo"_qs u"foo"_qsv; But QStringView(u"foo") should call that first constructor. Doesn't it? I never remember if the literal decays to pointer before the overload resolution. > > In most other places we should by default only use QString, unless > > there are very significant performance benefits to be had from using > > QStringView. This helps us keep an API that’s both easy to use and > > maintain. With the ideas above, you can still create a read-only > > string, so data copies can in many cases be avoided if required. > > Really? How? > > The "nice" thing about QStringView is that it does not have ownership; > you have to be careful about how long you hold onto it lest it turn into > a dangling pointer. You can't construct a QString from any old bag of > byt^Wcharacters because a QString is implicitly valid until it is destroyed. That's the problem we've had with QStringLiteral and QString::fromRawData(). You *can* create it from read-only data and tell it never to try to modify. The trick is guaranteeing that it remains valid until the last user finished using it. Because of copy-on-write, that last user can be much later than the statement that created the QString in the first place. One way to ensure that guarantee is to never unload/free the memory block in the first place. We already don't unload plugins for this and similar reasons. One thing Lars and I agree is that those literals must be null-terminated, unlike QStringView. Whether it's simply an API contract or whether we test/ enforce remains to be seen. On the platforms where Qt runs, we can almost always read past the end of the string to see if the terminator is there, even if it means writing assembly code. > That said, I think I understand the reasoning here; make it up front > that the input is going to wind up in *a* QString. If the user's input > is *already* a QString, the function can make a shared copy rather than > constructing a brand new one. However, it would be nice for such > functions to offer r-value reference overloads for cases where a QString > needs to be created, or if the user is done with their copy. (Actually, > a possibly-owning reference wrapper could be useful here...) -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel System Software Products ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On Tuesday, 12 May 2020 10:48:24 PDT Giuseppe D'Angelo via Development wrote: > What do you do? Adding a QStringView overload will make calls ambiguous, > removing the QString one will be an ABI break. We need an established > solution for these cases as they'll pop up during the Qt 6 lifetime. Indeed. And the API policy must be one such that it doesn't depend on what the method does *today* and it doesn't create a mess. Functions change. Let's take an example with QRegularExpression's pattern (not picking on Giuseppe). Today it is: QString pattern() const; void setPattern(const QString ); QString QRegularExpression::pattern() const { return d->pattern; } Since this is returning a stored QString, someone might feel that it should instead return a QStringView. But if it's storing, then the setter should remain const QString &. That would be: QStringView pattern() const; void setPattern(const QString ); But suppose that there's a pcre2_get_pattern_16() function. Then someone might be tempted to say that since PCRE stores the pattern, we don't need to. That would mean QRegularExpression::pattern() ought to be written as: QString QRegularExpression::pattern() const { qsizetype len = pcre2_get_pattern_length_16(d->compiledPattern); QString retval(Qt::Uninitialized, len); pcre2_get_pattern_16(d->compiledPattern, retval.data(), len); return retval; } But if PCRE is going to store the pattern and PCRE doesn't use QString, then setPattern could take a QStringView instead. That would be: QString pattern() const; void setPattern(QStringView pattern); That's the opposite of the previous one. I want rules that determine what the API should be without looking at the implementation of those two functions. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel System Software Products ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On Tuesday, 12 May 2020 09:34:40 PDT Marc Mutz via Development wrote: > On 2020-05-12 11:31, Jaroslaw Kobus wrote: > > So, just an idea: instead of repeating the common API part in QString > > and QStringView, what about making it one common? E.g. what about: > > - deriving QString from QStringView (and adding mutator API) > > or (maybe even better): > > - aggregating QStringView object as a part of QString API and giving > > > accesor for it, like: > Vetoed. Over my dead body™. No inheriting of non-polymorphic types from > each other. What we have is static polymorphism, and that's what we > should continue to have. Agreed, but also because many of the methods in QStringView are not applicable to QString. QStringView::mid(), for example, returns QStringView, but QString::mid() returns QString. QString is neither a specialisation nor a broadening of QStringView. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel System Software Products ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On Tuesday, 12 May 2020 02:04:35 PDT Tor Arne Vestbø wrote: > During the contributor summit we were talking about just assuming “foo” is > utf-8, now that our source code is utf-8. Is that not possible? We've been doing that since 5.0. But UTF-8 to UTF-16 requires a conversion. u"" wouldn't and in some cases, we would be able to use it without memory allocations either -- that is, QStringLiteral(). -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel System Software Products ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On Tuesday, 12 May 2020 02:01:45 PDT Edward Welbourne wrote: > I largely agree, with the exception of: supporting an 8-bit string view > type for comparisons (including startsWith(), find()/indexOf() and > similar) can save client code a factor of two on the size of many string > literals. I'm fine with limiting its use to the QString(View) API, > though. So QUtf8View would replace QLatin1String as that 8-bit view > type, with a much more limited scope. > > While we can simply ask folk to stick a u on the front of their strings, > doubling the size of each, it would be a kindness to those with lots of > string literals to allow them to use u8 instead and avoid that doubling. > Meanwhile, the many situations where data from an outside source arrives > in UTF-8 make a case for providing a view type that can wrap such data > and make it "presentable" for interaction with QString(View), tagged > with the right semantics (i.e. the knowledge that it's UTF-8) in the > type system. I think we need some more data before we do that. First of all, char8_t doesn't exist before C++20. u8"" has existed since C++11, but it didn't produce char8_t literals until C++20. So we have to be careful with recommending people use it. The APIs we add using char8_t, if any, will exist with C++20 only. But for Qt, everything char is already UTF-8, so we don't need char8_t. The problem with QUtf8View is how it may be used. Unlike QLatin1String, direct UTF-16-to-UTF-8 comparisons as easy, so the QString methods that would take QUtf8View are necessarily slower. If space is a constraint but not runtime, it might be best to just use QString constructor. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel System Software Products ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 5/12/20 6:12 PM, Иван Комиссаров wrote: So the question is - is it possible to allow to construct QString from unicode literal? "Not yet", but adding a constructor from char16_t to QString makes sense. This creates a problem down the line: today you have a f(QString) and you call it with f(u"whatever"). Then, later on, you realize that QString is not needed and QStringView suffices. (This is the case all over existing Qt code.) What do you do? Adding a QStringView overload will make calls ambiguous, removing the QString one will be an ABI break. We need an established solution for these cases as they'll pop up during the Qt 6 lifetime. Note that adding the QString(char16_t*) constructor introduces this ambiguity for the functions that are already overloaded on QString+QStringView (and thus today are using QStringView). Thanks, -- Giuseppe D'Angelo | giuseppe.dang...@kdab.com | Senior Software Engineer KDAB (France) S.A.S., a KDAB Group company Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com KDAB - The Qt, C++ and OpenGL Experts smime.p7s Description: S/MIME Cryptographic Signature ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 2020-05-12 16:12, Giuseppe D'Angelo via Development wrote: On 5/12/20 12:20 PM, Иван Комиссаров wrote: * Exceptions can be done where significant performance gains can be demonstrated and the API will by design not require a copy of the data (e.g. XML writer, stream writers, date time handling) Let me disagree here. The decision should be taken on the fact if the object takes ownership of the string (and thus QString is used) or it only «looks» into it. I agree. This however leaves us with questions regarding the API. E.g.: class Attribute { public: // OK: takes ownership void addAttribute(const QString , const QString ); Such code can take QAnyStringView which would be, essentially, std::variantchar32_t)>. And while I still think that char[] should be deprecated once we can depend on char8_t, for the time being, that would work with "foo" (and convert to QUtf8StringView). Thanks, Marc ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 2020-05-12 11:31, Jaroslaw Kobus wrote: So, just an idea: instead of repeating the common API part in QString and QStringView, what about making it one common? E.g. what about: - deriving QString from QStringView (and adding mutator API) or (maybe even better): - aggregating QStringView object as a part of QString API and giving accesor for it, like: Vetoed. Over my dead body™. No inheriting of non-polymorphic types from each other. What we have is static polymorphism, and that's what we should continue to have. Sorry, Marc ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
Good question! Personally, I think that both should accept u"foo" as input. However, the following code does not compile: QString s(u"foo"); I have no idea if this is intentional or not and if there will be problems with QString/QStringView overloads. However, since the overloads are going to be revisited anyway, maybe it is possible to remove some QString overloads In favor of the QStirngView ones and thus allow accepting unicode literal in QString as well. I don’t think that accepting char* should be the desired use-case. Yes, it works in the first case because QT_NO_CAST_FROM_ASCII is disabled by default, but I don’t think we should encourage that use-case - if the unicode literal is working for both cases, that should become the «right way» to go. So the question is - is it possible to allow to construct QString from unicode literal? Ivan > 12 мая 2020 г., в 16:12, Giuseppe D'Angelo via Development > написал(а): > > On 5/12/20 12:20 PM, Иван Комиссаров wrote: >>> * Exceptions can be done where significant performance gains can be >>> demonstrated and the API will by design not require a copy of the data >>> (e.g. XML writer, stream writers, date time handling) >> Let me disagree here. The decision should be taken on the fact if the object >> takes ownership of the string (and thus QString is used) or it only «looks» >> into it. > > I agree. This however leaves us with questions regarding the API. E.g.: > > class Attribute { > public: > // OK: takes ownership > void addAttribute(const QString , const QString ); > > // does not take ownership > bool hasAttribute(QStringView key) const; > }; > > Is it OK that you can call addAttribute("foo", "bar") but not > hasAttribute("foo")? (And similar) > > Thanks, > -- > Giuseppe D'Angelo | giuseppe.dang...@kdab.com | Senior Software Engineer > KDAB (France) S.A.S., a KDAB Group company > Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com > KDAB - The Qt, C++ and OpenGL Experts > > ___ > Development mailing list > Development@qt-project.org > https://lists.qt-project.org/listinfo/development ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 12/05/2020 03.49, Lars Knoll wrote: * QStringLiteral should turn into a small wrapper around u”…”, and probably also get deprecated. Maybe we could add a user defined literal for it instead that returns a read-only QString (QString s = “…”_q;). So u”…” would lead to a QStringView, u”…”_q to a read-only QString. How will this work? As I understand, the main advantage to QStringLiteral is that it statically encodes the *length* as well as the data. This isn't possible with raw literals, which are merely NUL-terminated. Even std::string wants literals for this reason. A UDL would obviously be superior, but I don't see us ever getting rid of some form of QString literal short of templatizing *everything* that takes a T* (for T in char, char16_t, etc.) to take a T(&)[N] instead. In most other places we should by default only use QString, unless there are very significant performance benefits to be had from using QStringView. This helps us keep an API that’s both easy to use and maintain. With the ideas above, you can still create a read-only string, so data copies can in many cases be avoided if required. Really? How? The "nice" thing about QStringView is that it does not have ownership; you have to be careful about how long you hold onto it lest it turn into a dangling pointer. You can't construct a QString from any old bag of byt^Wcharacters because a QString is implicitly valid until it is destroyed. That said, I think I understand the reasoning here; make it up front that the input is going to wind up in *a* QString. If the user's input is *already* a QString, the function can make a shared copy rather than constructing a brand new one. However, it would be nice for such functions to offer r-value reference overloads for cases where a QString needs to be created, or if the user is done with their copy. (Actually, a possibly-owning reference wrapper could be useful here...) -- Matthew ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 5/12/20 12:20 PM, Иван Комиссаров wrote: * Exceptions can be done where significant performance gains can be demonstrated and the API will by design not require a copy of the data (e.g. XML writer, stream writers, date time handling) Let me disagree here. The decision should be taken on the fact if the object takes ownership of the string (and thus QString is used) or it only «looks» into it. I agree. This however leaves us with questions regarding the API. E.g.: class Attribute { public: // OK: takes ownership void addAttribute(const QString , const QString ); // does not take ownership bool hasAttribute(QStringView key) const; }; Is it OK that you can call addAttribute("foo", "bar") but not hasAttribute("foo")? (And similar) Thanks, -- Giuseppe D'Angelo | giuseppe.dang...@kdab.com | Senior Software Engineer KDAB (France) S.A.S., a KDAB Group company Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com KDAB - The Qt, C++ and OpenGL Experts smime.p7s Description: S/MIME Cryptographic Signature ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
On 2020-05-12 12:36, Lars Knoll wrote: ... Leaving things behind simplifies our lives and in the longer term also our users life. And yes, non unicode encodings are legacy in todays world. They need to disappear, and most people are working towards that goal. We can and should do our part. Lars +1 I still have some .exe files around I wrote for WIndows 3.0 in 1991, while they run nicely today (GDI graphics) on my Windows 10 PC, the Swedish characters display wrong, because somewhere along the way Microsoft decided that the kosher codepage for Windows programs would cease to be 850 and instead be 1251. Yes in 1991 CP 850 was hot, today not so much. So I'd prefer if Qt would require UTF-8 even on Windows. P.S. Consider a similar type of "technical debt" being settled by Qt: I'm thinking of the "DPI awareness" setting in 5.14, i.e. for a default widgets program, Qt nowadays tells WIndows that it's "DPI aware" and wants the truth about screen coordinates, even on those portable PCs with high DPIs that have Scale set to 125% or 150%. On the Qt forum I've seen lot of heat/complaints about QLabels being shoehorned in with the QLineEdits because the fonts are too big for those 125% or 150% screens, I'd answer: create a qt.conf file with the contents: [Platforms] WindowsArguments = dpiawareness=0 and your legacy widgets program will go back to display fine, albeit a bit blurry and bloated. But! If you're asking (with that qt.conf file present) what the screen size is (e.g. QGuiApplication::screens(0)->geometry() etc.) Windows will lie to you and scale "backwards" so that a normal 2560x1440 screen is reported as "QRect(0,0,1707,960)". So using dpiawareness=0 is a bad long-term solution :-( ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
> On 12 May 2020, at 11:34, André Pönitz wrote: > > On Tue, May 12, 2020 at 07:49:06AM +, Lars Knoll wrote: >> I believe it’s important to leave the non Unicode world behind us [...] > > Is that meant to be a convincing technical argument? This is nothing technical per se. > >> * We have extensive support for legacy text encodings in Qt Core, that >> should not be there anymore in 2020 > > To clarify, since this kind of things is easily misread: > > "It should not be _in Qt Core_, but it should be somewhere else _in Qt_." > > Getting easy access to encodings is a valuable feature of Qt. A separate library that uses ICU behind the scenes is something I agree with. QTextCodec in it’s current form not so much. > >> * We offer options to generate HTML or XML in legacy encodings, even >> though the standard clearly says that those are deprecated >> >> * to/fromLocal8Bit() should be equivalent to to/fromUtf8() on all but >> Windows (where we’re still a few years away from fully getting rid of >> this) >> >> * source code encoding is undefined >> Cleaning this up has progressed quite a bit, and a lot of changes in >> various classes have been merged. There’s a large set of changes >> currently being reviewed the remove QTextCodec as a dependency in Qt >> (it’ll get moved to libQt5Compat), and introduce a new QStringConverter >> class, that can handle transcoding between Unicode encodings, Latin1 >> and the system locale. For all systems except Windows, we make the >> additional assumption that the system locale is UTF-8 (see also my >> other mail about UTF-8 as System locale on Windows). > > libQt5Compat is something that's likely to go away in Qt 7. I don't see the > general need for text codecs going away. So it would make more sense to have > them in a module of their own from the beginning. See above, and the new QStringEncoder/Decoder can support additional encodings (though that’s not yet implemented). > >> A next step is to change the build system, so that it (by default) >> assumes that source code is encoded in UTF-8. We are lady do set >> compiler flags to ensure this when building Qt itself, but are not >> doing this yet for user code. > > Which makes sense, because it's not up to a library to dictate how user > code has to look like. Funny, how most other programming languages actually ‘dictate’ that. gcc and clang have both switched to making this the default already for quite some time (even if your system locale happens to not be utf8). I’ve not seen complaints about this anywhere. > >> But gcc and clang do already treat all source code as UTF-8 by default >> (and I believe ICC does the same at least on platforms other than >> Windows). MSVC will require a /utf-8 flag to enable this, something >> that I want to add to the default config for both qmake and cmake when >> compiling a Qt app. Without it, MSVC will still assume the source code >> is encoded in the current ANSI code page and u”…” or u8”…” will result >> in garbage. Worse it’ll lead to non portable code, that might compile >> correctly on one developer machine and create garbage on the next one >> (as it uses a different locale). > >> Changing this also for our users will make source code written for Qt >> more portable and bring Qt on par with most other programming languages >> in the world that already mandate utf8 as the source encoding (JS, >> Swift, Java, etc). > > "Bringing on par" by cutting functionality that is. > > To me it is unclear how relevant citing other languages here is. If anything > at all, Standard C++ would be relevant, which does *not* mandate UTF-8. We are talking what the default is. If someone really wants a different encoding, they can still do that. And the default is already utf8 on all but windows (where it’s the current ansi code page, which means anything but ascii is not well defined at all). > >> [...] > >> Comments are welcome, [...] > > I buy a "codecs are too big for Qt Core, they should be separate" argument > (that was not made here unless I overlooked it) and I buy the "there should > not be multiple overloads for the mass of string-taking functions in the API" > argument. I'd even buy a "we don't have resources to even keep it around". > > I don't understand the motivation for the "legacy", "believe", "important to > leave behind" line of reasoning. Leaving things behind simplifies our lives and in the longer term also our users life. And yes, non unicode encodings are legacy in todays world. They need to disappear, and most people are working towards that goal. We can and should do our part. Lars ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
> 12 мая 2020 г., в 09:49, Lars Knoll написал(а): > > Hi all, > First of all, the plan sounds great! > > Most other classes: > > * Only take and return QString > * Exceptions can be done where significant performance gains can be > demonstrated and the API will by design not require a copy of the data (e.g. > XML writer, stream writers, date time handling) Let me disagree here. The decision should be taken on the fact if the object takes ownership of the string (and thus QString is used) or it only «looks» into it. Otherwise, QString gets propagated all over the place: void addSuffix(const QString ) // can’t use view here! { m_memberString.append(suffix); // no QStringView overload, can’t use QStringView in the API } Ok, we aim to have an QString::append(QStringView) overload, so the example is not that good. Another one: QMimeType findMimeType(const QString ) // can’t use view here! { QMimeDataBase().mimeTypeForName(name); // no QStringView overload, the API propagates QString through all the code } I hope the idea is clear. PS: it is not that easy to fix QMimeDataBase to take QStringView (I looked into the possibility), but the aim should be to take QStringView where it is possible, not where it is *faster*. Ivan ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
> On 12 May 2020, at 11:04, Tor Arne Vestbø wrote: > > >> On 12 May 2020, at 09:49, Lars Knoll wrote: >> >> * Our QLatin1String uses are in most cases about pure ASCII strings. In any >> case, we should consider mass porting them over to u”…” instead. > > During the contributor summit we were talking about just assuming “foo” is > utf-8, now that our source code is utf-8. Is that not possible? It is, but we’d need to copy the data to create a QString. With 16bit data, we could avoid many of the copies. Cheers, Lars ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
> From: Development on behalf of Lars > Knoll > Sent: Tuesday, May 12, 2020 9:49 AM > To: Qt development mailing list > Subject: [Development] QString and related changes for Qt 6 > > > * QStringView and QByteArrayView need to be completed to implement all const > methods of QString/QByteArray Wondering about this point. Looks like we aim for: QString API = QStringView API (const API) + mutator API So, just an idea: instead of repeating the common API part in QString and QStringView, what about making it one common? E.g. what about: - deriving QString from QStringView (and adding mutator API) or (maybe even better): - aggregating QStringView object as a part of QString API and giving accesor for it, like: QStringView QString::stringView(); In this way we are getting access to read-only API part of QString API. And we are not anymore worried about manual sync of the QString const API part and QStringView API. The same of course regards to QByteArray & QByteArrayView... Jarek ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
> On 12 May 2020, at 09:49, Lars Knoll wrote: > > * Our QLatin1String uses are in most cases about pure ASCII strings. In any > case, we should consider mass porting them over to u”…” instead. During the contributor summit we were talking about just assuming “foo” is utf-8, now that our source code is utf-8. Is that not possible? Tor Arne ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
Re: [Development] QString and related changes for Qt 6
Lars Knoll (12 May 2020 09:49) wrote: > My high level goal for the string classes in Qt 6 was to complete the > Unicode related changes that we started for Qt 5.0, where we made utf8 > and utf16 the main encodings, and simplify things. I believe it’s > important to leave the non Unicode world behind us, and offer an as > consistent cross-platform story here as we can. +1 > A next step is to change the build system, so that it (by default) > assumes that source code is encoded in UTF-8. We [already] do set > compiler flags to ensure this when building Qt itself, ... except on Windows, which we plan to fix; good plan. > Our string handling classes currently consist of the following > classes: QByteArray, QString, QStringView, QStringRef, QStringLiteral > and QLatin1String. The set it too large, inconsistent and needs > cleaning up: Indeed, we recently documented which should be used when; doing so made it clear that we'd be better with fewer of these, if only for the sake of making it easier to explain ! > * QByteArray’s methods like toUpper() will only handle ASCII > characters (they assume Latin1 in Qt5). We should document that doing even this is under sufferance and we wish folk would stop using QByteArray for it. It's an operation that implicates the semantics of the bytes, so should be done using a class that believes it knows the semantics of the bytes - which QByteArray should steadfastly refuse to do. Aim to remove at Qt 7. > This would leave us with 4 string-related classes: QByteArray(View) > and QString(View). Sounds much better; and clearer. > One open question is whether we should add a QUtf8String with a > char8_t. I am not yet convinced that we actually need the class > though. How about a QUtf8View, replacing QLatin1String, as the way to pass single-byte-encoded literals into our string APIs ? See below. > The next question is what we do with our API methods. Currently we > have many places where we have three to 4 overloads for the same > methods (taking a QString, a QStringView, a QStringRef and a > QLatin1String). We can’t have 4 overloads for each method in all of > Qt, so we need to restrict overloads to the places where it is > required. IMO this is mainly the string related classes > themselves. And even there we can probably cut down on the number of > overloads. I largely agree, with the exception of: supporting an 8-bit string view type for comparisons (including startsWith(), find()/indexOf() and similar) can save client code a factor of two on the size of many string literals. I'm fine with limiting its use to the QString(View) API, though. So QUtf8View would replace QLatin1String as that 8-bit view type, with a much more limited scope. While we can simply ask folk to stick a u on the front of their strings, doubling the size of each, it would be a kindness to those with lots of string literals to allow them to use u8 instead and avoid that doubling. Meanwhile, the many situations where data from an outside source arrives in UTF-8 make a case for providing a view type that can wrap such data and make it "presentable" for interaction with QString(View), tagged with the right semantics (i.e. the knowledge that it's UTF-8) in the type system. Eddy. ___ Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development
[Development] QString and related changes for Qt 6
Hi all, I’ve had a longer chat with Thiago about how to evolve QString for Qt 6 last week. Some work has already happened, so both QString and QByteArray now share the data structure with QList/QVector, enabling zero-copy conversion between the types. There’s also some pending changes to transition those classes to qsizetype and removing the 32bit limitations we currently have. My high level goal for the string classes in Qt 6 was to complete the Unicode related changes that we started for Qt 5.0, where we made utf8 and utf16 the main encodings, and simplify things. I believe it’s important to leave the non Unicode world behind us, and offer an as consistent cross-platform story here as we can. Qt 5.x still has some left-overs from the pre-unicode world: * QTextStream encodes in Latin1 by default, so do a couple of classes in some places * While we assume Utf8 as the source encoding for Qt, we still use QLatin1String all over the place * We have extensive support for legacy text encodings in Qt Core, that should not be there anymore in 2020 * We offer options to generate HTML or XML in legacy encodings, even though the standard clearly says that those are deprecated * to/fromLocal8Bit() should be equivalent to to/fromUtf8() on all but Windows (where we’re still a few years away from fully getting rid of this) * source code encoding is undefined Cleaning this up has progressed quite a bit, and a lot of changes in various classes have been merged. There’s a large set of changes currently being reviewed the remove QTextCodec as a dependency in Qt (it’ll get moved to libQt5Compat), and introduce a new QStringConverter class, that can handle transcoding between Unicode encodings, Latin1 and the system locale. For all systems except Windows, we make the additional assumption that the system locale is UTF-8 (see also my other mail about UTF-8 as System locale on Windows). A next step is to change the build system, so that it (by default) assumes that source code is encoded in UTF-8. We are lady do set compiler flags to ensure this when building Qt itself, but are not doing this yet for user code. But gcc and clang do already treat all source code as UTF-8 by default (and I believe ICC does the same at least on platforms other than Windows). MSVC will require a /utf-8 flag to enable this, something that I want to add to the default config for both qmake and cmake when compiling a Qt app. Without it, MSVC will still assume the source code is encoded in the current ANSI code page and u”…” or u8”…” will result in garbage. Worse it’ll lead to non portable code, that might compile correctly on one developer machine and create garbage on the next one (as it uses a different locale). Changing this also for our users will make source code written for Qt more portable and bring Qt on par with most other programming languages in the world that already mandate utf8 as the source encoding (JS, Swift, Java, etc). Our string handling classes currently consist of the following classes: QByteArray, QString, QStringView, QStringRef, QStringLiteral and QLatin1String. The set it too large, inconsistent and needs cleaning up: * With the source code encoding being utf8, QLatin1String makes a lot less sense, and I my goal is to deprecate/deprioritize it in Qt 6. Instead, I would like to advocate the use of u”…” to directly encode the string as utf-16. * QStringRef has been superseded by QStringView and should get deprecated. The main hurdle here is it’s use in QXmlStream. The plan is to extend QXmlStringRef (yes, that one exists as well…) to cover the use case. Both QXmlStringRef and QStringRef will get a cast operator to QStringView. With that we can then remove all API that takes a QStringRef and replace it with API taking either a QString or a QStringView * QStringLiteral should turn into a small wrapper around u”…”, and probably also get deprecated. Maybe we could add a user defined literal for it instead that returns a read-only QString (QString s = “…”_q;). So u”…” would lead to a QStringView, u”…”_q to a read-only QString. * We should add a QByteArrayView to keep symmetry between the QString and QByteArray APIs. This is somewhat independent from the rest though and lower priority. * QStringView and QByteArrayView need to be completed to implement all const methods of QString/QByteArray * A basic different between QString and and QStringView will be that the view class can contain non zero terminated data and are read-only, while QString will guarantee a zero termination (I checked whether we can remove that enforcement, but it will break too much code). Sidenote: Currently, fromRawData() together with utf16() can break this assumption, we should fix this * QByteArray’s methods like toUpper() will only handle ASCII characters (they assume Latin1 in Qt5). This would leave us with 4 string related classes: QByteArray(View) and QString(View). Another step that is