Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Thiago Macieira
On Wednesday 14 October 2015 20:09:56 Marc Mutz wrote:
> On Wednesday 14 October 2015 12:41:12 Allan Sandfeld Jensen wrote:
> > Why not a QCharArray? With external data constructor, that should be the
> > same,  shouldn't it?
> 
> If you propose something like QString/QByteArray::fromRawData(), then that 
> allocates the control block, so no, not really an option.

Which is also solved by the null d-pointer.

In other words
QStringLiteral("foo") === QString::fromRawData(u"foo", 3);

In theory. In practice, there may be some dragons hidden somewhere.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Thiago Macieira
On Wednesday 14 October 2015 20:04:12 Marc Mutz wrote:
> On Wednesday 14 October 2015 18:11:26 Thiago Macieira wrote:
> > and the fact that QStringLiterals don't share will cause the
> > innocent-looking  above code require 64 bytes of read-only data.
> 
> They are shared, because it seems that lambdas within the same function have
> the same type. At least last I checked, that was what GCC implemented.

GCC 5.2, 6: 2 lambdas, data duplicated
Clang 3.7, 3.8: 2 lambdas, data duplicated
ICC 16: 2 lambdas, data duplicated

You can see from the disassembly that they are two different types.

> > movq_ZN10QArrayData18shared_static_dataE@GOTPCREL(%rip), %rax
> 
> And you want the nullptr to get rid of this relocation.

Yes, but more importantly because it speeds up the check for when reference 
counting should be done. Right now, it needs to check bit 9 inside d->flags, 
which means dereferencing the pointer (hitting another cacheline) and the 
compiler never knows that test is constant with QStringLiterals.

With a null pointer, the check is very trivial (a TEST instruction, for both 
the null and the ~1 check) and the compiler should be able to optimise the 
destructor away.

Here's the entire function, as it is today with one QStringLiteral only:
(compiled with GCC 6 -fno-exceptions, rearranged/edited for clarity)

; load the literal:
movq_ZN10QArrayData18shared_static_dataE@GOTPCREL(%rip), %rax   
; d
movl$3, 16(%rsp); str.d.size = 3
movq%rax, (%rsp); str.d.d = 
::shared_static_data
leaq.LC0(%rip), %rax; u"foo"
movq%rax, 8(%rsp)   ; str.d.b = u"foo"
; make the call:
movq%rsp, %rdi
call_Z1fRK7QString@PLT
; inlined QString::~QString
movq(%rsp), %rax; reload the d pointer
testl   $512, (%rax); d->flags & QArrayData::ImmutableHeader
je  .L8
addq$40, %rsp
ret
; this is the dead code, it never gets run:
.L8:
lock subl   $1, 4(%rax) ; d->ref_.deref()
jne .L5
movq(%rsp), %rdi; load d pointer
movl$16, %edx   ; alignof(QTypedArrayData)
movl$2, %esi; sizeof(QChar)
call_ZN10QArrayData10deallocateEPS_mm@PLT
addq$40, %rsp
ret

A hacky implementation that uses a null pointer instead:

; load the literal:
leaq.LC0(%rip), %rax; u"foo"
movq$0, (%rsp)  ; str.d.d = nullptr
movq%rax, 8(%rsp)   ; str.d.b = u"foo"
movl$3, 16(%rsp); str.d.size = 3
; make the call
movq%rsp, %rdi
call_Z1fRK7QString@PLT
addq$40, %rsp
ret

The QString::~QString destructor expanded to empty with GCC. Unfortunately, 
Clang and ICC retained the check (they must be assuming the callee modified the 
const parameter).

Unfortunately, if I change the isStatic to check for LSB set for the SSO case, 
even GCC gets thrown off and brings back the dead code.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Marc Mutz
On Wednesday 14 October 2015 15:59:21 Matthew Woehlke wrote:
> On 2015-10-14 07:15, Knoll Lars wrote:
> > In addition, it might be tricky to use QStringView in signals and
> > slots.
> 
> As I previously stated, I'm pretty sure you *CAN'T* use QStringView to
> call slots, except for direct call. In any other case, you risk the
> backing data being modified or, worse, deallocated, before the slot
> dispatches (this is *especially* dangerous with cross-thread dispatch,
> since now you have thread safety to worry about). The only way around
> that would be for QStringView to take a COW reference to the underlying
> data.
> 
> We already have a class like that. It's called QString.
> 
> What *might* work is if the event dispatcher, when it makes copies of
> the arguments, makes a deep copy of QStringView into a QString. I'm not
> sure if this is possible, though, and anyway then you're in the same
> boat of making a (potentially) unnecessary deep copy if you had a
> QString in the first place.

This is nothing new. You cannot pass reference types through cross-thread 
signal/slot connections. In fact, you cannot pass any non-reentrant type, 
either. That doesn't prevent API such as QPrintPreviewDialog::paintRequested() 
from cropping up, and still being useful.

Thanks,
Marc

-- 
Marc Mutz  | Senior Software Engineer
KDAB (Deutschland) GmbH & Co.KG, a KDAB Group Company
Tel: +49-30-521325470
KDAB - The Qt Experts
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Marc Mutz
On Wednesday 14 October 2015 12:41:12 Allan Sandfeld Jensen wrote:
> Why not a QCharArray? With external data constructor, that should be the
> same,  shouldn't it?

If you propose something like QString/QByteArray::fromRawData(), then that 
allocates the control block, so no, not really an option.

> Anyway, I doubt this is really something that needs optimizing, QString is 
> neat because it is simple and easy to remember.  If anything we need to
> use  QByteArray in more places where QStrings are only 8-bit strings.

I'm not optimising. I'm decoupling the concept of a "QString" from the owning 
implementation "QString", so that we don't need to either convert from/to 
QString quite so often or you can use "foreign types" 
(std::basic_string, char16_t[], ...) in lieu of QString. That is 
important when you need to interface with 3rd-party libraries.

Thanks,
Marc

-- 
Marc Mutz  | Senior Software Engineer
KDAB (Deutschland) GmbH & Co.KG, a KDAB Group Company
Tel: +49-30-521325470
KDAB - The Qt Experts
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Marc Mutz
On Wednesday 14 October 2015 18:11:26 Thiago Macieira wrote:
> and the fact that QStringLiterals don't share will cause the
> innocent-looking  above code require 64 bytes of read-only data.

They are shared, because it seems that lambdas within the same function have 
the same type. At least last I checked, that was what GCC implemented.

> movq_ZN10QArrayData18shared_static_dataE@GOTPCREL(%rip), %rax

And you want the nullptr to get rid of this relocation.

I like it!

Thanks,
Marc

-- 
Marc Mutz  | Senior Software Engineer
KDAB (Deutschland) GmbH & Co.KG, a KDAB Group Company
Tel: +49-30-521325470
KDAB - The Qt Experts
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Knoll Lars
On 13/10/15 22:46, "Matthew Woehlke"  wrote:

>On 2015-10-13 15:59, Jake Petroules wrote:
>> On Oct 13, 2015, at 1:46 PM, Marc Mutz wrote:
>>> I would therefore like to propose to abandon QString for new API (and
>>>over 
>>> time phase it out of existing API), and only provide (const QChar*,
>>>size_t) as 
>>> the most general form. I would propose to package the two into a
>>>class, called 
>>> - you guessed it - QStringView.
>> 
>> In general this sounds like a dangerous idea because it carries over
>> all the old API concepts (i.e. (QChar *, size_t) is an extremely
>> broken abstraction). You need to read and truly comprehend
>> https://developer.apple.com/swift/blog/?id=30 before suggesting any
>> changes to string-related APIs for the next major version of Qt,
>> because if anything, THAT is what it should look like. Anything but
>> that is a near-useless wrapper around binary data, not a true string
>> class.

From a conceptual point of view I fully agree with the article. Handling
unicode data is difficult, and that is what’s required to make it as
seamless as possible. But the approach Swift is taking is not trivial or
even 100% unambiguous. Afaik they always work with a certain normalization
form (composed). 

But it poses certain problems as well. With their API, you can always add
a combining character (like an accent) to an existing letter in the
string, but you can never remove it. This creates a certain assymetry that
can in some cases pose problems as well.

>
>While I don't necessarily disagree with that article, I think that the
>points being made are orthogonal to what Marc is proposing.

Yes, to a good degree it’s orthogonal. What both Marc’s proposal and the
article above show is that we should rethink some parts of our unicode
handling with Qt 6. QString is very good in many ways, but it still shows
it’s history as being a vector of utf16 code points.
>
>The idea of QStringView would, I presume, be similar to that of
>std::string_view; namely, to provide an abstraction over a bag of
>"characters" (using that term rather loosely). It does NOT in any way
>relate to doing any sort of operations (besides slicing) on a "string".
>The idea is to be able to inexpensively pass around "text", whether it
>comes from QString, QStringRef, wchar_t*, or what have you, without
>having to perform superfluous memory allocations to convert to One True
>Form (i.e. QString) when the consumer doesn't actually care.
>
>That said... I note that slots probably still need to take QString,
>because a queued call with a QStringView is horribly broken (for reasons
>which I hope are obvious). At least unless the event dispatcher is
>clever enough to promote these to QString in the event.

Yes, as will many other methods. QStringView would IMO mainly something we
can use in places where we use the data in a read-only fashion and where
performance is critical.

Cheers,
Lars

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Knoll Lars
I’m not a huge fan of having different overloads with QString, QStringRef
and QLatin1String and in some cases (QChar *, int) for many methods
neither. But while your proposal solves some problems it introduces others.

A QStringView class would only work for methods that read the data
contained in it, but don’t try to modify it or take a copy (as Thiago
pointed out). And you certainly can’t keep the pointer to the data around
for longer than the lifetime of the QStringView, so it’s to some extent an
advanced class you have to be careful when using in your own APIs.

So it can work nicely for methods such as QString::indexOf and similar,
but will never be good for methods that need to copy the string (e.g.
QUrl::setHostName).


Another thing I wonder about is whether we shouldn’t deprecate
QLatin1String moving forward. We have QStringLiteral, and even though it’s
implementation is not ideal, we should be able to get it working
everywhere now with Qt 5.7. Let’s think about how and whether we can
improve it’s implementation to fix the remaining issues. Then we could
remove/deprecate QLatin1String.

On 13/10/15 23:01, "Thiago Macieira"  wrote:

>On Tuesday 13 October 2015 22:46:36 Marc Mutz wrote:
>> Q: What mistakes do you refer to?
>> 
>> A: The fact that it has copy ctor and assignment operator, so it's not a
>> trivally-copyable type and thus cannot efficiently passed by-value. It
>>may
>> also be too large for pass-by-value due to the rather useless QString
>> pointer (should have been QStringData*, if any). Neither can be fixed
>> before Qt 6.
>
>Not even in Qt 6. The reason why it uses a QString pointer is that it
>follows 
>the QString through reallocations. If the QString is mutated, the
>QStringRef 
>will still be valid (provided it isn't shortened beyond the substring the
>QStringRef points to). There's a lot of code that depends on this, so we
>can't 
>change it.

Only by deprecating QStringRef and not using it ourselves anymore. But
it’s used quite a lot in Qt, so this is no easy job and will certainly
break source compatibility in places such as the XML stream reader.
>
>> Q: Why size_t?
>> 
>> A: The intent of QStringView (and std::experimental::string_view) is to
>>act
>> as an interface between modules written with different compilers and
>> different flags. A std::string will never be compatible between
>>compilers
>> or even just different flags, but a simple struct {char*, size_t} will
>> always be, by way of it's C compatibility.
>> 
>> So the goal is not just to accept QString, QStringRef, and (QChar*,int)
>>(and
>> QVarLengthArray!) as input to QStringView, but also
>> std::basic_string and std::vector.
>
>The C++ committee's current stance on signed vs unsigned is that you
>should 
>use signed for everything, except when you want to have modulo-2
>overflows. 
>We're not overflowing, so it should be signed.

Yes, signed please. We can discuss whether it should be 64bit for Qt 6.

>
>> Q: What future do you have in mind for QStringRef?
>> 
>> A: None in particular, though I have found a need for an owning
>>QStringRef
>> in some places. But I expect Qt 6' QString to be able to provide a
>> restricted view on shared data, such that it would subsume QStringRef
>> completely.
>
>We should deprecate it if QStringView comes into being.

Agree. 
>
>> Q: What about QLatin1String?
>> 
>> A: Once QString is backed by UTF-8, latin-1 ceases to be a special
>>charset.
>> We might want something like QUsAsciiString, but it would just be a
>>UTF-8
>> string, so it could be packed into QStringView.
>
>Since QString will not be backed by UTF-8, the answer is irrelevant.

Agree here as well. We can’t make QString utf-8 backed without breaking
way too much code. I also don’t see the need for it. The native encoding
on Windows and Mac (Cocoa) is utf-16 as well, on Linux it’s utf-8. So no
matter which platform we’re on, we won’t avoid some conversions.

And I will strongly oppose any attempts to make QString some sort of
hybrid supporting both. The added complexity in maintaining the code base
is simply not worth it.

>
>> Q: What about QByteArray, QVector?
>> 
>> A: I'm unsure about QByteArrayView. It might not pull its weight
>>compared to
>> std::(experimental::)string_view, but I also note that we're currently
>> missing a QByteArrayRef, so a QBAView might make sense while we wait for
>> the std one to become available to us.
>
>Given the mistakes that you and I are pointing out in QStringRef, we
>should 
>not add QByteArrayRef. Instead, it should be in the new-style, in which
>case I 
>wonder whether we should add a class in the first place. And moreover,
>how 
>often is this needed? std::array_view should be plenty for QByteArray and
>QVector where needed.

Agreed as well.
>
>> I'm actively opposed to a QArrayView, because I don't think it provides
>>us
>> with anything std::(experimental::)array_view doesn't already.
>
>Right.
>
>> Q: What do you mean when you say "abandon 

Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Olivier Goffart
On Tuesday 13. October 2015 22:46:36 Marc Mutz wrote:
> I would therefore like to propose to abandon QString for new API (and over
> time phase it out of existing API), and only provide (const QChar*, size_t)
> as the most general form. I would propose to package the two into a class,
> called - you guessed it - QStringView.

+1

I think we indeed need QStringView, QByteArrayView and even QVectorView.

And functions that take strings without taking ownership of them (i.e: not 
setters) should use that.

-- 
Olivier 

Woboq - Qt services and support - http://woboq.com - http://code.woboq.org
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Marc Mutz
On Wednesday 14 October 2015 08:37:19 Knoll Lars wrote:
> I’m not a huge fan of having different overloads with QString, QStringRef
> and QLatin1String and in some cases (QChar *, int) for many methods
> neither. But while your proposal solves some problems it introduces others.
> 
> A QStringView class would only work for methods that read the data
> contained in it, but don’t try to modify it or take a copy (as Thiago
> pointed out).

I do not agree with that statement.

First, afaiu from what Thiago mentions in reviews, Q6String will have SSO 
(small-string-optimisation) which makes many short strings expensive to copy 
(if you think that copying 24 bytes is slower than upping an atomic int 
through an indirection) or cheap to copy (if you think the opposite). In any 
case, small strings will be very cheap to create (no allocation), so for many 
strings there will be not much difference between passing a QStringView or 
passing a QString.

Second, upon modification, the QString will detach (make a copy), and _then_ 
perform the operation. With a QStringView and an efficient base of operations, 
those two operations can be folded into one (basically as the const versions 
of QString methods, where they exist, do (or should do)). Unless the operation 
can and will actually be done in the original allocation (ie. incl. no 
detach), the const methods should be faster. That will never be the case when 
you pass QString by const-&, because there will always be the lvalue parameter 
attached to the QString instance. For modification in-place to work, you need 
to pass by rvalue ref. So for typical functions modifying the string, there's 
also no difference between QString and QStringView.

That leaves classes which simply store the string. You cited QUrl. I don't see 
a problem providing QString overloads for these, esp. considering that we're 
starting out with an all-QString API here. Then again, once we have 
QStringView overloads, we can simply disable the QString overloads and see the 
effect.

BTW: functions storing a passed QString as-is should provide a QString&& 
overload, and that might be a good idea even when otherwise using QStringView 
only.

> And you certainly can’t keep the pointer to the data around
> for longer than the lifetime of the QStringView, so it’s to some extent an
> advanced class you have to be careful when using in your own APIs.

It's like the distinction between QModelIndex and QPersistentModelIndex. The 
first is an interface type, the latter the storage type. Neither is more 
"advanced" than the other. They are complements.
 
> So it can work nicely for methods such as QString::indexOf and similar,
> but will never be good for methods that need to copy the string (e.g.
> QUrl::setHostName).
> 
> 
> Another thing I wonder about is whether we shouldn’t deprecate
> QLatin1String moving forward. We have QStringLiteral, and even though it’s
> implementation is not ideal, we should be able to get it working
> everywhere now with Qt 5.7. Let’s think about how and whether we can
> improve it’s implementation to fix the remaining issues. Then we could
> remove/deprecate QLatin1String.

There are problems in QStringLiteral that cannot be solved. Common data 
sharing will never happen with the current syntax. I'd suggest a 
QStaticString, a fully constexpr wrapper around QStaticStringData, 
basically to determine the N transparently, which can be used as a variable at 
namespace scope in lieu of the current need to pack all QStringLiterals into 
static inline functions. But that's outside the scope of this thread, so let's 
not go there.

> On 13/10/15 23:01, "Thiago Macieira"  wrote:
> >On Tuesday 13 October 2015 22:46:36 Marc Mutz wrote:
> >> Q: What mistakes do you refer to?
> >>
> >> 
> >>
> >> A: The fact that it has copy ctor and assignment operator, so it's not a
> >> trivally-copyable type and thus cannot efficiently passed by-value. It
> >>
> >>may
> >>
> >> also be too large for pass-by-value due to the rather useless QString
> >> pointer (should have been QStringData*, if any). Neither can be fixed
> >> before Qt 6.
> >
> >Not even in Qt 6. The reason why it uses a QString pointer is that it
> >follows 
> >the QString through reallocations. If the QString is mutated, the
> >QStringRef 
> >will still be valid (provided it isn't shortened beyond the substring the
> >QStringRef points to). There's a lot of code that depends on this, so we
> >can't 
> >change it.

QString foo = "foo";
QStringRef ref = foo.midRef(1); // ref == "oo";
foo = "bar"; // oops, ref == "ar";

We could change it to hold QString::Data* instead, though, right? And make it 
share ownership of the QString::Data, in which case we have a QString that has 
position and size inline. Or, if it doesn't participate in the ownership, we 
can start returning QStringRef from QStringLiteral(Ref?), killing one major 
QSL problem (out-of-line QString dtor litter).

> Only by deprecating 

Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Bubke Marco
Hi Lars

Knoll Lars 
> Agree here as well. We can’t make QString utf-8 backed without breaking
> way too much code. I also don’t see the need for it. The native encoding
> on Windows and Mac (Cocoa) is utf-16 as well, on Linux it’s utf-8. So no
> matter which platform we’re on, we won’t avoid some conversions.

With native do you mean the OS API's? There are many other API's which 
are preferring UTF-8 for performance and/or size reason like databases. 
Most text from the web is in UTF-8 because the overhead of Chinese signs 
is still lower than the savings for the embedded tags around them. 
I don't think we should orientate on the OS API's but more on the most 
performance 
demanding ones. So why do we not provide a QUtf8String and use it for 
example in networking. We don't need to change everything at once but
we should provide UTF-8 support so that our users do not have to invent 
the wheel again and again like we do in Creator.

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Thiago Macieira
On Wednesday 14 October 2015 17:55:34 Bubke Marco wrote:
> Think about a local aware compare which is called very very often. You don't
> want malloc in between. In in most cases you get an const char* or const
> shor* in this cases It would be nice if your interface would support UTF-8
> and not only UTF-16.

Three of the four implementations of QString::localeAwareCompare operate on 
UTF-16 (Win32 CompareStringW, CoreFoundation's CFStringCompare and ICU 
ucol_strcoll). That's another reason for keeping QString as UTF-16.

I don't think any of those even allocates memory, but it's impossible to tell 
for sure with the CoreFoundation API.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread André Pönitz
On Wed, Oct 14, 2015 at 06:37:19AM +, Knoll Lars wrote:
> Agree here as well. We can’t make QString utf-8 backed without breaking
> way too much code. I also don’t see the need for it. The native encoding
> on Windows and Mac (Cocoa) is utf-16 as well, on Linux it’s utf-8. So no
> matter which platform we’re on, we won’t avoid some conversions.

I am afraid that "the native encoding on Windows and Mac (Cocoa)
is UTF-16" argument does not carry much weight in my daily work.

I read/write files a lot, talk to processes/services/whatever.
Almost all of that uses some 8-bit encoding, often enough something
compatible with UTF-8, also on Windows and Mac. Even small fry like
settings keys is usually plain English 7-bit clean.

QString's UTF-16 is pretty much the antithesis of a good compromise in
that area. It generates line noise in the sources and wastes cycles at
runtime.

> And I will strongly oppose any attempts to make QString some sort of
> hybrid supporting both. The added complexity in maintaining the code base
> is simply not worth it.

I don't think a hybrid would be better, either. But that is not
part of this RFC.

I think Marc's proposal of using *View classes in interfaces has some merits.
How much exactly I am unsure about. I only know that the (non-)performance
of QString based interfaces has bitten me often enough to justify at least
some experiments.


That's why I'd like to propose the following: 

Since experiments within Qt proper are difficult due to the BC
and SC guarantees we give and the practical impossibility to un-do
additions we should simply not do it there.

Instead, we could (and should) use part of Qt Creator's code base,
specifically some of 'leaf' plugins (i.e. plugins with no known
downstream users), to play with the idea, and develop a solid
understanding of the pros and cons of the idea of using *View
classes in interfaces until Qt 6 comes.

The way forward could be to add e.g. 'Utils::[Q]StringView'
and 'Utils::[Q]ByteArrayView' in implementation src/libs/utils
and start using these in a few 'harmless' plugins.

The advantages here are less restrictions due to lower compatibility
guarantees, less restrictions imposed by older compilers, less harm
done if the experiment fails (i.e. if the *Views turn out to not be
beneficial) and generally more flexibility when e.g. comparing competing
implementations.

Opinions?

Andre'
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Thiago Macieira
On Wednesday 14 October 2015 22:56:15 André Pönitz wrote:
> > I think there’s actually quite a few of those. In addition, it might be
> > tricky to use QStringView in signals and slots.
> 
> One could try to be clever and go through an intermediate QString
> object at least in queued connections. Or even always. 

-2 on any signal-slot special-casing for some types.

It's possible that the string in question *is* retained and there's no need to 
copy. We can't know that in QObject::activate, so we shouldn't try.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Thiago Macieira
On Wednesday 14 October 2015 21:51:23 Bubke Marco wrote:
> On October 14, 2015 23:10:26 Thiago Macieira  
wrote:
> > Do it on your own. You just said that ICU has the function you want, so
> > use
> > it.
> 
> So Qt is always shipping with ICU?

It can be disabled on Windows. On OS X there's no point since it's part of the 
system. On Linux, if you disable it, you're going to have some other features 
reduced, so don't disable it.

> > Qt does not have to provide a comparator that operates on something other
> > than its native string type.
> 
> Isn't Qt a framework to help developers? Sorry your argumentation is sounds
> not very empirical.

Yes, it is. But Qt's goal is not to support every single use-case and corner-
case out there. Qt should make 90% easy and 9% possible. That means there's a 
1% of the realm of possibilities that Qt does not address. If your use-case 
calls into this group, use the fact that Qt is native code and just call other 
libraries.

That's one of the two main advantages of native code. There's no sandbox to 
escape from.

Qt already supports doing locale-aware comparison. We even have a class for 
it, so it can be done efficiently: QCollator and it supports our native string 
type (QString).

Providing extra support for a character encoding that is not what QString uses 
falls in that 1%. Just use ICU.

> >> Maybe windows and mac os will bring support to the standard library so we
> >> don't need it but in the mean time it would be very helpful.
> >> 
> >> A utf 8 based QTextDocument would be maybe nice too.
> > 
> > What for? It needs to keep a lot of extra structures, so the cost of
> > conversion and extra memory is minimal. And besides, QTextDocument really
> > needs a seekable string, not UTF-8.
> 
> Is UTF 16 seekable? You still have surrogates and you can merge merge code
> points.

Seekable enough. It's much easier to deal with than UTF-8. A surrogate pair, 
as its name says, appears *only* in pairs, so you always know if you're on the 
first or on the second. Moreover, all living languages are encoded in the Basic 
Multilingual Plane, so no surrogate pairs are required for any of them. 
Handling of surrogate pairs can be moved to non-critical codepaths.

As for combining code points, that's something different and usually one or 
more layers removed from the seeking, along-side zero- and full-width code 
points. QTextDocument also handles fonts with variable width glyphs, so you 
can never simply convert a byte index to pixel just like that. (not to mention 
those pesky line breaks...)

> Lets describe an example. I send the QTextDocument content to an library
> which expect utf8 content and gives me back positions. This gets
> interesting if you use non ASCII signs. Actually the new clang code model
> works that way.

That example shows how UTF-16 is better. See above on seekability of UTF-16 vs 
UTF-8.

The solution for this is to fix the library to accept UTF-16. When we were 
doing Qt 5.0, we needed PCRE to support UTF-16. Their developers were very 
welcoming and wrote the version that supports UTF-16, so Qt does not need to 
reallocate.

> > Even if we provide UTF-8 support classes, those will not propagate to the
> > GUI. Forget it.
> 
> What about compressing UTF 16 like python is doing it for UTF 32. If you are
> only using ascii you set a flag and you can remove all that useless zeros.
> It would be have implications for data() but maybe we should not provide
> access to the internal representation. If you use UTF 32 as a base you
> don't need anymore surrogates.

That's what Lars called a "hybrid solution" and vetoed. I second that.

Way too much code would break if we did that because we allow people access to 
the data pointer in QString and to iterate directly (std::{,w,u16}string don't 
allow that, which makes parsing them actually a lot more cumbersome).

As for UTF-32/UCS-4, it occupies twice as much space as it needs for all text 
written with living languages.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Bubke Marco
On October 14, 2015 22:13:11 Thiago Macieira  wrote:

> On Wednesday 14 October 2015 17:55:34 Bubke Marco wrote:
>> Think about a local aware compare which is called very very often. You don't
>> want malloc in between. In in most cases you get an const char* or const
>> shor* in this cases It would be nice if your interface would support UTF-8
>> and not only UTF-16.
>
> Three of the four implementations of QString::localeAwareCompare operate on 
> UTF-16 (Win32 CompareStringW, CoreFoundation's CFStringCompare and ICU 
> ucol_strcoll). That's another reason for keeping QString as UTF-16.
>

Thiago, to my understanding ICU is supporting UTF 8 too. I don't ask for UTF 8 
support because I like it but I need it. And I don't want an utf 8 baked 
QString. For my use cases implicit sharing is overkill.  Move semantics would 
be enough. I want localAwareCompare(const char *s1, const char *s2). Maybe 
windows and mac os will bring support to the standard library so we don't need 
it but in the mean time it would be very helpful. 

A utf 8 based QTextDocument would be maybe nice too. 


___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Thiago Macieira
On Wednesday 14 October 2015 20:52:12 Bubke Marco wrote:
> On October 14, 2015 22:13:11 Thiago Macieira  
wrote:
> > On Wednesday 14 October 2015 17:55:34 Bubke Marco wrote:
> >> Think about a local aware compare which is called very very often. You
> >> don't want malloc in between. In in most cases you get an const char* or
> >> const shor* in this cases It would be nice if your interface would
> >> support UTF-8 and not only UTF-16.
> > 
> > Three of the four implementations of QString::localeAwareCompare operate
> > on
> > UTF-16 (Win32 CompareStringW, CoreFoundation's CFStringCompare and ICU
> > ucol_strcoll). That's another reason for keeping QString as UTF-16.
> 
> Thiago, to my understanding ICU is supporting UTF 8 too. I don't ask for UTF
> 8 support because I like it but I need it. 

There's ucol_strcollUTF8 since ICU 50, indeed. Quite a few systems are still 
running older versions today, but that wouldn't be an argument for Qt 6.

> And I don't want an utf 8 baked
> QString. For my use cases implicit sharing is overkill.  Move semantics
> would be enough. I want localAwareCompare(const char *s1, const char *s2).

Do it on your own. You just said that ICU has the function you want, so use 
it.

Qt does not have to provide a comparator that operates on something other than 
its native string type.

> Maybe windows and mac os will bring support to the standard library so we
> don't need it but in the mean time it would be very helpful.
> 
> A utf 8 based QTextDocument would be maybe nice too.

What for? It needs to keep a lot of extra structures, so the cost of 
conversion and extra memory is minimal. And besides, QTextDocument really 
needs a seekable string, not UTF-8.

Even if we provide UTF-8 support classes, those will not propagate to the GUI. 
Forget it.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Bubke Marco
On October 14, 2015 23:10:26 Thiago Macieira  wrote:

> On Wednesday 14 October 2015 20:52:12 Bubke Marco wrote:
>> On October 14, 2015 22:13:11 Thiago Macieira  
> wrote:
>> And I don't want an utf 8 baked
>> QString. For my use cases implicit sharing is overkill.  Move semantics
>> would be enough. I want localAwareCompare(const char *s1, const char *s2).
>
> Do it on your own. You just said that ICU has the function you want, so use 
> it.

So Qt is always shipping with ICU? 

> Qt does not have to provide a comparator that operates on something other 
> than 
> its native string type.

Isn't Qt a framework to help developers? Sorry your argumentation is sounds not 
very empirical. 

>
>> Maybe windows and mac os will bring support to the standard library so we
>> don't need it but in the mean time it would be very helpful.
>> 
>> A utf 8 based QTextDocument would be maybe nice too.
>
> What for? It needs to keep a lot of extra structures, so the cost of 
> conversion and extra memory is minimal. And besides, QTextDocument really 
> needs a seekable string, not UTF-8.

Is UTF 16 seekable? You still have surrogates and you can merge merge code 
points. 

Lets describe an example. I send the QTextDocument content to an library which 
expect utf8 content and gives me back positions. This gets interesting if you 
use non 
ASCII signs. Actually the new clang code model works that way. 
>
> Even if we provide UTF-8 support classes, those will not propagate to the 
> GUI. 
> Forget it.

What about compressing UTF 16 like python is doing it for UTF 32. If you are 
only using ascii you set a flag and you can remove all that useless zeros. It 
would be have implications for data() but maybe we should not provide access to 
the internal representation. If you use UTF 32 as a base you don't need anymore 
surrogates. 
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread André Pönitz
On Wed, Oct 14, 2015 at 11:15:55AM +, Knoll Lars wrote:
> >That leaves classes which simply store the string. You cited QUrl. I
> >don't see 
> >a problem providing QString overloads for these, esp. considering that
> >we're 
> >starting out with an all-QString API here. Then again, once we have
> >QStringView overloads, we can simply disable the QString overloads and
> >see the 
> >effect.
> 
> I think there’s actually quite a few of those. In addition, it might be
> tricky to use QStringView in signals and slots.

One could try to be clever and go through an intermediate QString
object at least in queued connections. Or even always. 

> [...]
> Of course we don’t know all it’s uses. But many uses outside of QtCore are
> clearly less critical. QLineEdit::setText is clearly not called in tight
> loops, and once you set the text it has to do lots of other work. There
> are many similar APIs in Qt, where I don’t think we’ll ever see a benefit
> of a QStringView, and the simplicity of passing in a const QString ref is
> probably preferrable.

Right. OTOH there are instances where it provably *does* matter, e.g.
everything in the vicinity of QFileInfo, or:

> >Take QDateTime as a warning.

...

> I am certainly in favor of experimenting with this. Let’s start in a
> branch or behind an ifdef.

Or in a safer place. See my other mail.

Andre'

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Allan Sandfeld Jensen
On Tuesday 13 October 2015, Marc Mutz wrote:
> Hi,
> 
> After looking quite a bit into the current state of string handling in Qt
> for my QtWS talk last week, I have become frustrated by the state of
> string handling in Qt.
> 
> We have such powerful tools for string handling (QStringRef,
> QStringBuilder), but all APIs outside QString and its immediate
> surroundings only deal in QString. The correct way would be to overload
> every function taking QString with QLatin1String and QStringRef versions,
> and then, for some other rare cases, const QChar *, int size. Let alone
> std::basic_string.
> 
> I would therefore like to propose to abandon QString for new API (and over
> time phase it out of existing API), and only provide (const QChar*, size_t)
> as the most general form. I would propose to package the two into a class,
> called - you guessed it - QStringView.
> 
Why not a QCharArray? With external data constructor, that should be the same, 
shouldn't it?

Anyway, I doubt this is really something that needs optimizing, QString is 
neat because it is simple and easy to remember.  If anything we need to use 
QByteArray in more places where QStrings are only 8-bit strings.

`Allan
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Marc Mutz
On Thursday 15 October 2015 00:27:14 Thiago Macieira wrote:
> Way too much code would break if we did that because we allow people access
> to  the data pointer in QString and to iterate directly
> (std::{,w,u16}string don't allow that, which makes parsing them actually a
> lot more cumbersome).

Just chiming in to say: It does: 
http://en.cppreference.com/w/cpp/string/basic_string/data

-- 
Marc Mutz  | Senior Software Engineer
KDAB (Deutschland) GmbH & Co.KG, a KDAB Group Company
Tel: +49-30-521325470
KDAB - The Qt Experts
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Marc Mutz
Hi Andre,

On Wednesday 14 October 2015 22:37:01 André Pönitz wrote:
> That's why I'd like to propose the following: 
> 
> Since experiments within Qt proper are difficult due to the BC
> and SC guarantees we give and the practical impossibility to un-do
> additions we should simply not do it there.
> 
> Instead, we could (and should) use part of Qt Creator's code base,
> specifically some of 'leaf' plugins (i.e. plugins with no known
> downstream users), to play with the idea, and develop a solid
> understanding of the pros and cons of the idea of using *View
> classes in interfaces until Qt 6 comes.
> 
> The way forward could be to add e.g. 'Utils::[Q]StringView'
> and 'Utils::[Q]ByteArrayView' in implementation src/libs/utils
> and start using these in a few 'harmless' plugins.
> 
> The advantages here are less restrictions due to lower compatibility
> guarantees, less restrictions imposed by older compilers, less harm
> done if the experiment fails (i.e. if the *Views turn out to not be
> beneficial) and generally more flexibility when e.g. comparing competing
> implementations.
> 
> Opinions?

I disagree that QtC is a better place to try out QStringView.

The user base of Qt APIs is orders of magnitude larger than that of QtC APIs, 
and we should encourage outside experiments, not prevent them.

Just like QStringBuilder, we can make QStringView opt-in for now (which means 
providing a QString overload :( - but we can start with existing API).

Thanks,
Marc

-- 
Marc Mutz  | Senior Software Engineer
KDAB (Deutschland) GmbH & Co.KG, a KDAB Group Company
Tel: +49-30-521325470
KDAB - The Qt Experts
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Oswald Buddenhagen
On Wed, Oct 14, 2015 at 11:15:55AM +, Knoll Lars wrote:
> >> >> A: Once QString is backed by UTF-8, [...]
> 
> It’s worthwhile discussing, but any such change would have huge
> implications on our QString API.
> 
indeed.

> In any case, it’s nothing we can do in Qt 5.
>
i don't think this is true. see
http://lists.qt-project.org/pipermail/development/2015-February/020111.html
and the surrounding discussion.

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Thiago Macieira
On Thursday 15 October 2015 02:22:50 Marc Mutz wrote:
> On Thursday 15 October 2015 00:27:14 Thiago Macieira wrote:
> > Way too much code would break if we did that because we allow people
> > access
> > to  the data pointer in QString and to iterate directly
> > (std::{,w,u16}string don't allow that, which makes parsing them actually a
> > lot more cumbersome).
> 
> Just chiming in to say: It does:
> http://en.cppreference.com/w/cpp/string/basic_string/data

Ah, right. The mutable pointer is the one missing...
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Bubke Marco
Marc Mutz 
> I'm not optimising. I'm decoupling the concept of a "QString" from the owning
> implementation "QString", so that we don't need to either convert from/to
> QString quite so often or you can use "foreign types"
> (std::basic_string, char16_t[], ...) in lieu of QString. That is
> important when you need to interface with 3rd-party libraries.

Think about a local aware compare which is called very very often. You don't 
want
malloc in between. In in most cases you get an const char* or const shor* in 
this cases
It would be nice if your interface would support UTF-8 and not only UTF-16.

Incorporating ideas of http://utfcpp.sourceforge.net/ could be useful.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Matthew Woehlke
On 2015-10-14 07:15, Knoll Lars wrote:
> In addition, it might be tricky to use QStringView in signals and
> slots.

As I previously stated, I'm pretty sure you *CAN'T* use QStringView to
call slots, except for direct call. In any other case, you risk the
backing data being modified or, worse, deallocated, before the slot
dispatches (this is *especially* dangerous with cross-thread dispatch,
since now you have thread safety to worry about). The only way around
that would be for QStringView to take a COW reference to the underlying
data.

We already have a class like that. It's called QString.

What *might* work is if the event dispatcher, when it makes copies of
the arguments, makes a deep copy of QStringView into a QString. I'm not
sure if this is possible, though, and anyway then you're in the same
boat of making a (potentially) unnecessary deep copy if you had a
QString in the first place.

-- 
Matthew

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-14 Thread Matthew Woehlke
On 2015-10-14 06:16, Marc Mutz wrote:
> First, afaiu from what Thiago mentions in reviews, Q6String will have SSO 
> (small-string-optimisation) which makes many short strings expensive to copy 
> (if you think that copying 24 bytes is slower than upping an atomic int 
> through an indirection) or cheap to copy (if you think the opposite). In any 
> case, small strings will be very cheap to create (no allocation), so for many 
> strings there will be not much difference between passing a QStringView or 
> passing a QString.

Atomic operations are expensive (I think I heard once 'on the order of
100 instruction cycles', but that's highly apocryphal), mainly I would
guess due to the need to maintain cache coherency. A small copy might
happen entirely in local hot cache. 24 bytes is a whole three registers
on a modern 64-bit machine. That's probably not going to be very slow.

(Mind, atomics still blow full mutexes out of the water, but they're
still an order of magnitude slower than small stack allocations and most
single machine instructions.)

>> Yes, signed please. We can discuss whether it should be 64bit for Qt 6.
> 
> The current std API uses size_t. Do you (= both of you) expect that ever to 
> change? If it doesn't, Qt will forever be the odd one out, until we finally 
> drop QVector etc for std::vector etc and then porting will be a horror 
> because 
> of MSVC's annoying warnings.

STL should change. In Qt and Python, you can use negative indices to
refer to a distance (length) relative to the end (length) of the string.
In STL you can't do that, which is a significant limitation by
comparison. Please don't drop this useful functionality!

> array_view cannot compete with QByteArray's API. E.g. there's no toInt().

...and it *shouldn't*. Never mind that you're talking about a function
that deals with *strings*, it's debatable whether that sort of thing
belongs as class methods at all. Anyway, they aren't "missing" in the
standard library; they're free functions.

(That said, the CSL could use better flavors, and there was some talk of
that, but AFAIK it didn't get anywhere. I can pretty well guarantee you
the committee isn't going to be adding that sort of thing to array_view,
or even string_view, any time soon.)

-- 
Matthew

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-13 Thread Jake Petroules

> On Oct 13, 2015, at 1:46 PM, Marc Mutz  wrote:
> 
> Hi,
> 
> After looking quite a bit into the current state of string handling in Qt for 
> my QtWS talk last week, I have become frustrated by the state of string 
> handling in Qt.
> 
> We have such powerful tools for string handling (QStringRef, QStringBuilder), 
> but all APIs outside QString and its immediate surroundings only deal in 
> QString. The correct way would be to overload every function taking QString 
> with QLatin1String and QStringRef versions, and then, for some other rare 
> cases, const QChar *, int size. Let alone std::basic_string.
> 
> I would therefore like to propose to abandon QString for new API (and over 
> time phase it out of existing API), and only provide (const QChar*, size_t) 
> as 
> the most general form. I would propose to package the two into a class, 
> called 
> - you guessed it - QStringView.
> 
> =FAQ=
> 
> Q: Why not just use QStringRef?
> 
> A: QStringRef is tied to QString. E.g. you can't create a QStringRef from a 
> pair of QChar*, int. It also is kind of stuck in historic mistakes making it 
> undesireable as a cheap-to-pass parameter type.
> 
> Q: What mistakes do you refer to?
> 
> A: The fact that it has copy ctor and assignment operator, so it's not a 
> trivally-copyable type and thus cannot efficiently passed by-value. It may 
> also 
> be too large for pass-by-value due to the rather useless QString pointer 
> (should have been QStringData*, if any). Neither can be fixed before Qt 6.
> 
> Q: Why size_t?
> 
> A: The intent of QStringView (and std::experimental::string_view) is to act 
> as 
> an interface between modules written with different compilers and different 
> flags. A std::string will never be compatible between compilers or even just 
> different flags, but a simple struct {char*, size_t} will always be, by way 
> of 
> it's C compatibility.
> 
> So the goal is not just to accept QString, QStringRef, and (QChar*,int) (and 
> QVarLengthArray!) as input to QStringView, but also 
> std::basic_string and std::vector.
> 
> Q: What about the plans to make QString UTF-8-backed?
> 
> A: QStringView-using code will need to be ported just as QString-using code 
> will.
> 
> Q: What future do you have in mind for QStringRef?
> 
> A: None in particular, though I have found a need for an owning QStringRef in 
> some places. But I expect Qt 6' QString to be able to provide a restricted 
> view on shared data, such that it would subsume QStringRef completely.
> 
> Q: What about QLatin1String?
> 
> A: Once QString is backed by UTF-8, latin-1 ceases to be a special charset. 
> We 
> might want something like QUsAsciiString, but it would just be a UTF-8 
> string, 
> so it could be packed into QStringView.
> 
> Q: What about QByteArray, QVector?
> 
> A: I'm unsure about QByteArrayView. It might not pull its weight compared to 
> std::(experimental::)string_view, but I also note that we're currently 
> missing 
> a QByteArrayRef, so a QBAView might make sense while we wait for the std one 
> to become available to us.
> 
> I'm actively opposed to a QArrayView, because I don't think it provides us 
> with anything std::(experimental::)array_view doesn't already.
> 
> Q: What about a rope?
> 
> A: A rope is a more complex string that can provide complex views on existing 
> data as well as store rules for generating stretches of data (as opposed to 
> the data itself).
> 
> A rope is a very complex data structure and would not work as a universal 
> interface type. It would be cool if Qt had a rope, but that is outside the 
> scope of my proposal.
> 
> Q: What do you mean when you say "abandon QString"?
> 
> A: I mean that functions should not take QStrings as arguments, but 
> QStringViews. Then users can transparently pass QString, QStringRef and any 
> of 
> a number of other "string" types without overloading the function on each of 
> them.
> 
> I do not mean to abandon QString, the class. Only QString, the interface type.
> 
> Q: What API should QStringView have?
> 
> A: Since it's mainly an interface type, it should have implicit conversions 
> from all kinds of "string" types, but explicit conversion _to_ those string 
> types. It should carry all the API from QString that can be implemented on 
> just a (QChar*, size_t) (e.g. trimmed(), left(), mid(), section(), split(), 
> but not append(), replace() (except maybe the (QChar,QChar) overload. 
> Corresponding QString/Ref API could (eventually) just forward to the 
> QStringView one.
> 
> Thanks, now fire away,
> Marc
> 
> -- 
> Marc Mutz  | Senior Software Engineer
> KDAB (Deutschland) GmbH & Co.KG, a KDAB Group Company
> Tel: +49-30-521325470
> KDAB - The Qt Experts
> ___
> Development mailing list
> Development@qt-project.org
> http://lists.qt-project.org/mailman/listinfo/development


In general this sounds like a dangerous idea because it carries over all the 

Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-13 Thread Bubke Marco
I like idea to devide the job of manipulating data and sending data around in 
different classes. Many times I get string from different sources in different 
formats with different ownerships. And for performance reasons you don't want 
copy or convert that strings. Many sources like databases provide for 
performance reasons utf8 so we  should definitely support it. How do you want 
handle ownership. I think it should no be included in the type. What about move 
semantics. If the data is moved around it don't need to be copied before it can 
be manipulated. 
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


[Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-13 Thread Marc Mutz
Hi,

After looking quite a bit into the current state of string handling in Qt for 
my QtWS talk last week, I have become frustrated by the state of string 
handling in Qt.

We have such powerful tools for string handling (QStringRef, QStringBuilder), 
but all APIs outside QString and its immediate surroundings only deal in 
QString. The correct way would be to overload every function taking QString 
with QLatin1String and QStringRef versions, and then, for some other rare 
cases, const QChar *, int size. Let alone std::basic_string.

I would therefore like to propose to abandon QString for new API (and over 
time phase it out of existing API), and only provide (const QChar*, size_t) as 
the most general form. I would propose to package the two into a class, called 
- you guessed it - QStringView.

=FAQ=

Q: Why not just use QStringRef?

A: QStringRef is tied to QString. E.g. you can't create a QStringRef from a 
pair of QChar*, int. It also is kind of stuck in historic mistakes making it 
undesireable as a cheap-to-pass parameter type.

Q: What mistakes do you refer to?

A: The fact that it has copy ctor and assignment operator, so it's not a 
trivally-copyable type and thus cannot efficiently passed by-value. It may also 
be too large for pass-by-value due to the rather useless QString pointer 
(should have been QStringData*, if any). Neither can be fixed before Qt 6.

Q: Why size_t?

A: The intent of QStringView (and std::experimental::string_view) is to act as 
an interface between modules written with different compilers and different 
flags. A std::string will never be compatible between compilers or even just 
different flags, but a simple struct {char*, size_t} will always be, by way of 
it's C compatibility.

So the goal is not just to accept QString, QStringRef, and (QChar*,int) (and 
QVarLengthArray!) as input to QStringView, but also 
std::basic_string and std::vector.

Q: What about the plans to make QString UTF-8-backed?

A: QStringView-using code will need to be ported just as QString-using code 
will.

Q: What future do you have in mind for QStringRef?

A: None in particular, though I have found a need for an owning QStringRef in 
some places. But I expect Qt 6' QString to be able to provide a restricted 
view on shared data, such that it would subsume QStringRef completely.

Q: What about QLatin1String?

A: Once QString is backed by UTF-8, latin-1 ceases to be a special charset. We 
might want something like QUsAsciiString, but it would just be a UTF-8 string, 
so it could be packed into QStringView.

Q: What about QByteArray, QVector?

A: I'm unsure about QByteArrayView. It might not pull its weight compared to 
std::(experimental::)string_view, but I also note that we're currently missing 
a QByteArrayRef, so a QBAView might make sense while we wait for the std one 
to become available to us.

I'm actively opposed to a QArrayView, because I don't think it provides us 
with anything std::(experimental::)array_view doesn't already.

Q: What about a rope?

A: A rope is a more complex string that can provide complex views on existing 
data as well as store rules for generating stretches of data (as opposed to 
the data itself).

A rope is a very complex data structure and would not work as a universal 
interface type. It would be cool if Qt had a rope, but that is outside the 
scope of my proposal.

Q: What do you mean when you say "abandon QString"?

A: I mean that functions should not take QStrings as arguments, but 
QStringViews. Then users can transparently pass QString, QStringRef and any of 
a number of other "string" types without overloading the function on each of 
them.

I do not mean to abandon QString, the class. Only QString, the interface type.

Q: What API should QStringView have?

A: Since it's mainly an interface type, it should have implicit conversions 
from all kinds of "string" types, but explicit conversion _to_ those string 
types. It should carry all the API from QString that can be implemented on 
just a (QChar*, size_t) (e.g. trimmed(), left(), mid(), section(), split(), 
but not append(), replace() (except maybe the (QChar,QChar) overload. 
Corresponding QString/Ref API could (eventually) just forward to the 
QStringView one.

Thanks, now fire away,
Marc

-- 
Marc Mutz  | Senior Software Engineer
KDAB (Deutschland) GmbH & Co.KG, a KDAB Group Company
Tel: +49-30-521325470
KDAB - The Qt Experts
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

2015-10-13 Thread Matthew Woehlke
On 2015-10-13 15:59, Jake Petroules wrote:
> On Oct 13, 2015, at 1:46 PM, Marc Mutz wrote:
>> I would therefore like to propose to abandon QString for new API (and over 
>> time phase it out of existing API), and only provide (const QChar*, size_t) 
>> as 
>> the most general form. I would propose to package the two into a class, 
>> called 
>> - you guessed it - QStringView.
> 
> In general this sounds like a dangerous idea because it carries over
> all the old API concepts (i.e. (QChar *, size_t) is an extremely
> broken abstraction). You need to read and truly comprehend
> https://developer.apple.com/swift/blog/?id=30 before suggesting any
> changes to string-related APIs for the next major version of Qt,
> because if anything, THAT is what it should look like. Anything but
> that is a near-useless wrapper around binary data, not a true string
> class.

While I don't necessarily disagree with that article, I think that the
points being made are orthogonal to what Marc is proposing.

The idea of QStringView would, I presume, be similar to that of
std::string_view; namely, to provide an abstraction over a bag of
"characters" (using that term rather loosely). It does NOT in any way
relate to doing any sort of operations (besides slicing) on a "string".
The idea is to be able to inexpensively pass around "text", whether it
comes from QString, QStringRef, wchar_t*, or what have you, without
having to perform superfluous memory allocations to convert to One True
Form (i.e. QString) when the consumer doesn't actually care.

That said... I note that slots probably still need to take QString,
because a queued call with a QStringView is horribly broken (for reasons
which I hope are obvious). At least unless the event dispatcher is
clever enough to promote these to QString in the event.

-- 
Matthew

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


<    1   2