Re: [Development] [SPAM] How bad QList really is

2020-05-12 Thread Ville Voutilainen
On Sat, 25 Apr 2020 at 17:45, André Pönitz  wrote:
> What I see here is a general-purpose random-access container with cheaper
> insertion and deletion at front and in the middle than *vector provides for
> 61.3% of the types, augmented by a small-object optimization that kicks in
> with zero overhead for 98.5% of the actually created instances.
>
> If such a container did not exist it would need to be invented.

I wonder, to what extent those properties apply to the proposal for std::colony,
here: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0447r10.html
std::colony doesn't have random access, but it's somewhat telling that there
are attempts to add a standard container that otherwise fills somewhat
similar sweet spot
niches (that might not be small niches) as Qt5's QList does. The
purposes and goals
are a tad different from Qt5's QList, but they certainly reveal
another perspective
according to which vectors and lists and maps don't necessarily cover
all reasonable
uses.
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] [SPAM] How bad QList really is

2020-04-29 Thread Matthew Woehlke

On 28/04/2020 19.41, Giuseppe D'Angelo via Development wrote:

Il 28/04/20 21:45, Matthew Woehlke ha scritto:

* QList gets adapted so that its internal array allocates 3 *
sizeof(void*) per element, so that e.g. Q6StringList won't require a
per-item allocation. This would then also avoid allocations for other
datatypes, e.g. QStringView, QImage, maybe QVariant and QColor; but of
course waste a ton of space for the ones which remain small (most of Qt
implicitly shared datatypes).

Uh... can't it allocate sizeof(T) if T meets some criteria? IOW, I don't
see the second case penalizing smaller types unless the implementation
is poorly done.


This way of working is the *key* of QList design. QList is always a 
vector of void* (so it's nice for Qt pimpled types).


...?


This allows to type erase the entire management of the vector (it's
always a vector of void*), reducing the amount of template code that
needs to be instantiated.
Okay, I'm going to *assume* that what you're trying to say, poorly, is 
that QList for relocatable T, sizeof(T) ≤ sizeof(void*), is 
implemented in terms of casting the bits of T into an intptr_t, which is 
further mangled into a void*.


Just because that's how it's implemented *now* doesn't preclude a new, 
ABI-breaking version from being implemented as void*[N] for N less than 
some small value. It would increase the code overhead *somewhat*, but 
not unmanageably so (all allowable instances could still be explicitly 
instantiated), and that is probably worth it for the difference in space 
efficiency.


--
Matthew
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] [SPAM] How bad QList really is

2020-04-28 Thread Giuseppe D'Angelo via Development

Il 28/04/20 21:45, Matthew Woehlke ha scritto:

* QList gets adapted so that its internal array allocates 3 *
sizeof(void*) per element, so that e.g. Q6StringList won't require a
per-item allocation. This would then also avoid allocations for other
datatypes, e.g. QStringView, QImage, maybe QVariant and QColor; but of
course waste a ton of space for the ones which remain small (most of Qt
implicitly shared datatypes).

Uh... can't it allocate sizeof(T) if T meets some criteria? IOW, I don't
see the second case penalizing smaller types unless the implementation
is poorly done.


This way of working is the *key* of QList design. QList is always a 
vector of void* (so it's nice for Qt pimpled types). This allows to type 
erase the entire management of the vector (it's always a vector of 
void*), reducing the amount of template code that needs to be instantiated.


The only type specific operations are setting/retrieving/deleting a 
value in that vector, thus giving us the rule:


- if the type is small and relocatable, just put it in the vector
- otherwise, heap allocate and put the pointer in the vector

There are no exceptions (which is why e.g. Q5List on 32 bits is 
wasteful), again by design. And, because noone is volunteering to do the 
work while QVector is already there...


My 2 c,
--
Giuseppe D'Angelo | giuseppe.dang...@kdab.com | Senior Software Engineer
KDAB (France) S.A.S., a KDAB Group company
Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com
KDAB - The Qt, C++ and OpenGL Experts



smime.p7s
Description: Firma crittografica S/MIME
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] [SPAM] How bad QList really is

2020-04-28 Thread Matthew Woehlke

On 28/04/2020 03.56, Giuseppe D'Angelo via Development wrote:
1) First and foremost, in Qt 6 QString, QByteArray, QVector, are bigger 
than a pointer (3 times a pointer size). So, in Qt 6:


* either QList stays unchanged, and now we heap allocate each element 
for those cases too (thus it's necessary to know how the above 
statistics change); or


* QList gets adapted so that its internal array allocates 3 * 
sizeof(void*) per element, so that e.g. Q6StringList won't require a 
per-item allocation. This would then also avoid allocations for other 
datatypes, e.g. QStringView, QImage, maybe QVariant and QColor; but of 
course waste a ton of space for the ones which remain small (most of Qt 
implicitly shared datatypes).


Uh... can't it allocate sizeof(T) if T meets some criteria? IOW, I don't 
see the second case penalizing smaller types unless the implementation 
is poorly done.


--
Matthew
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] [SPAM] How bad QList really is

2020-04-28 Thread Thiago Macieira
On Tuesday, 28 April 2020 00:56:35 PDT Giuseppe D'Angelo via Development 
wrote:
> 1) First and foremost, in Qt 6 QString, QByteArray, QVector, are bigger
> than a pointer (3 times a pointer size). So, in Qt 6:

And QVariant is 4 pointers, but Qt 5's QVariant is already too big for Qt 5's 
QList.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel System Software Products



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] [SPAM] How bad QList really is

2020-04-28 Thread Giuseppe D'Angelo via Development

Il 28/04/20 11:16, Kevin Kofler ha scritto:

* or the SSO gets reverted.


It's not SSO.

Is there any actual evidence that it is a win in practice? 


Yes. But it's not SSO.


And on what hardware?


Enough to justify the adoption in all std::string implementations and in 
3rd party ones as well (e.g. Folly).



It is very hardware-dependent whether copying several bytes is
faster than incrementing one atomic. That may well be the case on today's
hardware, but what about tomorrow's? 


Even more.


And what about older hardware that is still in use?


Right, let's establish the foundations for the Qt 6 lifetime (2020-2035) 
by looking at 2002 hardware.



The rule of thumb should be that copying less is better than copying more.


But: is copying less AND allocating better than just copying more?



The confirmation bias comes from the fact that Qt datatypes are designed
to work nicely with QList; this for at least two good reasons:

a) Qt value types are normally pimpled (because of refcounting, or ABI
stability) so they fit exactly in QList's array slots, but user-defined
types aren't necessarily pimpled. Thus the bet: they're almost always
the wrong size for QList.

b) We've spent an awful amount of time reviewing Qt's source code and
tagging down every value type as movable (where it made sense). And I'd
like to extend a big thank you to Sérgio for clazy, Marc for tirelessly
fixing the code, Thiago for giving us relocatable types before Qt 6 in
QVector-but-not-QList-as-that's-BIC. Users simply forget the dreaded
typeinfo macro and so, for their types, QList is always in
array-of-pointers mode no matter the size of the datatype.


I use implicitly-shared types and Q_MOVABLE_TYPE annotations in my code.


Good.

While we're at empirical evidence, do you want evidence of the contrary? 
Endless other users don't. Qt examples contain several instances of 
QList with Foo untagged and too big to fit anyhow.



These can also usually be added later when needed. User code is not bound by
as strict binary compatibility rules as Qt itself.


Can be != will be. Those examples have been around for a decade.


In conclusion, if QList is a good choice for Qt's own types, and even if
we then expose it at the API level as the Qt container of choice, its
only advantage then over QVector is going to be prepending? How often is
that a use case to justify the semantic burden of this extra container?


The prepending optimization alone is a necessary feature. Replacing QList
with an unmodified QVector, without that prepending optimization, is
entirely unacceptable.


And the evidence for this need is?


This cannot be claimed as a closed result: for insertion, it's ignoring
the cost of the individual allocation of the newly inserted item, that
needs to be traded off the moving of more bytes in memory.


That is highly dependent on the hardware, the operating system, and the
malloc implementation (where there is more than one for the operating
system). So the results you obtain will depend on what you run the benchmark
on.


That's why I didn't provide numbers, but still pointed out that the 
result, alone, is wrong.


--
Giuseppe D'Angelo | giuseppe.dang...@kdab.com | Senior Software Engineer
KDAB (France) S.A.S., a KDAB Group company
Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com
KDAB - The Qt, C++ and OpenGL Experts



smime.p7s
Description: Firma crittografica S/MIME
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] [SPAM] How bad QList really is

2020-04-28 Thread Kevin Kofler
Giuseppe D'Angelo via Development wrote:
> 1) First and foremost, in Qt 6 QString, QByteArray, QVector, are bigger
> than a pointer (3 times a pointer size). So, in Qt 6:
> 
> * either QList stays unchanged, and now we heap allocate each element
> for those cases too (thus it's necessary to know how the above
> statistics change); or
> 
> * QList gets adapted so that its internal array allocates 3 *
> sizeof(void*) per element, so that e.g. Q6StringList won't require a
> per-item allocation. This would then also avoid allocations for other
> datatypes, e.g. QStringView, QImage, maybe QVariant and QColor; but of
> course waste a ton of space for the ones which remain small (most of Qt
> implicitly shared datatypes).

* or the SSO gets reverted.

Is there any actual evidence that it is a win in practice? And on what 
hardware? It is very hardware-dependent whether copying several bytes is 
faster than incrementing one atomic. That may well be the case on today's 
hardware, but what about tomorrow's? And what about older hardware that is 
still in use?

The rule of thumb should be that copying less is better than copying more.

> The confirmation bias comes from the fact that Qt datatypes are designed
> to work nicely with QList; this for at least two good reasons:
> 
> a) Qt value types are normally pimpled (because of refcounting, or ABI
> stability) so they fit exactly in QList's array slots, but user-defined
> types aren't necessarily pimpled. Thus the bet: they're almost always
> the wrong size for QList.
> 
> b) We've spent an awful amount of time reviewing Qt's source code and
> tagging down every value type as movable (where it made sense). And I'd
> like to extend a big thank you to Sérgio for clazy, Marc for tirelessly
> fixing the code, Thiago for giving us relocatable types before Qt 6 in
> QVector-but-not-QList-as-that's-BIC. Users simply forget the dreaded
> typeinfo macro and so, for their types, QList is always in
> array-of-pointers mode no matter the size of the datatype.

I use implicitly-shared types and Q_MOVABLE_TYPE annotations in my code.

These can also usually be added later when needed. User code is not bound by 
as strict binary compatibility rules as Qt itself.

> In conclusion, if QList is a good choice for Qt's own types, and even if
> we then expose it at the API level as the Qt container of choice, its
> only advantage then over QVector is going to be prepending? How often is
> that a use case to justify the semantic burden of this extra container?

The prepending optimization alone is a necessary feature. Replacing QList 
with an unmodified QVector, without that prepending optimization, is 
entirely unacceptable.

> This cannot be claimed as a closed result: for insertion, it's ignoring
> the cost of the individual allocation of the newly inserted item, that
> needs to be traded off the moving of more bytes in memory.

That is highly dependent on the hardware, the operating system, and the 
malloc implementation (where there is more than one for the operating 
system). So the results you obtain will depend on what you run the benchmark 
on.

Kevin Kofler

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] [SPAM] How bad QList really is

2020-04-28 Thread Ville Voutilainen
On Tue, 28 Apr 2020 at 10:59, Giuseppe D'Angelo via Development
 wrote:

> > What I see here is a general-purpose random-access container with cheaper
> > insertion and deletion at front and in the middle than *vector provides for
> > 61.3% of the types,
>
> This cannot be claimed as a closed result: for insertion, it's ignoring
> the cost of the individual allocation of the newly inserted item, that
> needs to be traded off the moving of more bytes in memory.
>
>
> Thanks for the scientific approach,

Here's what I'm curious about.  With Qt5 and QList, and with Qt6-like
"QList is QVector", if we run the
scenario André depicted and measure wall clock time, which one wins?
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] [SPAM] How bad QList really is

2020-04-28 Thread Giuseppe D'Angelo via Development

On 4/25/20 4:49 PM, André Pönitz wrote:

Spam detection software, running on the system "mx.qt-project.org",
has identified this incoming email as possible spam.  The original


[*SIGHS*]



We all know the story that began with

"We knew for a long time that QList is not a good default
container, despite what the documentation claims. The problem
boils down to the fact that for a lot of types T, QList is
needlessly inefficient by allocating elements on the heap and
storing pointers to them instead of storing the elements
in-place, like e.g. QVector does. Sometimes, for large and
complex types, that might be exactly what you want, but a
conservative estimate would put that chance at less than 5%." [1]


I was curious how this "conservative" estimate of "less than 5%" 
manifests in real QList object instances in a real world example.  



My random picks of real world examples usually end up with "Loading
the Qt Creator project in Qt Creator itself", and "statistics" is
something I make up myself, in this case it is looking at QList objects
at destruction time, and sorting them a bit according to size of
the items in the list, number of items still present at that time etc.

Besides being simple to do it's rather close to the typical use I see,
where containers are filled up, accessed/used a few times and destroyed
without much intermediate size changes.


Here is what I get:

* total QList objects created  :51.631.363   
  - destructors that only bump ref :38.015.673  73.6%
  - destructors that actually dealloc  :13.615.690  26.4% 



The  13.615.690  instances with actual deallocations ordered by the 
size of the item type:


size occurences

 4 1.656#  0.01 %   with internal padding
   
 8 13.424.228   # 98.59 %   objects with ideal size, 



 12 3\
 1674.979 |
 2460.560 |
 28   126 |
 3222.358 |
 40 2.054 |
 4818.786 |
 5671 >  # 1.40 %  with indirection
 64 3 |
 7246 |
 80 3.484 |
 96 1 | 
 1127.264 |
 128   52 | 
 1842 |

 2112  17/


From the 8-byte objects we have
 
13.420.883  stored directly  #  99.975 %  of cases

 3.345  stored indirectly#   0.025 %

i.e. in 98.57% of all cases, QList objects behave optimally.


Before we enter confirmation bias here: could you also offer a breakdown 
of the actual types involved? Basically, there's some important things 
to discuss here:


1) First and foremost, in Qt 6 QString, QByteArray, QVector, are bigger 
than a pointer (3 times a pointer size). So, in Qt 6:


* either QList stays unchanged, and now we heap allocate each element 
for those cases too (thus it's necessary to know how the above 
statistics change); or


* QList gets adapted so that its internal array allocates 3 * 
sizeof(void*) per element, so that e.g. Q6StringList won't require a 
per-item allocation. This would then also avoid allocations for other 
datatypes, e.g. QStringView, QImage, maybe QVariant and QColor; but of 
course waste a ton of space for the ones which remain small (most of Qt 
implicitly shared datatypes).



2) Please re-run the count also for the QVector instances in Creator. 
Under the "typical use" defined above, they could be converted to 
QLists, couldn't they? How much would the totals change?



3) How many datatypes are defined by Creator (so they could be 
considered "user defined", for the purposes of this exercise; although I 
believe Creator also employs pimpl quite aggressively to keep BC, or am 
I wrong here?), and how many are coming from Qt?


The confirmation bias comes from the fact that Qt datatypes are designed 
to work nicely with QList; this for at least two good reasons:


a) Qt value types are normally pimpled (because of refcounting, or ABI 
stability) so they fit exactly in QList's array slots, but user-defined 
types aren't necessarily pimpled. Thus the bet: they're almost always 
the wrong size for QList.


b) We've spent an awful amount of time reviewing Qt's source code and 
tagging down every value type as movable (where it made sense). And I'd 
like to extend a big thank you to Sérgio for clazy, Marc for tirelessly 
fixing the code, Thiago for giving us relocatable types before Qt 6 in 
QVector-but-not-QList-as-that's-BIC. Users simply forget the dreaded 
typeinfo macro and so, for their types, QList is always in 
array-of-pointers mode no matter the size of the datatype.



This is the area I fundamentally consider the "original sin" more than 
anything else: QList can be a terrible default choice for end users and 
their datatypes, while it's possibly still a good 

Re: [Development] [SPAM] How bad QList really is

2020-04-27 Thread André Somers

Hi,

On 25-04-20 16:49, André Pönitz wrote:

We all know the story that began with

 "We knew for a long time that QList is not a good default
 container, despite what the documentation claims. The problem
 boils down to the fact that for a lot of types T, QList is
 needlessly inefficient by allocating elements on the heap and
 storing pointers to them instead of storing the elements
 in-place, like e.g. QVector does. Sometimes, for large and
 complex types, that might be exactly what you want, but a
 conservative estimate would put that chance at less than 5%." [1]


I was curious how this "conservative" estimate of "less than 5%"
manifests in real QList object instances in a real world example.

My random picks of real world examples usually end up with "Loading
the Qt Creator project in Qt Creator itself", and "statistics" is
something I make up myself, in this case it is looking at QList objects
at destruction time, and sorting them a bit according to size of
the items in the list, number of items still present at that time etc.

Besides being simple to do it's rather close to the typical use I see,
where containers are filled up, accessed/used a few times and destroyed
without much intermediate size changes.


Wouldn't you think that by now most cases where QList was *not* great to 
use have been replaced in QtC already, exactly because it's downsides 
have been known for quite a while now? I am going to assert that will 
skew your statistics, and thus make it basically impossible to draw any 
kind of conclusion from them.


André



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] [SPAM] How bad QList really is

2020-04-27 Thread Matthew Woehlke

On 25/04/2020 10.49, André Pönitz wrote:

We all know the story that began with

 "We knew for a long time that QList is not a good default
 container, despite what the documentation claims. The problem
 boils down to the fact that for a lot of types T, QList is
 needlessly inefficient by allocating elements on the heap and
 storing pointers to them instead of storing the elements
 in-place, like e.g. QVector does. Sometimes, for large and
 complex types, that might be exactly what you want, but a
 conservative estimate would put that chance at less than 5%." [1]


I was curious how this "conservative" estimate of "less than 5%"
manifests in real QList object instances in a real world example.


(Remainder elided)

Thanks for the analysis! However, it seems the major take-away is that 
"when QList acts like QVector, all is well"? It would be interesting to 
see a comparison that ignores all instances where QList is behaving like 
QVector.


The other problem is that performance isn't the only story. Sometimes, 
reference stability is desired. AFAIK, the only containers in Qt¹ that 
provide this currently are QLinkedList and QMap.


(¹ I speak of Qt6, specifically. In Qt5, QHash could be added to that 
list, but no more.)


--
Matthew
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development