[Development] [SPAM] How bad QList really is

André Pönitz Sat, 25 Apr 2020 07:44:44 -0700

Spam detection software, running on the system "mx.qt-project.org",
has identified this incoming email as possible spam.  The original
message has been attached to this so you can view it or label
similar future email.  If you have any questions, see
the administrator of that system for details.


Content preview:  We all know the story that began with "We knew for a long
  time that QList is not a good default container, despite what the 
documentation
   claims. The problem boils down to the fact that for a lot of types T, 
QList<T>
   is needlessly ineffi [...] 

Content analysis details:   (5.3 points, 4.6 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 0.8 BAYES_50               BODY: Bayes spam probability is 40 to 60%
                            [score: 0.5000]
 0.0 FREEMAIL_FROM          Sender email is commonly abused enduser mail
                            provider (apoenitz[at]t-online.de)
 0.0 SPF_HELO_NONE          SPF: HELO does not publish an SPF Record
-0.0 RCVD_IN_MSPIKE_H2      RBL: Average reputation (+2)
                            [194.25.134.80 listed in wl.mailspike.net]
 0.0 URIBL_BLOCKED          ADMINISTRATOR NOTICE: The query to URIBL was
                            blocked.  See
                            
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
                             for more information.
                            [URIs: marcmutz.wordpress.com]
 2.0 SPOOFED_FREEMAIL       No description available.
 2.5 TO_NO_BRKTS_PCNT       To: lacks brackets + percentage

--- Begin Message ---

We all know the story that began with

    "We knew for a long time that QList is not a good default
    container, despite what the documentation claims. The problem
    boils down to the fact that for a lot of types T, QList<T> is
    needlessly inefficient by allocating elements on the heap and
    storing pointers to them instead of storing the elements
    in-place, like e.g. QVector does. Sometimes, for large and
    complex types, that might be exactly what you want, but a
    conservative estimate would put that chance at less than 5%." [1]


I was curious how this "conservative" estimate of "less than 5%" 
manifests in real QList object instances in a real world example.  

My random picks of real world examples usually end up with "Loading
the Qt Creator project in Qt Creator itself", and "statistics" is
something I make up myself, in this case it is looking at QList objects
at destruction time, and sorting them a bit according to size of
the items in the list, number of items still present at that time etc.

Besides being simple to do it's rather close to the typical use I see,
where containers are filled up, accessed/used a few times and destroyed
without much intermediate size changes.


Here is what I get:

* total QList objects created          :    51.631.363   
  - destructors that only bump ref     :    38.015.673      73.6%
  - destructors that actually dealloc  :    13.615.690      26.4% 


The  13.615.690  instances with actual deallocations ordered by the 
size of the item type:

    size     occurences

     4         1.656        #  0.01 %   with internal padding
                           
     8     13.424.228       # 98.59 %   objects with ideal size, 
                                    

     12             3    \
     16        74.979     |
     24        60.560     |
     28           126     |
     32        22.358     |
     40         2.054     |
     48        18.786     |
     56            71     >  # 1.40 %  with indirection
     64             3     |
     72            46     |
     80         3.484     |
     96             1     |     
     112        7.264     |
     128           52     |     
     184            2     |
     2112          17    /


>From the 8-byte objects we have
 
        13.420.883  stored directly      #  99.975 %  of cases
             3.345  stored indirectly    #   0.025 %

i.e. in 98.57% of all cases, QList objects behave optimally.

This high percentage differs from the artificial benchmark scenarios not
by accident. Almost all of these lists are lists of pointers, a natural
ingredient of tree-like structures, common in user interfaces.


The "padding problem" for types with less than ideal size has been touted
as major problem. Let's see how hard it hits me:

>From the 4-byte objects we have
 
               947  stored directly      #   57.2 %  of cases
               709  stored indirectly    #   42.8 %

In the first case I use "conservatively estimated" 32 byte on the heap, for
data and any accounting, 4 more for the gap itself, the second has only
4 for the gap, summing up to a total of 29.312 bytes. 

>From the 8-byte objects we have
 
        13.420.883  stored directly      #  99.975 %  of cases
             3.345  stored indirectly    #   0.025 %

i.e. another 53.520 bytes due to the indirection. Total waste is 82.832 bytes
accumlated extra RAM use during application lifetime, out of 112.899.252 total
"top level" QList payload. That is 0.073% "problems", or, in other words, it
takes 623 QList objects to waste *one* byte.


Overall there are 333 different item types, grouped by size of type there is:

          7    <  8    #  2.1 %  of types
        122   ==  8    # 36.6 %  of types
        204    >  8    # 61.3 %  of types



What I see here is a general-purpose random-access container with cheaper
insertion and deletion at front and in the middle than *vector provides for
61.3% of the types, augmented by a small-object optimization that kicks in
with zero overhead for 98.5% of the actually created instances.

If such a container did not exist it would need to be invented.


Andre'


[1] 
https://marcmutz.wordpress.com/2010/07/29/sneak-preview-qlist-considered-harmful

--- End Message ---

_______________________________________________
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development

[Development] [SPAM] How bad QList really is

Reply via email to