On Wednesday 14 October 2015 20:04:12 Marc Mutz wrote: > On Wednesday 14 October 2015 18:11:26 Thiago Macieira wrote: > > and the fact that QStringLiterals don't share will cause the > > innocent-looking above code require 64 bytes of read-only data. > > They are shared, because it seems that lambdas within the same function have > the same type. At least last I checked, that was what GCC implemented.
GCC 5.2, 6: 2 lambdas, data duplicated Clang 3.7, 3.8: 2 lambdas, data duplicated ICC 16: 2 lambdas, data duplicated You can see from the disassembly that they are two different types. > > movq _ZN10QArrayData18shared_static_dataE@GOTPCREL(%rip), %rax > > And you want the nullptr to get rid of this relocation. Yes, but more importantly because it speeds up the check for when reference counting should be done. Right now, it needs to check bit 9 inside d->flags, which means dereferencing the pointer (hitting another cacheline) and the compiler never knows that test is constant with QStringLiterals. With a null pointer, the check is very trivial (a TEST instruction, for both the null and the ~1 check) and the compiler should be able to optimise the destructor away. Here's the entire function, as it is today with one QStringLiteral only: (compiled with GCC 6 -fno-exceptions, rearranged/edited for clarity) ; load the literal: movq _ZN10QArrayData18shared_static_dataE@GOTPCREL(%rip), %rax ; d movl $3, 16(%rsp) ; str.d.size = 3 movq %rax, (%rsp) ; str.d.d = &QArrayData::shared_static_data leaq .LC0(%rip), %rax ; u"foo" movq %rax, 8(%rsp) ; str.d.b = u"foo" ; make the call: movq %rsp, %rdi call _Z1fRK7QString@PLT ; inlined QString::~QString movq (%rsp), %rax ; reload the d pointer testl $512, (%rax) ; d->flags & QArrayData::ImmutableHeader je .L8 addq $40, %rsp ret ; this is the dead code, it never gets run: .L8: lock subl $1, 4(%rax) ; d->ref_.deref() jne .L5 movq (%rsp), %rdi ; load d pointer movl $16, %edx ; alignof(QTypedArrayData<QChar>) movl $2, %esi ; sizeof(QChar) call _ZN10QArrayData10deallocateEPS_mm@PLT addq $40, %rsp ret A hacky implementation that uses a null pointer instead: ; load the literal: leaq .LC0(%rip), %rax ; u"foo" movq $0, (%rsp) ; str.d.d = nullptr movq %rax, 8(%rsp) ; str.d.b = u"foo" movl $3, 16(%rsp) ; str.d.size = 3 ; make the call movq %rsp, %rdi call _Z1fRK7QString@PLT addq $40, %rsp ret The QString::~QString destructor expanded to empty with GCC. Unfortunately, Clang and ICC retained the check (they must be assuming the callee modified the const parameter). Unfortunately, if I change the isStatic to check for LSB set for the SSO case, even GCC gets thrown off and brings back the dead code. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center _______________________________________________ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development