Hi,
TL;DR:
- QT_NO_SINGLE_ARG_QHASH_OVERLOAD changes semantics in 6.11.1 and dev
- delayed from strict mode 6.11 to 6.12 to give everyone more time
- check impact by buiding with macro enabled and qtbase past
abb639bea4df011b19607df4509187ebe91bbb59
(https://codereview.qt-project.org/c/qt/qtbase/+/729320)
- do not push a fix for public types without approval of the likes of Thiago,
Peppe, Ahmad, Ivan, Mårten, or me: BC is involved
- see Action Points at end of email
Long version:
We identified a problem with the way QT_NO_SINGLE_ARG_QHASH_OVERLOAD works, a
macro that aims to find qHash() overloads that are missing the seed argument.
In order to make such overloads work, Qt has provided a "1-to-2-arg qHash()
adapter" that, given a 1-arg qHash() overload, provides a 2-arg version that
calls the 1-arg one and then xors in the seed, simplified for illustration:
constexpr size_t qHash(const T &t, size_t seed) noexcept(noexcept(qHash(t)))
{ return qHash(t) ^ seed; }
There are two things to observe here:
First, this template is always a perfect match for any type T, so, in
particular, a _better_ match than one where a qHash() call would be resolved by
implicit conversion.. As such, it will be instantiated not just for any type
who only has a 1-arg qHash() overload, as intended, but for any qHash() call
that requires implicit conversion on either of the two arguments.
In addition to the intended usecase where the qHash() is lacking a seed
argument, this can happen if the type doesn't have a qHash() at all, but
implicitly converts to one that has (e.g. QRgba64 → quint64), or if the seed
argument has type != size_t (like Qt 5's uint, or, also seen, muscle memory
qsizetype).
If the adapter was simply removed, those calls would call the same qHash()
overloads as before, so it would still compile. Only qHash() functions without
a seed argument cause errors here (and even those may be overridden by a better
match that has a seed parameter, and a key that implicitly converts from the
type in question).
This is, btw, why std::hash has a struct to specialize rather than a function
to overload: to prevent implicit conversions.
None of this were a problem if not for the second thing to observe, which is
that qHash(x, seed) is a different calculation from qHash(x, 0) ^ seed! So the
two will, in general, give different results!
To understand why, note that any hashing function essentially hashes
chunk-by-chunk, using what's called a "combiner" to essentially do seed =
combine(seed, next-chunk) over and over again. These combiners are chosen such
that the sequence of chunks _matters_ (you want {1, 2, 3} to hash to something
different than {3, 2, 1}, because they're not equal¹). So, without knowledge of
the _particular_ combiner used in Qt, the following should be obvious:
It makes a difference whether I start with a seed != 0 and then combine all the
chunks or whether I start with a seed == 0, combine all the chunks, and then
xor in seed at the end.
What this means is that if the 1-to-2-arg qHash() adapter is involved, even if
a qHash() overload eventually gets called that _has_ a seed argument, then the
value of qHash(x, seed) when using the adapter will differ from the value of
qHash(x, seed) when the underlying qHash() is called with implicit conversion.
Since the adapter is a template, therefore inline, any use of it was compiled
into user code, and we can't change it for old code.
That is _silently_ BiC (old and new code don't agree on the hash value of a
given element, one inserts into a QHash, the other can't find it anymore).
So any change we make must preserve the compatibility of the new way with what
the 1-to-2-arg qHash() adapter did, essentially: we mustn't replace missing or
"faulty" qHash() overloads with one that does
return qHash(to-underlying(key), seed); // WRONG, if adapter was used before
but need to emulate the adapter instead:
return qHash(to-underlying(key)) ^ seed;
at least until Qt 7 and for public types. For private API, we don't give such
guarantees, so we can just change the qHash() to the idiomatic way there.
For in-tree users, there's a QHashPrivate::ex1to2arg(key, seed) that you can
use, which encapsulates the version differences.
Now, we finally come to the main point of this email: The
QT_NO_SINGLE_ARG_QHASH_OVERLOAD macro that is in strict mode 6.11 in 6.11.0
simply removes the adapter. This was very dangerous, since, as described above,
it means most qHash() calls that used to go through the adapter now go to the
underlying qHash(), which changes the hash value.
After we identified this problem, we have taken two mitigation measures: First,
we delayed the macro to strict mode 6.12 from strict mode 6.11.
For you, this means: DO NOT USE strict mode 6.12, yet, if you are a Qt module.²
Second, instead of removing the adapter, we now =delete it. This means the
overload set is not changed, and the use of the adapter is causing a compile
error instead of a silent change in hash value.
By the end of this week, I should have added the macro to all module's
.cmake.conf, at least on Gerrit, and fixed the issues the CI pointed out.
So far so good.
Now the bad part: This does NOT suffice. The macro can only detect issues if a
2-arg qHash() call _is_ made. That means if you have a faulty qHash() in your
module for a type X that never gets stuck into a QSet or a QHash or into a
qHashRange/Multi, then the macro will not complain.
Action points:
- (me) ensure the qt5.git modules compile with the macro against a sufficiently
recent qtbase
- (minimum) review _all_ qHash functions in your module that they are exactly
of the form
size_t qHash(T key/const T&key, size_t seed = 0)
(or overload for seed/non-seed, which is necessary for out-of-line hidden
friends)
When fixing, take special care to not change the value for public types)
- (eventually) for all equality-comparable types in your module, check that
exactly one one of the following is true:
- qHash(x, size_t{42}) does not compile (with or without the adapter), or
- qHash(x, size_t{42}) compiles and doesn't use the adapter, and
- a QHash filled in a TU with the macro set can be read from a TU that has
the adapter enabled
How to prevent the linker from just picking one of the other QHash
instantiations is still a matter of research, so unless you have a good idea to
contribute here, this isn't something we have infrastructure for, so nothing to
be done.
Once we have fixed all of our own qHash functions, we can, of course, consider
making adapter unconditionally =deleted, then we wouldn't need to do quite so
much testing at the code of breaking code for our users.
¹ unless they are, e.g. in an unordered container, then the combiner needs to
be commutative
² no module should enable strict mode 6.12 before 6.12 FF, anyway, because it's
a moving target that can lead to issues cropping up at dependency update time,
and we don't want that. Use individual QT_NO_ macros until FF.
Sorry for the bad news, but this just goes to show that BC in a huge library
like Qt will only ever be a probabilistics game, at least until AI takes over.
Thanks,
Marc
--
Marc Mutz <[email protected]><mailto:[email protected]> (he/his)
Principal Software Engineer
The Qt Company
Erich-Thilo-Str. 10 12489
Berlin, Germany
www.qt.io<http://www.qt.io>
Geschäftsführer: Mika Pälsi, Juha Varelius, Jouni Lintunen
Sitz der Gesellschaft: Berlin,
Registergericht: Amtsgericht Charlottenburg,
HRB 144331 B
Public
--
Development mailing list
[email protected]
https://lists.qt-project.org/listinfo/development