Hi everyone,
Crash signatures derived from the function on the stack are how we group
("bucket") crashes as belonging to a certain issue or bug. They should
be precise enough to identify this "bucket" but also short enough so we
can handle it as a denominator in lists and when talking about those
issues. For some time, we have seen that our signatures are very
long-winded and contain parts that make it sometimes even harder to
bucket correctly. To fix that, we set out to shorten our crash signatures.
We already completed a first step of this effort in June: After we found
that templates in signatures were often fluctuating wildly in crashes
that belonged to the same bug, all <sometemplate> parts of crash
signatures were replaced by just <T>.
That made a signature like this (from bug 1045509, the [@ …] are our
customary delimiters for signatures, not really part of the signature
itself though):
[@ nsTArray_base<nsTArrayFallibleAllocator,
nsTArray_CopyWithMemutils>::UsesAutoArrayBuffer() |
nsTArray_Impl<unsigned char,
nsTArrayFallibleAllocator>::SizeOfExcludingThis(unsigned int (*)(void
const*)) ]
be shortended to:
[@ nsTArray_base<T>::UsesAutoArrayBuffer() |
nsTArray_Impl<T>::SizeOfExcludingThis(unsigned int (*)(void const*)) ]
Which is definitely somewhat better to read and put in tables like
topcrash reports, etc. - and we found it did not munge bugs together
into the same signature more than previously, at least to our knowledge.
But we found out we can go even further: Different argument lists of
functions (mostly due to overloading) did as far as I remember not help
us distinguish any bugs in the >4 years I have been working with crashes
- but patches changing types of arguments or adding one to a function
often made us lose the connection between a bug and the signature.
Therefore, we are removing argument lists from the signatures.
The signature listed above will turn out as:
[@ nsTArray_base<T>::UsesAutoArrayBuffer |
nsTArray_Impl<T>::SizeOfExcludingThis ]
Today, we have run a script on Bugzilla (see bug 1178094) to update all
affected bugs to add the new shortened signature to the Crash Signatures
field without sending a ton of bugmail.
We have tested in the last weeks that Socorro crash-stats can create the
new shortened signatures fine on their staging setup and that generation
of the special "shutdownhang | …" signatures for browser processes that
did take more than 60s to shut down and "OOM | …" for out-of-memory
crashes do still work in all cases where they worked before.
As all preparation has been done, we will flip the switch on production
Socorro crash-stats in the next days, and then those shortened
signatures will be created everywhere.
Note that this will impede some stats that are comparing signatures
across days, even though we will see to reprocess some crashes to make
the watershed be at a UTC day delimiter so that as few stats as possible
are disturbed by the change.
Please let me know of any issues with those changes (as well as any
other questions about or issues with crash analysis), and thanks to
Lars, Byron (glob) and others who helped with those changes!
Thanks,
KaiRo
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform