Hello all,based on the feedback I received it seems like the place where most people would like to see extra information is the crash signature. We can certainly improve that by flagging more crashes, here's my proposal for things that should go in the signature:
* Extend the "bad hardware" signatures to flag more data. For example we only flag Windows crashes where the exception contains the `STATUS_DEVICE_DATA_ERROR`. Looking through our crashes it seems like all crashes where the reason is a variant of `STATUS_IN_PAGE_ERROR` are non-actionable: they cover network disconnections, corrupt data, disk being full, etc... These are not necessarily instances of "bad hardware" so we might want to split them up, but still flag them as clearly non-actionable.
* Surface bit-flip detection. We're not yet 100% confident about our bit-flip detection heuristic (though the false positives appear to be few). We could put the bit-flip detection into a dedicated field which can be searched and thus made visible in the aggregations tab. This would make it easy to spot crash signatures with a high number of potential bit-flips. It's worth noting that early testing indicates that crashes caused by bit-flips represent double-digit percentages of reported crashes.
* Always replace the crash address with the adjusted address, including for NULL pointers. This will make understanding crashes easier, we can put the raw crash address in a separate field.
* Last but not least I'd like to add a field providing a high-level description of the crash, possibly obtained by cross-referencing the address, the crash reason, the platform and other information. The idea is to have a platform-independent place to store what kind of crash we're dealing with (stack overflow, assertion, NULL-dereference, UAF, misaligned access, etc...).
WDYT? Gabriele On 02/02/23 14:14, Gabriele Svelto wrote:
[cross-posting to dev-platform] Hello everybody,last year we replaced the tool used to extract stacks and additional information from crashes with a new Rust-based implementation. One of the goals behind the change was extending it over the legacy tool and surfacing richer information about crashes.As we've implemented a few different analyses we were left wondering what would be the best way to surface this information in crash reports. Here's a few things we can detect now:* We accessed a dead object (i.e. the crash is an UAF)* We jumped into a dead object (same as above, but we were probably going through the vtable or a function pointer) * The crash was a NULL pointer access even though it might not look like one (e.g. it was NULL plus a fixed offset, like when reading a field of a structure) * The crash was cause by bad hardware (corrupted data on disk, bad memory with stuck bits, etc...)And here's a few more we'll be able to detect soon:* The crash was caused by the data being misaligned when accessed (this happens sometimes with SIMD/vector instructions) * The crash is impossible - likely caused by a CPU bug (e.g. the crash is a segmentation fault but the crashing instruction doesn't access memory) * The crash is a stack overflow (these have a specific crash reason on Windows, but not on macOS and Linux)So the question is, where would you like to see this data? In some case we already surface some of it in the crash signature, see this crash as an example:https://crash-stats.mozilla.org/report/index/2ccae112-2e4e-4904-ba18-aaba60230202This isn't the best way for all type of crashes though, for example accesses to dead objects can't be detected all the time so putting that information in the signature would bucket crashes in unwanted ways depending on what was in the object.Another possibility is to add a new field to the crashes that would then be shown as a column when looking at a list of crash reports (something like the crash reason, but more rich since we know more about the crash).What do you think? What would help you more when looking at crashes? Gabriele
-- You received this message because you are subscribed to the Google Groups "[email protected]" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/a/mozilla.org/d/msgid/dev-platform/418b6ae6-905b-a34f-bb55-9ad00bfd9581%40mozilla.com.
OpenPGP_signature
Description: OpenPGP digital signature
