Hi all,
I've been working on some analysis of the telemetry probes we
recently added for Sync and FxA. I've done this in a google-doc so it's
easy for people to add comments etc -
https://docs.google.com/a/mozilla.com/document/d/1NaowX6dzh0iVHf5xYwHodj5c8pstyTdBOkkXApTAc6M/edit?usp=sharing
The tl;dr section from that doc:
In general, Sync appears quite healthy. Many probes we added to track
key health indicators are showing relatively nice numbers.
Unfortunately, many of the probes that indicate failure scenarios are
too broad and fail to differentiate expected and unexpected failures
(for example, a sync starting while the network is disconnected, or
Firefox being shutdown during a sync are counted as errors). But despite
these limitations, the failure numbers are small enough that no
particular concerns have been identified.
This report recommends that some of these probes be reworked (eg, to
handle the “false-positive” error counts mentioned above), some be
removed (as they are showing no particular problem and there’s no reason
to believe they will offer ongoing value), some require further
investigation (bugs have been opened and referenced here in those
cases), while others should be kept and used within an evolution
dashboard to measure ongoing health and potential regressions (bug
1234415 has been opened for the creation of this dashboard.)
All comments welcome!
Mark
_______________________________________________
Sync-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/sync-dev