Re: [dev] crash reporter, what are the various distro's strategies ?

Bernd Eilers Fri, 18 May 2007 06:43:34 -0700

Caolan McNamara wrote:

So when OOo crashes, what are the strategies that distros employ here if
any ?

Sun has a SOAP receiving service for crash reports, a database to storecrashreports, a daemon service which tries to find similiar stacktracebased on the information send in the XML file which works with a certainfuzziness, a web frontend for development to look at the stuff stored inthe database and some automated process which generates tasks from crashreports to developers in a Sun internal bugtracking system based onfrequency of crashes. The autosubmission of tasks and the web frontendprovide some means to resolve the data send with the crash reportagainst the debug information kept for released versions using platformdepend utilities, eg. kd.exe on windows addr2line on unix. Theautosubmission of tasks merly guesses on a potential developer whichcould become the possible first owner based on applying random functionto the library owners of the first 5 stackframes, mostly such tasks getreassigned by the first owner to a better one sometimes the oneinitially getting the Task is already the right one. In the databasereports have their own reportid and similar ones are grouped togetherunder a common stackmatchid.

For RH we don't build the crashreporter (as it's basically unusable info
for Sun),

This is because Sun would not have the debug information created duringbuilding those Releases which would be needed to 'resolv' the crashreport getting stackinformation with source code filenames, classes andline numbers out of it. The information in the XML file is not enoughfor the developer to locate the origin of the problem, it´s only enoughfor checking possible similiarity.

but we do configure to enable using it, and replace it in the
install set with a simple replacement that tests for the set of common
problems that historically caused crashes, e.g. nvidia drivers, a11y
enabled, selinux settings and insane local library butchering, and
prompts the unfortunate user to log the supplied text to the RH
bugzilla, and we map the stack back to source with
http://people.redhat.com/caolanm/ooocvs/ooomapstack after throwing out
the nvidia driver using reports

This ooomapstack utility most likely will also need access to debuginformation kept during the build, wouldn´t it?

At some stage in the past the ooobuild OOo would spawn off the gnome
gnome bug-buddy, but that's no longer the case is it ?

So do other distros have various solutions here, or just simply crash
out and/or dump core ?

What I'm thinking about aiming at is a shared cross-distro crash
repository where we can auto submit the distro OOo crashes, and the
distros can plug in their various stack mappers, with quick and dirty
gnomebugzilla-alike tooling to merge the duped backtraces together.


I see a few potential problems with that:

1.) Crash data does contain dump of memmory which in some cases mightcontain personal information which end-users would not want others to beable to find on public websites. There is stuff likehttp://www.sun.com/privacy/ and I am quite sure others do have similarterms to adhere to. Which means that anything we can create can not becreated in full public but must have limited access to restrictedgroup(s) of developers who agree on common terms of handling possibleprivate data in a secure manner.

2.) Who should host this repository? Or are you thinking about some kindof distributed or partly distributed repository?

3.) Resources have to be available to call something like RH´sooomapstack utility or Sun´s crashdebug utility on all supported platforms.

4.) Resources have to be available to keep debug information for allbuilds for doing 3.)

5.) 3.) + 4.) have to be on the same network meaning that either onecontributor would have to provide disk space and computing resources foranyone else or that everyone has too keep it´s one debug-information andprovide computing resources for the community to 'resolv' crash reportssend in for their distributions.

6.) At least what we (Sun) currently use as 'StackMapping' (in terms ofgrouping together similar reports) does often have false mapping in bothdirections, that means that the system sometimes things 2 reports aresimiliar but the developer later on finds out that they have a differntroot cause and that also the system sometimes thinks two reports are notsimilar but the developer later on finds out that they do in fact havethe same root cause. Considering our 'fuzzy'-algorithm that´s where I amstarting to wonder how that gnomebugzilla-alike tooling to merge theduped backtraces together might look like! Handling does false positives/ false negatives already has become a time consuming task which wouldget worse with a larger repository containing data for all distros.

7.) With the amount of data that we (Sun) currently already have in thedatabase stackmatching already has become a resource hog. In adistributed system which keeps information for ALL distributions itwould be even more complicated and time consuming to find matches. Thiscan soon get over the limit of where the matching service(s) can onlyhandle X incoming reports per hour and we will in fact get Y incomingreports per hour with Y>X. We already nearly had such situation in thepast resulting in various new implementations of matching algorithm anddatabase design.

8.) I don´t understand the approach of everyone plugging in an ownstackmapper. Why would distro-A use Algorithm-A for guessing similaritywhile distro-B would use Algorithm-B and how would than the overallquality of things matched into the same stackmatchid look like regardingfalse positives / false negatives? Or wait you did not talk aboutgrouping of reports together as one stackid but about resolving themwith something like RH's ooomapstack or Sun's crashdebug utility. Thanwell yes, see 5) and we would need to have at least a common interfaceeg. somewhere on a web application service for being able to call thatcrash 'resolving' for any developer who got the access rights to analysecrash reports.


>
> C.
>

Kind regards,

Bernd Eilers

Maintainer of Sun´s report handling backend infrastructure services andcorresponding developer web application.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [dev] crash reporter, what are the various distro's strategies ?

Reply via email to