On 10 January 2014 09:15, Alexander Sack <a...@canonical.com> wrote: > On Fri, Jan 10, 2014 at 6:23 AM, Paul Larson <paul.lar...@canonical.com> > wrote: >> = Mako = >> 100% pass (no reruns for anything) >> But we saw several crashes - dialer-app (which has been going on for a >> while) as well as unity8 crash in default and in click-image-tests. >> Default tests also saw a crash in whoopsie: >> http://ci.ubuntu.com/smokeng/trusty/touch/mako/120:20140109.1:20140107.1/5973/click_image_tests/ > > > Why do we see whoopsie crashes? Thought we disabled it to not auto > process crashes on the phone sometimes last cycle.
I'm being pedantic, but they're technically apport crashes. Whoopsie is just the daemon that shovels .crash files to https://daisy.ubuntu.com. The phone does not presently do a second-phase processing of crash files (adding package information, hooks, etc), nor does it feed the crash files to whoopsie (using whoopsie-upload-all), as Steve established the upstart job that ran whoopsie-upload-all was busted: https://bugs.launchpad.net/ubuntu/+source/apport/+bug/1235436 However, Brian Murray has fixed the bug in question, so we should be largely ready to go on accepting crash reports from phones. The last remaining piece is getting armhf retracers online as part of the move of the retracing infrastructure to Prodstack. I've asked Brian to take this task from me and finish it up. All that's needed is working with webops to verify that the stagingstack deployment is functional: https://rt.admin.canonical.com//Ticket/Display.html?id=58019 Now, to your question of why we're seeing whoopsie-upload-all crashes collected in the CI infrastructure. As Michał points out, that script is being run over a corrupted crash file. I've filed this bug to better deal with that particular case: https://bugs.launchpad.net/ubuntu/+source/apport/+bug/1267774 There's a deeper problem here. Didier informs me that they were seeing a lot of crashes in unity8 with a smashed stacktrace. They realised the dying unity process was getting reaped and restarted by upstart while still being processed by apport because it was taking a long time to collect and process the core file. They set a timeout 30s (data/unity8.override in unity8-autopilot), which seemed to work around the problem, but perhaps that value is being exceeded. We need a better solution than increasing a timeout. James, does upstart provide us with a better mechanism for telling it to not kill a process in this state? Can we add one if not? :) Thanks everyone. -- Mailing list: https://launchpad.net/~ubuntu-phone Post to : ubuntu-phone@lists.launchpad.net Unsubscribe : https://launchpad.net/~ubuntu-phone More help : https://help.launchpad.net/ListHelp