Re: Enabling (many) assertions in opt builds locally and eventually Nightly
On Mon, Sep 24, 2018 at 04:05:22PM -0400, Randell Jesup wrote: On 9/20/18 5:59 PM, Andrew McCreight wrote: On Wed, Sep 19, 2018 at 5:44 PM Kris Maglione wrote: On Wed, Sep 19, 2018 at 05:37:46PM -0700, Bobby Holley wrote: So, I don't think we need to do anything fancy with forking - we'd just need to capture stacks and send them via telemetry rather than as a crash report. This was the idea behind bug 1209131, which got pretty far along but never made it to the finish line. This would actually potentially even give us better information than fork-and-crash, since we should be able to include JS stacks in that setup too. We've never been able to do that in ordinary crash reports, since breakpad doesn't know how to unwind JS stacks, and we can't safely ask the JS runtime to do it while we're crashing. Though keep in mind that any stack that includes content JS is going to likely count as PII, so it would have to be hidden by default on Soccorro. Please note that it would be illegal to collect such data without asking for explicit user consent first. The GDPR requires a "positive opt-in" mechanism: https://ico.org.uk/for-organisations/guide-to-the-general-data-protection-regulation-gdpr/lawful-basis-for-processing/consent/ Our current Telemetry permission is an opt-out mechanism. Right - this would need to be handled in a similar way to real crashes - pop a crashreporter dialog to let the user submit it. We just wouldn't kill the browser (and probably disable future semi-assertions until restart once we hit and report one to avoid bugging the user too much). For telemetry assertion reporting, I think BHR is a closer analog. That already collects JS stacks. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
>On 9/20/18 5:59 PM, Andrew McCreight wrote: >> On Wed, Sep 19, 2018 at 5:44 PM Kris Maglione wrote: >> >>> On Wed, Sep 19, 2018 at 05:37:46PM -0700, Bobby Holley wrote: So, I don't think we need to do anything fancy with forking - we'd just need to capture stacks and send them via telemetry rather than as a crash report. This was the idea behind bug 1209131, which got pretty far along but never made it to the finish line. >>> >>> This would actually potentially even give us better information >>> than fork-and-crash, since we should be able to include JS >>> stacks in that setup too. We've never been able to do that in >>> ordinary crash reports, since breakpad doesn't know how to >>> unwind JS stacks, and we can't safely ask the JS runtime to do >>> it while we're crashing. >>> >> >> Though keep in mind that any stack that includes content JS is going to >> likely count as PII, so it would have to be hidden by default on Soccorro. > > >Please note that it would be illegal to collect such data >without asking for explicit user consent first. >The GDPR requires a "positive opt-in" mechanism: >https://ico.org.uk/for-organisations/guide-to-the-general-data-protection-regulation-gdpr/lawful-basis-for-processing/consent/ >Our current Telemetry permission is an opt-out mechanism. Right - this would need to be handled in a similar way to real crashes - pop a crashreporter dialog to let the user submit it. We just wouldn't kill the browser (and probably disable future semi-assertions until restart once we hit and report one to avoid bugging the user too much). -- Randell Jesup, Mozilla Corp remove "news" for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
On 9/20/18 5:59 PM, Andrew McCreight wrote: On Wed, Sep 19, 2018 at 5:44 PM Kris Maglione wrote: On Wed, Sep 19, 2018 at 05:37:46PM -0700, Bobby Holley wrote: So, I don't think we need to do anything fancy with forking - we'd just need to capture stacks and send them via telemetry rather than as a crash report. This was the idea behind bug 1209131, which got pretty far along but never made it to the finish line. This would actually potentially even give us better information than fork-and-crash, since we should be able to include JS stacks in that setup too. We've never been able to do that in ordinary crash reports, since breakpad doesn't know how to unwind JS stacks, and we can't safely ask the JS runtime to do it while we're crashing. Though keep in mind that any stack that includes content JS is going to likely count as PII, so it would have to be hidden by default on Soccorro. Please note that it would be illegal to collect such data without asking for explicit user consent first. The GDPR requires a "positive opt-in" mechanism: https://ico.org.uk/for-organisations/guide-to-the-general-data-protection-regulation-gdpr/lawful-basis-for-processing/consent/ Our current Telemetry permission is an opt-out mechanism. /Mats ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
Cameron McCormack writes: > (I wonder if we could collect all the same data, and use the same > crash reporting infrastructure, for non-crashing crash reports like > this.) For what it's worth, I've done something very close to this *accidentally*, on Linux, by manually sending a crash signal to test crash reporting and hitting bugs in how the signal is re-raised, such that it just returns to the not-really-crashed code. (It does unhook the signal handlers, so the next crash will kill the process, but that could be changed.) I never tried to make it an actual feature -- I had concerns about rate limiting, and maybe some other things I've forgotten, and then the work that's been mentioned here about telemetry seemed like it was solving the same problem in a more generally useful way. It would have been useful for desktop Linux sandboxing, because there are cases where we don't need to crash, and want to avoid crashing if possible on release, but we'll likely need at least a crash report's worth of info to figure out what's going on; instead we settled for crashing only on Nightly. --Jed [1] https://searchfox.org/mozilla-central/rev/0b8ed772d24605d7cb44c1af6d59e4ca023bd5f5/toolkit/crashreporter/breakpad-client/linux/handler/exception_handler.cc#398 ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
On Wed, Sep 19, 2018 at 5:44 PM Kris Maglione wrote: > On Wed, Sep 19, 2018 at 05:37:46PM -0700, Bobby Holley wrote: > >So, I don't think we need to do anything fancy with forking - we'd just > >need to capture stacks and send them via telemetry rather than as a crash > >report. This was the idea behind bug 1209131, which got pretty far along > >but never made it to the finish line. > > This would actually potentially even give us better information > than fork-and-crash, since we should be able to include JS > stacks in that setup too. We've never been able to do that in > ordinary crash reports, since breakpad doesn't know how to > unwind JS stacks, and we can't safely ask the JS runtime to do > it while we're crashing. > Though keep in mind that any stack that includes content JS is going to likely count as PII, so it would have to be hidden by default on Soccorro. > >On Wed, Sep 19, 2018 at 5:05 PM Randell Jesup > wrote: > > > >> >On Thu, Sep 20, 2018 at 09:22:12AM +1000, Cameron McCormack wrote: > >> >> On Thu, Sep 20, 2018, at 1:52 AM, Ehsan Akhgari wrote: > >> >> > While it may be the case that we may need to be more stable for > >> >> > MOZ_RELEASE_ASSERTs, I very much doubt that every single MOZ_ASSERT > >> in our > >> >> > codebase is actually a guaranteed to never fail, so promoting them > >> all to > >> >> > be enabled in something like Nightly is extremely dangerous in > terms > >> of the > >> >> > stability of Nightly for users who are trying to use the browser to > >> get > >> >> > their normal work done. > >> >> > >> >> If it's truly useful to know whether Nightly users are failing these > >> >> MOZ_ASSERT assertions, we could use Telemetry to report their failure > >> >> instead of crashing.> > >> >> (I wonder if we could collect all the same data, and use the same > crash > >> reporting infrastructure, for non-crashing crash reports like this.) > >> > > >> >On Linux and Mac, we could make MOZ_ASSERT fork and crash in the child, > >> >and continue in the parent process. > >> > >> Oooh, tricky. (though threads aren't preserved in forks, but the > >> current thread's stack is, so if we care about all threads, we might > >> need to grab pointers to their stacks/etc and store pointers to the > >> stacks before fork()). But even without that it would be good. > >> > >> And yes, we don't have to have it crash the browser (though if it later > >> crashes we'd want to know about assertions we'd hit). Telemetry isn't > >> quite appropriate for reporting it; it could however save a crash report > >> at ASSERT(), then (probably) disable MOZ_ASSERT reports and let the > >> browser continue (and it may well now really crash). This disabling > >> would avoid the cascade effect, but still get us crashreports. This > >> would be a bunch(?) of work in crashreporter - ted? It might be easy > >> invoke a crashreport sampling, and continue, but I suspect it's not, and > >> might be quite hard. > >> > >> -- > >> Randell Jesup, Mozilla Corp > >> remove "news" for personal email > >> ___ > >> dev-platform mailing list > >> dev-platform@lists.mozilla.org > >> https://lists.mozilla.org/listinfo/dev-platform > >> > >___ > >dev-platform mailing list > >dev-platform@lists.mozilla.org > >https://lists.mozilla.org/listinfo/dev-platform > > -- > Kris Maglione > Senior Firefox Add-ons Engineer > Mozilla Corporation > > Common sense is the collection of prejudices acquired by age 18. > --Albert Einstein > > ___ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
On Wed, Sep 19, 2018 at 8:38 PM Bobby Holley wrote: > I spoke with Georg about it recently, and got the impression that his team > could get it finished if we had a current use-case. Hooking it up to > MOZ_ASSERTs on nightly builds seems like a pretty good one. > That would indeed be wonderful to have! > On Wed, Sep 19, 2018 at 5:05 PM Randell Jesup > wrote: > > > >On Thu, Sep 20, 2018 at 09:22:12AM +1000, Cameron McCormack wrote: > > >> On Thu, Sep 20, 2018, at 1:52 AM, Ehsan Akhgari wrote: > > >> > While it may be the case that we may need to be more stable for > > >> > MOZ_RELEASE_ASSERTs, I very much doubt that every single MOZ_ASSERT > > in our > > >> > codebase is actually a guaranteed to never fail, so promoting them > > all to > > >> > be enabled in something like Nightly is extremely dangerous in terms > > of the > > >> > stability of Nightly for users who are trying to use the browser to > > get > > >> > their normal work done. > > >> > > >> If it's truly useful to know whether Nightly users are failing these > > >> MOZ_ASSERT assertions, we could use Telemetry to report their failure > > >> instead of crashing.> > > >> (I wonder if we could collect all the same data, and use the same > crash > > reporting infrastructure, for non-crashing crash reports like this.) > > > > > >On Linux and Mac, we could make MOZ_ASSERT fork and crash in the child, > > >and continue in the parent process. > > > > Oooh, tricky. (though threads aren't preserved in forks, but the > > current thread's stack is, so if we care about all threads, we might > > need to grab pointers to their stacks/etc and store pointers to the > > stacks before fork()). But even without that it would be good. > > > > And yes, we don't have to have it crash the browser (though if it later > > crashes we'd want to know about assertions we'd hit). Telemetry isn't > > quite appropriate for reporting it; it could however save a crash report > > at ASSERT(), then (probably) disable MOZ_ASSERT reports and let the > > browser continue (and it may well now really crash). This disabling > > would avoid the cascade effect, but still get us crashreports. This > > would be a bunch(?) of work in crashreporter - ted? It might be easy > > invoke a crashreport sampling, and continue, but I suspect it's not, and > > might be quite hard. > > > > -- > > Randell Jesup, Mozilla Corp > > remove "news" for personal email > > ___ > > dev-platform mailing list > > dev-platform@lists.mozilla.org > > https://lists.mozilla.org/listinfo/dev-platform > > > ___ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform > -- Ehsan ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
On Wed, Sep 19, 2018 at 05:37:46PM -0700, Bobby Holley wrote: So, I don't think we need to do anything fancy with forking - we'd just need to capture stacks and send them via telemetry rather than as a crash report. This was the idea behind bug 1209131, which got pretty far along but never made it to the finish line. This would actually potentially even give us better information than fork-and-crash, since we should be able to include JS stacks in that setup too. We've never been able to do that in ordinary crash reports, since breakpad doesn't know how to unwind JS stacks, and we can't safely ask the JS runtime to do it while we're crashing. On Wed, Sep 19, 2018 at 5:05 PM Randell Jesup wrote: >On Thu, Sep 20, 2018 at 09:22:12AM +1000, Cameron McCormack wrote: >> On Thu, Sep 20, 2018, at 1:52 AM, Ehsan Akhgari wrote: >> > While it may be the case that we may need to be more stable for >> > MOZ_RELEASE_ASSERTs, I very much doubt that every single MOZ_ASSERT in our >> > codebase is actually a guaranteed to never fail, so promoting them all to >> > be enabled in something like Nightly is extremely dangerous in terms of the >> > stability of Nightly for users who are trying to use the browser to get >> > their normal work done. >> >> If it's truly useful to know whether Nightly users are failing these >> MOZ_ASSERT assertions, we could use Telemetry to report their failure >> instead of crashing.> >> (I wonder if we could collect all the same data, and use the same crash reporting infrastructure, for non-crashing crash reports like this.) > >On Linux and Mac, we could make MOZ_ASSERT fork and crash in the child, >and continue in the parent process. Oooh, tricky. (though threads aren't preserved in forks, but the current thread's stack is, so if we care about all threads, we might need to grab pointers to their stacks/etc and store pointers to the stacks before fork()). But even without that it would be good. And yes, we don't have to have it crash the browser (though if it later crashes we'd want to know about assertions we'd hit). Telemetry isn't quite appropriate for reporting it; it could however save a crash report at ASSERT(), then (probably) disable MOZ_ASSERT reports and let the browser continue (and it may well now really crash). This disabling would avoid the cascade effect, but still get us crashreports. This would be a bunch(?) of work in crashreporter - ted? It might be easy invoke a crashreport sampling, and continue, but I suspect it's not, and might be quite hard. -- Randell Jesup, Mozilla Corp remove "news" for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform -- Kris Maglione Senior Firefox Add-ons Engineer Mozilla Corporation Common sense is the collection of prejudices acquired by age 18. --Albert Einstein ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
So, I don't think we need to do anything fancy with forking - we'd just need to capture stacks and send them via telemetry rather than as a crash report. This was the idea behind bug 1209131, which got pretty far along but never made it to the finish line. I spoke with Georg about it recently, and got the impression that his team could get it finished if we had a current use-case. Hooking it up to MOZ_ASSERTs on nightly builds seems like a pretty good one. On Wed, Sep 19, 2018 at 5:05 PM Randell Jesup wrote: > >On Thu, Sep 20, 2018 at 09:22:12AM +1000, Cameron McCormack wrote: > >> On Thu, Sep 20, 2018, at 1:52 AM, Ehsan Akhgari wrote: > >> > While it may be the case that we may need to be more stable for > >> > MOZ_RELEASE_ASSERTs, I very much doubt that every single MOZ_ASSERT > in our > >> > codebase is actually a guaranteed to never fail, so promoting them > all to > >> > be enabled in something like Nightly is extremely dangerous in terms > of the > >> > stability of Nightly for users who are trying to use the browser to > get > >> > their normal work done. > >> > >> If it's truly useful to know whether Nightly users are failing these > >> MOZ_ASSERT assertions, we could use Telemetry to report their failure > >> instead of crashing.> > >> (I wonder if we could collect all the same data, and use the same crash > reporting infrastructure, for non-crashing crash reports like this.) > > > >On Linux and Mac, we could make MOZ_ASSERT fork and crash in the child, > >and continue in the parent process. > > Oooh, tricky. (though threads aren't preserved in forks, but the > current thread's stack is, so if we care about all threads, we might > need to grab pointers to their stacks/etc and store pointers to the > stacks before fork()). But even without that it would be good. > > And yes, we don't have to have it crash the browser (though if it later > crashes we'd want to know about assertions we'd hit). Telemetry isn't > quite appropriate for reporting it; it could however save a crash report > at ASSERT(), then (probably) disable MOZ_ASSERT reports and let the > browser continue (and it may well now really crash). This disabling > would avoid the cascade effect, but still get us crashreports. This > would be a bunch(?) of work in crashreporter - ted? It might be easy > invoke a crashreport sampling, and continue, but I suspect it's not, and > might be quite hard. > > -- > Randell Jesup, Mozilla Corp > remove "news" for personal email > ___ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
>On Thu, Sep 20, 2018 at 09:22:12AM +1000, Cameron McCormack wrote: >> On Thu, Sep 20, 2018, at 1:52 AM, Ehsan Akhgari wrote: >> > While it may be the case that we may need to be more stable for >> > MOZ_RELEASE_ASSERTs, I very much doubt that every single MOZ_ASSERT in our >> > codebase is actually a guaranteed to never fail, so promoting them all to >> > be enabled in something like Nightly is extremely dangerous in terms of the >> > stability of Nightly for users who are trying to use the browser to get >> > their normal work done. >> >> If it's truly useful to know whether Nightly users are failing these >> MOZ_ASSERT assertions, we could use Telemetry to report their failure >> instead of crashing.> >> (I wonder if we could collect all the same data, and use the same crash >> reporting infrastructure, for non-crashing crash reports like this.) > >On Linux and Mac, we could make MOZ_ASSERT fork and crash in the child, >and continue in the parent process. Oooh, tricky. (though threads aren't preserved in forks, but the current thread's stack is, so if we care about all threads, we might need to grab pointers to their stacks/etc and store pointers to the stacks before fork()). But even without that it would be good. And yes, we don't have to have it crash the browser (though if it later crashes we'd want to know about assertions we'd hit). Telemetry isn't quite appropriate for reporting it; it could however save a crash report at ASSERT(), then (probably) disable MOZ_ASSERT reports and let the browser continue (and it may well now really crash). This disabling would avoid the cascade effect, but still get us crashreports. This would be a bunch(?) of work in crashreporter - ted? It might be easy invoke a crashreport sampling, and continue, but I suspect it's not, and might be quite hard. -- Randell Jesup, Mozilla Corp remove "news" for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
On 9/19/2018 5:30 PM, Mike Hommey wrote: On Linux and Mac, we could make MOZ_ASSERT fork and crash in the child, and continue in the parent process. There is kind of a way to do this on Windows 8.1+ too if we wanted [1], though it obviously doesn't quite work the same way. [1] https://msdn.microsoft.com/en-us/windows/desktop/dn457828 ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
On Thu, Sep 20, 2018 at 09:22:12AM +1000, Cameron McCormack wrote: > On Thu, Sep 20, 2018, at 1:52 AM, Ehsan Akhgari wrote: > > While it may be the case that we may need to be more stable for > > MOZ_RELEASE_ASSERTs, I very much doubt that every single MOZ_ASSERT in our > > codebase is actually a guaranteed to never fail, so promoting them all to > > be enabled in something like Nightly is extremely dangerous in terms of the > > stability of Nightly for users who are trying to use the browser to get > > their normal work done. > > If it's truly useful to know whether Nightly users are failing these > MOZ_ASSERT assertions, we could use Telemetry to report their failure instead > of crashing. > > (I wonder if we could collect all the same data, and use the same crash > reporting infrastructure, for non-crashing crash reports like this.) On Linux and Mac, we could make MOZ_ASSERT fork and crash in the child, and continue in the parent process. Mike ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
On Thu, Sep 20, 2018, at 1:52 AM, Ehsan Akhgari wrote: > While it may be the case that we may need to be more stable for > MOZ_RELEASE_ASSERTs, I very much doubt that every single MOZ_ASSERT in our > codebase is actually a guaranteed to never fail, so promoting them all to > be enabled in something like Nightly is extremely dangerous in terms of the > stability of Nightly for users who are trying to use the browser to get > their normal work done. If it's truly useful to know whether Nightly users are failing these MOZ_ASSERT assertions, we could use Telemetry to report their failure instead of crashing. (I wonder if we could collect all the same data, and use the same crash reporting infrastructure, for non-crashing crash reports like this.) ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
Ehsan wrote: >On Tue, Sep 18, 2018 at 1:30 AM Randell Jesup wrote: >> We already *need* to be stable in that case, given MOZ_RELEASE_ASSERT >> (and normal just-because crashes). And as best I can tell, we are stable >> (with regards to user profiles). Once upon a time we weren't (5(?) >> years ago?) > >I do come across MOZ_ASSERTs that trigger while browsing every once in a >while (using a debug build). I almost never file bugs for these, because >the issues are rarely reproducible, I often have little information that >would be helpful as to how exactly the assertion was triggered, and I often >am very actively working on something else and the assertion failure is >just getting in the way of me getting my work done. I sometimes have to >comment out a broken MOZ_ASSERT in my local build to be able to proceed. Good point, though if we have people actually browsing with these, we'll find these bogus MOZ_ASSERTs - and a bogus MOZ_ASSERT may mean we have a sec bug that we assume "can't happen", since it doesn't fail in mochitests. >While it may be the case that we may need to be more stable for >MOZ_RELEASE_ASSERTs, I very much doubt that every single MOZ_ASSERT in our >codebase is actually a guaranteed to never fail, so promoting them all to >be enabled in something like Nightly is extremely dangerous in terms of the >stability of Nightly for users who are trying to use the browser to get >their normal work done. Again, good point -- though I would absolutely start with "get developers to use this". If we're kicking a MOZ_ASSERT, someone should be either fixing it or removing the assert and dealing with it. So please file against whomever is maintaining that bit of code, and let them sort out why it would fail. >This is, I believe, the way that we advertise >Nightly these days: we advertise it as a mostly stable experimental version >of Firefox and we encourage people to download it to try out the latest and >greatest. Exposing that population to crashes as a result of promoting all >MOZ_ASSERTs to become MOZ_RELEASE_ASSERTs seems like a proposition that >needs to be backed by some data demonstrating that such a build would match >the current stability requirements of Nightly. Sure. >FWIW, I think your original proposal, having a way to opt into assertions >without a slow build locally, is extremely valuable. I could see myself >using that option intentionally even with the pain points I described >above, but for dogfooding rather than while working on a patch and such. Cool, thanks. I have the start of a patch to enable a version of this. -- Randell Jesup, Mozilla Corp remove "news" for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
On Wed, Sep 19, 2018 at 11:52:07AM -0400, Ehsan Akhgari wrote: FWIW, I think your original proposal, having a way to opt into assertions without a slow build locally, is extremely valuable. I could see myself using that option intentionally even with the pain points I described above, but for dogfooding rather than while working on a patch and such. I agree. A separate build flag to enable most assertions without doing a full debug build would be extremely useful. I would probably leave that enabled most of the time, even though I rarely do debug builds in normal development. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
On Tue, Sep 18, 2018 at 1:30 AM Randell Jesup wrote: > >On 9/17/2018 1:31 PM, Botond Ballo wrote: > >> One potential issue here is that an assertion failure in Nightly > >> builds might be simultaneously very annoying (because it crashes the > >> browser of Nightly users every time it occurs) and not very actionable > >> (because no reliable STR can be found for it). > > > >I would also be concerned about data loss and profile corruption when an > >assertion crashes the browser. I don't think we'll have a very easy time > >convincing people to do anything non-trivial with Nightly if there's a > >pretty good chance that they completely lose the fruits of their labour. > > > >I'd hope that this wouldn't be too frequent a problem, but it probably > >depends on the assertion, and the state of the durability of our user > >profiles. > > We already *need* to be stable in that case, given MOZ_RELEASE_ASSERT > (and normal just-because crashes). And as best I can tell, we are stable > (with regards to user profiles). Once upon a time we weren't (5(?) > years ago?) > I do come across MOZ_ASSERTs that trigger while browsing every once in a while (using a debug build). I almost never file bugs for these, because the issues are rarely reproducible, I often have little information that would be helpful as to how exactly the assertion was triggered, and I often am very actively working on something else and the assertion failure is just getting in the way of me getting my work done. I sometimes have to comment out a broken MOZ_ASSERT in my local build to be able to proceed. While it may be the case that we may need to be more stable for MOZ_RELEASE_ASSERTs, I very much doubt that every single MOZ_ASSERT in our codebase is actually a guaranteed to never fail, so promoting them all to be enabled in something like Nightly is extremely dangerous in terms of the stability of Nightly for users who are trying to use the browser to get their normal work done. This is, I believe, the way that we advertise Nightly these days: we advertise it as a mostly stable experimental version of Firefox and we encourage people to download it to try out the latest and greatest. Exposing that population to crashes as a result of promoting all MOZ_ASSERTs to become MOZ_RELEASE_ASSERTs seems like a proposition that needs to be backed by some data demonstrating that such a build would match the current stability requirements of Nightly. Note that with MOZ_DIAGNOSTIC_ASSERT, we have had tons of cases where people have put in assertions that were *surely* correct and would never ever fail, only to find out that the assertion turned out to be crashers and in some cases top crashers in Nightly. It is true that sometimes that is the only tool in our toolbox for hunting down bugs, but using a macro intentionally to opt in to such a scenario seems entirely different than promoting all of our existing assertions to Nightly without any way to assess their quality beforehand. FWIW, I think your original proposal, having a way to opt into assertions without a slow build locally, is extremely valuable. I could see myself using that option intentionally even with the pain points I described above, but for dogfooding rather than while working on a patch and such. Cheers, -- Ehsan ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
>On 9/17/2018 1:31 PM, Botond Ballo wrote: >> One potential issue here is that an assertion failure in Nightly >> builds might be simultaneously very annoying (because it crashes the >> browser of Nightly users every time it occurs) and not very actionable >> (because no reliable STR can be found for it). > >I would also be concerned about data loss and profile corruption when an >assertion crashes the browser. I don't think we'll have a very easy time >convincing people to do anything non-trivial with Nightly if there's a >pretty good chance that they completely lose the fruits of their labour. > >I'd hope that this wouldn't be too frequent a problem, but it probably >depends on the assertion, and the state of the durability of our user >profiles. We already *need* to be stable in that case, given MOZ_RELEASE_ASSERT (and normal just-because crashes). And as best I can tell, we are stable (with regards to user profiles). Once upon a time we weren't (5(?) years ago?) -- Randell Jesup, Mozilla Corp remove "news" for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
On 9/17/2018 1:31 PM, Botond Ballo wrote: One potential issue here is that an assertion failure in Nightly builds might be simultaneously very annoying (because it crashes the browser of Nightly users every time it occurs) and not very actionable (because no reliable STR can be found for it). I would also be concerned about data loss and profile corruption when an assertion crashes the browser. I don't think we'll have a very easy time convincing people to do anything non-trivial with Nightly if there's a pretty good chance that they completely lose the fruits of their labour. I'd hope that this wouldn't be too frequent a problem, but it probably depends on the assertion, and the state of the durability of our user profiles. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
With webrender we've had pretty good results from always defaulting to MOZ_RELEASE_ASSERT, as users are often mildly decent at producing an STR. But perhaps we're luckier in that everyone dogfooding webrender was more willing to figure things out than a nightly user? On Mon, Sep 17, 2018 at 3:31 PM Botond Ballo wrote: > One potential issue here is that an assertion failure in Nightly > builds might be simultaneously very annoying (because it crashes the > browser of Nightly users every time it occurs) and not very actionable > (because no reliable STR can be found for it). That's the situation > I'm in with bug 1457603, for example, where I may have to end up > _downgrading_ a diagnostic assert that I added, to a regular assert. > > On Mon, Sep 17, 2018 at 3:25 PM, Jeff Gilbert > wrote: > > I would be excited for something between MOZ_ASSERT and MOZ_CRASH, but > > I think raising MOZ_ASSERTs to MOZ_ASSERT_NIGHTLY should be done by > > hand and reviewed. > > > > On Mon, Sep 17, 2018 at 11:46 AM, Kris Maglione > wrote: > >> There are some non-trivial snags for this proposal. A lot of assertions > rely > >> on other code gated on #ifdef DEBUG or DebugOnly<...> to avoid defining > data > >> members or storing values in non-debug builds. We could replace those > with > >> more fine-grained macros, of course, but particularly in the case of > data > >> members, we'd probably also take a significant memory usage regression. > >> > >> There's also the snag of NS_ASSERTION macros which are... weird. > >> > >> This is nothing we can't deal with, of course, but it will probably > require > >> a lot of manual work if we want to do it correctly. > >> > >> > >> On Mon, Sep 17, 2018 at 02:38:14PM -0400, Randell Jesup wrote: > >>> > >>> There are quite a few things that may be caught by assertions by > >>> developers before committing, especially sec issues - but most > >>> developers don't run debug builds as their main local test builds, or > >>> for local browsing use, because they are almost unusably slow. And > >>> turning on compiler optimization doesn't actually help much here - the > >>> problem is mostly debug-only assertions and code that's debug only, > such > >>> as bug 1441052 ("Don't do full grey-node checking in local debug > >>> builds"). > >>> > >>> These added checks may be useful for CI tests, but they make the builds > >>> unusable, so the vast majority of assertions don't run in the builds > our > >>> developers are using. So enabling most of the existing MOZ_ASSERTs for > >>> local opt builds (minus the worst performance-limiting ones) would help > >>> developers catch bugs before landing them (and perhaps shipping > >>> them). It's a lot like using ASAN builds to browse - but with much less > >>> perf degradation on hopes. > >>> > >>> Even better, if we can make the overall hit to perf low enough (1%?), > we > >>> could enable it for some/all Nightly users/builds. Or make "early > >>> nightly" (and developer builds) enable them, and late nightly and beta > >>> not. > >>> > >>> If we moved some of the most expensive checks to an alternative macro > >>> (MOZ_ASSERT_DEBUG()), we could (selectively?) enable MOZ_ASSERT checks > >>> in some opt builds. Alternatively we could promote some MOZ_ASSERTs to > >>> MOZ_ASSERT_NIGHTLY() (or MOZ_DIAGNOSTIC_ASSERT), which would assert in > >>> (all) debug builds, and in nightly opt builds - and maybe promote some > >>> to MOZ_ASSERT_RELEASE if we can take the perf hit, and they're in an > >>> important spot. > >>> > >>> And as a stepping-stone to the above, having local opt builds enable > >>> (most) MOZ_ASSERTs (excluding really expensive ones, like bug 1441052) > >>> even if the hit is larger (5, 10%) would also increase the likelihood > >>> that we'll catch things before we land them, or before they get to > beta. > >>> (Note: --enable-release would turn this local-build behavior off - and > >>> anyone doing speed tests/profiling/etc needs that already, and probably > >>> we should have a specific enable/disable like > >>> --disable-extra-assertions). This wouldn't be all that hard - most of > >>> the work would be finding "expensive" assertions like bug 1441052. > >>> > >>> (and perhaps we should not continue to overload --enable-release for "I > >>> want to benchmark/profile this build") > >>> > >>> Enabling most MOZ_ASSERTs in automation tests on opt builds would also > >>> slightly increase our odds of finding problems - though this may not > >>> help much, as the same tests are being run in debug builds. The only > >>> advantage would be races may be more likely in the faster opt builds. > >>> It might be worth trying once we have this for a few weeks, and see if > >>> it helps or not. > >>> > >>> Note: I've discussed this concept with various people including bz in > >>> the past, and mostly have had agreement that this would be useful - > thus > >>> my filing of bug 1441052. If we agree this is worth doing, we should > >>> file bugs on it and
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
On 9/17/18 3:25 PM, Jeff Gilbert wrote: I would be excited for something between MOZ_ASSERT and MOZ_CRASH Note that we already have MOZ_DIAGNOSTIC_ASSERT in between those. We should perhaps be using it a bit more. -Boris ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
One potential issue here is that an assertion failure in Nightly builds might be simultaneously very annoying (because it crashes the browser of Nightly users every time it occurs) and not very actionable (because no reliable STR can be found for it). That's the situation I'm in with bug 1457603, for example, where I may have to end up _downgrading_ a diagnostic assert that I added, to a regular assert. On Mon, Sep 17, 2018 at 3:25 PM, Jeff Gilbert wrote: > I would be excited for something between MOZ_ASSERT and MOZ_CRASH, but > I think raising MOZ_ASSERTs to MOZ_ASSERT_NIGHTLY should be done by > hand and reviewed. > > On Mon, Sep 17, 2018 at 11:46 AM, Kris Maglione wrote: >> There are some non-trivial snags for this proposal. A lot of assertions rely >> on other code gated on #ifdef DEBUG or DebugOnly<...> to avoid defining data >> members or storing values in non-debug builds. We could replace those with >> more fine-grained macros, of course, but particularly in the case of data >> members, we'd probably also take a significant memory usage regression. >> >> There's also the snag of NS_ASSERTION macros which are... weird. >> >> This is nothing we can't deal with, of course, but it will probably require >> a lot of manual work if we want to do it correctly. >> >> >> On Mon, Sep 17, 2018 at 02:38:14PM -0400, Randell Jesup wrote: >>> >>> There are quite a few things that may be caught by assertions by >>> developers before committing, especially sec issues - but most >>> developers don't run debug builds as their main local test builds, or >>> for local browsing use, because they are almost unusably slow. And >>> turning on compiler optimization doesn't actually help much here - the >>> problem is mostly debug-only assertions and code that's debug only, such >>> as bug 1441052 ("Don't do full grey-node checking in local debug >>> builds"). >>> >>> These added checks may be useful for CI tests, but they make the builds >>> unusable, so the vast majority of assertions don't run in the builds our >>> developers are using. So enabling most of the existing MOZ_ASSERTs for >>> local opt builds (minus the worst performance-limiting ones) would help >>> developers catch bugs before landing them (and perhaps shipping >>> them). It's a lot like using ASAN builds to browse - but with much less >>> perf degradation on hopes. >>> >>> Even better, if we can make the overall hit to perf low enough (1%?), we >>> could enable it for some/all Nightly users/builds. Or make "early >>> nightly" (and developer builds) enable them, and late nightly and beta >>> not. >>> >>> If we moved some of the most expensive checks to an alternative macro >>> (MOZ_ASSERT_DEBUG()), we could (selectively?) enable MOZ_ASSERT checks >>> in some opt builds. Alternatively we could promote some MOZ_ASSERTs to >>> MOZ_ASSERT_NIGHTLY() (or MOZ_DIAGNOSTIC_ASSERT), which would assert in >>> (all) debug builds, and in nightly opt builds - and maybe promote some >>> to MOZ_ASSERT_RELEASE if we can take the perf hit, and they're in an >>> important spot. >>> >>> And as a stepping-stone to the above, having local opt builds enable >>> (most) MOZ_ASSERTs (excluding really expensive ones, like bug 1441052) >>> even if the hit is larger (5, 10%) would also increase the likelihood >>> that we'll catch things before we land them, or before they get to beta. >>> (Note: --enable-release would turn this local-build behavior off - and >>> anyone doing speed tests/profiling/etc needs that already, and probably >>> we should have a specific enable/disable like >>> --disable-extra-assertions). This wouldn't be all that hard - most of >>> the work would be finding "expensive" assertions like bug 1441052. >>> >>> (and perhaps we should not continue to overload --enable-release for "I >>> want to benchmark/profile this build") >>> >>> Enabling most MOZ_ASSERTs in automation tests on opt builds would also >>> slightly increase our odds of finding problems - though this may not >>> help much, as the same tests are being run in debug builds. The only >>> advantage would be races may be more likely in the faster opt builds. >>> It might be worth trying once we have this for a few weeks, and see if >>> it helps or not. >>> >>> Note: I've discussed this concept with various people including bz in >>> the past, and mostly have had agreement that this would be useful - thus >>> my filing of bug 1441052. If we agree this is worth doing, we should >>> file bugs on it and see what the effort to get this would be, and if we >>> could ship some of this to users. Much like nightly-asan, this would >>> shake out bugs we'll never find from crash-stats reports, and which can >>> lead to critical sec bugs, and turn them into frequently-actionable >>> bugs. If needed this can be enabled incrementally once the build/config >>> infra and macros are in place; there are several options for that. >> >> ___ >> dev-platform mailing list >>
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
I would be excited for something between MOZ_ASSERT and MOZ_CRASH, but I think raising MOZ_ASSERTs to MOZ_ASSERT_NIGHTLY should be done by hand and reviewed. On Mon, Sep 17, 2018 at 11:46 AM, Kris Maglione wrote: > There are some non-trivial snags for this proposal. A lot of assertions rely > on other code gated on #ifdef DEBUG or DebugOnly<...> to avoid defining data > members or storing values in non-debug builds. We could replace those with > more fine-grained macros, of course, but particularly in the case of data > members, we'd probably also take a significant memory usage regression. > > There's also the snag of NS_ASSERTION macros which are... weird. > > This is nothing we can't deal with, of course, but it will probably require > a lot of manual work if we want to do it correctly. > > > On Mon, Sep 17, 2018 at 02:38:14PM -0400, Randell Jesup wrote: >> >> There are quite a few things that may be caught by assertions by >> developers before committing, especially sec issues - but most >> developers don't run debug builds as their main local test builds, or >> for local browsing use, because they are almost unusably slow. And >> turning on compiler optimization doesn't actually help much here - the >> problem is mostly debug-only assertions and code that's debug only, such >> as bug 1441052 ("Don't do full grey-node checking in local debug >> builds"). >> >> These added checks may be useful for CI tests, but they make the builds >> unusable, so the vast majority of assertions don't run in the builds our >> developers are using. So enabling most of the existing MOZ_ASSERTs for >> local opt builds (minus the worst performance-limiting ones) would help >> developers catch bugs before landing them (and perhaps shipping >> them). It's a lot like using ASAN builds to browse - but with much less >> perf degradation on hopes. >> >> Even better, if we can make the overall hit to perf low enough (1%?), we >> could enable it for some/all Nightly users/builds. Or make "early >> nightly" (and developer builds) enable them, and late nightly and beta >> not. >> >> If we moved some of the most expensive checks to an alternative macro >> (MOZ_ASSERT_DEBUG()), we could (selectively?) enable MOZ_ASSERT checks >> in some opt builds. Alternatively we could promote some MOZ_ASSERTs to >> MOZ_ASSERT_NIGHTLY() (or MOZ_DIAGNOSTIC_ASSERT), which would assert in >> (all) debug builds, and in nightly opt builds - and maybe promote some >> to MOZ_ASSERT_RELEASE if we can take the perf hit, and they're in an >> important spot. >> >> And as a stepping-stone to the above, having local opt builds enable >> (most) MOZ_ASSERTs (excluding really expensive ones, like bug 1441052) >> even if the hit is larger (5, 10%) would also increase the likelihood >> that we'll catch things before we land them, or before they get to beta. >> (Note: --enable-release would turn this local-build behavior off - and >> anyone doing speed tests/profiling/etc needs that already, and probably >> we should have a specific enable/disable like >> --disable-extra-assertions). This wouldn't be all that hard - most of >> the work would be finding "expensive" assertions like bug 1441052. >> >> (and perhaps we should not continue to overload --enable-release for "I >> want to benchmark/profile this build") >> >> Enabling most MOZ_ASSERTs in automation tests on opt builds would also >> slightly increase our odds of finding problems - though this may not >> help much, as the same tests are being run in debug builds. The only >> advantage would be races may be more likely in the faster opt builds. >> It might be worth trying once we have this for a few weeks, and see if >> it helps or not. >> >> Note: I've discussed this concept with various people including bz in >> the past, and mostly have had agreement that this would be useful - thus >> my filing of bug 1441052. If we agree this is worth doing, we should >> file bugs on it and see what the effort to get this would be, and if we >> could ship some of this to users. Much like nightly-asan, this would >> shake out bugs we'll never find from crash-stats reports, and which can >> lead to critical sec bugs, and turn them into frequently-actionable >> bugs. If needed this can be enabled incrementally once the build/config >> infra and macros are in place; there are several options for that. > > ___ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
There are some non-trivial snags for this proposal. A lot of assertions rely on other code gated on #ifdef DEBUG or DebugOnly<...> to avoid defining data members or storing values in non-debug builds. We could replace those with more fine-grained macros, of course, but particularly in the case of data members, we'd probably also take a significant memory usage regression. There's also the snag of NS_ASSERTION macros which are... weird. This is nothing we can't deal with, of course, but it will probably require a lot of manual work if we want to do it correctly. On Mon, Sep 17, 2018 at 02:38:14PM -0400, Randell Jesup wrote: There are quite a few things that may be caught by assertions by developers before committing, especially sec issues - but most developers don't run debug builds as their main local test builds, or for local browsing use, because they are almost unusably slow. And turning on compiler optimization doesn't actually help much here - the problem is mostly debug-only assertions and code that's debug only, such as bug 1441052 ("Don't do full grey-node checking in local debug builds"). These added checks may be useful for CI tests, but they make the builds unusable, so the vast majority of assertions don't run in the builds our developers are using. So enabling most of the existing MOZ_ASSERTs for local opt builds (minus the worst performance-limiting ones) would help developers catch bugs before landing them (and perhaps shipping them). It's a lot like using ASAN builds to browse - but with much less perf degradation on hopes. Even better, if we can make the overall hit to perf low enough (1%?), we could enable it for some/all Nightly users/builds. Or make "early nightly" (and developer builds) enable them, and late nightly and beta not. If we moved some of the most expensive checks to an alternative macro (MOZ_ASSERT_DEBUG()), we could (selectively?) enable MOZ_ASSERT checks in some opt builds. Alternatively we could promote some MOZ_ASSERTs to MOZ_ASSERT_NIGHTLY() (or MOZ_DIAGNOSTIC_ASSERT), which would assert in (all) debug builds, and in nightly opt builds - and maybe promote some to MOZ_ASSERT_RELEASE if we can take the perf hit, and they're in an important spot. And as a stepping-stone to the above, having local opt builds enable (most) MOZ_ASSERTs (excluding really expensive ones, like bug 1441052) even if the hit is larger (5, 10%) would also increase the likelihood that we'll catch things before we land them, or before they get to beta. (Note: --enable-release would turn this local-build behavior off - and anyone doing speed tests/profiling/etc needs that already, and probably we should have a specific enable/disable like --disable-extra-assertions). This wouldn't be all that hard - most of the work would be finding "expensive" assertions like bug 1441052. (and perhaps we should not continue to overload --enable-release for "I want to benchmark/profile this build") Enabling most MOZ_ASSERTs in automation tests on opt builds would also slightly increase our odds of finding problems - though this may not help much, as the same tests are being run in debug builds. The only advantage would be races may be more likely in the faster opt builds. It might be worth trying once we have this for a few weeks, and see if it helps or not. Note: I've discussed this concept with various people including bz in the past, and mostly have had agreement that this would be useful - thus my filing of bug 1441052. If we agree this is worth doing, we should file bugs on it and see what the effort to get this would be, and if we could ship some of this to users. Much like nightly-asan, this would shake out bugs we'll never find from crash-stats reports, and which can lead to critical sec bugs, and turn them into frequently-actionable bugs. If needed this can be enabled incrementally once the build/config infra and macros are in place; there are several options for that. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Enabling (many) assertions in opt builds locally and eventually Nightly
It may be possible to narrow down the performance-regressing MOZ_ASSERTs by pushing a series of Talos try runs with MOZ_ASSERT enabled for specific top-level directories - then take those with significant regressions, and recurse down a layer, lather, rinse, repeat to at least the leaf-dir level, or (if there are a lot in the dir) the file level. Then we can look with eyeballs and figure out what's up. It might also be possible to use trace APIs to measure the cost of each assertion... though the traces themselves might swamp most assertion costs. -- Randell Jesup, Mozilla Corp remove "news" for personal email ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform