Re: To bump mochitest's timeout from 45 seconds to 90 seconds
On 9 February 2016 at 14:51, Marco Bonardo wrote: > On Tue, Feb 9, 2016 at 6:54 PM, Ryan VanderMeulen > wrote: > > > I'd have a much easier time accepting that argument if my experience > > didn't tell me that nearly every single "Test took longer than expected" > or > > "Test timed out" intermittent ends with a RequestLongerTimeout as the fix > > > this sounds equivalent to saying "Since we don't have enough resources (or > a plan) to investigate why some tests take so long, let's give up"... But > then maybe we should have that explicit discussion, rather than assuming > it's a truth. > Since we are focused on quality I don't think it's acceptable to say we are > fine if a test takes an unexpected amount of time to run. The fact those > bugs end up being resolved by bumping the timeout without any kind of > investigation (and it happens, I know) is worrisome. > I agree. However, this has traditionally been a very difficult area for Release Engineering and Engineering Productivity to make progress in. Who can we work with to understand these timing characteristics in more depth? ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Memory Usage on Perfherder & Memory Reduction
Hi All, Recently Geoff Brown landed an AWSY-like system [1] for tracking memory usage on Perfherder. This is awesome. It's one of my pinned tabs. I was happy to see two recent "drops" in memory usage: 1. A ~3% drop in "Resident Memory Tabs closed [+30s]", likely due to Bug 990916 which expires displayports https://treeherder.mozilla.org/perf.html#/graphs?series=[mozilla-inbound,f9cdadf297fd409c043e8114ed0fa656334e7fad,1]&zoom=1454516622714.927,1454583882842.8733,181623181.32925725,250028978.43070653 2. A ~2% drop across all memory tracking sometime on Feb8. Hard to pick a changeset, but the drop happened when inbound was merged to fx-team. https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,f9cdadf297fd409c043e8114ed0fa656334e7fad,1%5D Great to see drops in memory usage! [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1233220 ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: To bump mochitest's timeout from 45 seconds to 90 seconds
On Tue, Feb 9, 2016 at 6:54 PM, Ryan VanderMeulen wrote: > I'd have a much easier time accepting that argument if my experience > didn't tell me that nearly every single "Test took longer than expected" or > "Test timed out" intermittent ends with a RequestLongerTimeout as the fix this sounds equivalent to saying "Since we don't have enough resources (or a plan) to investigate why some tests take so long, let's give up"... But then maybe we should have that explicit discussion, rather than assuming it's a truth. Since we are focused on quality I don't think it's acceptable to say we are fine if a test takes an unexpected amount of time to run. The fact those bugs end up being resolved by bumping the timeout without any kind of investigation (and it happens, I know) is worrisome. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: To bump mochitest's timeout from 45 seconds to 90 seconds
On Tue, Feb 9, 2016 at 8:09 PM, Gijs Kruitbosch wrote: > I concur with Ryan here, and I'd add that IME 90% if not more of these > timeouts (where they are really timeouts because the test is long, rather > than just brokenness in the test that leaves it hanging until the timeout) > happen on debug/asan builds The current timeout was already setup when we had debug tests, so it is already accounting for that. We should at a maximum reduce the timeout on opt if that matters. I think it would be fine to have much bigger timeouts on asan or other builds that are not our primary target. debug and opt are close enough to what we ship and what devs use everyday that performance matters too. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
TCW Soft Close: Tree Closing Maintenance Window, Sat February 13 2016, 06:00-10:00a PST
FYI. We do not expect any significant impact to platform operations. "Soft Close" means we'll leave the trees open, but devs who push: - can expect issues - are personally responsible for managing their job (retries, etc.) -- Forwarded message -- From: Date: Tue, Feb 9, 2016 at 11:20 AM Subject: [Planned] Scheduled Tree Closing Maintenance Window, Sat February 13 2016, 06:00-10:00a PST To: all-moco-m...@mozilla.com Issue Status: Upcoming Short Summary: IT will be performing the following work during the Feb 13, 2016 TCW: 1232033 - Delete trunking of releng vlans to switch1.r601-1.ops.scl3.mozilla.net 1239378 - Upgrade java on production Elasticsearch cluster 1240821 - Upstream EPEL mirror needs to be corrected and resynch'd Mozilla IT Maintenance Notification: -- Issue Status: Upcoming Bug IDs: 1239400 Start Date:2016-02-13 Start Time:06:00 PST Site: All Services: Tree Closure Impact of Work:Minimal disruption to Mozilla sites and services is expected. Elasticsearch availabiity for SUMO will be impacted during change 1239378. If you have any questions or concerns please address them to...@mozilla.com or visit #moc in IRC Also, visit whistlepig.mozilla.org for all notifications. -- m...@mozilla.com - m...@mozilla.com ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Gecko/Firefox stats and diagrams wanted
On Tue, Feb 9, 2016 at 12:31 PM, Nicholas Alexander wrote: > I also wanted to try to find some diagrams to show how Firefox and Gecko >> work/their architecture, from a high level perspective (not too insane a >> level of detail, but reasonable). >> > > Nathan Froyd worked up a very high-level slide deck for his onboarding > sessions; they're amazing. I'm not sure how public those slides are, so > I've CCed him and he may choose to link to those. I would really love to > see these worked up into a document rather than a presentation. > The presentation is public: https://docs.google.com/presentation/d/1ZHUkNzZK2TrF5_4MWd_lqEq7Ph5B6CDbNsizIkBxbnQ/edit?usp=sharing I've tried to include links into wikis and whatnot where possible. We have: https://wiki.mozilla.org/Gecko:Overview which includes jumping-off points for exploration of major subsystems, as well. If folks have suggestions of diagrams, links, etc. that should go in, I'd love to hear about them. Thanks, -Nathan ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: To bump mochitest's timeout from 45 seconds to 90 seconds
I concur with Ryan here, and I'd add that IME 90% if not more of these timeouts (where they are really timeouts because the test is long, rather than just brokenness in the test that leaves it hanging until the timeout) happen on debug/asan builds, where "perf regressions" isn't really a meaningful concept to regression-analyze for, compared to the debug and asan overhead. ~ Gijs On 09/02/2016 17:54, Ryan VanderMeulen wrote: I'd have a much easier time accepting that argument if my experience didn't tell me that nearly every single "Test took longer than expected" or "Test timed out" intermittent ends with a RequestLongerTimeout as the fix. -Ryan On 2/9/2016 12:50 PM, Haik Aftandilian wrote: On Tue, Feb 9, 2016 at 2:47 AM, Marco Bonardo wrote: Based on that, bumping the timeout may have 2 downsides, long term: - slower tests for everyone - sooner or later 90 seconds won't be enough again. Are we going to bump to 180 then? Essentially restating Marco's concern, increasing timeouts has the side effect where performance regressions are not noticed. i.e., a new bug that causes a test to take longer, but still pass, is not detected. With the original lower timeouts, the test would fail with a timeout. So a little bit of the value of the tests is lost, and it's difficult to address later. Haik ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Gecko/Firefox stats and diagrams wanted
+ Sotaro, who has over the years created a lot of different architecture/class diagrams of different parts of Gecko. They might be too detailed for your needs but worth checking. On Tue, Feb 9, 2016 at 12:31 PM, Nicholas Alexander wrote: > +Kyle, +Nathan > > On Tue, Feb 9, 2016 at 9:00 AM, Chris Mills wrote: > >> Hi all, >> >> I’m writing a presentation about browsers, standards implementation, and >> cross-browser coding to give at some universities. As a part of it, I >> wanted to present some stats about Firefox/Gecko to show how many people on >> average commit to it (say, every month, every year?), how many people work >> on localising the content strings, how many people work on platform/UI >> features, etc. >> > > Kyle Lahnakoski has done some work in this area -- he set up a neat > contributor dashboard. Perhaps Kyle has more data about paid activity > too. I'm CCing him to see if he can say more. I imagine Mike Hoye has > much to say here. > > >> I also wanted to try to find some diagrams to show how Firefox and Gecko >> work/their architecture, from a high level perspective (not too insane a >> level of detail, but reasonable). >> > > Nathan Froyd worked up a very high-level slide deck for his onboarding > sessions; they're amazing. I'm not sure how public those slides are, so > I've CCed him and he may choose to link to those. I would really love to > see these worked up into a document rather than a presentation. > > Thanks for doing this work! > Nick > ___ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: To bump mochitest's timeout from 45 seconds to 90 seconds
Just to clarify, you're *only* talking about browser-chrome mochitests here, correct? (not other mochitest suites like mochitest-plain) (It looks like this is the case, based on the bug, but your dev.platform post here made it sound like this change affected all mochitests.) Thanks, ~Daniel On 02/08/2016 02:51 PM, Armen Zambrano G. wrote: > Hello, > In order to help us have less timeouts when running mochitests under > docker, we've decided to double mochitests' gTimeoutSeconds and reduce > large multipliers in half. > > Here's the patch if you're curious: > https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1246152&attachment=8717111 > > If you have any comments or concerns please raise them in the bug. > > regards, > Armen > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: To bump mochitest's timeout from 45 seconds to 90 seconds
I'd have a much easier time accepting that argument if my experience didn't tell me that nearly every single "Test took longer than expected" or "Test timed out" intermittent ends with a RequestLongerTimeout as the fix. -Ryan On 2/9/2016 12:50 PM, Haik Aftandilian wrote: On Tue, Feb 9, 2016 at 2:47 AM, Marco Bonardo wrote: Based on that, bumping the timeout may have 2 downsides, long term: - slower tests for everyone - sooner or later 90 seconds won't be enough again. Are we going to bump to 180 then? Essentially restating Marco's concern, increasing timeouts has the side effect where performance regressions are not noticed. i.e., a new bug that causes a test to take longer, but still pass, is not detected. With the original lower timeouts, the test would fail with a timeout. So a little bit of the value of the tests is lost, and it's difficult to address later. Haik ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: To bump mochitest's timeout from 45 seconds to 90 seconds
On Tue, Feb 9, 2016 at 2:47 AM, Marco Bonardo wrote: > Based on that, bumping the timeout may have 2 downsides, long term: > - slower tests for everyone > - sooner or later 90 seconds won't be enough again. Are we going to bump to > 180 then? > Essentially restating Marco's concern, increasing timeouts has the side effect where performance regressions are not noticed. i.e., a new bug that causes a test to take longer, but still pass, is not detected. With the original lower timeouts, the test would fail with a timeout. So a little bit of the value of the tests is lost, and it's difficult to address later. Haik > > I think that's the main reason the default timeout was set to a low value, > while still allowing the multipliers as a special case for tests that > really require bigger times, cause there's no other way out. > > Is docker doubling the time for every test? From the bug looks like it may > add 20-30% of overhead, so why are we not bumping the timeout of 30% (let's > say 60s) and investigating the original cause (the bug that takes 80s to > run) to figure if something can be done to make it finish sooner? > > -m > > > On Mon, Feb 8, 2016 at 11:51 PM, Armen Zambrano G. > wrote: > > > Hello, > > In order to help us have less timeouts when running mochitests under > > docker, we've decided to double mochitests' gTimeoutSeconds and reduce > > large multipliers in half. > > > > Here's the patch if you're curious: > > > > > https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1246152&attachment=8717111 > > > > If you have any comments or concerns please raise them in the bug. > > > > regards, > > Armen > > > > -- > > Zambrano Gasparnian, Armen > > Automation & Tools Engineer > > http://armenzg.blogspot.ca > > ___ > > dev-platform mailing list > > dev-platform@lists.mozilla.org > > https://lists.mozilla.org/listinfo/dev-platform > > > ___ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Gecko/Firefox stats and diagrams wanted
+Kyle, +Nathan On Tue, Feb 9, 2016 at 9:00 AM, Chris Mills wrote: > Hi all, > > I’m writing a presentation about browsers, standards implementation, and > cross-browser coding to give at some universities. As a part of it, I > wanted to present some stats about Firefox/Gecko to show how many people on > average commit to it (say, every month, every year?), how many people work > on localising the content strings, how many people work on platform/UI > features, etc. > Kyle Lahnakoski has done some work in this area -- he set up a neat contributor dashboard. Perhaps Kyle has more data about paid activity too. I'm CCing him to see if he can say more. I imagine Mike Hoye has much to say here. > I also wanted to try to find some diagrams to show how Firefox and Gecko > work/their architecture, from a high level perspective (not too insane a > level of detail, but reasonable). > Nathan Froyd worked up a very high-level slide deck for his onboarding sessions; they're amazing. I'm not sure how public those slides are, so I've CCed him and he may choose to link to those. I would really love to see these worked up into a document rather than a presentation. Thanks for doing this work! Nick ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Gecko/Firefox stats and diagrams wanted
Hi all, I’m writing a presentation about browsers, standards implementation, and cross-browser coding to give at some universities. As a part of it, I wanted to present some stats about Firefox/Gecko to show how many people on average commit to it (say, every month, every year?), how many people work on localising the content strings, how many people work on platform/UI features, etc. I also wanted to try to find some diagrams to show how Firefox and Gecko work/their architecture, from a high level perspective (not too insane a level of detail, but reasonable). Has anyone got anything like these, or ideas on how I can get such information? If so, I’d love to hear from you. thanks, Chris Mills Senior tech writer || Mozilla developer.mozilla.org || MDN cmi...@mozilla.com || @chrisdavidmills ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: MozReview/Autoland in degraded state
Try integration is now restored. Autoland to inbound will be available pending some further testing. On Fri, Feb 5, 2016 at 5:34 PM, Gregory Szorc wrote: > r+ carry forward/"status" column is now working again. > > Autoland / Try integration is still offline. > > On Fri, Feb 5, 2016 at 12:13 PM, Mark Côté wrote: > > > And a little longer than planned, but we're back. All users, regardless > > of level, can once again push code to MozReview. > > > > As noted, Autoland and r+ carry forward/"status" column will remain > > disabled a little while longer, as there are some unrelated issues to > > sort out. We'll report back here when they're back, hopefully Monday. > > > > Mark > > > > > > On 2016-02-05 1:42 PM, Mark Côté wrote: > > > We will be deploying a fix for the ssh-level restrictions to MozReview > > > shortly, around 2:30 pm EST/11:30 am PST. MozReview will be down for > > > about 10 minutes if all goes smoothly. We'll be able to rollback not > > > long after that if there are unresolvable issues. You can follow along > > > in #mozreview. > > > > > > Other fixes to LDAP and Autoland will follow Mondayish. > > > > > > Thank you for your patience. > > > > > > Mark > > > > > > > > > On 2016-02-03 3:02 AM, Gregory Szorc wrote: > > >> MozReview and Autoland are currently in a degraded state: > > >> > > >> * HTTP pushes are disabled > > >> * SSH pushes require LDAP SCM Level 3 access > > >> * Autoland is disabled > > >> * r+ carry forward has been disabled / the overall "status" column in > > the > > >> commits list may not turn green > > >> > > >> The last bullet point is particularly troubling, as we had to disable > > >> something that wasn't designed to be disabled. There may be some weird > > >> fallout with review flag / state as a result. > > >> > > >> Bug 1244835 tracks restoring push access. Bug 1245412 tracks the other > > >> issues. > > >> > > >> Please understand that additional technical details cannot be provided > > at > > >> this time. > > >> > > >> We apologize for the inconvenience and hope to have full service > > restored > > >> as soon as possible. > > >> > > > > > > > ___ > > dev-platform mailing list > > dev-platform@lists.mozilla.org > > https://lists.mozilla.org/listinfo/dev-platform > > > ___ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: To bump mochitest's timeout from 45 seconds to 90 seconds
I will try 60 seconds and see how it goes. On 16-02-09 05:47 AM, Marco Bonardo wrote: > 90 seconds for a simple test sounds like a lot of time and a huge bump from > the current situation (45). > The risk is people will start writing much bigger tests instead of > splitting them into smaller an more manageable tests. Plus when a test > depends on a long timeout in the product, developers are used to figure out > ways to reduce those (through hidden prefs or such) so that test can finish > sooner and not timeout. > Based on that, bumping the timeout may have 2 downsides, long term: > - slower tests for everyone > - sooner or later 90 seconds won't be enough again. Are we going to bump to > 180 then? > > I think that's the main reason the default timeout was set to a low value, > while still allowing the multipliers as a special case for tests that > really require bigger times, cause there's no other way out. > > Is docker doubling the time for every test? From the bug looks like it may > add 20-30% of overhead, so why are we not bumping the timeout of 30% (let's > say 60s) and investigating the original cause (the bug that takes 80s to > run) to figure if something can be done to make it finish sooner? > > -m > > > On Mon, Feb 8, 2016 at 11:51 PM, Armen Zambrano G. > wrote: > >> Hello, >> In order to help us have less timeouts when running mochitests under >> docker, we've decided to double mochitests' gTimeoutSeconds and reduce >> large multipliers in half. >> >> Here's the patch if you're curious: >> >> https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1246152&attachment=8717111 >> >> If you have any comments or concerns please raise them in the bug. >> >> regards, >> Armen >> >> -- >> Zambrano Gasparnian, Armen >> Automation & Tools Engineer >> http://armenzg.blogspot.ca >> ___ >> dev-platform mailing list >> dev-platform@lists.mozilla.org >> https://lists.mozilla.org/listinfo/dev-platform >> -- Zambrano Gasparnian, Armen Automation & Tools Engineer http://armenzg.blogspot.ca ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Reftests moving to structured logging
This is now live on central. On 04/02/16 01:28 PM, Andrew Halberstadt wrote: Reftest is the last major test harness still not using structured logs, but that should change by the end of the week. See bug 1034290 [1] for more details. I've tried my best to make sure things like reftest-analyzer, leak/assertion checks, crash detection, etc. all continue to work. But due to the sad lack of tests for the harnesses themselves, it's possible that I missed something. So if you see anything not working like it should, please file a bug blocking bug 1034290 [1] and CC me. What does this change mean for reftest? In the short term, nothing should be different save that reftests will start working with tools that depend on structured logging (e.g ActiveData, auto-starring, etc). In the medium term, it means we'll be able to tweak the log format without breaking anything (once consumers that are still parsing the formatted log get updated). In the long term, structured logging will be a foundation upon which new data driven tools will be built. Let me know if you have any questions or concerns, -Andrew [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1034290 ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: To bump mochitest's timeout from 45 seconds to 90 seconds
90 seconds for a simple test sounds like a lot of time and a huge bump from the current situation (45). The risk is people will start writing much bigger tests instead of splitting them into smaller an more manageable tests. Plus when a test depends on a long timeout in the product, developers are used to figure out ways to reduce those (through hidden prefs or such) so that test can finish sooner and not timeout. Based on that, bumping the timeout may have 2 downsides, long term: - slower tests for everyone - sooner or later 90 seconds won't be enough again. Are we going to bump to 180 then? I think that's the main reason the default timeout was set to a low value, while still allowing the multipliers as a special case for tests that really require bigger times, cause there's no other way out. Is docker doubling the time for every test? From the bug looks like it may add 20-30% of overhead, so why are we not bumping the timeout of 30% (let's say 60s) and investigating the original cause (the bug that takes 80s to run) to figure if something can be done to make it finish sooner? -m On Mon, Feb 8, 2016 at 11:51 PM, Armen Zambrano G. wrote: > Hello, > In order to help us have less timeouts when running mochitests under > docker, we've decided to double mochitests' gTimeoutSeconds and reduce > large multipliers in half. > > Here's the patch if you're curious: > > https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1246152&attachment=8717111 > > If you have any comments or concerns please raise them in the bug. > > regards, > Armen > > -- > Zambrano Gasparnian, Armen > Automation & Tools Engineer > http://armenzg.blogspot.ca > ___ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform