Re: Several leak failures have slipped passed continuous integration

2016-12-29 Thread Eric Rahm
Also note this has happened before. mccr8 was looking into similar
leak-checking-is-totally-busted-but-nobody-noticed issues a few years ago
in https://bugzilla.mozilla.org/show_bug.cgi?id=1045316

Glad to hear you're looking into end-to-end tests!

-e

On Thu, Dec 29, 2016 at 8:37 AM, Andrew Halberstadt <
ahalberst...@mozilla.com> wrote:

> Over the holidays, we noticed that leaks in mochitest and reftest were not
> turning jobs orange, and that the test harnesses had been running in that
> state for quite some time. During this time several leak related test
> failures have landed, which can be tracked with this dependency tree:
> https://bugzilla.mozilla.org/showdependencytree.cgi?id=
> 1325148&hide_resolved=0
>
> The issue causing jobs to remain green has been fixed, however the known
> leak regressions had to be whitelisted to allow this fix to land. So while
> future leak regressions will properly fail, the existing ones (in the
> dependency tree) still need to be fixed. For mochitest, the whitelist can
> be found here:
> https://dxr.mozilla.org/mozilla-central/source/
> testing/mochitest/runtests.py#2218
>
> Other than that, leak checking is only disabled on linux crashtests.
>
> Please take a quick look to see if there is a leak in a component for
> which you could help out. I will continue to help with triage and bisection
> for the remaining issues until they are all fixed. Also big thanks to all
> the people who are currently working on a fix or have already landed a fix.
>
> Read on only if you are interested in the details.
>
>
>
> *Why wasn't this caught earlier? *
> The short answer to this question is that we do not have adequate testing
> of our CI.
>
> The problem happened at the intersection between mozharness and the test
> harnesses. Basically a change in mozharness exposed a latent bug in the
> test harnesses, and was able to land because it appeared as if nothing went
> wrong. Catching errors like this is tricky because regular unit tests would
> not have detected it either. It requires integration tests of the CI system
> as a whole (spanning test harnesses, mozharness and buildbot/taskcluster).
>
>
> *How will we prevent this in the future?*
>
> Historically, integration testing our test harnesses has been a hard
> problem. However with recent work in taskcluster, python tests and some
> refactoring on the build frontend, I believe there is a path forward that
> will allow us to stand up this kind of test. I will commit some of my time
> to fix this and hope to have *something* running that would have caught
> this by the end of Q1.
>
> I would also like to stand up a test harness designed to test command line
> applications in CI, which would provide another avenue for writing test
> harness unit and integration tests. Bug 1311991
>  will track this
> work.
>
> It is important that developers are able to trust our tests, and when bugs
> like this happen, that trust is eroded. For that I'd like to apologize, and
> express my hope that this will be the last time a major test result bug
> like this happens again. At the very least, we need to have the capability
> of adding a regression test when a bug like this happens in the future.
>
> Thanks for your help and understanding.
> - Andrew
>
> ___
> firefox-dev mailing list
> firefox-...@mozilla.org
> https://mail.mozilla.org/listinfo/firefox-dev
>
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Several leak failures have slipped passed continuous integration

2016-12-29 Thread Boris Zbarsky

On 12/29/16 11:49 AM, ahalberst...@mozilla.com wrote:

Have we done any sort of audit to see whether any other tests got broken
by the structured logging changes?  Had we done such an audit after bug
1321127 was fixed?


Once bug 1321127 was fixed, any other tests that were broken would turn the job 
orange so would have prevented the fix from landing.


No, what I meant was this: we now have two separate instances of 
harnesses not showing failures because they were relying on the 
TEST-UNEXPECTED-FAIL string parsing, not the structured logger.  Have we 
done any auditing for _other_ cases that are relying on said string parsing?


-Boris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Several leak failures have slipped passed continuous integration

2016-12-29 Thread ahalberstadt
On Thursday, December 29, 2016 at 2:04:28 PM UTC-5, Boris Zbarsky wrote:
> Andrew, thank you for catching this!  This sounds a _lot_ like 
> https://bugzilla.mozilla.org/show_bug.cgi?id=1321127 in terms of cause 
> and effects.

Yes, that bug is a similar failure. In both cases we should have had some sort 
of automated testing to catch regressions in mozharness' log parsing. When I 
start working on trying to get this stood up, I will make sure we check all of 
test failures, assertions, leaks and crashes across all our harnesses.

> Have we done any sort of audit to see whether any other tests got broken 
> by the structured logging changes?  Had we done such an audit after bug 
> 1321127 was fixed?

Once bug 1321127 was fixed, any other tests that were broken would turn the job 
orange so would have prevented the fix from landing. It's possible (though 
highly unlikely), that a failure was "accidentally fixed" on central but not 
aurora before we detected the problem. I can push the fix to aurora too just to 
be sure.

> Have the bugs in that dependency tree you link above been nominated for 
> tracking for the relevant branches?  I _think_ the leak detection has 
> been broken since sometime in the 51 timeframe at least for some test 
> harnesses (based on the target milestone on bug 1261199, for example), 
> so presumably we need to backport the fix and figure out which leak bugs 
> are happening on aurora and beta, then set tracking flags, etc, right?

Yes, there are a couple leaks on the 52 branch and their tracking flags have 
been set accordingly. I haven't noticed any on 51 yet, though it could be 
possible. Joel and I are still working through the list to better triage them, 
but as they are getting fixed we are double checking that they aren't on 
aurora/beta as well.

> 
> Again, thank you for catching this.
> 
> -Boris

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Several leak failures have slipped passed continuous integration

2016-12-29 Thread Boris Zbarsky

On 12/29/16 8:37 AM, Andrew Halberstadt wrote:

Over the holidays, we noticed that leaks in mochitest and reftest were
not turning jobs orange, and that the test harnesses had been running in
that state for quite some time. During this time several leak related
test failures have landed, which can be tracked with this dependency tree:
https://bugzilla.mozilla.org/showdependencytree.cgi?id=1325148&hide_resolved=0


Andrew, thank you for catching this!  This sounds a _lot_ like 
https://bugzilla.mozilla.org/show_bug.cgi?id=1321127 in terms of cause 
and effects.


Have we done any sort of audit to see whether any other tests got broken 
by the structured logging changes?  Had we done such an audit after bug 
1321127 was fixed?


Have the bugs in that dependency tree you link above been nominated for 
tracking for the relevant branches?  I _think_ the leak detection has 
been broken since sometime in the 51 timeframe at least for some test 
harnesses (based on the target milestone on bug 1261199, for example), 
so presumably we need to backport the fix and figure out which leak bugs 
are happening on aurora and beta, then set tracking flags, etc, right?


Again, thank you for catching this.

-Boris

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: tier-2 Windows clang-cl static analysis builds running on inbound

2016-12-29 Thread Ehsan Akhgari
On 2016-12-29 9:34 AM, Ryan VanderMeulen wrote:
> On 12/23/2016 3:21 PM, Nicholas Nethercote wrote:
>> Hooray!
>>
>> What is the name of the job on Treeherder? I see a "Windows 2012 x64 opt
>> Executed by TaskCluster build-win64-clang/opt tc(Bcl)" job (and some
>> minor
>> variations) but I suspect that's not the new static analysis one?
> 
> There's a tc(S) job on the Windows 2012 lines now. That said, filtering
> Treeherder on "static" doesn't show the jobs, which is unfortunate. Any
> chance we can adjust the job name so it does?

I have already filed bug 1325723 to rename these jobs to make the
trychooser syntax work for them.  The "static" part in the name of the
jobs comes from buildbot, so we need to do something different for these
jobs.  That being said, I'm not familiar with the naming conventions
here, so if someone has a better idea how we can name the jobs for this
use case and also the trychooser syntax to work, please comment there.

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Several leak failures have slipped passed continuous integration

2016-12-29 Thread Andrew Halberstadt
Over the holidays, we noticed that leaks in mochitest and reftest were 
not turning jobs orange, and that the test harnesses had been running in 
that state for quite some time. During this time several leak related 
test failures have landed, which can be tracked with this dependency tree:

https://bugzilla.mozilla.org/showdependencytree.cgi?id=1325148&hide_resolved=0

The issue causing jobs to remain green has been fixed, however the known 
leak regressions had to be whitelisted to allow this fix to land. So 
while future leak regressions will properly fail, the existing ones (in 
the dependency tree) still need to be fixed. For mochitest, the 
whitelist can be found here:

https://dxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py#2218

Other than that, leak checking is only disabled on linux crashtests.

Please take a quick look to see if there is a leak in a component for 
which you could help out. I will continue to help with triage and 
bisection for the remaining issues until they are all fixed. Also big 
thanks to all the people who are currently working on a fix or have 
already landed a fix.


Read on only if you are interested in the details.


_Why wasn't this caught earlier?
_
The short answer to this question is that we do not have adequate 
testing of our CI.

_
_The problem happened at the intersection between mozharness and the 
test harnesses. Basically a change in mozharness exposed a latent bug in 
the test harnesses, and was able to land because it appeared as if 
nothing went wrong. Catching errors like this is tricky because regular 
unit tests would not have detected it either. It requires integration 
tests of the CI system as a whole (spanning test harnesses, mozharness 
and buildbot/taskcluster).



_How will we prevent this in the future?_

Historically, integration testing our test harnesses has been a hard 
problem. However with recent work in taskcluster, python tests and some 
refactoring on the build frontend, I believe there is a path forward 
that will allow us to stand up this kind of test. I will commit some of 
my time to fix this and hope to have /something/ running that would have 
caught this by the end of Q1.


I would also like to stand up a test harness designed to test command 
line applications in CI, which would provide another avenue for writing 
test harness unit and integration tests. Bug 1311991 
 will track this work.


It is important that developers are able to trust our tests, and when 
bugs like this happen, that trust is eroded. For that I'd like to 
apologize, and express my hope that this will be the last time a major 
test result bug like this happens again. At the very least, we need to 
have the capability of adding a regression test when a bug like this 
happens in the future.


Thanks for your help and understanding.
- Andrew
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Intent to Implement: adding vector effects non-scaling-size, non-rotation and fixed-position to SVG

2016-12-29 Thread Jeff Muizelaar
I'm concerned about the complexity this will add to the SVG implementation
as we're looking to transition to WebRender.

Can the desired effects be achieved by interleaving HTML and SVG content
today? e.g. It seems like introductory notes example could just use a
separate SVG element that had fixed positioning instead of needing to build
fixed-position into SVG.

Do other browsers intend to implement these features?

-Jeff

On Sun, Dec 25, 2016 at 9:47 PM, Ramin  wrote:

> Intent to Implement: adding vector effects  non-scaling-size, non-rotation
> and fixed-position to SVG
>
> Contact emails
> 
> te-fuk...@kddi-tech.com, g-ra...@kddi-tech.com, sa-...@kddi-tech.com
>
> Summary
> 
> To offer vector effects regarding special coordinate transformations and
> graphic drawings as described in following Spec link,
> SVG Tiny 1.2 introduced the vector-effect property. Although SVG Tiny 1.2
> introduced only non-scaling stroke behavior, this version introduces a
> number of additional effects.
>
> We intend now to implement non-scaling-size, non-rotation and
> fixed-position, as well as their combination to Gecko/SVG.
>
> Motivation
> 
> It is a point of interest in many SVG content providers to let the outline
> of an object keep its original size or to keep the position of an object
> fix, regardless of type of transforms that are applied to it. For example,
> in a map with a 2px wide line representing roads it is of
> interest to keep the roads 2px wide even when the user zooms into the map,
> or introductory notes on the graphic chart in which panning is possible.
> Therefore, there is a high need for supporting these features in browsers.
>
>
> Spec(Link to standard)
> 
> https://svgwg.org/svg2-draft/coords.html#VectorEffects
>
> Platform coverage
> 
> Starting from Windows and Linux.
>
> Bug
> 
> https://bugzilla.mozilla.org/show_bug.cgi?id=1318208
>
> Estimated or target release
> 
> 2017/1/30
>
> Requesting approval to ship?
> 
> No.  Implementation is expected to take some time.
>
> Tests
> 
> Coming soon.
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: tier-2 Windows clang-cl static analysis builds running on inbound

2016-12-29 Thread Ryan VanderMeulen

On 12/23/2016 3:21 PM, Nicholas Nethercote wrote:

Hooray!

What is the name of the job on Treeherder? I see a "Windows 2012 x64 opt
Executed by TaskCluster build-win64-clang/opt tc(Bcl)" job (and some minor
variations) but I suspect that's not the new static analysis one?


There's a tc(S) job on the Windows 2012 lines now. That said, filtering 
Treeherder on "static" doesn't show the jobs, which is unfortunate. Any 
chance we can adjust the job name so it does?


-Ryan
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform