+qa-b2g

TL;DR: Graphics smoketests are not well covered because of our tools. Also
automate end-to-end tests are costly to run and flaky. We probably need to
focus more on unit tests, and to know where we need to add these tests, we
need coverage reports.


>From the QA standpoint, we used to automate as many smoketests as possible
in Python. However, since marionette-js became ready for running on device,
our main effort has been to start to migrate what we have in Gip to Gij.

In other words, our team is currently focused on:

   - Make sure marionette-js runs on device the way we need,
   - Migrate the tests that can be run on Treeherder from Gip to Gij,
   - Maintain the Gip suite, while the tests are migrated,
   - Triage the flaky tests,
   - Perform manual testing on lately found regression.

Another issue I'd like to call out regarding the smoke testing of graphics:
Most of our automation relies on Marionette. As it essentially relies on
the DOM, it usually doesn't catch regressions. That's how we end up seeing
them in manual testing. To prevent that, No-Jun created the imagecompare
suite, that runs on a daily-basis. I don't know how many tests we have in
there.

On another point, Greg is right about one thing: Most of our end to end
tests (that is to say, on device) are unreliable. Like said above, we're on
triaging these tests; we're are on the right track to get at least 2
suites: One for the tests known to be 100% reliable, and one for the flaky
ones.

In parallel, since an end to end test has a high cost to be run (currently
1-2 minutes per test on a Flame) and since it's hard to craft a totally
reliable test, the best way to prevent regressions remains to write unit
tests [1]. It's currently hard to know what areas are unit tested. One way
to start could be to automatically generate reports about the coverage on
each push. I know Sylvestre Ledru started a discussion like this on the
Firefox release-drivers mailing list. I do think we need a tool like this
for gecko and gaia. At a first glance, and correct me if I'm wrong, having
coverage analysis should be achievable as we use mocha to run the tests. We
could use a tool like istanbul[2] and plug it to coveralls[3]. From here,
devs and QA can have an overview of what areas need to be tested
(automatically or manually).

What do you all think?

Johan

[1]
http://googletesting.blogspot.fr/2015/04/just-say-no-to-more-end-to-end-tests.html
[2] https://github.com/gotwarlost/istanbul
[3] http://coveralls.io/


On Mon, Jun 15, 2015 at 4:17 AM, Greg Weng <snowma...@gmail.com> wrote:

> I think the problem is if our infrastructure or test method is not
> powerful enough to test those tests in automation. For example, one
> smoketest I've broken is about LockScreen doesn't update time correctly, if
> user pulled the battery before rebooting (Bug 1119097). In such case, I
> think the key is our automation couldn't cover some hardware related cases
> like battery or RIL, although I've heard some works, like to allow
> simulators to dial to each other, are actually ongoing. However, I believe
> these are tough works for our Gecko and Gonk team, since the latest news I
> now have is they are planning to make a new abstraction upon HAL and to
> make it more decoupling from the real devices. And some teams now even has
> no plan for that yet. Maybe we, I mean MoCo/MoFo, should consider this is
> one of the most priority issues. Since to be tricked by inaccurate CI
> result (compare to real devices) and wait the patch to be stable enough
> always torture us (Gaia team). And the idea (to make a new abstraction for
> testing and porting purposes) had already been mentioned at least one year
> ago, as far as I know.
>
> By the way, for regressions I'm used to bisecting to find the accurate
> broken patch. However from the information I've got is for Gecko or whole
> B2G it's impractical to do bisecting, if considering the building time. I
> wonder if we could or need to ease the pain of this to make finding a
> actual broken part more easily and automatically. I believe this does help.
> 2015年6月15日 上午3:55於 "Kartikaya Gupta" <kgu...@mozilla.com>寫道:
>
>> Is there any effort under way to making the smoketests automated and
>> run as part of our regular automation testing? The B2G QA team does a
>> great job of identifying regressions and tracking down the regressing
>> changeset, but AFAIK this is an entirely manual process that happens
>> after the change has landed. Ideally we should catch this on try
>> pushes or on landing though, and for that we need to automate the
>> smoketests.
>>
>> There have been a lot of complaints (and rightly so) about all sorts
>> of B2G-breaking changesets landing. I myself have landed quite a few
>> of them. I think it's unrealistic to expect every Gecko developer to
>> run through all of the smoketests manually for every change they want
>> to make (even just for the main devices/configurations we support).
>> It's also unrealistic to expect them to reliably identify "high risk"
>> changes for explicit pre-landing QA testing, because even small
>> changes can break things badly on B2G given the variety of
>> configurations we have there.
>>
>> I think the only reasonable long-term solution is to automate the
>> smoketests, and I would like to know if there's any planned or
>> in-progress effort to do that.
>>
>> Cheers,
>> kats
>> _______________________________________________
>> dev-gaia mailing list
>> dev-g...@lists.mozilla.org
>> https://lists.mozilla.org/listinfo/dev-gaia
>>
>
> _______________________________________________
> dev-gaia mailing list
> dev-g...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-gaia
>
>
_______________________________________________
dev-b2g mailing list
dev-b2g@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-b2g

Reply via email to