Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-10 Thread Kurt Roeckx

On 2016-03-09 22:17, Boris Zbarsky wrote:

On 3/9/16 3:47 PM, decoder...@googlemail.com wrote:

Actually no. I adapted our gtests in less than an hour.


Does this have to do with the set of things they're testing, or the
style the tests are written in?


I think the point is that some tests make it easy to set up fuzzing 
based on them.  I think what makes it easy is that it has a some input 
data that it puts thru some API.  For instance read some JS and either 
just parse it or even try to run it.  Or read some image file and decode 
it, maybe even calling the needed functions to display it.


Clearly everything were you get (untrusted) data from somewhere it 
should be relatively easy to set up fuzzing for it, but maybe it should 
start with writing a test suite that takes that input and does something 
with it, and probably tries to expose it to as much as possible things 
as the real application would.


So maybe the question isn't about having standalone programs, but more 
about a test suite tries to deals with input data like a program that 
uses the API would do.



Kurt

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread Andrew McCreight
On Wed, Mar 9, 2016 at 12:17 PM, Boris Zbarsky  wrote:

> Just to satisfy my curiosity, what is AFL?
>

AFL is American Fuzzy Lop, a fuzzer that uses a combination of compiled-in
code coverage and genetic algorithms. http://lcamtuf.coredump.cx/afl/ It
has found a ton of errors in all sorts of programs, but it requires pretty
deterministic behavior (eg so it can implicitly learn that tweaking the nth
bit will cause a different branch to be taken).



>
> but that still doesn't solve the problem that people have to write the
>> necessary code that we can fuzz then.
>>
>
> OK.  This is a problem, certainly, and pretty independent of both the
> "split Gecko" thing and the existence of shells, right?
>
> What are the necessary qualities for things you can fuzz?
>
> -Boris
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread twsmith
On Wednesday, March 9, 2016 at 12:47:08 PM UTC-8, decod...@googlemail.com wrote:
> > 
> > > the sample tests (xpcshell-tests) are extremely complicated to adapt
> > 
> > That seems like it would be a problem in any new thing too, right?
> 
> Actually no. I adapted our gtests in less than an hour.
> 
> > 
> > > and we can't easily use it with AFL.
> > 
> > Just to satisfy my curiosity, what is AFL?
> 
> http://lcamtuf.coredump.cx/afl/
> 
> > 
> > > but that still doesn't solve the problem that people have to write the 
> > > necessary code that we can fuzz then.
> > 
> > OK.  This is a problem, certainly, and pretty independent of both the 
> > "split Gecko" thing and the existence of shells, right?
> 
> Not really no. Because some shells and tests we have are very straightforward 
> to use and we can figure it out ourselves. xpcshell is not such an example.
> 
> > 
> > What are the necessary qualities for things you can fuzz?
> 
> 
> It depends on the type of fuzzing. Let's stick to AFL:
> 
> - Program is easy to start (doesn't need profiles or long initialization) and 
> can be packaged
> - Has AFL persistent mode support (requires support on C++ level)
> - Exercises the targeted feature in a similar way compared to how Firefox 
> would do it
> - Optionally has some extra testing features (e.g. gczeal, ion-eager, 
> extra-checks for the JS shell) that make bug finding easier
> - Can be compiled with all sanitizer types (although MSan is not going to 
> work for some stuff even in shells)
> 
> 
> That's just a dump out of my head, might be missing some stuff.

More qualities that benefit fuzzing:
- Fast
- Deterministic (very useful for feedback driven fuzzing)
- Automate-able, easily run from a script from the command line
- Distribute-able in parallel and size by side. So ideally statically linked 
and multiple instances can run at the same time in the same environment.
- Lots of assertions
- Targeted, only meant to test one area of code (very useful for feedback 
driven fuzzing)
- Cross platform, unless functionality is the exactly the same and then in that 
case target Linux
- Accepts data from a file or stdin, having to depend on web servers etc, does 
complicate things but sometimes is required
- Simple build system, now I feel like I'm just stating the obvious :)

 
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread David Rajchenbach-Teller
On 09/03/16 22:17, Bobby Holley wrote:
> I think splitting Gecko into multiple repos is an anti-goal, because we
> derive enormous productivity benefits from using a mono-repo.
> 
> If you're serious about proposing that, I would suggest starting a new
> thread to avoid derailing this one.

I'm actually not suggesting it, just extrapolating, so yeah, let's not
derail the thread.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread Bobby Holley
On Wed, Mar 9, 2016 at 12:58 PM, David Rajchenbach-Teller <
dtel...@mozilla.com> wrote:

> I'm assuming that, for the sake of this discussion, Gecko ==
> mozilla-central minus stuff that can be built and distributed on its
> own. In particular, most of toolkit/ is covered by this definition.
>
> Just looking at things I use/work on, I believe that we could relatively
> easily split in distinct modules, with distinct repos and separate
> fuzzing/testing:
> - the crash reporter;
> - the crash manager;
> - the shutdown terminator;
> - AsyncShutdown;
> - Telemetry;
> - OS.File;
> - Task.jsm.
>

Modularizing these things would be great, but I don't see anything on this
list that I think we would realistically fuzz given our current resourcing.


>
>
> Certainly, this is just a few ‱ of Firefox, but, hey, you need to start
> somewhere. I believe that chipping away at the nightmare of implicit
> dependencies that Gecko has become would be a good idea. Fwiw, 90% of my
> work on (Async)Shutdown is finding out which implicit and undocumented
> hypothesis on execution order have been introduced in the code and
> shaving as much yak as needed to make them explicit, well-behaved and
> not-hangy/crashy.
>
> If splitting Gecko Cargo-style, across repos with clear dependencies, is
> the best way to do this, I'm happy to head that way.
>

I think splitting Gecko into multiple repos is an anti-goal, because we
derive enormous productivity benefits from using a mono-repo.

If you're serious about proposing that, I would suggest starting a new
thread to avoid derailing this one.


> More generally, I have recently come to realize how scared I am of my
> own code, in particular JS-based. I know how it is supposed to behave,
> but I have no idea how it is going to behave if, say, someone has the
> brilliant idea of throwing an Error that cannot be converted to a String
> (yes, this has happened to me, much fun ensued), or stopping a process
> while I'm communicating to it, etc.
>
> If someone finds a way to introduce fuzz-testing that can simulate all
> of that, I'm all for it.
>
>
> Cheers,
>  David
>
> On 09/03/16 19:14, Bobby Holley wrote:
> > Can you elaborate on which Gecko components you're hoping to fuzz
> > separately? A lot of the core is pretty heavily-intertwined, so I'm
> pretty
> > skeptical that we'd ever be able to separate out DOM, style, and layout
> > from each other (for example). There are basically two barriers:
> >
> > (1) These components are enormous, and were not built to be very modular.
> > Any such efforts would require a huge amount of engineering resources,
> > which we would probably not spend just to make the component fuzzable.
> > (2) The performance cost of adding an abstraction layer between
> > tightly-coupled components in C++11 would probably be prohibitive (the
> > situation is different for Rust/Servo because of Traits - we could
> > potentially do this with C++ Concepts/Modules in around a decade).
> >
> > This is all to say that I think a general call to "modularize Gecko"
> isn't
> > really helpful. But if there are specific leaf-y components that you want
> > to fuzz separately but can't, that might be a good starting point.
> >
> > On Wed, Mar 9, 2016 at 9:54 AM,  wrote:
> >
> >> On Wednesday, March 9, 2016 at 9:38:55 AM UTC-8, Nicolas B. Pierron
> wrote:
> >>> This discussion is a follow-up discussion to some emails sent privately
> >> by
> >>> accident.
> >>>
> >>> If you have not followed, I will quote David Bryant:
> >>>  > Improving release quality is one of the three fundamental goals
> >> Platform
> >>>  > Engineering committed to this year. To this end, lmandel built a
> >> Bugzilla
> >>>  > dashboard that allows us to track regressions found in any given
> >> release
> >>>  > cycle. This dashboard [...] can
> >>>  > also be found at: http://mozilla.github.io/releasehealth/
> >>>
> >>> To David's email, I answered the following:
> >>>
> >>> --
> >>> tl;dr: If we want to improve the quality of our products we should
> >>> split Gecko in standalone programs which are fuzzing-friendly.
> >>>
> >>> One thing which strikes me, is the ratio of regressions per component
> >>> that we have for each versions, and more over who are the persons
> >>> opening these bugs:
> >>>   - Release:
> >>>
> >>
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox44&f2=OP&f3=cf_status_firefox43&f4=cf_status_firefox43&f5=cf_status_firefox43&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898533&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
> >>>   - Beta:
> >>>
> >>
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox45&f2=OP&f3=cf_status_firefox44&f4=cf_status_firefox44&f5=cf_status_firefox44&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwo

Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread Boris Zbarsky

On 3/9/16 3:47 PM, decoder...@googlemail.com wrote:

Actually no. I adapted our gtests in less than an hour.


Does this have to do with the set of things they're testing, or the 
style the tests are written in?



- Program is easy to start (doesn't need profiles or long initialization) and 
can be packaged
- Has AFL persistent mode support (requires support on C++ level)
- Exercises the targeted feature in a similar way compared to how Firefox would 
do it
- Optionally has some extra testing features (e.g. gczeal, ion-eager, 
extra-checks for the JS shell) that make bug finding easier
- Can be compiled with all sanitizer types (although MSan is not going to work 
for some stuff even in shells)


Thank you, that's helpful.

-Boris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread David Rajchenbach-Teller
I'm assuming that, for the sake of this discussion, Gecko ==
mozilla-central minus stuff that can be built and distributed on its
own. In particular, most of toolkit/ is covered by this definition.

Just looking at things I use/work on, I believe that we could relatively
easily split in distinct modules, with distinct repos and separate
fuzzing/testing:
- the crash reporter;
- the crash manager;
- the shutdown terminator;
- AsyncShutdown;
- Telemetry;
- OS.File;
- Task.jsm.


Certainly, this is just a few ‱ of Firefox, but, hey, you need to start
somewhere. I believe that chipping away at the nightmare of implicit
dependencies that Gecko has become would be a good idea. Fwiw, 90% of my
work on (Async)Shutdown is finding out which implicit and undocumented
hypothesis on execution order have been introduced in the code and
shaving as much yak as needed to make them explicit, well-behaved and
not-hangy/crashy.

If splitting Gecko Cargo-style, across repos with clear dependencies, is
the best way to do this, I'm happy to head that way.

More generally, I have recently come to realize how scared I am of my
own code, in particular JS-based. I know how it is supposed to behave,
but I have no idea how it is going to behave if, say, someone has the
brilliant idea of throwing an Error that cannot be converted to a String
(yes, this has happened to me, much fun ensued), or stopping a process
while I'm communicating to it, etc.

If someone finds a way to introduce fuzz-testing that can simulate all
of that, I'm all for it.


Cheers,
 David

On 09/03/16 19:14, Bobby Holley wrote:
> Can you elaborate on which Gecko components you're hoping to fuzz
> separately? A lot of the core is pretty heavily-intertwined, so I'm pretty
> skeptical that we'd ever be able to separate out DOM, style, and layout
> from each other (for example). There are basically two barriers:
> 
> (1) These components are enormous, and were not built to be very modular.
> Any such efforts would require a huge amount of engineering resources,
> which we would probably not spend just to make the component fuzzable.
> (2) The performance cost of adding an abstraction layer between
> tightly-coupled components in C++11 would probably be prohibitive (the
> situation is different for Rust/Servo because of Traits - we could
> potentially do this with C++ Concepts/Modules in around a decade).
> 
> This is all to say that I think a general call to "modularize Gecko" isn't
> really helpful. But if there are specific leaf-y components that you want
> to fuzz separately but can't, that might be a good starting point.
> 
> On Wed, Mar 9, 2016 at 9:54 AM,  wrote:
> 
>> On Wednesday, March 9, 2016 at 9:38:55 AM UTC-8, Nicolas B. Pierron wrote:
>>> This discussion is a follow-up discussion to some emails sent privately
>> by
>>> accident.
>>>
>>> If you have not followed, I will quote David Bryant:
>>>  > Improving release quality is one of the three fundamental goals
>> Platform
>>>  > Engineering committed to this year. To this end, lmandel built a
>> Bugzilla
>>>  > dashboard that allows us to track regressions found in any given
>> release
>>>  > cycle. This dashboard [...] can
>>>  > also be found at: http://mozilla.github.io/releasehealth/
>>>
>>> To David's email, I answered the following:
>>>
>>> --
>>> tl;dr: If we want to improve the quality of our products we should
>>> split Gecko in standalone programs which are fuzzing-friendly.
>>>
>>> One thing which strikes me, is the ratio of regressions per component
>>> that we have for each versions, and more over who are the persons
>>> opening these bugs:
>>>   - Release:
>>>
>> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox44&f2=OP&f3=cf_status_firefox43&f4=cf_status_firefox43&f5=cf_status_firefox43&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898533&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
>>>   - Beta:
>>>
>> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox45&f2=OP&f3=cf_status_firefox44&f4=cf_status_firefox44&f5=cf_status_firefox44&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898534&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
>>>   - Aurora:
>>>
>> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox46&f2=OP&f3=cf_status_firefox45&f4=cf_status_firefox45&f5=cf_status_firefox45&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898536&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
>>>   - Nightly:
>>>
>> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2

Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread decoder . oh
> 
> > the sample tests (xpcshell-tests) are extremely complicated to adapt
> 
> That seems like it would be a problem in any new thing too, right?

Actually no. I adapted our gtests in less than an hour.

> 
> > and we can't easily use it with AFL.
> 
> Just to satisfy my curiosity, what is AFL?

http://lcamtuf.coredump.cx/afl/

> 
> > but that still doesn't solve the problem that people have to write the 
> > necessary code that we can fuzz then.
> 
> OK.  This is a problem, certainly, and pretty independent of both the 
> "split Gecko" thing and the existence of shells, right?

Not really no. Because some shells and tests we have are very straightforward 
to use and we can figure it out ourselves. xpcshell is not such an example.

> 
> What are the necessary qualities for things you can fuzz?


It depends on the type of fuzzing. Let's stick to AFL:

- Program is easy to start (doesn't need profiles or long initialization) and 
can be packaged
- Has AFL persistent mode support (requires support on C++ level)
- Exercises the targeted feature in a similar way compared to how Firefox would 
do it
- Optionally has some extra testing features (e.g. gczeal, ion-eager, 
extra-checks for the JS shell) that make bug finding easier
- Can be compiled with all sanitizer types (although MSan is not going to work 
for some stuff even in shells)


That's just a dump out of my head, might be missing some stuff.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread Boris Zbarsky

On 3/9/16 3:06 PM, decoder...@googlemail.com wrote:

Not at all. xpcshell is not very useful for fuzzing. It is slow


OK, fair.


the sample tests (xpcshell-tests) are extremely complicated to adapt


That seems like it would be a problem in any new thing too, right?


and we can't easily use it with AFL.


Just to satisfy my curiosity, what is AFL?


but that still doesn't solve the problem that people have to write the 
necessary code that we can fuzz then.


OK.  This is a problem, certainly, and pretty independent of both the 
"split Gecko" thing and the existence of shells, right?


What are the necessary qualities for things you can fuzz?

-Boris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread decoder . oh

> To what extent are we fuzzing things like our url parser and other necko 
> bits via our existing shell (xpcshell) that exposes all that stuff?
> 
> -Boris

Not at all. xpcshell is not very useful for fuzzing. It is slow, the sample 
tests (xpcshell-tests) are extremely complicated to adapt and we can't easily 
use it with AFL.

As discussed in #developers, we could maybe have xpcshell work with AFL 
persistent mode (although it would be complicated I assume), but that still 
doesn't solve the problem that people have to write the necessary code that we 
can fuzz then. xpcshell-tests are not that code (I did some fuzzing with an 
xpcshell-test when we added the new X509 validation code and it was horrible).


Right now, the best approach to me seems having gtests and other compiled 
tests. The have the advantage that they are relatively simple to adapt (I was 
able to transform some tests image and mp3 decoding in less than 30 minutes) 
but they often still reflect what we do in the browser. For example for image 
decoding I saw we have multi-chunk decoding. If we just fuzz libjpeg or libpng 
we would just miss this part.

As for speed, on my gtest I get 500-1000 iterations per second on the GIF 
decoder for example. That's a good speed for fuzzing and very useful.

The only thing that I haven't looked into yet is how I can package the 
resulting tests easily so they can be sent to fuzzing machines, but that's 
solvable.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread Boris Zbarsky

On 3/9/16 12:54 PM, twsm...@mozilla.com wrote:

I think Nicolas is right on the mark! JS shell is a good example of using a 
shell for fuzzing and I think media and graphics could also get a good deal out 
of a shell.


To what extent are we fuzzing things like our url parser and other necko 
bits via our existing shell (xpcshell) that exposes all that stuff?


-Boris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread Bobby Holley
On Wed, Mar 9, 2016 at 10:45 AM,  wrote:

> This seems to be the general response whenever this topic is brought up,
> with good reason. I don't think modularizing Gecko just for the sake of it
> makes much sense. What I think the take away from this should be is: when
> developing a new component or maintaining an old one keep in mind that
> making it accessible to fuzzers is valuable. In the short term this may
> been making gtests that we can leverage.
>
> Decoder has started work on support for gtests to help use it as an
> alternative to the full browser. I believe he does have a couple simple
> fuzzers working atm however I'm not sure what they have been finding. I
> have tried myself with varying levels of success.
>
> I think a good starting point for this work would be media and image
> processing (codecs, parsers demuxers, decoders, etc...). We (fuzzing team)
> have already talked to the media team about this.


FWIW, the Servo team is interested in using the Gecko media stack, so there
is already some momentum behind modularizing that. I agree that it makes
sense for large and complex leaf subsystems with minimal coupling to the
rest of Gecko (media is a good example, possibly the best).


> Christoph and I recently fuzzed the new BMP decoder and had many
> situations where we hit issues that were not reproducible that would have
> been easily reproducible in a shell. The issues were in the image
> processing code but obscured by the rest of the system. The main issue in
> these cases was threading/timing.
>
> Maybe others have ideas for how we can make code more fuzz-able?
>
>
> On Wednesday, March 9, 2016 at 10:15:16 AM UTC-8, Bobby Holley wrote:
> > Can you elaborate on which Gecko components you're hoping to fuzz
> > separately? A lot of the core is pretty heavily-intertwined, so I'm
> pretty
> > skeptical that we'd ever be able to separate out DOM, style, and layout
> > from each other (for example). There are basically two barriers:
> > (1) These components are enormous, and were not built to be very modular.
> > Any such efforts would require a huge amount of engineering resources,
> > which we would probably not spend just to make the component fuzzable.
> > (2) The performance cost of adding an abstraction layer between
> > tightly-coupled components in C++11 would probably be prohibitive (the
> > situation is different for Rust/Servo because of Traits - we could
> > potentially do this with C++ Concepts/Modules in around a decade).
> >
> > This is all to say that I think a general call to "modularize Gecko"
> isn't
> > really helpful. But if there are specific leaf-y components that you want
> > to fuzz separately but can't, that might be a good starting point.
> >
> > On Wed, Mar 9, 2016 at 9:54 AM,  wrote:
> >
> > > On Wednesday, March 9, 2016 at 9:38:55 AM UTC-8, Nicolas B. Pierron
> wrote:
> > > > This discussion is a follow-up discussion to some emails sent
> privately
> > > by
> > > > accident.
> > > >
> > > > If you have not followed, I will quote David Bryant:
> > > >  > Improving release quality is one of the three fundamental goals
> > > Platform
> > > >  > Engineering committed to this year. To this end, lmandel built a
> > > Bugzilla
> > > >  > dashboard that allows us to track regressions found in any given
> > > release
> > > >  > cycle. This dashboard [...] can
> > > >  > also be found at: http://mozilla.github.io/releasehealth/
> > > >
> > > > To David's email, I answered the following:
> > > >
> > > > --
> > > > tl;dr: If we want to improve the quality of our products we should
> > > > split Gecko in standalone programs which are fuzzing-friendly.
> > > >
> > > > One thing which strikes me, is the ratio of regressions per component
> > > > that we have for each versions, and more over who are the persons
> > > > opening these bugs:
> > > >   - Release:
> > > >
> > >
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox44&f2=OP&f3=cf_status_firefox43&f4=cf_status_firefox43&f5=cf_status_firefox43&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898533&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
> > > >   - Beta:
> > > >
> > >
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox45&f2=OP&f3=cf_status_firefox44&f4=cf_status_firefox44&f5=cf_status_firefox44&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898534&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
> > > >   - Aurora:
> > > >
> > >
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox46&f2=OP&f3=cf_status_firefox45&f4=cf_status_firefox45&f5=cf_status_firefox45&f6=CP&include_fields=id&j2=OR&keywords=regression

Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread twsmith
This seems to be the general response whenever this topic is brought up, with 
good reason. I don't think modularizing Gecko just for the sake of it makes 
much sense. What I think the take away from this should be is: when developing 
a new component or maintaining an old one keep in mind that making it 
accessible to fuzzers is valuable. In the short term this may been making 
gtests that we can leverage.

Decoder has started work on support for gtests to help use it as an alternative 
to the full browser. I believe he does have a couple simple fuzzers working atm 
however I'm not sure what they have been finding. I have tried myself with 
varying levels of success.

I think a good starting point for this work would be media and image processing 
(codecs, parsers demuxers, decoders, etc...). We (fuzzing team) have already 
talked to the media team about this. Christoph and I recently fuzzed the new 
BMP decoder and had many situations where we hit issues that were not 
reproducible that would have been easily reproducible in a shell. The issues 
were in the image processing code but obscured by the rest of the system. The 
main issue in these cases was threading/timing.

Maybe others have ideas for how we can make code more fuzz-able?


On Wednesday, March 9, 2016 at 10:15:16 AM UTC-8, Bobby Holley wrote:
> Can you elaborate on which Gecko components you're hoping to fuzz
> separately? A lot of the core is pretty heavily-intertwined, so I'm pretty
> skeptical that we'd ever be able to separate out DOM, style, and layout
> from each other (for example). There are basically two barriers:
> (1) These components are enormous, and were not built to be very modular.
> Any such efforts would require a huge amount of engineering resources,
> which we would probably not spend just to make the component fuzzable.
> (2) The performance cost of adding an abstraction layer between
> tightly-coupled components in C++11 would probably be prohibitive (the
> situation is different for Rust/Servo because of Traits - we could
> potentially do this with C++ Concepts/Modules in around a decade).
> 
> This is all to say that I think a general call to "modularize Gecko" isn't
> really helpful. But if there are specific leaf-y components that you want
> to fuzz separately but can't, that might be a good starting point.
> 
> On Wed, Mar 9, 2016 at 9:54 AM,  wrote:
> 
> > On Wednesday, March 9, 2016 at 9:38:55 AM UTC-8, Nicolas B. Pierron wrote:
> > > This discussion is a follow-up discussion to some emails sent privately
> > by
> > > accident.
> > >
> > > If you have not followed, I will quote David Bryant:
> > >  > Improving release quality is one of the three fundamental goals
> > Platform
> > >  > Engineering committed to this year. To this end, lmandel built a
> > Bugzilla
> > >  > dashboard that allows us to track regressions found in any given
> > release
> > >  > cycle. This dashboard [...] can
> > >  > also be found at: http://mozilla.github.io/releasehealth/
> > >
> > > To David's email, I answered the following:
> > >
> > > --
> > > tl;dr: If we want to improve the quality of our products we should
> > > split Gecko in standalone programs which are fuzzing-friendly.
> > >
> > > One thing which strikes me, is the ratio of regressions per component
> > > that we have for each versions, and more over who are the persons
> > > opening these bugs:
> > >   - Release:
> > >
> > https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox44&f2=OP&f3=cf_status_firefox43&f4=cf_status_firefox43&f5=cf_status_firefox43&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898533&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
> > >   - Beta:
> > >
> > https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox45&f2=OP&f3=cf_status_firefox44&f4=cf_status_firefox44&f5=cf_status_firefox44&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898534&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
> > >   - Aurora:
> > >
> > https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox46&f2=OP&f3=cf_status_firefox45&f4=cf_status_firefox45&f5=cf_status_firefox45&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898536&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
> > >   - Nightly:
> > >
> > https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox47&f2=OP&f3=cf_status_firefox46&f4=cf_status_firefox46&f5=cf_status_firefox46&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=1

Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread Bobby Holley
Can you elaborate on which Gecko components you're hoping to fuzz
separately? A lot of the core is pretty heavily-intertwined, so I'm pretty
skeptical that we'd ever be able to separate out DOM, style, and layout
from each other (for example). There are basically two barriers:

(1) These components are enormous, and were not built to be very modular.
Any such efforts would require a huge amount of engineering resources,
which we would probably not spend just to make the component fuzzable.
(2) The performance cost of adding an abstraction layer between
tightly-coupled components in C++11 would probably be prohibitive (the
situation is different for Rust/Servo because of Traits - we could
potentially do this with C++ Concepts/Modules in around a decade).

This is all to say that I think a general call to "modularize Gecko" isn't
really helpful. But if there are specific leaf-y components that you want
to fuzz separately but can't, that might be a good starting point.

On Wed, Mar 9, 2016 at 9:54 AM,  wrote:

> On Wednesday, March 9, 2016 at 9:38:55 AM UTC-8, Nicolas B. Pierron wrote:
> > This discussion is a follow-up discussion to some emails sent privately
> by
> > accident.
> >
> > If you have not followed, I will quote David Bryant:
> >  > Improving release quality is one of the three fundamental goals
> Platform
> >  > Engineering committed to this year. To this end, lmandel built a
> Bugzilla
> >  > dashboard that allows us to track regressions found in any given
> release
> >  > cycle. This dashboard [...] can
> >  > also be found at: http://mozilla.github.io/releasehealth/
> >
> > To David's email, I answered the following:
> >
> > --
> > tl;dr: If we want to improve the quality of our products we should
> > split Gecko in standalone programs which are fuzzing-friendly.
> >
> > One thing which strikes me, is the ratio of regressions per component
> > that we have for each versions, and more over who are the persons
> > opening these bugs:
> >   - Release:
> >
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox44&f2=OP&f3=cf_status_firefox43&f4=cf_status_firefox43&f5=cf_status_firefox43&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898533&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
> >   - Beta:
> >
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox45&f2=OP&f3=cf_status_firefox44&f4=cf_status_firefox44&f5=cf_status_firefox44&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898534&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
> >   - Aurora:
> >
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox46&f2=OP&f3=cf_status_firefox45&f4=cf_status_firefox45&f5=cf_status_firefox45&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898536&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
> >   - Nightly:
> >
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox47&f2=OP&f3=cf_status_firefox46&f4=cf_status_firefox46&f5=cf_status_firefox46&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898528&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
> >
> > To be more precise:
> >   - The small number of regression we have in the JS engine on the
> > release channel, versus the Extremely Huge number of regressions we
> > have on nightly.
> >   - And the fact that (almost) all the bugs opened against the JS
> > engine are opened by our fuzzing team.
> >
> > What I want to remark is the fact that our automated fuzzing is better
> > at finding recently introduced regressions.  And as far as I know,
> > Alice is not a bot.
> >
> >  From what I know, the reason fuzzing team is so efficient on the JS
> > engine is because we have a *standalone* JS shell.
> > The *standalone* JS shell is also the reason why our build time is
> > below 2 minutes as opposed to 18 minutes.
> >
> > So, I think that if we want to improve our quality we should focus on
> > making fuzzing-friendly standalone programs for the different
> > components of the platform.
> > Thus reducing, the compilation time, reducing the test suite time, and
> > improving the ability of the fuzzing team to find recently added
> > regressions.
> >
> > Maybe I am wrong, in which case the other alternative might be to
> > staff the JS Team to get rid of all these nightly issues before they
> > ride the train to release.
> > --
> >
> > To which I got the following 

Re: Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread twsmith
On Wednesday, March 9, 2016 at 9:38:55 AM UTC-8, Nicolas B. Pierron wrote:
> This discussion is a follow-up discussion to some emails sent privately by 
> accident.
> 
> If you have not followed, I will quote David Bryant:
>  > Improving release quality is one of the three fundamental goals Platform
>  > Engineering committed to this year. To this end, lmandel built a Bugzilla
>  > dashboard that allows us to track regressions found in any given release
>  > cycle. This dashboard [...] can
>  > also be found at: http://mozilla.github.io/releasehealth/
> 
> To David's email, I answered the following:
> 
> --
> tl;dr: If we want to improve the quality of our products we should
> split Gecko in standalone programs which are fuzzing-friendly.
> 
> One thing which strikes me, is the ratio of regressions per component
> that we have for each versions, and more over who are the persons
> opening these bugs:
>   - Release: 
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox44&f2=OP&f3=cf_status_firefox43&f4=cf_status_firefox43&f5=cf_status_firefox43&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898533&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
>   - Beta: 
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox45&f2=OP&f3=cf_status_firefox44&f4=cf_status_firefox44&f5=cf_status_firefox44&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898534&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
>   - Aurora: 
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox46&f2=OP&f3=cf_status_firefox45&f4=cf_status_firefox45&f5=cf_status_firefox45&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898536&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
>   - Nightly: 
> https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox47&f2=OP&f3=cf_status_firefox46&f4=cf_status_firefox46&f5=cf_status_firefox46&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898528&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
> 
> To be more precise:
>   - The small number of regression we have in the JS engine on the
> release channel, versus the Extremely Huge number of regressions we
> have on nightly.
>   - And the fact that (almost) all the bugs opened against the JS
> engine are opened by our fuzzing team.
> 
> What I want to remark is the fact that our automated fuzzing is better
> at finding recently introduced regressions.  And as far as I know,
> Alice is not a bot.
> 
>  From what I know, the reason fuzzing team is so efficient on the JS
> engine is because we have a *standalone* JS shell.
> The *standalone* JS shell is also the reason why our build time is
> below 2 minutes as opposed to 18 minutes.
> 
> So, I think that if we want to improve our quality we should focus on
> making fuzzing-friendly standalone programs for the different
> components of the platform.
> Thus reducing, the compilation time, reducing the test suite time, and
> improving the ability of the fuzzing team to find recently added
> regressions.
> 
> Maybe I am wrong, in which case the other alternative might be to
> staff the JS Team to get rid of all these nightly issues before they
> ride the train to release.
> --
> 
> To which I got the following replies:
> 
> On Wed, Mar 9, 2016 at 3:05 PM, Kyle Huey wrote:
>  > The ratio of engineers to code in the js engine is so much higher than
>  > the rest of the product that I'm not sure this is a sensible comparison.
>  > The js engine also doesn't depend on things like 3rd party gfx drivers ...
> 
> On Wed, Mar 9, 2016 at 3:05 PM, Olli Pettay wrote:
>  > Fuzzing captures only a fraction of issues.
> 
> On Wed, Mar 9, 2016 at 3:42 PM, Chris Hofmann wrote:
>  > On Wed, Mar 9, 2016 at 7:05 AM, Kyle Huey wrote:
>  >>
>  >> The ratio of engineers to code in the js engine is so much higher than the
>  >> rest of the product that I'm not sure this is a sensible comparison.  The 
> js
>  >> engine also doesn't depend on things like 3rd party gfx drivers ...
>  >
>  > This is probably not the only step that we need to take to substantially
>  > improve quality so setting up a place to have those discussions is good.  
> It
>  > really is worth some time and effort to brainstorm about all the things we
>  > might do to raise the bar, poke some holes in those ideas, then decide on
>  > and push forward on a few more in the next few 

Split Gecko in standalone fuzzing-friendly programs.

2016-03-09 Thread Nicolas B. Pierron
This discussion is a follow-up discussion to some emails sent privately by 
accident.


If you have not followed, I will quote David Bryant:
> Improving release quality is one of the three fundamental goals Platform
> Engineering committed to this year. To this end, lmandel built a Bugzilla
> dashboard that allows us to track regressions found in any given release
> cycle. This dashboard […] can
> also be found at: http://mozilla.github.io/releasehealth/

To David's email, I answered the following:

--
tl;dr: If we want to improve the quality of our products we should
split Gecko in standalone programs which are fuzzing-friendly.

One thing which strikes me, is the ratio of regressions per component
that we have for each versions, and more over who are the persons
opening these bugs:
 - Release: 
https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox44&f2=OP&f3=cf_status_firefox43&f4=cf_status_firefox43&f5=cf_status_firefox43&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898533&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
 - Beta: 
https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox45&f2=OP&f3=cf_status_firefox44&f4=cf_status_firefox44&f5=cf_status_firefox44&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898534&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
 - Aurora: 
https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox46&f2=OP&f3=cf_status_firefox45&f4=cf_status_firefox45&f5=cf_status_firefox45&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898536&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=
 - Nightly: 
https://bugzilla.mozilla.org/buglist.cgi?columnlist=product%2Ccomponent%2Creporter&f1=cf_status_firefox47&f2=OP&f3=cf_status_firefox46&f4=cf_status_firefox46&f5=cf_status_firefox46&f6=CP&include_fields=id&j2=OR&keywords=regression%2C&keywords_type=allwords&list_id=12898528&o1=equals&o3=equals&o4=equals&o5=equals&query_format=advanced&resolution=---&v1=affected&v3=unaffected&v4=%3F&v5=---&query_based_on=


To be more precise:
 - The small number of regression we have in the JS engine on the
release channel, versus the Extremely Huge number of regressions we
have on nightly.
 - And the fact that (almost) all the bugs opened against the JS
engine are opened by our fuzzing team.

What I want to remark is the fact that our automated fuzzing is better
at finding recently introduced regressions.  And as far as I know,
Alice is not a bot.

From what I know, the reason fuzzing team is so efficient on the JS
engine is because we have a *standalone* JS shell.
The *standalone* JS shell is also the reason why our build time is
below 2 minutes as opposed to 18 minutes.

So, I think that if we want to improve our quality we should focus on
making fuzzing-friendly standalone programs for the different
components of the platform.
Thus reducing, the compilation time, reducing the test suite time, and
improving the ability of the fuzzing team to find recently added
regressions.

Maybe I am wrong, in which case the other alternative might be to
staff the JS Team to get rid of all these nightly issues before they
ride the train to release.
--

To which I got the following replies:

On Wed, Mar 9, 2016 at 3:05 PM, Kyle Huey wrote:
> The ratio of engineers to code in the js engine is so much higher than
> the rest of the product that I'm not sure this is a sensible comparison.
> The js engine also doesn't depend on things like 3rd party gfx drivers ...

On Wed, Mar 9, 2016 at 3:05 PM, Olli Pettay wrote:
> Fuzzing captures only a fraction of issues.

On Wed, Mar 9, 2016 at 3:42 PM, Chris Hofmann wrote:
> On Wed, Mar 9, 2016 at 7:05 AM, Kyle Huey wrote:
>>
>> The ratio of engineers to code in the js engine is so much higher than the
>> rest of the product that I'm not sure this is a sensible comparison.  The js
>> engine also doesn't depend on things like 3rd party gfx drivers ...
>
> This is probably not the only step that we need to take to substantially
> improve quality so setting up a place to have those discussions is good.  It
> really is worth some time and effort to brainstorm about all the things we
> might do to raise the bar, poke some holes in those ideas, then decide on
> and push forward on a few more in the next few quarters.

On Wed, Mar 9, 2016 at 5:23 PM, Al Billings wrote:
> On 3/9/16 6:58 AM, Nicolas B. Pierron wrote:
>> So, I think that if we want to improve our quality we should focus on
>> making fuzzing-friendly standalone programs for the different
>> components of