Dear Lasse,

Thanks for your feedback. My reply is inline. However, it is a good time
to discuss oss-fuzz integration as we apply final touches on the test
harness :)

I have a few questions for you:

- oss-fuzz requires a Google linked email address of the maintainer.
Could you please provide me one?

- It is better that the test harness and related config (dictionary,
other fuzzer options) reside in the xz source repo. Are you okay
maintaining these in the long run?

Thank you :)

On 10/29/18 9:27 PM, Lasse Collin wrote:
> On 2018-10-29 Bhargava Shastry wrote:
>> Thanks for providing two versions for me to test. Here are the
>> results:
>>
>> - version 1 decompresses the whole of fuzzed (compressed) data
>> - version 2 decompresses in chunks of size (input=13 bytes)
>>
>> ### Executions per second
>>
>> I ran both versions a total of 96 times (I have 16 cores :-))
>>
>> - version 1 averaged 1757.20 executions per second
>> - version 2 averaged 429.10 executions per second
>>
>> So, clearly version 1 is faster
> 
> Yes, and the difference is bigger than I hoped.
> 
>> Regarding coverage
>>
>> - version 1 covered 950.26 CFG edges on average
>> - version 2 covered 941.11 CFG edges on average
> 
> I assume you had the latest xz.git that supports
> FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION.

That's correct.

> Did you run the same number of fuzzing rounds on both (so the second
> version took over four times longer) or did you run them for the same
> time (so the second version ran only 1/4 of rounds)?

It's the latter, I ran for a fixed time duration of 2 minutes. In this time,

- version 1 was fuzzed 212677 times on average i.e., the test was fuzzed
with that many distinct inputs
- version 2 was fuzzed 51986 times on average

So, like you say, roughly v2 ran for 1/4 of rounds as v1.

> If both version saw the same number of rounds, I would expect the
> second version to have the same or better coverage. But if the
> comparison was based on time, then it's no surprise if the first
> version has better apparent coverage even if it is impossible for it to
> hit certain code paths that are possible with the second version. It
> might also depend on which input file is used as a starting point for
> the fuzzer.

As starting point, I used all files with the "xz" extension that I could
find in the source repo (total of 63 files).

I also did the following experiment

- I ran version 1 overnight (over 16 hours in total)
- The coverage saturated at about 996 CFG edges

Then, I took the corpus that was generated for v1 fuzzing and fed it to
v2. My hope is that this will quickly tell me how much better (coverage
wise) v2 is were it to be run for as long as v1

- I found v2 covers 1004 CFG edges i.e., only 8 CFG edges more than v1

However, to be sure I need to keep v2 running for as long as v1, but my
guess is that this saturation will prevail.

>> Overall, version 1 is superior imho.
> 
> I don't know yet. Increasing the input and output chunk sizes is
> probably needed to make the second version faster. You could try
> some odd values between 100 and 250, or maybe even up to 500.

Okay, I can try this out once current experiment completes.

Regards,
Bhargava

Reply via email to