On 17.07.2018 19:44, Jussi Judin wrote:
Hi,

I have been fuzzing[1] various parts of Python standard library for Python 3.7 
with python-afl[2] to find out internal implementation issues that exist in the 
library. What I have been looking for are mainly following:

* Exceptions that are something else than the documented ones. These usually 
indicate an internal implementation issue. For example one would not expect an 
UnicodeDecodeError from netrc.netrc() function when the documentation[3] 
promises netrc.NetrcParseError and there is no way to pass properly sanitized 
file object to the netrc.netrc().
* Differences between values returned by C and Python versions of some 
functions. quopri module may have these.
* Unexpected performance and memory allocation issues. These can be somewhat 
controversial to fix, if at all, but at least in some cases from end-user perspective it 
can be really nasty if for example fractions.Fraction("1.64E6646466664") 
results in hundreds of megabytes of memory allocated and takes very long to calculate. I 
gave up waiting for that function call to finish after 5 minutes.

As this is going to result in a decent amount of bug reports (currently I only filed 
one[4], although that audio processing area has much more issues to file), I would 
like to ask your opinion on filing these bug reports. Should I report all issues 
regarding some specific module in one bug report, or try to further split them into 
more fine grained reports that may be related? These different types of errors are 
specifically noticeable in zipfile module that includes a lot of different exception 
and behavioral types on invalid data 
<https://github.com/Barro/python-stdlib-fuzzers/tree/master/zipfile/crashes> . 
And in case of sndhdr module, there are multiple modules with issues (aifc, sunau, 
wave) that then show up also in sndhdr when they are used. Or are some of you willing 
to go through the crashes that pop up and help with the report filing?

I'm not from the core team, so will recite best practices from my own experience.

Bugs should be reported "one per root cause" aka 1bug report=1fix. It's permissible to report separately, especially if you're not sure if they are the same bug (then add a prominent link), but since this is a volunteer project, you really should be doing any diplicate checks _before_ reporting. Since you'll be checking existing tickets before reporting each new one anyway, that'll automatically include _your own_ previous tickets ;-) For ditto bugs in multiple places, it's better to err on the side of fewer tickets -- this will both be less work for everyone and give a more complete picture. If something proves to warrant a separate ticket, it can be split off later.

The code and more verbose description for this is available from 
<https://github.com/Barro/python-stdlib-fuzzers>. It works by default on some 
GNU/Linux systems only (I use Debian testing), as it relies on /dev/shm/ being 
available and uses shell scripts as wrappers that rely on various tools that may not 
be installed on all systems by default.

As a bonus, as this uses coverage based fuzzing, it also opens up the possibility of 
automatically creating a regression test suite for each of the fuzzed modules to ensure 
that the existing functionality (input files under <fuzz-target>/corpus/ directory) 
does not suddenly result in additional exceptions and that it is more easy to test 
potential bug fixes (crash inducing files under <fuzz-target>/crashes/ directory).

As a downside, this uses two quite specific tools (afl, python-afl) that have 
further dependencies (Cython) inside them, I doubt the viability of integrating 
this type of testing as part of normal Python verification process. As a 
difference to libFuzzer based fuzzing that is already integrated in Python[5], 
this instruments the actual (and only the) Python code and not the actions that 
the interpreter does in the background. So this should result in better fuzzer 
coverage for Python code that is used with the downside that when C functions 
are called, they are complete black boxes to the fuzzer.

I have mainly run these fuzzer instances at most for several hours per module 
with 4 instances and stopped running no-issue modules after there have been no 
new coverage discovered after more than 10 minutes. Also I have not really 
created high quality initial input files, so I wouldn't be surprised if there 
are more issues lurking around that could be found with throwing more CPU and 
higher quality fuzzers at the problem.

[1]: https://en.wikipedia.org/wiki/Fuzzing
[2]: https://github.com/jwilk/python-afl
[3]: https://docs.python.org/3/library/netrc.html
[4]: https://bugs.python.org/issue34088
[5]: https://github.com/python/cpython/tree/3.7/Modules/_xxtestfuzz


--
Regards,
Ivan

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to