Ma Lin <malin...@163.com> added the comment:

I investigated this problem.

Here is the toggle conditions:

- The format is FORMAT_ALONE, this is the legacy .lzma container format.
- The file's header recorded "Uncompressed Size".
- The file doesn't have "End of Payload Marker" or "End of Stream Marker".

Otherwise, liblzma's internal state doesn't hold any bytes that can be output. 

Good news is:

- lzma module's default compressing format is FORMAT_XZ, not FORMAT_ALONE.
- Even FORMAT_ALONE files generated by lzma module (underlying xz library), 
always have "End of Payload Marker".
- Maybe FORMAT_ALONE format is being outdated in the world.

Attached file test_bad_files.py, test `DecompressReader.read(size=-1)` function 
[1] with different max_length values (from -1 to 1000, exclude 0), can ensure 
that the needs_input mechanism works properly.
Usage: modify `DIR` variable to bad files' folder.

[1] https://github.com/python/cpython/blob/v3.8.0b1/Lib/_compression.py#L72-L111

----------
Added file: https://bugs.python.org/file48425/test_bad_files.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue21872>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to