[issue41210] LZMADecompressor.decompress(FORMAT_RAW) truncate output when input is paticular LZMA+BCJ data

2020-07-11 Thread Hiroshi Miura
Hiroshi Miura added the comment: Here is a BCJ only CFFI test project. https://github.com/miurahr/bcj-cffi It imports two bcj_x86 C sources, one is from liblzma (src/xz_bcj_x86.c) taht is bind with python's lzma module, and the other is from xz-embbed project for linux

[issue41210] LZMADecompressor.decompress(FORMAT_RAW) truncate output when input is paticular LZMA+BCJ data

2020-07-07 Thread Hiroshi Miura
Hiroshi Miura added the comment: Thank you for information about similar problem. This problem is observed and reported on 7-zip library project, https://github.com/miurahr/py7zr/issues/178. py7zr heavily depend on lzma FORMAT_RAW interface. Fortunately 7-zip container format has size

[issue41210] LZMADecompressor.decompress(FORMAT_RAW) truncate output when input is paticular LZMA+BCJ data

2020-07-07 Thread Hiroshi Miura
Change by Hiroshi Miura : Added file: https://bugs.python.org/file49301/0001-lzma-support-LZMA1-with-FORMAT_RAW.patch ___ Python tracker ___

[issue41210] LZMADecompressor.decompress(FORMAT_RAW) truncate output when input is paticular LZMA+BCJ data

2020-07-07 Thread Ma Lin
Ma Lin added the comment: There was a similar issue (issue21872). When decompressing a lzma.FORMAT_ALONE format data, and it doesn't have the end marker (but has the correct "Uncompressed Size" in the .lzma header), sometimes the last one to dozens bytes can't be output. issue21872 fixed

[issue41210] LZMADecompressor.decompress(FORMAT_RAW) truncate output when input is paticular LZMA+BCJ data

2020-07-06 Thread Hiroshi Miura
Hiroshi Miura added the comment: I think FORMAT_RAW is only tested with LZMA2 in Lib/test/test_lzma.py Since no test is for LZMA1, then the document express FORMAT_RAW is for LZMA2. I'd like to add tests against LZMA1 and change expression on the document. -- keywords: +patch Added

[issue41210] LZMADecompressor.decompress(FORMAT_RAW) truncate output when input is paticular LZMA+BCJ data

2020-07-06 Thread Hiroshi Miura
Hiroshi Miura added the comment: >Compression filters: >FILTER_LZMA1 (for use with FORMAT_ALONE) >FILTER_LZMA2 (for use with FORMAT_XZ and FORMAT_RAW) I look into past discussion BPO-6715 when lzma module proposed. https://bugs.python.org/issue6715 There is an

[issue41210] LZMADecompressor.decompress(FORMAT_RAW) truncate output when input is paticular LZMA+BCJ data

2020-07-06 Thread Ma Lin
Ma Lin added the comment: The docs[1] said: Compression filters: FILTER_LZMA1 (for use with FORMAT_ALONE) FILTER_LZMA2 (for use with FORMAT_XZ and FORMAT_RAW) But your code uses a combination of `FILTER_LZMA1` and `FORMAT_RAW`, is this ok? [1]

[issue41210] LZMADecompressor.decompress(FORMAT_RAW) truncate output when input is paticular LZMA+BCJ data

2020-07-05 Thread Ma Lin
Change by Ma Lin : -- components: +Library (Lib) -Extension Modules nosy: +malin ___ Python tracker ___ ___ Python-bugs-list

[issue41210] LZMADecompressor.decompress(FORMAT_RAW) truncate output when input is paticular LZMA+BCJ data

2020-07-04 Thread Hiroshi Miura
New submission from Hiroshi Miura : When decompressing a particular archive, result become truncated a last word. A test data attached is uncompressed size is 12800 bytes, and compressed using LZMA1+BCJ algorithm into 11327 bytes. The data is a payload of a 7zip archive. Here is a pytest