Hi there, I found a memory leak bug in PdfParser::ReadXRefSubsection() function when fuzzing the podofotxtextract tool in r1849 version. Attached is PoC that reproduces the bug.
Based on the following Valgrind's output, this PoC actually triggers 2 memory bugs: The definitely lost 7800 bytes follows the same execution trace as I described in my last email "Memory leak in PdfMemDocument::Load()". The possibly lost 16,800,360 bytes is caused by the failure of resize operation in function PdfParser::ReadXRefSubsection(). It seems like the resize operation could fail if its parameter is very big and throw an error, but the PoDoFo's error handler does not recycle the memory allocated by resize operation. root@ec0f70831dcf:/data/podofo-code-1849-podofo-trunk/build/crashes# valgrind --leak-check=full ../tools/podofotxtextract/podofotxtextract _xref-NoObject--PdfMemDocument-198 ==353== Memcheck, a memory error detector ==353== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==353== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==353== Command: ../tools/podofotxtextract/podofotxtextract _xref-NoObject--PdfMemDocument-198 ==353== WARNING: There are more objects (700015) in this XRef table than specified in the size key of the trailer directory (7)! <</ID[<A561A9EF46EDD0F343AC23CCA1BD3E61><A561A9EF46EDD0F343AC23CCA1BD3E61DFEAEBA0>]/Info 2 0 R/Root 1 0 R/Size 7>> Error: An error 16 ocurred during processing the pdf file. PoDoFo encountered an error. Error: 16 ePdfError_NoObject Error Description: A object was expected but not found. Callstack: #0 Error Source: /data/podofo-code-1849-podofo-trunk/src/doc/PdfMemDocument.cpp:198 Information: Catalog object not found! ==353== ==353== HEAP SUMMARY: ==353== in use at exit: 16,881,416 bytes in 31 blocks ==353== total heap usage: 22,137 allocs, 22,106 frees, 34,359,513 bytes allocated ==353== ==353== 7,800 (696 direct, 7,104 indirect) bytes in 1 blocks are definitely lost in loss record 20 of 22 ==353== at 0x4C2E0EF: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==353== by 0x4D2D9D: PoDoFo::PdfMemDocument::Load(char const*, bool) (PdfMemDocument.cpp:255) ==353== by 0x4D2BE1: PoDoFo::PdfMemDocument::PdfMemDocument(char const*, bool) (PdfMemDocument.cpp:102) ==353== by 0x43B5EC: TextExtractor::Init(char const*) (TextExtractor.cpp:41) ==353== by 0x44040A: main (podofotxtextract.cpp:52) ==353== ==353== 16,800,360 bytes in 1 blocks are possibly lost in loss record 22 of 22 ==353== at 0x4C2E0EF: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==353== by 0x56CABA: allocate (new_allocator.h:104) ==353== by 0x56CABA: allocate (alloc_traits.h:182) ==353== by 0x56CABA: _M_allocate (stl_vector.h:170) ==353== by 0x56CABA: std::vector<PoDoFo::PdfParser::TXRefEntry, std::allocator<PoDoFo::PdfParser::TXRefEntry> >::_M_fill_insert(__gnu_cxx::__normal_iterator<PoDoFo::PdfParser::TXRefEntry*, std::vector<PoDoFo::PdfParser::TXRefEntry, std::allocator<PoDoFo::PdfParser::TXRefEntry> > >, unsigned long, PoDoFo::PdfParser::TXRefEntry const&) (vector.tcc:491) ==353== by 0x566BD6: insert (stl_vector.h:1073) ==353== by 0x566BD6: resize (stl_vector.h:716) ==353== by 0x566BD6: PoDoFo::PdfParser::ReadXRefSubsection(long&, long&) (PdfParser.cpp:788) ==353== by 0x55E183: PoDoFo::PdfParser::ReadXRefContents(long, bool) (PdfParser.cpp:722) ==353== by 0x55AA0B: PoDoFo::PdfParser::ReadDocumentStructure() (PdfParser.cpp:334) ==353== by 0x5593FA: PoDoFo::PdfParser::ParseFile(PoDoFo::PdfRefCountedInputDevice const&, bool) (PdfParser.cpp:217) ==353== by 0x558708: PoDoFo::PdfParser::ParseFile(char const*, bool) (PdfParser.cpp:164) ==353== by 0x4D2DE4: PoDoFo::PdfMemDocument::Load(char const*, bool) (PdfMemDocument.cpp:256) ==353== by 0x4D2BE1: PoDoFo::PdfMemDocument::PdfMemDocument(char const*, bool) (PdfMemDocument.cpp:102) ==353== by 0x43B5EC: TextExtractor::Init(char const*) (TextExtractor.cpp:41) ==353== by 0x44040A: main (podofotxtextract.cpp:52) ==353== ==353== LEAK SUMMARY: ==353== definitely lost: 696 bytes in 1 blocks ==353== indirectly lost: 7,104 bytes in 27 blocks ==353== possibly lost: 16,800,360 bytes in 1 blocks ==353== still reachable: 73,256 bytes in 2 blocks ==353== suppressed: 0 bytes in 0 blocks ==353== Reachable blocks (those to which a pointer was found) are not shown. ==353== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==353== ==353== For counts of detected and suppressed errors, rerun with: -v ==353== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0) Thanks, -- Liang Cheng Institute of Software, Chinese Academy of Sciences 4# South Fourth Street, Zhongguancun Beijing 100190, China
_xref-NoObject--PdfMemDocument-198
Description: Binary data
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users