If the big image patches are committed, is it worth adding a command line option to enable or disable big images or to set the max image size so applications that should never see big images don't have to worry about DOS?
________________________________ From: poppler <poppler-boun...@lists.freedesktop.org> on behalf of Martin (gzlist) <gzl...@googlemail.com> Sent: Thursday, August 20, 2020 7:43 AM To: poppler@lists.freedesktop.org <poppler@lists.freedesktop.org> Subject: [poppler] Big image splash support I have a patchset to support writing images with more than 2GiB pixel data using the splash backend, which raises a few questions worth running past this list. Rendering a pdf (using for instance the pdftoppm utility) allocates a bitmap to draw into and assemble the output. The bitmap dimensions and coordinates are of type int, and the product is also treated as an int, both in the gmem allocation, and in drawing logic. On an LP64 system, this means the max image size is under 27500x27500 and quite a bit less than could fit in memory. Large outputs result is this error: $ pdftoppm -scale-to 32768 -png ~/sample.ai > /tmp/out.png Bogus memory allocation size $ file /tmp/out.png /tmp/out.png: PNG image data, 1 x 1, 8-bit/color RGB, non-interlaced The logged line is from the allocator level check, and after there is a slightly surprising fallback to 1 pixel square output. From looking at other [allocator errors] this probably exists so *some* output is produced for documents with only some bogus content? There is an argument that all widths, heights, and coordinates should be size_t instead of int, but to limit the size and pain of the change, keeping the current interfaces using int seems reasonable. So two distinct changes to make: * Allocations larger than 2GiB must be allowed * Buffer offsets must use a type larger than int Also, optionally: * Handle failed allocations a little better For allocation, what I have currently is (A) change use of int in gmem.c to long so all allocations can be >2GiB on 64-bit machines. Alternatively I could (B) find just the uses of the allocator that may need large areas and make them use a new gmem function. Do you prefer (A) or (B) here? Given existing [pdf deflate bombs] there's probably not much DoS protection provided by the 2GiB limit? For the splash code, what I've implemented and propose landing is casting to size_t on index into a buffer, generally where multiplication happens. The tradeoff is it's harder to be sure of correctness - my method of finding where to make changes was a combination of grep, and gdb with real world inputs. Is this acceptable? It's possible this method will introduce memory safety issues specifically for large output sizes. Note the [cairo max size] is explicitly (1<<15)-1 for surface dimensions, and not addressed by my changes. Martin [allocator errors]: Some problems currently blocked at gmalloc level? https://gitlab.freedesktop.org/poppler/poppler/-/issues/532 https://gitlab.freedesktop.org/poppler/poppler/-/issues/825 [pdf deflate bombs]: PDF Deflate bombs may cause crashes or resource exhaustion https://gitlab.freedesktop.org/poppler/poppler/-/issues/878 [cairo max size]: Limits for the cairo backed https://gitlab.freedesktop.org/cairo/cairo/-/blob/1.16/src/cairo-image-surface.c#L59 There's been some work in pixman, but it's incomplete? https://gitlab.freedesktop.org/pixman/pixman/-/issues/6 _______________________________________________ poppler mailing list poppler@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/poppler
_______________________________________________ poppler mailing list poppler@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/poppler