Hi Ben.

(mirrored to the list, who knows when it'll get delivered)

I'm a maintainer for Pillow, an actively maintained successor for PIL that has some bugfixes and new features.
I've taken a quick look, and it appears that the bug is in Pillow as well.

It appears that at some point, once the file gets over about 23 mp, or at 8285x2780, there's a failure. It works at 8284x2780, and also if the subselected region is panned. I've traced it down to the calls to zlib -- Pillow/Pil is calling it to compress the png formatted scanlines, and somewhere in there (but not at the end of the image) the compressed datastream is getting corrupted.

However, there is a small workaround that appears to help. If a compress_level is passed into the save call, requesting anything from 0 (no compression) to 9 (max), it saves properly. Passing in -1 (the default compression level, equivalent to 6) triggers the failure.

I've been testing it with the following:
(if maxblock is big enough, you get all of the image data in one IDAT block, and we can test the decompression of it with the plain zlib decompression call. It's commented out here.)


from PIL import Image, ImageFile
import zlib

npz = numpy.load("data.npz")
imagedata = npz['arr_0']
palette = npz['arr_1']

Image.DEBUG = 1
#ImageFile.MAXBLOCK = 512*1024

print(imagedata.shape)

def test(p):
    i1 = imagedata[0:2780,0:p]
    im = Image.fromarray(i1, 'P')
    im.putpalette(palette)
    print (im)
    im.save('tmp.png',compress_level=9 )
    im2 = Image.open('tmp.png')
    print (im2)
    print ("Verify: %s" %im2.verify())
    try:
        im2 = Image.open('tmp.png')
        ImageFile.LOAD_TRUNCATED_IMAGES
        im2.load()
        print ("%s success" %p)
    except:
        print ("FAIL: %s" %p)
        #raise

    #with open('tmp.png','rb') as f:
    #    f.seek(821)#

    #    s = zlib.decompress(f.read(385128))
    #    print ("successful decompress")


test(8284)
test(8285)

The error I'm getting: "zlib.error: Error -3 while decompressing data: invalid distances set" would indicate that the compressed data stream is either corrupt or was compressed with different settings, perhaps a larger compression window.

I'm hoping to find the actual bug here, rather than a vague workaround.

eric


On 07/26/2013 04:55 AM, Ben Taylor wrote:
Hi all

I've got some data that should be fine (came from a known-good netCDF
file) but causes PIL to write an invalid image if it is used to save in
PNG format (it's happy writing GIF format).

The same code writes PNGs quite happily 99% of the time, but just
occasionally we generate a netCDF that causes this bug - we don't know
why. We're not sure if the problem is actually in PIL or if it might be
in libpng. Anyone mind taking a look please?

Test data at ftp://rsg.pml.ac.uk/rsg/benj/pil_problem/:
data.npz - Numpy npz data with data that causes the issue
working_data.npz - npz file from another source that works fine
test_harness.py - Run this with the two test files to demonstrate the
problem - the PNG generated from data.npz is corrupt.

Test harness code below (just to demonstrate it's nothing complicated).

TIA
Ben

#!/usr/bin/env python

import numpy
import Image

def test(infile, prefix):

    npz = numpy.load(infile)
    imagedata = npz['arr_0']
    palette = npz['arr_1']

    out_image = Image.fromarray(imagedata, 'P')
    out_image.putpalette(palette)

    out_image.save(prefix + ".gif") # ok
    out_image.save(prefix + ".png") # bust
# end function

test("data.npz", "broken")
test("working_data.npz", "working")


_______________________________________________
Image-SIG maillist  -  Image-SIG@python.org
http://mail.python.org/mailman/listinfo/image-sig

Reply via email to