Hi,

We've encountered some problems rendering pdf files to jpeg; I've
narrowed the problem down to a short set of test cases... which turn
out to have several different issues, depending on which platform we
test them on. I'd like to submit my test case to the issue tracker,
but I'm not sure whether to submit it as one bug or four... do the
developers have a preference?

List of issues, and test code, below... a .zip file with sample code
and pdf files is ready to be submitted.

The original problem we were coping with was the fact that pdf scans
from our departmental networked copier always render to a black page.
The other issues were just encountered while testing.

BTW, it is *entirely* possible that I'm just doing something wrong;
I'm new to PDFBox. Is there something obviously wrong with my test
code?

Thanks,

Sarah



    Problem #1: The file "ItDoesntWorkScan.pdf" renders to an empty
    black page. This file is a copy of "ItDoesntWorkPrinted.pdf"
    that has been printed on paper, and then scanned with
    a Xerox WorkCentre 5030 scanner, which then emails a pdf file
    back to the user.
    Tested On:
        - Mac OS 10.6
        - Windows 7
        - Ubuntu 10.10
    Unfortunately, the WorkCentre 5030 doesn't appear to have
    many user-settable options for scanning to PDF, so we weren't
    really able to try scanning with settings other than the defaults.


    Problem #2: On MacOS, running the headless tests ("ant run-headless")
    generates multiple instances of messages like this:

        *** __NSAutoreleaseNoPool(): Object 0x10b60a5a0 of class
        NSConcreteMapTableValueEnumerator autoreleased with no pool
        in place - just leaking


    Problem #3: TestRender.pdf adds an odd-looking (different font?)
    question mark to the end of every line. These are not present in
    the original PDF file. Tested On:
        - Mac OS 10.6
        - Windows 7
        - Ubuntu 10.10


    Problem #4: On a plain vanilla Ubuntu 10.10 install, running
    run-all failed to render any text, and threw lots of exceptions:

        
org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getawtFont(PDTrueTypeFont.java:425)

    ...however, installing the package "ttf-mscorefonts-installer"
    made those exceptions go away.
    (ubuntu1010_output.txt shows the exceptions;
    ubuntu1010_try2_output.txt is a run after the extra fonts are installed)

    Might be able to fix this one by setting UNKNOWN_FONT in
    Resources/PDFBox_External_Fonts.properties, but it would seem like
    it should choose some reasonable default if it isn't set...
    shouldn't it?




/* $Id$ */

import java.io.*;
import java.util.*;
import java.awt.image.*;

import org.apache.pdfbox.exceptions.*;
import org.apache.pdfbox.pdmodel.*;
import org.apache.pdfbox.util.*;

import javax.imageio.*;
import javax.imageio.stream.*;
import javax.imageio.metadata.*;

/**
 * Test scan from Xerox WorkCentre 5030. [email protected]
 */
public class TestRender {
    File infile;
    File outfile;
    private PDDocument document;
    private int imageType = BufferedImage.TYPE_INT_RGB;
    private int resolutionDPI = 96;
    private float imageQuality = 0.75f;

    public TestRender(String filename,String outdir) {
        infile = new File(filename);
        outfile = new
File(outdir,infile.getName().replaceAll("\\.pdf","")+".jpeg");
    }

    public void render() throws IOException {
        System.out.println();
        System.out.println();
        System.out.println("Rendering "+infile+" to "+outfile);
        document = PDDocument.load(infile);
        List pages = document.getDocumentCatalog().getAllPages();
        PDPage page = (PDPage)(pages.get(0));
        BufferedImage image = page.convertToImage(imageType,resolutionDPI);

        BufferedOutputStream outStream = new BufferedOutputStream(
            new FileOutputStream(outfile));
        ImageOutputStream imageOutStream
            = ImageIO.createImageOutputStream(outStream);
        ImageWriter iWriter = null;
        try {
            iWriter = ImageIO.getImageWritersByFormatName("JPEG").next();
            ImageWriteParam params = iWriter.getDefaultWriteParam();

            params.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
            params.setCompressionType("JPEG");
            params.setCompressionQuality(imageQuality);

            iWriter.setOutput(imageOutStream);

            IIOMetadata meta = null;

            // object to aggregate image, thumbnails, and metadata
            IIOImage iioImage = new IIOImage(image,null,meta);

            iWriter.write(
                null,           // optional stream metadata. null=use default
                iioImage,       // object w/image, thumbnails, and metadata
                params);        // params for writing process
        } finally {
            if (iWriter!=null) iWriter.dispose();
            document.close();
            outStream.close();
        }

    }

    public static void main(String[] args) {
        try {
            String outdir = args[0];

            // File exported as PDF from OmniGraffle 5.2.3 (OS X 10.6)
            new TestRender("pdf/ItDoesntWorkExport.pdf",outdir).render();

            // File printed as PDF (OS X 10.6)
            new TestRender("pdf/ItDoesntWorkPrinted.pdf",outdir).render();

            // File printed on paper, then scanned w/Xerox WorkCentre 5030
            new TestRender("pdf/ItDoesntWorkScan.pdf",outdir).render();

            // This code, printed as PDF from BBEdit 9.6.3 (OS X 10.6)
            new TestRender("pdf/TestRender.pdf",outdir).render();

        } catch(Throwable t) {
            t.printStackTrace();
        }
    }

}

Reply via email to