Hi All, I'm using pdfbox 1.6 to generate PDF files. These text files contain some simple text and JPEG images. The JPEGs are small (~157x200), representing thumbnails of other documents.
The problem is, only about half of my images display. The rest have a blank box where the image should be. Also, if I run the viewer like pdfedit or evince from the command line, you see errors: jason@butters:~/Desktop$ evince msg4.pdf Error: Could not find start of jpeg data Error: Could not find start of jpeg data Error: Could not find start of jpeg data Error: Could not find start of jpeg data Error: Could not find start of jpeg data Looking at PDJpeg, it looks like it reads in my JPEG to a BufferedImage, and then recompresses it to the stream. The problem is (I think), that if you look at the PDF spec it seems that the stream should really be just the raw DCT data. However, when you look at the PDFs generated by PDFBox, I see the JPEG headers (e.g. 0xff, ... "JFIF") in the stream. It seems like the PDF viewers are being lenient and trying to find the DCT data, but giving up on some of my images. Does this sound correct?? Thanks, Jason

