Re: Replacing images contents

Julien Plée Sat, 28 Aug 2010 04:47:08 -0700

Hi all,

I found a solution for embedding back images in the PDF document,replacing old image streams.

Quick notes about the process :

- I embed images extracted from the PDF withPDXObjectImage.write2file(). Files are edited in place, though I'msure that the file format will be the same as the embedded stream ;- When I edit files in place, I do not change the file configuration(width, height, etc are the same) ;

- There is no document saving between extracting and embedding phases.
Relating to the example bellow, those notes are important because :
- I copy the file raw stream and do not update the stream dictionnary ;

- I use the COSObject number to identify each stream and saving thePDF may change those numbers.


It may help someone else though it may not be the best solution.

/**
 * Embed back every bitmap image in the document found on the specified
 * directory.
 * @param doc Document to extract images from
 * @param dir Destination path to save images to
 * @throws Exception
 */
public static void embedImages (PDDocument doc, File dir)
throws Exception {
    if( dir.exists() ) {
        if( !dir.isDirectory() ) {
            dir = new File( dir.getCanonicalPath()+"-img" );
            embedImages( doc, dir );
        return;
        }
    }
    else {
        dir.mkdirs();
    }
    Iterator<Entry<COSObjectKey, Integer>> xrefEntriesIt =
        doc.getDocument().getXrefTable().entrySet().iterator();
    while( xrefEntriesIt.hasNext() ) {
        COSObject object = doc.getDocument().getObjectFromPool(
                xrefEntriesIt.next().getKey() );

if( object.getDictionaryObject( COSName.SUBTYPE ) ==COSName.IMAGE )

            embedSingleImage( object, dir );
    }
}

/**
 * Extracts an image pointed as a COSObject in the specified directory.
 * The image may be a vectorial path wich is not handled yet. This is

* guessed by the imageMask flag. However there may be betterindicators.* IMPORTANT NOTICE: The file stream is directly embedded in the oldstream.

 * If the image size changes, the final display will show distortion.
 * @param imObj The COSObject referencing the image stream
 * @param dir The directory where image is to be extracted to
 * @throws Exception
 */
protected static void embedSingleImage( COSObject imObj, File dir )
throws Exception {
    PDXObjectImage im = (PDXObjectImage) PDXObject.createXObject(
            (COSStream) imObj.getObject() );
    if( im.getImageMask() ) return;
    File inFile = new File( dir.getCanonicalPath()+File.separator
            +imObj.getObjectNumber().intValue()+"."+im.getSuffix() );
    if( !inFile.exists() )
        throw new Exception( "The file `"+inFile.getCanonicalPath()
        +"` doesn't exist and cannot be embedded." );
    InputStream newStream = new FileInputStream( inFile );

OutputStream embeddedStream =im.getCOSStream().createFilteredStream();

    int bSize = 10240;
    byte[] b = new byte[bSize];
    int bytesRead = 0;
    while( ( bytesRead = newStream.read( b, 0, bSize ) ) > -1 )
        embeddedStream.write( b, 0, bytesRead );
    embeddedStream.close();
}


Julien PLÉE

Le 26 août 10 à 00:47, [email protected] a écrit :

Julien,
Doesn't this code[1] create a new image object which is in no wayattachedto the PDF? Modifying "(COSStream) obj.getObject()" seems like it'ddowhat you intend. I'm not familiar with PDXObject.createXObject(),but itseems like that'd be creating a copy of the data passed it (similarto acopy constructor). Obviously modifying a copy isn't going to affectthe
original.

I'm pretty sure that's your problem, but I've never done anything with
streams nor images in PDFs, so I'm afraid I don't know the way to it's
supposed to be done.
Another thing which might be important: some PDF programs don'twrite out
anything in the xref table.  This doesn't follow the spec, but Adobe
Reader opens them fine either way, so many people don't realizethey're
out of spec (and thus expect your code to process them the same as a
proper PDF).

[1] PDXObjectImage image = (PDXObjectImage)
PDXObject.createXObject((COSStream) obj.getObject() );

----
Thanks,
Adam





From:
Julien Plée <[email protected]>
To:
[email protected]
Date:
08/25/2010 15:00
Subject:
Replacing images contents



Hello,

I have to put a watermark stamp on images stored in PDF files and I'm
having hard times trying to embed images back into the PDF.
I'm using the XrefTable to filter images. For embedding, I'm trying to
replace the stream of the original object but with no luck, the saved
PDF always looks the same.
Here is my method code focused on the PDXObjectImage :

/**
* Replaces a PDF image content with content from an image file onfile
 * system identified by the object id.
 *
 * (this.doc : PDDocument)
 * @param obj
 * @throws IOException
 */
protected void embedImageBack(COSObject obj) throws IOException
{
                String path = "img/";
                PDXObjectImage image = (PDXObjectImage)
PDXObject.createXObject(
                                                (COSStream)
obj.getObject() );
                File inputFile = new File(
path+obj.getObjectNumber()+"."+image.getSuffix() );
                PDXObjectImage newImage = null;
                if (image.getSuffix().equals("jpg"))
                                newImage = new PDJpeg( this.doc, new
FileInputStream(inputFile) );
                else
                                newImage = new PDCcitt( this.doc,
(RandomAccess) new RandomAccessFile( inputFile, "r" ) );
image.getCOSStream().replaceWithStream(newImage.getCOSStream());
                this.shouldSaveDoc = true;
}

After all images have been processed, I save the document in a new
file, but except that the file size changes, nothing else visible
happens.
Thanks for any help.

Julien PLÉE


?  Click here to submit conditions
This email and any content within or attached hereto from Sun WestMortgage Company, Inc. is confidential and/or legally privileged.The information is intended only for the use of the individual orentity named on this email. If you are not the intended recipient,you are hereby notified that any disclosure, copying, distributionor the taking of any action in reliance on the contents of thisemail information is strictly prohibited, and that the documentsshould be returned to this office immediately by email. Receipt byanyone other than the intended recipient is not a waiver of anyprivilege. Please do not include your social security number,account number, or any other personal or financial information inthe content of the email. Should you have any questions, pleasecall (800) 453 7884.

Re: Replacing images contents

Reply via email to