Re: possible to extract/find refence to image data in word files

Peter Brouwer Fri, 29 Jul 2005 02:24:02 -0700

Srinivas,

Thanks for your info.

I found now that my previous information was not completely correct. Inversion 2.5.1 the FIBAbstractType is complete.

I am on the same path as you, this is what I have right now. I am stillstruggling because I get an array index out of bounds exception. Thisexception is thrown becausethe _fib.getLcbDggInfo() gives me a wrong length of the office art objecttable data, or I interpret the data wrong. I get now atleast an EscherRecordwhich points me more or less to the correct place where the, in my casejpeg, data is stored. Only in front of the data are 25 bytes which I do notknow yet what they are used for.

Another thing is that I found that in my case the file has a 0Table and1Table stream. The _fib.isFWhichTblStm() should say which table to use. Iwonder now if because in my case it seems to be a complex file I shouldstill use the 1Table (is bigger and might not give an arrayindexoutofboundsexception).



      POIFSFileSystem filesystem = new POIFSFileSystem(istream);

       // read in the main stream.

DocumentEntry documentProps = (DocumentEntry)filesystem.getRoot().getEntry("WordDocument");

       byte[] _mainStream = new byte[documentProps.getSize()];
       filesystem.createDocumentInputStream("WordDocument").read(_mainStream);

       // use the fib to determine the name of the table stream.

org.apache.poi.hdf.model.hdftypes.FileInformationBlock _fib = neworg.apache.poi.hdf.model.hdftypes.FileInformationBlock(_mainStream);

System.out.println("has image:" + _fib.isFHasPic()); //always seemsto be false, something to do with complex file?


       String name = "0Table";
       System.out.println("use table1: "+_fib.isFWhichTblStm());
       if (_fib.isFWhichTblStm()) {
           name = "1Table";
       }

       // read in the table stream.

DocumentEntry tableProps = (DocumentEntry)filesystem.getRoot().getEntry(name);

       byte[] _tableStream = new byte[tableProps.getSize()];
       filesystem.createDocumentInputStream(name).read(_tableStream);

DefaultEscherRecordFactory factory = newDefaultEscherRecordFactory();

       int pos = _fib.getFcDggInfo();
       while (pos < (_fib.getFcDggInfo() + _fib.getLcbDggInfo())) {

               try {

org.apache.poi.ddf.EscherRecord r =factory.createRecord(_tableStream, pos);int bytesRead = r.fillFields(_tableStream, pos,factory);

                   System.out.println("bytes read" + bytesRead);
                   System.out.println(r.toString());
                   pos += bytesRead;
               } catch (Exception e) {
                   e.printStackTrace();
                   pos = _fib.getFcDggInfo() + _fib.getLcbDggInfo();
               }
           }

----- Original Message -----From: "Srinivas" <[EMAIL PROTECTED]>

To: "POI Users List" <poi-user@jakarta.apache.org>
Sent: Friday, July 29, 2005 9:48 AM
Subject: Re: possible to extract/find refence to image data in word files

HI Peter,

i am also trying the code to solve this problem.....
i found some link to see
http://jakarta.apache.org/poi/apidocs/org/apache/poi/ddf/EscherBlipWMFRecord.html
may it is useful to us....

thanks
srinivas

Peter Brouwer <[EMAIL PROTECTED]> wrote:
Hi Srinivas,

I am going through the source now and it looks like POI doesn't support

image extraction right now. If I look at the FIBAbstractType class(extended

by FileInformationBlock) I see that a lot of tags are missing. I am trying
now to add some of them and get the image stuff working. Would be great if
you tried to get the stuff working too, sofar I haven't succeeded.

Peter

----- Original Message -----From: "Srinivas"

To: "POI Users List"

Sent: Wednesday, July 27, 2005 8:10 AM
Subject: Re: possible to extract/find refence to image data in word files

HI Friends,

I am also following this track...please suggest us...what are the ways to
extract the images from the word document...?????


Thanks
Srinivas

Peter Brouwer

wrote:

Hi there,

I am trying now for a few days to extract an image out of a worddocument.

I haven't succeeded sofar. I am afraid now that POI is not able to do
this.

Has anybody succeeded in or know if it is possible to extract image data
out of ms word documents using POI.

Thanks in advance,
Peter


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List: http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/



---------------------------------
Start your day with Yahoo! - make it your home page



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/

Re: possible to extract/find refence to image data in word files

Reply via email to