Anyone with the same problem or some experience with this?


jonycus wrote:
> 
> Hi all,
> 
> I am trying to extract a whole .doc document and have managed to do great
> with text, tables and bullets, but I remain stuck regarding the images.
> AFAIK the images in the MSWord file are stored as .emz, which is a gzip-ed
> emf file. This is my code:
> 
> 
>         List picList = picTable.getAllPictures();
>         Picture picture = (Picture) picList.get(picC);
>         String folderPath = PATH;
>         String emzPath = folderPath+picture.suggestFullFileName()+".emz";
>         OutputStream image = new FileOutputStream(emzPath);
>         picture.writeImageContent(image);
>         image.close();
>         InputStream is = new FileInputStream(new File(emzPath));
>         GZIPInputStream gzipis = new GZIPInputStream(is);
>         OutputStream emfos = new FileOutputStream(new
> File(folderPath+picture.suggestFullFileName()+".emf"));
>         byte[] buf = new byte[1024];
>         int len;
>         while ((len = gzipis.read(buf)) > 0) {
>           emfos.write(buf, 0, len);
>         }
>         gzipis.close();
>         emfos.close();
> 
> This should do the extraction of the emf image file from the emz. However
> my
> code fails to do so because the gzipis (the supposed gzip InputStream) is
> not a gzip at all! It seems that the extracted image is not an emz file. I
> tried another approach, to save the word file as HTML (which stores the
> images in a separate folder) and I got the images as .emz and gif. Now the
> size of the .emz file from that extraction and my extraction defer in
> bytes,
> meaning that the extraction is done wrong? I have been able to open the
> .emz
> file from the HTML extraction with gzip, but not my extracted file,
> getting
> an not good gzip file?
> 
> Any help with this?
> 
> Best regards,
> Vasko
> 
> 

-- 
View this message in context: 
http://old.nabble.com/HWPF-image-extraction-problem-tp26300123p26498551.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to