Nick Burch wrote:
On Thu, 12 Jan 2006, Daniel Noll wrote:
But, is it possible to use the API to extract embedded images from these documents as well?

It's not currently possible to extract images from PowerPoint files. It's on the todo list, but I probably won't get around to it for another month or two (need to finish rich text first).

That sounds good, actually (although my personal bias would say images before rich text, but that's not the point, really.)

What about Word and Excel? I've had a bit of a navigate around the HWPF and HSSL APIs but I can't easily see how to get to the images with those either.
If you fancied adding the support, I'd be happy to give you some pointers on where to start

:-)

Actually, how long do you think it would take? We only really need to get at the binary data for the embedded image files (e.g. source PNGs), being able to render the vector drawings would be a bonus, but not necessarily a requirement at this point.

If I implemented text now and then waited for two months and implemented images, our own backwards compatibility requirements might add enough work to my plate as it is... if the work to rip out the images in all formats isn't "too much" work, it might be simpler to get it out of the way earlier and suffer less changes to our own code in the future.

Daniel


--
Daniel Noll

Nuix Australia Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
Phone: (02) 9280 0699
Fax:   (02) 9212 6902

This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/

Reply via email to