Nick Burch wrote:
On Thu, 12 Jan 2006, Daniel Noll wrote:
But, is it possible to use the API to extract embedded images from
these documents as well?
It's not currently possible to extract images from PowerPoint files.
It's on the todo list, but I probably won't get around to it for
another month or two (need to finish rich text first).
That sounds good, actually (although my personal bias would say images
before rich text, but that's not the point, really.)
What about Word and Excel? I've had a bit of a navigate around the HWPF
and HSSL APIs but I can't easily see how to get to the images with those
either.
If you fancied adding the support, I'd be happy to give you some
pointers on where to start
:-)
Actually, how long do you think it would take? We only really need to
get at the binary data for the embedded image files (e.g. source PNGs),
being able to render the vector drawings would be a bonus, but not
necessarily a requirement at this point.
If I implemented text now and then waited for two months and implemented
images, our own backwards compatibility requirements might add enough
work to my plate as it is... if the work to rip out the images in all
formats isn't "too much" work, it might be simpler to get it out of the
way earlier and suffer less changes to our own code in the future.
Daniel
--
Daniel Noll
Nuix Australia Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia
Phone: (02) 9280 0699
Fax: (02) 9212 6902
This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List: http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/