Re: help with POI & Co.

Nick Burch Wed, 10 Jan 2007 02:12:11 -0800

On Tue, 9 Jan 2007, Joerg Hohwiller wrote:

Besides I used the official POI release which is very old. I did NOT try the
HEAD from svn.

You should probably try with the svn head, you will generally have moreluck with HWPF and HSLF from there.

I did NOT even open most of the documents. The constructor caused anexception. Something like illegal fileformat or magic-number orsomething.

I use hslf for a web spider that tries lots of random documents, and it'sok on almost all of them, so it's odd that you're having such problems

(Normally you want to catch CorruptPowerPointFileException and
EncryptedPowerPointFileException, and skip over them, and catch
ArrayIndexOutOfBoundsException, and report bugs for those)
If an ArrayIndexOutOfBoundException is thrown by a method where the userdid not supply an index as parameter the implementation looks like ahack to me. Same applies to NullPointerExceptions.

These two are caused by powerpoint files containing things that we didn'tknow they might, and which our test documents don't. If you report bugsfor them, and include the problem document, we can try and figure outwhich of our assumptions on the file format are wrong, and work to fixthem.

My problem is that I extract many parts of text twice from the file. Itseems to me that they are really in there twice even though not visibleto the powerpoint application user.

Yup, that's to be expected on quicksaved files.QuickButCruddyTextExtractor will do something similar.

Your only option if you want to avoid that is to implement all thePersistPtr stuff, then parse SlideListWithTexts, and DoTheRightThing(tm)with it all. At which point, you've re-implemented most of hslf....


Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/

Re: help with POI & Co.

Reply via email to