Hi Andreas,

I appreciate the offer.  After some more digging I have found that the 
assumption made by this code snippet (from 
Ole10Native.createFromEmbeddedOleObject) is not 100% reliable:

        try {
            directory.getEntry("\u0001Ole10ItemName");
            plain = true;
        } catch (FileNotFoundException ex) {
            plain = false;
        }

What I have found is that with some documents that do not contain this entry 
(i.e. plain=false) are extractable if you set plain=true.

So I have made the following (very similar) method to replace the call:

    private Ole10Native resilientCreateFromEmbeddedOleObject(DirectoryNode 
directory) throws IOException, Ole10NativeException {

        final String OLE10_NATIVE = "\u0001Ole10Native";
        Ole10Native ole10 = null;
        boolean plain = false;
        boolean retry = false;

        try {
            directory.getEntry("\u0001Ole10ItemName");
            plain = true;
        } catch (FileNotFoundException ex) {
            plain = false;
        }

        DocumentEntry nativeEntry = 
(DocumentEntry)directory.getEntry(OLE10_NATIVE);
        byte[] data = new byte[nativeEntry.getSize()];
        directory.createDocumentInputStream(nativeEntry).read(data);

        // Have 2 goes at this - 'plain' can lie!
        try {
            ole10 = new Ole10Native(data, 0, plain);

        } catch (Ole10NativeException e) {
            retry = true;
        }

        if (retry) {
            ole10 = new Ole10Native(data, 0, !plain);
        }

        return ole10;
    }

This gives a higher success rate.  I will let you know what else I find :-)

Kind regards,

- Chris

On 16 Jul 2014, at 23:00, Andreas Beeker 
<andreas.bee...@gmx.de<mailto:andreas.bee...@gmx.de>> wrote:

Hi Chris,

> On 16.07.2014 15:24, Chris Bamford wrote:
> Looking in the source of Ole10Native at the offending line I see:
> if (totalSize < ofs) {
> throw new Ole10NativeException("Invalid Ole10Native");
> }
>
>Can anyone shed any light on what this means and why it happens?

The MS docs [1] are quite limited on that stream, so the code is just plain 
guessing :|
There are Ole10Native streams without an actually data part - i.e. (some) 
equation editor objects come without the data part, but encode somehow their 
data within the filename.
But the Ole objects I looked at up so far, were common in having a label, a 
filename and a command or at least 3 length-prefixed byte-arrays.
So this line checks if there was a error with the length-prefixes.

If you can share your file, please open a bug entry or alternatively send it to 
my private email.
I would then try to figure out, how the bin object could be handled.
Currently I don't have much time and my priority is to finish that xml 
signature stuff, so that may take some time ... sorry

Andi.


[1] 
http://msdn.microsoft.com/en-us/library/dd942447.aspx<http://msdn.microsoft.com/en-us/library/dd942447.aspx>



---------------------------------------------------------------------
To unsubscribe, e-mail: 
user-unsubscr...@poi.apache.org<mailto:user-unsubscr...@poi.apache.org>
For additional commands, e-mail: 
user-h...@poi.apache.org<mailto:user-h...@poi.apache.org>

Chris Bamford
Senior Developer
m: +44 7860 405292
p: +44 207 847 8700
w: www.mimecast.com
Address click here: www.mimecast.com/About-us/Contact-us/






Reply via email to