Hi,

Sorry, I can't answer any of these questions, due to privacy / social reasons. I'm aware that this lack of information may prevent the problem to be fixed, which is why I didn't open an issue.

Tilman

Am 06.09.2016 um 19:27 schrieb Javen O'Neal:
Can you describe how the file is produced? Or any way of producing a
sample-file that does not contain sensitive information?

Was this file created in Outlook for Windows or Mac? Which version? If not,
some other software? Does Outlook report that the data file is corrupted?

On Sep 6, 2016 9:51 AM, "Tilman Hausherr" <thaush...@t-online.de> wrote:
Sorry I can't, due to confidentiality. That's why I had hoped that the
debug output and the HMEFDumper output would help. If your (or my)
mailreader broke up the lines so that you see a mess like below, see it
here: http://pastebin.com/C1B7F4v0 . I could also run some test code if
needed (I still have the mail)... OTOH, this is not really a big deal.
Tilman


Am 05.09.2016 um 15:43 schrieb Dominik Stadler:
Hi,

I did take a quick look, it seems the developer who implemented this
thought a proper formatted file would never miss the attachment-data and
thus used an IllegalArgumentException, we can surely document this a bit
better, though, and if there are really files out there that Microsoft
tools produce and can read in again which show this then we should adjust
the code to handle it more gracefully.

However the few sample-files that we have available all work fine. Can
you
describe how the file is produced? Or any way of producing a sample-file
that does not contain sensitive information? We always try to add
unit-tests which verify fixes instead of doing them blindly.

Dominik.

On Wed, Aug 31, 2016 at 7:06 PM, Tilman Hausherr <thaush...@t-online.de>
wrote:

Hello,

I'm getting an IllegalArgumentException when calling
org.apache.poi.hmef.Attachment.getContents(). I'm using 1.14. My source
code and log output:

          <dependency>
              <groupId>org.apache.poi</groupId>
              <artifactId>poi-scratchpad</artifactId>
              <version>3.14</version>
          </dependency>


          HMEFMessage hmefMessage = new HMEFMessage(is);
          for (Attachment attachment : hmefMessage.getAttachments())
          {
               ......
                  logger.warn("Datei " + attachment.getLongFilename() + "
wurde nicht verarbeitet, da kein PDF");
                  logger.info("Info using code from
https://poi.apache.org/hmef/ :");
                  for (TNEFAttribute attr : attachment.getAttributes())
                  {
                      logger.info("A.TNEF : " + attr);
                  }
                  for (MAPIAttribute attr :
attachment.getMAPIAttributes())
                  {
                      logger.info("A.MAPI : " + attr);
                  }
                  if (logRejectedStream(attachment.getContents(),
attachment.getLongFilename()))
                  {
                      rejectedFiles.add(attachment.getLongFilename()); //
IllegalArgumentException :-(
                  }

31.08.2016 12:16:44 WARN  processimappdf.MessageProcessor:183 - Datei
null wurde nicht verarbeitet, da kein PDF
31.08.2016 12:16:44 INFO  processimappdf.MessageProcessor:184 - Info
using code from https://poi.apache.org/hmef/ :
31.08.2016 12:16:44 INFO  processimappdf.MessageProcessor:187 - A.TNEF :
Attribute AttachRenderData [36866] (attAttachRenddata), type=6, data
length=14
31.08.2016 12:16:44 INFO  processimappdf.MessageProcessor:187 - A.TNEF :
Attribute AttachTitle [32784] (PR_ATTACH_FILENAME), type=1,
data=Untitled
Attachment
31.08.2016 12:16:44 INFO  processimappdf.MessageProcessor:187 - A.TNEF :
Attribute AttachModifyDate [32787] (PR_LAST_MODIFICATION_TIME), type=3,
date=Wed Aug 31 12:04:45 UTC 2016
31.08.2016 12:16:44 INFO  processimappdf.MessageProcessor:187 - A.TNEF :
Attribute AttachMetaFile [32785] (PR_ATTACH_RENDERING), type=6, data
length=3512
31.08.2016 12:16:44 INFO  processimappdf.MessageProcessor:187 - A.TNEF :
Attribute Attachment [36869], type=6, 16 MAPI Attributes
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
AttachNum [3617] (PR_ATTACH_NUM) [01, 00, 00, 00]
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
ObjectType [4094] (PR_Object_TYPE) [07, 00, 00, 00]
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
AttachMethod [14085] (PR_ATTACH_METHOD) [06, 00, 00, 00]
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
RenderingPosition [14091] (PR_RENDERING_POSITION) [33, 00, 00, 00]
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
StoreSupportMask [13325] (PR_STORE_SUPPORT_MASK) [79, 0E, 04, 00]
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
AttachData [14081] (PR_ATTACH_DATA_OBJ) [0B, 00, 00, 00, 00, 00, 00, 00,
C0, 00, 00, 00, 00, 00, 00, 46, ....]
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
(unknown 7ffb) [32763] Sat Jan 01 00:00:00 UTC 4501
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
AttachEncoding [14082] (PR_ATTACH_ENCODING) []
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
(unknown 7ffc) [32764] Sat Jan 01 00:00:00 UTC 4501
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
(unknown 7ffd) [32765] [00, 00, 00, 00]
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
DisplayName [12289] (PR_DISPLAY_NAME) Picture (Device Independent
Bitmap)
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
(unknown 7ffa) [32762] [00, 00, 00, 00]
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
AttachFlags [14100] (PR_ATTACH_FLAGS) [00, 00, 00, 00]
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
(unknown 7ffe) [32766] [00, 00]
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
(unknown 7fff) [32767] [00, 00]
31.08.2016 12:16:45 INFO  processimappdf.MessageProcessor:191 - A.MAPI :
AttachTag [14090] (PR_ATTACH_TAG) [2A, 86, 48, 86, F7, 14, 03, 0A, 03,
02,
01]
Exception in thread "main" java.lang.IllegalArgumentException:
Attachment
corrupt - no Data section
      at org.apache.poi.hmef.Attachment.getContents(Attachment.java:144)
      at com.XXX.processimappdf.MessageProcessor.processWinmailMessag
e(MessageProcessor.java:193)
      at com.XXX.processimappdf.MessageProcessor.extractMimeMimeMulti
part(MessageProcessor.java:337)
      at com.XXX.processimappdf.MessageProcessor.processMessage(Messa
geProcessor.java:215)
      at com.XXX.processimappdf.MessageProcessor.processMessage(Messa
geProcessor.java:111)
      at com.XXX.processimappdf.Main.doStuff(Main.java:162)
      at com.XXX.processimappdf.Main.main(Main.java:69)

The poi source code is:

https://svn.apache.org/viewvc/poi/trunk/src/scratchpad/src/o
rg/apache/poi/hmef/Attachment.java?view=markup
138        /**
139         * Returns the contents of the attachment.
140         */
141        public byte[] getContents() {
142           TNEFAttribute contents = getAttribute(TNEFProperty.ID_A
TTACHDATA);
143           if(contents == null) {
144              throw new IllegalArgumentException("Attachment corrupt
-
no Data section");
145           }
146           return contents.getData();
147        }

  From my understanding, IllegalArgumentException is rather for
programming
errors.

I've now changed my code to check
getAttribute(TNEFProperty.ID_ATTACHDATA)
myself, but shouldn't getContents() rather throw a checked exception? Or
have a javadoc that explains what to do first before calling the method?

Sorry I can't offer the Winmail.dat file, it has confidential data. So
maybe we'll never know for sure if the file was corrupt.

The "lost" attachments are not really important, they are company /
department logos. I was downloading them to decide whether they are
important or now, and then calculate an MD5 digest and then ignore them
in
the future.

I also tried HMEFDumper... here's an output for a "corrupt" attachment:


Attachment # 2

Level 2 : Type 6 : ID AttachRenderData [36866] (attAttachRenddata)
    Data of length 14
    00000000 03 00 33 00 00 00 FF FF FF FF 00 00 00 00 ..3...........

Level 2 : Type 1 : ID AttachTitle [32784] (PR_ATTACH_FILENAME)
        Untitled Attachment
    Data of length 20
    00000000 55 6E 74 69 74 6C 65 64 20 41 74 74 61 63 68 6D Untitled
Attachm
00000010 65 6E 74 00                                     ent.

Level 2 : Type 3 : ID AttachModifyDate [32787]
(PR_LAST_MODIFICATION_TIME)
        Wed Aug 31 14:04:45 CEST 2016
    Data of length 14
    00000000 E0 07 08 00 1F 00 0C 00 04 00 2D 00 03 00 à.........-...

Level 2 : Type 6 : ID AttachMetaFile [32785] (PR_ATTACH_RENDERING)
    Data of length 3512
    00000000 01 00 09 00 00 03 DC 06 00 00 00 00 21 06 00 00
......Ü.....!...
    00000000 00 00 05 00 00 00 09 02 00 00 00 00 05 00 00 00
................
    00000000 01 02 FF FF FF 00 A5 00 00 00 41 0B C6 00 88 00
......¥...A.Æ...
Level 2 : Type 6 : ID Attachment [36869]
    Data of length 3804
    00000000 10 00 00 00 03 00 21 0E 01 00 00 00 03 00 FE 0F
......!.......þ.
    00000000 07 00 00 00 03 00 05 37 06 00 00 00 03 00 0B 37
.......7.......7
    00000000 33 00 00 00 03 00 0D 34 79 0E 04 00 0D 00 01 37
3......4y......7
      AttachNum [3617] (PR_ATTACH_NUM) [01, 00, 00, 00]
      ObjectType [4094] (PR_Object_TYPE) [07, 00, 00, 00]
      AttachMethod [14085] (PR_ATTACH_METHOD) [06, 00, 00, 00]
      RenderingPosition [14091] (PR_RENDERING_POSITION) [33, 00, 00, 00]
      StoreSupportMask [13325] (PR_STORE_SUPPORT_MASK) [79, 0E, 04, 00]
      AttachData [14081] (PR_ATTACH_DATA_OBJ) [0B, 00, 00, 00, 00, 00,
00,
00, C0, 00, 00, 00, 00, 00, 00, 46, ....]
      (unknown 7ffb) [32763] Sat Jan 01 00:00:00 UTC 4501
      AttachEncoding [14082] (PR_ATTACH_ENCODING) []
      (unknown 7ffc) [32764] Sat Jan 01 00:00:00 UTC 4501
      (unknown 7ffd) [32765] [00, 00, 00, 00]
      DisplayName [12289] (PR_DISPLAY_NAME) Picture (Device Independent
Bitmap)
      (unknown 7ffa) [32762] [00, 00, 00, 00]
      AttachFlags [14100] (PR_ATTACH_FLAGS) [00, 00, 00, 00]
      (unknown 7ffe) [32766] [00, 00]
      (unknown 7fff) [32767] [00, 00]
      AttachTag [14090] (PR_ATTACH_TAG) [2A, 86, 48, 86, F7, 14, 03, 0A,
03,
02, 01]

Surprisingly "ID Attachment" is not empty. I tried saving the
attr.getData() contents but it isn't an image file. It is some non
compressed data and has "Picture (Device Independent Bitmap)" near the
end.

Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Reply via email to