https://issues.apache.org/bugzilla/show_bug.cgi?id=51901
Bug #: 51901
Summary: [PATCH] StringChunk.parseAs7BitData - Encoding not
found - US-ASCII; format=flowed
Product: POI
Version: 3.8-dev
Platform: PC
Status: NEW
Severity: major
Priority: P2
Component: HSMF
AssignedTo: [email protected]
ReportedBy: [email protected]
Classification: Unclassified
Created attachment 27616
--> https://issues.apache.org/bugzilla/attachment.cgi?id=27616
Patch for issue
Some message files appear to have additional information for charset when
dealing with some US-ASCII types.
Patch attached, looks for an occurrence of a semicolon and substrings the
string if present. NOTE: won't work if a valid charset encoding for a string
can contain semicolons as a valid option. Other option could be to modify
Pattern used to produce charsets.
Actual m.group(1) string returned from Content-Type: "US-ASCII; format=flowed;
delsp=yes"
Unable to attach sample file due to sensitive nature.
Exception Message Stack Trace: POI-3.8-beta4
BaseTextExtractionService - Unexpected RuntimeException from
org.apache.tika.parser.microsoft.OfficeParser@2ddd595d
org.apache.tika.exception.TikaException: Unexpected RuntimeException from
org.apache.tika.parser.microsoft.OfficeParser@2ddd595d
Caused by: java.lang.RuntimeException: Encoding not found - US-ASCII;
format=flowed
at
org.apache.poi.hsmf.datatypes.StringChunk.parseAs7BitData(StringChunk.java:155)
at
org.apache.poi.hsmf.datatypes.StringChunk.parseString(StringChunk.java:86)
at
org.apache.poi.hsmf.datatypes.StringChunk.set7BitEncoding(StringChunk.java:74)
at org.apache.poi.hsmf.MAPIMessage.set7BitEncoding(MAPIMessage.java:413)
at org.apache.poi.hsmf.MAPIMessage.guess7BitEncoding(MAPIMessage.java:373)
at
org.apache.tika.parser.microsoft.OutlookExtractor.parse(OutlookExtractor.java:73)
at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:219)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
... 49 more
Caused by: java.io.UnsupportedEncodingException: US-ASCII; format=flowed
at java.lang.StringCoding.decode(StringCoding.java:170)
at java.lang.String.<init>(String.java:443)
at java.lang.String.<init>(String.java:515)
at
org.apache.poi.hsmf.datatypes.StringChunk.parseAs7BitData(StringChunk.java:153)
... 56 more
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]