[jira] [Commented] (TIKA-1191) ForkParser / ClassLoaderProxy does not define package

2018-01-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321839#comment-16321839
 ] 

Hudson commented on TIKA-1191:
--

SUCCESS: Integrated in Jenkins build Tika-trunk #1420 (See 
[https://builds.apache.org/job/Tika-trunk/1420/])
fix for TIKA-1191 contributed by BenRomberg (ben: 
[https://github.com/apache/tika/commit/6a398bd3f6245543091fd7c0e9e4facb34a26882])
* (edit) tika-core/src/test/java/org/apache/tika/fork/ForkTestParser.java
* (add) 
tika-core/src/test/java/org/apache/tika/fork/unusedpackage/ClassInUnusedPackage.java
* (edit) tika-core/src/test/java/org/apache/tika/fork/ForkParserTest.java
* (edit) tika-core/src/main/java/org/apache/tika/fork/ClassLoaderProxy.java


> ForkParser / ClassLoaderProxy does not define package
> -
>
> Key: TIKA-1191
> URL: https://issues.apache.org/jira/browse/TIKA-1191
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.4, 1.5
>Reporter: Nicolas Belisle
> Fix For: 1.18
>
> Attachments: ClassLoaderProxy.java.patch, Test.java, test.eml
>
>
> ForkParser will throw an Exception in some cases : 
> org.apache.tika.exception.TikaException: Invalid embedded resource
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:189)
>   at 
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:135)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:186)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
>   at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
>   at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:136)
>   at 
> org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:499)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:60)
>   at org.apache.tika.config.TikaConfig.(TikaConfig.java:169)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:268)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getTikaConfig(AbstractPOIFSExtractor.java:72)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getDetector(AbstractPOIFSExtractor.java:79)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:176)
>   ... 10 more
> A patch will follow



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-1191) ForkParser / ClassLoaderProxy does not define package

2018-01-10 Thread Ben Romberg (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321797#comment-16321797
 ] 

Ben Romberg commented on TIKA-1191:
---

Thank you! Feels good to contribute at least a little to such a great project!

> ForkParser / ClassLoaderProxy does not define package
> -
>
> Key: TIKA-1191
> URL: https://issues.apache.org/jira/browse/TIKA-1191
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.4, 1.5
>Reporter: Nicolas Belisle
> Fix For: 1.18
>
> Attachments: ClassLoaderProxy.java.patch, Test.java, test.eml
>
>
> ForkParser will throw an Exception in some cases : 
> org.apache.tika.exception.TikaException: Invalid embedded resource
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:189)
>   at 
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:135)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:186)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
>   at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
>   at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:136)
>   at 
> org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:499)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:60)
>   at org.apache.tika.config.TikaConfig.(TikaConfig.java:169)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:268)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getTikaConfig(AbstractPOIFSExtractor.java:72)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getDetector(AbstractPOIFSExtractor.java:79)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:176)
>   ... 10 more
> A patch will follow



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-1191) ForkParser / ClassLoaderProxy does not define package

2018-01-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321769#comment-16321769
 ] 

ASF GitHub Bot commented on TIKA-1191:
--

Gagravarr closed pull request #215: TIKA-1191 fix package access in ForkParser
URL: https://github.com/apache/tika/pull/215
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/tika-core/src/main/java/org/apache/tika/fork/ClassLoaderProxy.java 
b/tika-core/src/main/java/org/apache/tika/fork/ClassLoaderProxy.java
index 920926d74..01b0ba548 100644
--- a/tika-core/src/main/java/org/apache/tika/fork/ClassLoaderProxy.java
+++ b/tika-core/src/main/java/org/apache/tika/fork/ClassLoaderProxy.java
@@ -112,7 +112,9 @@ protected synchronized URL findResource(String name) {
 // Receive the response
 if (input.readBoolean()) {
 byte[] data = readStream();
-return defineClass(name, data, 0, data.length);
+Class clazz = defineClass(name, data, 0, data.length);
+definePackageIfNecessary(name, clazz);
+return clazz;
 } else {
 throw new ClassNotFoundException("Unable to find class " + 
name);
 }
@@ -121,6 +123,21 @@ protected synchronized URL findResource(String name) {
 }
 }
 
+private void definePackageIfNecessary(String className, Class clazz) {
+String packageName = toPackageName(className);
+if (packageName != null && getPackage(packageName) == null) {
+definePackage(packageName, null, null, null, null, null, null, 
null);
+}
+}
+
+private String toPackageName(String className) {
+int packageEndIndex = className.lastIndexOf('.');
+if (packageEndIndex > 0) {
+return className.substring(0, packageEndIndex);
+}
+return null;
+}
+
 private byte[] readStream() throws IOException {
 ByteArrayOutputStream stream = new ByteArrayOutputStream();
 byte[] buffer = new byte[0x];
diff --git a/tika-core/src/test/java/org/apache/tika/fork/ForkParserTest.java 
b/tika-core/src/test/java/org/apache/tika/fork/ForkParserTest.java
index 5883c75d0..01e08d9d5 100644
--- a/tika-core/src/test/java/org/apache/tika/fork/ForkParserTest.java
+++ b/tika-core/src/test/java/org/apache/tika/fork/ForkParserTest.java
@@ -218,4 +218,21 @@ public void testPulse() throws Exception {
 }
 }
 
+@Test
+public void testPackageCanBeAccessed() throws Exception {
+ForkParser parser = new ForkParser(
+ForkParserTest.class.getClassLoader(),
+new ForkTestParser.ForkTestParserAccessingPackage());
+try {
+Metadata metadata = new Metadata();
+ContentHandler output = new BodyContentHandler();
+InputStream stream = new ByteArrayInputStream(new byte[0]);
+ParseContext context = new ParseContext();
+parser.parse(stream, output, metadata, context);
+assertEquals("Hello, World!", output.toString().trim());
+assertEquals("text/plain", metadata.get(Metadata.CONTENT_TYPE));
+} finally {
+parser.close();
+}
+}
 }
diff --git a/tika-core/src/test/java/org/apache/tika/fork/ForkTestParser.java 
b/tika-core/src/test/java/org/apache/tika/fork/ForkTestParser.java
index 0948cdd64..7e9c0bf2f 100644
--- a/tika-core/src/test/java/org/apache/tika/fork/ForkTestParser.java
+++ b/tika-core/src/test/java/org/apache/tika/fork/ForkTestParser.java
@@ -22,11 +22,13 @@
 import java.util.Set;
 
 import org.apache.tika.exception.TikaException;
+import org.apache.tika.fork.unusedpackage.ClassInUnusedPackage;
 import org.apache.tika.metadata.Metadata;
 import org.apache.tika.mime.MediaType;
 import org.apache.tika.parser.AbstractParser;
 import org.apache.tika.parser.ParseContext;
 import org.apache.tika.sax.XHTMLContentHandler;
+import org.junit.Assert;
 import org.xml.sax.ContentHandler;
 import org.xml.sax.SAXException;
 
@@ -54,4 +56,12 @@ public void parse(
 xhtml.endDocument();
 }
 
+static class ForkTestParserAccessingPackage extends ForkTestParser {
+@Override
+public void parse(InputStream stream, ContentHandler handler, Metadata 
metadata,
+ParseContext context) throws IOException, SAXException, 
TikaException {
+Assert.assertNotNull(ClassInUnusedPackage.class.getPackage());
+super.parse(stream, handler, metadata, context);
+}
+}
 }
\ No newline at end of file
diff --git 
a/tika-core/src/test/java/org/apache/tika/fork/unusedpackage/ClassInUnusedPackage.java
 
b/tika-core/src/test/

[jira] [Commented] (TIKA-1191) ForkParser / ClassLoaderProxy does not define package

2018-01-10 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16320956#comment-16320956
 ] 

Tim Allison commented on TIKA-1191:
---

+1 I've been meaning to do this.  Looks good to me.  Thank you!

> ForkParser / ClassLoaderProxy does not define package
> -
>
> Key: TIKA-1191
> URL: https://issues.apache.org/jira/browse/TIKA-1191
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.4, 1.5
>Reporter: Nicolas Belisle
> Attachments: ClassLoaderProxy.java.patch, Test.java, test.eml
>
>
> ForkParser will throw an Exception in some cases : 
> org.apache.tika.exception.TikaException: Invalid embedded resource
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:189)
>   at 
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:135)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:186)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
>   at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
>   at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:136)
>   at 
> org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:499)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:60)
>   at org.apache.tika.config.TikaConfig.(TikaConfig.java:169)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:268)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getTikaConfig(AbstractPOIFSExtractor.java:72)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getDetector(AbstractPOIFSExtractor.java:79)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:176)
>   ... 10 more
> A patch will follow



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-1191) ForkParser / ClassLoaderProxy does not define package

2018-01-09 Thread Nick Burch (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16319839#comment-16319839
 ] 

Nick Burch commented on TIKA-1191:
--

[~talli...@mitre.org] I'm minded to apply Ben Romberg's patch from pull #215, 
any thoughts/comments/objections?

> ForkParser / ClassLoaderProxy does not define package
> -
>
> Key: TIKA-1191
> URL: https://issues.apache.org/jira/browse/TIKA-1191
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.4, 1.5
>Reporter: Nicolas Belisle
> Attachments: ClassLoaderProxy.java.patch, Test.java, test.eml
>
>
> ForkParser will throw an Exception in some cases : 
> org.apache.tika.exception.TikaException: Invalid embedded resource
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:189)
>   at 
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:135)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:186)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
>   at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
>   at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:136)
>   at 
> org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:499)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:60)
>   at org.apache.tika.config.TikaConfig.(TikaConfig.java:169)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:268)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getTikaConfig(AbstractPOIFSExtractor.java:72)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getDetector(AbstractPOIFSExtractor.java:79)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:176)
>   ... 10 more
> A patch will follow



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-1191) ForkParser / ClassLoaderProxy does not define package

2018-01-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16307557#comment-16307557
 ] 

ASF GitHub Bot commented on TIKA-1191:
--

BenRomberg opened a new pull request #215: TIKA-1191 fix package access in 
ForkParser
URL: https://github.com/apache/tika/pull/215
 
 
   `ForkParser` can not be used right now when using `AutoDetectParser` 
together with the optional `jai-imageio-core` dependency.
   
   This fix enhances the patch provided in TIKA-1191 with unit tests.
   
   Thanks for the great work with Apache Tika! It would be really helpful for 
us to be able to use `ForkParser` with all optional dependencies in a future 
version.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> ForkParser / ClassLoaderProxy does not define package
> -
>
> Key: TIKA-1191
> URL: https://issues.apache.org/jira/browse/TIKA-1191
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.4, 1.5
>Reporter: Nicolas Belisle
> Attachments: ClassLoaderProxy.java.patch, Test.java, test.eml
>
>
> ForkParser will throw an Exception in some cases : 
> org.apache.tika.exception.TikaException: Invalid embedded resource
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:189)
>   at 
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:135)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:186)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
>   at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
>   at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:136)
>   at 
> org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:499)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:60)
>   at org.apache.tika.config.TikaConfig.(TikaConfig.java:169)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:268)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getTikaConfig(AbstractPOIFSExtractor.java:72)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getDetector(AbstractPOIFSExtractor.java:79)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:176)
>   ... 10 more
> A patch will follow



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TIKA-1191) ForkParser / ClassLoaderProxy does not define package

2015-08-19 Thread Eric Biggers (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703807#comment-14703807
 ] 

Eric Biggers commented on TIKA-1191:


I am using Tika 1.7 and I encountered this problem while testing ForkParser on 
the files in the test-documents directory distributed with the Tika sources.  
An example of a file that causes the problem is "testBinControlWord.rtf".  
Applying the ClassLoaderProxy.java.patch attached to this ticket appears to 
solve the problem (or at least work around it, since the packages won't be 
defined with their full original metadata).

The stacktrace given above for Tika 1.8-SNAPSHOT looks like an unrelated 
problem.

> ForkParser / ClassLoaderProxy does not define package
> -
>
> Key: TIKA-1191
> URL: https://issues.apache.org/jira/browse/TIKA-1191
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.4, 1.5
>Reporter: Nicolas Belisle
> Attachments: ClassLoaderProxy.java.patch, Test.java, test.eml
>
>
> ForkParser will throw an Exception in some cases : 
> org.apache.tika.exception.TikaException: Invalid embedded resource
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:189)
>   at 
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:135)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:186)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
>   at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
>   at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:136)
>   at 
> org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:499)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:60)
>   at org.apache.tika.config.TikaConfig.(TikaConfig.java:169)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:268)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getTikaConfig(AbstractPOIFSExtractor.java:72)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getDetector(AbstractPOIFSExtractor.java:79)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:176)
>   ... 10 more
> A patch will follow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1191) ForkParser / ClassLoaderProxy does not define package

2015-03-15 Thread Tyler Palsulich (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362588#comment-14362588
 ] 

Tyler Palsulich commented on TIKA-1191:
---

Here is an updated stacktrace for Tika 1.8-SNAPSHOT. It looks like something is 
trying to mark/reset a stream that doesn't support it:
{code}
➜  trunk  tika -z 
https://issues.apache.org/jira/secure/attachment/12657409/test.eml
Exception in thread "main" org.apache.tika.exception.TikaException: Failed to 
parse an email message
at org.apache.tika.parser.mail.RFC822Parser.parse(RFC822Parser.java:79)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:270)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:270)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:153)
at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:450)
at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:123)
Caused by: java.io.IOException: mark/reset not supported
at java.io.InputStream.reset(InputStream.java:347)
at 
org.apache.tika.parser.microsoft.POIFSContainerDetector.detect(POIFSContainerDetector.java:161)
at 
org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
at 
org.apache.tika.cli.TikaCLI$FileEmbeddedDocumentExtractor.parseEmbedded(TikaCLI.java:918)
at 
org.apache.tika.parser.mail.MailContentHandler.body(MailContentHandler.java:110)
at 
org.apache.james.mime4j.parser.MimeStreamParser.parse(MimeStreamParser.java:133)
at org.apache.tika.parser.mail.RFC822Parser.parse(RFC822Parser.java:76)
... 6 more
{code}.

> ForkParser / ClassLoaderProxy does not define package
> -
>
> Key: TIKA-1191
> URL: https://issues.apache.org/jira/browse/TIKA-1191
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.4, 1.5
>Reporter: Nicolas Belisle
> Attachments: ClassLoaderProxy.java.patch, Test.java, test.eml
>
>
> ForkParser will throw an Exception in some cases : 
> org.apache.tika.exception.TikaException: Invalid embedded resource
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:189)
>   at 
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:135)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:186)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
>   at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
>   at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:136)
>   at 
> org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:499)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:60)
>   at org.apache.tika.config.TikaConfig.(TikaConfig.java:169)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:268)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getTikaConfig(AbstractPOIFSExtractor.java:72)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getDetector(AbstractPOIFSExtractor.java:79)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:176)
>   ... 10 more
> A patch will follow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1191) ForkParser / ClassLoaderProxy does not define package

2014-07-23 Thread Nicolas Belisle (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14072104#comment-14072104
 ] 

Nicolas Belisle commented on TIKA-1191:
---

I was able to reproduce a similar issue with another file using Tika 1.5. 
See attached eml.test and the test (Test.java).
The exception : 

Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected 
RuntimeException from org.apache.tika.parser.mail.RFC822Parser@6743bc0f
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
Caused by: java.lang.NullPointerException
at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:158)
at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:516)
at org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:60)
at org.apache.tika.config.TikaConfig.(TikaConfig.java:169)
at org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:268)
at org.apache.tika.parser.AutoDetectParser.(AutoDetectParser.java:51)
at 
org.apache.tika.parser.mail.RFC822Parser.adaptedExtractMultipart(RFC822Parser.java:167)
at 
org.apache.tika.parser.mail.RFC822Parser.adaptedExtractMultipart(RFC822Parser.java:156)
at org.apache.tika.parser.mail.RFC822Parser.parse(RFC822Parser.java:101)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
... 9 more


> ForkParser / ClassLoaderProxy does not define package
> -
>
> Key: TIKA-1191
> URL: https://issues.apache.org/jira/browse/TIKA-1191
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.4
>Reporter: Nicolas Belisle
> Attachments: ClassLoaderProxy.java.patch, Test.java, test.eml
>
>
> ForkParser will throw an Exception in some cases : 
> org.apache.tika.exception.TikaException: Invalid embedded resource
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:189)
>   at 
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:135)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:186)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
>   at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
>   at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:136)
>   at 
> org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:499)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:60)
>   at org.apache.tika.config.TikaConfig.(TikaConfig.java:169)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:268)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getTikaConfig(AbstractPOIFSExtractor.java:72)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getDetector(AbstractPOIFSExtractor.java:79)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:176)
>   ... 10 more
> A patch will follow



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (TIKA-1191) ForkParser / ClassLoaderProxy does not define package

2013-11-04 Thread Nicolas Belisle (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813484#comment-13813484
 ] 

Nicolas Belisle commented on TIKA-1191:
---

Unfortunately, I cannot upload an example (in my case, a Word 97-2003 document) 
that triggers the issue.


> ForkParser / ClassLoaderProxy does not define package
> -
>
> Key: TIKA-1191
> URL: https://issues.apache.org/jira/browse/TIKA-1191
> Project: Tika
>  Issue Type: Bug
>  Components: parser
>Affects Versions: 1.4
>Reporter: Nicolas Belisle
> Attachments: ClassLoaderProxy.java.patch
>
>
> ForkParser will throw an Exception in some cases : 
> org.apache.tika.exception.TikaException: Invalid embedded resource
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:189)
>   at 
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:135)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:186)
>   at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
>   at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
>   at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:136)
>   at 
> org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:499)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:60)
>   at org.apache.tika.config.TikaConfig.(TikaConfig.java:169)
>   at 
> org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:268)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getTikaConfig(AbstractPOIFSExtractor.java:72)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getDetector(AbstractPOIFSExtractor.java:79)
>   at 
> org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:176)
>   ... 10 more
> A patch will follow



--
This message was sent by Atlassian JIRA
(v6.1#6144)