[ 
https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984251#comment-13984251
 ] 

Rupert Westenthaler commented on TIKA-1276:
-------------------------------------------

Personally I am happy with having Tika running within an OSGI environment. 
AFAIK Tika uses the Java ServiceLoader for locating different parsers. 
ServiceLoader can only locate services within the same bundle. So the fact that 
the ClassLoader of the tika-bundle is parsed to the TikaConfig by the Activator 
is the reason why the current solution works.

As soon as one would like to have different parsers in different bundles one 
would need to also provide an alternative implementation to the DefaultParser 
for OSGI. Implementing a CompositeParser that tracks available parsers via the 
OSGI service registry would for sure be the way to go. 

To register single parsers as OSGI service one would only need to add 
@Component and @Service annotations. As this are not runtime annotation this 
would not even add any runtime dependencies.

Implementing it that way would even allow to dynamically add/remove parsers at 
runtime. Components with a dependency to the CompositeParser could even keep 
using their service object. 

> Missing embedded dependencies in tika-bundle
> --------------------------------------------
>
>                 Key: TIKA-1276
>                 URL: https://issues.apache.org/jira/browse/TIKA-1276
>             Project: Tika
>          Issue Type: Bug
>          Components: packaging
>    Affects Versions: 1.5
>         Environment: OSGI, Apache Felix via Apache Sling Launcher
>            Reporter: Rupert Westenthaler
>             Fix For: 1.6
>
>         Attachments: TIKA-1276_20140423_rwesten.diff, 
> TIKA-1276_20140428_2_rwesten.diff, TIKA-1276_20140428_3_rwesten.diff, 
> TIKA-1276_20140428_rwesten.diff
>
>
> While updating from tika 1.2 to 1.5 I that the 
> `org.apache.tika:tika-bundle:1.5` module has some missing dependences.
> 1. `com.uwyn:jhighlight:1.0` is not embedded
> Because of that installing the bundle results in the following exception
> {code}
> org.osgi.framework.BundleException: Unresolved constraint in bundle 
> org.apache.tika.bundle [103]: Unable to resolve 103.0: missing requirement 
> [103.0] osgi.wiring.package; 
> (osgi.wiring.package=com.uwyn.jhighlight.renderer))
> org.osgi.framework.BundleException: Unresolved constraint in bundle 
> org.apache.tika.bundle [103]: Unable to resolve 103.0: missing requirement 
> [103.0] osgi.wiring.package; 
> (osgi.wiring.package=com.uwyn.jhighlight.renderer)
>       at 
> org.apache.felix.framework.Felix.resolveBundleRevision(Felix.java:3962)
>       at org.apache.felix.framework.Felix.startBundle(Felix.java:2025)
>       at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1279)
>       at 
> org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)
>       at java.lang.Thread.run(Thread.java:744)
> {code}
> 2. `org.ow2.asm:asm:4.1` is not embedded because 
> `org.apache.tika:tika-core:1.5` uses `org.ow2.asm-debug-all:asm:4.1` and 
> therefore the `Embed-Dependency` directive `asm` does not match any 
> dependency. 
> Because of that one do get the following exception (after fixing (1))
> {code}
> org.osgi.framework.BundleException: Unresolved constraint in bundle 
> org.apache.tika.bundle [96]: Unable to resolve 96.0: missing requirement 
> [96.0] osgi.wiring.package; 
> (&(osgi.wiring.package=org.objectweb.asm)(version>=4.1.0)(!(version>=5.0.0))))
> org.osgi.framework.BundleException: Unresolved constraint in bundle 
> org.apache.tika.bundle [96]: Unable to resolve 96.0: missing requirement 
> [96.0] osgi.wiring.package; 
> (&(osgi.wiring.package=org.objectweb.asm)(version>=4.1.0)(!(version>=5.0.0)))
>       at 
> org.apache.felix.framework.Felix.resolveBundleRevision(Felix.java:3962)
>       at org.apache.felix.framework.Felix.startBundle(Felix.java:2025)
>       at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1279)
>       at 
> org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)
>       at java.lang.Thread.run(Thread.java:744)
> {code}
> There are two possibilities to fix this (a) change the `Embed-Dependency` to 
> `asm-debug-all` or adding a dependency to `org.ow2.asm:asm:4.1` to the 
> tika-bundle pom file.
> 3. `edu.ucar:netcdf:4.2-min` is not embedded
> Because of that one does get the following exception (after fixing (1) and 
> (2))
> {code}
> org.osgi.framework.BundleException: Unresolved constraint in bundle 
> org.apache.tika.bundle [96]: Unable to resolve 96.0: missing requirement 
> [96.0] osgi.wiring.package; (osgi.wiring.package=ucar.ma2))
> org.osgi.framework.BundleException: Unresolved constraint in bundle 
> org.apache.tika.bundle [96]: Unable to resolve 96.0: missing requirement 
> [96.0] osgi.wiring.package; (osgi.wiring.package=ucar.ma2)
>       at 
> org.apache.felix.framework.Felix.resolveBundleRevision(Felix.java:3962)
>       at org.apache.felix.framework.Felix.startBundle(Felix.java:2025)
>       at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1279)
>       at 
> org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)
>       at java.lang.Thread.run(Thread.java:744)
> {code}
> 4. The `com.adobe.xmp:xmpcore:5.1.2` dependency is required at runtime
> After fixing the above issues the tika-bundle was started successfully. 
> However when extracting EXIG metadata from a jpeg image I got the following 
> exception.
> {code}
> java.lang.NoClassDefFoundError: com/adobe/xmp/XMPException
>       at 
> com.drew.imaging.jpeg.JpegMetadataReader.extractMetadataFromJpegSegmentReader(JpegMetadataReader.java:112)
>       at 
> com.drew.imaging.jpeg.JpegMetadataReader.readMetadata(JpegMetadataReader.java:71)
>       at 
> org.apache.tika.parser.image.ImageMetadataExtractor.parseJpeg(ImageMetadataExtractor.java:91)
>       at org.apache.tika.parser.jpeg.JpegParser.parse(JpegParser.java:56)
>       [..]
> {code}
> Embedding xmpcore in the tika-bundle solved this issue.
> NOTES:
> * The Apache Stanbol integration tests only covers PDF, JPEG, DOCX. So there 
> might be additional issues with other not tested parsers. 
> * I was updating Tika from version 1.2 to 1.5. This means that all versions > 
> 1.2 might also be affected by this.
> * The following dependencies embedded by the tika-bundle are in fact OSGI 
> bundles and would not be needed to be embedded: commons-compress, xz, 
> commons-codec, commons-io



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to