Hi,

I looked at tika-server in a bit more detail, and I'm a bit concerned
about the dependency overhead it needs for the JAX-RS support:

  +- org.apache.cxf:cxf-rt-frontend-jaxrs:jar:2.5.2
     +- org.apache.cxf:cxf-common-utilities:jar:2.5.2
     |  +- org.apache.ws.xmlschema:xmlschema-core:jar:2.0.1
     |  \- org.codehaus.woodstox:woodstox-core-asl:jar:4.1.1
     |   \- org.codehaus.woodstox:stax2-api:jar:3.1.1
     +- org.apache.cxf:cxf-api:jar:2.5.2
     |  +- org.apache.neethi:neethi:jar:3.0.1
     |  \- wsdl4j:wsdl4j:jar:1.6.2
     +- org.apache.cxf:cxf-rt-core:jar:2.5.2
     |  +- com.sun.xml.bind:jaxb-impl:jar:2.2.4-1
     |  \- org.apache.geronimo.specs:geronimo-javamail_1.4_spec:jar:1.7.1
     +- org.springframework:spring-core:jar:3.0.6.RELEASE
     |  \- org.springframework:spring-asm:jar:3.0.6.RELEASE
     +- javax.ws.rs:jsr311-api:jar:1.1.1
     +- org.apache.cxf:cxf-rt-bindings-xml:jar:2.5.2
     +- org.apache.cxf:cxf-rt-transports-http:jar:2.5.2
     |  +- org.apache.cxf:cxf-rt-transports-common:jar:2.5.2
     |  \- org.springframework:spring-web:jar:3.0.6.RELEASE
     |   +- aopalliance:aopalliance:jar:1.0
     |   +- org.springframework:spring-beans:jar:3.0.6.RELEASE
     |   \- org.springframework:spring-context:jar:3.0.6.RELEASE
     |    +- org.springframework:spring-aop:jar:3.0.6.RELEASE
     |    \- org.springframework:spring-expression:jar:3.0.6.RELEASE
     \- org.codehaus.jettison:jettison:jar:1.3.1
  +- org.apache.cxf:cxf-rt-transports-http-jetty:jar:2.5.2
     +- org.eclipse.jetty:jetty-server:jar:7.5.4.v20111024
     |  +- org.eclipse.jetty:jetty-continuation:jar:7.5.4.v20111024
     |  \- org.eclipse.jetty:jetty-http:jar:7.5.4.v20111024
     |   \- org.eclipse.jetty:jetty-io:jar:7.5.4.v20111024
     |    \- org.eclipse.jetty:jetty-util:jar:7.5.4.v20111024
     +- org.eclipse.jetty:jetty-security:jar:7.5.4.v20111024
     \- org.apache.geronimo.specs:geronimo-servlet_2.5_spec:jar:1.1.2

That's about 7MB of middleware code. Do we really need all this? If
yes, who's going to review the licensing of all these dependencies and
come up with appropriate LICENSE/NOTICE files to include in the
tika-server jar?

The services exposed by tika-server are pretty simple and
straightforward, so I'm wondering if we could just replace all of the
above with just an embedded Jetty server, or even just the HttpCore
library [1].

[1] http://hc.apache.org/httpcomponents-core-ga/

BR,

Jukka Zitting

Reply via email to