Hi All,
On 01/07/12 17:27, Mattmann, Chris A (388J) wrote:
Hey Jukka,

On Jul 1, 2012, at 5:09 AM, Jukka Zitting wrote:

Hi,

I looked at tika-server in a bit more detail, and I'm a bit concerned
about the dependency overhead it needs for the JAX-RS support:

  +- org.apache.cxf:cxf-rt-frontend-jaxrs:jar:2.5.2
     +- org.apache.cxf:cxf-common-utilities:jar:2.5.2
     |  +- org.apache.ws.xmlschema:xmlschema-core:jar:2.0.1
     |  \- org.codehaus.woodstox:woodstox-core-asl:jar:4.1.1
     |   \- org.codehaus.woodstox:stax2-api:jar:3.1.1
     +- org.apache.cxf:cxf-api:jar:2.5.2
     |  +- org.apache.neethi:neethi:jar:3.0.1
     |  \- wsdl4j:wsdl4j:jar:1.6.2
     +- org.apache.cxf:cxf-rt-core:jar:2.5.2
     |  +- com.sun.xml.bind:jaxb-impl:jar:2.2.4-1
     |  \- org.apache.geronimo.specs:geronimo-javamail_1.4_spec:jar:1.7.1
     +- org.springframework:spring-core:jar:3.0.6.RELEASE
     |  \- org.springframework:spring-asm:jar:3.0.6.RELEASE
     +- javax.ws.rs:jsr311-api:jar:1.1.1
     +- org.apache.cxf:cxf-rt-bindings-xml:jar:2.5.2
     +- org.apache.cxf:cxf-rt-transports-http:jar:2.5.2
     |  +- org.apache.cxf:cxf-rt-transports-common:jar:2.5.2
     |  \- org.springframework:spring-web:jar:3.0.6.RELEASE
     |   +- aopalliance:aopalliance:jar:1.0
     |   +- org.springframework:spring-beans:jar:3.0.6.RELEASE
     |   \- org.springframework:spring-context:jar:3.0.6.RELEASE
     |    +- org.springframework:spring-aop:jar:3.0.6.RELEASE
     |    \- org.springframework:spring-expression:jar:3.0.6.RELEASE
     \- org.codehaus.jettison:jettison:jar:1.3.1
  +- org.apache.cxf:cxf-rt-transports-http-jetty:jar:2.5.2
     +- org.eclipse.jetty:jetty-server:jar:7.5.4.v20111024
     |  +- org.eclipse.jetty:jetty-continuation:jar:7.5.4.v20111024
     |  \- org.eclipse.jetty:jetty-http:jar:7.5.4.v20111024
     |   \- org.eclipse.jetty:jetty-io:jar:7.5.4.v20111024
     |    \- org.eclipse.jetty:jetty-util:jar:7.5.4.v20111024
     +- org.eclipse.jetty:jetty-security:jar:7.5.4.v20111024
     \- org.apache.geronimo.specs:geronimo-servlet_2.5_spec:jar:1.1.2

That's about 7MB of middleware code. Do we really need all this?

That's a good question. My goal in moving us away from the Jersey
code that was doing this was to move us away from Sun licensed code,
and on to Apache CXF, which I knew from OODT provided JAX-RS
support. Also I wanted to consume the vetted Apache CXF code which
I figured would be a ton safer license wise than Jersey.

Sergey Beryozkin (who I'm CC'ing on this email since I'm not sure
he's subscribed to dev@) helped by providing guidance on the CXF
side while I was working on this with Max. If you scope out [1], Max
brought up the large # of dependencies too and Sergey's response
was that in 2.6 there are only a few required dependencies:

*************
[INFO] +- org.apache.cxf:cxf-api:jar:2.6.0-SNAPSHOT:compile
[INFO] | +- org.codehaus.woodstox:woodstox-core-asl:jar:4.1.2:runtime
[INFO] | | - org.codehaus.woodstox:stax2-api:jar:3.1.1:runtime
[INFO] | +- org.apache.ws.xmlschema:xmlschema-core:jar:2.0.1:compile
[INFO] | +- 
org.apache.geronimo.specs:geronimo-javamail_1.4_spec:jar:1.7.1:compile
[INFO] | - wsdl4j:wsdl4j:jar:1.6.2:compile
[INFO] +- org.apache.cxf:cxf-rt-core:jar:2.6.0-SNAPSHOT:compile
[INFO] | - com.sun.xml.bind:jaxb-impl:jar:2.1.13:compile
**************

Maybe we should try upgrading to 2.6 if it's out?


Yes, CXF 2.6.0/2.6.1 is out


If
yes, who's going to review the licensing of all these dependencies and
come up with appropriate LICENSE/NOTICE files to include in the
tika-server jar?

These are CXF dependencies, which I'm sure there are relevant
entries in Apache CXF for them, no? And, aren't we eating our own
dog food here for this?


The services exposed by tika-server are pretty simple and
straightforward, so I'm wondering if we could just replace all of the
above with just an embedded Jetty server, or even just the HttpCore
library [1].

[1] http://hc.apache.org/httpcomponents-core-ga/

Does HTTP core provide JAX-RS support?


Not that I'm aware of. Generally speaking, one can obviously use a lower-level HTTP support API, but IMHO using JAX-RS is better and JAX-RS is the specification which is and will be evolving.

As far as CXF is concerned, I'm biased but I also believe getting Tika server supporting CXF was the right decision. CXF, like JAX-RS, is only going to become better, with things like the good support for attachments, advanced search capabilities, OAuth2, seem to be of possible use in the project. However I will not attempt to defend CXF strongly if the rest of the Tika team may think otherwise

Cheers
Sergey


Thanks!

Cheers,
Chris

[1] http://s.apache.org/0I

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



--
Sergey Beryozkin

Talend Community Coders
http://coders.talend.com/

Blog: http://sberyozkin.blogspot.com

Reply via email to