Hey Guys, I just found one of the coolest things in the world: Aptivate Tika [1], which is for all intents and purposes, a full port of Tika from Java to Python using JCC [2], which was written by the PyLucene community to automatically generate C++ code that can then be linked to Python and callable directly in Python to expose the Java lib functionality.
This got me thinking: why not use JCC to do the same thing with OODT? I'm sure we can do this and I'll do some initial benchmarking and testing to try it out. The next question is where precisely does it make sense to use JCC? I could see JCC creating a Python File Manager, Workflow Manager, and Resource Manager. This would basically allow folks to write some of the core server extensions in Python if they so desire. The flip side of this is that all of those servers are easily interacted with in Python using xmlrpclib [3]. Anyways, just wanted to share my thoughts. I'll keep plugging away at this. And, I'm working on figuring out how to do something like Aptivate Tika in Apache Tika and ship a PyTika :) Cheers, Chris [1] https://github.com/aptivate/python-tika [2] http://lucene.apache.org/pylucene/jcc/ [3] http://docs.python.org/2/library/xmlrpclib.html ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++