Hi, With Tika 1.0 almost done (how cool is that!), I think it's time to start looking forward to what we'll be doing during the 1.x cycle. One thing I've had in mind for a long time is to make Tika more easily usable in programming languages other than Java.
The tika-app jar already helps with that and I know there are people using Tika in .NET with IKVM, but it would be nice to see more tighter Tika integration also to languages like Python, Ruby, Javascript, Perl and PHP. Could we for example make a Ruby Gem out of Tika? The Tika facade class provides a pretty nice set of basic functionality that should be reasonably easy to port to other languages. More advanced Tika constructs like the SAX event mechanism or things like the ParseContext are probably trickier to port, so as a first step I'd be interested in looking at simply providing a basic set of Tika.py, Tika.rb, Tika.js, Tika.pm and Tika.php bindings (plus whatever else people may be interested in) that just reflect the key functionality found in Tika.java. Anyone interested in joining such an effort? Any pointers to existing work along similar lines? BR, Jukka Zitting