On Fri, Jul 8, 2011 at 9:41 PM, Jürgen Jakobitsch <[email protected]> wrote: > hi all, > > i just wanted to let you know, that we had troubles with aperture in the past > and skipped to tika. > > besides being complicated to use, aperture wasn't able to extract from pdfs > which were no > problem for tika.
One of your committer "Walter Kasper" knows aperture very well. I think this was the reason why currently Aperture is used. We are also aware of Apache Tika, but still had now time up to now to write an engine for it. However we will definitely start to use the language detection of Apache Tika in the near future, because the currently used component has an LGPL dependency and can therefore not be used. > if it's just to get rdf out of some sorts of documents, tika (all the deps > are already there) > will do it. a content handler that makes rdf out of metadata is a matter of > an hour... Year that is basically what would be needed. > please also consider that aperture is a one man show according to sourceforge > svn browse with last release > about a year ago... Maybe Walter can something say about that, because he should know at least some of the developers. best Rupert -- | Rupert Westenthaler [email protected] | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen
