from:"\"Christian Reuschling\""

Re: How to index the parsed content effectively

2014-07-02 Thread Christian Reuschling

t;> initial attempt >> was to index the output from ToTextContentHandler.toString() as a Lucene >> Text field. >> >> This is unlikely to be effective for large files. So I wonder what >> strategies exist for a >> more effective indexing/tokenization of t

Re: How to index the parsed content effectively

2014-07-02 Thread Christian Reuschling

riment with. > > The feedback will be appreciated Cheers, Sergey - -- ______ Christian Reuschling, Dipl.-Ing.(BA) Software Engineer Knowledge Management Department German Research Center for Artificial Intelligence DFKI GmbH Trippstadter Straße 122, D-67663 Kaiserslautern, Germany Phone: +49.631.2057

Re: 回复：Re: hello , how to utilize tika inside lucene ?

2013-02-25 Thread Christian Reuschling

McCandless >> 收件人：user@tika.apache.org, >> sdr...@sina.com 主题：Re: hello , how to utilize tika inside lucene ? >> 日期：2013年02月25日 03点55分 >> > - -- __ Christian Reuschling, Dipl.-Ing.(BA) Software Engineer Knowledge Management Depart

Leech crawler 1.3 released!

2013-02-05 Thread Christian Reuschling

-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Migrated to Tika 1.3, for those that use Tika and need further crawling capabilities. https://github.com/leechcrawler/leech Enjoy! :) Christian - -- __ Christian

Re: How to index the parsed content effectively

Re: How to index the parsed content effectively

Re: 回复：Re: hello , how to utilize tika inside lucene ?

Leech crawler 1.3 released!

4 matches

Site Navigation

Mail list logo

Footer information