Re: Analysing a document sections with Apache Tika

2017-05-04 Thread Thamme Gowda
.org> >> *Date: *Thursday, May 4, 2017 at 8:38 AM >> *To: *"user@tika.apache.org" <user@tika.apache.org> >> *Cc: *"thammego...@apache.org" <thammego...@apache.org> >> *Subject: *Re: Analysing a document sections with Apache Tika >> >

Re: Analysing a document sections with Apache Tika

2017-05-04 Thread Chris Mattmann
@tika.apache.org> Cc: "thammego...@apache.org" <thammego...@apache.org> Subject: Re: Analysing a document sections with Apache Tika Dear Thamme, Thanks for your reply and the suggestions. I build Grobid usign the instruction from http://grobid.readthedocs.io/en/latest/Insta

Re: Analysing a document sections with Apache Tika

2017-05-04 Thread tesm...@gmail.com
Dear Thamme, Thanks for your reply and the suggestions. I build Grobid usign the instruction from http://grobid.readthedocs.io/en/latest/Install-Grobid/ Trying to run the following example code from GitHub repository( https://github.com/kermitt2/grobid-example) = import

Re: Analysing a document sections with Apache Tika

2017-05-03 Thread Thamme Gowda
Hello, There is a nice project called Grobid [1] that does most of what you are describing. Tika has Grobid parser built in (it calls grobid over REST API) . checkout [2] for details I have a project that makes use of Tika with Grobid and NER support. It also builds a search index using solr.