Re: Solr indexing with Tika DIH local vs network share

2019-04-04 Thread neilb
Thank you Erick, this is very helpful! -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr indexing with Tika DIH local vs network share

2019-03-29 Thread Erick Erickson
So just try adding the autocommit and auotsoftcommit settings. All of the example configs have these entries and you can copy/paste/change > On Mar 29, 2019, at 10:35 AM, neilb wrote: > > Hi Erick, I am using solrconfig.xml from samples only and has very few > entries. I have attached my config

Re: Solr indexing with Tika DIH local vs network share

2019-03-29 Thread neilb
Hi Erick, I am using solrconfig.xml from samples only and has very few entries. I have attached my config files for review along with reply. Thanks solrconfig.xml tika-data-config.xml

Re: Solr indexing with Tika DIH local vs network share

2019-03-29 Thread Erick Erickson
I suspect is that your autocommit settings in solrconfig.xml are something like hard commit: has openSearcher set to “false” soft commit: has the interval set to -1 (never) That means that until an external commit is executed, you won’t see any documents. Try setting your soft commit to somet

Re: Solr indexing with Tika DIH local vs network share

2019-03-29 Thread neilb
Hi Erick, thanks a lot for your suggestions. I will look into it. But to answer my own query, I was little impatient and checking indexing status after every minute. What I found is after few hours, status started updating with document count and finished the indexing process in around 5Hrs. Do you

Re: Solr indexing with Tika DIH local vs network share

2019-03-26 Thread Erick Erickson
Not quite an answer to your specific qustion, but… There are a number of reasons why it’s better to run your Tika process outside of Solr and DIH. Here’s the long form: https://lucidworks.com/2012/02/14/indexing-with-solrj/ Ignore the RDBMS parts. It’s somewhat old, but should be adaptable easily.

Solr indexing with Tika DIH local vs network share

2019-03-26 Thread neilb
Hi, I am trying to setup Solr for our project which can return full text searches on PDF documents. I am able to run the sample Tika DIH example locally on my windows server machine. It can index all PDF documents recursively in "baseDir" of config xml. Presently "baseDir" points to local folder o