solr query to return matched text to regex with default schema

2020-07-07 Thread Phillip Wu
Hi, I want to search Solr for server names in a set of Microsoft Word documents, PDF, and image files like jpg,gif. Server names are given by the regular expression(regex) INFP[a-zA-z0-9]{3,9} TRKP[a-zA-z0-9]{3,9} PLCP[a-zA-z0-9]{3,9} SQRP[a-zA-z0-9]{3,9} Problem === I want to get the

Solr query not returning all results

2017-10-17 Thread Phillip Wu
Hi, I've indexed a lot of documents (*.docx & *.vsd). When I run a query from the website it returns only a small proportion of the data in the index: { "responseHeader":{ "status":0, "QTime":66, "params":{ "q":"NS Finance 9.2", "fl":"id,date", "start":"0", "_":"1508193512223"}},

Solr running on Windows2012r2 under task scheduler

2017-10-08 Thread Phillip Wu
Hi, I'm trying to run Solr 7.0.0 on Windows 2012r2 like a unix daemon. I'd though I'd do this by running the commands to start Solr interactively as a task scheduler job. If I don't do this then when I log out of the server Windows stops any running processes. I'm running Solr sucessfully from

Solr fields for Microsoft files, image files, PDF, text files

2017-09-24 Thread Phillip Wu
Hi, I'm starting out with Solr on a Windows box. I want to index the following documents: doc;docx xls;xlsx ppt vsd pdf txt gif;jpeg;tiff I undersand that solr uses Apache Tika to read these file types and return an xml stream back to Solr. For Tika image processing, I've loaded Tesseract.