Re: Indexing and ExtractingRequestHandler

2010-08-12 Thread Lance Norskog
This is probably true about Luke. The trunk has a new Lucene format and does not read any previous format. The trunk is a busy code base. The 3.1 branch is slated to be the next Solr release, and is probably a better base for your testing. Best of all is to use the Solr 1.4.1 binary release. On W

Re: Indexing and ExtractingRequestHandler

2010-08-11 Thread Harry Hochheiser
Thanks. I've done Tika command line to parse the Excel file, and I see contents in it that don't appear to be indexed. I've tried the path of using Tika to parse the Excel and then using extracting request handler to index the resulting text, and that doesn't work either. As far as Luke goes, I'v

Re: Indexing and ExtractingRequestHandler

2010-08-11 Thread Jan Høydahl / Cominvent
Hi, You can try Tika command line to parse your Excel file, then you will se the exact textual output from it, which will be indexed into Solr, and thus inspect whether something is missing. Are you sure you use a version of Luke which supports your version of Lucene? -- Jan Høydahl, search so

Indexing and ExtractingRequestHandler

2010-08-11 Thread Harry Hochheiser
I'm trying to use Solr to index the contents of an Excel file, using the ExtractingRequestHandler (CSV handler won't work for me - I need to consider the whole spreadsheet as one document), and I'm running into some trouble. Is there any way to see what's going on during the indexing process? I'm