Hi, I'm starting out with Solr on a Windows box. I want to index the following documents: doc;docx xls;xlsx ppt vsd
pdf txt gif;jpeg;tiff I undersand that solr uses Apache Tika to read these file types and return an xml stream back to Solr. For Tika image processing, I've loaded Tesseract. To be able to search the documents, I need to define "fields" in a file called meta-schema. How do I get a list of all valid field names based on the file type? For example *.doc, what "fields" exist so I choose what to store? I'm assuming that for example, *.doc files there is metadata put into the file by Microsoft Word eg.author,date and "free form" text. So where is the list of valid fields per file type? Also how do I search the "free form" text for a word/pattern in the Solr search tool?