: I am a new Solr user, and would like to create a new custom field that is : then populated with text extracted from each document when I crawl my file : system.
what are you using to do the crawling? Typically people feed solr structured data -- there are some things in Solr (like the ExtractingRequestHandler) that help you pull structure out of unstructured or semi-structured files, and there are things like DIH that can help you pull data from structure (or semi-structured) sources, but those aren't end-all-be-all solutions to all problems -- they aim to meet the 80/20 rule of simple common cases. If you have special requirements about parsing special files... : text text text... Received : 04 Jan 2002 17:31:40 ...text text text ...you'll need to write your own special code for parsing those files to extract the structure you want. where/how you use your custom code depends on your use cases -- maybe you write a custom extractor for Tika nad then use ExtractingRequestHandler, maybe you write a custom EntityProcessor and then use DataImportHandler, or maybe you just parse the code in the client langauge of your choice and POST it to Solr over HTTP ... it all depends on your use case and what you are comfortable with. BTW: Since you definitely seem to interested in using Solr, you should consider sending subsequent questions to the solr-user@lucene mailing list (general@lucene is generally for discussions about hte overall Lucene project, and/or questions when people really have no idea what they want to use) -Hoss
