I've been using the Lucene demo from http://lucene.apache.org/java/2_1_0/demo.html
I have a set of documents with filenames that give a good indication of content. A filename of 12 digits (I think this is [0-9]{12} as a regular expression) with the extension html is a troubleshooting guide, the number being an error code. A filename with two or three letters, then a minus (which would be [a-z]{2,3}- I think), then a known string means the document is about a particular subject; I have a list of the known strings matched to subjects. What I would like to do, is have my indexer create a field named "category", populated with either the string "troubleshooting" or with the known string extracted from the filename. Examples: For a file named 0000000000111.html the indexer adds the field "category" with the value "troubleshooting". For a file named xxx-cal-123.html the indexer adds the field "category" with the value "cal". For a file named xx-qv-(9).html the indexer adds the field "category" with the value "qv". Is there a way to do that? Beef. -- View this message in context: http://www.nabble.com/Create-and-populate-a-field-when-indexing-tf4713018.html#a13471852 Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]