parsing document title

2010-06-17 Thread Mango
I'm supposed to index documents which do not have all the information I need stored in the Metadata fields. I would like to extract the document title from the document body when the Title Metadata field contains no information. In addition, many of the documents contain a table with information o

Re: Adding a new field to existing Index

2010-06-29 Thread Mango
Unfortunately, I don't think it is possible to add new field without re-indexing. As for extracting content from the field, it should be possible to retrieve data if the term vectors were stored with positions offset (Field.TermVector.WITH_POSITIONS_OFFSETS). If not, I don't think it's possible.