gautamworah96 commented on PR #2342: URL: https://github.com/apache/james-project/pull/2342#issuecomment-2285055678
Hi @uschindler I didn't understand a few details > Lucene does not know how the document was indexed originally, the information is not stored anywhere in index. When you then reindex the document it will aply default settings for all the fields in the Document (which is analyzed/tokenized with the default analyzer). This will transform all StringField instances to TextField. Let's take the code [here](https://github.com/tigase/james-project/commit/e5fe4010131f085754cfadcffdb14224612bb848) as a reference for the purpose of this discussion. In the code, on L1293#[LuceneMessageSearchIndex.java](https://github.com/tigase/james-project/commit/e5fe4010131f085754cfadcffdb14224612bb848#diff-a7c2a3c5cdb7e4a2914c899409991e27df6b25ad54488f197bc533193e3a03d0), when they tried using a TermQuery on the ID field, that too failed. That atleast should've worked right? Sure Lucene does not store indexing information, but it would've stored the untokenized ID right (without info on whether or not it was tokenized)? All the TermQuery had to do was match against the terms indexed in the ID field? Even the updates made on L1282 used the same StringField -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
