Hello, I need to do a search that is capable to also match on substrings, for example:
*oo bar the qu* should find a document that contains 'foo bar the quux' and 'foo bar the qux'. Now, should I index the text as UN_TOKENIZED also, and do a WildCardQuery on this field? Obviously, then every blobtext is added as a single term in lucene. Clearly, this doesn't scale at all, and searching becomes very slow. Does anybody know a more efficient way? A PhraseQuery might get me somewhere, isn't? Does PhraseQuery allow wildcards in the phrase? But, as a phrase is analyzed according some analyzer it might strip the 'the' as a stopword, implying that *oo bar qu* would also match, right? I know the requirements is a little strange, but it is part of the JSR-170 specification (sql 'like' or xpath 'jcr:like' which mimics the sql like in db) Thanks for any pointers Ard -- Hippo Oosteinde 11 1017WT Amsterdam The Netherlands Tel +31 (0)20 5224466 ------------------------------------------------------------- [EMAIL PROTECTED] / [EMAIL PROTECTED] / http://www.hippo.nl -------------------------------------------------------------- --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]