>> documents contain very long strings for chemical substances, users are >> interested in certain parts of the string e.g. find all documents that >> comprise "*foo*" be it "1-foo-bar" or "rab-oof-13-foonyl-naphthalene").
> So you're saying you want users to be able to search for "of-13" and > match that second one? User's really are demanding that? Yes and yes. Users range from Information Professionals to "naive" end users. If there's a string like "N-(t-Butyl)-N-(3,5-dinitrobenzoyl)-nitroxyl" users can be expected to search for "dinitro", "3,5-dinitro", "nitrobenz" etc. There are also sequences of amino acids or DNA that users might want to match partially. > But, keep in mind that WildcardQuery itself does support "*oo*" and it > would work as expected (although with the performance caveat if the > index is huge). If you want QueryParser to support a leading wildcard > character, you would have to customize it yourself. That's what I have been pondering about the whole morning and I'm going to give it a try. As far as I have tested, Lucene carries out these queries far quicker than your average relational DB text retrieval tool :-) Thanks a lot for your comments! Best regards, René --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]