Michael, Yes, you're describing pretty much what I was thinking of but -- a) if I index "function()" as "function()" rather than "function", does that mean that if I search for "function", then it won't be found? -- the problem is that in some cases, the user will want to find function(), and in some cases just function -- can I accommodate for both? b) I understand about QueryParser.escape at searching time; at indexing time though, do I still need to escape the indexed values, e.g. keyword values, and store them in the escaped fashion, e.g. function\() -- or is function() ok? Thanks, - Dmitry
________________________________ From: Michael D. Curtin [mailto:[EMAIL PROTECTED] Sent: Fri 1/27/2006 2:14 PM To: java-user@lucene.apache.org Subject: Re: How to find "function()" - ? Dmitry Goldenberg wrote: > Hi, > > I'm trying to figure out a way to locate tokens which include special > characters. The actual text in the file being indexed is something like > "function() { statement1; statement2; }" > > The query I'm using is "function\()" since I want to locate precisely > "function()" - the query succeeds but what it finds is actually "function", > not "function()". If I run the same query against "function { statement1, > statement2 } it still succeeds and I get "function" in best fragments. > > How can I enforce () to be included? I think you're going to have to write your own Analyzer subclass that keeps special characters in the terms. Then, use that Analyzer during indexing. The included Analyzers drop parentheses and the like. If you're using Lucene's QueryParser, then use your new Analyzer there, too, and escape things like parentheses in the query text you submit to parse(). I think there's a discussion of custom Analyzers in the Lucene book, but I don't know where. Maybe somebody else on this list knows??? Good luck! --MDC --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]