Michael,
 
Yes, you're describing pretty much what I was thinking of but --
 
a) if I index "function()" as "function()" rather than "function", does that 
mean that if I search for "function", then it won't be found? -- the problem is 
that in some cases, the user will want to find function(), and in some cases 
just function -- can I accommodate for both?
 
b) I understand about QueryParser.escape at searching time; at indexing time 
though, do I still need to escape the indexed values, e.g. keyword values, and 
store them in the escaped fashion, e.g. function\() -- or is function() ok?
 
Thanks,
- Dmitry

________________________________

From: Michael D. Curtin [mailto:[EMAIL PROTECTED]
Sent: Fri 1/27/2006 2:14 PM
To: java-user@lucene.apache.org
Subject: Re: How to find "function()" - ?



Dmitry Goldenberg wrote:

> Hi,
> 
> I'm trying to figure out a way to locate tokens which include special 
> characters.  The actual text in the file being indexed is something like 
> "function() { statement1; statement2; }"
> 
> The query I'm using is "function\()" since I want to locate precisely 
> "function()" - the query succeeds but what it finds is actually "function", 
> not "function()".  If I run the same query against "function { statement1, 
> statement2 } it still succeeds and I get "function" in best fragments.
> 
> How can I enforce () to be included?

I think you're going to have to write your own Analyzer subclass that
keeps special characters in the terms.  Then, use that Analyzer during
indexing.  The included Analyzers drop parentheses and the like.

If you're using Lucene's QueryParser, then use your new Analyzer there,
too, and escape things like parentheses in the query text you submit to
parse().

I think there's a discussion of custom Analyzers in the Lucene book, but
I don't know where.  Maybe somebody else on this list knows???

Good luck!

--MDC

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to