Thanks, that did work.
On Tue, Jun 17, 2014 at 8:49 PM, Jack Krupansky <j...@basetechnology.com> wrote: > Yeah, this is kind of tricky and confusing! Here's what happens: > > 1. The query parser "parses" the input string into individual source > terms, each delimited by white space. The escape is removed in this > process, but... no analyzer has been called at this stage. > > 2. The query parser (generator) calls the analyzer for each source term. > Your analyzer is called at this stage, but... the escape is already gone, > so... the <backslash><slash> mapping rule is not triggered, leaving the > slash recorded in the source term from step 1. > > You do need the backslash in your original query because a slash > introduces a regex query term. It is added by the escape method you call, > but the escaping will be gone by the time your analyzer is called. > > So, just try a simple, unescaped slash in your char mapping table. > > -- Jack Krupansky > > -----Original Message----- From: Luis Pureza > Sent: Tuesday, June 17, 2014 1:43 PM > To: java-user@lucene.apache.org > Subject: Lucene QueryParser/Analyzer inconsistency > > > Hi, > > I'm experience a puzzling behaviour with the QueryParser and was hoping > someone around here can help me. > > I have a very simple Analyzer that tries to replace forward slashes (/) by > spaces. Because QueryParser forces me to escape strings with slashes before > parsing, I added a MappingCharFilter to the analyzer that replaces "\/" > with a single space. The analyzer is defined as follows: > > @Override > protected TokenStreamComponents createComponents(String field, Reader in) { > NormalizeCharMap.Builder builder = new NormalizeCharMap.Builder(); > builder.add("\\/", " "); > Reader mappingFilter = new MappingCharFilter(builder.build(), in); > > Tokenizer tokenizer = new WhitespaceTokenizer(version, mappingFilter); > return new TokenStreamComponents(tokenizer); > } > > Then I use this analyzer in the QueryParser to parse a string with dashes: > > String text = QueryParser.escape("one/two"); > QueryParser parser = new QueryParser(Version.LUCENE_48, "f", new > MyAnalyzer(Version.LUCENE_48)); > System.err.println(parser.parse(text)); > > The expected output would be > > f:one f:two > > However, I get: > > f:one/two > > The puzzling thing is that when I debug the analyzer, it tokenizes the > input string correctly, returning two tokens instead of one. > > What is going on? > > Many thanks, > > Luís Pureza > > P.S.: I was able to fix this issue temporarily by creating my own tokenizer > that tokenizes on whitespace and slashes. However, I still don't understand > what's going on. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >