Zied Hamdi created LUCENE-4247:
----------------------------------

             Summary: QueryParser doesn't call Analyzer
                 Key: LUCENE-4247
                 URL: https://issues.apache.org/jira/browse/LUCENE-4247
             Project: Lucene - Core
          Issue Type: Bug
          Components: core/queryparser
    Affects Versions: 3.6
            Reporter: Zied Hamdi


I'm trying to escape czech characters thorough the ASCIIFoldingFilter this 
works fine in indexing since I can retrieve the non-diacritic version of the 
content I indexed. But trying to retrieve with diacritics returns always 0 
results

In debug mode I can clearly see that the Analyzer wasn't called (in addition to 
that I've put a breakpoint in my analyser to check if it is not called later, 
and it never passes in)


searchText = "příLIš*";
                Analyzer analyzer = (Analyzer) factory.getBean("analyzer");
                Query q = new QueryParser((Version) factory.getBean("version"), 
DestinationPlaceProperties.NAME, analyzer).parse(searchText);


The query q has these values in debug:
prefix  Term  (id=90)   
        field   "name" (id=100) 
        text    "příliš" (id=101)       

--- more details ----
q       PrefixQuery  (id=65)    
        boost   1.0     
        numberOfTerms   0       
        prefix  Term  (id=90)   
        rewriteMethod   MultiTermQuery$2  (id=92)       
---------------------

My analyser is quite simple: I put its code just for reference

public class DestinationAnalyser extends Analyzer {

        /**
         * 
         */
        private final Version   luceneVersion;

        public DestinationAnalyser(Version lucene_version) {
                super();
                this.luceneVersion = lucene_version;
        }

        /*
         * (non-Javadoc)
         * 
         * @see 
org.apache.lucene.analysis.Analyzer#tokenStream(java.lang.String,
         * java.io.Reader)
         */
        @Override
        public TokenStream tokenStream(String fieldName, Reader reader) {
                TokenStream result = new StandardTokenizer(luceneVersion, 
reader);
                result = new StandardFilter(luceneVersion, result);
                result = new LowerCaseFilter(luceneVersion, result);
                result = new ASCIIFoldingFilter(result);
                return result;
        }
}


--------- WORKAROUND ---------
To avoid the problem, I'm actually using this method to transform the search 
text 
        /**
         * Uses {@link ASCIIFoldingFilter} to transform diacritical text to its 
ascii
         * counterpart
         * 
         * @param text
         *          to transform
         * @return ascii text
         */
        public static String foldToASCII(String text) {
                int length = text.length();
                char[] toReturn = new char[length];
                ASCIIFoldingFilter.foldToASCII(text.toCharArray(), 0, toReturn, 
0, length);
                return new String(toReturn);
        }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to