Hi Alessandro, yes, i want the user to be able to surround the query with "" to run the phrase query with a NOT tokenized phrase.
What do i have to do? Thanks and Kind regards On Tue, Jul 21, 2015 at 2:47 PM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > Hey Jack, reading the doc : > > " Set to true if phrase queries will be automatically generated when the > analyzer returns more than one term from whitespace delimited text. NOTE: > this behavior may not be suitable for all languages. > > Set to false if phrase queries should only be generated when surrounded by > double quotes." > > > In the user case , i guess he's likely to use double quotes. > > The only problem he sees so far is that the phrase query uses the query > time analyser to actually split the tokens. > > First we need a feedback from him, but I guess he would like to have the > phrase query, to not tokenise the text within the double quotes. > > In the case we should find a way. > > > Cheers > > 2015-07-21 13:12 GMT+01:00 Jack Krupansky <jack.krupan...@gmail.com>: > > > If you don't explicitly enable automatic phrase queries, the Lucene query > > parser will assume an OR operator on the sub-terms when a white > > space-delimited term analyzes into a sequence of terms. > > > > See: > > > > > https://lucene.apache.org/core/5_2_0/queryparser/org/apache/lucene/queryparser/classic/QueryParserBase.html#setAutoGeneratePhraseQueries(boolean) > > > > > > -- Jack Krupansky > > > > On Fri, Jul 17, 2015 at 4:41 AM, Diego Socaceti <socac...@gmail.com> > > wrote: > > > > > Hi all, > > > > > > i'm new to lucene and tried to write my own analyzer to support > > > hyphenated words like wi-fi, jean-pierre, etc. > > > For our customer it is important to find the word > > > - wi-fi by wi, fi, wifi, wi-fi > > > - jean-pierre by jean, pierre, jean-pierre, jean-* > > > > > > > > > > > > > > > The analyzer: > > > public class SupportHyphenatedWordsAnalyzer extends Analyzer { > > > > > > protected NormalizeCharMap charConvertMap; > > > > > > public MinLuceneAnalyzer() { > > > initCharConvertMap(); > > > } > > > > > > protected void initCharConvertMap() { > > > NormalizeCharMap.Builder builder = new NormalizeCharMap.Builder(); > > > builder.add("\"", ""); > > > charConvertMap = builder.build(); > > > } > > > > > > @Override > > > protected TokenStreamComponents createComponents(final String > > fieldName) > > > { > > > > > > final Tokenizer src = new WhitespaceTokenizer(); > > > > > > TokenStream tok = new WordDelimiterFilter(src, > > > WordDelimiterFilter.PRESERVE_ORIGINAL > > > | WordDelimiterFilter.GENERATE_WORD_PARTS > > > | WordDelimiterFilter.GENERATE_NUMBER_PARTS > > > | WordDelimiterFilter.CATENATE_WORDS, > > > null); > > > tok = new LowerCaseFilter(tok); > > > tok = new LengthFilter(tok, 1, 255); > > > tok = new StopFilter(tok, StopAnalyzer.ENGLISH_STOP_WORDS_SET); > > > > > > return new TokenStreamComponents(src, tok); > > > } > > > > > > @Override > > > protected Reader initReader(String fieldName, Reader reader) { > > > return new MappingCharFilter(charConvertMap, reader); > > > } > > > } > > > > > > > > > > > > > > > > > > The analyzer seems to work except for exact phrase match queries. > > > > > > e.g. the following words are indexed > > > > > > FD-A320-REC-SIM-1 > > > FD-A320-REC-SIM-10 > > > FD-A320-REC-SIM-11 > > > MIA-FD-A320-REC-SIM-1 > > > SIN-FD-A320-REC-SIM-1 > > > > > > > > > The (exact) query "FD-A320-REC-SIM-1" returns > > > FD-A320-REC-SIM-1 > > > MIA-FD-A320-REC-SIM-1 > > > SIN-FD-A320-REC-SIM-1 > > > > > > for our customer this is wrong because this exact phrase match > > > query should only return the single entry FD-A320-REC-SIM-1 > > > > > > Do you have any ideas or tips, how we have to change our current > > > analyzer to support this requirement??? > > > > > > > > > Thanks and Kind regards > > > Diego > > > > > > > > > -- > -------------------------- > > Benedetti Alessandro > Visiting card - http://about.me/alessandro_benedetti > Blog - http://alexbenedetti.blogspot.co.uk > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England >