[ https://issues.apache.org/jira/browse/LUCENE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562520#comment-14562520 ]
Michael McCandless commented on LUCENE-6459: -------------------------------------------- Thanks [~areek], I committed the last patch! I ran PMD linter and it spotted a few issues: {noformat} lucene/suggest/src/java/org/apache/lucene/search/suggest/document/CompletionWeight.java:52: Avoid unused private fields such as 'maxWeight'. lucene/suggest/src/java/org/apache/lucene/search/suggest/document/CompletionWeight.java:53: Avoid unused private fields such as 'minWeight'. lucene/suggest/src/java/org/apache/lucene/search/suggest/document/NRTSuggester.java:97: Avoid unused private fields such as 'endByte'. lucene/suggest/src/java/org/apache/lucene/search/suggest/document/NRTSuggester.java:173: Avoid unused local variables such as 'search'. {noformat} Can you look into them? Also, the API to add ContextSuggestField is somewhat annnoying with Java 7, e.g.: {noformat} document.add(new ContextSuggestField("context_suggest_field", Collections.<CharSequence>singletonList("type1"), "suggestion1", 4)); {noformat} Maybe we can improve this to take a varargs? > [suggest] Query Interface for suggest API > ----------------------------------------- > > Key: LUCENE-6459 > URL: https://issues.apache.org/jira/browse/LUCENE-6459 > Project: Lucene - Core > Issue Type: New Feature > Components: core/search > Affects Versions: 5.1 > Reporter: Areek Zillur > Assignee: Areek Zillur > Fix For: Trunk, 5.x, 5.1 > > Attachments: LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch, > LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch, > LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch, > LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch > > > This patch factors out common indexing/search API used by the recently > introduced [NRTSuggester|https://issues.apache.org/jira/browse/LUCENE-6339]. > The motivation is to provide a query interface for FST-based fields > (*SuggestField* and *ContextSuggestField*) > to enable suggestion scoring and more powerful automaton queries. > Previously, only prefix ‘queries’ with index-time weights were supported but > we can also support: > * Prefix queries expressed as regular expressions: get suggestions that > match multiple prefixes > ** *Example:* _star\[wa\|tr\]_ matches _starwars_ and _startrek_ > * Fuzzy Prefix queries supporting scoring: get typo tolerant suggestions > scored by how close they are to the query prefix > ** *Example:* querying for _seper_ will score _separate_ higher then > _superstitious_ > * Context Queries: get suggestions boosted and/or filtered based on their > indexed contexts (meta data) > ** *Boost example:* get typo tolerant suggestions on song names with > prefix _like a roling_ boosting songs with > genre _rock_ and _indie_ > ** *Filter example:* get suggestion on all file names starting with > _finan_ only for _user1_ and _user2_ > h3. Suggest API > {code} > SuggestIndexSearcher searcher = new SuggestIndexSearcher(reader); > CompletionQuery query = ... > TopSuggestDocs suggest = searcher.suggest(query, num); > {code} > h3. CompletionQuery > *CompletionQuery* is used to query *SuggestField* and *ContextSuggestField*. > A *CompletionQuery* produces a *CompletionWeight*, > which allows *CompletionQuery* implementations to pass in an automaton that > will be intersected with a FST and allows boosting and > meta data extraction from the intersected partial paths. A *CompletionWeight* > produces a *CompletionScorer*. A *CompletionScorer* > executes a Top N search against the FST with the provided automaton, scoring > and filtering all matched paths. > h4. PrefixCompletionQuery > Return documents with values that match the prefix of an analyzed term text > Documents are sorted according to their suggest field weight. > {code} > PrefixCompletionQuery(Analyzer analyzer, Term term) > {code} > h4. RegexCompletionQuery > Return documents with values that match the prefix of a regular expression > Documents are sorted according to their suggest field weight. > {code} > RegexCompletionQuery(Term term) > {code} > h4. FuzzyCompletionQuery > Return documents with values that has prefixes within a specified edit > distance of an analyzed term text. > Documents are ‘boosted’ by the number of matching prefix letters of the > suggestion with respect to the original term text. > {code} > FuzzyCompletionQuery(Analyzer analyzer, Term term) > {code} > h5. Scoring > {{suggestion_weight * boost}} > where {{suggestion_weight}} and {{boost}} are all integers. > {{boost = # of prefix characters matched}} > h4. ContextQuery > Return documents that match a {{CompletionQuery}} filtered and/or boosted by > provided context(s). > {code} > ContextQuery(CompletionQuery query) > contextQuery.addContext(CharSequence context, int boost, boolean exact) > {code} > *NOTE:* {{ContextQuery}} should be used with {{ContextSuggestField}} to query > suggestions boosted and/or filtered by contexts. > Running {{ContextQuery}} against a {{SuggestField}} will error out. > h5. Scoring > {{suggestion_weight * context_boost}} > where {{suggestion_weight}} and {{context_boost}} are all integers > When used with {{FuzzyCompletionQuery}}, > {{suggestion_weight * (context_boost + fuzzy_boost)}} > h3. Context Suggest Field > To use {{ContextQuery}}, use {{ContextSuggestField}} instead of > {{SuggestField}}. Any {{CompletionQuery}} can be used with > {{ContextSuggestField}}, the default behaviour is to return suggestions from > *all* contexts. {{Context}} for every completion hit > can be accessed through {{SuggestScoreDoc#context}}. > {code} > ContextSuggestField(String name, Collection<CharSequence> contexts, String > value, int weight) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org