[jira] [Commented] (LUCENE-6459) [suggest] Query Interface for suggest API

Michael McCandless (JIRA) Thu, 28 May 2015 01:41:33 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562520#comment-14562520
 ]


Michael McCandless commented on LUCENE-6459:
--------------------------------------------

Thanks [~areek], I committed the last patch!

I ran PMD linter and it spotted a few issues:

{noformat}
lucene/suggest/src/java/org/apache/lucene/search/suggest/document/CompletionWeight.java:52:
     Avoid unused private fields such as 'maxWeight'.
lucene/suggest/src/java/org/apache/lucene/search/suggest/document/CompletionWeight.java:53:
     Avoid unused private fields such as 'minWeight'.

lucene/suggest/src/java/org/apache/lucene/search/suggest/document/NRTSuggester.java:97:
 Avoid unused private fields such as 'endByte'.
lucene/suggest/src/java/org/apache/lucene/search/suggest/document/NRTSuggester.java:173:
        Avoid unused local variables such as 'search'.
{noformat}

Can you look into them?

Also, the API to add ContextSuggestField is somewhat annnoying with
Java 7, e.g.:

{noformat}
    document.add(new ContextSuggestField("context_suggest_field", 
Collections.<CharSequence>singletonList("type1"), "suggestion1", 4));
{noformat}

Maybe we can improve this to take a varargs?


> [suggest] Query Interface for suggest API
> -----------------------------------------
>
>                 Key: LUCENE-6459
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6459
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/search
>    Affects Versions: 5.1
>            Reporter: Areek Zillur
>            Assignee: Areek Zillur
>             Fix For: Trunk, 5.x, 5.1
>
>         Attachments: LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch, 
> LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch, 
> LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch, 
> LUCENE-6459.patch, LUCENE-6459.patch, LUCENE-6459.patch
>
>
> This patch factors out common indexing/search API used by the recently 
> introduced [NRTSuggester|https://issues.apache.org/jira/browse/LUCENE-6339]. 
> The motivation is to provide a query interface for FST-based fields 
> (*SuggestField* and *ContextSuggestField*) 
> to enable suggestion scoring and more powerful automaton queries. 
> Previously, only prefix ‘queries’ with index-time weights were supported but 
> we can also support:
> * Prefix queries expressed as regular expressions:  get suggestions that 
> match multiple prefixes
>       ** *Example:* _star\[wa\|tr\]_ matches _starwars_ and _startrek_
> * Fuzzy Prefix queries supporting scoring: get typo tolerant suggestions 
> scored by how close they are to the query prefix
>     ** *Example:* querying for _seper_ will score _separate_ higher then 
> _superstitious_
> * Context Queries: get suggestions boosted and/or filtered based on their 
> indexed contexts (meta data)
>     ** *Boost example:* get typo tolerant suggestions on song names with 
> prefix _like a roling_ boosting songs with 
> genre _rock_ and _indie_
>     ** *Filter example:* get suggestion on all file names starting with 
> _finan_ only for _user1_ and _user2_
> h3. Suggest API
> {code}
> SuggestIndexSearcher searcher = new SuggestIndexSearcher(reader);
> CompletionQuery query = ...
> TopSuggestDocs suggest = searcher.suggest(query, num);
> {code}
> h3. CompletionQuery
> *CompletionQuery* is used to query *SuggestField* and *ContextSuggestField*. 
> A *CompletionQuery* produces a *CompletionWeight*, 
> which allows *CompletionQuery* implementations to pass in an automaton that 
> will be intersected with a FST and allows boosting and 
> meta data extraction from the intersected partial paths. A *CompletionWeight* 
> produces a *CompletionScorer*. A *CompletionScorer* 
> executes a Top N search against the FST with the provided automaton, scoring 
> and filtering all matched paths. 
> h4. PrefixCompletionQuery
> Return documents with values that match the prefix of an analyzed term text 
> Documents are sorted according to their suggest field weight. 
> {code}
> PrefixCompletionQuery(Analyzer analyzer, Term term)
> {code}
> h4. RegexCompletionQuery
> Return documents with values that match the prefix of a regular expression
> Documents are sorted according to their suggest field weight.
> {code}
> RegexCompletionQuery(Term term)
> {code}
> h4. FuzzyCompletionQuery
> Return documents with values that has prefixes within a specified edit 
> distance of an analyzed term text.
> Documents are ‘boosted’ by the number of matching prefix letters of the 
> suggestion with respect to the original term text.
> {code}
> FuzzyCompletionQuery(Analyzer analyzer, Term term)
> {code}
> h5. Scoring
> {{suggestion_weight * boost}}
> where {{suggestion_weight}} and {{boost}} are all integers. 
> {{boost = # of prefix characters matched}}
> h4. ContextQuery
> Return documents that match a {{CompletionQuery}} filtered and/or boosted by 
> provided context(s). 
> {code}
> ContextQuery(CompletionQuery query)
> contextQuery.addContext(CharSequence context, int boost, boolean exact)
> {code}
> *NOTE:* {{ContextQuery}} should be used with {{ContextSuggestField}} to query 
> suggestions boosted and/or filtered by contexts.
> Running {{ContextQuery}} against a {{SuggestField}} will error out.
> h5. Scoring
> {{suggestion_weight  * context_boost}}
> where {{suggestion_weight}} and {{context_boost}} are all integers
> When used with {{FuzzyCompletionQuery}},
> {{suggestion_weight * (context_boost + fuzzy_boost)}}
> h3. Context Suggest Field
> To use {{ContextQuery}}, use {{ContextSuggestField}} instead of 
> {{SuggestField}}. Any {{CompletionQuery}} can be used with 
> {{ContextSuggestField}}, the default behaviour is to return suggestions from 
> *all* contexts. {{Context}} for every completion hit 
> can be accessed through {{SuggestScoreDoc#context}}.
> {code}
> ContextSuggestField(String name, Collection<CharSequence> contexts, String 
> value, int weight) 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-6459) [suggest] Query Interface for suggest API

Reply via email to