[ 
https://issues.apache.org/jira/browse/LUCENE-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906751#comment-13906751
 ] 

ASF GitHub Bot commented on LUCENE-5205:
----------------------------------------

GitHub user PaulElschot opened a pull request:

    https://github.com/apache/lucene-solr/pull/38

    Deprecate Surround parser 

    LUCENE-5205
    
    Nothing much to say :)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/PaulElschot/lucene-solr depr-surround

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/lucene-solr/pull/38.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #38
    
----
commit 44b504f070c888bf7124939d352b2f47f77f2e8e
Author: Robert Muir <rm...@apache.org>
Date:   2014-02-19T23:03:33Z

    LUCENE-5205: create branch
    
    git-svn-id: 
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5205@1569953 
13f79535-47bb-0310-9956-ffa450edef68

commit 97bebc0896fcb47c3f8059c32b25cc288e84b53d
Author: Robert Muir <rm...@apache.org>
Date:   2014-02-19T23:20:19Z

    LUCENE-5205: latest patch, synced to trunk, whitespace and braces and 
javadocs fixes only
    
    git-svn-id: 
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5205@1569969 
13f79535-47bb-0310-9956-ffa450edef68

commit cfb1f36875fd3643c84b3c495641692f6c5f9a26
Author: Robert Muir <rm...@apache.org>
Date:   2014-02-20T03:09:47Z

    LUCENE-5205: clean up some test code dup
    
    git-svn-id: 
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5205@1570063 
13f79535-47bb-0310-9956-ffa450edef68

commit 7d3974d2cd4645a8933e4a667459f9baf0cda34f
Author: Robert Muir <rm...@apache.org>
Date:   2014-02-20T03:18:33Z

    LUCENE-5205: reduce visibility of internal classes to pkg-private
    
    git-svn-id: 
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5205@1570066 
13f79535-47bb-0310-9956-ffa450edef68

commit e5dda4a2e13dc9419f61871561a5ca9bb54f9940
Author: Robert Muir <rm...@apache.org>
Date:   2014-02-20T03:29:28Z

    LUCENE-5205: clean up some craziness in these asserts
    
    git-svn-id: 
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5205@1570073 
13f79535-47bb-0310-9956-ffa450edef68

commit 020c91223f534a6ccb61745ff90bc273021bd60a
Author: Robert Muir <rm...@apache.org>
Date:   2014-02-20T03:37:58Z

    LUCENE-5205: clean up formatting, wildcard imports
    
    git-svn-id: 
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5205@1570074 
13f79535-47bb-0310-9956-ffa450edef68

commit 53a9649d778e5887796980b51dfd6b644c27b443
Author: Robert Muir <rm...@apache.org>
Date:   2014-02-20T03:43:50Z

    LUCENE-5205: uncomment+fix broken assertions
    
    git-svn-id: 
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5205@1570075 
13f79535-47bb-0310-9956-ffa450edef68

commit 068a17bf03620ccd4f25f619ff118495023f8ef5
Author: Robert Muir <rm...@apache.org>
Date:   2014-02-20T03:51:33Z

    LUCENE-5205: nuke ancient commented-out stuff, clean up exception tests
    
    git-svn-id: 
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5205@1570076 
13f79535-47bb-0310-9956-ffa450edef68

commit 02c98224356d6b0b41acb572d559ab0f4e9dd716
Author: Robert Muir <rm...@apache.org>
Date:   2014-02-20T04:04:27Z

    LUCENE-5205: clean up dead code, asserts, remove warnings
    
    git-svn-id: 
https://svn.apache.org/repos/asf/lucene/dev/branches/lucene5205@1570080 
13f79535-47bb-0310-9956-ffa450edef68

commit a5921753a832b79e6c11e51f1af9ad6aa1d47c96
Author: Paul Elschot <paul.j.elsc...@gmail.com>
Date:   2014-02-20T08:11:25Z

    Deprecate Surround parser, use SpanQueryParser instead.

----


> [PATCH] SpanQueryParser with recursion, analysis and syntax very similar to 
> classic QueryParser
> -----------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-5205
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5205
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/queryparser
>            Reporter: Tim Allison
>              Labels: patch
>             Fix For: 4.7
>
>         Attachments: LUCENE-5205.patch.gz, LUCENE-5205.patch.gz, 
> LUCENE_5205.patch, SpanQueryParser_v1.patch.gz, patch.txt
>
>
> This parser extends QueryParserBase and includes functionality from:
> * Classic QueryParser: most of its syntax
> * SurroundQueryParser: recursive parsing for "near" and "not" clauses.
> * ComplexPhraseQueryParser: can handle "near" queries that include multiterms 
> (wildcard, fuzzy, regex, prefix),
> * AnalyzingQueryParser: has an option to analyze multiterms.
> At a high level, there's a first pass BooleanQuery/field parser and then a 
> span query parser handles all terminal nodes and phrases.
> Same as classic syntax:
> * term: test 
> * fuzzy: roam~0.8, roam~2
> * wildcard: te?t, test*, t*st
> * regex: /\[mb\]oat/
> * phrase: "jakarta apache"
> * phrase with slop: "jakarta apache"~3
> * default "or" clause: jakarta apache
> * grouping "or" clause: (jakarta apache)
> * boolean and +/-: (lucene OR apache) NOT jakarta; +lucene +apache -jakarta
> * multiple fields: title:lucene author:hatcher
>  
> Main additions in SpanQueryParser syntax vs. classic syntax:
> * Can require "in order" for phrases with slop with the \~> operator: 
> "jakarta apache"\~>3
> * Can specify "not near": "fever bieber"!\~3,10 ::
>     find "fever" but not if "bieber" appears within 3 words before or 10 
> words after it.
> * Fully recursive phrasal queries with \[ and \]; as in: \[\[jakarta 
> apache\]~3 lucene\]\~>4 :: 
>     find "jakarta" within 3 words of "apache", and that hit has to be within 
> four words before "lucene"
> * Can also use \[\] for single level phrasal queries instead of " as in: 
> \[jakarta apache\]
> * Can use "or grouping" clauses in phrasal queries: "apache (lucene solr)"\~3 
> :: find "apache" and then either "lucene" or "solr" within three words.
> * Can use multiterms in phrasal queries: "jakarta\~1 ap*che"\~2
> * Did I mention full recursion: \[\[jakarta\~1 ap*che\]\~2 (solr~ 
> /l\[ou\]\+\[cs\]\[en\]\+/)]\~10 :: Find something like "jakarta" within two 
> words of "ap*che" and that hit has to be within ten words of something like 
> "solr" or that "lucene" regex.
> * Can require at least x number of hits at boolean level: "apache AND (lucene 
> solr tika)~2
> * Can use negative only query: -jakarta :: Find all docs that don't contain 
> "jakarta"
> * Can use an edit distance > 2 for fuzzy query via SlowFuzzyQuery (beware of 
> potential performance issues!).
> Trivial additions:
> * Can specify prefix length in fuzzy queries: jakarta~1,2 (edit distance =1, 
> prefix =2)
> * Can specifiy Optimal String Alignment (OSA) vs Levenshtein for distance 
> <=2: (jakarta~1 (OSA) vs jakarta~>1(Levenshtein)
> This parser can be very useful for concordance tasks (see also LUCENE-5317 
> and LUCENE-5318) and for analytical search.  
> Until LUCENE-2878 is closed, this might have a use for fans of SpanQuery.
> Most of the documentation is in the javadoc for SpanQueryParser.
> Any and all feedback is welcome.  Thank you.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to