[ https://issues.apache.org/jira/browse/SOLR-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13043399#comment-13043399 ]
Jan Høydahl commented on SOLR-1980: ----------------------------------- I'm sure I can get it working the way I started, using CharFilter, however perhaps it's possible to implement in a more generic and Lucene-like query syntax utilizing position info from the index: {code} title:"quick fox"@N:M {code} This would mean that the phrase must be anchored between N'th and M'th token position in the field. Negative values for N/M would mean relative to the end. Thus "^quick fox$" could be written {code} title:"quick fox"@0:-0 {code} Or if you require the phrase to be within first 10 words OR last 10 words: {code} title:("quick fox"@0:10 OR "quick fox"@-10:-0) {code} Requiring a term to be exactly @ position 3 would be: {code} title:fox@3:3 {code} If this syntax is feasible, we could use same syntax in eDisMax's pf param in order to tell it to add a position constraint when forming the pf part of the query: {code} pf=title@0:-0 {code} This would only generate a phrase match on title if the phrase is an exact match of the whole field. Potential issues with multi-valued fields? Is the field delimiter clearly marked or is it only an increment gap? Would it be easy to parse such a syntax and generate a Lucene query with the position constraints? > Implement boundary match support > -------------------------------- > > Key: SOLR-1980 > URL: https://issues.apache.org/jira/browse/SOLR-1980 > Project: Solr > Issue Type: New Feature > Components: Schema and Analysis > Reporter: Jan Høydahl > > Sometimes you need to specify that a query should match only at the start or > end of a field, or be an exact match. > Example content: > 1) a quick fox is brown > 2) quick fox is brown > Example queries: > "^quick fox" -> should only match 2) > "brown$" -> should match 1) and 2) > "^quick fox is brown$" -> should only match 2) > Proposed way of implmementation is through a new BoundaryMatchTokenFilter > which behaves like this: > On the index side it inserts special unique tokens at beginning and end of > field. These could be some weird unicode sequence. > On the query side, it looks for the first character matching "^" or the last > character matching "$" and replaces them with the special tokens. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org