[jira] Commented: (LUCENE-2003) Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or simplier StopFilter with stopWordsPosIncr mode switched on

2009-10-23 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769452#action_12769452
 ] 

Michael McCandless commented on LUCENE-2003:


Mark is this one done?

> Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or 
> simplier StopFilter with stopWordsPosIncr mode switched on
> ---
>
> Key: LUCENE-2003
> URL: https://issues.apache.org/jira/browse/LUCENE-2003
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.9, 3.0
>Reporter: Uwe Schindler
>Assignee: Mark Miller
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-2003.patch, LUCENE-2003.patch
>
>
> This is a followup on LUCENE-1987:
> If you set in HighligterTest the constant static final Version TEST_VERSION = 
> Version.LUCENE_24 to LUCENE_29 or LUCENE_CURRENT, the test 
> testSimpleQueryScorerPhraseHighlighting fails. Please note, that currently 
> (before LUCENE-2002 is fixed), you must also set the QueryParser to respect 
> posIncr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2003) Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or simplier StopFilter with stopWordsPosIncr mode switched on

2009-10-22 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768866#action_12768866
 ] 

Mark Miller commented on LUCENE-2003:
-

bq. The "total" would almost seem to tip the ambiguity toward meaning that it's 
the total slop between all clauses.

Yeah, I think it needs to be changed. Total appears just wrong. Perhaps 
something more along the lines of:

Matches spans matching a span from each clause, with up to slop 
unmatched positions between each of them

> Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or 
> simplier StopFilter with stopWordsPosIncr mode switched on
> ---
>
> Key: LUCENE-2003
> URL: https://issues.apache.org/jira/browse/LUCENE-2003
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.9, 3.0
>Reporter: Uwe Schindler
>Assignee: Mark Miller
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-2003.patch, LUCENE-2003.patch
>
>
> This is a followup on LUCENE-1987:
> If you set in HighligterTest the constant static final Version TEST_VERSION = 
> Version.LUCENE_24 to LUCENE_29 or LUCENE_CURRENT, the test 
> testSimpleQueryScorerPhraseHighlighting fails. Please note, that currently 
> (before LUCENE-2002 is fixed), you must also set the QueryParser to respect 
> posIncr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2003) Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or simplier StopFilter with stopWordsPosIncr mode switched on

2009-10-22 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768863#action_12768863
 ] 

Yonik Seeley commented on LUCENE-2003:
--

bq. You only need to add to the slop the largest inc, because the SpanQuery 
slop is the dist allowed between each span.

Learn something new every day :-)

Is this javadoc incorrect, or simply ambiguous, or am I reading it wrong:
{code}
  /** Construct a SpanNearQuery.  Matches spans matching a span from each
   * clause, with up to slop total unmatched positions between
   * them.  * When inOrder is true, the spans from each clause
   * must be * ordered as in clauses. */
  public SpanNearQuery(SpanQuery[] clauses, int slop, boolean inOrder) {
this(clauses, slop, inOrder, true); 
  }
{code}

The "total" would almost seem to tip the ambiguity toward meaning that it's the 
total slop between all clauses.

> Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or 
> simplier StopFilter with stopWordsPosIncr mode switched on
> ---
>
> Key: LUCENE-2003
> URL: https://issues.apache.org/jira/browse/LUCENE-2003
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.9, 3.0
>Reporter: Uwe Schindler
>Assignee: Michael McCandless
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-2003.patch, LUCENE-2003.patch
>
>
> This is a followup on LUCENE-1987:
> If you set in HighligterTest the constant static final Version TEST_VERSION = 
> Version.LUCENE_24 to LUCENE_29 or LUCENE_CURRENT, the test 
> testSimpleQueryScorerPhraseHighlighting fails. Please note, that currently 
> (before LUCENE-2002 is fixed), you must also set the QueryParser to respect 
> posIncr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2003) Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or simplier StopFilter with stopWordsPosIncr mode switched on

2009-10-22 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768862#action_12768862
 ] 

Mark Miller commented on LUCENE-2003:
-

Okay - I think this is the way to go -  maxPos-minPos+1-numTokens is too much 
slop because it just has to be the largest posInc - forgot thats how 
SpanQueries work when I did the orig patch.

> Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or 
> simplier StopFilter with stopWordsPosIncr mode switched on
> ---
>
> Key: LUCENE-2003
> URL: https://issues.apache.org/jira/browse/LUCENE-2003
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.9, 3.0
>Reporter: Uwe Schindler
>Assignee: Michael McCandless
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-2003.patch, LUCENE-2003.patch
>
>
> This is a followup on LUCENE-1987:
> If you set in HighligterTest the constant static final Version TEST_VERSION = 
> Version.LUCENE_24 to LUCENE_29 or LUCENE_CURRENT, the test 
> testSimpleQueryScorerPhraseHighlighting fails. Please note, that currently 
> (before LUCENE-2002 is fixed), you must also set the QueryParser to respect 
> posIncr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2003) Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or simplier StopFilter with stopWordsPosIncr mode switched on

2009-10-22 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768853#action_12768853
 ] 

Mark Miller commented on LUCENE-2003:
-

Hmm - well now you have me worried - never seen you be wrong.

I just tried a test like that and it appeared to work though.

Ah - I should have looked closer at the MultiPhraseQuery code - it is wrong - 
just happens to work.

You only need to add to the slop the largest inc, because the SpanQuery slop is 
the dist allowed between *each* span.

So thats why it works - it finds 3 the first time, doesn't add any more for the 
rest, but 3 is enough. I'll fix.

> Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or 
> simplier StopFilter with stopWordsPosIncr mode switched on
> ---
>
> Key: LUCENE-2003
> URL: https://issues.apache.org/jira/browse/LUCENE-2003
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.9, 3.0
>Reporter: Uwe Schindler
>Assignee: Michael McCandless
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-2003.patch
>
>
> This is a followup on LUCENE-1987:
> If you set in HighligterTest the constant static final Version TEST_VERSION = 
> Version.LUCENE_24 to LUCENE_29 or LUCENE_CURRENT, the test 
> testSimpleQueryScorerPhraseHighlighting fails. Please note, that currently 
> (before LUCENE-2002 is fixed), you must also set the QueryParser to respect 
> posIncr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2003) Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or simplier StopFilter with stopWordsPosIncr mode switched on

2009-10-22 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768843#action_12768843
 ] 

Yonik Seeley commented on LUCENE-2003:
--

Could you explain this part?
{code}
+  if (inc > lastInc) {
+slop += inc;
+  }
{code}

Seems like that would cause "A ??? B ??? C ??? D" to only have a slop of 3 (? 
represents a gap of 1).

Couldn't slop just be maxPos-minPos+1-numTokens?


> Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or 
> simplier StopFilter with stopWordsPosIncr mode switched on
> ---
>
> Key: LUCENE-2003
> URL: https://issues.apache.org/jira/browse/LUCENE-2003
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.9, 3.0
>Reporter: Uwe Schindler
>Assignee: Michael McCandless
> Fix For: 2.9.1, 3.0
>
> Attachments: LUCENE-2003.patch
>
>
> This is a followup on LUCENE-1987:
> If you set in HighligterTest the constant static final Version TEST_VERSION = 
> Version.LUCENE_24 to LUCENE_29 or LUCENE_CURRENT, the test 
> testSimpleQueryScorerPhraseHighlighting fails. Please note, that currently 
> (before LUCENE-2002 is fixed), you must also set the QueryParser to respect 
> posIncr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2003) Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or simplier StopFilter with stopWordsPosIncr mode switched on

2009-10-22 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768829#action_12768829
 ] 

Mark Miller commented on LUCENE-2003:
-

Well no crap - MultiPhraseQuery already does that. Someone else contrib'd that. 
Guess they are ahead of me - would have saved some though to look at it :)

> Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or 
> simplier StopFilter with stopWordsPosIncr mode switched on
> ---
>
> Key: LUCENE-2003
> URL: https://issues.apache.org/jira/browse/LUCENE-2003
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.9, 3.0
>Reporter: Uwe Schindler
>Assignee: Michael McCandless
> Fix For: 2.9.1, 3.0
>
>
> This is a followup on LUCENE-1987:
> If you set in HighligterTest the constant static final Version TEST_VERSION = 
> Version.LUCENE_24 to LUCENE_29 or LUCENE_CURRENT, the test 
> testSimpleQueryScorerPhraseHighlighting fails. Please note, that currently 
> (before LUCENE-2002 is fixed), you must also set the QueryParser to respect 
> posIncr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2003) Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or simplier StopFilter with stopWordsPosIncr mode switched on

2009-10-22 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768827#action_12768827
 ] 

Mark Miller commented on LUCENE-2003:
-

Umm - its hard to emulate the positions stuff from phrasequery with a 
SpanQuery. A limitation I hadn't really though much of. Should be doc'd.

One - uh - sloppy fix - is to count up all of the extra positions and add that 
to the slop.

ie if the positions for a phrasequery are 0, 1, 3 (stop word removed at 2), you 
would add 1 to the slop. 0,1,3,5 - add 2 to the slop.

I think that keeps a fairly good approximation.

Havn't thought about how that would work with MultiPhraseQuery yet.

> Highlighter has problems when you use StandardAnalyzer with LUCENE_29 or 
> simplier StopFilter with stopWordsPosIncr mode switched on
> ---
>
> Key: LUCENE-2003
> URL: https://issues.apache.org/jira/browse/LUCENE-2003
> Project: Lucene - Java
>  Issue Type: Bug
>Affects Versions: 2.9, 3.0
>Reporter: Uwe Schindler
>Assignee: Michael McCandless
> Fix For: 2.9.1, 3.0
>
>
> This is a followup on LUCENE-1987:
> If you set in HighligterTest the constant static final Version TEST_VERSION = 
> Version.LUCENE_24 to LUCENE_29 or LUCENE_CURRENT, the test 
> testSimpleQueryScorerPhraseHighlighting fails. Please note, that currently 
> (before LUCENE-2002 is fixed), you must also set the QueryParser to respect 
> posIncr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org