[jira] Commented: (LUCENE-2287) Unexpected terms are highlighted within nested SpanQuery instances

2011-02-25 Thread Salman Akram (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12999311#comment-12999311
 ] 

Salman Akram commented on LUCENE-2287:
--

Hi,

It seems the last patch was committed with still couple of failures. Any update 
on this? Do you think this is still better than the default highlighter?

Thanks!

 Unexpected terms are highlighted within nested SpanQuery instances
 --

 Key: LUCENE-2287
 URL: https://issues.apache.org/jira/browse/LUCENE-2287
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/highlighter
Affects Versions: 2.9.1
 Environment: Linux, Solaris, Windows
Reporter: Michael Goddard
Priority: Minor
 Attachments: LUCENE-2287.patch, LUCENE-2287.patch, LUCENE-2287.patch, 
 LUCENE-2287.patch, LUCENE-2287.patch, LUCENE-2287.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 I haven't yet been able to resolve why I'm seeing spurious highlighting in 
 nested SpanQuery instances.  Briefly, the issue is illustrated by the second 
 instance of Lucene being highlighted in the test below, when it doesn't 
 satisfy the inner span.  There's been some discussion about this on the 
 java-dev list, and I'm opening this issue now because I have made some 
 initial progress on this.
 This new test, added to the  HighlighterTest class in lucene_2_9_1, 
 illustrates this:
 /*
  * Ref: http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/
  */
 public void testHighlightingNestedSpans2() throws Exception {
   String theText = The Lucene was made by Doug Cutting and Lucene great 
 Hadoop was; // Problem
   //String theText = The Lucene was made by Doug Cutting and the great 
 Hadoop was; // Works okay
   String fieldName = SOME_FIELD_NAME;
   SpanNearQuery spanNear = new SpanNearQuery(new SpanQuery[] {
 new SpanTermQuery(new Term(fieldName, lucene)),
 new SpanTermQuery(new Term(fieldName, doug)) }, 5, true);
   Query query = new SpanNearQuery(new SpanQuery[] { spanNear,
 new SpanTermQuery(new Term(fieldName, hadoop)) }, 4, true);
   String expected = The BLucene/B was made by BDoug/B Cutting and 
 Lucene great BHadoop/B was;
   //String expected = The BLucene/B was made by BDoug/B Cutting and 
 the great BHadoop/B was;
   String observed = highlightField(query, fieldName, theText);
   System.out.println(Expected: \ + expected + \n + Observed: \ + 
 observed);
   assertEquals(Why is that second instance of the term \Lucene\ 
 highlighted?, expected, observed);
 }
 Is this an issue that's arisen before?  I've been reading through the source 
 to QueryScorer, WeightedSpanTerm, WeightedSpanTermExtractor, Spans, and 
 NearSpansOrdered, but haven't found the solution yet.  Initially, I thought 
 that the extractWeightedSpanTerms method in WeightedSpanTermExtractor should 
 be called on each clause of a SpanNearQuery or SpanOrQuery, but that didn't 
 get me too far.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-02-03 Thread Salman Akram (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12990038#comment-12990038
 ] 

Salman Akram commented on SOLR-1604:


Reminder! Any updates regarding integration with CommonGrams? Thanks

 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-24 Thread Salman Akram (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12985771#action_12985771
 ] 

Salman Akram commented on SOLR-1604:


Any updates on integration with CommonGrams? Thanks

 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-21 Thread Salman Akram (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12984786#action_12984786
 ] 

Salman Akram commented on SOLR-1604:


Ahmet,

I would be waiting for your response on CommonGrams. Would be grateful if you 
can look into it this weekend. Thanks!

 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-21 Thread Salman Akram (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12984820#action_12984820
 ] 

Salman Akram commented on SOLR-1604:


Although I would be asking this question on the mailing list as well but since 
its related to this patch so I wanted to check if this patch would work fine 
with SurroundQueryParser or if Surround does it itself? This patch 
functionality is really important for me.

Thanks a lot!

 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-18 Thread Salman Akram (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983208#action_12983208
 ] 

Salman Akram commented on SOLR-1604:


I tried the patch with latest non-grayed file but still inOrder doesn't seem to 
have any impact.

Results for a b~5 and b a~5 are still different.

Also any feedback about CommonGrams integration?

Thanks a lot for all the help!

 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-17 Thread Salman Akram (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982651#action_12982651
 ] 

Salman Akram commented on SOLR-1604:


I am trying to use CommonGrams with this patch but doesn't seem to work. 

If I don't add {!complexphrase} it uses CommonGramsQueryFilterFactory and 
proper bi-grams are made but of course doesn't use this patch.

If I add {!complexphrase} it simply does it the old way i.e. ignore CommonGrams.

Can you please help how can I combine both these features?



 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-12 Thread Salman Akram (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980660#action_12980660
 ] 

Salman Akram commented on SOLR-1604:


I integrated the patch and its working fine however, there were couple of 
issues. One is already resolved with the above un-ordered proximity parameters.

The issue is that although proximity search works with phrases BUT its not very 
accurate e.g. If I want to search   a b within 10 words of c the query 
would end up being a b c~10 but this will also return cases where a is not 
necessarily together with b. Any scenario where these 3 words are within 10 
words of each other will match.

Is it possible in SOLR to do what I mentioned above? Any other patch? Something 
like  a b c ~10...

Thanks!

 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-12 Thread Salman Akram (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980660#action_12980660
 ] 

Salman Akram edited comment on SOLR-1604 at 1/12/11 6:37 AM:
-

I integrated the patch and its working fine however, there were couple of 
issues. One is already resolved with the above un-ordered proximity parameters.

The issue is that although proximity search works with phrases BUT its not very 
accurate e.g. If I want to search   a b within 10 words of c the query 
would end up being a b c~10 but this will also return cases where a is not 
necessarily together with b. Any scenario where these 3 words are within 10 
words of each other will match.

Is it possible in SOLR to do what I mentioned above? Any other patch? Something 
like  a b c ~10...

Note: I was going through Lucene-1486 and there Ahmet mentioned that 
Specifically : (john johathon) smith~10  works perfectly. For me it seems 
there is no difference if I put the parenthesis or not.

Thanks!

  was (Author: salman741):
I integrated the patch and its working fine however, there were couple of 
issues. One is already resolved with the above un-ordered proximity parameters.

The issue is that although proximity search works with phrases BUT its not very 
accurate e.g. If I want to search   a b within 10 words of c the query 
would end up being a b c~10 but this will also return cases where a is not 
necessarily together with b. Any scenario where these 3 words are within 10 
words of each other will match.

Is it possible in SOLR to do what I mentioned above? Any other patch? Something 
like  a b c ~10...

Thanks!
  
 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-12 Thread Salman Akram (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980660#action_12980660
 ] 

Salman Akram edited comment on SOLR-1604 at 1/12/11 6:41 AM:
-

I integrated the patch and its working fine however, there were couple of 
issues. 

One is related to un-ordered proximity which seems to be fixed with the inOrder 
parameter but its not working for me (doesn't give any error but its still 
ordered). I will try to get the patch again coz I also merged it in early Nov 
so maybe it was applied after that.

The other issue is that although proximity search works with phrases BUT its 
not very accurate e.g. If I want to search   a b within 10 words of c the 
query would end up being a b c~10 but this will also return cases where a 
is not necessarily together with b. Any scenario where these 3 words are 
within 10 words of each other will match.

Is it possible in SOLR to do what I mentioned above? Any other patch? Something 
like  a b c ~10...

Note: I was going through Lucene-1486 and there Ahmet mentioned that 
Specifically : (john johathon) smith~10  works perfectly. For me it seems 
there is no difference if I put the parenthesis or not.

Thanks!

  was (Author: salman741):
I integrated the patch and its working fine however, there were couple of 
issues. One is already resolved with the above un-ordered proximity parameters.

The issue is that although proximity search works with phrases BUT its not very 
accurate e.g. If I want to search   a b within 10 words of c the query 
would end up being a b c~10 but this will also return cases where a is not 
necessarily together with b. Any scenario where these 3 words are within 10 
words of each other will match.

Is it possible in SOLR to do what I mentioned above? Any other patch? Something 
like  a b c ~10...

Note: I was going through Lucene-1486 and there Ahmet mentioned that 
Specifically : (john johathon) smith~10  works perfectly. For me it seems 
there is no difference if I put the parenthesis or not.

Thanks!
  
 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries

2011-01-12 Thread Salman Akram (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980853#action_12980853
 ] 

Salman Akram commented on SOLR-1604:


I am using SOLR 1.4.1 but integrated this patch in early Nov so maybe you 
committed the inOrder parameter after that?

When you say Regarding parenthesis inside quotes... if this works and groups 
the words in phrase together won't it work for my case e.g. (a b) c~10?

I guess if SurroundQuery doesn't use any analyzer it would be very difficult to 
make the existing queries work (I am using Standard Analyzer).

 Wildcards, ORs etc inside Phrase Queries
 

 Key: SOLR-1604
 URL: https://issues.apache.org/jira/browse/SOLR-1604
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Ahmet Arslan
Priority: Minor
 Fix For: Next

 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, 
 ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch


 Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports 
 wildcards, ORs, ranges, fuzzies inside phrase queries.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org