[jira] [Commented] (LUCENE-4503) MoreLikeThis supports multiple index readers.

2012-10-24 Thread Ying Andrews (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483322#comment-13483322
 ] 

Ying Andrews commented on LUCENE-4503:
--

 * Added support for multiple index readers so More Like This can generate a 
similary query based on multiple indexes.
 * This extends the MoreLikeThis feature to work with lucene MultSsearcher.
 * 
 * For example: 
 * Due to large size we may want to divide all sales index into: sales_1, 
sale_2, sales_3, ..., sales_n.
 * In this case we would best use parallel multi-searcher to do the search. Old 
MoreLikeThis.java doesn't support
 * this scenario.  If the current document of interest comes from index 
sales_1, then the query returned from
 * like(int) and like(Reader, String) will only be based on index sales_1, 
which apparently does not reflect the
 * entirety of the whole document population.
 * 
 * Modified:
 * constructors   - MoreLikeThis(IndexReader), 
 *  MoreLikeThis(IndexReader, Similarity)
 * private method - createQueue(MapString, Int)
 * 
 * Added:
 * constructors   - MoreLikeThis(IndexReader, IndexReader[]), 
 *  MoreLikeThis(IndexReader, 
IndexReader[], Similarity)
 *  
 * Notes: 
 * When invoking method like(int) of this class, you have to pass in the 
NORMALIZED document number.
 * You can use the same algorithm used in lucene MultiSearcher class, 
specifically seen in 
 * subSearcher(int) and subDoc(int) methods.


 MoreLikeThis supports multiple index readers.
 -

 Key: LUCENE-4503
 URL: https://issues.apache.org/jira/browse/LUCENE-4503
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Ying Andrews
Priority: Minor
  Labels: patch
 Attachments: MoreLikeThis.java.patch

   Original Estimate: 72h
  Remaining Estimate: 72h



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4503) MoreLikeThis supports multiple index readers.

2012-10-24 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483340#comment-13483340
 ] 

Robert Muir commented on LUCENE-4503:
-

Can't you just pass a MultiReader instead?

 MoreLikeThis supports multiple index readers.
 -

 Key: LUCENE-4503
 URL: https://issues.apache.org/jira/browse/LUCENE-4503
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Ying Andrews
Priority: Minor
  Labels: patch
 Attachments: MoreLikeThis.java.patch

   Original Estimate: 72h
  Remaining Estimate: 72h



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4503) MoreLikeThis supports multiple index readers.

2012-10-24 Thread Ying Andrews (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13483486#comment-13483486
 ] 

Ying Andrews commented on LUCENE-4503:
--

Thanks for pointing it out, Robert.

In the application I worked on, we had to support a mix of local and remote 
searchers. Due to the large scale and heterogeneous nature of our  systems we 
had to be able to search anything that implements Searchable.  We also had to 
take advantage of ParallelMultiSearcher to boost the performance. In a special 
case we had a ParallelMultiSearcher consisted of a group of local file indexes, 
a group of remote searchers whose data may come further from other remote 
searchers (kind like a tree) and one searcher that gets data from a SolrServer 
over the network. Therefore we had to adopt MultiSearcher instead of 
MultiReader strategy. We recently added MoreLikeThis feature into our 
heterogenous system. As you can see MultiReader is not an option in our 
environment. The links below roughly explains my situation.  Thank you.

http://lucene.472066.n3.nabble.com/MultiSearcher-vs-MultiReader-td546968.html
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200712.mbox/%3cof924d8f48.261c9541-onc22573a5.0077e70d-c22573a5.007a9...@il.ibm.com%3E

 MoreLikeThis supports multiple index readers.
 -

 Key: LUCENE-4503
 URL: https://issues.apache.org/jira/browse/LUCENE-4503
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Ying Andrews
Priority: Minor
  Labels: patch
 Attachments: MoreLikeThis.java.patch

   Original Estimate: 72h
  Remaining Estimate: 72h



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org