[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-03-30 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693997#action_12693997 ] Shai Erera commented on LUCENE-1575: bq. But what is the plan now for the FieldCompara

Re: Modularization

2009-03-30 Thread Michael Busch
On 3/31/09 1:31 AM, Chris Hostetter wrote: code isolation (by directory hierarchy) is hte best way i've seen to ensure modularization, and protect against inadvertent dependency bleeding. +1. That's actually what I meant with "one-to-one mapping between the packaging and the source code" (I didn

Reading document in Lucene

2009-03-30 Thread mitu2009
My indexed document in Lucene has got multiple cities assigned to it...ie. doc.Add(new Field("city", city1.Trim(), Field.Store.YES, Field.Index.TOKENIZED)); doc.Add(new Field("city", city2.Trim(), Field.Store.YES, Field.Index.TOKENIZED)); etc how do i iterate thru them and read the values after e

Lucene analyzer and dots

2009-03-30 Thread mitu2009
Is there any way I can make Lucene analyzer not ignore dots in the string?? for example,if my search criteria is: "A.B.C.D",Lucene should give me only those documents in the search results which have "A.B.C.D" and not "ABCD" -- View this message in context: http://www.nabble.com/Lucene-anal

Re: Modularization

2009-03-30 Thread Chris Hostetter
After stiring things up, and then being off-list for ~10 days, I'm in an interesting position coming back to this thread and seeing the discussion *after* it essentially ended, with a lot of semi-concensus but no clear sense of hard and fast resolution or plan of action. FWIW, here are the not

[jira] Commented: (LUCENE-1516) Integrate IndexReader with IndexWriter

2009-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693932#action_12693932 ] Michael McCandless commented on LUCENE-1516: Nice work to you too -- I just pi

[jira] Commented: (LUCENE-1516) Integrate IndexReader with IndexWriter

2009-03-30 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693922#action_12693922 ] Jason Rutherglen commented on LUCENE-1516: -- Mike, nice work! I will hopefully

[jira] Commented: (LUCENE-1425) Add ConstantScore highlighting support to SpanScorer

2009-03-30 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693917#action_12693917 ] Mark Miller commented on LUCENE-1425: - I'd like to commit this soon. > Add ConstantSc

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693847#action_12693847 ] Michael McCandless commented on LUCENE-1575: bq. And have STFC extend NSTFC? I

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-03-30 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693843#action_12693843 ] Shai Erera commented on LUCENE-1575: bq. So to be consistent maybe we create ScoringTo

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693822#action_12693822 ] Michael McCandless commented on LUCENE-1575: bq. How's that sound: That sound

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-03-30 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693812#action_12693812 ] Shai Erera commented on LUCENE-1575: ok I'll add another package-private ctor to TopDo

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693796#action_12693796 ] Michael McCandless commented on LUCENE-1575: bq. Or introducing a new ctor whi

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-03-30 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693787#action_12693787 ] Shai Erera commented on LUCENE-1575: bq. Turning off scoring in TopFieldCollector's ct

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693778#action_12693778 ] Michael McCandless commented on LUCENE-1575: bq. The question is what to do wi

[jira] Commented: (LUCENE-1039) Bayesian classifiers using Lucene as data store

2009-03-30 Thread Karl Wettin (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693744#action_12693744 ] Karl Wettin commented on LUCENE-1039: - Vaijanath, can you please post a small test ca

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-03-30 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693743#action_12693743 ] Shai Erera commented on LUCENE-1575: Ok I now understand better where score is used in

[jira] Updated: (LUCENE-1578) InstantiatedIndex supports non-optimized IndexReaders

2009-03-30 Thread Karl Wettin (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wettin updated LUCENE-1578: Attachment: LUCENE-1578.txt Please test this patch using a couple of different unoptimized readers

Re: InstantiatedIndex

2009-03-30 Thread Karl Wettin
28 mar 2009 kl. 01.21 skrev Jason Rutherglen: I'm thinking InstantiatedIndex needs to implement either clone of all the index data or needs to be able to accept a non-optimized reader, or both. I forget what the obstacles are to implementing the non-optimized reader option? Do you think

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-03-30 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693732#action_12693732 ] Shai Erera commented on LUCENE-1575: I am not sure what you mean - score is used all o

[jira] Commented: (LUCENE-1575) Refactoring Lucene collectors (HitCollector and extensions)

2009-03-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693729#action_12693729 ] Michael McCandless commented on LUCENE-1575: I think as part of this we should

Re: Bug in TopFieldCollector?

2009-03-30 Thread Shai Erera
I checked where it is used, and this arg is required by FieldValueHitQueue, by its only constructor. The array is passed to each field's getComparator method, which uses it only for CUSTOM field indeed. There, it calls comparatorSource.newComparator, and there's only one implementation now of it, w

RE: Bug in TopFieldCollector?

2009-03-30 Thread Uwe Schindler
You are right, I forget the sorting. And I also think, the most important thing would be to remove the need for the ctor in the custom sort. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandl

Re: Bug in TopFieldCollector?

2009-03-30 Thread Michael McCandless
Well, IndexSearcher also sorts its readers biggest to smallest (by .numDocs()) for better performance (so that the queues fill up as much as possible before hitting reader transitions). I think it's the exception, not the rule, for when a custom comparator would require the full array of sub-reade

RE: Bug in TopFieldCollector?

2009-03-30 Thread Uwe Schindler
Why not call IndexSearcher.getIndexReader().getSequentialSubReaders() (see http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/all/org/apach e/lucene/index/IndexReader.html). Its public and documented as this: public

Re: Bug in TopFieldCollector?

2009-03-30 Thread Michael McCandless
I agree, this is not a pleasant migration path forward from 2.4. I think maybe a good fix is to not even require IndexReader[] subReaders to be passed in, in the first place. Tracing downwards, the only reason why we needs this array at construction time is for the SortField.CUSTOM case, when it

Re: Bug in TopFieldCollector?

2009-03-30 Thread Shai Erera
Already did ! Another question - I think we somehow broke TopFieldCollector ... Previously, in TopFieldDocCollector, it accepted an IndexReader as a parameter, and now it requires IndexReader[], which is called subReaders. Calling the 'fast' search methods with Sort has no problem obtaining that I

Re: Bug in TopFieldCollector?

2009-03-30 Thread Michael McCandless
Looks like quite a bug, Shai! Thanks. It came in with LUCENE-1483. I would say add test case & fix it under 1575. Mike On Mon, Mar 30, 2009 at 3:50 AM, Shai Erera wrote: > Hi > > As I prepared the patch for 1575, I noticed a strange implementation in > TopFieldCollector's topDocs(): > >     Sc

Bug in TopFieldCollector?

2009-03-30 Thread Shai Erera
Hi As I prepared the patch for 1575, I noticed a strange implementation in TopFieldCollector's topDocs(): ScoreDoc[] scoreDocs = new ScoreDoc[queue.size()]; if (fillFields) { for (int i = queue.size() - 1; i >= 0; i--) { scoreDocs[i] = queue.fillFields((FieldValueHitQueue.En