[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2007-02-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472610 ] Michael McCandless commented on LUCENE-565: --- OK, got it. I will change those 3 to package protection and t

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2007-02-12 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472499 ] Yonik Seeley commented on LUCENE-565: - OK I moved NewIndexModifier's methods into IndexWriter and did some small

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2007-01-30 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468687 ] Michael Busch commented on LUCENE-565: -- Thanks for the explanation, Mike. I'd prefer flush() too and the changes

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2007-01-30 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468640 ] Michael McCandless commented on LUCENE-565: --- The flush() was added to better match the current IndexModifie

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2007-01-30 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468617 ] Michael Busch commented on LUCENE-565: -- I tried the new patch out and everything looks good to me. One comment t

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2007-01-29 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468371 ] Yonik Seeley commented on LUCENE-565: - I just reviewed this, and it looks good to me. I like how you managed to e

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2007-01-19 Thread Jeremy F. Kassis (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12466189 ] Jeremy F. Kassis commented on LUCENE-565: - Happy New Year everyone. I'm personally very excited about this im

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-18 Thread Ning Li (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12459506 ] Ning Li commented on LUCENE-565: Here is the design overview. Minor changes were made because of lock-less commits. In the current IndexWriter, newly added docume

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-18 Thread Ning Li (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12459490 ] Ning Li commented on LUCENE-565: Many versions of the patch were submitted as new code was committed to IndexWriter.java. For each version, all changes made were i

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-18 Thread Paul Elschot (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12459482 ] Paul Elschot commented on LUCENE-565: - I'd like to give this a try over the upcoming holidays. Would it be possible to post a single patch? A single patch can b

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-13 Thread Ning Li (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12458205 ] Ning Li commented on LUCENE-565: > Minor question... in the places that you use Vector, is there a reason you > aren't using ArrayList? > And in methods that pass

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-13 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12458174 ] Yonik Seeley commented on LUCENE-565: - Minor question... in the places that you use Vector, is there a reason you aren't using ArrayList? And in methods that p

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-13 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12458170 ] Yonik Seeley commented on LUCENE-565: - > both inserts and deletes - are committed in the same transaction. OK, cool. I agree that's the ideal default behavior

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-13 Thread Ning Li (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12458158 ] Ning Li commented on LUCENE-565: > Can the same thing happen with your patch (with a smaller window), or are > deletes applied between writing the new segment and

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-12 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12457885 ] Yonik Seeley commented on LUCENE-565: - Hmmm, I see your point... If deletes are deferred, a different reader could go and open the index and see the additions

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-12 Thread Ning Li (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12457865 ] Ning Li commented on LUCENE-565: > *or* you could choose to do it before a merge of the lowest level on-disk > segments. If none of the lowest level segments have

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-12 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12457833 ] Yonik Seeley commented on LUCENE-565: - On 12/12/06, Ning Li <[EMAIL PROTECTED]> wrote: > > To minimize the number of reader open/closes on large persistent segm

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-12 Thread Ning Li
Thanks for the comments Yonik! To minimize the number of reader open/closes on large persistent segments, I think the ability to apply deletes only before a merge is important. That might add a 4th method: doBeforeMerge() I'm not sure I get this. Buffered deletes are only applied(flushed) d

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-12 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12457789 ] Yonik Seeley commented on LUCENE-565: - > Good idea! I think this could also be used by reopen (LUCENE-743 ) to re-use > readers. Yes, although reopen() needs

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-12 Thread Michael McCandless (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12457764 ] Michael McCandless commented on LUCENE-565: --- > If SegmentInfos had a cached reader, that seems like it would solve both > problems. > I haven't thought a

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-12 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12457758 ] Yonik Seeley commented on LUCENE-565: - > It would be nice to not have to continually open and close readers on segments > that aren't involved in a merge. Is t

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-11 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12457582 ] Yonik Seeley commented on LUCENE-565: - Lack of committer time... I've been busy enough that I've shied away from complexity and gravitated toward issues that I

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-12-08 Thread Michael Busch (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12456887 ] Michael Busch commented on LUCENE-565: -- What are the reasons to not add the NewIndexModifier to Lucene? This issue has already 6 votes, so it seems to be very

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-11-22 Thread Ning Li (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12452039 ] Ning Li commented on LUCENE-565: With the recent commits to IndexWriter, this patch no longer applies cleanly. The 5 votes for this issue encourages me to submit y

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-11-06 Thread Ning Li (JIRA)
PM Subject [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-14 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12434751 ] Yonik Seeley commented on LUCENE-565: - I also did a quick indexing performance test w/ Solr: maxBufferedDocs=100, mergeFactor=4, did 100K random overwriting ad

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-12 Thread Ning Li
The new code does handle the case. After mergeSegments(...) in maybeMergeSegments(), there is the following code: numSegments -= mergeFactor; if (docCount > upperBound) { minSegment++; exceedsUpperLimit = true; } else if (docCount > 0) {

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-11 Thread Yonik Seeley
A strange case I just thought of. Does the new code handle the case where a merge can drop the resulting segment "down a level" (due to deletions)? Example: M=3, B=10, maxMergeDocs=30 1) segment sizes = 30, 30, 30, 30 2) set maxMergeDocs=100 3) add enough docs to cause a merge 4) the leftmos

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-11 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12434041 ] Yonik Seeley commented on LUCENE-565: - Thanks for separating out the new merge policy Ning! I'm reviewing the patch now... Assuming everything looks good (it

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread Marvin Humphrey
On Sep 6, 2006, at 4:23 PM, Ning Li wrote: When do you add "merge-worthy" segments? I'd guess at the end of a session, when it's easy to decide which segments are "merge-worthy". Right. KS sorts the segments by size, then tries to merge the smallest away. The calculation uses the fibonacc

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread Ning Li
On 9/6/06, Marvin Humphrey <[EMAIL PROTECTED]> wrote: That's one way of thinking about it. There's only one "thing" though: a big bucket of serialized index entries. At the end of a session, those are sorted, pulled apart, and used to write the tis, tii, frq, and prx files. Interesting. Whe

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread Ning Li
So what's left... maxMergeDocs I guess. Capping the segment size breaks the simple invariants a bit. Correct. We also need to be able to handle changes to M and maxMergeDocs between different IndexWriter sessions. When checking for a merge for Hmmm. A change of M could easily break the inva

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread Yonik Seeley
On 9/6/06, Ning Li <[EMAIL PROTECTED]> wrote: > So, I *think* most of our hypothetical problems go away with a simple > adjustment to f(n): > > f(n) = floor(log_M((n-1)/B)) Correct. And nice. :-) Equivalently, f(n) = ceil(log_M (n / B)). If f(n) = c, it means B*(M^(c-1)) < n <= B*(M^(c)). So f

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread Marvin Humphrey
On Sep 6, 2006, at 12:06 PM, Yonik Seeley wrote: Hmmm, not rewriting stored fields is nice. I guess that could apply to anything that's strictly document specific, such as term vectors. Yes. Remember the old benchmarks I posted a few months ago? KinoSearch's performance was much closer to

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread jason rutherglen
(LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided) On Sep 6, 2006, at 10:30 AM, Yonik Seeley wrote: > So it looks like you have intermediate things that aren't lucene > segments, but end up producing valid lucene segments at the end of a > se

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread Yonik Seeley
On 9/6/06, Marvin Humphrey <[EMAIL PROTECTED]> wrote: On Sep 6, 2006, at 10:30 AM, Yonik Seeley wrote: > So it looks like you have intermediate things that aren't lucene > segments, but end up producing valid lucene segments at the end of a > session? That's one way of thinking about it. There

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread Marvin Humphrey
On Sep 6, 2006, at 10:30 AM, Yonik Seeley wrote: So it looks like you have intermediate things that aren't lucene segments, but end up producing valid lucene segments at the end of a session? That's one way of thinking about it. There's only one "thing" though: a big bucket of serialized i

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread Yonik Seeley
On 9/6/06, Marvin Humphrey <[EMAIL PROTECTED]> wrote: So cut the Gordian Knot? http://wiki.apache.org/jakarta-lucene/KinoSearchMergeModel :-) Interesting stuff... So it looks like you have intermediate things that aren't lucene segments, but end up producing valid lucene segments at the end o

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread Ning Li
So, I *think* most of our hypothetical problems go away with a simple adjustment to f(n): f(n) = floor(log_M((n-1)/B)) Correct. And nice. :-) Equivalently, f(n) = ceil(log_M (n / B)). If f(n) = c, it means B*(M^(c-1)) < n <= B*(M^(c)). So f(n) = 0 means n <= B. --

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread Yonik Seeley
Just brainstorming a little... Assuming B=1000, M=10 (I think better with concrete examples) It seems like we should avoid unnecessary merging, allowing up to 9 segments of 1000 documents or less w/o merging. When we reach 10 segments, they should be merged into a single segment. Let's assume a

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread Marvin Humphrey
On Sep 5, 2006, at 3:28 PM, Ning Li wrote: Given M, B and an index which has L (0 < L < M) segments with docs less than B, how many ram docs should be accumulated before a merge is triggered? B is not good. B-sum(L) is the old strategy which has problems. So between B-sum(L) and B? Once there a

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-06 Thread Ning Li
> "Less than M number of segments whose doc count n satisfies B*(M^c) <= > n < B*(M^(c+1)) for any c >= 0." > In other words, less than M number of segments with the same f(n). Ah, I had missed that. But I don't believe that lucene currently obeys this in all cases. I think it does hold for n

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-05 Thread Yonik Seeley
On 9/5/06, Ning Li <[EMAIL PROTECTED]> wrote: > What about an invariant that says the number of main index segments > with the same level (f(n)) should be less than M. That is exactly what the second property says: "Less than M number of segments whose doc count n satisfies B*(M^c) <= n < B*(M^(

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-05 Thread Ning Li
What about an invariant that says the number of main index segments with the same level (f(n)) should be less than M. That is exactly what the second property says: "Less than M number of segments whose doc count n satisfies B*(M^c) <= n < B*(M^(c+1)) for any c >= 0." In other words, less than

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-05 Thread Yonik Seeley
On 9/2/06, Ning Li <[EMAIL PROTECTED]> wrote: Here is an outline of the rest of the email: 1) Current Lucene merge policy. 2) Its strengths and weaknesses. The strengths are good candidates for index invariants. 3) Changed merge behaviour in the patch. 1) Current Lucene merge policy TargetMergeD

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-01 Thread Ning Li
I believe this patch probably also changes the merge behavior. I think we need to discuss what exactly the new merge behavior is, if it's OK, what we think the index invariants should be (no more than x segments of y size, etc), and I'd like to see some code to test those invariants. Yes, the pa

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-01 Thread Doron Cohen (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12432216 ] Doron Cohen commented on LUCENE-565: I agree - I also suspected it might change the merge behavior (and also had reflections from the repeated trials to have t

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-01 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12432164 ] Yonik Seeley commented on LUCENE-565: - I believe this patch probably also changes the merge behavior. I think we need to discuss what exactly the new merge beha

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-09-01 Thread Yonik Seeley (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12432155 ] Yonik Seeley commented on LUCENE-565: - > the new code is *9* times faster! That's a bit apples-and-oranges :-) I don't think people use IndexModifier when t

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-30 Thread Jason Rutherglen (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12431757 ] Jason Rutherglen commented on LUCENE-565: - I figured out the problem, the Solr DirectUpdateHandler2 expects to delete only a certain number of documents sp

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-29 Thread Jason Rutherglen (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12431475 ] Jason Rutherglen commented on LUCENE-565: - Having trouble reproducing this. Probably something in the other code. Thanks for the help and the patch, I fe

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-29 Thread Ning Li
DirectUpdateHandler2. I will create a non-Solr reproduction of the issue. I'm still not clear how you used the patch. So this will definitely help. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-29 Thread Jason Rutherglen (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12431423 ] Jason Rutherglen commented on LUCENE-565: - Good points... I've actually used both NewIndexModifier and the parent. I've tried writing a new UpdateHandler,

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-29 Thread Doron Cohen (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12431419 ] Doron Cohen commented on LUCENE-565: Just to make sure on the scenario - are you - (1) using NewIndexModifier at all, or (2) just letting Solr use this IndexW

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-29 Thread jason rutherglen
as the deleted documents and so are registered as deleted. - Original Message From: Ning Li <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Tuesday, August 29, 2006 5:00:17 PM Subject: Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Co

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-29 Thread Ning Li
(reopen), then perform a batch addDocuments. Then when a search is executed nothing is returned, and after an optimize the index goes down to 1K. Seems What did you set maxBufferedDocs to? If it is bigger than the number of documents you inserted, the newly added documents haven't reached disk

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-29 Thread Jason Rutherglen (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12431400 ] Jason Rutherglen commented on LUCENE-565: - I started to flush the deletes after making them, which opens a new NewIndexModifier afterwards. I still see th

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-29 Thread Doron Cohen (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12431354 ] Doron Cohen commented on LUCENE-565: Is it that results that were returned are suddenly (say after updates) not returned anymore (indicating something bad happ

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-29 Thread Ning Li
Could you elaborate? Jason Rutherglen commented on LUCENE-565: - It seems this writer works, but then some mysterious happens to the index and the searcher can no longer read it. I am using this in conjunction with Solr. The index files look ok, howev

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-29 Thread Jason Rutherglen (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12431342 ] Jason Rutherglen commented on LUCENE-565: - It seems this writer works, but then some mysterious happens to the index and the searcher can no longer read it

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-24 Thread Jason Rutherglen (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12430361 ] Jason Rutherglen commented on LUCENE-565: - This IndexWriter seems to work. Thanks. Great work! > Supporting deleteDocuments in IndexWriter (Code and Perf

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-23 Thread Ning Li (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12430130 ] Ning Li commented on LUCENE-565: Doron, thank you very much for the review! I want to briefly comment on one of your comments: > (5) deleteDocument(int doc) not im

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-22 Thread jason rutherglen
From: Ning Li <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Tuesday, August 22, 2006 12:45:00 PM Subject: Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided) > I tested just the IndexWriter from this code base

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-22 Thread Ning Li
I tested just the IndexWriter from this code base, it does not seem to work. NewIndexModifier does work. I simply used IndexWriter to create several documents and then search for them. Nothing came back even though it seems something was written to disk. The patch worked until several days ago

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-22 Thread Jason Rutherglen (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12429780 ] Jason Rutherglen commented on LUCENE-565: - I tested just the IndexWriter from this code base, it does not seem to work. NewIndexModifier does work. I sim

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-08-14 Thread Doron Cohen (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12428035 ] Doron Cohen commented on LUCENE-565: I tried out this patch (July18), and have a few comments... First, it is nice to be able to add/remove documents with no n

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-17 Thread Yonik Seeley
On 7/14/06, Ning Li <[EMAIL PROTECTED]> wrote: > If one is going to be able to support deleteByQuery, why not a full > IndexSearcher/IndexWriter combination? What do you mean by "a full combination"? An IndexReader/IndexWriter combination that can be used to do searches as well as just deletes

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-14 Thread Ning Li
Hey, you're moving the goalposts ;-) You proposed a specific patch, and it certtainly doesn't have support for delete-by-query. The patch makes IndexWriter support delete-by-term, which is what IndexReader supports. Granted, delete-by-term is not as general as delete-by-query so you don't have t

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-14 Thread Yonik Seeley
On 7/14/06, Ning Li <[EMAIL PROTECTED]> wrote: I have viewed "delete by term" (which is supported by IndexReader and NewIndexModifier) as a kind of "delete by query", not "delete by id". If I replace Term in DeleteTerm with Query (or query string), and re-define applyDeletesSelectively() as foll

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-13 Thread Ning Li
Solr's implementation is here: http://svn.apache.org/viewvc/incubator/solr/trunk/src/java/org/apache/solr/update/DirectUpdateHandler2.java?view=markup I read it and I see which point I didn't make clear. :-) I have viewed "delete by term" (which is supported by IndexReader and NewIndexModifier)

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-13 Thread Yonik Seeley
On 7/12/06, Ning Li <[EMAIL PROTECTED]> wrote: > If it can be done in a separate class, using public APIs (or at least > with a minimum of protected access), without a loss in performance, > then that's the way to go IMO. This is exactly what I'm asking. Can it be done using public APIs without

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-12 Thread Ning Li
I'm not sure I understand your question you mean why would one want to stick to public APIs? No, that's not what I meant. I definitely agree that we should stick to publich APIs as much as we can. If it can be done in a separate class, using public APIs (or at least with a minimum of prote

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-12 Thread Yonik Seeley
On 7/12/06, Ning Li <[EMAIL PROTECTED]> wrote: > I'll rephrase my original question: > When implementing NewIndexModifier, what type of efficiencies do we > get by using the new protected methods of IndexWriter vs using the > public APIs of IndexReader and IndexWriter? What do you think can be

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-12 Thread robert engels
You are really starting to lose me here Ning... I think the sentiments are that there are better, less intrusive ways of doing what you are trying to do with the massive changes to IndexWriter. A higher level class that manages the updates can be just as efficient if not more so, and far

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-12 Thread Ning Li
The current implementation makes some assumptions, such as the "unique key" is a single field, not any sort of compound key, and it doesn't allow deletes by query. That, coupled with a more complex implementation makes me wary of putting it in IndexWriter. By "current implementation", you meant

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-12 Thread robert engels
The previous email spelled out the details pretty well. I don't have time to write the code at this time. I was only making a comment that I thought your patch was too involved for what it does and there are better ways of doing it (which it seems other have had the same sentiment). On Ju

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-12 Thread Ning Li
I proposed a design of "BufferedWriter" in a previous email that would not have this limited. It is similar to what other have suggested, which is to handle the buffering in a higher-level class and level IndexWriter alone. Could you spell out the details, or better, submit the patch? So that we

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-12 Thread robert engels
No, the code does not exist yet... I have a version that requires a unique identifier. I proposed a design of "BufferedWriter" in a previous email that would not have this limited. It is similar to what other have suggested, which is to handle the buffering in a higher-level class and lev

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-12 Thread Ning Li
Then I submit hat my proposed "BufferedWriter" is far simpler and probably performs equally as well, if not better, especially for the case where a document can be uniquely identified. Can I find the patch for this already somewhere? Does it require an explicit unique identifier understandable b

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-10 Thread Ning Li
Random comment... ... An alternate implementation could use a HashMap to associate term with maxSegment. ... Very well taken. :-) I won't submit a new version of the patch at this point to avoid too many versions of the patch. Thanks, Ning ---

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-10 Thread Yonik Seeley
Random comment... when applying deletes you can break out of the loop early. + while (docs.next()) { + int doc = docs.doc(); + if (doc <= (((DeleteTerm)deleteTerms.elementAt(i)).maxSegment)) { + reader.deleteDocument(doc); +

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-10 Thread Chuck Williams
Yonik Seeley wrote on 07/10/2006 09:27 AM: > I'll rephrase my original question: > When implementing NewIndexModifier, what type of efficiencies do we > get by using the new protected methods of IndexWriter vs using the > public APIs of IndexReader and IndexWriter? I won't comment on Ning's imp

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-10 Thread Yonik Seeley
On 7/10/06, Ning Li <[EMAIL PROTECTED]> wrote: Almost all the Lucene newbies that I know went through this learning curve of realizing you have to batch inserts and deletes to achieve good performance. I agree that having the ability of interleave inserts and deletes to users of Lucene is a goo

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-10 Thread robert engels
Then I submit hat my proposed "BufferedWriter" is far simpler and probably performs equally as well, if not better, especially for the case where a document can be uniquely identified. On Jul 10, 2006, at 10:47 AM, Ning Li wrote: You keep stating that you never need to close the IndexWrite

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-10 Thread Ning Li
You keep stating that you never need to close the IndexWriter. I don't believe this is the case, and you are possibly misleading people as to the extent of your patch. Don't you need to close (or flush) to get the documents on disk, so a new IndexReader can find them? If not any documents added

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-10 Thread robert engels
You keep stating that you never need to close the IndexWriter. I don't believe this is the case, and you are possibly misleading people as to the extent of your patch. Don't you need to close (or flush) to get the documents on disk, so a new IndexReader can find them? If not any documents

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-09 Thread Ning Li
To clarify, higher level (application level) adds and deletes can be managed at a lower level such that index readers and writers aren't continually opened and closed. ... The big question is, what kind of efficiencies do you get by putting this functionallity in IndexWriter vs a higher level cl

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-07 Thread Yonik Seeley
When one interleaves adds and deletes, it isn't the case that indexreaders and indexwriters need to be opened and closed each interleave. To clarify, higher level (application level) adds and deletes can be managed at a lower level such that index readers and writers aren't continually opened an

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-07 Thread Nadav Har'El
On Thu, Jul 06, 2006, Yonik Seeley wrote about "Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)": >.. > When one interleaves adds and deletes, it isn't the case that > indexreaders and indexwriters need

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-06 Thread Chuck Williams
t;>>> you can perform the deletion and the add, then close the writer. >>>> >>>> I have asked how this submission optimizes anything, and I still >>>> can't seem to get an answer? >>>> >>>> >>>> On Jul 6, 2006, at 4:27 PM, Otis

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-06 Thread robert engels
m more frequently and in order they really are happening, without grouping them. Otis - Original Message From: robert engels <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Thursday, July 6, 2006 3:24:13 PM Subject: Re: [jira] Commented: (LUCENE-565) Supporting deleteDocument

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-06 Thread Chuck Williams
fferent scenario, the one where you >>> can't wait to batch deletes and adds, and want/need to execute them >>> more frequently and in order they really are happening, without >>> grouping them. >>> >>> Otis >>> >>> - Original Message

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-06 Thread robert engels
t/need to execute them more frequently and in order they really are happening, without grouping them. Otis - Original Message From: robert engels <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Thursday, July 6, 2006 3:24:13 PM Subject: Re: [jira] Commented: (LUCENE-56

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-06 Thread Ning Li
> Yonik mentioned this in email. It does sound like a better place for > this might be in a higher level class. IndexWriter would really not > be just a writer/appender once delete functionality is added to it, > even if it's the IndexReaders behind the scenes doing the work. So > if you are goi

[jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-06 Thread Otis Gospodnetic (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-565?page=comments#action_12419605 ] Otis Gospodnetic commented on LUCENE-565: - Thanks for all the information about coding changes, that makes it easier to understand the diff. Ideally this will become c

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-06 Thread robert engels
y are happening, without grouping them. Otis - Original Message From: robert engels <[EMAIL PROTECTED]> To: java-dev@lucene.apache.org Sent: Thursday, July 6, 2006 3:24:13 PM Subject: Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-06 Thread Otis Gospodnetic
t; To: java-dev@lucene.apache.org Sent: Thursday, July 6, 2006 3:24:13 PM Subject: Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided) I guess we just chose a much simpler way to do this... Even with you code changes, to see the m

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-06 Thread Ning Li
> Even with you code changes, to see the modification made using the > IndexWriter, it must be closed, and a new IndexReader opened. That behaviour remains the same. > So a far simpler way is to get the collection of updates first, then > using opened indexreader, > for each doc in collection >

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-06 Thread Chuck Williams
robert engels wrote on 07/06/2006 12:24 PM: > I guess we just chose a much simpler way to do this... > > Even with you code changes, to see the modification made using the > IndexWriter, it must be closed, and a new IndexReader opened. > > So a far simpler way is to get the collection of updates fi

Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)

2006-07-06 Thread robert engels
I guess we just chose a much simpler way to do this... Even with you code changes, to see the modification made using the IndexWriter, it must be closed, and a new IndexReader opened. So a far simpler way is to get the collection of updates first, then using opened indexreader, for each doc

  1   2   >