You also have to make sure you test this on non-Windows systems. Since a
delete in Windows is prevented when the file is open, but non-Windows
system do not have this limitation so there is a far greater chance you
will have an inconsistent index.
Excellent point, will do.
I'm now testing a
I am betting that if your remote locking has issues, you will have the
similar problems (since your new code requires accurate reading of the
directory to determine the "latest" files). I also believe that
directory reads like this are VERY inefficient in most cases.
OK, I will test the cost
You also have to make sure you test this on non-Windows systems.
Since a delete in Windows is prevented when the file is open, but non-
Windows system do not have this limitation so there is a far greater
chance you will have an inconsistent index.
On Aug 18, 2006, at 5:00 PM, Michael McCan
Also, the commit lock is there to allow the merge process to remove
unused segments. Without it, a reader might get half way through reading
the segments, only to find some missing, and then have to restart
reading again. In a highly interactive environment this would be too
inefficient.
OK
[
http://issues.apache.org/jira/browse/LUCENE-635?page=comments#action_12429135 ]
Michael McCandless commented on LUCENE-635:
---
OK, does anyone have a strong opinion one way or another on these
small changes?
I would lean towards keepi
Also, the commit lock is there to allow the merge process to remove
unused segments. Without it, a reader might get half way through
reading the segments, only to find some missing, and then have to
restart reading again. In a highly interactive environment this would
be too inefficient.
I am betting that if your remote locking has issues, you will have
the similar problems (since your new code requires accurate reading
of the directory to determine the "latest" files). I also believe
that directory reads like this are VERY inefficient in most cases.
I think these proposed
i don't think these changes are going to work. With multiple writers and
or readers doing deletes, without serializing the writes you will have
inconsistencies - and the del files will need to be unioned.
That is:
station A opens the index
station B opens the index
station A deletes some do
: You can reproduce OutOfMemory easily. I've attach test files - this is
: altered DistanceSortingTest example from LIA book. Also you can
: profile it and see caching of distances arrays.
An OutOfMemory error is differnet from a memory leak. Sorting with a
custom Comparator does in fact use a l
i don't think these changes are going to work. With multiple writers
and or readers doing deletes, without serializing the writes you
will have inconsistencies - and the del files will need to be unioned.
That is:
station A opens the index
station B opens the index
station A deletes some do
It could in theory lead to starvation but this should be rare in
practice unless you have an IndexWriter that's constantly committing.
An index with a small mergeFactor (say 2) and a small maxBufferedDocs
(default 10), would have segments deleted every
mergeFactor*maxBufferedDocs when rapidly
: soon too. Just came across this while writing up documentation on
: scoring and thought it sounded like a reasonable and easy fix. I
: know Hoss has done a lot with Explanations, so he may know best if
: there are issues with skipTo and explain. All tests still pass
I can't think of any reas
[ http://issues.apache.org/jira/browse/LUCENE-388?page=all ]
Doron Cohen updated LUCENE-388:
---
Attachment: doron_2b_IndexWriter.patch
Right... actually it should be like this:
int minSegment = segmentInfos.size() - singleDocSegmentsCount - 1;
But sinc
On 8/18/06, Michael McCandless <[EMAIL PROTECTED]> wrote:
It could in theory lead to starvation but this should be rare in
practice unless you have an IndexWriter that's constantly committing.
An index with a small mergeFactor (say 2) and a small maxBufferedDocs
(default 10), would have segment
The basic idea is to change all commits (from SegmentReader or
IndexWriter) so that we never write to an existing file that a reader
could be reading from. Instead, always write to a new file name using
sequentially numbered files. For example, for "segments", on every
commit, write to a the s
Anyone see any reason why I shouldn't make the following commit to
TermScorer explain per Otis' TODO comment on the method: * @todo
Modify to make use of [EMAIL PROTECTED] TermDocs#skipTo(int)}.
public Explanation explain(int doc) throws IOException {
TermQuery query = (TermQuery)we
[
http://issues.apache.org/jira/browse/LUCENE-388?page=comments#action_12429027 ]
Yonik Seeley commented on LUCENE-388:
-
We could also make the following change to flushRamSegments, right?
private final void flushRamSegments() throws IOExc
The basic idea is to change all commits (from SegmentReader or
IndexWriter) so that we never write to an existing file that a reader
could be reading from. Instead, always write to a new file name using
sequentially numbered files. For example, for "segments", on every
commit, write to a the seq
[
http://issues.apache.org/jira/browse/LUCENE-388?page=comments#action_12429012 ]
Yonik Seeley commented on LUCENE-388:
-
Thanks Doron, I caught that too and I was just going to set the count to 0 in
mergeSegments (mergeSegments is always cal
I think it's possible to modify Lucene's commit process so that it
does not require any commit locking at all.
This would be a big win because it would prevent all the various messy
errors (FileNotFound exceptions on instantiating an IndexReader,
Access Denied errors on renaming X.new -> X, Lock
[ http://issues.apache.org/jira/browse/LUCENE-388?page=all ]
Doron Cohen updated LUCENE-388:
---
Attachment: doron_2_IndexWriter.patch
The attached doron_2_IndexWriter.patch is fixing the updating of
singleDocSegmentsCount to take place in mergeSegments(minS
[
http://issues.apache.org/jira/browse/LUCENE-650?page=comments#action_12428955 ]
Doron Cohen commented on LUCENE-650:
I reviewed this patch and think that it is valid.
This seems like a real bug:
- In FieldSortedHitQueue, when no locale is
[
http://issues.apache.org/jira/browse/LUCENE-388?page=comments#action_12428953 ]
Doron Cohen commented on LUCENE-388:
well there is a problem in the current patch after all... the counter is
not decremented when a merge is triggerred b
Hi!
Could you please read the following discussion in java-user mail list
- http://www.gossamer-threads.com/lists/lucene/java-user/35352
You can reproduce OutOfMemory easily. I've attach test files - this is
altered DistanceSortingTest example from LIA book. Also you can
profile it and see cachi
24 matches
Mail list logo