]
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>
with the latest SegmentInfos once the writer has been rolled back.
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/bro
with the latest SegmentInfos once the writer has been rolled back.
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/bro
with the latest SegmentInfos once the writer has been rolled back.
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/bro
@simonw.
I beasted all Lucene tests 113X times and only hit 3 failures from LUCENE-8073.
+1 to push!
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
>
can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
> Issue Type: Bug
> Components:
unrelated failures; look like Similarity issues ... I'll open
separate issues for those.
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.
is the right one
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Cor
.
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>
spent quite some time to get
the accounting right now. I think it looks good now but I could use some help
beasting / reviewing it.
> Attempting to add documents past limit can corrupt index
>
>
> Key:
; thanks @simonw and
[~ysee...@gmail.com]!
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
>
ing to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
> Issue Type: Bug
every time and very
quickly. It also trips an assertion in the IW that I had to change since I
think it's not guaranteed especially with the setup I am running in the test.
[~mikemccand] can you take a look.
> Attempting to add documents past limit can corrupt in
!!: reader.maxDoc=2 IW.pendingNumDocs=0
After sleep,commit,close reader.maxDoc=2 IW.pendingNumDocs=0
{code}
Still needs to be turned into a proper unit test, preferably w/o any sleeps.
> Attempting to add documents past limit can corrupt in
forward? Can you share it? I will also try to turn your reproduction into a
testcase but maybe we should push the fix first to not be in the way of a
release, WDYT?
> Attempting to add documents past limit can corrupt in
of updates, even after
my attempted fix, was for 4.10.3 / 4.10.4
I forward-ported that code to master and things no longer fail... so I think
this patch is good for recent Lucene versions. Thanks!
> Attempting to add documents past limit can corrupt in
also exposed pendingNumDocs for testing reasons and then tested it against
expected values, and was able to get tests that reliably failed after a handful
of updates. I'll try digging that up and see if it passes with this patch.
> Attempting to add documents past limit can corrupt in
for testing reasons and then tested it against
expected values, and was able to get tests that reliably failed after a handful
of updates. I'll try digging that up and see if it passes with this patch.
> Attempting to add documents past limit can corrupt in
to be called more than once, I
think, so I think your fix is the right approach! Thanks.
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://is
[
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Simon Willnauer reassigned LUCENE-8043:
---
Assignee: Simon Willnauer
> Attempting to add documents past limit can corr
o.maxDoc());
- readerPool.drop(info);
}
}{code}
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
>
to reproduce it.
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>
enough to prevent the issue
from reproducing. It reproduces with assertions enabled ({{-ea}}), but no
assertion breaks.
> Attempting to add documents past limit can corrupt index
>
>
> Key: LUCENE-8043
>
ast limit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
> Issue Type: Bug
>
mit can corrupt index
>
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
> Issue Type: Bug
> Components:
harder with trunk than with 4.10, but still happens.
It may have something do do with deletion accounting (and maybe deletes being
deducted *before* merges are actually finished), but that may not explain all
of the types of failures I see.
> Attempting to add documents past limit can corr
Yonik Seeley created LUCENE-8043:
Summary: Attempting to add documents past limit can corrupt index
Key: LUCENE-8043
URL: https://issues.apache.org/jira/browse/LUCENE-8043
Project: Lucene - Core
!
Corrupt index (missing .si file) on first 4.x commit to a 3.x index
---
Key: LUCENE-6287
URL: https://issues.apache.org/jira/browse/LUCENE-6287
Project: Lucene - Core
Issue Type
[
https://issues.apache.org/jira/browse/LUCENE-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335512#comment-14335512
]
Simon Willnauer commented on LUCENE-6287:
-
LGTM +1 to commit
Corrupt index
, and then when the commit finishes we incRef them with
IFD again and they are permanent after that.
Corrupt index (missing .si file) on first 4.x commit to a 3.x index
---
Key: LUCENE-6287
URL
enough.
I just created a simple 3.x index with many segments, and in the test case
provoke merging and run a concurrent commit. MDW.close then runs check index
which detects the corruption.
Next I'll try to fix the bug ...
Corrupt index (missing .si file) on first 4.x commit to a 3.x index
Michael McCandless created LUCENE-6287:
--
Summary: Corrupt index (missing .si file) on first 4.x commit to a
3.x index
Key: LUCENE-6287
URL: https://issues.apache.org/jira/browse/LUCENE-6287
[
https://issues.apache.org/jira/browse/LUCENE-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved LUCENE-6287.
Resolution: Fixed
Corrupt index (missing .si file) on first 4.x commit to a 3.x
On Wed, Jun 13, 2012 at 8:45 PM, Itamar Syn-Hershko ita...@code972.com wrote:
Mike,
On Wed, Jun 13, 2012 at 7:31 PM, Michael McCandless
luc...@mikemccandless.com wrote:
Hi Itamar,
One quick question: does Lucene.Net include the fixes done for
LUCENE-1044 (to fsync files on commit)? Those
I think the 0-segment segments_1 file is expected in Lucene.Net since
we changed that later, in 3.1 in Lucene (LUCENE-2386)?
Mike McCandless
http://blog.mikemccandless.com
On Thu, Jun 14, 2012 at 8:40 PM, Itamar Syn-Hershko ita...@code972.com wrote:
I can confirm 2.9.4 had autoCommit, but it
I think the 0-segment segments_1 file is expected in Lucene.Net since
we changed that later, in 3.1 in Lucene (LUCENE-2386)?
Mike McCandless
http://blog.mikemccandless.com
On Thu, Jun 14, 2012 at 8:40 PM, Itamar Syn-Hershko ita...@code972.com wrote:
I can confirm 2.9.4 had autoCommit, but it
Well, the only thing I see is that there is no place where writer.Commit()
is called in the delegate assigned to corpusReader.OnDocument. I know that
lucene is very transactional, and at least in 3.x, the writer will never
auto commit to the index. You can write millions of documents, but if
I'm quite certain this shouldn't happen also when Commit wasn't called.
Mike, can you comment on that?
On Thu, Jun 14, 2012 at 8:03 PM, Christopher Currens
currens.ch...@gmail.com wrote:
Well, the only thing I see is that there is no place where writer.Commit()
is called in the delegate
If this is the case, 2328 probably made it's way to Lucene.Net since we are
using the released sources for porting, and we now need to apply 3418 in
the current version.
Iatmar: I confirmed that 2328 is in the latest code.
Thanks,
Troy
On Wed, Jun 13, 2012 at 5:45 PM, Itamar Syn-Hershko
I can confirm 2.9.4 had autoCommit, but it is gone in 3.0.3 already, so
Lucene.Net doesn't have autoCommit.
So I don't have autoCommit set to true, but I can clearly see a segments_1
file there along with the other files. If that helpes, it always keeps with
the name segments_1 with 32 bytes,
On Wed, Jun 13, 2012 at 8:45 PM, Itamar Syn-Hershko ita...@code972.com wrote:
Mike,
On Wed, Jun 13, 2012 at 7:31 PM, Michael McCandless
luc...@mikemccandless.com wrote:
Hi Itamar,
One quick question: does Lucene.Net include the fixes done for
LUCENE-1044 (to fsync files on commit)? Those
I'm quite certain this shouldn't happen also when Commit wasn't called.
Mike, can you comment on that?
On Thu, Jun 14, 2012 at 8:03 PM, Christopher Currens
currens.ch...@gmail.com wrote:
Well, the only thing I see is that there is no place where writer.Commit()
is called in the delegate
If this is the case, 2328 probably made it's way to Lucene.Net since we are
using the released sources for porting, and we now need to apply 3418 in
the current version.
Iatmar: I confirmed that 2328 is in the latest code.
Thanks,
Troy
On Wed, Jun 13, 2012 at 5:45 PM, Itamar Syn-Hershko
Right: Lucene never autocommits anymore ...
If you create a new index, add a bunch of docs, and things crash
before you have a chance to commit, then there is no index (not even a
0 doc one) in that directory.
Mike McCandless
http://blog.mikemccandless.com
On Thu, Jun 14, 2012 at 1:41 PM,
Not what I'm seeing. I actually see a lot of segments created and merged
while it operates. Expected?
Reminding you, this is 2.9.4 / 3.0.3
On Fri, Jun 15, 2012 at 3:10 AM, Michael McCandless
luc...@mikemccandless.com wrote:
Right: Lucene never autocommits anymore ...
If you create a new
Hmm, OK: in 2.9.4 / 3.0.x, if you open IW on a new directory, it will
make a zero-segment commit. This was changed/fixed in 3.1 with
LUCENE-2386.
In 2.9.x (not 3.0.x) there is still an autoCommit parameter,
defaulting to false, but if you set it to true then IndexWriter will
periodically commit.
I can confirm 2.9.4 had autoCommit, but it is gone in 3.0.3 already, so
Lucene.Net doesn't have autoCommit.
So I don't have autoCommit set to true, but I can clearly see a segments_1
file there along with the other files. If that helpes, it always keeps with
the name segments_1 with 32 bytes,
Mike, The codebase for lucene.net should be almost identical to java's
3.0.3 release, and LUCENE-1044 is included in that.
Itamar, are you committing the index regularly? I only ask because I can't
reproduce it myself by forcibly terminating the process while it's
indexing. I've tried both
Christopher,
I used the IndexBuilder app from here
https://github.com/synhershko/Talks/tree/master/LuceneNeatThings with a
8.5GB wikipedia dump.
After running for 2.5 days I had to forcefully close it (infinite loop in
the wiki-markdown parser at 92%, go figure), and the 40-something GB index
I
Hi Java devs,
I'm a Lucene.Net committer, and there is a chance we have a bug in our
FSDirectory implementation that causes indexes to get corrupted when
indexing is cut while the IW is still open. As it roots from some
retroactive fixes you made, I'd appreciate your feedback.
Correct me if I'm
Hi Itamar,
One quick question: does Lucene.Net include the fixes done for
LUCENE-1044 (to fsync files on commit)? Those are very important for
an index to be intact after OS/JVM crash or power loss.
More responses below:
On Tue, Jun 12, 2012 at 8:20 PM, Itamar Syn-Hershko ita...@code972.com
Mike,
On Wed, Jun 13, 2012 at 7:31 PM, Michael McCandless
luc...@mikemccandless.com wrote:
Hi Itamar,
One quick question: does Lucene.Net include the fixes done for
LUCENE-1044 (to fsync files on commit)? Those are very important for
an index to be intact after OS/JVM crash or power loss.
Christopher,
I used the IndexBuilder app from here
https://github.com/synhershko/Talks/tree/master/LuceneNeatThings with a
8.5GB wikipedia dump.
After running for 2.5 days I had to forcefully close it (infinite loop in
the wiki-markdown parser at 92%, go figure), and the 40-something GB index
I
Yes, reproduced in first try. See attached program - I referenced it to
current trunk.
On Thu, Jun 14, 2012 at 3:54 AM, Itamar Syn-Hershko ita...@code972.comwrote:
Christopher,
I used the IndexBuilder app from here
https://github.com/synhershko/Talks/tree/master/LuceneNeatThings with a
pass thru checkindex fine, but
then the problematic term happens to have df=1000... this segment has
skipInterval=10
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org
-Dtestmethod=testLongPostings
-Dtests.seed=7951749419575219720:5894148324166575802 -Dtests.codec=MockRandom
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira/browse
docFreq is a skipInterval^N, in this
case docFreq=243, skipInterval=3
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira/browse/LUCENE-3037
Project
a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira/browse/LUCENE-3037
Project: Lucene - Java
Issue Type: Bug
Reporter: Robert Muir
Attachments: LUCENE-3037
with skipInterval=16) are unaffected.
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira/browse/LUCENE-3037
Project: Lucene - Java
Issue Type: Bug
the issue open in case someone is
willing to write an integer version here (which would be greatly preferred!)
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira
produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira/browse/LUCENE-3037
Project: Lucene - Java
Issue Type: Bug
Reporter: Robert Muir
Fix
(log(n)/log(m)) = p can be
transformed into a simple equation: find first 'p' for which n = m^p is true.
Assuming m^p grows quickly, this can be computed in a tight loop, quite fast
and with an exact result.
TestFSTs.testRealTerms produces a corrupt index
produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira/browse/LUCENE-3037
Project: Lucene - Java
Issue Type: Bug
Reporter: Robert Muir
Fix For: 4.0
particularly important here though: when initting the skipper at this
place we are doing i/o... its more important that the calculation isnt wrong.
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL
static int log(int x, int base) {
assert base=2;
int ret = 0;
long n = base; // needs to be a long to avoid overflow
while (x = n) {
n *= base;
ret++;
}
return ret;
}
{code}
TestFSTs.testRealTerms produces a corrupt index
it?
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira/browse/LUCENE-3037
Project: Lucene - Java
Issue Type: Bug
Reporter
floating point was good. StrictMath would not
have eliminated all of the errors.
For example log(380204032,52) using StrictMath still returns 4, while the
integer versions correctly return 5.
TestFSTs.testRealTerms produces a corrupt index
, as there
aren't any problems with a base of 16.
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira/browse/LUCENE-3037
Project: Lucene - Java
Issue
;
b_hi = b_mid;
} else if (b_mid x) {
lo = mid;
b_lo = b_mid;
} else {
return mid;
}
}
return b_hi == x ? hi : lo;
}
{code}
TestFSTs.testRealTerms produces a corrupt index
---
Key
: 1139ms for 1 iterations.
log3: 1697ms for 1 iterations.
{noformat}
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira/browse/LUCENE-3037
calculation is less than
one-thousandth of the time spent in this skipper init
Yeah, just having fun.
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira/browse
of the skipping too :)
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira/browse/LUCENE-3037
Project: Lucene - Java
Issue Type: Bug
too :)
Well, the code I posted is *slower*, so I'm afraid that if history is any guide
here, I can only promise pessimization...
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira/browse/LUCENE-3037
Project: Lucene - Java
Issue Type: Bug
Reporter: Robert Muir
/Lucene-trunk/1533/testReport/junit/org.apache.lucene.util.automaton.fst/TestFSTs/testRealTerms/
for the exception.
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
URL: https://issues.apache.org/jira
is the broken segment (3 files).
i played around with disabling dpenum/skipper reuse to figure out why we have
a bad offset in the skip, but i didnt get anywhere...
TestFSTs.testRealTerms produces a corrupt index
---
Key: LUCENE-3037
UDIDIndexWriter keeps write lock on corrupt index
-
Key: LUCENE-3015
URL: https://issues.apache.org/jira/browse/LUCENE-3015
Project: Lucene - Java
Issue Type: Bug
Components: Index
no class UDIDIndexWriter, so maybe
thats an external customization.
If this is the case, I will close the issue.
UDIDIndexWriter keeps write lock on corrupt index
-
Key: LUCENE-3015
URL: https://issues.apache.org
are right.
I'll close the ticket.
UDIDIndexWriter keeps write lock on corrupt index
-
Key: LUCENE-3015
URL: https://issues.apache.org/jira/browse/LUCENE-3015
Project: Lucene - Java
Issue Type: Bug
in a newer commit point are sync'd.
Corrupt index: term out of order after forced stop during indexing
---
Key: LUCENE-1037
URL: https://issues.apache.org/jira/browse/LUCENE-1037
Project
Corrupt index: term out of order after forced stop during indexing
---
Key: LUCENE-1037
URL: https://issues.apache.org/jira/browse/LUCENE-1037
Project: Lucene - Java
Issue
two separate
causes that could lead to exceptions like this in 2.0.0.
ArrayIndexOutOfBoundsException in BitVector.get whilst adding a doc resulting
in corrupt index
--
Key: LUCENE-784
[ http://issues.apache.org/jira/browse/LUCENE-702?page=all ]
Michael McCandless resolved LUCENE-702.
---
Fix Version/s: 2.1
Resolution: Fixed
Disk full during addIndexes(Directory[]) can corrupt index
.
* Improved how MockRAMDirectory tracks/enforces max usage.
* Other small fixes to unit test.
Disk full during addIndexes(Directory[]) can corrupt index
--
Key: LUCENE-702
URL: http
, right?
The write once semantics is not kept for segment names or .delN. This
is ok because no reader will read the old versions.
Disk full during addIndexes(Directory[]) can corrupt index
--
Key: LUCENE-702
no segments_N references them).
Disk full during addIndexes(Directory[]) can corrupt index
--
Key: LUCENE-702
URL: http://issues.apache.org/jira/browse/LUCENE-702
Project: Lucene - Java
Issue
addIndexes(Directory[]) can corrupt index
--
Key: LUCENE-702
URL: http://issues.apache.org/jira/browse/LUCENE-702
Project: Lucene - Java
Issue Type: Bug
Components: Index
?
Disk full during addIndexes(Directory[]) can corrupt index
--
Key: LUCENE-702
URL: http://issues.apache.org/jira/browse/LUCENE-702
Project: Lucene - Java
Issue Type: Bug
one commit at the end which captures all the changes.
Disk full during addIndexes(Directory[]) can corrupt index
--
Key: LUCENE-702
URL: http://issues.apache.org/jira/browse/LUCENE-702
index.
Disk full during addIndexes(Directory[]) can corrupt index
--
Key: LUCENE-702
URL: http://issues.apache.org/jira/browse/LUCENE-702
Project: Lucene - Java
Issue Type: Bug
(Directory[]) can corrupt index
--
Key: LUCENE-702
URL: http://issues.apache.org/jira/browse/LUCENE-702
Project: Lucene - Java
Issue Type: Bug
Components: Index
Affects Versions
Disk full during addIndexes(Directory[]) can corrupt index
--
Key: LUCENE-702
URL: http://issues.apache.org/jira/browse/LUCENE-702
Project: Lucene - Java
Issue Type: Bug
92 matches
Mail list logo