[
https://issues.apache.org/jira/browse/LUCENE-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769946#action_12769946
]
Uwe Schindler edited comment on LUCENE-1960 at 10/26/09 9:05 AM:
-
I do not know if this is a bug in 2.9.0, but it seems that segments with all
documents deleted are not automatically removed:
{noformat}
4 of 14: name=_dlo docCount=5
compound=true
hasProx=true
numFiles=2
size (MB)=0.059
diagnostics = {java.version=1.5.0_21, lucene.version=2.9.0 817268P -
2009-09-21 10:25:09, os=SunOS,
os.arch=amd64, java.vendor=Sun Microsystems Inc., os.version=5.10,
source=flush}
has deletions [delFileName=_dlo_1.del]
test: open reader.OK [5 deleted docs]
test: fields..OK [136 fields]
test: field norms.OK [136 fields]
test: terms, freq, prox...OK [1698 terms; 4236 terms/docs pairs; 0 tokens]
test: stored fields...OK [0 total field count; avg ? fields per doc]
test: term vectorsOK [0 total vector count; avg ? term/freq vector
fields per doc]
{noformat}
Shouldn't such segments not be removed automatically during the next
*commit*/close of IndexWriter?
But this would be another issue. In my opinion, we are fine with the current
approach, the longer optimization time is rectified by the larger index size
because of no compression anymore and the more heavyer initial merge without
addRawDocument is only 30% slower (one time!).
+1 for committing
was (Author: thetaphi):
I do not know if this is a bug in 2.9.0, but it seems that segments with
all documents deleted are not automatically removed:
{code}
2009-10-24 17:08:15,264 INFO org.apache.lucene.index.CheckIndex - 4 of 14:
name=_dlo docCount=5
2009-10-24 17:08:15,264 INFO org.apache.lucene.index.CheckIndex -
compound=true
2009-10-24 17:08:15,264 INFO org.apache.lucene.index.CheckIndex -
hasProx=true
2009-10-24 17:08:15,264 INFO org.apache.lucene.index.CheckIndex - numFiles=2
2009-10-24 17:08:15,265 INFO org.apache.lucene.index.CheckIndex - size
(MB)=0.059
2009-10-24 17:08:15,265 INFO org.apache.lucene.index.CheckIndex -
diagnostics = {java.version=1.5.0_21, lucene.version=2.9.0 817268P - 2009-09-21
10:25:09, os=SunOS, os.arch=amd64, java.vendor=Sun Microsystems Inc.,
os.version=5.10, source=flush}
2009-10-24 17:08:15,265 INFO org.apache.lucene.index.CheckIndex - has
deletions [delFileName=_dlo_1.del]
2009-10-24 17:08:15,356 INFO org.apache.lucene.index.CheckIndex - test:
open reader.OK [5 deleted docs]
2009-10-24 17:08:15,356 INFO org.apache.lucene.index.CheckIndex - test:
fields..OK [136 fields]
2009-10-24 17:08:15,357 INFO org.apache.lucene.index.CheckIndex - test:
field norms.OK [136 fields]
2009-10-24 17:08:15,372 INFO org.apache.lucene.index.CheckIndex - test:
terms, freq, prox...OK [1698 terms; 4236 terms/docs pairs; 0 tokens]
2009-10-24 17:08:15,373 INFO org.apache.lucene.index.CheckIndex - test:
stored fields...OK [0 total field count; avg ? fields per doc]
2009-10-24 17:08:15,373 INFO org.apache.lucene.index.CheckIndex - test:
term vectorsOK [0 total vector count; avg ? term/freq vector fields per
doc]
{code}
Shouldn't such segments not be removed automatically during the next merge?
But this would be another issue. In my opinion, we are fine with the current
approach, the longer optimization time is rectified by the larger index size
because of no compression anymore and the more heavyer initial merge without
addRawDocument is only 30% slower (one time!).
+1 for committing
> Remove deprecated Field.Store.COMPRESS
> --
>
> Key: LUCENE-1960
> URL: https://issues.apache.org/jira/browse/LUCENE-1960
> Project: Lucene - Java
> Issue Type: Task
>Reporter: Michael Busch
>Assignee: Michael Busch
>Priority: Minor
> Fix For: 3.0
>
> Attachments: lucene-1960-1.patch, lucene-1960-1.patch,
> lucene-1960.patch, optimize-time.txt
>
>
> Also remove FieldForMerge and related code.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org