[jira] Commented: (LUCENE-2010) Remove segments with all documents deleted in commit/flush/close of IndexWriter instead of waiting until a merge occurs.

2011-01-25 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986602#action_12986602
 ] 

Uwe Schindler commented on LUCENE-2010:
---

Thanks for taking care!

> Remove segments with all documents deleted in commit/flush/close of 
> IndexWriter instead of waiting until a merge occurs.
> 
>
> Key: LUCENE-2010
> URL: https://issues.apache.org/jira/browse/LUCENE-2010
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.9
>Reporter: Uwe Schindler
>Assignee: Michael McCandless
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2010.patch
>
>
> I do not know if this is a bug in 2.9.0, but it seems that segments with all 
> documents deleted are not automatically removed:
> {noformat}
> 4 of 14: name=_dlo docCount=5
>   compound=true
>   hasProx=true
>   numFiles=2
>   size (MB)=0.059
>   diagnostics = {java.version=1.5.0_21, lucene.version=2.9.0 817268P - 
> 2009-09-21 10:25:09, os=SunOS,
>  os.arch=amd64, java.vendor=Sun Microsystems Inc., os.version=5.10, 
> source=flush}
>   has deletions [delFileName=_dlo_1.del]
>   test: open reader.OK [5 deleted docs]
>   test: fields..OK [136 fields]
>   test: field norms.OK [136 fields]
>   test: terms, freq, prox...OK [1698 terms; 4236 terms/docs pairs; 0 tokens]
>   test: stored fields...OK [0 total field count; avg ? fields per doc]
>   test: term vectorsOK [0 total vector count; avg ? term/freq vector 
> fields per doc]
> {noformat}
> Shouldn't such segments not be removed automatically during the next 
> commit/close of IndexWriter?
> *Mike McCandless:*
> Lucene doesn't actually short-circuit this case, ie, if every single doc in a 
> given segment has been deleted, it will still merge it [away] like normal, 
> rather than simply dropping it immediately from the index, which I agree 
> would be a simple optimization. Can you open a new issue? I would think IW 
> can drop such a segment immediately (ie not wait for a merge or optimize) on 
> flushing new deletes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2010) Remove segments with all documents deleted in commit/flush/close of IndexWriter instead of waiting until a merge occurs.

2011-01-25 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986346#action_12986346
 ] 

Michael McCandless commented on LUCENE-2010:


bq. Do you want to fix the rest of the tests and remove the text-only 
keepAllSegments method?

It's actually only the QueryUtils test class that uses this... it makes an 
"empty" index by adding N docs and then deleting them all.  So the test-only 
API needs to be public (QueryUtils is in oal.search).  I'll mark it as 
lucene.internal...

> Remove segments with all documents deleted in commit/flush/close of 
> IndexWriter instead of waiting until a merge occurs.
> 
>
> Key: LUCENE-2010
> URL: https://issues.apache.org/jira/browse/LUCENE-2010
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.9
>Reporter: Uwe Schindler
>Assignee: Michael McCandless
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2010.patch
>
>
> I do not know if this is a bug in 2.9.0, but it seems that segments with all 
> documents deleted are not automatically removed:
> {noformat}
> 4 of 14: name=_dlo docCount=5
>   compound=true
>   hasProx=true
>   numFiles=2
>   size (MB)=0.059
>   diagnostics = {java.version=1.5.0_21, lucene.version=2.9.0 817268P - 
> 2009-09-21 10:25:09, os=SunOS,
>  os.arch=amd64, java.vendor=Sun Microsystems Inc., os.version=5.10, 
> source=flush}
>   has deletions [delFileName=_dlo_1.del]
>   test: open reader.OK [5 deleted docs]
>   test: fields..OK [136 fields]
>   test: field norms.OK [136 fields]
>   test: terms, freq, prox...OK [1698 terms; 4236 terms/docs pairs; 0 tokens]
>   test: stored fields...OK [0 total field count; avg ? fields per doc]
>   test: term vectorsOK [0 total vector count; avg ? term/freq vector 
> fields per doc]
> {noformat}
> Shouldn't such segments not be removed automatically during the next 
> commit/close of IndexWriter?
> *Mike McCandless:*
> Lucene doesn't actually short-circuit this case, ie, if every single doc in a 
> given segment has been deleted, it will still merge it [away] like normal, 
> rather than simply dropping it immediately from the index, which I agree 
> would be a simple optimization. Can you open a new issue? I would think IW 
> can drop such a segment immediately (ie not wait for a merge or optimize) on 
> flushing new deletes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2010) Remove segments with all documents deleted in commit/flush/close of IndexWriter instead of waiting until a merge occurs.

2011-01-24 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986267#action_12986267
 ] 

Uwe Schindler commented on LUCENE-2010:
---

Look fine to me! Its indeed quite simple. Will test this later.

Do you want to fix the rest of the tests and remove the text-only 
keepAllSegments method? At least this method should be hidden by a 
package-private accessor or, if not possible, @lucene.internal.

> Remove segments with all documents deleted in commit/flush/close of 
> IndexWriter instead of waiting until a merge occurs.
> 
>
> Key: LUCENE-2010
> URL: https://issues.apache.org/jira/browse/LUCENE-2010
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.9
>Reporter: Uwe Schindler
>Assignee: Michael McCandless
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2010.patch
>
>
> I do not know if this is a bug in 2.9.0, but it seems that segments with all 
> documents deleted are not automatically removed:
> {noformat}
> 4 of 14: name=_dlo docCount=5
>   compound=true
>   hasProx=true
>   numFiles=2
>   size (MB)=0.059
>   diagnostics = {java.version=1.5.0_21, lucene.version=2.9.0 817268P - 
> 2009-09-21 10:25:09, os=SunOS,
>  os.arch=amd64, java.vendor=Sun Microsystems Inc., os.version=5.10, 
> source=flush}
>   has deletions [delFileName=_dlo_1.del]
>   test: open reader.OK [5 deleted docs]
>   test: fields..OK [136 fields]
>   test: field norms.OK [136 fields]
>   test: terms, freq, prox...OK [1698 terms; 4236 terms/docs pairs; 0 tokens]
>   test: stored fields...OK [0 total field count; avg ? fields per doc]
>   test: term vectorsOK [0 total vector count; avg ? term/freq vector 
> fields per doc]
> {noformat}
> Shouldn't such segments not be removed automatically during the next 
> commit/close of IndexWriter?
> *Mike McCandless:*
> Lucene doesn't actually short-circuit this case, ie, if every single doc in a 
> given segment has been deleted, it will still merge it [away] like normal, 
> rather than simply dropping it immediately from the index, which I agree 
> would be a simple optimization. Can you open a new issue? I would think IW 
> can drop such a segment immediately (ie not wait for a merge or optimize) on 
> flushing new deletes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org