[ 
https://issues.apache.org/jira/browse/SOLR-7820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641543#comment-14641543
 ] 

Ramkumar Aiyengar commented on SOLR-7820:
-----------------------------------------

I agree there are a few issues here, just that the deleting the current index 
just brushes them all under the carpet and adds risk.

 - The current default of 100 updates for {{UpdateLog}} is often insufficient 
for many cases. I made that number configurable, if it's a few thousand 
updates, just tweaking it might work. But {{UpdateLog}} has scaling limitations 
I think, so YMMV. I thought {{CdcrUpdateLog}} came about to overcome this 
scaling limitation -- but I haven't looked at it enough to know if it can 
replace {{UpdateLog}}, perhaps [~erickerickson] or [~ysee...@gmail.com] know..
 - The other thing which could vastly improve this situation, even if a full 
recovery was needed, was synchronizing commits across replicas, since recovery 
skips segments already present in the current index. I believe [~varunthacker] 
was looking at this, but I can't find the issue now.
 - Regardless, I agree that it would be a good enhancement to calculate ahead 
of time how much space is needed for recovery and cleanly abort instead of 
trying and running out of space.


> IndexFetcher should delete the current index directory before downloading the 
> new index when isFullCopyNeeded==true
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-7820
>                 URL: https://issues.apache.org/jira/browse/SOLR-7820
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication (java)
>            Reporter: Timothy Potter
>
> When a replica is trying to recover and it's IndexFetcher decides it needs to 
> pull the full index from a peer (isFullCopyNeeded == true), then the existing 
> index directory should be deleted before the full copy is started to free up 
> disk to pull a fresh index, otherwise the server will potentially need 2x the 
> disk space (old + incoming new). Currently, the IndexFetcher removes the 
> index directory after the new is downloaded; however, once the fetcher 
> decides a full copy is needed, what is the value of the existing index? It's 
> clearly out-of-date and should not serve queries. Since we're deleting data 
> preemptively, maybe this should be an advanced configuration property, only 
> to be used by those that are disk-space constrained (which I'm seeing more 
> and more with people deploying high-end SSDs - they typically don't have 2x 
> the disk capacity required by an index).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to