[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-05-15 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840792#comment-16840792
 ] 

ASF subversion and git services commented on SOLR-12999:


Commit bf8c6ea435a39564d2a7de5c37cc9e522ca5b9cb in lucene-solr's branch 
refs/heads/jira/SOLR-13468 from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bf8c6ea ]

SOLR-12999: Harden TestReplicationHandlerDiskOverFlow against sporadic timing 
failures

 - ensure IndexFetcher injection is reset in @After method
 - replace System.out with Logger
 - Log and fail on any exceptions in any callbacks/threads
 - use CyclicBarrier (instead of CountdownLatch) to ensure the Query Thread 
loop doesn't monopolize
   CPU preventing IndexFetcher callback from ever being run

(Some of these improvements directly address jenkins failures we've been seeing)


> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.1
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-05-14 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839813#comment-16839813
 ] 

Hoss Man commented on SOLR-12999:
-

FYI: while fixing TestReplicationHandlerDiskOverFlow to prevent some cncurrency 
related failures that have popped up in recent jenkins builds, i've uncovered 3 
related issues that i've filed which people following this may be interested 
in...

* SOLR-13469 - rejected requests during ful IndexFetch should not use 403 
response code
* SOLR-13470 - SolrException msg not always propogated to HttpClient (may be 
specific to SOLR-12999 ?)
* SOLR-13471 - javabin codec can marshal data that it can't unmarshal 
(ClassCastException: class java.lang.Boolean cannot be cast to class 
java.lang.String)
 

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.1
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-05-14 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839769#comment-16839769
 ] 

ASF subversion and git services commented on SOLR-12999:


Commit 371d645e6add019675ab3c468b34e8b6ea711dbd in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=371d645 ]

SOLR-12999: Harden TestReplicationHandlerDiskOverFlow against sporadic timing 
failures

 - ensure IndexFetcher injection is reset in @After method
 - replace System.out with Logger
 - Log and fail on any exceptions in any callbacks/threads
 - use CyclicBarrier (instead of CountdownLatch) to ensure the Query Thread 
loop doesn't monopolize
   CPU preventing IndexFetcher callback from ever being run

(Some of these improvements directly address jenkins failures we've been seeing)

(cherry picked from commit bf8c6ea435a39564d2a7de5c37cc9e522ca5b9cb)


> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.1
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-05-14 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839768#comment-16839768
 ] 

ASF subversion and git services commented on SOLR-12999:


Commit 85346266f8c1b28f69441542be9b91cceb3bc9e3 in lucene-solr's branch 
refs/heads/branch_8_1 from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8534626 ]

SOLR-12999: Harden TestReplicationHandlerDiskOverFlow against sporadic timing 
failures

 - ensure IndexFetcher injection is reset in @After method
 - replace System.out with Logger
 - Log and fail on any exceptions in any callbacks/threads
 - use CyclicBarrier (instead of CountdownLatch) to ensure the Query Thread 
loop doesn't monopolize
   CPU preventing IndexFetcher callback from ever being run

(Some of these improvements directly address jenkins failures we've been seeing)

(cherry picked from commit bf8c6ea435a39564d2a7de5c37cc9e522ca5b9cb)


> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.1
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-05-14 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839765#comment-16839765
 ] 

ASF subversion and git services commented on SOLR-12999:


Commit bf8c6ea435a39564d2a7de5c37cc9e522ca5b9cb in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bf8c6ea ]

SOLR-12999: Harden TestReplicationHandlerDiskOverFlow against sporadic timing 
failures

 - ensure IndexFetcher injection is reset in @After method
 - replace System.out with Logger
 - Log and fail on any exceptions in any callbacks/threads
 - use CyclicBarrier (instead of CountdownLatch) to ensure the Query Thread 
loop doesn't monopolize
   CPU preventing IndexFetcher callback from ever being run

(Some of these improvements directly address jenkins failures we've been seeing)


> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.1
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-02-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16761699#comment-16761699
 ] 

ASF subversion and git services commented on SOLR-12999:


Commit 1d13d3df03a1b7f6513629b2aaab45f4e0d36006 in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1d13d3d ]

SOLR-12999: Index replication could delete segments before downloading segments 
from master if there is not enough disk space


> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.x
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-02-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16761698#comment-16761698
 ] 

ASF subversion and git services commented on SOLR-12999:


Commit b061947e9103ddbedd36ff684ce65b39ef263229 in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b061947 ]

SOLR-12999: Index replication could delete segments before downloading segments 
from master if there is not enough disk space


> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.x
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-02-06 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16761561#comment-16761561
 ] 

Jan Høydahl commented on SOLR-12999:


Note: This is not yet in master branch...

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.x
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-02-02 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758983#comment-16758983
 ] 

Noble Paul commented on SOLR-12999:
---

right, I'll be glad to have somebody take a look at the changes. IndexFetcher 
is really complex 

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.x
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-02-01 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758679#comment-16758679
 ] 

David Smiley commented on SOLR-12999:
-

The fix-version was 8.0 which was incorrect.  Just now I created an "8.x" 
version in JIRA's admin for both Lucene & Jira in accordance with our release 
process [https://wiki.apache.org/lucene-java/ReleaseTodo#Add_New_JIRA_Versions] 
 [~jimczi] seems like you forgot this?

I am still concerned about no code review to a component that is so gnarly as 
IndexFetcher but this isn't a blocker.

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.x
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-02-01 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758664#comment-16758664
 ] 

Noble Paul commented on SOLR-12999:
---

It's not in 8.0 it's in 8_x branch

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.0
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-02-01 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758658#comment-16758658
 ] 

David Smiley commented on SOLR-12999:
-

Woah; why did this land in 8.0 and without a code review?  That's a no-no 
during a release.

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.0
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-01-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757728#comment-16757728
 ] 

ASF subversion and git services commented on SOLR-12999:


Commit 34da61e863c459ae48103a3f6e1dacb76f14cd23 in lucene-solr's branch 
refs/heads/branch_8x from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=34da61e ]

SOLR-12999: Index replication could delete segments before downloading segments 
from master if there is not enough disk space


> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.0
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-01-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757727#comment-16757727
 ] 

ASF subversion and git services commented on SOLR-12999:


Commit d3c686aa242e8b6ff8363244dd6267e1e51ff4fa in lucene-solr's branch 
refs/heads/branch_8x from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d3c686a ]

SOLR-12999: Index replication could delete segments before downloading segments 
from master if there is not enough disk space


> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Assignee: Noble Paul
>Priority: Major
> Fix For: 8.0
>
> Attachments: SOLR-12999.patch, SOLR-12999.patch
>
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2019-01-15 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743575#comment-16743575
 ] 

Noble Paul commented on SOLR-12999:
---

bq. Can you please clarify what you mean by "full replication"?

A full replication is performed when there are segments in master index with 
same name but different size/checksum. It used to be less common in the past 
with master-slave setup , but it's now very common with leader-replica setup in 
SolrCloud.


bq.For example, we often found ourselves in situations where...
Well, in your case it has downloaded the entire leader index to a new directory




> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-12-29 Thread Jason Baik (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730725#comment-16730725
 ] 

Jason Baik commented on SOLR-12999:
---

[~noble.paul] 
{quote}If a full replication is required
{quote}
Can you please clarify what you mean by "full replication"?

Is it literally the case that the 100% of the leader's index needs to be copied 
([https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/IndexFetcher.java#L520),]
 or it is going to be some configurable, heuristic percentage that's deemed 
"close to full"?

If it's the former, the usefulness of this change is reduced because a tiny 
similarity between the segments of the leader and the replica will cause Solr 
not consider it as a "full replication".

For example, we often found ourselves in situations where:
 * The leader's index size is 100GB
 * The recovering replica has 90GB of disk left.
 * The leader and the replica segments have a 1% (1GB) similarity 
 * For this small similarity, Solr does not consider this as a full 
replication, and the "delete local index and free up disk space" doesn't 
trigger.
 * The replica ends up copying 99% (99GB) of the leader's index and goes out of 
disk space.

We got around this by conditioning the delete of the local index on a heuristic 
such that if an N% of the leader's index needs to be copied, trigger the logic.

I hope that's the direction you guys are thinking.

 

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-12-29 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730693#comment-16730693
 ] 

Noble Paul commented on SOLR-12999:
---

{quote}IndexFetcher that implements the change being proposed here...
{quote}
[~jason.j.b...@gmail.com] The proposed solution is as follows.
 If a full replication is required,
 # calculate the disk space required to copy all the segments
 # If the space 
 # Check if the local disk has enough space with some ~10% margin. if the 
master index is 100GB and the local disk has at-least 110GB free, the system 
should behave exactly as it does today
 # If the master-index has 100GB size and the local disk has only 90GB , the 
segments can't be copied. See if we can delete the local index and get at least 
110GB free space delete the local index first and download the segments from 
master

Yes, there is a possibility of screw up and we need to address that separately.

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-12-28 Thread Jason Baik (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730543#comment-16730543
 ] 

Jason Baik commented on SOLR-12999:
---

I'm from the team--[~mbraun688] mentioned in an earlier comment--which is 
currently using a fork of the IndexFetcher that implements the change being 
proposed here (i.e. delete the unnecessary index files BEFORE copying the 
master's remote files to reduce pressure on disk space). 

I want to chime in to report an edge case that must be handled with care. We 
learned it the hard way after losing a few shards this week.

Basically, we encountered a situation that [~erickerickson] anticipated:
{quote}Say replication deletes segments, then _for any reason_, the sync fails 
to complete...
{quote}
 

In a nutshell, 
 * We had a replica that had to sync with the leader via full index replication.
 * Our fork of the IndexFetcher deleted all segments on the replica prior to 
initiating index copying.
 * Before the sync was complete, the entire Solr cluster was shut down (due to 
a human mistake).
 * When the cluster was brought back up, the replica with all segments deleted 
connected to Zookeeper first, and initiated 
org.apache.solr.cloud.ShardLeaderElectionContext#runLeaderProcess() for the 
shard.
 * To our surprise, this replica was elected as the leader b/c:
 ** org.apache.solr.update.PeerSync#sync() inspected the tail of the 
transaction log, and found that this replica had all the updates that the other 
replicas had in their transaction logs.
 ** This led to PeerSyncResult.succes = true, and ShardLeaderElectionContext 
saw it as good enough reason to elect this replica as the leader...
 * After this point, any replica that synced with this new leader also copied 
over the wiped index, and we started losing data in all replicas...

It's a rather extreme case that was caused by an unlucky sequence of events, 
where all replicas of the shard went down at once, then the replica with 
segments deleted initiated the leader process, but this does demonstrate a 
disastrous scenario that yet another failure in Solr cloud while a replica is 
in a wiped state, can potentially lead to a complete loss of a shard. The 
implementer of this change should consider putting in some safety measures that 
prevents the replica with the segments deleted can never be elected as the 
leader in the event of a failure that causes another round of leader election.

 

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-12-18 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724643#comment-16724643
 ] 

Erick Erickson commented on SOLR-12999:
---

Well, there's a third possibility. Use whatever remains of your index, possibly 
with checkindex and the -fix option. Admittedly you'd only have partial index, 
but if that somehow was the only replica left at least you'd have something.

I don't think that's a valid reason to hold up this JIRA, but let's be explicit 
about the design decision. That said, the case where recovering the partial 
index from your one remaining replica is outweighed by the utility of avoiding 
disk full situations IMO.

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-12-18 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724573#comment-16724573
 ] 

Noble Paul commented on SOLR-12999:
---

Let's look at a usecase where the disk space left in the replica is less than 
the space required to download the segments from the leader. So the only 
choices left are , 
# fail replication altogether
# or delete the local segments and fetch the segments from the leader  

I think #2 is definitely a better option

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would probably only be 
> useful if there is no SolrIndexSearcher open, since it would prevent the 
> removal of files.
> The motivating scenario is a SolrCloud replica that is going into full 
> recovery.  It ought to not be fielding searches.  The code changes would not 
> depend on SolrCloud though.
> This option would have some danger the user should be aware of.  If the 
> replication fails, leaving the local files incomplete/corrupt, the only 
> recourse is to try full replication again.  You can't just give up and field 
> queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-11-20 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693689#comment-16693689
 ] 

Erick Erickson commented on SOLR-12999:
---

Or I can just stop commenting on a  JIRA I know little about the genesis of ;)

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would only be useful if 
> there is no SolrIndexSearcher open, since it would prevent the removal of 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-11-20 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693681#comment-16693681
 ] 

David Smiley commented on SOLR-12999:
-

The title & description is intentionally general because it's plausibly useful 
for non-SolrCloud it depends.  The code involved would likely not even be 
SolrCloud centric (not care about ZK, etc.).  Nevertheless I'll enhance the 
description

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would only be useful if 
> there is no SolrIndexSearcher open, since it would prevent the removal of 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-11-20 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693677#comment-16693677
 ] 

Erick Erickson commented on SOLR-12999:
---

OK, so maybe change the title or description to be explicit about when this is 
being considered? As it stands there's nothing indicating that there is an 
assumption that the core/replica is unusable in the first place when this 
option is applied. That's twice I've thought someone was talking about 
replication in general, including non-cloud setups

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would only be useful if 
> there is no SolrIndexSearcher open, since it would prevent the removal of 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-11-20 Thread Michael Braun (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693670#comment-16693670
 ] 

Michael Braun commented on SOLR-12999:
--

[~erickerickson] right, it should never be serving queries as it's in a 
'recovering' rather than 'active' state. If it were to serve queries it would 
be in out of sync data. The replica has to know it can never serve traffic 
again unless it fully recovers. If the sync fails to complete, it will need to 
start over rather than using what it has as an active replica. 

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would only be useful if 
> there is no SolrIndexSearcher open, since it would prevent the removal of 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-11-20 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693648#comment-16693648
 ] 

Erick Erickson commented on SOLR-12999:
---

[~dsmiley] Ah, ok. That makes sense. Deleting them first since you can't search 
anyway seems safe enough.

[~mbraun688]
bq. The other use case for this is if you're using more than 50% hard drive, 
copying index files could result in going out of disk

That's the dangerous case unless, as David indicates, the replica is never 
going to serve queries anyway. Say replication deletes segments, then _for any 
reason_, the sync fails to complete. No new searchers can be opened. Ever. 
Worse, if this cascaded through multiple replicas you could lose data.

You could refuse to start to replicate if the total size of the segments you 
_knew_ you had to copy exceeded your available disk space (plus some fudge 
factor maybe), but there'd still be race conditions where more than one 
core/replica decided to do a full sync.

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would only be useful if 
> there is no SolrIndexSearcher open, since it would prevent the removal of 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-11-20 Thread Michael Braun (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693497#comment-16693497
 ] 

Michael Braun commented on SOLR-12999:
--

Exactly - in the SolrCloud case where it can't PeerSync, index files need to be 
shipped over for recovery, and that recovering replica shouldn't be serving 
traffic (so the close of the searcher should be safe / it should have already 
been closed). 

The other use case for this is if you're using more than 50% hard drive, 
copying index files could result in going out of disk. 

Also one thing to note - fullCopy=false does not necessarily mean it won't copy 
basically an entire full index over.

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would only be useful if 
> there is no SolrIndexSearcher open, since it would prevent the removal of 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-11-19 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692606#comment-16692606
 ] 

David Smiley commented on SOLR-12999:
-

As Michael hints at indirectly, the motivating use-case is especially SolrCloud 
when full-replication is needed to get back in sync when a replica starts up 
after it had been down.  This is what I have in mind, any way.  In this 
circumstance, I am hoping and assuming a SolrIndexSearcher would not even be 
open since the replica isn't searchable in this case (right?).

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would only be useful if 
> there is no SolrIndexSearcher open, since it would prevent the removal of 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-11-19 Thread Michael Braun (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692096#comment-16692096
 ] 

Michael Braun commented on SOLR-12999:
--

On my old project we forked IndexFetcher to do something similar - when full 
replication was called for (so we'd be deleting every single file once it was 
done), we pointed the index to a newly-created dummy directory and made sure 
the Searcher was off the existing one before a cleanup was triggered on the 
original index directory, which would cause the underlying now-unneeded files 
to be deleted. 

[~jason.j.b...@gmail.com] can probably add more to this.

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would only be useful if 
> there is no SolrIndexSearcher open, since it would prevent the removal of 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-11-19 Thread Shawn Heisey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692005#comment-16692005
 ] 

Shawn Heisey commented on SOLR-12999:
-

As noted by Erick these statements should be true:

On *nix, Lucene/Solr can successfully delete any file, but until the current 
searcher closes them, the space won't be released.  On Windows, trying to 
delete files that a searcher has open will fail.

I like the idea, but for these reasons, I don't think it will actually work.


> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would only be useful if 
> there is no SolrIndexSearcher open, since it would prevent the removal of 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12999) Index replication could delete segments first

2018-11-19 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691897#comment-16691897
 ] 

Erick Erickson commented on SOLR-12999:
---

I'm a little lost here. What's the definition of "knows will not be needed"? If 
it's a list of segments locally that are not present on the leader and 
therefore won't be necessary this seems fraught.

> if a new searcher is opened for any reason, wouldn't it fail? Restarts, rogue 
> commits, whatever.

> Wouldn't the files still  be there on *nix systems if a searcher had them 
> open?

> On Windows systems, if a searcher has the files open would the delete fail in 
> the first place?

I know you know Lucene details far better than me, so I suspect I'm totally 
missing the point.

> Index replication could delete segments first
> -
>
> Key: SOLR-12999
> URL: https://issues.apache.org/jira/browse/SOLR-12999
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: David Smiley
>Priority: Major
>
> Index replication could optionally delete files that it knows will not be 
> needed _first_.  This would reduce disk capacity requirements of Solr, and it 
> would reduce some disk fragmentation when space get tight.
> Solr (IndexFetcher) already grabs the remote file list, and it could see 
> which files it has locally, then delete the others.  Today it asks Lucene to 
> {{deleteUnusedFiles}} at the end.  This new mode would only be useful if 
> there is no SolrIndexSearcher open, since it would prevent the removal of 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org