[ https://issues.apache.org/jira/browse/SOLR-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Otis Gospodnetic updated SOLR-830: ---------------------------------- Fix Version/s: 1.3.1 > snappuller picks bad snapshot name > ---------------------------------- > > Key: SOLR-830 > URL: https://issues.apache.org/jira/browse/SOLR-830 > Project: Solr > Issue Type: Bug > Components: replication (scripts) > Affects Versions: 1.2, 1.3 > Reporter: Hoss Man > Assignee: Bill Au > Fix For: 1.3.1 > > > as mentioned on the mailing list... > http://www.nabble.com/FileNotFoundException-on-slave-after-replication---script-bug--to20111313.html#a20111313 > {noformat} > We're seeing strange behavior on one of our slave nodes after replication. > When the new searcher is created we see FileNotFoundExceptions in the log > and the index is strangely invalid/corrupted. > We may have identified the root cause but wanted to run it by the community. > We figure there is a bug in the snappuller shell script, line 181: > snap_name=`ssh -o StrictHostKeyChecking=no ${master_host} "ls > ${master_data_dir}|grep 'snapshot\.'|grep -v wip|sort -r|head -1"` > This line determines the directory name of the latest snapshot to download > to the slave from the master. Problem with this line is that it grab the > temporary work directory of a snapshot in progress. Those temporary > directories are prefixed with "temp" and as far as I can tell should never > get pulled from the master so its easy to disambiguate. It seems that this > temp directory, if it exists will be the newest one so if present it will be > the one replicated: FAIL. > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.