[ 
https://issues.apache.org/jira/browse/SOLR-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14013348#comment-14013348
 ] 

Dawid Weiss commented on SOLR-6119:
-----------------------------------

The problem is in the test, here:
{code}
          BackupThread deleteBackupThread = new BackupThread(backupNames[i], 
ReplicationHandler.CMD_DELETE_BACKUP);
          deleteBackupThread.start();
          int waitCnt = 0;
          CheckDeleteBackupStatus checkDeleteBackupStatus = new 
CheckDeleteBackupStatus();
          while (true) {
            checkDeleteBackupStatus.fetchStatus();
{code}

you run the backup threads but never wait for the backup to finish, checking 
the "delete status". There is a race condition in there -- either the check for 
backup status should really return true after backup files are removed or the 
wait for the backup itself should be done in an alternative way.

If you add a log to backup (before/ after) and to the finally block in the 
test, the wrong interleaving is:
{code}
4752 T60 oash.SnapShooter.deleteNamedSnapshot Deleting snapshot: eyxtuk
4752 T12 oash.TestReplicationHandler.doTestBackup #### --> DELETING (finally 
block in the test)
4754 T60 oash.SnapPuller.delTree WARN Unable to delete file : 
C:\Users\dweiss\AppData\Local\Temp\solr.handler.TestReplicationHandler-B751491BC59B33CA-005\solr-instance-001\collection1\data\snapshot.eyxtuk\_0.cfs
4754 T60 oash.SnapShooter.deleteNamedSnapshot WARN Unable to delete snapshot: 
eyxtuk
4754 T60 oash.SnapShooter.deleteNamedSnapshot Deleting snapshot: eyxtuk (DONE)
{code}

So the test never waits for the snapshooter.deleteNamedSnapshot to finish.

> TestReplicationHandler attempts to remove open folders
> ------------------------------------------------------
>
>                 Key: SOLR-6119
>                 URL: https://issues.apache.org/jira/browse/SOLR-6119
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Dawid Weiss
>            Priority: Minor
>         Attachments: SOLR-6119.patch, SOLR-6119.patch, SOLR-6119.patch
>
>
> TestReplicationHandler has a weird logic around the 'snapDir' variable. It 
> attempts to remove snapshot folders, even though they're not closed yet. My 
> recent patch uncovered the bug but I don't know how to fix it cleanly -- the 
> test itself seems to be very fragile (for example I don't understand the 
> 'namedBackup' variable which is always set to true, yet there are 
> conditionals around it).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to