[ https://issues.apache.org/jira/browse/HDFS-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260446#comment-14260446 ]
Colin Patrick McCabe commented on HDFS-7496: -------------------------------------------- Thanks for looking at this. {code} @@ -125,6 +123,7 @@ private boolean syncOnClose; private long restartBudget; + private boolean hasReference = false; /** * for replaceBlock response @@ -221,6 +220,11 @@ " while receiving block " + block + " from " + inAddr); } } + if (replicaInfo instanceof ReplicaInfo) { + // Hold a reference to protect IOs on the streams. + ((ReplicaInfo) replicaInfo).getVolume().reference(); + hasReference = true; + } {code} A few comments: * Rather than having a {{boolean hasReference}}, let's have an actual pointer to the {{FsVolumeSpi}} object. That makes it clear that we can access the volume whenever we want, because we're holding a refcount. * We should release the reference count in {{close()}}, not in the finally block, I think. This is consistent with how we release the streams and so forth. * We don't need this code any more: {code} if (lastPacketInBlock) { // Finalize the block and close the block file try { finalizeBlock(startTime); } catch (ReplicaNotFoundException e) { // Verify that the exception is due to volume removal. FsVolumeSpi volume; synchronized (datanode.data) { volume = datanode.data.getVolume(block); } if (volume == null) { // ReplicaInfo has been removed due to the corresponding data // volume has been removed. Don't need to check disk error. LOG.info(myString + ": BlockReceiver is interrupted because the block pool " + block.getBlockPoolId() + " has been removed.", e); sendAckUpstream(ack, expected, totalAckTimeNanos, 0, Status.OOB_INTERRUPTED); running = false; receiverThread.interrupt(); continue; } throw e; } } {code} Because it will no longer be possible for the volume to go away while we're using it, we can get rid of that whole code block in the "catch" block, right? {code} @Override - public synchronized void removeVolumes(Collection<StorageLocation> volumes) { - Set<File> volumeSet = new HashSet<File>(); + public synchronized void removeVolumes(Collection<StorageLocation> volumes) + throws IOException { + Set<String> volumeSet = new HashSet<>(); for (StorageLocation sl : volumes) { - volumeSet.add(sl.getFile()); + volumeSet.add(sl.getFile().getCanonicalPath()); } for (int idx = 0; idx < dataStorage.getNumStorageDirs(); idx++) { Storage.StorageDirectory sd = dataStorage.getStorageDir(idx); - if (volumeSet.contains(sd.getRoot())) { + if (volumeSet.contains(sd.getRoot().getCanonicalPath())) { {code} This change seems unrelated to this JIRA... am I missing something? Also, as I've said in the past, I'm strongly against {{removeVolumes}} throwing an {{IOException}}. I don't see how the code is supposed to proceed if removal fails with an exception. {code} 176 DataNode getDatanode() { 177 return datanode; 178 } {code} This isn't necessary, since {{FsDatasetImpl#datanode}} already has package-private access and this accessor has the same level of access. {code} 106 if (dataset.getDatanode() == null) { 107 // FsVolumeImpl is used in test. 108 return null; 109 } {code} How about using {{Preconditions.checkNonNull}} here... might look nicer {code} 401 File createRbwFile(String bpid, Block b) throws IOException { 402 reference(); 403 try { 404 reserveSpaceForRbw(b.getNumBytes()); 405 return getBlockPoolSlice(bpid).createRbwFile(b); 406 } finally { 407 unreference(); 408 } 409 } {code} This is kind of a weird approach, having the volume increment reference counts on itself. What I was envisioning was having {{getNextVolume}} increment the reference count when it retrieved the volume, and having the caller decrement the reference count after the caller was done with the volume object. > Fix FsVolume removal race conditions on the DataNode > ----------------------------------------------------- > > Key: HDFS-7496 > URL: https://issues.apache.org/jira/browse/HDFS-7496 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Colin Patrick McCabe > Assignee: Lei (Eddy) Xu > Attachments: HDFS-7496.000.patch > > > We discussed a few FsVolume removal race conditions on the DataNode in > HDFS-7489. We should figure out a way to make removing an FsVolume safe. -- This message was sent by Atlassian JIRA (v6.3.4#6332)