[ https://issues.apache.org/jira/browse/HDFS-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14187857#comment-14187857 ]
Colin Patrick McCabe commented on HDFS-7035: -------------------------------------------- I think it would be easier to simply: * step #1: add the new DataStorages * step #2: for all the DataStorages that we successfully added, call FsDatasetSpi#addVolumes * step #3: for all the volumes that failed in step #2, remove the DataStorages Then we don't have to change the {{FsDatasetSpi}} interface again. We don't have to create a new callback interface with {{commit}}, and the patch gets a lot smaller. I can see there may be some benefit to parallelizing searching the new directory for replica files, but this seems like an optimization we should do in a separate JIRA. Of course, we can do that inside {{FsDatasetSpi}} without changing the interface, the same way you're doing it here... with Futures. My impression is that the amount of work we do in DataStorage is really small, so there isn't any need to parallelize the DataStorage stuff, only the scanning of the replicas. I'm having trouble figuring out the locking for the callback-based approach here. Right now we're holding the dn lock when calling {{commit}}... but I don't see why. Locking would be clearer if we just avoided callbacks and went for the straight-line approach, I think. What do you think? {code} 506 int numOldDataDirs = 0; 507 synchronized (this) { 508 numOldDataDirs = dataDirs.size(); 509 dataDirs = locations; 510 } {code} Do we still need to set {{DataNode#dataDirs}} here? And I don't see where we correct this on failure, if we fail. > Make adding volume an atomic operation. > --------------------------------------- > > Key: HDFS-7035 > URL: https://issues.apache.org/jira/browse/HDFS-7035 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode > Affects Versions: 2.5.0 > Reporter: Lei (Eddy) Xu > Assignee: Lei (Eddy) Xu > Attachments: HDFS-7035.000.combo.patch, HDFS-7035.000.patch, > HDFS-7035.001.combo.patch, HDFS-7035.001.patch, HDFS-7035.002.patch, > HDFS-7035.003.patch, HDFS-7035.003.patch, HDFS-7035.004.patch, > HDFS-7035.005.patch, HDFS-7035.007.patch, HDFS-7035.008.patch, > HDFS-7035.009.patch, HDFS-7035.010.patch, HDFS-7035.010.patch, > HDFS-7035.011.patch > > > It refactors {{DataStorage}} and {{BlockPoolSliceStorage}} to reduce the > duplicate code and supports atomic adding volume operations. Also it > parallels loading data volume operation: each thread loads one volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)