[jira] [Updated] (HDFS-7496) Fix FsVolume removal race conditions on the DataNode by reference-counting the volume instances

2015-01-21 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7496:
---
Fix Version/s: (was: 3.0.0)
   2.7.0

committed to 2.7.  Thanks.

> Fix FsVolume removal race conditions on the DataNode by reference-counting 
> the volume instances
> ---
>
> Key: HDFS-7496
> URL: https://issues.apache.org/jira/browse/HDFS-7496
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Lei (Eddy) Xu
> Fix For: 2.7.0
>
> Attachments: HDFS-7496-branch-2.000.patch, HDFS-7496.000.patch, 
> HDFS-7496.001.patch, HDFS-7496.002.patch, HDFS-7496.003.patch, 
> HDFS-7496.003.patch, HDFS-7496.004.patch, HDFS-7496.005.patch, 
> HDFS-7496.006.patch, HDFS-7496.007.patch
>
>
> We discussed a few FsVolume removal race conditions on the DataNode in 
> HDFS-7489.  We should figure out a way to make removing an FsVolume safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7496) Fix FsVolume removal race conditions on the DataNode by reference-counting the volume instances

2015-01-21 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-7496:

Attachment: HDFS-7496-branch-2.000.patch

Uploaded for branch-2

> Fix FsVolume removal race conditions on the DataNode by reference-counting 
> the volume instances
> ---
>
> Key: HDFS-7496
> URL: https://issues.apache.org/jira/browse/HDFS-7496
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Lei (Eddy) Xu
> Fix For: 3.0.0
>
> Attachments: HDFS-7496-branch-2.000.patch, HDFS-7496.000.patch, 
> HDFS-7496.001.patch, HDFS-7496.002.patch, HDFS-7496.003.patch, 
> HDFS-7496.003.patch, HDFS-7496.004.patch, HDFS-7496.005.patch, 
> HDFS-7496.006.patch, HDFS-7496.007.patch
>
>
> We discussed a few FsVolume removal race conditions on the DataNode in 
> HDFS-7489.  We should figure out a way to make removing an FsVolume safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7496) Fix FsVolume removal race conditions on the DataNode by reference-counting the volume instances

2015-01-20 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7496:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Can you make a version for branch-2?  The backport isn't 
straightforward here.

In my branch-2 backport, I get this error:
{code}
2015-01-20 19:46:09,386 ERROR datanode.DataNode (DataXceiver.java:run(275)) - 
127.0.0.1:33539:DataXceiver error processing WRITE_BLOCK operation  src: 
/127.0.0.1:53230 dst: /127.0.0.1:33539
java.lang.IllegalStateException
at com.google.common.base.Preconditions.checkState(Preconditions.java:129)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.checkReference(FsVolumeImpl.java:208)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.unreference(FsVolumeImpl.java:175)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.access$100(FsVolumeImpl.java:61)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl$FsVolumeReferenceImpl.close(FsVolumeImpl.java:193)
at 
org.apache.hadoop.hdfs.server.datanode.ReplicaHandler.close(ReplicaHandler.java:42)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.close(BlockReceiver.java:344)
at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:244)
at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:261)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:799)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
{code}

> Fix FsVolume removal race conditions on the DataNode by reference-counting 
> the volume instances
> ---
>
> Key: HDFS-7496
> URL: https://issues.apache.org/jira/browse/HDFS-7496
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Lei (Eddy) Xu
> Fix For: 3.0.0
>
> Attachments: HDFS-7496.000.patch, HDFS-7496.001.patch, 
> HDFS-7496.002.patch, HDFS-7496.003.patch, HDFS-7496.003.patch, 
> HDFS-7496.004.patch, HDFS-7496.005.patch, HDFS-7496.006.patch, 
> HDFS-7496.007.patch
>
>
> We discussed a few FsVolume removal race conditions on the DataNode in 
> HDFS-7489.  We should figure out a way to make removing an FsVolume safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7496) Fix FsVolume removal race conditions on the DataNode by reference-counting the volume instances

2015-01-20 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7496:
---
Summary: Fix FsVolume removal race conditions on the DataNode by 
reference-counting the volume instances  (was: Fix FsVolume removal race 
conditions on the DataNode )

> Fix FsVolume removal race conditions on the DataNode by reference-counting 
> the volume instances
> ---
>
> Key: HDFS-7496
> URL: https://issues.apache.org/jira/browse/HDFS-7496
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7496.000.patch, HDFS-7496.001.patch, 
> HDFS-7496.002.patch, HDFS-7496.003.patch, HDFS-7496.003.patch, 
> HDFS-7496.004.patch, HDFS-7496.005.patch, HDFS-7496.006.patch, 
> HDFS-7496.007.patch
>
>
> We discussed a few FsVolume removal race conditions on the DataNode in 
> HDFS-7489.  We should figure out a way to make removing an FsVolume safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)