[ 
https://issues.apache.org/jira/browse/HDFS-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880648#action_12880648
 ] 

Thanh Do commented on HDFS-1239:
--------------------------------

overall we see that the client-namenode protocol does not allow
the client to say to the namenode something like "hey, i tried to write to 
the datanodes you've given me, but it fails, could you give me other
datanodes please?"  the reason is the cloud should have more machines,
and maybe it makes more sense if the client could be given another set of 
datanodes

> All datanodes are bad in 2nd phase
> ----------------------------------
>
>                 Key: HDFS-1239
>                 URL: https://issues.apache.org/jira/browse/HDFS-1239
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.20.1
>            Reporter: Thanh Do
>
> - Setups:
> number of datanodes = 2
> replication factor = 2
> Type of failure: transient fault (a java i/o call throws an exception or 
> return false)
> Number of failures = 2
> when/where failures happen = during the 2nd phase of the pipeline, each 
> happens at each datanode when trying to perform I/O 
> (e.g. dataoutputstream.flush())
>  
> - Details:
>  
> This is similar to HDFS-1237.
> In this case, node1 throws exception that makes client creates
> a pipeline only with node2, then tries to redo the whole thing,
> which throws another failure. So at this point, the client considers
> all datanodes are bad, and never retries the whole thing again, 
> (i.e. it never asks the namenode again to ask for a new set of datanodes).
> In HDFS-1237, the bug is due to permanent disk fault. In this case, it's 
> about transient error.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (than...@cs.wisc.edu) and
> Haryadi Gunawi (hary...@eecs.berkeley.edu)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to