[ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033346#comment-13033346
 ] 

Todd Lipcon commented on HDFS-1332:
-----------------------------------

Hey Nicholas. How do you feel about the following compromise:
- For the simple case that there are no datanodes in the cluster, we include 
some additional detail in the exception message indicating as much. This will 
help the common case of a new user whose datanodes failed to start and is 
confused why he can't write blocks. This should be in the IOException itself so 
that it propagates to the client.
- if debug is enabled, we construct the HashMap as above, and log the "failure 
to allocate block" type messages at WARN level
- if debug is not enabled, then we log a message that says something like 
"failure to allocate block ... For more information, please enable DEBUG level 
logging on the o.a.h.BlockPlacementPolicyDefault logger."

This should avoid any performance impact, but also point users down the right 
path to solving the issues.

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> ------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1332
>                 URL: https://issues.apache.org/jira/browse/HDFS-1332
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Todd Lipcon
>            Assignee: Ted Yu
>            Priority: Minor
>              Labels: newbie
>             Fix For: 0.23.0
>
>         Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to