[ 
https://issues.apache.org/jira/browse/HDFS-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627522#comment-13627522
 ] 

Todd Lipcon commented on HDFS-4305:
-----------------------------------

bq. I am not sure if minimum block size is really required. I would rather make 
it a namenode WebUI status to say, your block size is way too small.

What about for the case that just one or a few files have the small block size? 
You wouldn't want to put this on the NN web UI.

The issue I've seen in the past is that some well-meaning but naive user wants 
to get their MR job to generate more splits. They don't know how to do this 
properly within MR, so instead they create the file with a tiny block size like 
1KB, then are surprised when they have really bad performance, etc. Having some 
reasonable limit should help keep them from shooting themselves in the foot.
                
> Add a configurable limit on number of blocks per file, and min block size
> -------------------------------------------------------------------------
>
>                 Key: HDFS-4305
>                 URL: https://issues.apache.org/jira/browse/HDFS-4305
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 1.0.4, 3.0.0, 2.0.2-alpha
>            Reporter: Todd Lipcon
>            Assignee: Andrew Wang
>            Priority: Minor
>         Attachments: hdfs-4305-1.patch
>
>
> We recently had an issue where a user set the block size very very low and 
> managed to create a single file with hundreds of thousands of blocks. This 
> caused problems with the edit log since the OP_ADD op was so large 
> (HDFS-4304). I imagine it could also cause efficiency issues in the NN. To 
> prevent users from making such mistakes, we should:
> - introduce a configurable minimum block size, below which requests are 
> rejected
> - introduce a configurable maximum number of blocks per file, above which 
> requests to add another block are rejected (with a suitably high default as 
> to not prevent legitimate large files)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to