[ https://issues.apache.org/jira/browse/HDFS-5499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819712#comment-13819712 ]
Haohui Mai commented on HDFS-5499: ---------------------------------- It's problematic to do throttling using a cluster-wide configuration. For trusted programs this approach works fine. For untrusted programs, there's no way to guarantee that these programs will respect this configuration. These programs can take all available I/O bandwidth to make themselves run faster. Another concern is that you might not be able to make optimal decisions at the filesystem layer. For example, in many cases you don't want any throttling for LocalFileSystem at all. Another example is that touching ByteRangeInputStream can throttle webhdfs accidentally, since the code is shared by Hftp / Hsftp / WebHdfs, etc. I assume that you're running your jobs in a trusted environment. Does implementing a decorator class for FileSystem solve your problem? The class can wraps all Input/OutputStream with an adaptor which implements throttling. The code needs to use the new API for throttling, but it seems to me that it is a cleaner and simpler route to take. > Provide way to throttle per FileSystem read/write bandwidth > ----------------------------------------------------------- > > Key: HDFS-5499 > URL: https://issues.apache.org/jira/browse/HDFS-5499 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Lohit Vijayarenu > Attachments: HDFS-5499.1.patch > > > In some cases it might be worth to throttle read/writer bandwidth on per JVM > basis so that clients do not spawn too many thread and start data movement > causing other JVMs to starve. Ability to throttle read/write bandwidth on per > FileSystem would help avoid such issues. > Challenge seems to be how well this can be fit into FileSystem code. If one > enables throttling around FileSystem APIs, then any hidden data transfer > within cluster using them might also be affected. Eg. copying job jar during > job submission, localizing resources for distributed cache and such. -- This message was sent by Atlassian JIRA (v6.1#6144)