[ 
https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741622#comment-14741622
 ] 

Colin Patrick McCabe commented on HDFS-8873:
--------------------------------------------

Good work, [~templedf].

{code}
  public static final String  DFS_DATANODE_DIRECTORYSCAN_THROTTLE_KEY = 
"dfs.datanode.directoryscan.throttle";
{code}
Should be "throttle.type"?

{code}
649 <property>
650       <name>dfs.datanode.directoryscan.throttle.limit</name>
651       <value>0</value>
652       <description>The limit setting for the report compiler threads 
throttle. The
653       meaning of this setting is determined by the
654       dfs.datanode.directoryscan.throttle setting. In all cases, setting 
this limit
655       to 0 disables compiler thread throttling. 0 is the default setting.
656       </description>
657     </property>
{code}
I would prefer to see per-type configuration keys that are more descriptive.  
For example, {{dfs.datanode.directoryscan.timeslice.throttle.ms.per.sec}}.  If 
we invent more throttle types later, we can always add more configuration key 
names.

testTimesliceThrottle: please copy the Configuration object and change the copy 
rather than mutating {{TestDirectoryScanner#CONF}}.

{code}
604           // Waiting should be about 4x running.
605           ratio = (float)scanner.timeWaiting.get() / 
scanner.timeRunning.get();
606     
607           LOG.info("RATIO: " + ratio);
608           assertTrue("Throttle is too restrictive", ratio <= 4.5);
609           assertTrue("Throttle is too permissive", ratio >= 3.0);
{code}

I'm a bit concerned that other tests running on the same machine, or GCs could 
cause delays here that would make the test fail.  Perhaps we could do this in a 
loop and keep retrying until we found that the ratio was correct?

{code}
84        @VisibleForTesting
85        final AtomicLong timeRunning = new AtomicLong(0L);
{code}
Should be "timeRunningMs" to reflect the fact that this interval is in 
milliseconds.  Similar with "timeWaiting"

{code}
115         public static ThrottleType parse(String type) {
116           if (type.trim().equalsIgnoreCase(TIMESLICE.toString())) {
117             return TIMESLICE;
118           } else {
119             return NONE;
120           }
121         }
{code}
We should log an ERROR message if we can't understand the throttle type.  
Silently defaulting to doing nothing is not behavior most users will appreciate.

DirectoryScanner.java: there are some unnecessary whitespace changes.

TimesliceThrottler: maybe TimeSliceThrottlerTask is a more appropriate name 
here?  I like to think of executors as scheduling tasks.  Arguably the 
throttler state is all contained outside the class so it's not really "the 
throttler."

{code}
1121          } catch (Throwable t) {
1122            LOG.error("Throttle thread died unexpectedly", t);
1123    
1124            if (t instanceof Error) {
1125              throw t;
1126            }
1127          }
{code}
What's the purpose of rethrowing exceptions here?

{code}
  private static class Throttle extends Semaphore {
{code}
While this works, it feels more natural to use a boolean + condition variable 
here.

{code}
try {
  lock.lock();
  while (blockThreads) {
    cond.wait();
  }
} finally {
  lock.unlock();
}
{code}

{code}
74        private final ThrottleType throttleType;
75        private final int throttleLimit;
76        private ScheduledExecutorService throttleThread;
77        private Semaphore runningThreads = new Semaphore(0);
78        private volatile Throttle throttle;
{code}
It feels like this state should be encapsulated inside the Throttler itself.

{code}
860                 // Give our parent a chance to block us for throttling
861                 if (throttle != null) {
862                   throttle.start();
863                 }
{code}
Can we just say that Throttler is always non-null, but sometimes we have a 
{{NoOpThrottler}}?  I don't like all this checking for null business.

You could get rid of the type enum and all those explicit type checks, and just 
have a factory function inside an interface that creates the appropriate kind 
of Throttler object from the Configuration (no-op, timeslice, etc).

> throttle directoryScanner
> -------------------------
>
>                 Key: HDFS-8873
>                 URL: https://issues.apache.org/jira/browse/HDFS-8873
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.7.1
>            Reporter: Nathan Roberts
>            Assignee: Daniel Templeton
>         Attachments: HDFS-8873.001.patch
>
>
> The new 2-level directory layout can make directory scans expensive in terms 
> of disk seeks (see HDFS-8791) for details. 
> It would be good if the directoryScanner() had a configurable duty cycle that 
> would reduce its impact on disk performance (much like the approach in 
> HDFS-8617). 
> Without such a throttle, disks can go 100% busy for many minutes at a time 
> (assuming the common case of all inodes in cache but no directory blocks 
> cached, 64K seeks are required for full directory listing which translates to 
> 655 seconds) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to