BELUGA BEHR created HDFS-14292:
----------------------------------

             Summary: Introduce Java ExecutorService to DataNode
                 Key: HDFS-14292
                 URL: https://issues.apache.org/jira/browse/HDFS-14292
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: datanode
    Affects Versions: 3.2.0
            Reporter: BELUGA BEHR


I wanted to investigate {{dfs.datanode.max.transfer.threads}} from 
{{hdfs-site.xml}}.  It is described as "Specifies the maximum number of threads 
to use for transferring data in and out of the DN."   The default value is 
4096.  I found it interesting because 4096 threads sounds like a lot to me.  
I'm not sure how a system with 8-16 cores would react to this large a thread 
count.  Intuitively, I would say that the overhead of context switching would 
be immense.

During mt investigation, I discovered the 
[following|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java#L203-L216]
 setup in the {{DataXceiverServer}} class:

# A peer connects to a DataNode
# A new thread is spun up to service this connection
# The thread runs to completion
# The tread dies

It would perhaps be better if we used a thread pool to better manage the 
lifecycle of the service threads and to allow the DataNode to re-use existing 
threads, saving on the need to create and spin-up threads on demand.

In this JIRA, I have added a couple of things:

# Added a thread pool that will always maintain a single thread running, always 
awaiting a new connection should one arrive.  On-demand, it will create up to 
{{dfs.datanode.max.transfer.threads}}.  A thread that has completed its prior 
duties will stay idle for up to 30 seconds, it will be retired if no new work 
has arrived.
# Added new methods to the {{Peer}} Interface to allow for better logging and 
less code within each Thread ({{DataXceiver}}).
# Updated the Thread code ({{DataXceiver}}) regarding its interactions with 
{{blockReceiver}} instance variable



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to