[ 
https://issues.apache.org/jira/browse/HDFS-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960114#comment-14960114
 ] 

Yi Liu edited comment on HDFS-7964 at 10/16/15 6:40 AM:
--------------------------------------------------------

Thanks [~daryn] for the work.

Further comments:

*1.* In FSEditLogAsync#run
{code}
@Override
  public void run() {
    try {
      while (true) {
        ....
        if (doSync) {
          ...
            logSync(getLastWrittenTxId());
  ...
{code}
I think it's better to pass the txid of current edit to {{logSync}}, not need 
to wait for all txid written. Then it's more efficient and client can get more 
faster response? 

*2.*
{code}
-log4j.rootLogger=OFF, CONSOLE
+log4j.rootLogger=DEBUG, CONSOLE
{code}
Any reason to change it?

*3.*
{code}
call.abortResponse(syncEx);
{code}
Seems this code is not available?


was (Author: hitliuyi):
Thanks [~daryn] for the work.

Further comments:

*1.* In FSEditLogAsync#run
{code}
@Override
  public void run() {
    try {
      while (true) {
        ....
        if (doSync) {
          ...
            logSync(getLastWrittenTxId());
  ...
{code}
I think it's better to pass the txid of current edit to {{logSync}}, not need 
to wait for all txid written. Then it's more efficient and client can get more 
faster response? 

*2.*
{code}
+          editsBatchedInSync = txid - synctxid - 1;
{code}
Isn't it "txid - synctxid"?   The txid is the max txid written, and synctxid is 
the max txid already synced, suppose txid = 20, synctxid = 10, then the 
editsBatchedInSync should be (txid - synctxid) = (20 - 10) = 10.   Also you can 
get it from the existing log message:
{code}
final String msg =
                "Could not sync enough journals to persistent storage " +
                "due to " + e.getMessage() + ". " +
                "Unsynced transactions: " + (txid - synctxid);
{code}

*3.*
{code}
-log4j.rootLogger=OFF, CONSOLE
+log4j.rootLogger=DEBUG, CONSOLE
{code}
Any reason to change it?

*4.*
{code}
call.abortResponse(syncEx);
{code}
Seems this code is not available?

> Add support for async edit logging
> ----------------------------------
>
>                 Key: HDFS-7964
>                 URL: https://issues.apache.org/jira/browse/HDFS-7964
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: 2.0.2-alpha
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HDFS-7964.patch, HDFS-7964.patch
>
>
> Edit logging is a major source of contention within the NN.  LogEdit is 
> called within the namespace write log, while logSync is called outside of the 
> lock to allow greater concurrency.  The handler thread remains busy until 
> logSync returns to provide the client with a durability guarantee for the 
> response.
> Write heavy RPC load and/or slow IO causes handlers to stall in logSync.  
> Although the write lock is not held, readers are limited/starved and the call 
> queue fills.  Combining an edit log thread with postponed RPC responses from 
> HADOOP-10300 will provide the same durability guarantee but immediately free 
> up the handlers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to