[ 
https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281537#comment-14281537
 ] 

Jesse Yates commented on HDFS-6440:
-----------------------------------

Some follow up after actually looking at the code:

bq. Is it possible that doWork throws IOException other than RemoteException?
Yup. In fact, the implemention of doWork at EditLogTailer#ln291 can throw an 
IOException if the call to the proxy for rollEditLog throws an IOException. 
Sure, this is a bit brittle - a remoteException could be thrown by that call 
(or any other) as an IOException, but that really can't be helped because we 
have no other way of differentiating right now. 

bq. 6. needCheckpoint == true implies sendRequests == true thus when call 
doCheckpiont(), sendRequest is always true.

Yup, that was a slight logic bug. I think setting send request should look like:
{code:title=StandbyCheckpointer.java}
          // on all nodes, we build the checkpoint. However, we only ship the 
checkpoint if have a
          // rollback request, are the checkpointer, are outside the quiet 
period.
         boolean sendRequest = needCheckpoint &&  (isPrimaryCheckPointer
              || secsSinceLast >= checkpointConf.getQuietPeriod());
{code}
to actually not send the request every time - it wasn't going to break anything 
before, but now it should actually conserve bandwidth :) 

bq. 7. Could you break this line
My IDE has that at 99 chars long - isn't 100 chars the standard line width? 
However, I moved the IOE from the rest of the signature up to the second half 
of the method declaration.

bq. 11. Finally, could you reduce the changes in `MiniDFSCluster.java`, as many 
of them are not changed, e.g. `MiniDFSCluster.java:911-986`.
I think I'm at the minimal number of changes there. Git thinks there are line 
add and removes frequently when things move around a bit, as this patch 
necessitates. Fortunately, they should be easy to ignore... but let me know if 
I'm missing what you are getting at.

> Support more than 2 NameNodes
> -----------------------------
>
>                 Key: HDFS-6440
>                 URL: https://issues.apache.org/jira/browse/HDFS-6440
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: auto-failover, ha, namenode
>    Affects Versions: 2.4.0
>            Reporter: Jesse Yates
>            Assignee: Jesse Yates
>         Attachments: Multiple-Standby-NameNodes_V1.pdf, 
> hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, 
> hdfs-multiple-snn-trunk-v0.patch
>
>
> Most of the work is already done to support more than 2 NameNodes (one 
> active, one standby). This would be the last bit to support running multiple 
> _standby_ NameNodes; one of the standbys should be available for fail-over.
> Mostly, this is a matter of updating how we parse configurations, some 
> complexity around managing the checkpointing, and updating a whole lot of 
> tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to