[ 
https://issues.apache.org/jira/browse/HDFS-13767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564576#comment-16564576
 ] 

Chen Liang edited comment on HDFS-13767 at 8/1/18 12:59 AM:
------------------------------------------------------------

Post WIP.v004 patch. The main changes are:
 # changed to {{getCorrectLastAppliedOrWrittenTxId}} as Erik suggested.
 # Add a simple unit test and had some minor change to {{TestObserverNode}}.

I was encountering some issues when running test, here are some suspects I had 
on TestObserverNode:

Currently with msync, every single call to Observer needs to catch up state id. 
While when {{ObserverReadProxyProvider}} is created, it makes 
{{reportBadBlocks}} and {{checkAccess}} calls. These calls also need to catch 
up state id. This means every time when  {{ObserverReadProxyProvider}} is 
created there will be a wait... And since every unit test creates a dfs 
cluster, for every single unit test in {{TestObserverNode}}, there will be such 
a wait. I could set the period to a small time, but that introduces other 
issues, because several tests depend on explicitly made edits tailing calls. So 
I set the periods to a smaller, but not too small number...and tests still 
quite some time to run.

I think the right fix would be to have {{ObserverReadProxyProvider}} not make 
those calls on being created. But make a new special call that bypasses the 
catching up (if possible). This is one of those hacky code we are planing to 
optimize in the future. And also, the ongoing work HDFS-13523 may also simplify 
the tests. For now, I tend to believe having these slow tests may be fine.


was (Author: vagarychen):
Post WIP.v004 patch. The main changes are:
 # changed to {{getCorrectLastAppliedOrWrittenTxId}} as Erik suggested.
 # Add a simple unit test and had some minor change to {{TestObserverNode}}.

I was encountering some issues when running test, here are some suspects I had 
on TestObserverNode:

Currently with msync, every single call to Observer needs to catch up state id. 
While when {{ObserverReadProxyProvider}} is created, it makes 
{{reportBadBlocks}} and {{checkAccess}} calls. These calls are also read, so 
they also need to catch up state id. This means every time when  
{{ObserverReadProxyProvider}} is created there will be a wait... And since 
every unit test creates a dfs cluster, for every single unit test in 
{{TestObserverNode}}, there will be such a wait. I could set the period to a 
small time, but that introduces other issues, because several tests depend on 
explicitly made edits tailing calls. So I set the periods to a smaller, but not 
too small number...and tests still quite some time to run.

I think the right fix would be to have {{ObserverReadProxyProvider}} not make 
those read calls on being created. This is one of those hacky code we are 
planing to optimize in the future. And also, the ongoing work HDFS-13523 may 
also simplify the tests. For now, I tend to believe having these slow tests may 
be fine.

> Add msync server implementation.
> --------------------------------
>
>                 Key: HDFS-13767
>                 URL: https://issues.apache.org/jira/browse/HDFS-13767
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>            Priority: Major
>         Attachments: HDFS-13767.WIP.001.patch, HDFS-13767.WIP.002.patch, 
> HDFS-13767.WIP.003.patch, HDFS-13767.WIP.004.patch
>
>
> This is a followup on HDFS-13688, where msync API is introduced to 
> {{ClientProtocol}} but the server side implementation is missing. This is 
> Jira is to implement the server side logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to