[jira] [Comment Edited] (HDFS-12943) Consistent Reads from Standby Node

Brahma Reddy Battula (JIRA) Mon, 17 Dec 2018 06:11:22 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-12943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722550#comment-16722550
 ]


Brahma Reddy Battula edited comment on HDFS-12943 at 12/17/18 2:10 PM:
-----------------------------------------------------------------------

{quote}I think when we discuss a "request", we need to differentiate an RPC 
request originating from a Java application (MapReduce task, etc.) vs. a CLI 
request. The former will be the vast majority of operations on a typical 
cluster, so I would argue that optimizing for the performance and efficiency of 
that usage is much more important.
{quote}
Agree, I Could have mentioned CLI. But getHAServiceState() call from ORP which 
taken 2s+ as I mentioned above.Bytheway My intent was when read/write are 
combined in single application how much will be impact as it needs switch?

Just for curiosity,,do we've write benchmarks with and without ORP,as I didn't 
find from HDFS-14058 and HDFS-14059?
{quote}1.Are you running with HDFS-13873? With this patch (only committed 
yesterday so I doubt you have it) the exception thrown should be more 
meaningful.
{quote}
Yes,with latest HDFS-12943 branch.
{quote}2.Did you remember to enable in-progress edit log tailing?
{quote}
Yes,Enabled for three NN's
{quote}3.Was this run on an almost completely stagnant cluster (no other 
writes)? This can make the ANN flush its edits to the JNs less frequently, 
increasing the lag time between ANN and Observer.
{quote}
Yes,no other writes.

 
 Tried the following test with and with ORF,Came to know it's(perf impact) 
based on the tailing edits("*dfs.ha.tail-edits.period") which is default 1m.(In 
tests, it's 100MS)..*
{code:java}
@Test
 public void testSimpleRead() throws Exception {
 long avg=0;
 long avgL=0;
 long avgC=0;
 int num = 100;
 for (int i = 0; i < num; i++) {
 Path testPath1 = new Path(testPath, "test1"+i);
 long startTime=System.currentTimeMillis();
 assertTrue(dfs.mkdirs(testPath1, FsPermission.getDefault()));
 long l = System.currentTimeMillis() - startTime;
 System.out.println("time TakenL1: "+i+" : "+l);
 avg = avg+l;
 assertSentTo(0);
 long startTime2=System.currentTimeMillis();
 dfs.getContentSummary(testPath1);
 long C = System.currentTimeMillis() - startTime2;
 System.out.println("time TakengetContentSummary: "+i+" : "+ C);
 avgC = avgC+C;
 assertSentTo(2);
 long startTime1=System.currentTimeMillis();
 dfs.getFileStatus(testPath1);
 long L = System.currentTimeMillis() - startTime1;
 System.out.println("time TakengetFileStatus: "+i+" : "+ L);
 avgL = avgL+L;
 assertSentTo(2);
}
 System.out.println("AVG: mkDir: "+avg/num+" List: "+avgL/num+" Cont: 
"+avgC/num);
}{code}
IMO,Configuring less value(like 100ms) for reading ingress edits put load on 
journalnode till log roll happens(2mins by default),as it's open the stream to 
read the edits.

Apart from the perf i have following queries.
 i) Did we try with C/CPP client..?
 ii)are we planning separate metrics for observer reads(Client 
Side),Application like mapred might helpful for  job counters?

 


was (Author: brahmareddy):
{quote}I think when we discuss a "request", we need to differentiate an RPC 
request originating from a Java application (MapReduce task, etc.) vs. a CLI 
request. The former will be the vast majority of operations on a typical 
cluster, so I would argue that optimizing for the performance and efficiency of 
that usage is much more important.
{quote}
Agree, I Could have mentioned CLI. But getHAServiceState() call from ORP which 
taken 2s+ as I mentioned above.Bytheway My intent was when read/write are 
combined in single application how much will be impact as it needs switch?

Just for curiosity,,do we've write benchmarks with and without ORP,as I didn't 
find from HDFS-14058 and HDFS-14059?
{quote}1.Are you running with HDFS-13873? With this patch (only committed 
yesterday so I doubt you have it) the exception thrown should be more 
meaningful.
{quote}
Yes,with latest HDFS-12943 branch.
{quote}2.Did you remember to enable in-progress edit log tailing?
{quote}
Yes,Enabled for three NN's
{quote}3.Was this run on an almost completely stagnant cluster (no other 
writes)? This can make the ANN flush its edits to the JNs less frequently, 
increasing the lag time between ANN and Observer.
{quote}
Yes,no other writes.

 
 Tried the following test with and with ORF,Came to know it's(perf impact) 
based on the tailing edits("*dfs.ha.tail-edits.period") which is default 1m.(In 
tests, it's 100MS)..*
{code:java}
@Test
 public void testSimpleRead() throws Exception {
 long avg=0;
 long avgL=0;
 long avgC=0;
 int num = 100;
 for (int i = 0; i < num; i++) {
 Path testPath1 = new Path(testPath, "test1"+i);
 long startTime=System.currentTimeMillis();
 assertTrue(dfs.mkdirs(testPath1, FsPermission.getDefault()));
 long l = System.currentTimeMillis() - startTime;
 System.out.println("time TakenL1: "+i+" : "+l);
 avg = avg+l;
 assertSentTo(0);
 long startTime2=System.currentTimeMillis();
 dfs.getContentSummary(testPath1);
 long C = System.currentTimeMillis() - startTime2;
 System.out.println("time TakengetContentSummary: "+i+" : "+ C);
 avgC = avgC+C;
 assertSentTo(2);
 long startTime1=System.currentTimeMillis();
 dfs.getFileStatus(testPath1);
 long L = System.currentTimeMillis() - startTime1;
 System.out.println("time TakengetFileStatus: "+i+" : "+ L);
 avgL = avgL+L;
 assertSentTo(2);
}
 System.out.println("AVG: mkDir: "+avg/num+" List: "+avgL/num+" Cont: 
"+avgC/num);
}{code}
Apart from the perf i have following queries.
 i) Did we try with C/CPP client..?
 ii)are we planning separate metrics for observer reads(Client 
Side),Application like mapred might helpful for  job counters?

 

> Consistent Reads from Standby Node
> ----------------------------------
>
>                 Key: HDFS-12943
>                 URL: https://issues.apache.org/jira/browse/HDFS-12943
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: hdfs
>            Reporter: Konstantin Shvachko
>            Priority: Major
>         Attachments: ConsistentReadsFromStandbyNode.pdf, 
> ConsistentReadsFromStandbyNode.pdf, HDFS-12943-001.patch, 
> HDFS-12943-002.patch, TestPlan-ConsistentReadsFromStandbyNode.pdf
>
>
> StandbyNode in HDFS is a replica of the active NameNode. The states of the 
> NameNodes are coordinated via the journal. It is natural to consider 
> StandbyNode as a read-only replica. As with any replicated distributed system 
> the problem of stale reads should be resolved. Our main goal is to provide 
> reads from standby in a consistent way in order to enable a wide range of 
> existing applications running on top of HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-12943) Consistent Reads from Standby Node

Reply via email to