[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350302#comment-17350302 ] zhuobin zheng commented on HDFS-13522: -- Hi, [~hemanthboyina], [^HDFS-13522_WIP.patch] is a nice patch !! Given that you haven't updated the code for a long time, i try to update it to adapt to current trunk branch. ([^HDFS-13522.002.patch]) Only made the following changes: # Fit code to current trunk # Fix Web HDFS NPE. (Patch Line: 1085,1070. Null check before use) # Double cache in MembershipNamenodeResolver, to avoid NNs sort every time. (fields: observerFirstCacheNS) # Update Observer NN state to Unavailable, to avoid access unavailable NN. (Patch Line: 875.) # Msync Locked in NS Level, not global level. And removed unlock logic when sync configured 0ms , for reduce msync times. (May be unnecessary optimization. Can add this logic back if it is useful.) (Patch Line: 1038) Can you help for review in your free time? [~csun] [~xkrogen] [~hemanthboyina] [~surendralilhore] [~crh] > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Labels: pull-request-available > Attachments: HDFS-13522.001.patch, HDFS-13522.002.patch, > HDFS-13522_WIP.patch, RBF_ Observer support.pdf, Router+Observer RPC > clogging.png, ShortTerm-Routers+Observer.png > > Time Spent: 2h 20m > Remaining Estimate: 0h > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192877#comment-17192877 ] Hemanth Boyina commented on HDFS-13522: --- thanks [~surendrasingh] for the review {quote} # Load balancing between multiple observer.{quote} we are shuffling the observers so that same observer doesn't always receive the call {code:java} // shuffle the observers if observer are greater than 1 // so same observer doesn't come always if (observerRead && observerMemberships.size() > 1) { Collections.shuffle(observerMemberships); Collections .sort(nonObserverMemberships, new NamenodePriorityComparator()); } {code} {quote}2.Webhdfs call, I think you may get NPE for webhdfs call {quote} will fix and add a test case and will update the patch > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, HDFS-13522_WIP.patch, RBF_ > Observer support.pdf, Router+Observer RPC clogging.png, > ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191970#comment-17191970 ] Surendra Singh Lilhore commented on HDFS-13522: --- Hi [~hemanthboyina], In initial review I got two things, which need to be taken care. # Load balancing between multiple observer. # Webhdfs call, I think you may get NPE for webhdfs call. I will review this patch in detail. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, HDFS-13522_WIP.patch, RBF_ > Observer support.pdf, Router+Observer RPC clogging.png, > ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191824#comment-17191824 ] Hemanth Boyina commented on HDFS-13522: --- thanks for the comments [~elgoiri] [~csun] [~crh] {quote}can you help me understand if the consistency guarantees are same with and without router or router relaxes the consistency guarantees {quote} yes , router guarantees consistency , for a read call router first does msync on all namespaces > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, HDFS-13522_WIP.patch, RBF_ > Observer support.pdf, Router+Observer RPC clogging.png, > ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190767#comment-17190767 ] CR Hota commented on HDFS-13522: [~elgoiri] Thanks for following-up. [~hemanthboyina] Thanks for uploading the patch and feel free to take this jira. I can also help with the code review. Meanwhile can you help me understand if the consistency guarantees are same with and without router or router relaxes the consistency guarantees ? This was a discussion point when we were last working on this. Please refer to the notes in the thread. The last design doc which was uploaded was intended to allow routers in the middle to still honor the same consistency guarantees that client to Nameode/ObserverNamenode honor without routers. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, HDFS-13522_WIP.patch, RBF_ > Observer support.pdf, Router+Observer RPC clogging.png, > ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190483#comment-17190483 ] Chao Sun commented on HDFS-13522: - [~hemanthboyina] feel free to take over this. I haven't got a chance to work on this but I think it is an important feature. I may be able to help on code review. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, HDFS-13522_WIP.patch, RBF_ > Observer support.pdf, Router+Observer RPC clogging.png, > ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190360#comment-17190360 ] Íñigo Goiri commented on HDFS-13522: Thanks [~hemanthboyina] for the update. The patch has a couple of things that I would try to fix but it looks like the right approach to me. We may want to discuss adding the contexts and so on but I would move forward with that. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, HDFS-13522_WIP.patch, RBF_ > Observer support.pdf, Router+Observer RPC clogging.png, > ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190290#comment-17190290 ] Hemanth Boyina commented on HDFS-13522: --- thanks everyone or the discussions here at huawei , we have developed and have been using router with observer node for quite some time , please check [^HDFS-13522_WIP.patch] > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, HDFS-13522_WIP.patch, RBF_ > Observer support.pdf, Router+Observer RPC clogging.png, > ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189824#comment-17189824 ] Íñigo Goiri commented on HDFS-13522: [~csun], [~crh], any updates on this? Did you move this forward? > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, RBF_ Observer support.pdf, > Router+Observer RPC clogging.png, ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978082#comment-16978082 ] Surendra Singh Lilhore commented on HDFS-13522: --- {quote}anyone interested taking this ahead? {quote} Thanks [~ayushtkn] for ping. {quote}I started reading but got an initial doubt, regarding the need to split read and write routers. I think we can use only one kind of routers itself. {quote} I am also thinking to utilize same router for observer call instead adding new role for router. It will increase complexity of cluster. Already HDFS overloaded with different role of processes. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, RBF_ Observer support.pdf, > Router+Observer RPC clogging.png, ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977594#comment-16977594 ] Ayush Saxena commented on HDFS-13522: - [~surendrasingh] [~crh] anyone interested taking this ahead? > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, RBF_ Observer support.pdf, > Router+Observer RPC clogging.png, ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929700#comment-16929700 ] Ayush Saxena commented on HDFS-13522: - Thanx [~crh] for the design. I started reading but got an initial doubt, regarding the need to split read and write routers. I think we can use only one kind of routers itself. The reason to split for observer read seems here too to differentiate call between active NN for write and Observer NN for read. Can this be not done in existing routers, we can check if the stateId is set, That means the client is using {{ObserverProxyProvider}} and we can direct the call to Observer NN and if not we can follow the normal flow as it is. Let me know if I missed some fact here. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, RBF_ Observer support.pdf, > Router+Observer RPC clogging.png, ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844322#comment-16844322 ] CR Hota commented on HDFS-13522: [~ayushtkn] [~elgoiri] [~brahmareddy] [~surendrasingh] Uploaded the high level design. Most of the things are already known, the main focus is on state management. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, RBF_ Observer support.pdf, > Router+Observer RPC clogging.png, ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16838266#comment-16838266 ] CR Hota commented on HDFS-13522: [~ayushtkn] Thanks for the comments. I am working on creating a high level design document and should be able to share that later this week. On a high level, the reason I bought up HDFS-14090, is that these 2 jiras have to be thought about together to solve the whole Observer support. But surely, isolation design and details we can discuss in HDFS-14090 and focus on Observer only here. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, Router+Observer RPC clogging.png, > ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16837545#comment-16837545 ] Ayush Saxena commented on HDFS-13522: - Thanx Everyone for the discussion!!! I guess we should focus primarily on SBN here and can discuss the quee stuff majorly itself at HDFS-14090. [~crh] had a quick look at the proposed design posted. * Is the major idea behind spliting the router into two half is as like the Read-Only appears to be as an Observer and the Write one as Active to the end Client, so as to use existing SBN client can connect to RBF similarly as it connects to a normal NS? * I guess if we intend to change this way, we should ensure the older setup where routers served all stays intact. May be the Write ones stays be regular Router(present one) serving all requests(both Read/Write) and the Read-Only may come up as a additions. * Well if I am catching it correct, How do you plan to handle the State Id stuff for the client when it connects through Router. In case of Single NS, that was pretty straight forward, but router connects to multiple NS. Not sure if here too you plan to handle at router(msync for every call) Whether this approach would have any edge over the one already up by Surendra. Neither the intent to make the client process exactly same as the present SBN scenario shall stay. Since for Read your own write client doesn't require an msync. Would be great If you could share some more details and plans regarding it.:) > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, Router+Observer RPC clogging.png, > ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16834424#comment-16834424 ] Surendra Singh Lilhore commented on HDFS-13522: --- {quote}For implementing a queue similar to FairCallQueue, Handlers have to understand which downstream namenode a call is meant for and also whether its a read/write. These aspects of a RPC can be inspected only once the call is handled at the ClientProtocol implementation layer which callqueue won't have any visibility into. {quote} {{org.apache.hadoop.ipc.Server.Call.isCallCoordinated()}}, this API tells if it is read or write call and this is available in handlers. Based on this API we can add call to respective queue. Currently namespace info is not available in handlers. {quote}Since we haven't yet finalized how to solve these, in the interim we have taken a approach as shown in ShortTerm-Routers+Observer.png. {quote} You mean, we can give state same as namenode to router (ACTIVE, OBSERVER) and utilize the functionality of {{ObserverReadProxyProvider}} or going to change client to support read-only/Write-only router ? > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, Router+Observer RPC clogging.png, > ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16834021#comment-16834021 ] CR Hota commented on HDFS-13522: [~surendrasingh] Thanks for your comments. For implementing a queue similar to FairCallQueue, Handlers have to understand which downstream namenode a call is meant for and also whether its a read/write. These aspects of a RPC can be inspected only once the call is handled at the ClientProtocol implementation layer which callqueue won't have any visibility into. Since we haven't yet finalized how to solve these, in the interim we have taken a approach as shown in ShortTerm-Routers+Observer.png. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, Router+Observer RPC clogging.png, > ShortTerm-Routers+Observer.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832414#comment-16832414 ] Surendra Singh Lilhore commented on HDFS-13522: --- Thanks [~csun] and [~crh] {quote}Separation of read vs write queuing in routers to begin with to fundamental continue maintaining a fast lane access for read calls. This work should create foundations to help to eventually separate read vs write per nameservice thus helping solve HDFS-14090. {quote} This can be done by implementing new queue same as {{FairCallQueue}}. New queue implementation can create two internal queue for read and write calls. {quote} Honor existing client configurations around whether or not to access Observer NN for certain use cases. For ex: a client can currently continue using ConfiguredFailOverProxy without connecting to Observer. If a client maintains such preference, routers should honor it and connect to Active Namenode for read calls as well. {quote} In case of router federation client will not use c{{lientStateId}}. This we can use to check if client wants to read from observer or active. If the clientStateId is {{Long.MIN_VALUE}} then will send read to observer otherwise send to active. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, Router+Observer RPC clogging.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831823#comment-16831823 ] CR Hota commented on HDFS-13522: [~elgoiri] [~ayushtkn] [~csun] Thanks a lot for the discussion and [~surendrasingh] many thanks for your initial patch. Great to see more interest in this work. Based on understanding, below is the problem statement and design exit criteria we may want to look at. Attached is a file that represents the problem statement pictorially. h1. Problem statement Figure 1 in attached Router+Observer RPC clogging.png shows a typical RPC mechanism with respect to active namenode and observer namenode. In this case, observer namenodes are strictly processing read only requests from clients. Since there is no global lock or contention, RPC queue wait times are lower and processing times are significantly faster in Observer namenodes when compared to active namenode. With router based federation, a proxy layer is introduced that serves incoming client traffic on behalf of the client and performs the actual action against the downstream namenode. _This server proxy layer inherits the same server implementation that name nodes utilize_. With a single RPC queue in router, all read and writes will get intermingled again thus substantially diminishing the benefits of Observer namenode. Figure 2 illustrates this behavior. This is particularly problematic for rpc latency sensitive real time engines such as Presto. h1. Design criteria On a high level, the design of this feature should help achieve below 2 key objectives. # Separation of read vs write queuing in routers to begin with to fundamental continue maintaining a fast lane access for read calls. This work should create foundations to help to eventually separate read vs write per nameservice thus helping solve HDFS-14090. # Honor existing client configurations around whether or not to access Observer NN for certain use cases. For ex: a client can currently continue using ConfiguredFailOverProxy without connecting to Observer. If a client maintains such preference, routers should honor it and connect to Active Namenode for read calls as well. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch, Router+Observer RPC clogging.png > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831447#comment-16831447 ] Chao Sun commented on HDFS-13522: - Thanks [~surendrasingh] for the patch, and [~ayushtkn] for the comments. It's great that you guys are interested in this work. [~crh] and me had a look at the patch and we had a discussion offline about the design. [~crh] will post some comments on this soon. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831439#comment-16831439 ] Surendra Singh Lilhore commented on HDFS-13522: --- Thanks [~elgoiri] for review.. I just tried this solution and attached patch to get others opinion. I am panning to use this in production enviroment. [~csun], This jira is on your name, if you want to work on this or have better solution, you can continure :). > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830536#comment-16830536 ] Íñigo Goiri commented on HDFS-13522: Thank you [~surendrasingh], I think the idea in [^HDFS-13522.001.patch] looks good. I think the counting of the RPC queries is a little out of place. The metrics already track this kind of things (e.g., calls to the State Store or calls to Standby NNs). We should try to leverage the same. For checking if it is a READ operation we also track that in OpCategory in RouterRpcServer. Not sure if we should leverage that or rely on the observer infra. We should change the UI a little too to show the observer NN. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830075#comment-16830075 ] Surendra Singh Lilhore commented on HDFS-13522: --- Thank you all for discussion. I tried observer read in router and attached initial patch for this. Patch do following things # Store observer state in state store. # Keep the last transaction id for all the namespace in memory. # Send the write direct to active NN. # For read first do the msync() and send read call to observer NN. # Added two property to support observer read in Router, a) *dfs.federation.router.observer.read.enable*, b) *dfs.federation.router.observer.auto-msync-period*. # Now in client side no change is required. This is initial patch, others can give their suggestion to improve this. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-13522.001.patch > > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16827244#comment-16827244 ] Ayush Saxena commented on HDFS-13522: - Thanx Everyone for the discussion here. Feel like this three challenges. [~elgoiri] already has mentioned two challenges : * Collecting Observer state * Invoking at the observer * Handling the state id The above two seems fairly straightforward. The main challenge seems to be handling the state id. In a non federation scenario. A client gets the state id for every operation at the Active and the client uses that id while invoking the call at observer, which the observer uses to ensure non stale read. The problem at RBF I feel is Router is mounted to different namespaces and a client call can go to any of the namespace depending on the mount mapping. So, the challenge may be handling the state id. That too may have two approaches,that I can think of: First we store the state id at the Router end and decide observer read at Router making the client independent, For each call we check the state id corresponding to the NS and invoke the call accordingly. Second is what I think might be create a Router State which can be sent to the Client, as is sent by the NN presently, and that may be decoded back to get each of the namespace states, which can be used further. The first one seems quite easy but major problem which I feel would be to sync the state amongst all routers and the overhead that it will cause during an operation, We have to read every time the value in this case from StateStore and update the value everytime on write(may be a point of bottleneck too) and with second the mechanism to wrap the state id stays a challange. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Assignee: Chao Sun >Priority: Major > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464214#comment-16464214 ] Chao Sun commented on HDFS-13522: - Sounds good [~elgoiri]. I'll finish [HDFS-12976|https://issues.apache.org/jira/browse/HDFS-12976] first. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Priority: Major > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464096#comment-16464096 ] Íñigo Goiri commented on HDFS-13522: Thanks [~csun] for the clarification; that makes sense. As a simplistic approach we could just ignore what the client does and let the Router decide what to use. However, this may not fit the model being envisioned by HDFS-12976. I would finish HDFS-12976 first and based on that we may go fancier and use some caller context or RPC header. Anyway, while doing HDFS-12976, just keep in mind this possible requirement. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Priority: Major > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463444#comment-16463444 ] Chao Sun commented on HDFS-13522: - Yes [~elgoiri] I agree that on {{RouterRpcClient}} side the implementation should be like you said. The issue I'm trying to describe is that the client-side config can enable/disable the observer reads, which should affect the router-side behavior accordingly. For instance, if the client uses the default {{ConfiguredFailoverProxyProvider}} for {{dfs.client.failover.proxy.provider}}, then on the router side all read/write requests should go through active NN. However, if client uses {{StandbyReadProxyProvider}}, then read requests on the router side should go to observer NN first. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Priority: Major > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462885#comment-16462885 ] Íñigo Goiri commented on HDFS-13522: bq. One thing I'm not certain is that observer reads can be enabled/disabled on the client side (e.g., via StandbyReadProxyProvider and some config key). How this can be reflected in the RouterRpcClient which sits on the server side? {{RouterRpcClient}} uses the raw {{ClientProtocol}}. The core of the code is in {{RouterRpcClient#invokeMethod}} and it has a check for {{StandbyException}} and it allows a failover internally. This method also gets the namenodes to invoke this methods in order based on the current status. The change would be in {{getNamenodesForNameservice(nsId)}}, this method should recognize if the operation is READ/WRITE (we can piggyback this information from the {{RouterRpcServer}} and if it's a WRITE, keep the current order and if it's a READ allow the OBSERVER NNs to go first. Not sure the policies that are being design right now but we can plug them here. Examples I can think: random, observer first... > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Priority: Major > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461972#comment-16461972 ] Chao Sun commented on HDFS-13522: - Thanks [~elgoiri]. One thing I'm not certain is that observer reads can be enabled/disabled on the client side (e.g., via {{StandbyReadProxyProvider}} and some config key). How this can be reflected in the {{RouterRpcClient}} which sits on the server side? > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Priority: Major > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461766#comment-16461766 ] Erik Krogen commented on HDFS-13522: Thanks for your helpful comments [~elgoiri]! {quote} It would be handy to have a setup of the MiniDFSCluster that sets OBSERVER NNs automatically and having some predefined contract that makes sure that requests are going to the OBSERVER. Is there something along those lines already available in HDFS-12943? {quote} Not yet but we should definitely add that. Created HDFS-13523 Just FYI, no one is planning on working on this ticket soon, just created to make sure it is not forgotten. > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Priority: Major > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13522) Support observer node from Router-Based Federation
[ https://issues.apache.org/jira/browse/HDFS-13522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461763#comment-16461763 ] Íñigo Goiri commented on HDFS-13522: There are two main changes to do: * Collect the OBSERVER state in {{NamenodeHeartbeatService}} and store it in the {{MembershipStore}} through the {{FederationNamenodeServiceState}} * Allow the {{RouterRpcClient}} to pick OBSERVER NNs to perform the operations. Both changes should be fairly easy to add. It would be handy to have a setup of the MiniDFSCluster that sets OBSERVER NNs automatically and having some predefined contract that makes sure that requests are going to the OBSERVER. Is there something along those lines already available in HDFS-12943? > Support observer node from Router-Based Federation > -- > > Key: HDFS-13522 > URL: https://issues.apache.org/jira/browse/HDFS-13522 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: federation, namenode >Reporter: Erik Krogen >Priority: Major > > Changes will need to occur to the router to support the new observer node. > One such change will be to make the router understand the observer state, > e.g. {{FederationNamenodeServiceState}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org