[jira] [Comment Edited] (HDFS-13779) Implement performFailover logic for ObserverReadProxyProvider.
[ https://issues.apache.org/jira/browse/HDFS-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597855#comment-16597855 ] Konstantin Shvachko edited comment on HDFS-13779 at 8/30/18 7:53 PM: - Looks good, Erik. Few comments from me: # In ORPP you use raw types: "References to generic type AbstractNNFailoverProxyProvider should be parameterized" # Unused imports (2) in {{TestObserverReadProxyProvider}}. # For {{CachingHAProxyFactory}} I don't think we achieve much here. It still stores the the Proxy twice - one in the {{backingFactory}} and another in the cache. If want to cache proxies we should do it in the RPC engine globally for all proxies. I suggest we skip this optimization for now. # Also {{TestObserverNode}} fails for me with your patch. I can debug if it is only on my box. was (Author: shv): Looks good, Erik. Two comments from me: # In ORPP you use raw types: "References to generic type AbstractNNFailoverProxyProvider should be parameterized" # Unused imports (2) in {{TestObserverReadProxyProvider}}. # For {{CachingHAProxyFactory}} I don't think we achieve much here. It still stores the the Proxy twice - one in the {{backingFactory}} and another in the cache. If want to cache proxies we should do it in the RPC engine globally for all proxies. I suggest we skip this optimization for now. # Also {{TestObserverNode}} fails for me with your patch. I can debug if it is only on my box. > Implement performFailover logic for ObserverReadProxyProvider. > -- > > Key: HDFS-13779 > URL: https://issues.apache.org/jira/browse/HDFS-13779 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Konstantin Shvachko >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-13779-HDFS-12943.000.patch, > HDFS-13779-HDFS-12943.001.patch, HDFS-13779-HDFS-12943.WIP00.patch > > > Currently {{ObserverReadProxyProvider}} inherits {{performFailover()}} method > from {{ConfiguredFailoverProxyProvider}}, which simply increments the index > and switches over to another NameNode. The logic for ORPP should be smart > enough to choose another observer, otherwise it can switch to a SBN, where > reads are disallowed, or to an ANN, which defeats the purpose of reads from > standby. > This was discussed in HDFS-12976. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13779) Implement performFailover logic for ObserverReadProxyProvider.
[ https://issues.apache.org/jira/browse/HDFS-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597827#comment-16597827 ] Erik Krogen edited comment on HDFS-13779 at 8/30/18 7:34 PM: - Just attached v001 patch addressing the review comments as above, filling out the remaining Javadoc, fixing TestObserverNode, and a few other minor fixes. Should be ready to go now pending any other review comments. was (Author: xkrogen): Just attached v001 patch addressing the review comments as above, filling out the remaining Javadoc, fixing TestObserverNode, and a few other minor fixes. > Implement performFailover logic for ObserverReadProxyProvider. > -- > > Key: HDFS-13779 > URL: https://issues.apache.org/jira/browse/HDFS-13779 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Konstantin Shvachko >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-13779-HDFS-12943.000.patch, > HDFS-13779-HDFS-12943.001.patch, HDFS-13779-HDFS-12943.WIP00.patch > > > Currently {{ObserverReadProxyProvider}} inherits {{performFailover()}} method > from {{ConfiguredFailoverProxyProvider}}, which simply increments the index > and switches over to another NameNode. The logic for ORPP should be smart > enough to choose another observer, otherwise it can switch to a SBN, where > reads are disallowed, or to an ANN, which defeats the purpose of reads from > standby. > This was discussed in HDFS-12976. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13779) Implement performFailover logic for ObserverReadProxyProvider.
[ https://issues.apache.org/jira/browse/HDFS-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594309#comment-16594309 ] Erik Krogen edited comment on HDFS-13779 at 8/27/18 10:50 PM: -- Attached v000 patch based on recent changes to {{ConfiguredFailoverProxyProvider}} and {{ObserverReadProxyProvider}}. This takes the approach of scanning through each proxy and checking the state; if it is not correct, try another one. Once a correct (observer) proxy is found, stay on that one until it fails. This is essentially a copy of how {{ConfiguredFailoverProxyProvider}} is implemented. One issue the above approach opens us up to is if we are currently using an observer proxy, but it switches to be an active, we may not notice. I think we can add a period (say, every 1 or 5 minutes) refresh of the state of the proxy we are currently using to ensure it is still an observer. I would prefer to do this in a follow-on patch. I have a few outstanding TODOs about comments in the patch, but otherwise it should be ready for review. was (Author: xkrogen): Attached v000 patch based on recent changes to {{ConfiguredFailoverProxyProvider}} and {{ObserverReadProxyProvider}}. This takes the approach of scanning through each proxy and checking the state; if it is not correct, try another one. Once a correct (observer) proxy is found, stay on that one until it fails. This is essentially a copy of how {{ConfiguredFailoverProxyProvider}} is implemented. I have a few outstanding TODOs about comments in the patch, but otherwise it should be ready for review. > Implement performFailover logic for ObserverReadProxyProvider. > -- > > Key: HDFS-13779 > URL: https://issues.apache.org/jira/browse/HDFS-13779 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Konstantin Shvachko >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-13779-HDFS-12943.000.patch, > HDFS-13779-HDFS-12943.WIP00.patch > > > Currently {{ObserverReadProxyProvider}} inherits {{performFailover()}} method > from {{ConfiguredFailoverProxyProvider}}, which simply increments the index > and switches over to another NameNode. The logic for ORPP should be smart > enough to choose another observer, otherwise it can switch to a SBN, where > reads are disallowed, or to an ANN, which defeats the purpose of reads from > standby. > This was discussed in HDFS-12976. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13779) Implement performFailover logic for ObserverReadProxyProvider.
[ https://issues.apache.org/jira/browse/HDFS-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581593#comment-16581593 ] Erik Krogen edited comment on HDFS-13779 at 8/15/18 8:53 PM: - Thanks for looking [~vagarychen]! For #1, the {{currentIndex}} only increases, then you take the modulo vs. the current list size to get the actual index. So in the case you've discussed with {{currentIndex = 2}} (not 3, since it is 0-indexed), before we have (caret indicating current index): {code:java} 1 2 3 4 5 ^ {code} Then we try 3 and 4, they both throw standby so we remove them, so the remaining list is ({{currentIndex = 2}} still): {code:java} 1 2 5 ^ {code} Next we try 5, is says RETRY, so we increment to get {{currentIndex = 3}}, then we try 1 ({{3 % 3 == 0}}) as expected. I went with {{currentIndex}} increasing and taking modulo, rather than always maintaining an actual index, since the size of the list may change and otherwise we would have to potentially update {{currentIndex}} every time the list size changes (to avoid AIOOBE). Modulus gives us AIOOBE avoidance for free. For #2, good point. For #3 / #4, yes, it needs cleanup, just a WIP for now. was (Author: xkrogen): Thanks for looking [~vagarychen]! For #1, the {{currentIndex}} only increases, then you take the modulo vs. the current list size to get the actual index. So in the case you've discussed with {{currentIndex = 2}} (not 3, since it is 0-indexed), before we have (caret indicating current index): {code} 1 2 3 4 5 ^ {code} Then we try 3 and 4, they both throw standby so we remove them, so the remaining list is ({{currentIndex = 2}} still): {code} 1 2 5 ^ {code} Next we try 5, is says RETRY, so we increment to get {{currentIndex = 3}}, then we try 1 ({{3 % 3 == 0}}) as expected. I went with {{currentIndex}} increasing and taking modulo, rather than always maintaining an actual index, since the size of the list may change and otherwise we would have to potentially update {{currentIndex}} every time the list size changes (to avoid AIOOBE). Modulus gives us AIOOBE avoidance for free. For #2, good point. For #3 / #4, yes, it needs cleanup, just a WIP for now. > Implement performFailover logic for ObserverReadProxyProvider. > -- > > Key: HDFS-13779 > URL: https://issues.apache.org/jira/browse/HDFS-13779 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Konstantin Shvachko >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-13779-HDFS-12943.WIP00.patch > > > Currently {{ObserverReadProxyProvider}} inherits {{performFailover()}} method > from {{ConfiguredFailoverProxyProvider}}, which simply increments the index > and switches over to another NameNode. The logic for ORPP should be smart > enough to choose another observer, otherwise it can switch to a SBN, where > reads are disallowed, or to an ANN, which defeats the purpose of reads from > standby. > This was discussed in HDFS-12976. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org