[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969498#comment-16969498 ] Íñigo Goiri commented on HDFS-14284: There must be something with the protobuf that needs to be added. Anyway, the message works for sure and that one is passed. In addition, it is easier to make the router id parameter mandatory and then just send it over the message. In my mind there are two benefits of using this {{RIOE(String msg)}}: * It makes clear that it's a special issue coming from the Router. * It makes it mandatory to specify the Router id so the exception will have that in the exception. Given this I would keep it this way. I would also try to see if we can tweak the protobuf to be able to carry the routerId separately. If not possible, the current approach in [^HDFS-14284.008.patch] is fine with me. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch, HDFS-14284.007.patch, HDFS-14284.008.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969200#comment-16969200 ] Ayush Saxena commented on HDFS-14284: - Checked in further, Observed while unwrapping only the exception message is passed to RouterIOException, not the parameters. So the routerID can't be forwarded. So, getting router ID from the exception at the client doesn't seems to be as such possible, Until and unless we do some string logic and extract routerId and set in the RIOE(String msg) constructor else there is no benefit having separate RIOE [~elgoiri] thoughts? > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch, HDFS-14284.007.patch, HDFS-14284.008.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964271#comment-16964271 ] Ayush Saxena commented on HDFS-14284: - Thanx [~hemanthboyina] for the patch. As I said before, please give a check to the RouterIOException by extracting from the RemoteException. You can get the RemoteException as : {code:java} RemoteException re = LambdaTestUtils.intercept(RemoteException.class, "Cannot locate a registered namenode for ns0 from " + routerContext.getRouter().getRouterId(), () -> routerProtocol.addBlock(testPath, clientName, newBlock, null, 1, null, null)); RouterIOException rioe = (RouterIOException) re.unwrapRemoteException(RouterIOException.class); rioe.getMessage(); // Have assertion checks for this and similarly for routerID {code} You can do something like this, To manually unwrap, you need to have a constructor with just {{String}} as param.Else it shall throw {{NoMethodException}}. You can create one and set the message and RouterID into it and then try. I had a quick rough try it worked. Give a try, if you face issues, Let me know. I will try help write. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch, HDFS-14284.007.patch, HDFS-14284.008.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963211#comment-16963211 ] hemanthboyina commented on HDFS-14284: -- updated the patch with comments fixed , please review [~ayushtkn] > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch, HDFS-14284.007.patch, HDFS-14284.008.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962301#comment-16962301 ] Hadoop QA commented on HDFS-14284: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 42s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 11s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 36s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 67m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14284 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12984292/HDFS-14284.008.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ddf4517e923b 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / bd498ba | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28204/testReport/ | | Max. process+thread count | 2760 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28204/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL:
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961893#comment-16961893 ] hemanthboyina commented on HDFS-14284: -- {quote}Derrive the RouterIOException from the RemoteException, And check the {{getMessage}} and {{getRouterID}} are giving correct stuff. {quote} have checked with assert contains as it goes unwrapped as {{RemoteException}} All other comments were handled will post the patch soon. thanks > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch, HDFS-14284.007.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960790#comment-16960790 ] Ayush Saxena commented on HDFS-14284: - Thanx [~hemanthboyina] for the patch. Couple of comments : {code:java} -throw new NoNamenodesAvailableException(nsId, ioe); +throw new NoNamenodesAvailableException( +nsId + " from router " + router.getRouterId(), ioe); {code} No need to append the text here in the nsId variable, Doesn't make sense to have message for a variable which intends to store NsId, Add a param for Router ID, and do the message appending part and all inside the Exception method, To make the actual code flow look clean. {code:java} - throw new IOException("No namenodes to invoke " + method.getName() + - " with params " + Arrays.deepToString(params) + " from " - + router.getRouterId()); + throw new RouterIOException("No namenodes to invoke " + method.getName() + + " with params " + Arrays.deepToString(params), + router.getRouterId()); {code} If I see earlier the text was from ROUTERID not from router ROUTERID {code:java} .append(" from router ") {code} So, better we keep the text same, don't add router here, Somebody parsing the string would fail, if we tweak the text. For the test : * Derrive the RouterIOException from the RemoteException, And check the {{getMessage}} and {{getRouterID}} are giving correct stuff. * No need to remove the NoNamenodeException Test, We are changing that too, Better keep that too, If the both exceptions are using some code flow, put them in same test and name the test a little genreic, otherwise sepaerate them, But try to reuse the code if possible, by refactoring into a method, if you can't keep in one test. * > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch, HDFS-14284.007.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959178#comment-16959178 ] Hadoop QA commented on HDFS-14284: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 59s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 42s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 44s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 6s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 62m 43s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.federation.router.TestRouterFaultTolerant | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14284 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12983954/HDFS-14284.007.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux cff374255ed5 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2eba2624 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28171/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28171/testReport/ | | Max. process+thread count | 2765 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28171/console | | Powered by | Apache Yetus 0.8.0
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956337#comment-16956337 ] hemanthboyina commented on HDFS-14284: -- thanks for d review [~ayushtkn] {quote}I don't think that is something correct behavior either {quote} will have a null check for router and append to the message if it is not null {quote}even the message is lost {quote} {code:java} this.msg = msg; {code} will set the message by this way and will add an test case to check RouterIOException > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955288#comment-16955288 ] Ayush Saxena commented on HDFS-14284: - I checked it doesn't throw NPE, But just prints {{null from router null}}, since we aren't setting the {{msg}} variable in the constructor, even the message is lost. I don't think that is something correct behavior either we don't have this constructor or handle this in some better way. [~elgoiri] any opinion? The {{testRouterIOException}} doesn't test {{RouterIOException}} it tests {{NoNamenodeException}}. Better change the name. The {{RouterIOException}} isn't tested anywhere.. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955287#comment-16955287 ] Ayush Saxena commented on HDFS-14284: - I have a doubt : If RIOE gets created using this : {code:java} + + public RouterIOException(String msg) { +super(msg); + } {code} I think this can throw NPE : {code:java} + @Override + public String getMessage() { +return new StringBuilder().append(msg).append(" from router ") +.append(this.routerId).toString(); + } {code} I may be wrong, did someone checked it? > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955270#comment-16955270 ] Íñigo Goiri commented on HDFS-14284: +1 on [^HDFS-14284.006.patch]. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955130#comment-16955130 ] hemanthboyina commented on HDFS-14284: -- thanks for confirming [~elgoiri] , i think all review comments were covered can you push the patch forward > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953043#comment-16953043 ] Íñigo Goiri commented on HDFS-14284: Thanks [~ayushtkn], yes, that's it, RemoteException only extracts a few exceptions not all. I guess this is fine then. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952572#comment-16952572 ] Ayush Saxena commented on HDFS-14284: - I didn't catch much the discussion but, if you both are talking about {{RemoteException}} here: {code:java} 262 LambdaTestUtils.intercept(RemoteException.class, 263 "No namenodes available under nameservice " + ns0 + " from router " 264 + routerContext.getRouter().getRouterId(), 265 () -> routerProtocol.mkdirs(dirPath, permission, false)); {code} That it should be {{NoNamenodeException}} then it will not be so, neither before nor after, since the unwrapping is done manually, for all client operations in DFSClient per API and only those are unwrapped, all others goes unwrapped as {{RemoteException}} {code:java} return namenode.mkdirs(src, absPermission, createParent); } catch (RemoteException re) { throw re.unwrapRemoteException(AccessControlException.class, InvalidPathException.class, FileAlreadyExistsException.class, FileNotFoundException.class, ParentNotDirectoryException.class, SafeModeException.class, NSQuotaExceededException.class, DSQuotaExceededException.class, QuotaByStorageTypeExceededException.class, UnresolvedPathException.class, SnapshotAccessControlException.class); {code} And I am not pretty sure but IIRC you can only unwrap the exceptions instance of IOE's only, > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952559#comment-16952559 ] hemanthboyina commented on HDFS-14284: -- i dont think it is becuase of RouterIOException i have even tried without the fix , the exception coming was RemoteException (using filesystem interface for mkdirs) > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952289#comment-16952289 ] Íñigo Goiri commented on HDFS-14284: Interesting, there are some exceptions that get unwrapped. I'm guessing this is because of the RouterIOException. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952173#comment-16952173 ] hemanthboyina commented on HDFS-14284: -- even if we use FileSystem Interface we get the exception as {code:java} org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RetriableException): org.apache.hadoop.hdfs.server.federation.router.NoNamenodesAvailableException: No namenodes available under nameservice ns0 from router ** {code} the class for the exception is RemoteException . > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951226#comment-16951226 ] Íñigo Goiri commented on HDFS-14284: I may be wrong, but I think that if we use the FileSystem interface instead of the ClientProtocol, we can use mkdirs and get the actual exception instead of RemoteException which requires checking the message. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch, > HDFS-14284.006.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16950241#comment-16950241 ] Hadoop QA commented on HDFS-14284: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 41s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 8s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 0s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 50s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 79m 43s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.fs.contract.router.web.TestRouterWebHDFSContractSeek | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.3 Server=19.03.3 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14284 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12982867/HDFS-14284.006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7d311b64f128 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 5f4641a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28082/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28082/testReport/ | | Max. process+thread count | 2708 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28082/console | | Powered by | Apache Yetus 0.8.0
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946937#comment-16946937 ] Ayush Saxena commented on HDFS-14284: - [~brahmareddy] comment for {{NoNamenodesAvailableException}} needs to be handled. Apart from that seems fair enough!!! > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch, HDFS-14284.005.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946617#comment-16946617 ] Hadoop QA commented on HDFS-14284: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 20s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 20s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 66m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.2 Server=19.03.2 Image:yetus/hadoop:1dde3efb91e | | JIRA Issue | HDFS-14284 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12982467/HDFS-14284.005.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux fbf9bf3beb65 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 4fdf016 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28033/testReport/ | | Max. process+thread count | 2544 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28033/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL:
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946108#comment-16946108 ] Íñigo Goiri commented on HDFS-14284: For the getMessage(), let's just do: {code} @Override public String getMessage() { return new StringBuilder() .append(msg) .append(" from router ") .append(this.routerId) .toString(); } {code} Then we cna just use LamdaTestUtils#intercept() for this. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16945586#comment-16945586 ] hemanthboyina commented on HDFS-14284: -- test case failures seems unrelated , pls review the patch [~elgoiri] > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16945065#comment-16945065 ] Hadoop QA commented on HDFS-14284: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 47s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 24s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 70m 45s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.federation.router.TestRouterHeartbeatService | | | hadoop.hdfs.server.federation.store.driver.TestStateStoreZK | | | hadoop.hdfs.server.federation.router.TestRouterMountTableCacheRefresh | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.2 Server=19.03.2 Image:yetus/hadoop:1dde3efb91e | | JIRA Issue | HDFS-14284 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12982276/HDFS-14284.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ea70d30b9d69 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / b8086bf | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28014/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28014/testReport/ | | Max. process+thread count | 2438 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U:
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16945046#comment-16945046 ] hemanthboyina commented on HDFS-14284: -- updated patch with check style issues fixed and test failures are unrelated > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch, > HDFS-14284.003.patch, HDFS-14284.004.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16945043#comment-16945043 ] Hadoop QA commented on HDFS-14284: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 54s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 20s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 42s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 77m 0s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.federation.router.TestRouterHeartbeatService | | | hadoop.hdfs.server.federation.store.driver.TestStateStoreZK | | | hadoop.hdfs.server.federation.router.TestRouterMountTableCacheRefresh | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.2 Server=19.03.2 Image:yetus/hadoop:1dde3efb91e | | JIRA Issue | HDFS-14284 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12982274/HDFS-14284.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 83cdf4851223 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fb1ecff | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28013/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt | | unit |
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943111#comment-16943111 ] Brahma Reddy Battula commented on HDFS-14284: - Ok.Just I want to confirm when router is can't access state store we can shutdown the router. {quote}This shouldn't break compatibility as it would be a new field in the new remote exception. {quote} I was talking about "new NoNamenodesAvailableException" where we are going to add one more field( and this exception was introduced b. I was concerned about this. [~ayushtkn] and [~inigoiri], if you both are ok. Then I am ok. [~hemanthboyina] you can update the patch,as [~crh] suggested. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942989#comment-16942989 ] Íñigo Goiri commented on HDFS-14284: In this particular case, we had issues with our secure ZK client; it lost access to the trust store and that left the Router in an intermediate state. This particular instance shouldn't be an issue anymore as we now catch that exception and detect properly. However, it showed that knowing which particular Router was having issues was pretty important. This shouldn't break compatibility as it would be a new field in the new remote exception. In any case, we should constraint to RBF. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942477#comment-16942477 ] Brahma Reddy Battula commented on HDFS-14284: - {quote}We will try to cover as many cases as possible but not easy to get all of them down. {quote} [~elgoiri] can you hightlight why router is not having access to statestore? Do you think,in such case router should allow requests..? > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942437#comment-16942437 ] Ayush Saxena commented on HDFS-14284: - [~brahmareddy] I agree we should shut down the router, considering the {{fail-fast}} mechanism. But getting all the scenarios(I mean all) where we should do this doesn't seems to be a trivial task. Even after these many years, We didn't had all the cases where the namenode should terminate, if it isn't able to serve the request, (I remember fixing couple of months back such a missed case for NN) and Out of personal experience, With the fix infront of the eyes, these issues may appear simple but finding the root-cause is quite difficult in such cases, Atleast in cases of namenode, we know where to check, since there is only one active NN, Which is unlikely for a RBF deployment. With 40+ Routers as Inigo mentioned, getting the culprit Router would be quite a time taking affair. IMO propagating back the routerID is worth enough. bq. and wn't be incompatiable if there is some automation. Are you talking about the the scripts parsing the message, They might fail due to addition in routerID? If so, We can cover up that with a config, and keep that as false, if the Admin has a big deployment and wants to have this up he can enable. If something else, which bothers, let us know we should ensure in anyway we don't outsmart the Compat guidelines. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942391#comment-16942391 ] Íñigo Goiri commented on HDFS-14284: Correct, that's one case. We will try to cover as many cases as possible but not easy to get all of them down. I think if we set this new RouterIOException as a default which always reports the router id, it will be much easier to debug future issues. This is orthogonal to other efforts. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942360#comment-16942360 ] Brahma Reddy Battula commented on HDFS-14284: - {quote}Router that had issues (no access to the store) and it was very hard to find it. {quote} I think we should shutdown router in that case.. One of the example is like below which is not handled currently.. {code:java} if (!zkManager.getCurator().isStarted()) { throw new StateStoreUnavailableException( "Cannot get data, " + "ZKCurator is STOPPED:" + e.getMessage()); }{code} > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942313#comment-16942313 ] Íñigo Goiri commented on HDFS-14284: [~brahmareddy], the main reason for me to open this JIRA was that we had one Router that had issues (no access to the store) and it was very hard to find it. We have 40+ Routers in our deployment and I had to go one by one. Currently, the client assumes one knows the Namenode (or Router) that one is interacting with. This is not true for RBF and any Router can be replying. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942294#comment-16942294 ] Brahma Reddy Battula commented on HDFS-14284: - Agree with [~crh]. May be we can handle seperate Jira and handle only "NoNamenodesAvailableException" here..? I have one question here, what we are going to achieve by adding routerID to following..? will client retry to another router..? ( and wn't be incompatiable if there is some automation.) Exception in thread "main" org.apache.hadoop.ipc.RemoteException(java.io.IOException): No namenode available under nameservice ns0 > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938979#comment-16938979 ] CR Hota commented on HDFS-14284: [~hemanthboyina] [~inigoiri] [~ayushtkn] Thanks for the discussion so far. Overall approach looks fine. Can we separate out RIOEx from hadoop-common and StandByExe not extend from RIOEx? Its best not to change hadoop-common directly for this feature. RIOEx can be added in hdfs-rbf project and standby can be used directly to construct the error msg containing the router id before creating standby exception. Anyways standby already has logic in client side to failover, log of standby will automatically output the router id used when exception was created in server. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938155#comment-16938155 ] Iñigo Goiri commented on HDFS-14284: As I mentioned before, I think routerId should be a field in RouterIOException and store it. Then having a getRouterId method. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch, HDFS-14284.002.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938093#comment-16938093 ] Hadoop QA commented on HDFS-14284: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 48s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 7s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 55s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 16s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 52s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 23m 17s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}136m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:efed4450bf1 | | JIRA Issue | HDFS-14284 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12981361/HDFS-14284.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 8b500c5752ab 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / bdaaa3b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | whitespace |
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937848#comment-16937848 ] Ayush Saxena commented on HDFS-14284: - IMO changing to RIOE from IOE won’t be a problem. Anyone catching IOE would still be able to catch it since RIOE extends IOE and as said RIOE will be a better also since the client can easily do getRouterId and stuff, it will also differentiate the general IOE from the ones coming from router > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937813#comment-16937813 ] hemanthboyina commented on HDFS-14284: -- have some questions [~elgoiri] {code:java} throw new IOException("Cannot locate a registered namenode for " + bpId + " from " + router.getRouterId()); {code} * some places routerid was appended to IOException , as RouterIOException extends IO , so we should make the IO to RIOE ? * there is no static method to get routerid from ROIE , so we need to pass the routerid whenever RIOE called > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937157#comment-16937157 ] Ayush Saxena commented on HDFS-14284: - Thanx [~elgoiri] makes sense. we can extend the same RIOE for all the exceptions where we tend to get the router Id. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937146#comment-16937146 ] Íñigo Goiri commented on HDFS-14284: {quote} do we have any advantage of having a RouterIOException, rather than directly calling super(msg + "from" + routerId); in the new constructors? {quote} It is easier to handle from the client side. If we know is a subtype of RouterIOException we can just do getRouterId() instead of having to parse messages. I think is a better design overall. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937111#comment-16937111 ] Ayush Saxena commented on HDFS-14284: - Thanx [~hemanthboyina] for the patch. I think the trace with which we started as reported was : {code:java} Exception in thread "main" org.apache.hadoop.ipc.RemoteException(java.io.IOException): No namenode available under nameservice ns0 {code} This seems to be a no namenode available scenario, That is a {{NoNamenodeException}}. We can add router id in this similarly and there arent't many occurrences of this, some already have router id appended as string, we just need to migrate them to use the new constructor. [~elgoiri] do we have any advantage of having a RouterIOException, rather than directly calling {{super(msg + "from" + routerId);}} in the new constructors? > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937105#comment-16937105 ] hemanthboyina commented on HDFS-14284: -- _is there any static method we can get?_ there isn't any static method as such _we should move other exception to use this_ i think we have missed out router id only here should we modify the existing exceptions having router id(IO,StandBy) to use RouterIOException ?? > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937085#comment-16937085 ] Íñigo Goiri commented on HDFS-14284: Let's make routerId a field and add a method {{getRouterId()}} to RouterIOException. In addition, the message needs some spaces. We would need to have a unit test here too. It would be nice to be able to get the router Id automatically; is there any static method we can get? As follow ups, we should move other exception to use this. [~ayushtkn] do you mind chiming in? > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937025#comment-16937025 ] hemanthboyina commented on HDFS-14284: -- attached patch , please review [~elgoiri] > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14284.001.patch > > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936177#comment-16936177 ] Hadoop QA commented on HDFS-14284: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 44s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 20s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 27s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 23m 39s{color} | {color:red} hadoop-hdfs-rbf in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 79m 43s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.federation.router.TestRouterFaultTolerant | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:efed4450bf1 | | JIRA Issue | HDFS-14284 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12981096/HDFS-14284.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 31b12bca9121 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c30e495 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/27942/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/27942/testReport/ | | Max. process+thread count | 1578 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | |
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936081#comment-16936081 ] Íñigo Goiri commented on HDFS-14284: [~hemanthboyina], it would be something like that but that one is too specific. I would like to have RouterIOException or something like that that would take the routerId as a parameter (or even get it automatically). So ConnectionNullException would be a subclass of this and do: {code} throw new ConnectionNullException("Cannot get a connection to " + rpcAddress, router.getRouterId()); {code} (Or just the new and infer the router Id by itself using a static or similar.) > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936055#comment-16936055 ] hemanthboyina commented on HDFS-14284: -- [~elgoiri] appending router id to the message will solve the issue ? {code:java} throw new ConnectionNullException("Cannot get a connection to " + rpcAddress + "from" + router.getRouterId()); {code} > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888225#comment-16888225 ] Íñigo Goiri commented on HDFS-14284: OK, we can add it to the process exception. I wonder if we should create a special IOException which has the router id as a field instead of in the message itself. By default, we can then append to the message. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888071#comment-16888071 ] Ayush Saxena commented on HDFS-14284: - I guess we are processing all exception using {{processException()}} in RouterRpcClient. There we can append the RouterId. There are couple of other exceptions which do append the router id like here : {code:java} if (namenodes == null || namenodes.isEmpty()) { throw new IOException("No namenodes to invoke " + method.getName() + " with params " + Arrays.deepToString(params) + " from " + router.getRouterId()); {code} {code:java} // All namenodes were unavailable or in standby String msg = "No namenode available to invoke " + method.getName() + " " + Arrays.deepToString(params) + " in " + namenodes + " from " + router.getRouterId(); {code} {code:java} throw new StandbyException( "Router " + router.getRouterId() + " is overloaded: " + msg); {code} {code:java} if (namenodes == null || namenodes.isEmpty()) { throw new IOException("Cannot locate a registered namenode for " + nsId + " from " + router.getRouterId()); } {code} Almost all places where exception is at fault from Router, Barring the One from where you got. Maybe we can add it there too, and that may solve your problem? Moreover if the exception is from namenode I don't think so you need the routerId because that would be independent of the router. but still that also I feel to do so also should be easy, that can be handled in {{processException}} And as you have 16 Routers giving back the routerId should be far more easier than having audit logs, lesser overhead too. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880692#comment-16880692 ] Íñigo Goiri commented on HDFS-14284: Anything that can tell me from the client who was the Router that triggered the exception is fine. Both audit and reporting the exception would make it. Even the client technically knows the Router, so the client could add the address of the Namenode (Router in this case) that replied. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880596#comment-16880596 ] Ayush Saxena commented on HDFS-14284: - IIUC Don't we unwrap every exception in Router Before throwing it to the client, We have some unwrap mechanism to change ns path to mount level path for user and RE to IOE. Isn't this just require to just add the Router Identifier too into the exception when we process the exception in the same method? Or something else is intended? > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876389#comment-16876389 ] Íñigo Goiri commented on HDFS-14284: Sure, let's see how HDFS-13270 goes. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: hemanthboyina >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875483#comment-16875483 ] hemanthboyina commented on HDFS-14284: -- [~elgoiri] if we implement HDFS-13270 (Router Audit Logger ) , then this may solve your issue As we will get to know from which Router(IP) the exception occured. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771453#comment-16771453 ] Chao Sun commented on HDFS-14284: - Agree this could be useful. Perhaps we can make this optional by adding another constructor and field to {{RemoteException}}? cc [~crh] and [~fengnanli]. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769806#comment-16769806 ] Íñigo Goiri commented on HDFS-14284: We are getting the remote exception as this: {code} Exception in thread "main" org.apache.hadoop.ipc.RemoteException(java.io.IOException): No namenode available under nameservice BN2 at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.shouldRetry(RouterRpcClient.java:309) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:464) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invoke(RouterRpcClient.java:471) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeMethod(RouterRpcClient.java:367) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.invokeSequential(RouterRpcClient.java:734) at org.apache.hadoop.hdfs.server.federation.router.RouterClientProtocol.getFileInfo(RouterClientProtocol.java:699) at org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getFileInfo(RouterRpcServer.java:731) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:881) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2621) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2040) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1449) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3076) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1127) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:881) {code} Maybe we can extend RemoteException to include the source (e.g., IP) of the exception. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769816#comment-16769816 ] Erik Krogen commented on HDFS-14284: Makes sense to me. IIRC that information may not be available where the {{RemoteException}} is created, but if it is, +1. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769714#comment-16769714 ] Erik Krogen commented on HDFS-14284: Interesting problem. Can you maybe post a sample stack trace now, and what you hope for it to look like in the future? In particular, I am curious if you're thinking specifically of a {{RemoteException}}, or something else. The {{RemoteException}} may be a good place to store such information. It wouldn't catch IO exceptions, but I think these typically log their destination address anyhow. Agreed that something like this can be useful for Observer Nodes as well. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14284) RBF: Log Router identifier when reporting exceptions
[ https://issues.apache.org/jira/browse/HDFS-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769697#comment-16769697 ] Íñigo Goiri commented on HDFS-14284: One of the Routers is triggering some exceptions and it is very hard to know which of the Routers (currently 16) to check for more detailed logs. The easy solution is to add the Router identifier to the exception that we throw from the Router. However, this might be a common scenario in general and we may want to also change the ConfiguredFailoverProxyProvider to identify the source of the exception. I think this might be similar for the Observer Namenodes if we have multiple of them. [~xkrogen], [~csun], any thoughts on doing this generic? In any case, I think we should add this to the Router side too. > RBF: Log Router identifier when reporting exceptions > > > Key: HDFS-14284 > URL: https://issues.apache.org/jira/browse/HDFS-14284 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Íñigo Goiri >Priority: Major > > The typical setup is to use multiple Routers through > ConfiguredFailoverProxyProvider. > In a regular HA Namenode setup, it is easy to know which NN was used. > However, in RBF, any Router can be the one reporting the exception and it is > hard to know which was the one. > We should have a way to identify which Router/Namenode was the one triggering > the exception. > This would also apply with Observer Namenodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org