[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043677#comment-14043677
 ] 

Hudson commented on HDFS-6475:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1812 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1812/])
HDFS-6475. WebHdfs clients fail without retry because incorrect handling of 
StandbyException. Contributed by Yongjun Zhang. (atm: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605217)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.5.0
>
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043493#comment-14043493
 ] 

Hudson commented on HDFS-6475:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/])
HDFS-6475. WebHdfs clients fail without retry because incorrect handling of 
StandbyException. Contributed by Yongjun Zhang. (atm: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605217)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.5.0
>
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043325#comment-14043325
 ] 

Hudson commented on HDFS-6475:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #594 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/594/])
HDFS-6475. WebHdfs clients fail without retry because incorrect handling of 
StandbyException. Contributed by Yongjun Zhang. (atm: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605217)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.5.0
>
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-24 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043082#comment-14043082
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Thanks a lot [~atm]!

Many thanks to [~daryn] and [~jingzhao] for the review and comments. I will 
follow up with the getTrueCause issue in HDFS-6588.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.5.0
>
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042829#comment-14042829
 ] 

Hudson commented on HDFS-6475:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5774 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5774/])
HDFS-6475. WebHdfs clients fail without retry because incorrect handling of 
StandbyException. Contributed by Yongjun Zhang. (atm: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605217)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.5.0
>
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-24 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042801#comment-14042801
 ] 

Aaron T. Myers commented on HDFS-6475:
--

+1, the latest patch looks good to me. I'm going to commit this momentarily.

We can do the follow-up to clean up the code as Daryn suggests in the other 
JIRA that Yongjun has filed.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-23 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041771#comment-14041771
 ] 

Aaron T. Myers commented on HDFS-6475:
--

Hey Daryn, given Yongjun's investigation, seems like this is a bit more complex 
than originally suggested. Unless you object, I'm going to go ahead and check 
this in tomorrow so we can get this fixed and move on. 

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-23 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041171#comment-14041171
 ] 

Yongjun Zhang commented on HDFS-6475:
-

HI [~daryn],

Would like to check with you, you mentioned 
{quote}
If it turns out to be a lot more complicated, then perhaps a followup jira is ok
{quote}
Based on the information we have so far, the work involved is to remove 
getTrueCause, and replace the retrievePassword with retriableRetrievePassword, 
changing the interface spec of relevant methods because 
retriableRetrievePassword throws more exceptions,  removing getTrueCause 
method, fixing the test failures reported in HDFS-6588. 

I hope you'd agree that it's appropriate to dedicate HDFS-6588 for the above 
mentioned work, and use the lastest patch I posted for HDFS-6475 to handle the 
SecurityException that UserProvider throws.  Would you please comment again? 
Thanks.

BTW, thanks [~jingzhao] for clarifying things in HDFS-6588 (sorry I had a typo 
in last update as 6589). 


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-23 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040984#comment-14040984
 ] 

Yongjun Zhang commented on HDFS-6475:
-

HI [~daryn],

Thanks a lot for the comments. Calling 
{{DelegationTokenSecretManager#retrievePassword}} is the sole place I have seen.

And, the following method in AbstractDelegationTokenSecretManager is where 
retrievePassword is called,
{code}
  public synchronized void verifyToken(TokenIdent identifier, byte[] password)
  throws InvalidToken {
byte[] storedPassword = retrievePassword(identifier);
if (!Arrays.equals(password, storedPassword)) {
  throw new InvalidToken("token (" + identifier
  + ") is invalid, password doesn't match");
}
  }
{code}

I wonder whether we can just replace the above retrievePassword call with 
retriableRetrievePassword here. I will give it a try. 

The failed tests are reported in HDFS-6589, related to HDFS-5322. Hi 
[~jingzhao], I put a question in HDFS-6589. I wonder if the failed tests are 
designed to cover real user scenarios?  Thanks for clarifying.

Best regards.


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-23 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040790#comment-14040790
 ] 

Daryn Sharp commented on HDFS-6475:
---

bq. Your earlier suggestion indicated that we should use 
SecretManager#retriableRetrievePassword instead of 
SecretManager#retrievePassword, does that mean client code has to be modified?

If I understand the question: The methods are only used server-side so no 
client-side changes should be required, so no incompatibility concerns.

Did you happen to trace how/where the {{StandbyException}} is wrapped in an 
{{InvalidToken}}? It looks like 
{{DelegationTokenSecretManager#retrievePassword}} is the only place it occurs, 
but {{DelegationTokenSecretManager#retriableRetrievePassword}} does not wrap 
exceptions in {{InvalidToken}}.

Is this maybe just a test case issue?  Which testcase is failing?

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-22 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040344#comment-14040344
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Filed HDFS-6589 for the flaky test 
TestDistributedFileSystem.testAllWithNoXmlDefaults.


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040329#comment-14040329
 ] 

Hadoop QA commented on HDFS-6475:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12651893/HDFS-6475.009.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestDistributedFileSystem

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7208//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7208//console

This message is automatically generated.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This me

[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040313#comment-14040313
 ] 

Hadoop QA commented on HDFS-6475:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12651890/HDFS-6475.008.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7207//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7207//console

This message is automatically generated.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch, HDFS-6475.009.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-22 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040264#comment-14040264
 ] 

Yongjun Zhang commented on HDFS-6475:
-

HI [~daryn], [~jingzhao] and [~atm],

Many thanks to you guys for earlier review and comments. I just uploaded a new 
revision (008) to address the comments and testing errors.
In summary,

- Per Daryn's suggestion, I attempted to remove getTrueCause() method from 
Server.java as a whole, ran into test failures. After spending quite some time 
to look into, I personally really removing the existing getTrueCauseMethod 
really deserves a new JIRA, I filed HDFS-6588 with details and questions. I 
hope you'd agree based on the information I provided there, but I'm certainly 
open for further discussion.

- The new patch I just uploaded for HDFS-6475 is limited to handle the case 
reported in this JIRA. It is only a few lines in ExceptionHandler.java, plus 
the testcase I added. It no longer calls getTrueCause() method defined in 
Server.java.

Thanks a lot for follow-up.


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
> HDFS-6475.008.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039749#comment-14039749
 ] 

Hadoop QA commented on HDFS-6475:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12651811/HDFS-6475.007.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.ipc.TestSaslRPC
  org.apache.hadoop.hdfs.web.TestWebHdfsTokens

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7198//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7198//console

This message is automatically generated.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(

[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-20 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039718#comment-14039718
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Hi, 

I uploaded a new patch per [~daryn]'s suggestion. On top of what Daryn 
suggested, and because of the exceptions I described in previous update (with 
stack information), I added the logic for InvalidToken in ExceptionHandler:
{code}
   if (e instanceof SecurityException) {
  e = toCause(e);
}
if (e instanceof InvalidToken) {
  e = toCause(e);
}
{code}
This logic is essentially what I wanted to share the original getTrueCause 
method for.

Hi Daryn, would you please help review again?
BTW,  refer to your comment:
{quote}
In saslProcess, just throw the exception instead of running it through 
getTrueCause since it's not a "InvalidToken wrapping another exception" anymore.
{quote}
I did what you suggested, but I'm still getting InvalidToken exception (see the 
stack described above). So it seems that the exception that saslProcess tries 
to handle comes from different source than what I'm running into. 

Thanks a lot.



> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-20 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039675#comment-14039675
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Hello Guys,

Thank you all for the review and discussion.

As a follow-up, the first thing I did was to check that 
retriableRetrievePassword does give StandbyException.

Then I tried what [~daryn] suggested how we could fix this jira, and still run 
into the same symptom (client side),:

{code}
org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:348)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:108)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:580)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:546)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
at kclient1.kclient1$2.run(kclient1.java:80)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
at kclient1.kclient1.main(kclient1.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 StandbyException
at 
org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:108)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
{code}

I traced the server side and found the following:

UserProvider.getValue caught 
{{org.apache.hadoop.security.token.SecretManager$InvalidToken: 
StandbyException}}
with the following stack

{code}
UserProvider.getValue(HttpContext) line: 56
UserProvider.getValue(HttpContext) line: 41
InjectableValuesProvider.getInjectableValues(HttpContext) line: 46
AbstractResourceMethodDispatchProvider$ResponseOutInvoker(AbstractResourceMethodDispatchProvider$EntityParamInInvoker).getParams(HttpContext)
 line: 153
AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(Object, 
HttpContext) line: 203
AbstractResourceMethodDispatchProvider$ResponseOutInvoker(ResourceJavaMethodDispatcher).dispatch(Object,
 HttpContext) line: 75
HttpMethodRule.accept(CharSequence, Object, UriRuleContext) line: 288
ResourceClassRule.accept(CharSequence, Object, UriRuleContext) line: 108
RightHandPathRule.accept(CharSequence, Object, UriRuleContext) line: 147
RootResourceClassesRule.accept(CharSequence, Object, UriRuleContext) line: 84
WebApplicationImpl._handleRequest(WebApplicationContext, ContainerRequest) 
line: 1469
WebApplicationImpl._handleRequest(WebApplicationContext, ContainerRequest, 
ContainerResponse) line: 1400
WebApplicationImpl.handleRequest(ContainerRequest, ContainerResponse) line: 1349
WebApplicationImpl.handleRequest(ContainerRequest, ContainerResponseWriter) 
line: 1339
ServletContainer$InternalWebComponent(WebComponent).service(URI, URI, 
HttpServletRequest, HttpServletResponse) line: 416
ServletContainer.service(URI, URI, HttpServletRequest, HttpServletResponse) 
line: 537
ServletContainer.service(HttpServletRequest, HttpServletResponse) line: 699
ServletContainer(HttpServlet).service(ServletRequest, ServletResponse) line:

[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-20 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039425#comment-14039425
 ] 

Yongjun Zhang commented on HDFS-6475:
-

HI [~daryn], 

While I'm working on a revised version, I have a question. Your earlier 
suggestion indicated that we should use 
{{SecretManager#retriableRetrievePassword}} instead of 
{{SecretManager#retrievePassword}}, does that mean client code has to be 
modified? If so, is it a compatibility issue? thanks a lot.


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-20 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039361#comment-14039361
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Many thanks [~daryn], I will work on a new revision per your suggestion.


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039330#comment-14039330
 ] 

Daryn Sharp commented on HDFS-6475:
---

I'm +0 and defer to atm if the fix is harder than what I just suggested.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039327#comment-14039327
 ] 

Daryn Sharp commented on HDFS-6475:
---

What I'm saying is I think the patch adds too much unnecessary code.  Filing an 
improvement to delete all but a few lines of the code changed in this patch 
seems a bit odd.  I think you just need to:
# Delete {{getTrueCause}} entirely instead of moving it elsewhere
# In {{saslProcess}}, just throw the exception instead of running it through 
{{getTrueCause}} since it's not a "InvalidToken wrapping another exception" 
anymore.
# Keep your 3-line change to unwrap {{SecurityException}} in {{toResponse}}

If it turns out to be a lot more complicated, then perhaps a followup jira is 
ok, but since what should be the cleanest fix is 4 lines of change it would be 
nice to try that first.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-20 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039156#comment-14039156
 ] 

Yongjun Zhang commented on HDFS-6475:
-

HI [~daryn], 

Thanks a lot for the review and the in-depth comments. Since the 
{{getTrueCause}} method is already there, and the patch I posted reused it to 
solve the problem. What you suggested would be an improvement of the solution, 
would you please consider that we commit this patch first, and file a follow-up 
jira to improve the solution? Really appreciate your help.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039128#comment-14039128
 ] 

Daryn Sharp commented on HDFS-6475:
---

I don't think {{getTrueCause}} is necessary anymore and can be removed.  It was 
born when {{SaslDigestCallbackHandler}} used to call 
{{SecretManager#retrievePassword}} which could only throw {{InvalidToken}}.  A 
standby/retriable/etc exception had to be tunneled out via an {{InvalidToken}} 
cause.  Now there's apparently a {{SecretManager#retriableRetrievePassword}} 
that can throw those exceptions directly.

The problem comes down to an abuse of the injectable {{UserProvider}} to do 
authentication instead of via an authentication filter...  Perhaps 
{{SecurityException}} is the only exception for which the cause needs to be 
extracted.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039049#comment-14039049
 ] 

Daryn Sharp commented on HDFS-6475:
---

Taking a look now.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-19 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038040#comment-14038040
 ] 

Aaron T. Myers commented on HDFS-6475:
--

+1, the latest patch looks good to me.

I'll commit this tomorrow unless anyone else has comments in the meantime.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14035626#comment-14035626
 ] 

Hadoop QA commented on HDFS-6475:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12651015/HDFS-6475.006.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7160//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7160//console

This message is automatically generated.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch, HDFS-6475.006.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-17 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033990#comment-14033990
 ] 

Daryn Sharp commented on HDFS-6475:
---

Is a static method in InvalidToken really the right place to check and handle 
other non-InvalidToken exceptions?

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-16 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033406#comment-14033406
 ] 

Jing Zhao commented on HDFS-6475:
-

Yeah, +1

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-16 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033372#comment-14033372
 ] 

Aaron T. Myers commented on HDFS-6475:
--

Latest patch looks pretty good to me. [~jingzhao]?

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-16 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033099#comment-14033099
 ] 

Yongjun Zhang commented on HDFS-6475:
-

BTW, the testLargeFile failure is intermittent and I updated HDFS_6543 about it.


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-16 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032627#comment-14032627
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Running the two failed test locally, I'm seeing one passed and the other failed 
with today's trunk without my change. Filed HDFS-6543.

Specifically:
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer.testEncryptedBalancer2
 passed, and
org.apache.hadoop.hdfs.web.TestWebHDFS.testLargeFile failed with and without 
the fix, I filed HDFS-6543.

Thanks.




> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032178#comment-14032178
 ] 

Hadoop QA commented on HDFS-6475:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650507/HDFS-6475.005.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.web.TestWebHDFS
  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7126//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7126//console

This message is automatically generated.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.j

[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-15 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032096#comment-14032096
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Hi [~jingzhao], 

Thanks a lot for the review and the suggestion! I agree InvalidToken is indeed 
a good place to hold this util method. I made this change and uploaded patch 
005.


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
> HDFS-6475.005.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-15 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032067#comment-14032067
 ] 

Jing Zhao commented on HDFS-6475:
-

Thanks for the fix, [~yzhangal]! The patch looks good to me. My only concern is 
that if we can put the getTrueCause method into some Util class so that we do 
not need to call a method defined in ipc.Server from a class only used by 
webhdfs. I guess we can even move it into the InvalidToken class?

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031507#comment-14031507
 ] 

Hadoop QA commented on HDFS-6475:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650438/HDFS-6475.004.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7121//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7121//console

This message is automatically generated.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-13 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031461#comment-14031461
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Hi ATM,

Thanks a lot for the review! Sorry I didn't make it clear earlier. The change 
in ExceptionHandler does happen on the server side. Basically the 
ExceptionHandler class processes the original exception thrown at server side 
and pass a possibly revised exception to client. The original exception thrown 
at the server side is SecurityException (from UserProvider class) which has 
cause InvalidToken which in turn has cause StandbyException, The 
ExceptionHandler processes and and pass StandbyException to client.

I'm uploading a revised version to address both of your comments. Hopefully the 
revised comments made it more clear. Thanks in advance for reviewing the new 
revision!


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-13 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031415#comment-14031415
 ] 

Aaron T. Myers commented on HDFS-6475:
--

The latest patch looks pretty good to me. Two very small comments:

# I think that the method comment for 
{{testDelegationTokenStandbyNNAppearFirst}} is a bit misleading. Seems like 
it's implying that the Standby NN is now throwing a different exception, when 
in fact I believe that the exception that's thrown is not changed by this 
patch, but rather that the client-side handling of the unwrapping of the 
exception is changed.
# There should be no need to restore the state of the standby/active NNs at the 
end of the test, since the cluster is always shut down at the end of every test 
in this class.

+1 from me once the above are addressed.

[~daryn] and [~jingzhao] - does the latest patch look OK to you?

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-11 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028154#comment-14028154
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Filed  HDFS-6518 (TestCacheDirectives#testExceedsCapacity fails intermittently).


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028087#comment-14028087
 ] 

Hadoop QA commented on HDFS-6475:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649805/HDFS-6475.003.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7082//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7082//console

This message is automatically generated.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch, HDFS-6475.003.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14027605#comment-14027605
 ] 

Hadoop QA commented on HDFS-6475:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649756/HDFS-6475.003.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7080//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7080//console

This message is automatically generated.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-11 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14027515#comment-14027515
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Hi Guys,

I managed to create a testcase and uploaded patch version 003. Would you please 
help review it again? 

Hi [~daryn], I found that ExceptionHandler already did the following import 
from ipc area:
{code}
 import org.apache.hadoop.ipc.RemoteException;
{code}
which means the webhdfs servlet already has dependency on ipc package. So I 
didn't duplicate the code.
would you please let me know whether that make sense?

Thanks a lot.


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
> HDFS-6475.003.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-10 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026642#comment-14026642
 ] 

Yongjun Zhang commented on HDFS-6475:
-

HI [~daryn],

Thanks a lot for the review and comments. I attempted to let UserProvider class 
throw different exception earlier, and found it inherits from classes of jersey 
package, which we won't be able  to change the interface spec. I could just 
make a duplicated copy of the IPC server code to the ExceptionHandler area so 
they are not tied to each other. I'm open for that.






> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-10 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026440#comment-14026440
 ] 

Daryn Sharp commented on HDFS-6475:
---

I had to hack around this problem in HDFS-6222...  I'm a bit uneasy about tying 
the webhdfs servlet to the IPC server.  I'd rather see the logic contained 
within webhdfs.  I think {{UserProvider}} should throw a different exception 
that its {{ExceptionHandler}} specially knows to unwrap.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026288#comment-14026288
 ] 

Hadoop QA commented on HDFS-6475:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649538/HDFS-6475.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.web.TestWebHDFS

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7071//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7071//console

This message is automatically generated.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.

[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-10 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026198#comment-14026198
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Hello [~jingzhao], 

Thanks for your earlier suggestion. Sorry for a bit delay to take care of it. 

I just uploaded a patch. I was able to verify it works with a real cluster 
where I saw the problem and to see the patch fixed the issue. However, I was 
not successful creating a testcase for it. Since this new patch reused the 
method getTrueCause() in Server.java, the remaining thing to be checked by a 
unit test would be the change I made in ExceptionHandler. 

The change in ExceptionHandler is, for ContainerException and 
SecurityException, call getTrueCause() to find the real exception based on the 
"cause" chain of the ContainerException/SecurityException. The original code in 
ExceptionHandler only does one level of cause-seeking for ContainerException.

Would you please help take a look at the patch to see if this patch can be 
committed without a unit testcase? or if you have any other advice?

Thanks a lot.



> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-03 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016289#comment-14016289
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Hello Jing, many thanks for the very helpful info, I'm behind with other 
schedule, will look at this a bit later.


> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-02 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016068#comment-14016068
 ] 

Jing Zhao commented on HDFS-6475:
-

Hi [~yzhangal], thanks for the response and figuring out the issue. Maybe we 
can put the true cause logic into ExceptionHandler#toResponse?

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-02 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016002#comment-14016002
 ] 

Yongjun Zhang commented on HDFS-6475:
-

Hi Jing,

Thanks a lot for the info!  I took a quick look, the issue is similar but seems 
there is an important difference here. That is,
In HDFS-5322 fix, the method (and all caller hierarchy)
{code}
  private void saslProcess(RpcSaslProto saslMessage)
throws WrappedRpcServerException, IOException, InterruptedException {
{code}
is allowed to throw IOException, so your HDFS-5322 solution work well. 

For HDFS-6475, the involved class UserProvider  is not allowed to throw 
IOException. In fact, 
UserProvider is only throwing unchecked exception, e.g., SecurityException 
here to include the StandbyException info in the message and cause:
{code}
/** Inject user information to http operations. */
@Provider
public class UserProvider
extends AbstractHttpContextInjectable
implements InjectableProvider {
  @Context HttpServletRequest request;
  @Context ServletContext servletcontext;
   ..
  @Override
  public UserGroupInformation getValue(final HttpContext context) {
final Configuration conf = (Configuration) servletcontext
.getAttribute(JspHelper.CURRENT_CONF);
try {
  return JspHelper.getUGI(servletcontext, request, conf,
  AuthenticationMethod.KERBEROS, false);
} catch (IOException e) {
  throw new SecurityException(
  "Failed to obtain user group information: " + e, e);
}
  }
{code}

This means we can't throw StandbyException (inherited from IOException) from 
here.
So my uploaded patch tried to parse the message string of the SecurityException 
thrown
here.

UserProvider class inherits from classes of jersey package, which we won't be 
able 
to change the interface spec.

We might be able to change the client/server interface: we detect this kind of 
case
at the interface, then instead of throwing the RemoteException that wraps 
SecurityException,
we throw RemoteException that wraps the cause of StandbyException. I'n not sure 
whether
we should go this route though. 

Would you please comment again? thanks.





> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclien

[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-02 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015634#comment-14015634
 ] 

Jing Zhao commented on HDFS-6475:
-

Looks like the similar issue on the RPC side has been fixed in HDFS-5322. Maybe 
we can follow the same pattern to fix this (see Server#getTrueCause).

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException

2014-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015186#comment-14015186
 ] 

Hadoop QA commented on HDFS-6475:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12647850/HDFS-6475.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7026//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7026//console

This message is automatically generated.

> WebHdfs clients fail without retry because incorrect handling of 
> StandbyException
> -
>
> Key: HDFS-6475
> URL: https://issues.apache.org/jira/browse/HDFS-6475
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, webhdfs
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Attachments: HDFS-6475.001.patch
>
>
> With WebHdfs clients connected to a HA HDFS service, the delegation token is 
> previously initialized with the active NN.
> When clients try to issue request, the NN it contacts is stored in a map 
> returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
> the NN based on the order, so likely the first one it runs into is StandbyNN. 
> If the StandbyNN doesn't have the updated client crediential, it will throw a 
> s SecurityException that wraps StandbyException.
> The client is expected to retry another NN, but due to the insufficient 
> handling of SecurityException mentioned above, it failed.
> Example message:
> {code}
> {RemoteException={message=Failed to obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: 
> StandbyException, javaCl
> assName=java.lang.SecurityException, exception=SecurityException}}
> org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
> obtain user group information: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
> at 
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
> at kclient1.kclient$1.run(kclient.java:64)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
> at kclient1.kclient.main(kclient.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)