[ https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880143#comment-15880143 ]
ASF GitHub Bot commented on HADOOP-13817: ----------------------------------------- Github user jojochuang commented on a diff in the pull request: https://github.com/apache/hadoop/pull/161#discussion_r102665699 --- Diff: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ShellBasedUnixGroupsMapping.java --- @@ -133,8 +177,26 @@ protected ShellCommandExecutor createGroupIDExecutor(String userName) { groups = resolvePartialGroupNames(user, e.getMessage(), executor.getOutput()); } catch (PartialGroupNameException pge) { - LOG.warn("unable to return groups for user " + user, pge); - return new LinkedList<>(); + LOG.warn("unable to return groups for user {}", user, pge); + return EMPTY_GROUPS; + } + } catch (IOException ioe) { + // If its a shell executor timeout, indicate so in the message + // but treat the result as empty instead of throwing it up, + // similar to how partial resolution failures are handled above + if (executor.isTimedOut()) { + LOG.warn( + "Unable to return groups for user '{}' as shell group lookup " + + "command '{}' ran longer than the configured timeout limit of " + + "{} seconds.", + user, + Arrays.asList(executor.getExecString()), --- End diff -- I am +1 pending this and Jenkins precommit build. Somehow Jenkins is never run for your patches. > Add a finite shell command timeout to ShellBasedUnixGroupsMapping > ----------------------------------------------------------------- > > Key: HADOOP-13817 > URL: https://issues.apache.org/jira/browse/HADOOP-13817 > Project: Hadoop Common > Issue Type: Improvement > Components: security > Affects Versions: 2.6.0 > Reporter: Harsh J > Assignee: Harsh J > Priority: Minor > > The ShellBasedUnixGroupsMapping run various {{id}} commands via the > ShellCommandExecutor modules without a timeout set (its set to 0, which > implies infinite). > If this command hangs for a long time on the OS end due to an unresponsive > groups backend or other reasons, it also blocks the handlers that use it on > the NameNode (or other services that use this class). That inadvertently > causes odd timeout troubles on the client end where its forced to retry (only > to likely run into such hangs again with every attempt until at least one > command returns). > It would be helpful to have a finite command timeout after which we may give > up on the command and return the result equivalent of no groups found. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org