[ https://issues.apache.org/jira/browse/HDFS-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eugene Koifman updated HDFS-8663: --------------------------------- Assignee: (was: Eugene Koifman) > sys cpu usage high on namenode server > ------------------------------------- > > Key: HDFS-8663 > URL: https://issues.apache.org/jira/browse/HDFS-8663 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs, namenode > Affects Versions: 2.3.0 > Environment: hadoop 2.3.0 centos5.8 > Reporter: tangjunjie > > sys cpu usage high on namenode server lead to run job very slow. > I use ps -elf see many zombie process. > I check hdfs log I found many exceptions like: > org.apache.hadoop.util.Shell$ExitCodeException: id: sem_410: No such user > at org.apache.hadoop.util.Shell.runCommand(Shell.java:505) > at org.apache.hadoop.util.Shell.run(Shell.java:418) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) > at org.apache.hadoop.util.Shell.execCommand(Shell.java:739) > at org.apache.hadoop.util.Shell.execCommand(Shell.java:722) > at > org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:83) > at > org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52) > at org.apache.hadoop.security.Groups.getGroups(Groups.java:139) > at > org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1409) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.<init>(FSPermissionChecker.java:81) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getPermissionChecker(FSNamesystem.java:3310) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3491) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:764) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:764) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980) > Then I create all user such as sem_410 appear in exception.Then the sys cpu > usage on namenode down. > BTW, my hadoop 2.3.0 enaable hadoop acl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)