Hi all, I recently pulled the latest source, and ran a full build. The command line was this: mvn compile -Pnative
I was confronted with this: [INFO] Requested user cmccabe has id 500, which is below the minimum allowed 1000 [INFO] FAIL: test-container-executor [INFO] ================================================ [INFO] 1 of 1 test failed [INFO] Please report to mapreduce-...@hadoop.apache.org [INFO] ================================================ [INFO] make[1]: *** [check-TESTS] Error 1 [INFO] make[1]: Leaving directory `/home/cmccabe/hadoop4/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/container-executor' Needless to say, it didn't do much to improve my mood. I was even less happy when I discovered that -DskipTests has no effect on native tests (they always run.) See HADOOP-8480. Unfortunately, it seems like this problem is popping up more and more in our native code. It first appeared in test-task-controller (see MAPREDUCE-2376) and then later in test-container-executor (HADOOP-8499). The basic problem seems to be the hardcoded assumption that all user IDs below 1000 are system IDs. It is true that there are configuration files that can be changed to alter the minimum user ID, but unfortunately these configuration files are not used by the unit tests. So anyone developing on a platform where the user IDs start at 500 is now a second-class citizen, unable to run unit tests. This includes anyone running Red Hat, MacOS, Fedora, etc. Personally, I can change my user ID. It's a time-consuming process, because I need to re-uid all files, but I can do it. This luxury may not be available to everyone, though-- developers who don't have root on their machines, or are using a pre-assigned user ID to connect to NFS come to mind. It's true that we could hack around this with environment variables. It might even be possible to have Maven set these environment variables automatically from the current user ID. However, the larger question I have here is whether this UID validation scheme even makes any sense. I have a user named "nobody" whose user ID is 65534. Surely I should not be able to run map-reduce jobs as this user? Yet, under the current system, I can do exactly that. The root of the problem seems to be that there is both a default minimum and a default maximum for "automatic" user IDs. This configuration seems to be stored in /etc/login.defs. On my system, it has: SYSTEM_UID_MIN 100 SYSTEM_UID_MAX 499 UID_MIN 500 UID_MAX 60000 So that means that anything over 60000 (like nobody) is not considered a valid user ID for regular users. We could potentially read this file (at least on Linux) and get more sensible defaults. I am also curious if we could simply check whether the user we're trying to run the job as has a valid login shell. System users are almost always set to have a login shell of /bin/false or /sbin/nologin. Thoughts? Colin