[ https://issues.apache.org/jira/browse/YARN-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16137515#comment-16137515 ]
Eric Badger commented on YARN-6623: ----------------------------------- Hey [~vvasudev], I’ve finished my comments on the production code portion of the patch. This doesn’t include test code. I think I’ll wait for your response to this before I review the tests. I tried to be quite thorough and so I have quite a few comments. I hope that they are all correct interpretations, but it’s quite a big patch so I might have misunderstood some stuff and/or missed some stuff. ---- Should we fix the no newline at the end of file warnings? The apply tool complains about them. ---- {noformat:title=DockerCommand:getCommandArguments()} public Map<String, List<String>> getCommandArguments() { return Collections.unmodifiableMap(commandArguments); } {noformat} This will return the command as well as the arguments. Unless we are considering the {{/usr/docker}} to be the actual command and {{inspect}} to be one of the arguments. If that’s what we’re expecting to happen here, then the name is a little bit misleading. This might be more of a problem with how {{commandArguments}} is named than how this function is named. ---- {noformat:title=container-executor.c:construct_docker_command()} +char *construct_docker_command(const char *command_file) { + int ret = 0; + char *buffer = (char *) malloc(EXECUTOR_PATH_MAX * sizeof(char)); {noformat} This should use {{_SC_ARG_MAX}} as we did in YARN-6988 {{size_t command_size = MIN(sysconf(_SC_ARG_MAX), 128*1024);}} Also, why not use {{calloc()}} instead of {{malloc()}} and then {{memset()}}? ---- {noformat:title=container-executor.c:run_docker()} + docker_command = construct_docker_command(command_file); + docker_binary = get_docker_binary(&CFG); {noformat} I don’t see these getting freed. Am I missing this invocation somewhere? ---- {noformat:title=container-executor.c:run_docker()} + memset(docker_command_with_binary, 0, EXECUTOR_PATH_MAX); {noformat} Is this necessary? We allocate the memory with {{calloc()}} which already clears all of the memory upon allocation. ---- {noformat:title=container-executor.h} // get the executable's filename char* get_executable(char *argv0); {noformat} Do we need this declaration (in container-executor.h) since we have moved that declaration into get_executable.h? We should also add get_executable.h in the appropriate places. Looks like main.c and test-container-executor.c both call {{get_executable}}. ---- {noformat:title=main.c:assert_valid_setup()} - fprintf(ERRORFILE,"realpath of executable: %s\n", - errno != 0 ? strerror(errno) : "unknown"); - flush_and_close_log_files(); - exit(-1); + fprintf(ERRORFILE, "realpath of executable: %s\n", + errno != 0 ? strerror(errno) : "unknown"); + exit(INVALID_CONFIG_FILE); {noformat} Why don’t we want to flush the log files anymore? ---- {noformat:title=util.c:alloc_and_clear_memory()} +void* alloc_and_clear_memory(size_t num, size_t size) { + void *ret = calloc(num, size); + if (ret == NULL) { + exit(OUT_OF_MEMORY); + } + return ret; +} {noformat} Should we inline this? Also, if we’re going to use this function, then all calloc calls should be replaced with this (like the ones I mentioned above) ---- {noformat:title=util.h} // DOCKER_CONTAINER_NAME_INVALID = 41, {noformat} Should add {{(NOT USED)}} to follow convention ---- {noformat:title=docker-util.c:read_and_verify_command_file()} if (command == NULL || (strcmp(command, docker_command) != 0)) { ret = INCORRECT_COMMAND; } {noformat} Is {{command}} guaranteed to be null-terminated? It comes from the configuration file, which is a Java creation and I don’t think Java null-terminates. This comment is probably relevant for quite a few other places that are doing string operations. We need to be very safe about this or else we risk possibly overrunning into random regions of the heap. A safe alternative would be to use the “n” version of all the string operations. This patch uses a mixed bag of the regular versions and their accompanying “n” versions. I don’t quite understand the reasoning behind the usage of each. If we guarantee that the string is null terminated (and always null terminated) then we don’t need the “n” versions. But even if we guarantee that the input string is null terminated, it may accidentally have the null character chopped off by an off by 1 error in a strdup or something like that. So my preference here would be to use the “n” versions of all of the string functions. Thoughts? ---- {noformat:title=docker-util.c:read_and_verify_command_file()} + if (current_len + string_len < bufflen - 1) { + strncpy(buff + current_len, string, bufflen - current_len); + buff[current_len + string_len] = '\0'; + return 0; + } {noformat} Here it is copying over {{bufflen - current_len}} bytes, when we really only have {{string_len}} bytes to copy. It’s not going to overflow {{buff}}, but we might copy some extra garbage past the end of {{string}} if it’s not null terminated. ---- {noformat:title=docker-util.c: add_docker_config_param()} +static int add_docker_config_param(const struct configuration *command_config, char *out, const size_t outlen) { + int ret = 0; + size_t tmp_buffer_size = 4096; + char *tmp_buffer; + char *config = get_configuration_value("docker-config", DOCKER_COMMAND_FILE_SECTION, command_config); + if (config != NULL) { + tmp_buffer = (char *) alloc_and_clear_memory(tmp_buffer_size, sizeof(char)); + quote_and_append_arg(&tmp_buffer, &tmp_buffer_size, "--config=", config); + ret = add_to_buffer(out, outlen, tmp_buffer); + free(tmp_buffer); + free(config); + } + return ret; {noformat} Is this function necessary? Can’t it be done in a call to {{add_param_to_command}}? ---- {noformat:title=docker-util.c: get_docker_load_command()} + image_name = get_configuration_value("image", DOCKER_COMMAND_FILE_SECTION, &command_config); + if (image_name == NULL) { + return INVALID_DOCKER_IMAGE_NAME; + } + + memset(out, 0, outlen); {noformat} Maybe being paranoid is a good thing, but {{get_docker_load_command}} and any of the other commands from {{get_docker_command}} are always passed in a buffer that has already been nulled out, so the memset here is redundant. I imagine it will be similar in the other {{get_*_comand}} functions. Maybe we should put a comment saying that it is the caller’s responsibility to null out the buffer that they pass in and get rid of these memsets? Either that or keep the memset here and get rid of the resets higher up in {{get_docker_command}}? ---- {noformat:title=docker-util.c: get_docker_load_command()} + image_name = get_configuration_value("image", DOCKER_COMMAND_FILE_SECTION, &command_config); + if (image_name == NULL) { + return INVALID_DOCKER_IMAGE_NAME; + } {noformat} We should validate the image name here as we do in {{get_docker_pull_command}} below. ---- {noformat:title=docker-util.c: get_docker_rm_command()} + container_name = get_configuration_value("name", DOCKER_COMMAND_FILE_SECTION, &command_config); + if (container_name == NULL) { + return INVALID_DOCKER_CONTAINER_NAME; + } {noformat} We validate the container_id name in docker inspect, but not in docker rm, stop, or run ---- {noformat:title=docker-util.c} +static int add_param_to_command(const struct configuration *command_config, const char *key, const char *param, + const int with_argument, char *out, const size_t outlen) { + size_t tmp_buffer_size = 1024; {noformat} What’s the reasoning behind the buffer size here? It’s hardcoded to 1024 in a bunch of places. This doesn’t follow the system path limit, the {{EXECUTOR_PATH_MAX}} or the system argument length. ---- {noformat:title=docker-util.c:set_network()} + } else if (value == NULL) { + ret = 0; + goto free_and_exit; {noformat} Since we didn’t find any network in the configuration, we didn’t set a network. That seems like it should print out an error and/or fail. Right now it will just silently do nothing. ---- {noformat:title=docker-util.c:set_network()} + if (permitted_values != NULL) { + free(permitted_values[0]); + free(permitted_values); + } {noformat} We need to loop through {{permitted_values}} and free everything or else we will leak. ---- {noformat:title=docker-util.c:check_mount_permitted()} + // directory check + permitted_mount_len = strlen(permitted_mounts[i]); + if (permitted_mount_len > 0 + && permitted_mounts[i][permitted_mount_len - 1] == '/') { + if (strncmp(requested, permitted_mounts[i], permitted_mount_len) == 0) { + return 1; {noformat} This looks potentially dangerous. Though we are normalizing the paths in all of the current invocations, someone in the future may add in another invocation that doesn’t normalize. I think at a bare minimum we need a comment saying that this function requires all symlinks to be resolved in the path that is passed to this function. Because if we allow symlinks to be followed then we will get different behavior because of how docker treats symlinks. A possible fix for this could be to combine {{get_src_mount()}} and {{get_real_src_mount()}} into a single function that parses the mount string to get the source and then resolves it. ---- {noformat:title=docker-util.c:normalize_mounts()} + char *alloc_ptr = mounts[0]; + for (i = 0; mounts[i] != NULL; ++i) { + tmp = normalize_mount(mounts[i]); + if (tmp == NULL) { + return -1; + } + mounts[i] = tmp; + } + free(alloc_ptr); {noformat} Why are we freeing {{mounts\[0\]}} or anything at all here? ---- {noformat:title=docker-util.c:get_real_src_mount()} + char *tmp = strdup(src_mount); + if (tmp == NULL) { + fprintf(ERRORFILE, "Could not allocate memory\n"); + exit(OUT_OF_MEMORY); + } {noformat} We should be consistent on our error handling. Some calls to strdup we exit with OOM and some we let it return error. This is inconsistent between {{get_src_mount()}} and {{get_real_src_mount()}}. I’m not sure I have a preference on whether to exit here or not, but we should make sure all places are consistent. On another note, is this function necessary given that we have {{normalize_mount()}}? ---- {noformat:title=docker-util.c:add_mounts()} + if(permitted_ro_mounts != NULL) { + free(permitted_ro_mounts[0]); + free(permitted_ro_mounts); + } + if (permitted_rw_mounts != NULL) { + for (i = 0; permitted_rw_mounts[i] != NULL; ++i) { + free(permitted_rw_mounts[i]); + } + free(permitted_rw_mounts); + } + if (values != NULL) { + free(values[0]); + free(values); + } {noformat} You’ve done this above too, so I might be missing something. But I don’t see why we only want to free the 0th element of {{permitted_ro_mounts}}. Is there something that guarantees that {{permitted_ro_mounts}} will only have the first element of the pointer array populated? The pointer arrays are getting created by {{get_configuration_values_delimiter()}}, which ends up doing a strdup inside of a while loop via {{split_delimiter()}}. ---- {noformat:title=docker-util.c:add_mounts()} + if (real_src_mount != NULL) { + free(real_src_mount); + } + if (src_mount != NULL) { + free(src_mount); + } + if(normalized_src_mount != NULL) { + free(normalized_src_mount); + } + if(container_executor_cfg_path != NULL) { + free((char *) container_executor_cfg_path); + } {noformat} We don’t need to check for null. {{free}} will do nothing if a null pointer is passed, so it’s safe and makes the check unnecessary. ---- {noformat:title=docker-util.c:add_mounts()} + free(real_src_mount); + free(src_mount); + free(normalized_src_mount); + src_mount = NULL; + real_src_mount = NULL; + normalized_src_mount = NULL; {noformat} Is the clear necessary here? The next time they are used in the loop they will be written instead of read. And the convention of the code in this file is to not clear after free (though maybe it should be?) ---- {noformat:title=docker-util.c:add_mounts()} + real_src_mount = get_real_src_mount(src_mount); + if (real_src_mount == NULL) { + ret = INVALID_DOCKER_MOUNT; + goto free_and_exit; + } + normalized_src_mount = normalize_mount(real_src_mount); + if(normalized_src_mount == NULL) { + ret = INVALID_DOCKER_MOUNT; + goto free_and_exit; + } {noformat} Do we need to call both {{get_real_src_mount()}} and {{normalize_mount()}}? As I mentioned in an earlier comment, {{get_real_src_mount()}} seems to be a subset of the functionality of {{normalize_mount()}}. They both call {{realpath()}}, but {{normalize_mount()}} does some extra directory stuff. ---- {noformat:title=docker-util.c:add_mounts()} + // determine if the user can modify the container-executor.cfg file + tmp_path_buffer[0] = normalized_src_mount; + // just re-use the function, flip the args to check if the container-executor path is in the requested + // mount point + ret = check_mount_permitted(tmp_path_buffer, container_executor_cfg_path); + if (ret == 1) { + fprintf(ERRORFILE, "Attempting to mount a parent directory '%s' of container-executor.cfg as read-write\n", + values[i]); + ret = INVALID_DOCKER_RW_MOUNT; + goto free_and_exit; + } {noformat} I understand the idea here, but isn’t this at the discretion of the admin? What if clusters have container-executor.cfg in a conf directory with a bunch of other configs? This will cause them to have to make major changes in the structure of their configs and deployments. Especially if they don’t have security enabled, they might not care about these potential vulnerabilities. Plus, unix permissions have the container-executor.cfg file as only readable by root. This would be a change that the administrator couldn’t override to relax the constraint. So while I like the idea of trying to protect container-executor.cfg, I think this should be something that can be relaxed by the admins at their discretion. ---- {noformat:title=docker-util.c:set_privileged()} + if (value != NULL) { + free(value); + } + if (privileged_container_enabled != NULL) { + free(privileged_container_enabled); + } {noformat} Don’t need to do the null check before {{free}} ---- {noformat:title=docker-util.c:set_capabilities()} + if (values != NULL) { + if(permitted_values != NULL) { + for (i = 0; values[i] != NULL; ++i) { + memset(tmp_buffer, 0, tmp_buffer_size); + permitted = 0; + for (j = 0; permitted_values[j] != NULL; ++j) { + if (strcmp(values[i], permitted_values[j]) == 0) { + permitted = 1; + break; + } + } {noformat} I think this is the 3rd time I’ve seen this structure. I think it would be nice if we had a single function that took in a double array for permitted_values, and the value to be checked to see whether that value is permitted. That way we don’t have to duplicate all of the looping and can minimize the amount of code that we need to maintain. We could use this for networks, capabilities, mounts, devices, etc. ---- {noformat:title=docker-util.c:set_capabilities()} + if(values != NULL) { + free(values[0]); + free(values); + } + if(permitted_values != NULL) { + free(permitted_values[0]); + free(permitted_values); + } {noformat} Same issue with not looping when freeing ---- {noformat:title=docker-util.c:set_devices()} + if (permitted_devices != NULL) { + free(permitted_devices[0]); + free(permitted_devices); + } + if (values != NULL) { + free(values[0]); + free(values); {noformat} Same issue with not looping when freeing ---- {noformat:title=docker-util.c:get_docker_run_command()} + memset(out, 0, outlen); {noformat} Is there a reason we need to memset {{out}} here but not in the other {{get_*_command}} functions? ---- {noformat:title=docker-util.c:get_docker_run_command()} + quote_and_append_arg(&tmp_buffer, &tmp_buffer_size, "", image); + ret = add_to_buffer(out, outlen, tmp_buffer); + if (ret != 0) { + return BUFFER_TOO_SMALL; + } + memset(tmp_buffer, 0, tmp_buffer_size); + + launch_command = get_configuration_values_delimiter("launch-command", DOCKER_COMMAND_FILE_SECTION, &command_config, + ","); + if (launch_command != NULL) { + for (i = 0; launch_command[i] != NULL; ++i) { + memset(tmp_buffer, 0, tmp_buffer_size); {noformat} Don’t think we need the first memset here since the first thing we do in the for loop is memset. ---- {noformat:title=docker-util.c:get_docker_run_command()} + ret = add_to_buffer(out, outlen, DOCKER_RUN_COMMAND); + if (ret == 0) { … + } + return BUFFER_TOO_SMALL; {noformat} Rather than put this whole block in an if statement, we could invert the if statement and have an early return on failure, similar to what we do a few lines up with {{add_docker_config_param()}} ---- {noformat:title=docker-util.c:get_docker_run_command()} + if (ret != 0) { + free(launch_command[0]); + free(launch_command); + free(tmp_buffer); {noformat} Same issue with not looping when freeing for {{launch_command}} ---- As a general comment for docker-util.c, it would be nice if we could consolidate some of the code in the {{get\_\*\_command}} functions and in the {{set\_\*}} functions. There is a lot of redundant code in there. I’m worried that someone in the future will change part of the flow in one function (possibly a bug fix) that will then not get propagated to the rest of the similar code. This wouldn't necessarily have to be a part of this patch, but it might get kicked down the line and never get in if we don't do it now. > Add support to turn off launching privileged containers in the > container-executor > --------------------------------------------------------------------------------- > > Key: YARN-6623 > URL: https://issues.apache.org/jira/browse/YARN-6623 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager > Reporter: Varun Vasudev > Assignee: Varun Vasudev > Priority: Blocker > Attachments: YARN-6623.001.patch, YARN-6623.002.patch, > YARN-6623.003.patch, YARN-6623.004.patch, YARN-6623.005.patch > > > Currently, launching privileged containers is controlled by the NM. We should > add a flag to the container-executor.cfg allowing admins to disable launching > privileged containers at the container-executor level. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org