[ https://issues.apache.org/jira/browse/YARN-8207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467488#comment-16467488 ]
Jason Lowe commented on YARN-8207: ---------------------------------- {quote}Struct args is still evolving. I think it would be safer to keep the data structure private for opaque data structure and deep copy to caller. {quote} Having an opaque data structure that the caller must deep copy to use correctly doesn't make sense. Deep copies by clients is the antithesis of opacity, since the client must intimately understand the structure to properly perform the deep copy. Even if the structure isn't changing, clients are likely to get this copy wrong (as has already occurred), especially if each use needs it done separately. At a bare minimum there should be a utility method, e.g: {{extract_execv_args(args* args)}}, that encapsulates the extraction of execv arguments so each use of args for execv doesn't require a separate, tricky deep copy loop with a trailing NULL appended. That utility method could also avoid the full deep copy and simply transfer ownership of the individual strings to the shallow-copied vector that is returned. But in the end I think avoiding copies completely by heap allocating the args is a much more efficient and elegant solution. Please create an init function, e.g.: init_args(args* args), or a macro to encapsulate initialization of the structure. If the structure is likely to change then it's important to put this in one place rather than chase down every separate initialization code instance. In {{flatten}} this code: {code:java} to = '\0'; {code} is setting the pointer to NULL rather than terminating the string with a NUL character. It needs to be: {code:java} *to = '\0'; {code} {quote}At this time, add_to_args returns no opts to avoid having to check for null on make_string. {quote} I'm not sure what is meant by "returns no opts to avoid having to check for null on make_string." If add_to_args is passed a NULL string then it will segfault because the first statement in add_to_args tries to dereference it: {code:java} static int add_to_args(args *args, const char *string) { if (!string[0]) { {code} {quote}I think the proposal of making the reverse change will add more null pointer check, which makes the code harder to read again. {quote} What I'm proposing would make the code easier to read without triggering the segfault. Most uses of add_to_args look something like this: {code:java} char* x = make_string(fmt, ...); ret = add_to_args(args, x); free(x); if (ret != 0) { ... {code} As add_to_args works today, the lack of a NULL check on the make_string result will cause the program to crash. I'm proposing to add this check at the beginning of add_to_args: {code:java} static int add_to_args(args *args, const char *string) { if (string == NULL) { return -1; } {code} Then the existing make_string followed by add_to_args code pattern works because add_to_args will properly handle a NULL argument and return an error. That precludes the need to add NULL checks everywhere make_string is called when immediately followed by add_to_args. That improves readability while avoiding a segfault. The length != 0 check in reset_args is unnecessary. This code: {code:java} size_t offset = tmp_ptr - values[i] + 1; dst = (char *) alloc_and_clear_memory(offset, sizeof(char)); strncpy(dst, values[i], offset); dst[tmp_ptr - values[i]] = '\0'; pattern = (char *) alloc_and_clear_memory((size_t)(strlen(permitted_values[j]) - 5), sizeof(char)); strcpy(pattern, permitted_values[j] + 6); {code} would be safer and easier to understand written with strdup/strndup, e.g.: {code:java} dst = strndup(values[i], tmp_ptr - values[i]); pattern = strdup(permitted_values[j] + 6); {code} make_string is still not checking for vsnprintf failure. If the first vsnprintf fails and returns -1, the code will allocate a 0-byte buffer. The second vsnprintf may then actually succeed, returning the length required to store all of the string without updating the zero-length buffer. Then make_string will "succeed" by returning an unterminated buffer. It would be good to fix the tab character issue that's been flagged in the past few patches (in create_container_directories). > Docker container launch use popen have risk of shell expansion > -------------------------------------------------------------- > > Key: YARN-8207 > URL: https://issues.apache.org/jira/browse/YARN-8207 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services > Affects Versions: 3.0.0, 3.1.0, 3.0.1, 3.0.2 > Reporter: Eric Yang > Assignee: Eric Yang > Priority: Blocker > Labels: Docker > Attachments: YARN-8207.001.patch, YARN-8207.002.patch, > YARN-8207.003.patch, YARN-8207.004.patch, YARN-8207.005.patch, > YARN-8207.006.patch, YARN-8207.007.patch, YARN-8207.008.patch > > > Container-executor code utilize a string buffer to construct docker run > command, and pass the string buffer to popen for execution. Popen spawn a > shell to run the command. Some arguments for docker run are still vulnerable > to shell expansion. The possible solution is to convert from char * buffer > to string array for execv to avoid shell expansion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org