[ https://issues.apache.org/jira/browse/YARN-8259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483038#comment-16483038 ]
Eric Badger commented on YARN-8259: ----------------------------------- bq. If hidepid option is used by system administrator, yarn user might not have rights to check if /proc/[pid] exists. This might be a concern, but there is a workaround to allow for the admin to whitelist the NM user https://linux-audit.com/linux-system-hardening-adding-hidepid-to-proc/ bq. Also, the reacquistion code runs signalContainer once per second until the application finishes, this resulted in many docker inspect and container-executor calls, which are expensive operations. This worries me the most. Especially on nodes where there are lots of containers running concurrently, this could be pretty devastating for rolling upgrades. I'm not sure I have a strong opinion one way or another on retries vs. /proc for correctness, but I am worried about overloading the docker daemon with a large amount of inspect/ps calls. > Revisit liveliness checks for Docker containers > ----------------------------------------------- > > Key: YARN-8259 > URL: https://issues.apache.org/jira/browse/YARN-8259 > Project: Hadoop YARN > Issue Type: Sub-task > Affects Versions: 3.0.2, 3.2.0, 3.1.1 > Reporter: Shane Kumpf > Assignee: Shane Kumpf > Priority: Major > Labels: Docker > Attachments: YARN-8259.001.patch > > > As privileged containers may execute as a user that does not match the YARN > run as user, sending the null signal for liveliness checks could fail. We > need to reconsider how liveliness checks are handled in the Docker case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org