[ 
https://issues.apache.org/jira/browse/YARN-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18052575#comment-18052575
 ] 

ASF GitHub Bot commented on YARN-11920:
---------------------------------------

edwardcapriolo opened a new pull request, #8184:
URL: https://github.com/apache/hadoop/pull/8184

   
   ### Description of PR
   
   1. Container executor creates directories incorrectly. yarn is unable to 
create appdir
   2. code is incorrectly freeing memory in method create_app_dirs a pointer to 
app_dir is created, then a primary_app_dir is set to app_dir, then appdir is 
free()ed. This results in undefined behavior.
   3. No unit test was calling create_app_dirs (one was added)  
   4. better comments were added explaing the different partial failures 
possible in the function
   
   ### How was this patch tested?
   Unit tests pass. Unit tests were added
   
   ### For code changes:
   
   - [ Y] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ NA] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ NA] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ NA] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   ### AI Tooling
   
   This PR was made by hand the 'old school way'
   If an AI tool was used:
   
   - [ NA ] The PR includes the phrase "Contains content generated by <tool>"
         where <tool> is the name of the AI tool used.
   - [ NA ] My use of AI contributions follows the ASF legal policy
         https://www.apache.org/legal/generative-tooling.html




> linux-container-executor requires flexible directory permissions.
> -----------------------------------------------------------------
>
>                 Key: YARN-11920
>                 URL: https://issues.apache.org/jira/browse/YARN-11920
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Priority: Major
>
> When running yarn node-manager directories are created which the node manager 
> can not write to. I discussed this on the mailing list:  
> {quote}Hello. I am trying to run linux-container-executor in a setup without 
> kerberos. I want to see it "change user" and run a map reduce job.
> I have a fork of linux-container-executor with some gratuitous println:
> main : command provided 0
> 2026-01-12T19:51:25.467740715Z main : run as user is auser
> 2026-01-12T19:51:25.467750476Z main : requested yarn user is auser
> 2026-01-12T19:51:25.467760225Z main : validate_container_id 
> 2026-01-12T19:51:25.467771148Z main : huh 
> 2026-01-12T19:51:25.467784131Z validated command: INITIALIZE_CONTAINER
> 2026-01-12T19:51:25.467795274Z init : set_user 
> 2026-01-12T19:51:25.467805332Z maybe free_user 
> 2026-01-12T19:51:25.467815142Z going to check user 
> 2026-01-12T19:51:25.467824798Z min id 
> 2026-01-12T19:51:25.467833618Z min id 1000
> 2026-01-12T19:51:25.467842685Z Get user info 
> 2026-01-12T19:51:25.467851066Z init : set_user done 
> 2026-01-12T19:51:25.467860879Z initialize_app( 
> 2026-01-12T19:51:25.467871118Z create user dirs 
> 2026-01-12T19:51:25.467881131Z initialize_user.
> 2026-01-12T19:51:25.467890384Z created 
> 2026-01-12T19:51:25.467900435Z create_log_dirs().
> 2026-01-12T19:51:25.467911090Z create container log 
> 2026-01-12T19:51:25.467920790Z create_container_log_dirs
> 2026-01-12T19:51:25.467931683Z open_file_as_nm.
> 2026-01-12T19:51:25.467941717Z change_user 
> 2026-01-12T19:51:25.467952667Z change_user.
> *2026-01-12T19:51:25.467962032Z Can't create directory 
> /yarn-root/nm-local-dir/usercache/auser/appcache - Permission denied*
> 2026-01-12T19:51:25.467973350Z Did not create any app directories
> I am creating users like this:
>   RUN addgroup -S hadoop
>   RUN addgroup -S hdfs && adduser -S -G hdfs -H -D hdfs
>   RUN addgroup -S yarn && adduser -S -G yarn -H -D yarn
>   RUN addgroup yarn hadoop
>   RUN addgroup -S auser && adduser -S -G auser -H -D auser
> I am launching a wordcount as "auser" like so:
> [https://github.com/edwardcapriolo/edgy-ansible/blob/main/imaging/hadoop/compositions/ha_rm_zk_pki_tls/enter_auser.sh]
> This is what teh directory inside the node manager looks like:
> nm1:/yarn-root/nm-local-dir/usercache# rm -rf auser/
> nm1:/yarn-root/nm-local-dir/usercache# ld -lahd /yarn-root/
> nm1:/yarn-root/nm-local-dir/usercache# ls -lahd /yarn-root/
> drwxr-xr-x    1 yarn     root          24 Jan 12 19:32 /yarn-root/
> nm1:/yarn-root/nm-local-dir/usercache# ls -lahd /yarn-root/nm-local-dir/
> drwxr-xr-x    1 yarn     hadoop        54 Jan 12 19:32 
> /yarn-root/nm-local-dir/
> nm1:/yarn-root/nm-local-dir/usercache# ls -lahd /yarn-root/nm-local-dir/
> filecache/ nmPrivate/ usercache/ 
> nm1:/yarn-root/nm-local-dir/usercache# ls -lahd 
> /yarn-root/nm-local-dir/usercache/
> drwxr-sr-x    1 yarn     hadoop        10 Jan 12 20:38 
> /yarn-root/nm-local-dir/usercache/
> nm1:/yarn-root/nm-local-dir/usercache# ls -lahd 
> /yarn-root/nm-local-dir/usercache/auser/
> drwxr-s---    1 auser    hadoop         0 Jan 12 20:38 
> /yarn-root/nm-local-dir/usercache/auser/
> My node manager is running as yarn
> nm1:/$ ps -ef | grep yarn
>     1 yarn      0:20 /usr/bin/java -Dproc_nodemanager 
> nm1:/$ id -u yarn
> 101
> nm1:/$ id -g yarn
> 103
> nm1:/$ id -G yarn
> 103 101
> nm1:/$ id -G yarn -n
> yarn hadoop
> nm1:/$ umask 
> 0022
> I am guessing that the issue is 
> drwxr-s---    1 auser    hadoop         0 Jan 12 20:38 auser
> Ths directory gets owned by auser/hadoop but the group write is off?
> My yarn config is here:
> [https://github.com/edwardcapriolo/edgy-ansible/blob/main/imaging/hadoop/compositions/ha_rm_zk_pki_tls/hd_conf/yarn-site.xml#L126]
> Also manually changing it it just gets put back
> nm1:/yarn-root/nm-local-dir/usercache# chmod g+w auser/
> nm1:/yarn-root/nm-local-dir/usercache# ls -lah
> total 0      
> drwxr-sr-x    1 yarn     hadoop        10 Jan 12 20:38 .
> drwxr-xr-x    1 yarn     hadoop        54 Jan 12 19:32 ..
> drwxrws---    1 auser    hadoop         0 Jan 12 20:38 auser
> nm1:/yarn-root/nm-local-dir/usercache# ls -lah
> total 0      
> drwxr-sr-x    1 yarn     hadoop        10 Jan 12 20:38 .
> drwxr-xr-x    1 yarn     hadoop        54 Jan 12 19:32 ..
> drwxr-s---    1 auser    hadoop         0 Jan 12 20:38 auser
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to