@hadoop folks,

https://github.com/apache/hadoop/pull/8177#issue-comment-box

https://github.com/apache/hadoop/pull/8184

I know this work stream appeared from nowwhere :) sorry. Please take aome
time to revire this.

My assessment, which could be wrong, is that lce does not work as
described. It creates directories with permissions like 750 owned by user
and  yarn nodemanager running yarn:hadoop is unable can not write to the.

Also the c code has some inaccuracy. Allocating buffers wrong and returning
pointers to memory already freed.

If you look at the second pr. I actually have a bit more work to do.
Changing the appcache dir isnt enough. The same problem exists for the log
directory... however even if yarn fails to write the logs it will mark the
job success but can have 4 attempt directories.


On Wednesday, January 14, 2026, Edward Capriolo <[email protected]>
wrote:

> Hadoop devs.
>
> I created this bug/feature:
> https://issues.apache.org/jira/browse/YARN-11920
>
> All I can guess is that people are running yarn as yarn:hadoop and then
> adding all the users to the hadoop group. (which is against what the
> documents suggest). But there just isnt anyway to make this work that I can
> see.
>
> On Tue, Jan 13, 2026 at 10:08 AM Edward Capriolo <[email protected]>
> wrote:
>
>> No if you look inside the code of container-executor. It checks and
>> constantly re-writes the permissions;
>>
>> /**
>>  * Ensure that the given path and all of the parent directories
>> * are created * with the desired permissions.*
>>  */
>> int mkdirs(const char* path, mode_t perm) {
>>   struct stat sb;
>>
>>
>> New runs re-write all permissions when any folder is created! so user abc
>> will recrete the tree.
>>
>> On Mon, Jan 12, 2026 at 10:42 PM Balaji Radhakrishnan <
>> [email protected]> wrote:
>>
>>> Hello Edward,
>>>
>>> I think you should be able to give write permissions 'rx' to group
>>> manually.
>>>
>>> Thanks
>>> R Balaji
>>>
>>> Get Outlook for Android <https://aka.ms/AAb9ysg>
>>> ------------------------------
>>> *From:* Edward Capriolo <[email protected]>
>>> *Sent:* Tuesday, January 13, 2026 2:47:18 AM
>>> *To:* [email protected] <[email protected]>
>>> *Subject:* Re: Question on linux-container-executor
>>>
>>> src/main/native/container-executor/test/test-container-executor.c
>>>
>>> void create_nm_roots(char ** nm_roots) {
>>>
>>>   char** nm_root;
>>>
>>>   for(nm_root=nm_roots; *nm_root != NULL; ++nm_root) {
>>>
>>>     if (mkdir(*nm_root, 0755) != 0) {
>>>
>>>       printf("FAIL: Can't create directory %s - %s\n", *nm_root,
>>>
>>>              strerror(errno));
>>>
>>>       exit(1);
>>>
>>>     }
>>>
>>>     char buffer[100000];
>>>
>>>
>>> *   sprintf(buffer, "%s/usercache", *nm_root);
>>>                 if (mkdir(buffer, 0755) != 0) {              *
>>>
>>>       printf("FAIL: Can't create directory %s - %s\n", buffer,
>>>
>>>              strerror(errno));
>>>
>>>       exit(1);
>>>
>>>     }
>>>
>>>   }
>>>
>>> }
>>>
>>>
>>> The test here is creating 755 which on the surface seems to differ with
>>> what I am seeing.
>>>
>>> On Mon, Jan 12, 2026 at 3:53 PM Edward Capriolo <[email protected]>
>>> wrote:
>>>
>>> Hello. I am trying to run linux-container-executor in a setup without
>>> kerberos. I want to see it "change user" and run a map reduce job.
>>>
>>> I have a fork of linux-container-executor with some gratuitous println:
>>>
>>> main : command provided 0
>>> 2026-01-12T19:51:25.467740715Z main : run as user is auser
>>> 2026-01-12T19:51:25.467750476Z main : requested yarn user is auser
>>> 2026-01-12T19:51:25.467760225Z main : validate_container_id
>>> 2026-01-12T19:51:25.467771148Z main : huh
>>> 2026-01-12T19:51:25.467784131Z validated command: INITIALIZE_CONTAINER
>>> 2026-01-12T19:51:25.467795274Z init : set_user
>>> 2026-01-12T19:51:25.467805332Z maybe free_user
>>> 2026-01-12T19:51:25.467815142Z going to check user
>>> 2026-01-12T19:51:25.467824798Z min id
>>> 2026-01-12T19:51:25.467833618Z min id 1000
>>> 2026-01-12T19:51:25.467842685Z Get user info
>>> 2026-01-12T19:51:25.467851066Z init : set_user done
>>> 2026-01-12T19:51:25.467860879Z initialize_app(
>>> 2026-01-12T19:51:25.467871118Z create user dirs
>>> 2026-01-12T19:51:25.467881131Z initialize_user.
>>> 2026-01-12T19:51:25.467890384Z created
>>> 2026-01-12T19:51:25.467900435Z create_log_dirs().
>>> 2026-01-12T19:51:25.467911090Z create container log
>>> 2026-01-12T19:51:25.467920790Z create_container_log_dirs
>>> 2026-01-12T19:51:25.467931683Z open_file_as_nm.
>>> 2026-01-12T19:51:25.467941717Z change_user
>>> 2026-01-12T19:51:25.467952667Z change_user.
>>> *2026-01-12T19:51:25.467962032Z Can't create directory
>>> /yarn-root/nm-local-dir/usercache/auser/appcache - Permission denied*
>>> 2026-01-12T19:51:25.467973350Z Did not create any app directories
>>>
>>> I am creating users like this:
>>>
>>>   RUN addgroup -S hadoop
>>>   RUN addgroup -S hdfs && adduser -S -G hdfs -H -D hdfs
>>>   RUN addgroup -S yarn && adduser -S -G yarn -H -D yarn
>>>   RUN addgroup yarn hadoop
>>>   RUN addgroup -S auser && adduser -S -G auser -H -D auser
>>>
>>> I am launching a wordcount as "auser" like so:
>>>
>>> https://github.com/edwardcapriolo/edgy-ansible/blob/main/imaging/hadoop/
>>> compositions/ha_rm_zk_pki_tls/enter_auser.sh
>>>
>>> This is what teh directory inside the node manager looks like:
>>>
>>> nm1:/yarn-root/nm-local-dir/usercache# rm -rf auser/
>>> nm1:/yarn-root/nm-local-dir/usercache# ld -lahd /yarn-root/
>>> nm1:/yarn-root/nm-local-dir/usercache# ls -lahd /yarn-root/
>>> drwxr-xr-x    1 yarn     root          24 Jan 12 19:32 /yarn-root/
>>> nm1:/yarn-root/nm-local-dir/usercache# ls -lahd /yarn-root/nm-local-dir/
>>> drwxr-xr-x    1 yarn     hadoop        54 Jan 12 19:32
>>> /yarn-root/nm-local-dir/
>>> nm1:/yarn-root/nm-local-dir/usercache# ls -lahd /yarn-root/nm-local-dir/
>>> filecache/ nmPrivate/ usercache/
>>> nm1:/yarn-root/nm-local-dir/usercache# ls -lahd /yarn-root/nm-local-dir/
>>> usercache/
>>> drwxr-sr-x    1 yarn     hadoop        10 Jan 12 20:38
>>> /yarn-root/nm-local-dir/usercache/
>>> nm1:/yarn-root/nm-local-dir/usercache# ls -lahd /yarn-root/nm-local-dir/
>>> usercache/auser/
>>> drwxr-s---    1 auser    hadoop         0 Jan 12 20:38
>>> /yarn-root/nm-local-dir/usercache/auser/
>>>
>>> My node manager is running as yarn
>>> nm1:/$ ps -ef | grep yarn
>>>     1 yarn      0:20 /usr/bin/java -Dproc_nodemanager
>>>
>>> nm1:/$ id -u yarn
>>> 101
>>> nm1:/$ id -g yarn
>>> 103
>>> nm1:/$ id -G yarn
>>> 103 101
>>> nm1:/$ id -G yarn -n
>>> yarn hadoop
>>>
>>> nm1:/$ umask
>>> 0022
>>>
>>> I am guessing that the issue is
>>>
>>> drwxr-s---    1 auser    hadoop         0 Jan 12 20:38 auser
>>>
>>> Ths directory gets owned by auser/hadoop but the group write is off?
>>>
>>> My yarn config is here:
>>> https://github.com/edwardcapriolo/edgy-ansible/blob/main/imaging/hadoop/
>>> compositions/ha_rm_zk_pki_tls/hd_conf/yarn-site.xml#L126
>>>
>>> Also manually changing it it just gets put back
>>>
>>> nm1:/yarn-root/nm-local-dir/usercache# chmod g+w auser/
>>> nm1:/yarn-root/nm-local-dir/usercache# ls -lah
>>> total 0
>>> drwxr-sr-x    1 yarn     hadoop        10 Jan 12 20:38 .
>>> drwxr-xr-x    1 yarn     hadoop        54 Jan 12 19:32 ..
>>> drwxrws---    1 auser    hadoop         0 Jan 12 20:38 auser
>>> nm1:/yarn-root/nm-local-dir/usercache# ls -lah
>>> total 0
>>> drwxr-sr-x    1 yarn     hadoop        10 Jan 12 20:38 .
>>> drwxr-xr-x    1 yarn     hadoop        54 Jan 12 19:32 ..
>>> drwxr-s---    1 auser    hadoop         0 Jan 12 20:38 auser
>>>
>>> Any help would be appreciated.Thanks!
>>>
>>>
>>>
>>>
>>>

-- 
Sorry this was sent from mobile. Will do less grammar and spell check than
usual.

Reply via email to