Gergo Repas created YARN-7796: --------------------------------- Summary: Container-executor fails with segfault for certain OS configurations Key: YARN-7796 URL: https://issues.apache.org/jira/browse/YARN-7796 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0 Reporter: Gergo Repas Assignee: Gergo Repas Attachments: YARN-7796.000.patch
There is a relatively big (128K) buffer allocated on the stack in container-executor.c for the purpose of copying files. As indicated by the below gdb stack trace, this allocation can fail with SIGSEGV. This happens only on certain OS configurations - I can reproduce this issue on RHEL 6.9: {code:java} [Thread debugging using libthread_db enabled] main : command provided 0 main : run as user is systest main : requested yarn user is systest Program received signal SIGSEGV, Segmentation fault. 0x00000000004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 "/yarn/nm/nmPrivate/container_1516711246952_0001_02_000001.tokens", out_filename=0x932930 "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_000001.tokens", perm=384) at /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 966 char buffer[buffer_size]; (gdb) bt #0 0x00000000004069bc in copy_file (input=7, in_filename=0x7ffd669fd2d6 "/yarn/nm/nmPrivate/container_1516711246952_0001_02_000001.tokens", out_filename=0x932930 "/yarn/nm/usercache/systest/appcache/application_1516711246952_0001/container_1516711246952_0001_02_000001.tokens", perm=384) at /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:966 #1 0x0000000000409a81 in initialize_app (user=<value optimized out>, app_id=0x7ffd669fd2b7 "application_1516711246952_0001", nmPrivate_credentials_file=0x7ffd669fd2d6 "/yarn/nm/nmPrivate/container_1516711246952_0001_02_000001.tokens", local_dirs=0x9331c8, log_roots=<value optimized out>, args=0x7ffd669fb168) at /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c:1122 #2 0x0000000000403f90 in main (argc=<value optimized out>, argv=<value optimized out>) at /root/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c:558 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org