Nathan Falk created AMBARI-12974: ------------------------------------ Summary: fast-hdfs-resource fails when sticky bit is used for chmod Key: AMBARI-12974 URL: https://issues.apache.org/jira/browse/AMBARI-12974 Project: Ambari Issue Type: Bug Components: contrib Affects Versions: 2.1.0 Environment: x86 or power, any OS IBM Open Platform Reporter: Nathan Falk
IBM Open Platform version 4.1 uses the permission 01777 for Spark's event log directory: {code} [root@compute000 ~]# grep spark_eventlog_dir_mode /var/lib/ambari-server/resources/stacks/BigInsights/4.1/services/SPARK/package/scripts/params.py spark_eventlog_dir_mode = 01777 {code} In our case, the error is that the Spark History Server fails to start with an IllegalArgumentException: {code} Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/BigInsights/4.1/services/SPARK/package/scripts/job_history_server.py", line 167, in <module> JobHistoryServer().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 218, in execute method(env) File "/var/lib/ambari-agent/cache/stacks/BigInsights/4.1/services/SPARK/package/scripts/job_history_server.py", line 73, in start self.create_historyServer_directory() File "/var/lib/ambari-agent/cache/stacks/BigInsights/4.1/services/SPARK/package/scripts/job_history_server.py", line 120, in create_historyServer_directory params.HdfsResource(None, action="execute") File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 157, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 396, in action_execute self.get_hdfs_resource_executor().action_execute(self) File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 117, in action_execute logoutput=logoutput, File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 157, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 258, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of 'hadoop --config /usr/iop/current/hadoop-client/conf jar /var/lib/ambari-agent/lib/fast-hdfs-resource.jar /var/lib/ambari-agent/data/hdfs_resources.json' returned 1. WARNING: Use "yarn jar" to launch YARN applications. Using filesystem uri: hdfs://localhost:8020 Creating: Resource [source=null, target=/user/spark, type=directory, action=create, owner=spark, group=hadoop, mode=755, recursiveChown=false, recursiveChmod=false, changePermissionforParents=false] Creating: Resource [source=null, target=hdfs://localhost:8020/iop/apps/4.1.0.0/spark/logs/history-server, type=directory, action=create, owner=spark, group=hadoop, mode=1777, recursiveChown=false, recursiveChmod=false, changePermissionforParents=false] Exception in thread "main" java.lang.IllegalArgumentException: 1777 at org.apache.hadoop.fs.permission.PermissionParser.<init>(PermissionParser.java:60) at org.apache.hadoop.fs.permission.UmaskParser.<init>(UmaskParser.java:42) at org.apache.hadoop.fs.permission.FsPermission.<init>(FsPermission.java:106) at org.apache.ambari.fast_hdfs_resource.Resource.setMode(Resource.java:217) at org.apache.ambari.fast_hdfs_resource.Runner.main(Runner.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} The problem is in fast-hdfs-resource, which means that when WebHDFS is used, a different path is taken and there is no error. Since some of our users are using IBM Spectrum Scale instead of HDFS, it is not possible to enable WebHDFS, and so fast-hdfs-resource is used for all hadoop file operations. A JIRA had been opened for this problem previously, and a patch provided (AMBARI-11351). The JIRA was closed because it was thought that the problem went away. In reality, the problem was still there, but it was masked by the use of WebHDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)