Nathan Falk created AMBARI-12974:
------------------------------------

             Summary: fast-hdfs-resource fails when sticky bit is used for chmod
                 Key: AMBARI-12974
                 URL: https://issues.apache.org/jira/browse/AMBARI-12974
             Project: Ambari
          Issue Type: Bug
          Components: contrib
    Affects Versions: 2.1.0
         Environment: x86 or power, any OS
IBM Open Platform
            Reporter: Nathan Falk


IBM Open Platform version 4.1 uses the permission 01777 for Spark's event log 
directory:

{code}
[root@compute000 ~]# grep spark_eventlog_dir_mode 
/var/lib/ambari-server/resources/stacks/BigInsights/4.1/services/SPARK/package/scripts/params.py
 
spark_eventlog_dir_mode = 01777
{code}

In our case, the error is that the Spark History Server fails to start with an 
IllegalArgumentException:

{code}
Traceback (most recent call last):
  File 
"/var/lib/ambari-agent/cache/stacks/BigInsights/4.1/services/SPARK/package/scripts/job_history_server.py",
 line 167, in <module>
    JobHistoryServer().execute()
  File 
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
 line 218, in execute
    method(env)
  File 
"/var/lib/ambari-agent/cache/stacks/BigInsights/4.1/services/SPARK/package/scripts/job_history_server.py",
 line 73, in start
    self.create_historyServer_directory()
  File 
"/var/lib/ambari-agent/cache/stacks/BigInsights/4.1/services/SPARK/package/scripts/job_history_server.py",
 line 120, in create_historyServer_directory
    params.HdfsResource(None, action="execute")
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
line 157, in __init__
    self.env.run()
  File 
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
line 152, in run
    self.run_action(resource, action)
  File 
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
line 118, in run_action
    provider_action()
  File 
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
 line 396, in action_execute
    self.get_hdfs_resource_executor().action_execute(self)
  File 
"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py",
 line 117, in action_execute
    logoutput=logoutput,
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
line 157, in __init__
    self.env.run()
  File 
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
line 152, in run
    self.run_action(resource, action)
  File 
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
line 118, in run_action
    provider_action()
  File 
"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
 line 258, in action_run
    tries=self.resource.tries, try_sleep=self.resource.try_sleep)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
line 70, in inner
    result = function(command, **kwargs)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
line 92, in checked_call
    tries=tries, try_sleep=try_sleep)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
line 140, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
line 291, in _call
    raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of 'hadoop --config 
/usr/iop/current/hadoop-client/conf jar 
/var/lib/ambari-agent/lib/fast-hdfs-resource.jar 
/var/lib/ambari-agent/data/hdfs_resources.json' returned 1. WARNING: Use "yarn 
jar" to launch YARN applications.
Using filesystem uri: hdfs://localhost:8020
Creating: Resource [source=null, target=/user/spark, type=directory, 
action=create, owner=spark, group=hadoop, mode=755, recursiveChown=false, 
recursiveChmod=false, changePermissionforParents=false]
Creating: Resource [source=null, 
target=hdfs://localhost:8020/iop/apps/4.1.0.0/spark/logs/history-server, 
type=directory, action=create, owner=spark, group=hadoop, mode=1777, 
recursiveChown=false, recursiveChmod=false, changePermissionforParents=false]
Exception in thread "main" java.lang.IllegalArgumentException: 1777
    at 
org.apache.hadoop.fs.permission.PermissionParser.<init>(PermissionParser.java:60)
    at org.apache.hadoop.fs.permission.UmaskParser.<init>(UmaskParser.java:42)
    at 
org.apache.hadoop.fs.permission.FsPermission.<init>(FsPermission.java:106)
    at org.apache.ambari.fast_hdfs_resource.Resource.setMode(Resource.java:217)
    at org.apache.ambari.fast_hdfs_resource.Runner.main(Runner.java:78)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}


The problem is in fast-hdfs-resource, which means that when WebHDFS is used, a 
different path is taken and there is no error.

Since some of our users are using IBM Spectrum Scale instead of HDFS, it is not 
possible to enable WebHDFS, and so fast-hdfs-resource is used for all hadoop 
file operations.

A JIRA had been opened for this problem previously, and a patch provided 
(AMBARI-11351). The JIRA was closed because it was thought that the problem 
went away. In reality, the problem was still there, but it was masked by the 
use of WebHDFS.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to