-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43948/
-----------------------------------------------------------
Review request for Ambari, Alejandro Fernandez, Andrew Onischuk, Sumit Mohanty,
and Sid Wagle.
Bugs: AMBARI-15158
https://issues.apache.org/jira/browse/AMBARI-15158
Repository: ambari
Description
-------
If ATS is installed than Resource Manager after starting will check if the
directories where ATS will store time line data for active and completed
applications exists in DFS. There migh tbe cases when RM comes up much earlier
than ATS creating these directories. In these situations RM will stop with
"IOException: /ats/active does not exist" error message.
In order to avoid this situation the pythin script responsible for starting RM
component has been modified to check the existence of these directories upfront
before the RM process is started. This check is performed only if ATS is
installed and have either
yarn.timeline-service.entity-group-fs-store.active-dir or
yarn.timeline-service.entity-group-fs-store.done-dir set.
Diffs
-----
ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/params_linux.py
2ef404d
ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/resourcemanager.py
ec7799e
Diff: https://reviews.apache.org/r/43948/diff/
Testing
-------
Manual testing:
1. Created secure/non-secure clusters with Blueprint where NN, RM and ATS were
deployed to different nodes. This was tested with both cases when HDFS has
webhdfs enabled and disabled.
2. Created a cluster using the UI where NN, RM and ATS were deployed to
different nodes. After the cluster was kerberized and was tested with both
cases when HDFS has webhdfs enabled and disabled.
Python tests results:
----------------------------------------------------------------------
Total run:902
Total errors:0
Total failures:0
OK
Thanks,
Sebastian Toader