GitHub user mattf-horton opened a pull request:

    https://github.com/apache/incubator-metron/pull/425

    METRON 609 Enhance Mpack to handle single-node and small-cluster installs 
of Elasticsearch

    This PR is not ready for prime time, but is provided for ease of access to 
work-in-progress for:
    - METRON-609 Enhance Mpack to handle single-node and small-cluster installs 
of Elasticsearch, and 
    - METRON-634 Mpack bug fixes and improvements (not related to singlenode 
install).  
    
    These are presented as two separate commits, so you can look at them 
separately if you wish.
    
    These are the included enhancements from METRON-609:
    - Enable 1-, 2-, and 3-node clusters to have a working Elasticsearch 
install via the Mpack.
        - Change constraints from 1+ Masters and 3+ Slaves, to 1+ and 0+.
        - Allow non-dedicated master/datanodes via boolean 
"masters_also_are_datanodes".
        - Allow use of alternative single-node template via 
"single_node_elasticsearch" boolean.
        - Only the 1- and 4-node clusters have been tested, last month.
    - Improve various mouse-over Description fields in the GUI.
    - I included the attempted validation check on (storm) num_slots = 
slots_per_supervisor * num_supervisors.  This doesn't currently work due to 
pre-existing bug in other parts of validation check, so haven't been able to 
test.
    
    These are the included enhancements and bug fixes from METRON-634:    
    NOT AFFECTING THE AMBARI DATABASE:
    - ES pid_dir specification and usage:
        - Currently pid_dir is multiply specified in elastic-env.xml and 
params.py. The config parameter should not be over-ridden in params.py.
        - PID_DIR failed to be included in /etc/sysconfig/elasticsearch. It 
needs to be added to the template in elastic-sysconfig, as it must be provided 
to ES at launch-time (else the default directory will be used).
        - pid_file is specified in params.py, but is not used anywhere. (The ES 
internal launcher synthesizes it from PID_DIR, and this is appropriate.)
    - JAVA_HOME needs to be provided in /etc/sysconfig/elasticsearch (templated 
in elastic-sysconfig.xml). Its absence causes Centos7 systemctl to fail the ES 
launch, unless /bin/java is defined (which it isn't necessarily).
    - Also in the /etc/sysconfig/elasticsearch template in 
elastic-sysconfig.xml, the value of ES_JAVA_OPTS incorrectly spans 3 lines. The 
lines must be terminated with backslashes to effectively become a single line. 
The current inclusion of newlines in the long string value is acceptable 
(although unusual) in shellscript, but not in a systemd EnvironmentFile. 
/etc/sysconfig/elasticsearch must function as both.
    - Also in ES_JAVA_OPTS, the two instances of log_dir needs to be followed 
by a slash '/'
    - In elastic.py, when directories are being pre-created and permissions 
set, the file $CONF_DIR/scripts should also be pre-created. I intermittently 
hit permissions issues with this directory being created later by root, and not 
properly assigned to elastic_user.
    - In several places in elastic.py, "params.elastic_user" is incorrectly 
used when "params.user_group" should be used.
    - Undefined "format()" method is used in elastic.py, unnecessarily in 
File(format("/etc/sysconfig/elasticsearch")...
    - Undefined "format()" method is similarly used several times unnecessarily 
in elastic_master.py
    - The comments and descriptions in elastic-site.xml have multiple suggested 
improvements.
    - Provide Quick Links in Ambari service page for Elasticsearch to the 
self-report pages for ES health and ES node list. (very useful for debugging)
    
    CHANGES THAT DO AFFECT THE AMBARI DATABASE:
    - pid_dir SHOULD be specified in elastic-sysconfig.xml, rather than 
elastic-env.xml, as it is a parameter that must be provided to ES at 
launch-time, but is not something there's any reason for the admin to change in 
usual circumstances.
    - conf_dir SHOULD be specified in elastic-env.xml or elastic-site.xml, not 
in elastic-sysconfig.xml. While it too is a parameter that must be provided to 
ES at launch-time, it is typically left to the installing admin where to put 
the config files.
    - The Ambari configuration parameter names in elastic-site.xml should be 
improved in several instances to make the semantics more obvious to the human 
reader (who may not be real familiar with Elasticsearch configuration). 
Mouse-over documentation will continue to provide the ES config parameter 
equivalents. In particular, suggest:
        - cluster_name -> es_cluster_name  (to distinguish ES cluster from 
Stack cluster)
        - zen_discovery_ping_unicast_hosts -> es_cluster_hosts
        - network_host -> network_bindings  (these are in fact interface names, 
not host names)
    - There are at least two places in elasticsearch.master.yaml.j2 
(zen_discovery_ping_unicast_hosts and network_host) where needed square 
brackets are either missing or included in the configuration string. To be 
consistent with other usages, and less prone to human error, the square 
brackets should not be in the configuration string but rather should be 
provided in the template text.
    - In METRON/0.3.0/configuration/metron-env.xml and 
METRON/0.3.0/package/scripts/params/params_linux.py, the value 
"metron_apps_indexed_hdfs_dir" does not need to be settable by admin; it is 
appropriate to require it to be subordinate to "metron_apps_hdfs_dir". Thus it 
can be removed from metron-env.xml and set to 
"{metron_apps_hdfs_dir}/indexing/indexed" in params_linux.py. This also 
eliminates a really unacceptable use of "double format".
    
    NOTE that these changes, because they affect the database, should properly 
be accompanied by a database update script and a version increment in the Mpack 
version number.  This is not currently implemented.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mattf-horton/incubator-metron METRON-609

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-metron/pull/425.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #425
    
----
commit 0fd12a5bab2745e7c496657ef92b792b60faf2bf
Author: mattf-horton <mfo...@hortonworks.com>
Date:   2017-01-25T22:39:23Z

    METRON-609 Enhance Mpack to handle single-node and small-cluster installs 
of Elasticsearch.  Work in Progress, at request of David Lyle.

commit 1af5376d59fe4c1812bda519e9b960dc74fdb0d6
Author: mattf-horton <mfo...@hortonworks.com>
Date:   2017-01-26T07:41:04Z

    METRON-634 Mpack bug fixes and improvements (not related to singlenode 
install). Partial: all improvements from METRON-634 already proved out in 
METRON-608.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to