Github user anandsubbu commented on a diff in the pull request:

    https://github.com/apache/metron/pull/869#discussion_r159361598
  
    --- Diff: metron-deployment/README.md ---
    @@ -15,178 +15,134 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
express or implied.
     See the License for the specific language governing permissions and
     limitations under the License.
     -->
    -# Overview
    -This set of playbooks can be used to deploy an Ambari-managed Hadoop 
cluster containing Metron services using Ansible. These playbooks target 
RHEL/CentOS 6.x operating
    -systems.
     
    -Installation consists of -
    -- Building Metron tarballs, RPMs and the Ambari MPack
    -- Deploying Ambari
    -- Leveraging Ambari to install:
    -  * The required Hadoop Components
    -  * Core Metron (Parsing, Enrichment, Indexing)
    -  * Elasticsearch
    -  * Kibana
    -- Starting All Services
    +This project contains tools for building, packaging, and deploying Apache 
Metron.  Please refer to the following sections for more information on how to 
get Apache Metron running in your environment.
     
    -## Prerequisites
    -The following tools are required to run these scripts:
    -
    -- [Maven](https://maven.apache.org/)
    -- [Git](https://git-scm.com/)
    -- [Ansible](http://www.ansible.com/) (2.0.0.2 or 2.2.2.0)
    -- [Docker](https://www.docker.com/) (Docker for Mac on OSX)
    -
    -These scripts depend on two files for configuration:
    -
    -- hosts - declares which Ansible roles will be run on which hosts
    -- group_vars/all - various configuration settings needed to install Metron
    -
    -For production use, it is recommended that Metron be installed on an 
existing cluster managed by Ambari as described in the Installing Management 
Pack section below.
    -## Ambari
    -The Ambari playbook will install a Hadoop cluster including the Metron 
Services (Parsing, Enrichment, Indexing). Ambari will also install 
Elasticsearch and Kibana.
    -
    -Currently, the playbooks supports building a local development cluster 
running on one node or deploying to a 10 node cluster on AWS EC2.
    -
    -## Vagrant
    -There is a development environment based on Vagrant that is referred to as 
"Full Dev".  This installs the entire Ambari/Metron stack. This is useful in 
testing out changes to the installation procedure.
    -
    -### Prerequsities
    -- Install [Vagrant](https://www.vagrantup.com/) (5.0.16+)
    -- Install the Hostmanager plugin for vagrant - Run `vagrant plugin install 
vagrant-hostmanager` on the machine where Vagrant is
    -installed
    -
    -### Full-Dev
    -Navigate to `metron/metron-deployment/vagrant/full-dev-platform` and run 
`vagrant up`.
    -
    -## Ambari Management Pack
    -An Ambari Management Pack can be built in order to make the Metron service 
available on top of an existing stack, rather than needing a direct stack 
update.
    -
    -This will set up
    -- Metron Parsers
    -- Enrichment
    -- Indexing
    -- GeoIP data
    -- Optional Elasticsearch
    -- Optional Kibana
    -
    -### Prerequisites
    -- A cluster managed by Ambari 2.4.2+
    -- Metron RPMs available on the cluster in the /localrepo directory.  See 
[RPMs](#rpms) for further information.
    -- [Node.js](https://nodejs.org/en/download/package-manager/) repository 
installed on the Management UI host
    -
    -### Building Management Pack
    -From `metron-deployment` run
    -```
    -mvn clean package
    -```
    -
    -A tar.gz that can be used with Ambari can be found at 
`metron-deployment/packaging/ambari/metron-mpack/target/`
    -
    -### Installing Management Pack
    -Before installing the mpack, update Storm's topology.classpath in Ambari 
to include '/etc/hbase/conf:/etc/hadoop/conf'. Restart Storm service.
    -
    -Place the mpack's tar.gz onto the node running Ambari Server. From the 
command line on this node, run
    -```
    -ambari-server install-mpack --mpack=<mpack_location> --verbose
    -```
    -
    -This will make the services available in Ambari in the same manner as any 
services in a stack, e.g. through Add Services or during cluster install.
    -The Indexing / Parsers/ Enrichment masters should be colocated with a 
Kafka Broker (to create topics) and HBase client (to create the enrichment and 
theatintel tables).
    -This colocation is currently not enforced by Ambari, and should be managed 
by either a Service or Stack advisor as an enhancement.
    -
    -Several configuration parameters will need to be filled in, and should be 
pretty self explanatory (primarily a couple of Elasticsearch configs, and the 
Storm REST URL).  Examples are provided in the descriptions on Ambari.
    -Notably, the URL for the GeoIP database that is preloaded (and is 
prefilled by default) can be set to use a `file:///` location
    -
    -After installation, a custom action is available in Ambari (where stop / 
start services are) to install Elasticsearch templates.  Similar to this, a 
custom Kibana action to Load Template is available.
    -
    -Another custom action is available in Ambari to import Zeppelin 
dashboards. See the [metron-indexing 
documentation](../metron-platform/metron-indexing)
    -
    -#### Offline installation
    -Currently there is only one point that would reach out to the internet 
during an install.  This is the URL for the GeoIP database information.
    -
    -The RPMs DO NOT reach out to the internet (because there is currently no 
hosting for them).  They look on the local filesystem in `/localrepo`.
    -
    -### Current Limitations
    -There are a set of limitations that should be addressed based to improve 
the current state of the mpacks.
    -
    -- There is currently no hosting for RPMs remotely.  They will have to be 
built locally.
    -- Colocation of appropriate services should be enforced by Ambari.  See 
[#Installing Management Pack] for more details.
    -- Storm's topology.classpath is not updated with the Metron service 
install and needs to be updated separately.
    -- Several configuration parameters used when installing the Metron service 
could (and should) be grabbed from Ambari.  Install will require them to be 
manually entered.
    -- Need to handle upgrading Metron
    -
    -## RPMs
    -RPMs can be built to install the components in metron-platform. These RPMs 
are built in a Docker container and placed into `target`.
    -
    -Components in the RPMs:
    -- metron-common
    -- metron-data-management
    -- metron-elasticsearch
    -- metron-enrichment
    -- metron-parsers
    -- metron-pcap
    -- metron-solr
    -- stellar-common
    + * [How do I deploy Metron with 
Ambari?](#how-do-i-deploy-metron-with-ambari)
    + * [How do I deploy Metron on a single 
VM?](#how-do-i-deploy-metron-on-a-single-vm)
    + * [How do I build RPM packages?](#how-do-i-build-rpm-packages)
    + * [How do I build DEB packages?](#how-do-i-build-deb-packages)
    + * [How do I deploy Metron within AWS?](#how-do-i-deploy-metron-within-aws)
    + * [How do I build Metron with Docker?](#how-do-i-build-metron-with-docker)
     
    -### Prerequisites
    -- Docker.  The image detailed in: 
`metron-deployment/packaging/docker/rpm-docker/README.md` will automatically be 
built (or rebuilt if necessary).
    -- Artifacts for metron-platform have been produced.  E.g. `mvn clean 
package -DskipTests` in `metron-platform`
     
    -The artifacts are required because there is a dependency on modules not 
expressed via Maven (we grab the resulting assemblies, but don't need the 
jars).  These are
    -- metron-common
    -- metron-data-management
    -- metron-elasticsearch
    -- metron-enrichment
    -- metron-indexing
    -- metron-parsers
    -- metron-pcap-backend
    -- metron-solr
    -- metron-profiler
    -- metron-config
    +How do I deploy Metron with Ambari?
    +-----------------------------------
     
    -### Building RPMs
    -```
    -cd metron-deployment
    -mvn clean package -Pbuild-rpms
    -```
    -
    -The output RPM files will land in `target/RPMS/noarch`.  They can be 
installed with the standard
    -```
    -rpm -i <package>
    -```
    +This provides a Management Pack (MPack) extension for [Apache 
Ambari](https://ambari.apache.org/) that simplifies the provisioning, 
management and monitoring of Metron on clusters of any size.  
     
    -## Kibana Dashboards
    -
    -The dashboards installed by the Kibana custom action are managed by the 
dashboard.p file.  This file is created by exporting existing dashboards from a 
running Kibana instance.
    -
    -To create a new version of the file, make any necessary changes to Kibana 
(e.g. on full-dev), and export with the appropriate script.
    -
    -```
    -python 
packaging/ambari/metron-mpack/src/main/resources/common-services/KIBANA/4.5.1/package/scripts/dashboard/dashboardindex.py
 \
    -$ES_HOST 9200 \
    
-packaging/ambari/metron-mpack/src/main/resources/common-services/KIBANA/4.5.1/package/scripts/dashboard/dashboard.p
 -s
    -```
    +This allows you to easily install Metron using a simple, guided process.  
This also allows you to monitor cluster health and even secure your cluster 
with kerberos.
     
    -Build the Ambari Mpack to get the dashboard updated appropriately.
    +#### What is this good for?
     
    -Once the MPack is installed, run the Kibana service's action "Load 
Template" to install dashboards.  This will completely overwrite the .kibana in 
Elasticsearch, so use with caution.
    +* If you want to see how Metron can really scale by deploying it on your 
own hardware, or even in the cloud, this is the best option for you.
     
    -## Kerberos
    -The MPack can allow Metron to be installed and then Kerberized, or 
installed on top of an already Kerberized cluster.  This is done through 
Ambari's standard Kerberization setup.
    +* If you want to run a proof-of-concept to see how Apache Metron can 
benefit your organization, then this is the way to do it.
     
    -### Caveats
    -* For nodes using a Metron client and a local repo, the repo must exist on 
all nodes (e.g via createrepo). This repo can be empty; only the main Metron 
services need the RPMs.
    -* A Metron client must be installed on each supervisor node in a secured 
cluster.  This is to ensure that the Metron keytab and client_jaas.conf get 
distributed in order to allow reading and writing from Kafka.
    -  * When Metron is already installed on the cluster, this should be done 
before Kerberizing.
    -  * When addding Metron to an already Kerberized cluster, ensure that all 
supervisor nodes receive a Metron client.
    -* Storm (and Metron) must be restarted after Metron is installed on an 
already Kerberized cluster.  Several Storm configs get updated, and Metron will 
be unable to write to Kafka without a restart.
    -  * Kerberizing a cluster with an existing Metron already has restarts of 
all services during Kerberization, so it's unneeded.
    +#### How?
     
    -Instructions for setup on Full Dev can be found at 
[Kerberos-ambari-setup.md](Kerberos-ambari-setup.md).  These instructions 
reference the manual install instructions.
    +To deploy Apache Metron using Ambari, follow the instructions at 
[packaging/ambari/metron-mpack](packaging/ambari/metron-mpack).
     
    -### Kerberos Without an MPack
    -Using the MPack is preferred, but instructions for Kerberizing manually 
can be found at [Kerberos-manual-setup.md](Kerberos-manual-setup.md). These 
instructions are reference by the Ambari Kerberos install instructions and 
include commands for setting up a KDC.
     
    -## TODO
    -- Support Ubuntu deployments
    +How do I deploy Metron on a single VM?
    +--------------------------------------
    +
    +This will deploy Metron and all of its dependencies on a virtual machine 
running on your computer.  
    +
    +#### What is this good for?
    +
    +* If you are new to Metron and want to explore the functionality that it 
offers, this is good place to start.  
    +
    +* If you are a developer contributing to the Apache Metron project, this 
is also a great way to test your changes.  
    +
    +#### What is this **not** good for?
    +
    +* This VM is **not** intended for processing anything beyond the most 
basic, low volume work loads.
    +
    +* Additional services should **not** be installed along side Metron in 
this VM.
    +
    +* This VM should **not** be used to run a proof-of-concept for Apache 
Metron within your organization.
    +
    +Running Metron within the resource constraints of a single VM is 
incredibly challenging. Failing to respect this warning, will cause various 
services to fail mysteriously as the system runs into memory and processing 
limits. 
    +
    +#### How?
    +
    +To deploy Metron in a VM running on your computer, follow the instructions 
at [vagrant/full-dev-platform](vagrant/full-dev-platform)
    +
    +
    +How do I build RPM packages?
    +----------------------------
    +
    +This provides RPM packages that allow you to install Metron on an 
RPM-based operating system like CentOS.
    +
    +#### What is this good for?
    +
    +* If you want to manually install Apache Metron on an RPM-based system 
like CentOS, installation can be simplified by using these packages.  
    +
    +* If you want a guided installation process using Ambari on an RPM-based 
system, then these RPMs are a necessary prerequisite.
    +
    +#### What is this **not** good for?
    +
    +* If you want a complete, guided installation process, use Ambari rather 
than just these packages.  Installing Metron using **only** these RPMs still 
leaves a considerable amount of configuration necessary to get Metron running.  
Installing with Ambari automates these additional steps.
    +
    +#### How?
    +
    +To build the RPM packages, follow the instructions at 
[packaging/docker/rpm-docker](packaging/docker/rpm-docker).
    +
    +
    +How do I build DEB packages?
    +-------------------------------
    +
    +This builds installable DEB packages that allow you to install Metron on 
an APT-based operating system like Ubuntu.
    +
    +#### What is this good for?
    +
    +* If you want to manually install Metron on a APT-based system like 
Ubuntu, installation can be simplified by using these packages.
    +
    +* If you want a guided installation process using Ambari on an APT-based 
system, then these RPMs are a necessary prerequisite.
    --- End diff --
    
    Minor correction.. I guess it should read DEBs instead of RPMs in this 
paragraph and the one below?


---

Reply via email to