[GitHub] [metron] mmiklavc edited a comment on issue #1365: METRON-2050: Automatically populate a list of enrichments from HBase

GitBox Wed, 10 Apr 2019 13:40:23 -0700

mmiklavc edited a comment on issue #1365: METRON-2050: Automatically populate a 
list of enrichments from HBase
URL: https://github.com/apache/metron/pull/1365#issuecomment-481850604
 
 
   # Testing Plan
   
   We need to verify
   
   1. The enrichment coprocessor loads as expected
   1. Normal sensor data flow through the system, from parsers through to 
indexing, still functions as normal
   1. The new enrichment list table is populated when new enrichment types are 
added to the enrichment HBase table
   
   ## TOC
   
   * Setup Test Environment
   * Verify Basics
   * Flatfile loader
   * Streaming enrichment
   * Final check
   
   ## Setup Test Environment
   
   1. Build full dev `metron/metron-deployment/development/centos6$ vagrant up`
   1. Login to full dev `ssh root@node1`, password "vagrant"
   1. Set some environment variables
       ```
       # the root and metron users will need to do this - add to the user's 
~/.bashrc or source each time you switch to the user
       source /etc/default/metron
       ```
   
   ## Verify Basics
   
   ### HBase enrichment table setup with coprocessor
   
   1. Run the following command from the CLI - you should see the coprocessor 
in the table attributes. Ambari should set this up as part of the MPack 
installation.
       ```
       # echo "describe 'enrichment'" | hbase shell
       HBase Shell; enter 'help<RETURN>' for list of supported commands.
       Type "exit<RETURN>" to leave the HBase Shell
       Version 1.1.2.2.6.5.1050-37, r897822d4dd5956ca186974c10382e9094683fa29, 
Tue Dec 11 
       02:04:10 UTC 2018
   
       describe 'enrichment'
       Table enrichment is ENABLED
       enrichment, {TABLE_ATTRIBUTES => {METADATA => {'Coprocessor$1' => 
       
'hdfs://node1:8020/apps/metron/coprocessor/metron-hbase-server-0.7.1-uber.jar|org.apache.metron.hbase.coprocessor.EnrichmentCoprocessor||zookeeperUrl=node1:2181'}
       }
       COLUMN FAMILIES DESCRIPTION
       {NAME => 't', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 
'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 
'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', 
BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
       1 row(s) in 0.3790 seconds 
       ```
   
   1. Ambari should provide 4 new options for configuring the enrichment list
       1. Enrichment List HBase Column Family
       1. Enrichment List HBase Coprocessor Implementation
       1. Enrichment List HBase Table Provider Implementation
       1. Enrichment List HBase Table
   
![image](https://user-images.githubusercontent.com/658443/55911620-3d205400-5b9e-11e9-8e73-e6aeeea4334a.png)
   
   ### Pipeline still processes to indexing
   
   Verify data is flowing through the system, from parsing to indexing
   
   1. Open Ambari and navigate to the Metron service 
http://node1:8080/#/main/services/METRON/summary
   2. Open the Alerts UI
   3. 
![image](https://user-images.githubusercontent.com/658443/55191493-f119ec00-5167-11e9-8444-be77308ccf24.png)
   4. Verify alerts show up in the main UI - click the search icon (you may 
need to wait a moment for them to appear)
   
![image](https://user-images.githubusercontent.com/658443/55191611-3dfdc280-5168-11e9-90ac-dc949f458b7f.png)
   5. Head back to Ambari and select the Kibana service 
http://node1:8080/#/main/services/KIBANA/summary
   6. Open the Kibana dashboard via the "Metron UI" option in the quick links
   7. 
![image](https://user-images.githubusercontent.com/658443/55191670-67b6e980-5168-11e9-9edd-4d346ed90da8.png)
   8. Verify the dashboard is populating
   9. 
![image](https://user-images.githubusercontent.com/658443/55191751-99c84b80-5168-11e9-82eb-d95ce1414478.png)
   
   ## Flatfile loader
   
   ### Preliminaries
   
   1. Before we start adding enrichments, let's verify the enrichment_list 
table is empty
   1. Go to Swagger
   
![image](https://user-images.githubusercontent.com/658443/55909130-78b81f80-5b98-11e9-8666-6cb52ae22a12.png)
   1. Click the `sensor-enrichment-config-controller` option.
   
![image](https://user-images.githubusercontent.com/658443/55909179-94232a80-5b98-11e9-898f-dd6cbfcb20fc.png)
   1. Click the `GET 
/api/v1/sensor/enrichment/config/list/available/enrichments` option.
   1. And finally click the "Try it out!" button. You should see an empty array 
returned in the response body.
   
![image](https://user-images.githubusercontent.com/658443/55909333-f2e8a400-5b98-11e9-85d2-a2d496d4dff8.png)
   1. Now, let's perform an enrichment load. We'll do this as the metron user
       ```
       su - metron
       source /etc/default/metron
       ```
   
   1. Download the alexa 1m dataset:
       ```
       wget http://s3.amazonaws.com/alexa-static/top-1m.csv.zip
       unzip top-1m.csv.zip
       ```
   1. Stage import file
       ```
       head -n 10000 top-1m.csv > top-10k.csv
       # plop it on HDFS
       hdfs dfs -put top-10k.csv /tmp
       ```
   1. Create an extractor.json for the CSV data by editing `extractor.json` and 
pasting in these contents:
       ```
       {
         "config": {
           "columns": {
             "domain": 1,
             "rank": 0
           },
           "indicator_column": "domain",
           "separator": ",",
           "type": "alexa"
         },
         "extractor": "CSV"
       }
       ```
   
   The extractor.json will get used by flatfile_loader.sh in the next step
   
   ### Import from HDFS via MR
   
   ```
   # truncate hbase
   echo "truncate 'enrichment'" | hbase shell
   # import data into hbase 
   $METRON_HOME/bin/flatfile_loader.sh -i /tmp/top-10k.csv -t enrichment -c t 
-e ./extractor.json -m MR
   # count data written and verify it's 10k
   echo "count 'enrichment'" | hbase shell
   ```
   
   You should see a 10k count in the enrichment table. We'll add one more 
source of enrichment type before checking our enrichment list.
   
   ## Streaming Enrichment
   
   1. Switch back to root if you're still the metron user.
       ```
       [metron@node1 ~]$ exit
       ```
   
   1. Pull down latest config from Zookeeper
       ```
       $METRON_HOME/bin/zk_load_configs.sh -m PULL -o 
${METRON_HOME}/config/zookeeper -z $ZOOKEEPER -f
       ```
   
   1. Create a file named `user.json` in the parser directory.
       ```
       touch ${METRON_HOME}/config/zookeeper/parsers/user.json
       ```
   
   1. Enter these contents:
       ```
       {
         "parserClassName" : "org.apache.metron.parsers.csv.CSVParser" ,
         "writerClassName" : 
"org.apache.metron.writer.hbase.SimpleHbaseEnrichmentWriter",
         "sensorTopic":"user",
         "parserConfig": {
           "shew.table" : "enrichment",
           "shew.cf" : "t",
           "shew.keyColumns" : "ip",
           "shew.enrichmentType" : "user",
           "columns" : {
             "user" : 0,
             "ip" : 1
           }
         }
       }
       ```
   
   1. Push the changes back up to Zookeeper
       ```
       $METRON_HOME/bin/zk_load_configs.sh -m PUSH -i 
$METRON_HOME/config/zookeeper/ -z $ZOOKEEPER
       ```
   
   1. Create the user Kafka topic
       ```
       ${HDP_HOME}/kafka-broker/bin/kafka-topics.sh --create --zookeeper 
$ZOOKEEPER --replication-factor 1 --partitions 1 --topic user
       ```
   
   1. Start the topology
       ```
       ${METRON_HOME}/bin/start_parser_topology.sh -s user -z $ZOOKEEPER
       ```
   
   1. Create a simple file with named `user.csv` with user mapping to IP, e.g.
       ```
       echo "mmiklavcic,192.168.138.158" > user.csv
       ```
   
   1. Push the data to Kafka
       ```
       tail user.csv | ${HDP_HOME}/kafka-broker/bin/kafka-console-producer.sh 
--broker-list $BROKERLIST --topic user
       ```
   
   1. Verify data makes it to the enrichment table.
       ```
       echo "count 'enrichment'" | hbase shell
       ```
   
       There should be 10,001 records now.
   
   ### Final check
   
   1. Check the Swagger UI again with our earlier steps. You should now see an 
"alexa" and a "user" enrichment type returned in the enrichment list results
   
![image](https://user-images.githubusercontent.com/658443/55911150-42c96a00-5b9d-11e9-80e7-a6c5c9f01dba.png)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [metron] mmiklavc edited a comment on issue #1365: METRON-2050: Automatically populate a list of enrichments from HBase

Reply via email to