mmiklavc edited a comment on issue #1365: METRON-2050: Automatically populate a list of enrichments from HBase URL: https://github.com/apache/metron/pull/1365#issuecomment-481850604 # Testing Plan We need to verify 1. The enrichment coprocessor loads as expected 1. Normal sensor data flow through the system, from parsers through to indexing, still functions as normal 1. The new enrichment list table is populated when new enrichment types are added to the enrichment HBase table ## TOC * Setup Test Environment * Verify Basics * Flatfile loader * Streaming enrichment * Final check ## Setup Test Environment 1. Build full dev `metron/metron-deployment/development/centos6$ vagrant up` 1. Login to full dev `ssh root@node1`, password "vagrant" 1. Set some environment variables ``` # the root and metron users will need to do this - add to the user's ~/.bashrc or source each time you switch to the user source /etc/default/metron ``` ## Verify Basics ### HBase enrichment table setup with coprocessor 1. Run the following command from the CLI - you should see the coprocessor in the table attributes. Ambari should set this up as part of the MPack installation. ``` # echo "describe 'enrichment'" | hbase shell HBase Shell; enter 'help<RETURN>' for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version 1.1.2.2.6.5.1050-37, r897822d4dd5956ca186974c10382e9094683fa29, Tue Dec 11 02:04:10 UTC 2018 describe 'enrichment' Table enrichment is ENABLED enrichment, {TABLE_ATTRIBUTES => {METADATA => {'Coprocessor$1' => 'hdfs://node1:8020/apps/metron/coprocessor/metron-hbase-server-0.7.1-uber.jar|org.apache.metron.hbase.coprocessor.EnrichmentCoprocessor||zookeeperUrl=node1:2181'} } COLUMN FAMILIES DESCRIPTION {NAME => 't', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.3790 seconds ``` 1. Ambari should provide 4 new options for configuring the enrichment list 1. Enrichment List HBase Column Family 1. Enrichment List HBase Coprocessor Implementation 1. Enrichment List HBase Table Provider Implementation 1. Enrichment List HBase Table ![image](https://user-images.githubusercontent.com/658443/55911620-3d205400-5b9e-11e9-8e73-e6aeeea4334a.png) ### Pipeline still processes to indexing Verify data is flowing through the system, from parsing to indexing 1. Open Ambari and navigate to the Metron service http://node1:8080/#/main/services/METRON/summary 2. Open the Alerts UI 3. ![image](https://user-images.githubusercontent.com/658443/55191493-f119ec00-5167-11e9-8444-be77308ccf24.png) 4. Verify alerts show up in the main UI - click the search icon (you may need to wait a moment for them to appear) ![image](https://user-images.githubusercontent.com/658443/55191611-3dfdc280-5168-11e9-90ac-dc949f458b7f.png) 5. Head back to Ambari and select the Kibana service http://node1:8080/#/main/services/KIBANA/summary 6. Open the Kibana dashboard via the "Metron UI" option in the quick links 7. ![image](https://user-images.githubusercontent.com/658443/55191670-67b6e980-5168-11e9-9edd-4d346ed90da8.png) 8. Verify the dashboard is populating 9. ![image](https://user-images.githubusercontent.com/658443/55191751-99c84b80-5168-11e9-82eb-d95ce1414478.png) ## Flatfile loader ### Preliminaries 1. Before we start adding enrichments, let's verify the enrichment_list table is empty 1. Go to Swagger ![image](https://user-images.githubusercontent.com/658443/55909130-78b81f80-5b98-11e9-8666-6cb52ae22a12.png) 1. Click the `sensor-enrichment-config-controller` option. ![image](https://user-images.githubusercontent.com/658443/55909179-94232a80-5b98-11e9-898f-dd6cbfcb20fc.png) 1. Click the `GET /api/v1/sensor/enrichment/config/list/available/enrichments` option. 1. And finally click the "Try it out!" button. You should see an empty array returned in the response body. ![image](https://user-images.githubusercontent.com/658443/55909333-f2e8a400-5b98-11e9-85d2-a2d496d4dff8.png) 1. Now, let's perform an enrichment load. We'll do this as the metron user ``` su - metron source /etc/default/metron ``` 1. Download the alexa 1m dataset: ``` wget http://s3.amazonaws.com/alexa-static/top-1m.csv.zip unzip top-1m.csv.zip ``` 1. Stage import file ``` head -n 10000 top-1m.csv > top-10k.csv # plop it on HDFS hdfs dfs -put top-10k.csv /tmp ``` 1. Create an extractor.json for the CSV data by editing `extractor.json` and pasting in these contents: ``` { "config": { "columns": { "domain": 1, "rank": 0 }, "indicator_column": "domain", "separator": ",", "type": "alexa" }, "extractor": "CSV" } ``` The extractor.json will get used by flatfile_loader.sh in the next step ### Import from HDFS via MR ``` # truncate hbase echo "truncate 'enrichment'" | hbase shell # import data into hbase $METRON_HOME/bin/flatfile_loader.sh -i /tmp/top-10k.csv -t enrichment -c t -e ./extractor.json -m MR # count data written and verify it's 10k echo "count 'enrichment'" | hbase shell ``` You should see a 10k count in the enrichment table. We'll add one more source of enrichment type before checking our enrichment list. ## Streaming Enrichment 1. Switch back to root if you're still the metron user. ``` [metron@node1 ~]$ exit ``` 1. Pull down latest config from Zookeeper ``` $METRON_HOME/bin/zk_load_configs.sh -m PULL -o ${METRON_HOME}/config/zookeeper -z $ZOOKEEPER -f ``` 1. Create a file named `user.json` in the parser directory. ``` touch ${METRON_HOME}/config/zookeeper/parsers/user.json ``` 1. Enter these contents: ``` { "parserClassName" : "org.apache.metron.parsers.csv.CSVParser" , "writerClassName" : "org.apache.metron.writer.hbase.SimpleHbaseEnrichmentWriter", "sensorTopic":"user", "parserConfig": { "shew.table" : "enrichment", "shew.cf" : "t", "shew.keyColumns" : "ip", "shew.enrichmentType" : "user", "columns" : { "user" : 0, "ip" : 1 } } } ``` 1. Push the changes back up to Zookeeper ``` $METRON_HOME/bin/zk_load_configs.sh -m PUSH -i $METRON_HOME/config/zookeeper/ -z $ZOOKEEPER ``` 1. Create the user Kafka topic ``` ${HDP_HOME}/kafka-broker/bin/kafka-topics.sh --create --zookeeper $ZOOKEEPER --replication-factor 1 --partitions 1 --topic user ``` 1. Start the topology ``` ${METRON_HOME}/bin/start_parser_topology.sh -s user -z $ZOOKEEPER ``` 1. Create a simple file with named `user.csv` with user mapping to IP, e.g. ``` echo "mmiklavcic,192.168.138.158" > user.csv ``` 1. Push the data to Kafka ``` tail user.csv | ${HDP_HOME}/kafka-broker/bin/kafka-console-producer.sh --broker-list $BROKERLIST --topic user ``` 1. Verify data makes it to the enrichment table. ``` echo "count 'enrichment'" | hbase shell ``` There should be 10,001 records now. ### Final check 1. Check the Swagger UI again with our earlier steps. You should now see an "alexa" and a "user" enrichment type returned in the enrichment list results ![image](https://user-images.githubusercontent.com/658443/55911150-42c96a00-5b9d-11e9-80e7-a6c5c9f01dba.png)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services