Github user cestella commented on the pull request:

    https://github.com/apache/incubator-metron/pull/127#issuecomment-222364875
  
    In order to validate this, you can do the following:
    * Configure a new parser, in this example I'll call it a `user` parser and 
we'll parse some CSV data to map `username` to `ip` by creating a file 
`/usr/metron/0.1BETA/config/zookeeper/enrichment/user.json` with
    
    ```
    {
        "parserClassName" : "org.apache.metron.parsers.csv.CSVParser"
       ,"writerClassName" : 
"org.apache.metron.writer.hbase.SimpleHbaseEnrichmentWriter"
       ,"sensorTopic":"user"
       ,"parserConfig":
       {
         "shew.table" : "enrichment"
        ,"shew.cf" : "t"
        ,"shew.keyColumns" : "user"
        ,"shew.enrichmentType" : "user"
        ,"columns" : {
                    "user" : 0
                   ,"ip" : 1
                     }
       }
    }
    ```
    * Add a new `user` enrichment type to `bro` data by adding `ip_src_addr` to 
`hbaseEnrichment` and associating `user` as a field type for `ip_src_addr` in  
`/usr/metron/0.1BETA/config/zookeeper/enrichment/bro.json` like so
    ```
    {
      "index": "bro",
      "batchSize": 5,
      "enrichment": {
        "fieldMap": {
          "geo": [
            "ip_dst_addr",
            "ip_src_addr"
          ],
          "host": [
            "host"
          ],
          "hbaseEnrichment" : [ "ip_src_addr" ]
        },
       "fieldToTypeMap":
       {
          "ip_src_addr" : [ "user"]
       }
      },
      "threatIntel":{
        "fieldMap":
        {
          "hbaseThreatIntel": ["ip_dst_addr", "ip_src_addr"]
        },
        "fieldToTypeMap":
        {
          "ip_dst_addr" : [ "malicious_ip" ]
        ,"ip_src_addr" : [ "malicious_ip" ]
        }
      }
    }```
    * Create the Kafka Queue as in the tutorials
    * Using `/usr/metron/0.1BETA/bin/zk_load_configs.sh` push up the config you 
just created. `/usr/metron/0.1BETA/bin/zk_load_configs.sh -m PUSH -z node1:2181 
-i /usr/metron/0.1BETA/config/zookeeper`
    * Create some reference CSV reference data with that looks like 
`jsirota,192.168.168.1` into a csv file named `user.csv`
    * Use the kafka console producer to push data into the `user` topic via  
`cat user.csv | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh 
--broker-list node1:6667 --topic user`
    * You should be able to check that the data gets into HBase by doing a 
`scan 'enrichment'` from the `hbase shell`
    * You should also be able to check, after new data has been run through, 
that the data is enriched in elasticsearch.  I would suggest bouncing the 
enrichment topology to ensure that stale data in the caches get flushed, but 
that is not strictly necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to