Re: Clearing of data to start over

Simon Elliston Ball Wed, 06 Sep 2017 20:09:09 -0700

Multiple Kafka brokers will help a lot. The wizard allows you too add more by 
using the plus symbol next to Kafka on the master selection screen. After the 
fact you can add more with the add service button on the hosts screen in ambari.


When adding brokers, don't forget to also alter your topics to have more 
partitions to make use of those brokers. Out of the box the default is a pretty 
useless 1. You should have at least as many partitions as you have disk 
spindles for kafka.

For pulling data from remote sites into metron I would suggest something like 
apache NiFi, using NiFi site to site to a NiFi collocated with your metron. 
That would then just write to kafka. So you can think of NiFi as being a bit 
like an agent or a forwarder. 

Good luck!

Simon 

Sent from my iPhone

> On 7 Sep 2017, at 04:01, Frank Horsfall <frankhorsf...@cunet.carleton.ca> 
> wrote:
> 
> I'm on a role with questions.
> 
> I'm curious to see if I can relieve processing pressure by adding a new vm. 
> 
> Would you know how I would go about it?
> 
> Also
> I would like to pull data from sources instead of have the sources push data 
> to my site. Have you come across this scenario?
> 
> F
> 
> 
> 
> Sent from my Bell Samsung device over Canada's largest network.
> 
> 
> -------- Original message --------
> From: Frank Horsfall <frankhorsf...@cunet.carleton.ca>
> Date: 2017-09-06 10:51 PM (GMT-05:00)
> To: user@metron.apache.org
> Subject: Re: Clearing of data to start over
> 
> Also 
> 
> Laurens you recommended to make 3 Kafka brokers but the install wizard would 
> not let me. 
> 
> As a result my node1 is the only broker currently.  Would this cause a 
> bottleneck?
> 
> If so is there a method to install and configures the 2 additional brokers 
> post initial install?
> 
> kindest regards 
> 
> Frank 
> 
> 
> 
> Sent from my Bell Samsung device over Canada's largest network.
> 
> 
> -------- Original message --------
> From: Frank Horsfall <frankhorsf...@cunet.carleton.ca>
> Date: 2017-09-06 10:38 PM (GMT-05:00)
> To: user@metron.apache.org
> Subject: Re: Clearing of data to start over
> 
> Thanks Laurens and Nick.
> 
> I want to let the queues run over night to give us some possible insights 
> into heap sizes etc.
> 
> I currently have 3 vms configured each with 8 cores  500 gigs of data 
> capacity  and 30 gigs of memory.
> 
> Elasticsearch has been configured with 10 gigs xmx.
> 
> I've set storm worker childopts at 7 gigs for now so it takes a while to max 
> out and generate heap errors.
> 
> I deleted approx 6 million events and shut off the data generating apps.
> 
> The idea is to see how much will be processed overnight.
> 
> One thing that has me puzzled is why my bro app isn't emitting events. I 
> double checked my config based on what's recommended but nothing is coming 
> through. A mystery. lol
> 
> 
> Also I kept some notes during the whole process and want to share them if you 
> are interested.  let me know
> 
> Frank
> 
> 
> 
> 
> 
> 
> 
> 
> Sent from my Bell Samsung device over Canada's largest network.
> 
> 
> -------- Original message --------
> From: Laurens Vets <laur...@daemon.be>
> Date: 2017-09-06 6:17 PM (GMT-05:00)
> To: user@metron.apache.org
> Cc: Frank Horsfall <frankhorsf...@cunet.carleton.ca>
> Subject: Re: Clearing of data to start over
> 
> Hi Frank,
> 
> If you all your queues (Kafka/Storm) are empty, the following should work:
> 
> - Deleting your elasticsearch indices: curl -X DELETE 
> 'http://localhost:9200/snort_index_*', curl -X DELETE 
> 'http://localhost:9200/yaf_index_*', etc...
> 
> - Deleting your Hadoop data:
> 
> Become the hdfs user: sudo su - hdfs
> Show what's been indexed in Hadoop: hdfs dfs -ls 
> /apps/metron/indexing/indexed/ 
> Output should show the following probably:
> /apps/metron/indexing/indexed/error
> /apps/metron/indexing/indexed/snort
> /apps/metron/indexing/indexed/yaf
> ...
> 
> You can remove these with:
> hdfs dfs -rmr -skipTrash /apps/metron/indexing/indexed/error/
> hdfs dfs -rmr -skipTrash /apps/metron/indexing/indexed/snort/
> 
> Or the individial files with
> 
> hdfs dfs -rmr -skipTrash /apps/metron/indexing/indexed/error/FILENAME
> 
> 
>> On 2017-09-06 13:59, Frank Horsfall wrote:
>> 
>> Hello all,
>> 
>> I have installed a 3 node system using the bare metal Centos 7 guideline.
>> 
>>  
>> 
>> https://cwiki.apache.org/confluence/display/METRON/Metron+0.4.0+with+HDP+2.5+bare-metal+install+on+Centos+7+with+MariaDB+for+Metron+REST
>> 
>>  
>> 
>> It has taken me a while to have all components working properly and I left 
>> the yaf,bro,snort apps running so quite a lot of data has been generated.  
>> Currently, I have almost 18 million events identified in Kibana. 16+ million 
>> are yaf based, and 2+ million are snort  …. 190 events are my new squid 
>> telemetry,  J.   It looks like it still has a while to go before it catches 
>> up to current day.   I recently shutdown the apps.
>> 
>>  
>> 
>>  
>> 
>> My questions are:
>> 
>>  
>> 
>> 1.       Is there a way to wipe all my data and indices clean so that I may 
>> now begin with a fresh dataset?
>> 
>> 2.       Is there a way to configure yaf so that its data is meaningful ? It 
>> is currently  creating what looks to be test data?
>> 
>> 3.       I have commented out the test snort rule  but it is still 
>> generating the odd record which looks once again looks like test data. Can 
>> this be stopped as well?
>> 
>>  
>> 
>> Kindest regards,
>> 
>> Frank
>> 
>>  
>> 
>>  
>> 
>>  
>> 
>

Re: Clearing of data to start over

Reply via email to