Re: Clearing of data to start over

Frank Horsfall Wed, 06 Sep 2017 20:21:31 -0700

Excellent thanks

Sent from my Bell Samsung device over Canada's largest network.

-------- Original message --------
From: Simon Elliston Ball <si...@simonellistonball.com>
Date: 2017-09-06 11:08 PM (GMT-05:00)
To: user@metron.apache.org
Subject: Re: Clearing of data to start over

Multiple Kafka brokers will help a lot. The wizard allows you too add more by 
using the plus symbol next to Kafka on the master selection screen. After the 
fact you can add more with the add service button on the hosts screen in ambari.

When adding brokers, don't forget to also alter your topics to have more 
partitions to make use of those brokers. Out of the box the default is a pretty 
useless 1. You should have at least as many partitions as you have disk 
spindles for kafka.

For pulling data from remote sites into metron I would suggest something like 
apache NiFi, using NiFi site to site to a NiFi collocated with your metron. 
That would then just write to kafka. So you can think of NiFi as being a bit 
like an agent or a forwarder.

Good luck!

Simon

Sent from my iPhone

On 7 Sep 2017, at 04:01, Frank Horsfall 
<frankhorsf...@cunet.carleton.ca<mailto:frankhorsf...@cunet.carleton.ca>> wrote:

I'm on a role with questions.

I'm curious to see if I can relieve processing pressure by adding a new vm.

Would you know how I would go about it?

Also
I would like to pull data from sources instead of have the sources push data to 
my site. Have you come across this scenario?

F

Sent from my Bell Samsung device over Canada's largest network.

-------- Original message --------
From: Frank Horsfall 
<frankhorsf...@cunet.carleton.ca<mailto:frankhorsf...@cunet.carleton.ca>>
Date: 2017-09-06 10:51 PM (GMT-05:00)
To: user@metron.apache.org<mailto:user@metron.apache.org>
Subject: Re: Clearing of data to start over

Also

Laurens you recommended to make 3 Kafka brokers but the install wizard would 
not let me.

As a result my node1 is the only broker currently.  Would this cause a 
bottleneck?

If so is there a method to install and configures the 2 additional brokers post 
initial install?

kindest regards

Frank

Sent from my Bell Samsung device over Canada's largest network.

-------- Original message --------
From: Frank Horsfall 
<frankhorsf...@cunet.carleton.ca<mailto:frankhorsf...@cunet.carleton.ca>>
Date: 2017-09-06 10:38 PM (GMT-05:00)
To: user@metron.apache.org<mailto:user@metron.apache.org>
Subject: Re: Clearing of data to start over

Thanks Laurens and Nick.

I want to let the queues run over night to give us some possible insights into 
heap sizes etc.

I currently have 3 vms configured each with 8 cores  500 gigs of data capacity  
and 30 gigs of memory.

Elasticsearch has been configured with 10 gigs xmx.

I've set storm worker childopts at 7 gigs for now so it takes a while to max 
out and generate heap errors.

I deleted approx 6 million events and shut off the data generating apps.

The idea is to see how much will be processed overnight.

One thing that has me puzzled is why my bro app isn't emitting events. I double 
checked my config based on what's recommended but nothing is coming through. A 
mystery. lol

Also I kept some notes during the whole process and want to share them if you 
are interested.  let me know

Frank

Sent from my Bell Samsung device over Canada's largest network.

-------- Original message --------
From: Laurens Vets <laur...@daemon.be<mailto:laur...@daemon.be>>
Date: 2017-09-06 6:17 PM (GMT-05:00)
To: user@metron.apache.org<mailto:user@metron.apache.org>
Cc: Frank Horsfall 
<frankhorsf...@cunet.carleton.ca<mailto:frankhorsf...@cunet.carleton.ca>>
Subject: Re: Clearing of data to start over

Hi Frank,

If you all your queues (Kafka/Storm) are empty, the following should work:

- Deleting your elasticsearch indices: curl -X DELETE 
'http://localhost:9200/snort_index_*', curl -X DELETE 
'http://localhost:9200/yaf_index_*<http://localhost:9200/snort_index_*',%20curl%20-X%20DELETE%20'http://localhost:9200/yaf_index_*>',
 etc...

- Deleting your Hadoop data:

Become the hdfs user: sudo su - hdfs
Show what's been indexed in Hadoop: hdfs dfs -ls /apps/metron/indexing/indexed/
Output should show the following probably:
/apps/metron/indexing/indexed/error
/apps/metron/indexing/indexed/snort
/apps/metron/indexing/indexed/yaf
...

You can remove these with:
hdfs dfs -rmr -skipTrash /apps/metron/indexing/indexed/error/
hdfs dfs -rmr -skipTrash /apps/metron/indexing/indexed/snort/

Or the individial files with

hdfs dfs -rmr -skipTrash /apps/metron/indexing/indexed/error/FILENAME

On 2017-09-06 13:59, Frank Horsfall wrote:
Hello all,
I have installed a 3 node system using the bare metal Centos 7 guideline.

https://cwiki.apache.org/confluence/display/METRON/Metron+0.4.0+with+HDP+2.5+bare-metal+install+on+Centos+7+with+MariaDB+for+Metron+REST

It has taken me a while to have all components working properly and I left the 
yaf,bro,snort apps running so quite a lot of data has been generated.  
Currently, I have almost 18 million events identified in Kibana. 16+ million 
are yaf based, and 2+ million are snort  …. 190 events are my new squid 
telemetry,  :).   It looks like it still has a while to go before it catches up 
to current day.   I recently shutdown the apps.

My questions are:

1.       Is there a way to wipe all my data and indices clean so that I may now 
begin with a fresh dataset?

2.       Is there a way to configure yaf so that its data is meaningful ? It is 
currently  creating what looks to be test data?

3.       I have commented out the test snort rule  but it is still generating 
the odd record which looks once again looks like test data. Can this be stopped 
as well?

Kindest regards,
Frank

Re: Clearing of data to start over

Reply via email to