[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-16 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
Looks good @cestella . +1 pending Travis


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-13 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
@JonZeolla Yes it is!  Whoops, my bad.  I guess my JIRA search-fu isn't as 
good as I thought.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread dlyle65535
Github user dlyle65535 commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
@mmiklavc - doesn't Enrichment Master's config do the same thing for 
enrichment.properties?


File(format("{metron_config_path}/enrichment.properties"),
 content=Template("enrichment.properties.j2"),
 owner=params.metron_user,
 group=params.metron_group
 )`

And Indexing Master handles elasticsearch.properties (and global.json)?


File("{0}/global.json".format(params.metron_zookeeper_config_path),
 owner=params.metron_user,
 content=InlineTemplate(params.global_json_template)
 )


File("{0}/elasticsearch.properties".format(params.metron_zookeeper_config_path 
+ '/..'),
 owner=params.metron_user,
 content=InlineTemplate(params.global_properties_template))`

That was my intention anyway. Am I mistaken?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
@cestella Yeah, if we do a CONFIG_PUT via the Stellar REPL without updating 
the local file system config copy, that's going to cause syncing problems.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread mmiklavc
Github user mmiklavc commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
OK, found this for the global config in metron-env.xml

```

global-json
global.json template
This is the jinja template for global.json 
file

{
"es.clustername": "{{ es_cluster_name }}",
"es.ip": "{{ es_url }}",
"es.date.format": ".MM.dd.HH"
}


content


```
This is referenced in params_linux.py
```
global_json_template = config['configurations']['metron-env']['global-json']
```
And then it's used by metron_service.py to lay down the config using jinja 
templates (edited for brevity):
```
def init_config():
...
Execute(ambari_format(
"{metron_home}/bin/zk_load_configs.sh --mode PUSH -i 
{metron_zookeeper_config_path} -z {zookeeper_quorum}"),
path=ambari_format("{java_home}/bin")
)
...
def load_global_config(params):
...
File("{0}/global.json".format(params.metron_zookeeper_config_path),
 owner=params.metron_user,
 content=InlineTemplate(params.global_json_template)
 )
...
init_config()
```
So yes, if you change global.json external to Ambari, ie in the Metron 
install config directory, Ambari will rewrite what's in the local FS, and 
follow up with a load to ZK. As best I can tell, this is _only_ applicable to 
the global config, not the individual topology json configs. Those are unpacked 
on install via Ambari performing an RPM install, but not actually managed on an 
ongoing basis. The way you get into hot water here is if you've chosen to 
manage Ambari configs in a different directory than where Ambari believes 
they're located, due to it using zk_load_configs.sh under the hood. Even then, 
this is parameterized via `metron_zookeeper_config_path`. I can't recall if 
that path is absolute, relative, or both. But in metron-env it's defaulted to 
`config/zookeeper`

Hope this clarifies some things.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
Ok, thanks @dlyle65535 I don't think it's a regression (since it doesn't 
appear to be any different behavior-wise between when it existed in enrichment 
vs in the new configs), but I do think that we should get a broader discussion 
of ambari management in light of the new management UI that @merrimanr 
submitted and, for that matter, the stellar REPL's `CONFIG_GET` and 
`CONFIG_PUT` functions.  I'll kick off a dev list discussion to try to figure 
out what to do about it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread dlyle65535
Github user dlyle65535 commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
Some yes and some no. I'm a bit bleary-eyed today, so I'm not sure if this 
is the complete list- here's my current understanding. Ambari actively (will 
overwrite) manages global.json, enrichment.properties and 
elasticsearch.properties. It passively (will lay down the default from the 
rpms) manages the others.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
@dlyle65535 I think that's a great point and thanks for clarifying.  
Looking at that file, it appears that ambari is explicitly providing the 
ability to modify the global config and certain flux topology properties files, 
but not a screen to manage the sensor-specific configuration content (i.e. at 
present parsers and enrichment).  Please correct if I have had a reading 
comprehension SNAFU ;)  If so, since we just added a new set of configs under 
the same config directory and provided hooks for `zk_load_utils.sh` to know how 
to load and get them, it shouldn't be any different than the existing configs.

That being said, it is a good point you make.  Let me restate it and you 
tell me if I have the gist.  If people change the configs on an m-pack 
installed cluster via the CLI (via `zk_load_utils.sh`) or the 
soon-to-be-committed management UI, ambari will revert those changes on service 
restart, correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread dlyle65535
Github user dlyle65535 commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
@merrimanr , @cestella - wrt management of configs from Ambari, the answer 
is kinda. Sensible default configurations are pushed with certain 
user-specified changes that allow the system to function. A complete list can 
be found looking at metron-env.xml.

There is an EXTREMELY important side-effect to this that must not be 
forgotten. Any configuration that has even partial management by Ambari must 
not be modified outside of Ambari if one expects those changes to survive a 
service restart. Ambari will detect a change to the file and overwrite it.

So, I don't know for sure, but I suspect this PR will introduce breaking 
changes to the MPack install which will be corrected during the work on 
METRON-653.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
Testing Instructions beyond the normal smoke test (i.e. letting data
flow through to the indices and checking them).

## Preliminaries

Since I will use the squid topology to pass data through in a controlled
way, we must install squid and generate one point of data:
* `yum install -y squid`
* `service squid start`
* `squidclient http://www.yahoo.com`

Also, set an environment variable to indicate `METRON_HOME`:
* `export METRON_HOME=/usr/metron/0.3.0` 

## Free Up Space on the virtual machine

First, let's free up some headroom on the virtual machine.  If you are 
running this on a
multinode cluster, you would not have to do this.
* Kill monit via `service monit stop`
* Kill tcpreplay via `for i in $(ps -ef | grep tcpreplay | awk '{print 
$2}');do kill -9 $i;done`
* Kill existing parser topologies via 
   * `storm kill snort`
   * `storm kill bro`
* Kill flume via `for i in $(ps -ef | grep flume | awk '{print $2}');do 
kill -9 $i;done`
* Kill yaf via `for i in $(ps -ef | grep yaf | awk '{print $2}');do kill -9 
$i;done`
* Kill bro via `for i in $(ps -ef | grep bro | awk '{print $2}');do kill -9 
$i;done`

## Deploy the squid parser
* Create the squid kafka topic: 
`/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --zookeeper node1:2181 
--create --topic squid --partitions 1 --replication-factor 1`
* Start via `$METRON_HOME/bin/start_parser_topology.sh -k node1:6667 -z 
node1:2181 -s squid`

### Test Case 1: Adjusting batch sizes
* Delete any squid index that currently exists (if any do).
* Create a file at `$METRON_HOME/config/zookeeper/indexing/squid.json` with 
the following contents:
```
{
  "index" : "squid",
  "batchSize" : 5
}
```
* Send 4 data points through and ensure that there are no data points in 
the index:
  * `cat /var/log/squid/access.log /var/log/squid/access.log 
/var/log/squid/access.log /var/log/squid/access.log | 
/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list 
node1:6667 --topic squid`
  * `curl "http://localhost:9200/squid*/_search?pretty=true=*:*; 2> 
/dev/null| grep "full_hostname" | wc -l` should yield  `0` 
* Send a final data point through and ensure that we have 5 data points:
  * `cat /var/log/squid/access.log | 
/usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list 
node1:6667 --topic squid`
  * `curl "http://localhost:9200/squid*/_search?pretty=true=*:*; 2> 
/dev/null| grep "full_hostname" | wc -l` should yield  `5` 
 
### Test Case 2: Update configs from the CLI
* Edit the file at `$METRON_HOME/config/zookeeper/indexing/squid.json` to 
the following contents:
```
{
  "index" : "squid",
  "batchSize" : 10
}
```
* Push the configs: `$METRON_HOME/bin/zk_load_configs.sh -m PUSH -i 
$METRON_HOME/config/zookeeper -z node1:2181`
* Dump the configs and verify the squid indexing config is correct: 
`$METRON_HOME/bin/zk_load_configs.sh -m DUMP -z node1:2181`

### Test Case 3: Stellar Management Functions
* Execute the following in the stellar shell:
```
Stellar, Go!
Please note that functions are loading lazily in the background and will
be unavailable until loaded fully.
{es.clustername=metron, es.ip=node1, es.port=9300,
es.date.format=.MM.dd.HH}
[Stellar]>>> # Grab the indexing config
[Stellar]>>> squid_config := CONFIG_GET('INDEXING', 'squid', true)
Functions loaded, you may refer to functions now...
[Stellar]>>> # Update the index and batch size
[Stellar]>>> squid_config := INDEXING_SET_BATCH( 
INDEXING_SET_INDEX(squid_config, 'squid'), 1)
[Stellar]>>> # Push the config to zookeeper
[Stellar]>>> CONFIG_PUT('INDEXING', squid_config, 'squid')
[Stellar]>>> # Grab the updated config from zookeeper
[Stellar]>>> CONFIG_GET('INDEXING', 'squid')
{
  "index" : "squid",
  "batchSize" : 1
}
```
* Confirm that the dump command from `$METRON_HOME/bin/zk_load_configs.sh 
-m DUMP -z node1:2181` contains the config with batch size of `1`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
@merrimanr I was under the impression that there was a panel to manage the 
enrichment configurations per sensor in ambari, but if that's not the case and 
it's just initial load, then I don't see a regression.  @dlyle65535 does this 
sound right to you?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread merrimanr
Github user merrimanr commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
What is the regression we're talking about?  Are we talking about loading 
the indexing configs initially with the Mpack or managing the configs in 
Ambari?  

I don't believe we are currently managing parser or enrichment configs in 
Ambari and indexing configs would also fall into that category since they are 
part of enrichment configs now.  Or is that incorrect?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
@ottobackwards yeah, agreed.  It's a fairly complex situation at the moment 
and not documented very well.  That might be worth a discussion on the dev list.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread cestella
Github user cestella commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
@dlyle65535 It definitely is a regression.  I didn't make it clear enough 
(I will do so now), but I very much do not want this PR to be committed before 
the management pack PR is committed.

The consequences of not managing in ambari is that you won't be able to 
adjust the default configs, which does not stop data from flowing through, but 
are sub-optimal (the index name defaults to the sensor name and the batch size 
defaults to 1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread ottobackwards
Github user ottobackwards commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
Just a side note:  with the rpms and docker everything, it is more 
difficult than ever to have a handle on all the places you need to consider for 
changes, I don't know if you want to discuss on the list but there should be a 
dev. guide entry or something.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread dlyle65535
Github user dlyle65535 commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
Thanks @cestella, makes total sense.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-metron issue #415: METRON-652: Extract indexing config from enrich...

2017-01-12 Thread dlyle65535
Github user dlyle65535 commented on the issue:

https://github.com/apache/incubator-metron/pull/415
  
Hi @cestella,

Can you give me a little detail about the consequences of not exposing the 
indexing config to Ambari? It seems like a regression to me, I recall we could 
deploy non-default configs prior to this PR. 

Thanks!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---