[GitHub] incubator-metron issue #518: METRON-799: The MPack should function in a kerb...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/518 I've run this up on an EC2 cluster, and been able to get data from the core topologies from Kafka to ES/HDFS. There are a couple caveats spelled out in the metron-deployment README now in a short Kerberos section mentioning that the mpack supports Ambari's Kerberization process. The client process managed to ensure keytabs existed and that the client_jaas was created appropriately. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #518: METRON-799: The MPack should function in a kerb...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/518 Update on this: After a significant amount of pain involved in making this work, we need to have the client_jaas.conf and the metron keytab distributed to the various supervisor nodes so that they can actually be authenticated/authorized as the metron user. The best discovered solution to this is to have a client that essentially just sets them up (so we can actually create things on the various supervisor servers). It's not ideal, but @dlyle65535 and I are testing that it works. To actually set this up, just install Metron clients on all Storm supervisor nodes prior to Kerberization. Docs will be updated once we confirm that this approach works. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #532: METRON-634 Mpack bug fixes and improvements, no...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/532 This is great, thanks for cleaning a lot of this up. When I get a chance, I look forward to spinning this up. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #518: METRON-799: The MPack should function in a kerb...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/518 Updated to use builtin check for security in stack advisor. Old versions caused problems because of typing in Python and my lack of catching what it actually does. Params files should still be fine, because the config objects underneath the covers in Ambari explicitly convert strings to booleans if appropriate. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #500: METRON-795: Install Metron REST with Ambari MPa...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/500 "confident this PR", not "confident that". In retrospect, it sounds like I'm referring to your last comment, not just in general. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #500: METRON-795: Install Metron REST with Ambari MPa...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/500 @merrimanr I'm pretty confident that is going to collide unpleasantly with https://github.com/apache/incubator-metron/pull/518 (The MPack should function in a kerberized cluster). I haven't thought through the details, but do you think it's potentially sufficient to make this component optional (cardinality 0+, rather than 1)? It's an extra step for the user, but it's only selecting it on the component list. At that point, we say the REST API + UI is only supported on non-kerberized clusters (which is true anyway), and we (hopefully) don't break Kerberizing the core components. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #518: METRON-799: The MPack should function in a kerb...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/518 Full dev is able to successfully startup, be Kerberized via Ambari without errors (including Storm service check), and have new data flow through. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #518: METRON-799: The MPack should function in a kerb...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/518 Still need to do some additional testing (I'm spiin, but this is updated to pull the topology.auto-credentials down to the topology level. This should let Storm actually run service check (because AutoTGT isn't initializing). At the same time, Metron should be able to set these appropriately and run them. This involves passing the config down to the various files (the .properties files that feed Flux) and updating integration tests slightly. Oddly enough, the use of the tickets in the client_jaas file caused issues, so it's now using the ticket cache and running a kinit before acting on topologies. I strongly dislike having the kinits, but there's no obvious reason for the difference in behavior between ticket cache and keytab. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #518: METRON-799: The MPack should function in a kerb...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/518 So coming back to the AutoTGT discussion and making it easier to see. There's basically two approaches we can takes that (should) work. 1. The current use of AutoTGT. This requires setting up .storm/storm.yaml for the Storm nodes, and users will have to do the same. 2. The AutoHDFS / AutoHBase solution mentioned by @dlyle65535. This requires symlinking some jars and configs to make them available to Storm. Should only be necessary for Nimbuses, but it does mean that HDFS/HBase upgrades can make the symlinks stale. Either solution requires Ambari acting on a different node (potentially) than the one running the scripts. I don't know if Ambari has any resources for handling that sort of things. It could potentially be a command over ssh to another node, but presumably that requires passwordless ssh setup or someone to manually create the symlinks (which may be acceptable for this pass). I'm fairly strongly inclined towards the second one, primarily because it requires less effort on the users part. Ambari work is fairly similar either way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #518: METRON-799: The MPack should function in...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/518#discussion_r110413918 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/metron_security.py --- @@ -0,0 +1,74 @@ +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at +http://www.apache.org/licenses/LICENSE-2.0 +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +""" + +import os.path +from resource_management.core.source import Template +from resource_management.core.resources.system import Directory, File +from resource_management.core import global_lock +from resource_management.core.logger import Logger +from resource_management.core.resources.system import Execute +from resource_management.libraries.functions import format as ambari_format + + +# Convenience function for ensuring home dirs are setup consistently. +def storm_security_setup(params): +if params.security_enabled: +# I don't think there's an Ambari way to get a user's local home dir , so have Python perform tilde expansion. +# Ambari's Directory doesn't do tilde expansion. +metron_storm_dir_tilde = '~' + params.metron_user + '/.storm' +metron_storm_dir = os.path.expanduser(metron_storm_dir_tilde) +Directory(metron_storm_dir, + mode=0755, + owner=params.metron_user, + group=params.metron_group + ) + +File(ambari_format('{client_jaas_path}'), + content=Template('client_jaas.conf.j2'), + owner=params.metron_user, + group=params.metron_group, + mode=0755 + ) + +File(metron_storm_dir + '/storm.yaml', + content=Template('storm.yaml.j2'), + owner=params.metron_user, + group=params.metron_group, + mode=0755 + ) + --- End diff -- Yeah, you're right, you should be able to change it via HA. I just meant I can't go and change it to something in full dev. Re: the AutoTGT stuff I'm unsure if HA Storm would be an issue, but assuming Storm handles spinning up secure nimbuses correctly it shouldn't be. New nimbuses are required to make sure they get all code of active topologies. I would assume this includes getting any TGTs (although I haven't verified this) to ensure that active topologies can continue to run. We'd need to test on an actual HA cluster. I disagree with AutoHDFS/AutoHBase being less complex, because if I recall correctly (and I don't, please correct me), that solution required setting up symlinks on each Storm node. I don't even know that we easily have that capability in Ambari as the Metron service. Even if we do work around that, I don't see AutoTGT as complicated, and the Storm docs mention "On a kerberos secure cluster they should be set by default to point to org.apache.storm.security.auth.kerberos.AutoTGT." It seems reasonable to go with Storm's recommendation unless we have a compelling reason not to. The custom storm.yaml only contains 3 configs that we need to run 1. Nimbus seeds (gathered from Storm configs themselves) 2. Jaas file 3. Thrift protocol (which admittedly doesn't grab correctly from Storm, but is essentially a constant for the purpose of Kerberos). Of these, only the jaas file really matters for management, and all it does it define named stanzas and it's not anything particularly complicated. Having said that, the Storm service check could make AutoTGT a lot less attractive if it's not easily workable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #518: METRON-799: The MPack should function in...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/518#discussion_r110411025 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/metron_security.py --- @@ -0,0 +1,74 @@ +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at +http://www.apache.org/licenses/LICENSE-2.0 +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +""" + +import os.path +from resource_management.core.source import Template +from resource_management.core.resources.system import Directory, File +from resource_management.core import global_lock +from resource_management.core.logger import Logger +from resource_management.core.resources.system import Execute +from resource_management.libraries.functions import format as ambari_format + + +# Convenience function for ensuring home dirs are setup consistently. +def storm_security_setup(params): +if params.security_enabled: +# I don't think there's an Ambari way to get a user's local home dir , so have Python perform tilde expansion. +# Ambari's Directory doesn't do tilde expansion. +metron_storm_dir_tilde = '~' + params.metron_user + '/.storm' +metron_storm_dir = os.path.expanduser(metron_storm_dir_tilde) +Directory(metron_storm_dir, + mode=0755, + owner=params.metron_user, + group=params.metron_group + ) + +File(ambari_format('{client_jaas_path}'), + content=Template('client_jaas.conf.j2'), + owner=params.metron_user, + group=params.metron_group, + mode=0755 + ) + +File(metron_storm_dir + '/storm.yaml', + content=Template('storm.yaml.j2'), + owner=params.metron_user, + group=params.metron_group, + mode=0755 + ) + --- End diff -- It's not possible to change it via Ambari, but it will populate the variable appropriately. There's some translation magic happening somewhere, because Ambari lists the value of nimbus.seeds as 'node1', but it actually passes it down as a list (which is why the new commit doesn't surround it with square brackets). I'm guessing this magic is also why the thrift config doesn't get passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #518: METRON-799: The MPack should function in...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/518#discussion_r110405590 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/metron_security.py --- @@ -0,0 +1,74 @@ +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at +http://www.apache.org/licenses/LICENSE-2.0 +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +""" + +import os.path +from resource_management.core.source import Template +from resource_management.core.resources.system import Directory, File +from resource_management.core import global_lock +from resource_management.core.logger import Logger +from resource_management.core.resources.system import Execute +from resource_management.libraries.functions import format as ambari_format + + +# Convenience function for ensuring home dirs are setup consistently. +def storm_security_setup(params): +if params.security_enabled: +# I don't think there's an Ambari way to get a user's local home dir , so have Python perform tilde expansion. +# Ambari's Directory doesn't do tilde expansion. +metron_storm_dir_tilde = '~' + params.metron_user + '/.storm' +metron_storm_dir = os.path.expanduser(metron_storm_dir_tilde) +Directory(metron_storm_dir, + mode=0755, + owner=params.metron_user, + group=params.metron_group + ) + +File(ambari_format('{client_jaas_path}'), + content=Template('client_jaas.conf.j2'), + owner=params.metron_user, + group=params.metron_group, + mode=0755 + ) + +File(metron_storm_dir + '/storm.yaml', + content=Template('storm.yaml.j2'), + owner=params.metron_user, + group=params.metron_group, + mode=0755 + ) + --- End diff -- nimbus.seeds is updated to grab the actual value. Oddly, storm.thrift.transport doesn't end up flowing correctly to the template. I double checked all the spelling and so on, and it always ended up just being the variable, not the value. I'm inclined to let that one go, since it's only even created in a Kerberized environment anyway and it's a constant. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #518: METRON-799: The MPack should function in a kerb...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/518 @dlyle65535 I'll take a look at nimbus.admins. I'm pretty confident it's running as Storm user though, Partial output from Ambari: ``` Traceback (most recent call last): ... user=params.storm_user ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #518: METRON-799: The MPack should function in...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/518#discussion_r110388704 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/metron_security.py --- @@ -0,0 +1,74 @@ +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at +http://www.apache.org/licenses/LICENSE-2.0 +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +""" + +import os.path +from resource_management.core.source import Template +from resource_management.core.resources.system import Directory, File +from resource_management.core import global_lock +from resource_management.core.logger import Logger +from resource_management.core.resources.system import Execute +from resource_management.libraries.functions import format as ambari_format + + +# Convenience function for ensuring home dirs are setup consistently. +def storm_security_setup(params): +if params.security_enabled: +# I don't think there's an Ambari way to get a user's local home dir , so have Python perform tilde expansion. +# Ambari's Directory doesn't do tilde expansion. +metron_storm_dir_tilde = '~' + params.metron_user + '/.storm' +metron_storm_dir = os.path.expanduser(metron_storm_dir_tilde) +Directory(metron_storm_dir, + mode=0755, + owner=params.metron_user, + group=params.metron_group + ) + +File(ambari_format('{client_jaas_path}'), + content=Template('client_jaas.conf.j2'), + owner=params.metron_user, + group=params.metron_group, + mode=0755 + ) + --- End diff -- My understanding, and correct me if I'm wrong, is that nimbus distributed and renews the TGT from Kerberos. I'm honestly not sure how we'd get everything to line up otherwise if we have to distribute keytabs and jaas files and everything else out. This does raise the question of what happens when the max renewal of a Kerberos ticket has passed. I don't know enough about the implementation to know what happens, and I'd be interested in thoughts. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #518: METRON-799: The MPack should function in...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/518#discussion_r110385969 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/enrichment_commands.py --- @@ -131,47 +153,35 @@ def init_geo(self): self.set_geo_configured() def init_kafka_topics(self): -Logger.info('Creating Kafka topics') -command_template = """{0}/kafka-topics.sh \ ---zookeeper {1} \ ---create \ ---topic {2} \ ---partitions {3} \ ---replication-factor {4} \ ---config retention.bytes={5}""" -num_partitions = 1 -replication_factor = 1 -retention_gigabytes = int(self.__params.metron_topic_retention) -retention_bytes = retention_gigabytes * 1024 * 1024 * 1024 - -Logger.info("Creating topics for enrichment") -topics = [self.__enrichment_topic] -for topic in topics: -Logger.info("Creating topic'{0}'".format(topic)) -Execute(command_template.format(self.__params.kafka_bin_dir, -self.__params.zookeeper_quorum, -topic, -num_partitions, -replication_factor, -retention_bytes)) - -Logger.info("Done creating Kafka topics") +Logger.info('Creating Kafka topics for enrichment') +# All errors go to indexing topics, so create it here if it's not already +metron_service.init_kafka_topics(self.__params, [self.__enrichment_topic, self.__params.metron_error_topic]) self.set_kafka_configured() +def init_kafka_acls(self): +Logger.info('Creating Kafka topics') --- End diff -- Yes, yes I did. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #518: METRON-799: The MPack should function in...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/518#discussion_r110385783 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/indexing_commands.py --- @@ -72,55 +93,46 @@ def remote_repo(): raise ValueError("Unsupported repo type '{0}'".format(repo_type)) def init_kafka_topics(self): -Logger.info('Creating Kafka topics') -command_template = """{0}/kafka-topics.sh \ ---zookeeper {1} \ ---create \ ---topic {2} \ ---partitions {3} \ ---replication-factor {4} \ ---config retention.bytes={5}""" -num_partitions = 1 -replication_factor = 1 -retention_gigabytes = int(self.__params.metron_topic_retention) -retention_bytes = retention_gigabytes * 1024 * 1024 * 1024 -Logger.info("Creating topics for indexing") - -Logger.info("Creating topic'{0}'".format(self.__indexing)) -Execute(command_template.format(self.__params.kafka_bin_dir, -self.__params.zookeeper_quorum, -self.__indexing, -num_partitions, -replication_factor, -retention_bytes)) -Logger.info("Done creating Kafka topics") +Logger.info('Creating Kafka topics for indexing') +metron_service.init_kafka_topics(self.__params, [self.__indexing]) + +def init_kafka_acls(self): +Logger.info('Creating Kafka ACLs') +# Indexed topic names matches the group +metron_service.init_kafka_acls(self.__params, [self.__indexing], [self.__indexing]) def init_hdfs_dir(self): -Logger.info('Creating HDFS indexing directory') +Logger.info('Setting up HDFS indexing directory') + +# Non Kerberized Metron runs under 'storm', requiring write under the 'hadoop' group. +# Kerberized Metron runs under it's own user. +ownership = 0755 if self.__params.security_enabled else 0775 +Logger.info('HDFS indexing directory ownership is: ' + str(ownership)) self.__params.HdfsResource(self.__params.metron_apps_indexed_hdfs_dir, type="directory", action="create_on_execute", owner=self.__params.metron_user, group=self.__params.hadoop_group, --- End diff -- I decided not to mess with it. If we have a preference on it not being owned by metron:hadoop, we can go ahead and do that, but I think we probably need a more thorough discussion of how we want all that owned and permissioned anyway. Leaving it only readable seemed like a reasonable compromise for now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #518: METRON-799: The MPack should function in...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/518#discussion_r110385480 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/metron_security.py --- @@ -0,0 +1,74 @@ +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at +http://www.apache.org/licenses/LICENSE-2.0 +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +""" + +import os.path +from resource_management.core.source import Template +from resource_management.core.resources.system import Directory, File +from resource_management.core import global_lock +from resource_management.core.logger import Logger +from resource_management.core.resources.system import Execute +from resource_management.libraries.functions import format as ambari_format + + +# Convenience function for ensuring home dirs are setup consistently. +def storm_security_setup(params): +if params.security_enabled: +# I don't think there's an Ambari way to get a user's local home dir , so have Python perform tilde expansion. +# Ambari's Directory doesn't do tilde expansion. +metron_storm_dir_tilde = '~' + params.metron_user + '/.storm' +metron_storm_dir = os.path.expanduser(metron_storm_dir_tilde) +Directory(metron_storm_dir, + mode=0755, + owner=params.metron_user, + group=params.metron_group + ) + +File(ambari_format('{client_jaas_path}'), + content=Template('client_jaas.conf.j2'), + owner=params.metron_user, + group=params.metron_group, + mode=0755 + ) + --- End diff -- All the stuff in (or referenced in the case of client_jaas) ~metron/.storm should just need to be on the Metron node, because everything gets kicked off there. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #518: METRON-799: The MPack should function in...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/518#discussion_r110385183 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/metron_security.py --- @@ -0,0 +1,74 @@ +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at +http://www.apache.org/licenses/LICENSE-2.0 +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +""" + +import os.path +from resource_management.core.source import Template +from resource_management.core.resources.system import Directory, File +from resource_management.core import global_lock +from resource_management.core.logger import Logger +from resource_management.core.resources.system import Execute +from resource_management.libraries.functions import format as ambari_format + + +# Convenience function for ensuring home dirs are setup consistently. +def storm_security_setup(params): +if params.security_enabled: +# I don't think there's an Ambari way to get a user's local home dir , so have Python perform tilde expansion. +# Ambari's Directory doesn't do tilde expansion. +metron_storm_dir_tilde = '~' + params.metron_user + '/.storm' +metron_storm_dir = os.path.expanduser(metron_storm_dir_tilde) +Directory(metron_storm_dir, + mode=0755, + owner=params.metron_user, + group=params.metron_group + ) + +File(ambari_format('{client_jaas_path}'), + content=Template('client_jaas.conf.j2'), + owner=params.metron_user, + group=params.metron_group, + mode=0755 + ) + +File(metron_storm_dir + '/storm.yaml', + content=Template('storm.yaml.j2'), + owner=params.metron_user, + group=params.metron_group, + mode=0755 + ) + --- End diff -- Just the properties in here. Couple thoughts as you bring this up. We probably want to have a ticket to make sure turning off Kerberos works correctly in the future. The properties in the file (except nimbus.seeds) are essentially set properties. We need our own client_jaas, and the storm.thrift.transport has to be there for some reason and that's pretty much constant on a secure cluster. That should also be made a property that flows down from Storm. nimbus.seeds is wrong and I need to carry that over from the actual Storm property. And I even made sure it was in params_linux and forgot to use it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #518: METRON-799: The MPack should function in...
GitHub user justinleet opened a pull request: https://github.com/apache/incubator-metron/pull/518 METRON-799: The MPack should function in a kerberized cluster ## Contributor Comments Allows the Ambari Kerberos wizard to handle Metron setup. Changes include: - Creation of Keytabs - Running everything as the Metron user, including Storm topology workers (on a Kerberized cluster). - Setup for Metron user to actually be able to run (client_jaas setup, home Storm dir setup, etc.) - Adjusting perms to 0755. The exception is the HDFS output folder on a non-kerb cluster is left as 0775 because we don't have Storm running workers as metron user on. When Kerberizing, the permissions will be restricted down to 0755. - Kafka ACLs - HBase ACLs - Refactored Topic creation to use a common function so I didn't have to edit the same thing 3 times. - Automated updating of Storm configs (the AutoTGT and running workers as user) There's still more testing I want to do, but this is definitely far enough along to submit a PR. I've spun this up on full dev with the now modified Kerberos setup instructions, with the caveat that Ambari's Storm service check fails (it's harmless, as far as I can tell). See below for more details. As this does not touch the sensors, data will need to be pushed manually (same as the old instructions). I've been able to push data from Kafka to Elasticsearch/HDFS. ### The Bad News I would love insight on a problem, if anybody has some. I haven't edited the docs to reflect this yet, in the hopes it'll be resolved. Storm's service check will fail during (and after) Kerberization. Metron can immediately be started perfectly fine. Nothing is legit wrong, but this setup means that the storm user is unable to submit to the cluster (it doesn't have it's home directory setup with some configs). Unfortunately, Ambari runs the service check as the storm user. This can be worked around by creating ~storm/.storm/storm.yaml ``` nimbus.seeds : ['node1'] java.security.auth.login.config : '/usr/hdp/current/storm-supervisor/conf/storm_jaas.conf' storm.thrift.transport : 'org.apache.storm.security.auth.kerberos.KerberosSaslTransportPlugin ``` `java.security.auth.login.conf` can also be `/etc/storm/conf/storm_jaas.conf`, but the value above leads me to my next point. All of these values already exist in storm.yaml. The fact that they need to be specified again in the user's home is really strange. And it'll give an error that the TGT found is not renewable, not something you'd expected. I'm unsure if there are restrictions on where Ambari chooses to run service check, so it's possible this would have to be setup on every node Storm lives on the cluster. I'm also unsure if we can actually have Ambari automate this if it turns out to be necessary, since we aren't the Storm service. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron (Incubating). Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root incubating-metron folder via: ``` mvn -q clean integration-test install && build_utils/verify_licenses.sh ``` - ~Have you written or updated unit tests and or integration tests to verify your changes?~ - ~If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF
[GitHub] incubator-metron issue #506: METRON-818: Ambari elasticsearch.properties tem...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/506 Thanks for running it up and verifying it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #506: METRON-818: Ambari elasticsearch.properties tem...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/506 @JonZeolla did you have any issues spinning this up? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #505: METRON-817: Customise output file path p...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/505#discussion_r109447598 --- Diff: metron-platform/metron-writer/src/main/java/org/apache/metron/writer/hdfs/HdfsWriter.java --- @@ -74,17 +91,43 @@ public BulkWriterResponse write(String sourceType ) throws Exception { BulkWriterResponse response = new BulkWriterResponse(); -SourceHandler handler = getSourceHandler(configurations.getIndex(sourceType)); +// Currently treating all the messages in a group for pass/failure. try { - handler.handle(messages); -} catch(Exception e) { + // Messages can all result in different HDFS paths, because of Stellar Expressions, so we'll need to iterate through + for(JSONObject message : messages) { +Map val = configurations.getSensorConfig(sourceType); +String path = getHdfsPathExtension( +sourceType, + (String)configurations.getSensorConfig(sourceType).getOrDefault(IndexingConfigurations.OUTPUT_PATH_FUNCTION_CONF, ""), +message +); +SourceHandler handler = getSourceHandler(sourceType, path); +handler.handle(message); + } +} catch (Exception e) { response.addAllErrors(e, tuples); } response.addAllSuccesses(tuples); return response; } + public String getHdfsPathExtension(String sourceType, String stellarFunction, JSONObject message) { +// If no function is provided, just use the sourceType directly +if(stellarFunction == null || stellarFunction.trim().isEmpty()) { + return sourceType; +} + +StellarCompiler.Expression expression = sourceTypeExpressionMap.computeIfAbsent(stellarFunction, s -> stellarProcessor.compile(stellarFunction)); +VariableResolver resolver = new MapVariableResolver(message); --- End diff -- @cestella Made that change. I did make the check `if(objResult != null && !(objResult instanceof String)`, to avoid having falling into the IAE when objResult is null. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #506: METRON-818: Ambari elasticsearch.properties tem...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/506 Full dev spun up and ran fine, and I see items showing up in ES and HDFS --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #505: METRON-817: Customise output file path p...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/505#discussion_r109438625 --- Diff: metron-platform/metron-writer/src/main/java/org/apache/metron/writer/hdfs/HdfsWriter.java --- @@ -74,17 +91,43 @@ public BulkWriterResponse write(String sourceType ) throws Exception { BulkWriterResponse response = new BulkWriterResponse(); -SourceHandler handler = getSourceHandler(configurations.getIndex(sourceType)); +// Currently treating all the messages in a group for pass/failure. try { - handler.handle(messages); -} catch(Exception e) { + // Messages can all result in different HDFS paths, because of Stellar Expressions, so we'll need to iterate through + for(JSONObject message : messages) { +Map val = configurations.getSensorConfig(sourceType); +String path = getHdfsPathExtension( +sourceType, + (String)configurations.getSensorConfig(sourceType).getOrDefault(IndexingConfigurations.OUTPUT_PATH_FUNCTION_CONF, ""), +message +); +SourceHandler handler = getSourceHandler(sourceType, path); +handler.handle(message); + } +} catch (Exception e) { response.addAllErrors(e, tuples); } response.addAllSuccesses(tuples); return response; } + public String getHdfsPathExtension(String sourceType, String stellarFunction, JSONObject message) { +// If no function is provided, just use the sourceType directly +if(stellarFunction == null || stellarFunction.trim().isEmpty()) { + return sourceType; +} + +StellarCompiler.Expression expression = sourceTypeExpressionMap.computeIfAbsent(stellarFunction, s -> stellarProcessor.compile(stellarFunction)); +VariableResolver resolver = new MapVariableResolver(message); --- End diff -- @cestella I'm mostly concerned about the performance of function compile on every single message that comes through indexing. If we keep the current approach, I would be interested in if there's a way to make things a little cleaner. In retrospect, I think this should be an LRU cache, so that we don't keep around a given parse forever. Any thoughts on that, assuming performance would be enough of a concern to not just use your proposal? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #505: METRON-817: Customise output file path p...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/505#discussion_r109432502 --- Diff: metron-platform/metron-writer/src/main/java/org/apache/metron/writer/hdfs/HdfsWriter.java --- @@ -74,17 +91,43 @@ public BulkWriterResponse write(String sourceType ) throws Exception { BulkWriterResponse response = new BulkWriterResponse(); -SourceHandler handler = getSourceHandler(configurations.getIndex(sourceType)); +// Currently treating all the messages in a group for pass/failure. try { - handler.handle(messages); -} catch(Exception e) { + // Messages can all result in different HDFS paths, because of Stellar Expressions, so we'll need to iterate through + for(JSONObject message : messages) { +Map val = configurations.getSensorConfig(sourceType); +String path = getHdfsPathExtension( +sourceType, + (String)configurations.getSensorConfig(sourceType).getOrDefault(IndexingConfigurations.OUTPUT_PATH_FUNCTION_CONF, ""), +message +); +SourceHandler handler = getSourceHandler(sourceType, path); +handler.handle(message); + } +} catch (Exception e) { response.addAllErrors(e, tuples); } response.addAllSuccesses(tuples); return response; } + public String getHdfsPathExtension(String sourceType, String stellarFunction, JSONObject message) { +// If no function is provided, just use the sourceType directly +if(stellarFunction == null || stellarFunction.trim().isEmpty()) { + return sourceType; +} + +StellarCompiler.Expression expression = sourceTypeExpressionMap.computeIfAbsent(stellarFunction, s -> stellarProcessor.compile(stellarFunction)); +VariableResolver resolver = new MapVariableResolver(message); --- End diff -- Unfortunately, I don't think we can, unless we want to do more work to actually look up the function and validate. On top of it, things like MAP_GET essentially return Object anyway, so we'd still want to check if it's a String afterwards. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #506: METRON-818: Ambari elasticsearch.properties tem...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/506 @mmiklavc We probably should edit the Solr config. That isn't carried through Ambari, so we don't have the same concern as here. However, it does look like `storm.auto.credentials=[]` got added to solr.properties and I thought it wasn't necessary there. Can we just drop that config and add the 'topology.worker.childopts='? Should I just go ahead and make that chance and add it to this PR? And if I do, do we have any testing plan for Solr or are we just making best effort fixes? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #506: METRON-818: Ambari elasticsearch.propert...
GitHub user justinleet opened a pull request: https://github.com/apache/incubator-metron/pull/506 METRON-818: Ambari elasticsearch.properties template is missing topology.worker.childopts ## Contributor Comments Adding the empty config to the Ambari elasticsearch.properties template. To test, spin up in a dev environment. Indexing topology should produce results instead of an error in the logs now. I'm still running this up in dev, but wanted to let people see what's going on. will update shortly. As a workaround, just add this line to a running Ambari instance and restart indexing in Ambari to push the configs. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron (Incubating). Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root incubating-metron folder via: ``` mvn -q clean integration-test install && build_utils/verify_licenses.sh ``` - ~Have you written or updated unit tests and or integration tests to verify your changes?~ - ~If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?~ - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - ~Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes~ Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommened that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/justinleet/incubator-metron ambari_config_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-metron/pull/506.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #506 commit d1fd6cf59433747a3cca503df552fcd0f003f488 Author: justinjleet Date: 2017-04-03T13:34:50Z Adding elasticsearch.properties empty config --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #505: METRON-817: Customise output file path p...
GitHub user justinleet opened a pull request: https://github.com/apache/incubator-metron/pull/505 METRON-817: Customise output file path patterns for HDFS indexing ## Contributor Comments Primarily this affects HdfsWriter by changing the output path from a set path (`/apps/metron/.../`), and allow it to be defined via a Stellar Function. Specifically, the base path is still defined the same (The `/apps/metron/.../` portion), but the `` portion is dropped and can now be defined by a Stellar function. By default, the original behavior of `` is used. This is defined in the `.json` file as indicated in the new README.md for metron-writer. ### Notes - This requires adding tracking things a bit more carefully (and if you're reviewing, please validate that it happens correctly). When the outputFile is closed, we remove the sourceHandler from HdfsWriter's map. - I'm slightly concerned about the correctness of the implementation, but it seems necessary to ensure that we don't leave a bunch of SourceHandlers lying around as data changes (and we don't want an enormous number of output files being written to). - If there's a cleaner way to manage this, I'd love to hear it and can refactor pretty easily. It throws off the rotation count (because we kill the SourceHandler from the map itself), but I doubt we care about that since it really only shows up in the output filename anyway. - This also adds an argument for max open files. This is a flux level config. I defaulted this to 500. 500 was chosen because it was an arbitrary round number that wasn't enormous. - If someone has a default with any real reasoning behind it, I'll go ahead and change it. - In HdfsWriter, we iterate through the messages, apply the Stellar function and then call the relevant handler. The entire group of message is treated as one single pass/fail (which is the same as the old behavior), rather than individually. The try/catch could potentially be moved into the for loop, but I don't think there's an explicit link between the message and the tuples that we can exploit to fail per message. I don't think it needs to be addressed here, but I'm curious if there's thought on this. ### Testing Unit tests are added to pretty much cover HdfsWriter, and this can be spun up in a dev environment. To test in dev - Spin up a dev environment - Validate that the output matches the old format in HDFS (Nothing has an output function defined) ``` [hdfs@node1 vagrant]$ hdfs dfs -ls /apps/metron/indexing/indexed/ Found 3 items drwxrwxr-x - storm hadoop 0 2017-04-03 13:11 /apps/metron/indexing/indexed/bro drwxrwxr-x - storm hadoop 0 2017-04-03 13:11 /apps/metron/indexing/indexed/error drwxrwxr-x - storm hadoop 0 2017-04-03 13:11 /apps/metron/indexing/indexed/snort ``` - Edit the indexing config for Bro to include an outputPathFunction in the hdfs section, e.g. in `/usr/metron/0.3.1/config/zookeeper/indexing/bro.json` ``` { "hdfs" : { "index": "bro", "batchSize": 5, "enabled" : true, "outputPathFunction": "FORMAT('ipsrc-%s', ip_src_addr)" }, "elasticsearch" : { "index": "bro", "batchSize": 5, "enabled" : true }, "solr" : { "index": "bro", "batchSize": 5, "enabled" : true } } ``` - Push the config configs to ZooKeeper: `/usr/metron/0.3.1/bin/zk_load_configs.sh -z node1:2181 -m PUSH -i /usr/metron/0.3.1/config/zookeeper/` - Let some more data run through and check the output folders, e.g. ``` [hdfs@node1 vagrant]$ hdfs dfs -ls /apps/metron/indexing/indexed/ Found 5 items drwxrwxr-x - storm hadoop 0 2017-04-03 13:11 /apps/metron/indexing/indexed/bro drwxrwxr-x - storm hadoop 0 2017-04-03 13:11 /apps/metron/indexing/indexed/error drwxrwxr-x - storm hadoop 0 2017-04-03 13:14 /apps/metron/indexing/indexed/ipsrc-192.168.138.158 drwxrwxr-x - storm hadoop 0 2017-04-03 13:14 /apps/metron/indexing/indexed/ipsrc-192.168.66.1 drwxrwxr-x - storm hadoop 0 2017-04-03 13:11 /apps/metron/indexing/indexed/snort [hdfs@node1 vagrant]$ hdfs dfs -ls /apps/metron/indexing/indexed/ipsrc-192.168.138.158 Found 1 items -rw-r--r-- 1 storm hadoop 223182 2017-04-03 13:14 /apps/metron/indexing/indexed/ipsrc-192.168.138.158/enrichment-null-0-0-1491225291377.json ``` ## Pull Request Checklist Thank you for submitting a contributio
[GitHub] incubator-metron issue #486: METRON-793: Migrate to storm-kafka-client kafka...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/486 @cestella My +1 stands with the testing issues ironed out. Thanks for looking into it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #488: METRON-796: Mpack uses wrong group for owning H...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/488 https://issues.apache.org/jira/browse/METRON-349 is updated to be more complete and reflect the current state of things. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #488: METRON-796: Mpack uses wrong group for owning H...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/488 @dlyle65535 Failure mode is that HDFS writes from Storm fail. The directories are owned by metron:metron with 775. Storm isn't in the metron group, so it fails to write. The perms exception is thrown in the bolt and no output file is created. Writes to ES work as expected. Validating this is pretty easy, just run up full dev and see if the perms error shows up and if HDFS gets any files. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #488: METRON-796: Mpack uses wrong group for owning H...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/488 @mattf-horton Permissions on the other items are generally 775, so they can be read as needed (and should be scaled back once we have everything lined up with the user as @simonellistonball mentions). I just wanted to touch things as little as possible to get them back in a working state. We should either expand out or replace METRON-349 (including closing off permissions) as the actual solution to the problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #488: METRON-796: Mpack uses wrong group for o...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/488#discussion_r107965896 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/params/params_linux.py --- @@ -39,7 +39,7 @@ tmp_dir = Script.get_tmp_dir() hostname = config['hostname'] -metron_group = config['configurations']['cluster-env']['metron_group'] +hadoop_group = config['configurations']['cluster-env']['user_group'] --- End diff -- Updated with a comment to clarify things a bit. Let me know if you think there's anything else we want to add or clarify. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #488: METRON-796: Mpack uses wrong group for owning H...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/488 For anybody looking, Matt's review comment is still relevant to discussion, but unfortunately hidden by GitHub thinking it's outdated after the last commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #488: METRON-796: Mpack uses wrong group for o...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/488#discussion_r107961358 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/params/params_linux.py --- @@ -39,7 +39,7 @@ tmp_dir = Script.get_tmp_dir() hostname = config['hostname'] -metron_group = config['configurations']['cluster-env']['metron_group'] +hadoop_group = config['configurations']['cluster-env']['user_group'] --- End diff -- I'll go ahead and move the config. On the group issue, that is the group named 'hadoop'. The cluster level config is named 'user_group', I have absolutely no idea why. I only called it 'hadoop_group' here, so it was more obvious it shouldn't be killed in the future. If there are objections to calling it 'hadoop_group', I could also carry it through as user_group and add a comment about the meaning in the params file. For example, in HDP stack https://github.com/apache/ambari/blob/trunk/ambari-server/src/main/resources/stacks/HDP/2.0.6/configuration/cluster-env.xml#L158 ``` user_group Hadoop Group hadoop GROUP Hadoop user group. ``` This declaration carried through a couple other stack definitions that I looked at. The use of this group also seems fairly common, e.g. in https://github.com/apache/ambari/blob/trunk/ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs.py#L60 ``` if "hadoop-policy" in params.config['configurations']: XmlConfig("hadoop-policy.xml", conf_dir=params.hadoop_conf_dir, configurations=params.config['configurations']['hadoop-policy'], configuration_attributes=params.config['configuration_attributes']['hadoop-policy'], owner=params.hdfs_user, group=params.user_group ) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #486: METRON-793: Migrate to storm-kafka-client kafka...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/486 @cestella Just noticed Travis after I commented. I'm moderately surprised that the most recent PR would break it, do you know what the issue is? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #486: METRON-793: Migrate to storm-kafka-client kafka...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/486 +1, was able to follow Mike's instructions, with a couple caveats. - Group authorization command was missing ``` /usr/hdp/current/kafka-broker/bin/kafka-acls.sh --authorizer kafka.security.auth.SimpleAclAuthorizer --authorizer-properties zookeeper.connect=node1:2181 --add --allow-principal User:storm-metron_cluster --allow-principal User:justin --group jsonMap_parser ``` - Topic authorization command on the enrichments topic side was missing. ``` /usr/hdp/current/kafka-broker/bin/kafka-acls.sh --authorizer kafka.security.auth.SimpleAclAuthorizer --authorizer-properties zookeeper.connect=node1:2181 --add --allow-principal User:storm-metron_cluster --allow-principal User:justin --topic enrichments ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #486: METRON-793: Migrate to storm-kafka-clien...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/486#discussion_r107913632 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/topology/ParserTopologyBuilder.java --- @@ -106,19 +105,22 @@ public static TopologyBuilder build(String zookeeperUrl, /** * Create a spout that consumes tuples from a Kafka topic. * - * @param zookeeperUrlZookeeper URL + * @param zkQuorum Zookeeper URL * @param sensorType Type of sensor - * @param offset Kafka topic offset where the topology will start; BEGINNING, END, WHERE_I_LEFT_OFF - * @param kafkaSpoutConfigOptions Configuration options for the kafka spout + * @param kafkaConfigOptional Configuration options for the kafka spout * @param parserConfigConfiguration for the parser * @return */ - private static KafkaSpout createKafkaSpout(String zookeeperUrl, String sensorType, SpoutConfig.Offset offset, EnumMap kafkaSpoutConfigOptions, SensorParserConfig parserConfig) { - + private static StormKafkaSpout createKafkaSpout(String zkQuorum, String sensorType, Optional> kafkaConfigOptional, SensorParserConfig parserConfig) { --- End diff -- StormKafkaSpout's return type here will actually be StormKafkaSpout, right? Can we make that explicit, rather than untyped (and also drop the Object.class from the creates assuming the other typing change)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #486: METRON-793: Migrate to storm-kafka-clien...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/486#discussion_r107912514 --- Diff: metron-platform/metron-storm-kafka/src/main/java/org/apache/metron/storm/kafka/flux/SimpleStormKafkaBuilder.java --- @@ -0,0 +1,234 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.metron.storm.kafka.flux; + +import com.google.common.base.Joiner; +import org.apache.kafka.clients.consumer.Consumer; +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.kafka.common.serialization.ByteArrayDeserializer; +import org.apache.metron.common.utils.KafkaUtils; +import org.apache.storm.kafka.spout.*; +import org.apache.storm.spout.SpoutOutputCollector; +import org.apache.storm.topology.OutputFieldsDeclarer; +import org.apache.storm.topology.OutputFieldsGetter; +import org.apache.storm.tuple.Fields; +import org.apache.storm.tuple.Values; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.Map; +import java.util.function.Function; + +/** + * This is a convenience layer on top of the KafkaSpoutConfig.Builder available in storm-kafka-client. + * The justification for this class is two-fold. First, there are a lot of moving parts and a simplified + * approach to constructing spouts is useful. Secondly, and perhaps more importantly, the Builder pattern + * is decidedly unfriendly to use inside of Flux. Finally, we can make things a bit more friendly by only requiring + * zookeeper and automatically figuring out the brokers for the bootstrap server. + * + * @param The kafka key type + * @param The kafka value type + */ +public class SimpleStormKafkaBuilder extends KafkaSpoutConfig.Builder { + final static String STREAM = "default"; + + /** + * The fields exposed by the kafka consumer. These will show up in the Storm tuple. + */ + public enum FieldsConfiguration { +KEY("key", record -> record.key()), +VALUE("value", record -> record.value()), +PARTITION("partition", record -> record.partition()), +TOPIC("topic", record -> record.topic()) +; +String fieldName; +Function recordExtractor; + +FieldsConfiguration(String fieldName, Function recordExtractor) { + this.recordExtractor = recordExtractor; + this.fieldName = fieldName; +} + +/** + * Return a list of the enums + * @param configs + * @return + */ +public static List toList(String... configs) { + List ret = new ArrayList<>(); + for(String config : configs) { +ret.add(FieldsConfiguration.valueOf(config.toUpperCase())); + } + return ret; +} + +/** + * Return a list of the enums from their string representation. + * @param configs + * @return + */ +public static List toList(List configs) { + List ret = new ArrayList<>(); + for(String config : configs) { +ret.add(FieldsConfiguration.valueOf(config.toUpperCase())); + } + return ret; +} + +/** + * Construct a Fields object from an iterable of enums. These fields are the fields + * exposed in the Storm tuple emitted from the spout. + * @param configs + * @return + */ +public static Fields getFields(Iterable configs) { + List fields = new ArrayList<>(); + for(FieldsConfiguration config : configs) { +fields.add(config.fieldName); + } + return new Fields(fields); +} + } + + /** + * Build a tuple given the fields and the topic. We want to use our FieldsConfiguration enum + * to define what this tuple looks like. + * @param T
[GitHub] incubator-metron issue #488: METRON-796: Mpack uses wrong group for owning H...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/488 I agree the topologies should run as the metron user, but this is just to get things back in a working state again (and it already used to be this way, so this isn't opening things up more than it was a couple weeks ago). I actually thought there was a separate Jira for running as the Metron user, but the one I was thinking of is https://issues.apache.org/jira/browse/METRON-349. The ticket should really be to consolidate everything under the metron user with appropriate ownership. I don't have a preference for updating that ticket or closing it as a new one (and I'm not sure which the community would prefer). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #488: METRON-796: Mpack uses wrong group for o...
GitHub user justinleet opened a pull request: https://github.com/apache/incubator-metron/pull/488 METRON-796: Mpack uses wrong group for owning HDFS directories ## Contributor Comments Reverts the group owner of a couple HDFS directories to be the hadoop group, rather than the metron group (which is just metron). Right now, the topologies run as the storm user (which belongs to the hadoop group), and therefore didn't have permission to write to HDFS (including in quick and full dev). This sets HDFS ownership to metron:hadoop, which lets it be handled appropriately. Other items, such as configs and installation files, were just left as the metron group. To test, just run up a dev environment and ensure files are being written and ownership makes sense (/apps/metron/indexing/indexed is metron:hadoop with 755 perms). The individual sensors will be owned by storm:hadoop (proving that writes work). For example: ``` [vagrant@node1 ~]$ hdfs dfs -ls /apps/metron/indexing Found 1 items drwxrwxr-x - metron hadoop 0 2017-03-24 12:57 /apps/metron/indexing/indexed [vagrant@node1 ~]$ hdfs dfs -ls /apps/metron/indexing/indexed Found 3 items drwxrwxr-x - storm hadoop 0 2017-03-24 13:01 /apps/metron/indexing/indexed/bro drwxrwxr-x - storm hadoop 0 2017-03-24 13:01 /apps/metron/indexing/indexed/error drwxrwxr-x - storm hadoop 0 2017-03-24 13:01 /apps/metron/indexing/indexed/snort [vagrant@node1 ~]$ hdfs dfs -ls /apps/metron/indexing/indexed/bro Found 1 items -rw-r--r-- 1 storm hadoop 211393 2017-03-24 13:01 /apps/metron/indexing/indexed/bro/enrichment-null-0-0-1490360489968.json ``` As a note, metron_group existed twice in params_linux.py, so only the first instance is changed to hadoop_group and pulled appropriately. The second is left as-is. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron (Incubating). Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root incubating-metron folder via: ``` mvn -q clean integration-test install && build_utils/verify_licenses.sh ``` - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommened that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/justinleet/incubator-metron METRON-796 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-metron/pull/488.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #488 commit ec7d070524334603e1712b4649e5600ded450284 Author: justinjleet Date: 2017-03-24T00:55:23Z Updating group perms --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #486: METRON-793: Migrate to storm-kafka-clien...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/486#discussion_r107683392 --- Diff: metron-platform/metron-pcap-backend/src/main/java/org/apache/metron/spout/pcap/HDFSWriterCallback.java --- @@ -96,16 +109,18 @@ public HDFSWriterCallback withConfig(HDFSWriterConfig config) { this.config = config; return this; } + @Override public List apply(List tuple, EmitContext context) { - -List keyValue = (List) tuple.get(0); -LongWritable ts = (LongWritable) keyValue.get(0); -BytesWritable rawPacket = (BytesWritable)keyValue.get(1); +byte[] key = (byte[]) tuple.get(0); +byte[] value = (byte[]) tuple.get(1); +if(!config.getDeserializer().deserializeKeyValue(key, value, KeyValue.key.get(), KeyValue.value.get())) { +LOG.debug("Dropping malformed packet..."); --- End diff -- I'm good with that. I had mostly discarded the worry about size because this is in debugging statements anyway and typically with storm you're setting reasonable timeouts on logging levels anyway. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #487: METRON-792: Quick Dev should remove/replace RPM...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/487 I'm +1 by inspection --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #486: METRON-793: Migrate to storm-kafka-client kafka...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/486 @cestella I'm good with keeping the extension points especially after the points you made. I think the TODOs are valuable, I just wanted to know the thought behind potentially building it out. Given the API instability, unfortunately it seems like our dependencies aren't going to provide that insulation layer. I'd rather have that be provided in a stable manner upstream from us, but that's not something we have any control over. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #486: METRON-793: Migrate to storm-kafka-clien...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/486#discussion_r107438798 --- Diff: metron-platform/metron-parsers/src/test/java/org/apache/metron/parsers/integration/components/ParserTopologyComponent.java --- @@ -97,6 +99,19 @@ public void start() throws UnableToStartException { public void stop() { if(stormCluster != null) { stormCluster.shutdown(); + if(new File("logs/workers-artifacts").exists()) { +Path rootPath = Paths.get("logs"); +Path destPath = Paths.get("target/logs"); +try { + Files.move(rootPath, destPath); + Files.walk(destPath) --- End diff -- Same deal with FileUtils.deleteDirectory() here --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #486: METRON-793: Migrate to storm-kafka-clien...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/486#discussion_r107522830 --- Diff: metron-platform/metron-pcap-backend/src/main/java/org/apache/metron/spout/pcap/HDFSWriterCallback.java --- @@ -96,16 +109,18 @@ public HDFSWriterCallback withConfig(HDFSWriterConfig config) { this.config = config; return this; } + @Override public List apply(List tuple, EmitContext context) { - -List keyValue = (List) tuple.get(0); -LongWritable ts = (LongWritable) keyValue.get(0); -BytesWritable rawPacket = (BytesWritable)keyValue.get(1); +byte[] key = (byte[]) tuple.get(0); +byte[] value = (byte[]) tuple.get(1); +if(!config.getDeserializer().deserializeKeyValue(key, value, KeyValue.key.get(), KeyValue.value.get())) { +LOG.debug("Dropping malformed packet..."); --- End diff -- Is it reasonable to include the key and value we're having issues with in the debug statement? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #486: METRON-793: Migrate to storm-kafka-clien...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/486#discussion_r107524059 --- Diff: metron-platform/metron-pcap-backend/src/main/java/org/apache/metron/spout/pcap/deserializer/Deserializers.java --- @@ -0,0 +1,59 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.metron.spout.pcap.deserializer; + +import org.apache.metron.common.utils.timestamp.TimestampConverters; +import org.apache.metron.common.utils.timestamp.TimestampConverter; + +import java.util.function.Function; + +/** + * Deserializers take the raw bytes from kafka key and value and construct the timestamp and raw bytes for PCAP. + */ +public enum Deserializers { + /** + * Extract the timestamp from the key and the raw packet (global-headerless) from the value + */ + FROM_KEY( converter -> new FromKeyDeserializer(converter)) + /** + * Ignore the key and pull the timestamp directly from the packet itself. Also, assume that the packet isn't global-headerless. + */ + ,FROM_PACKET(converter -> new FromPacketDeserializer()); + ; + Function creator; + Deserializers(Function creator) + { +this.creator = creator; + } + + public static KeyValueDeserializer create(String scheme, TimestampConverter converter) { +try { + Deserializers ts = Deserializers.valueOf(scheme.toUpperCase()); + return ts.creator.apply(converter); +} +catch(IllegalArgumentException iae) { + return Deserializers.FROM_KEY.creator.apply(converter); +} + } + + public static KeyValueDeserializer create(String scheme, String converter) { +return create(scheme, TimestampConverters.valueOf(converter.toUpperCase())); --- End diff -- Shouldn't this be a call to TimestampConverters.getConverter()? And if we're uppercasing things, shouldn't it be in the TimestampConverter (given that it's meant to match an Enum value) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #486: METRON-793: Migrate to storm-kafka-clien...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/486#discussion_r107521465 --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/topology/ParserTopologyBuilder.java --- @@ -106,19 +105,26 @@ public static TopologyBuilder build(String zookeeperUrl, /** * Create a spout that consumes tuples from a Kafka topic. * - * @param zookeeperUrlZookeeper URL + * @param zkQuorum Zookeeper URL * @param sensorType Type of sensor - * @param offset Kafka topic offset where the topology will start; BEGINNING, END, WHERE_I_LEFT_OFF - * @param kafkaSpoutConfigOptions Configuration options for the kafka spout + * @param kafkaConfigOptional Configuration options for the kafka spout * @param parserConfigConfiguration for the parser * @return */ - private static KafkaSpout createKafkaSpout(String zookeeperUrl, String sensorType, SpoutConfig.Offset offset, EnumMap kafkaSpoutConfigOptions, SensorParserConfig parserConfig) { - + private static StormKafkaSpout createKafkaSpout(String zkQuorum, String sensorType, Optional> kafkaConfigOptional, SensorParserConfig parserConfig) { +Map kafkaSpoutConfigOptions = kafkaConfigOptional.orElse(new HashMap<>()); String inputTopic = parserConfig.getSensorTopic() != null ? parserConfig.getSensorTopic() : sensorType; -SpoutConfig spoutConfig = new SpoutConfig(new ZkHosts(zookeeperUrl), inputTopic, "", inputTopic).from(offset); -SpoutConfigOptions.configure(spoutConfig, kafkaSpoutConfigOptions); -return new KafkaSpout(spoutConfig); + if(!kafkaSpoutConfigOptions.containsKey(SpoutConfiguration.FIRST_POLL_OFFSET_STRATEGY.key)) { --- End diff -- Can you make these putIfAbsent calls()? e.g. ``` kafkaSpoutConfigOptions.putIfAbsent(SpoutConfiguration.FIRST_POLL_OFFSET_STRATEGY.key, KafkaSpoutConfig.FirstPollOffsetStrategy.UNCOMMITTED_EARLIEST.toString()); ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #486: METRON-793: Migrate to storm-kafka-clien...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/486#discussion_r107534534 --- Diff: pom.xml --- @@ -67,20 +67,44 @@ 1.0.1 1.0.1 -0.10.0.1 +0.10.0 2.7.1 1.1.1 -1.8.0 1.5.2 +1.8.0 4.5 3.7 2.7.1 3.3 -${base_storm_version} +1.0.3 + + 1.0.1.2.5.0.0-1245 --- End diff -- I'm sure you've already thought of this, but assuming we do go with this, please make sure this gets a JIRA associated with it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #486: METRON-793: Migrate to storm-kafka-clien...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/486#discussion_r107531858 --- Diff: metron-platform/metron-storm-kafka/src/main/java/org/apache/metron/storm/kafka/flux/SimpleStormKafkaBuilder.java --- @@ -0,0 +1,234 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.metron.storm.kafka.flux; + +import com.google.common.base.Joiner; +import org.apache.kafka.clients.consumer.Consumer; +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.kafka.common.serialization.ByteArrayDeserializer; +import org.apache.metron.common.utils.KafkaUtils; +import org.apache.storm.kafka.spout.*; +import org.apache.storm.spout.SpoutOutputCollector; +import org.apache.storm.topology.OutputFieldsDeclarer; +import org.apache.storm.topology.OutputFieldsGetter; +import org.apache.storm.tuple.Fields; +import org.apache.storm.tuple.Values; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.Map; +import java.util.function.Function; + +/** + * This is a convenience layer on top of the KafkaSpoutConfig.Builder available in storm-kafka-client. + * The justification for this class is two-fold. First, there are a lot of moving parts and a simplified + * approach to constructing spouts is useful. Secondly, and perhaps more importantly, the Builder pattern + * is decidedly unfriendly to use inside of Flux. Finally, we can make things a bit more friendly by only requiring + * zookeeper and automatically figuring out the brokers for the bootstrap server. + * + * @param The kafka key type + * @param The kafka value type + */ +public class SimpleStormKafkaBuilder extends KafkaSpoutConfig.Builder { + final static String STREAM = "default"; + + /** + * The fields exposed by the kafka consumer. These will show up in the Storm tuple. + */ + public enum FieldsConfiguration { +KEY("key", record -> record.key()), +VALUE("value", record -> record.value()), +PARTITION("partition", record -> record.partition()), +TOPIC("topic", record -> record.topic()) +; +String fieldName; +Function recordExtractor; + +FieldsConfiguration(String fieldName, Function recordExtractor) { + this.recordExtractor = recordExtractor; + this.fieldName = fieldName; +} + +/** + * Return a list of the enums + * @param configs + * @return + */ +public static List toList(String... configs) { + List ret = new ArrayList<>(); + for(String config : configs) { +ret.add(FieldsConfiguration.valueOf(config.toUpperCase())); + } + return ret; +} + +/** + * Return a list of the enums from their string representation. + * @param configs + * @return + */ +public static List toList(List configs) { + List ret = new ArrayList<>(); + for(String config : configs) { +ret.add(FieldsConfiguration.valueOf(config.toUpperCase())); + } + return ret; +} + +/** + * Construct a Fields object from an iterable of enums. These fields are the fields + * exposed in the Storm tuple emitted from the spout. + * @param configs + * @return + */ +public static Fields getFields(Iterable configs) { + List fields = new ArrayList<>(); + for(FieldsConfiguration config : configs) { +fields.add(config.fieldName); + } + return new Fields(fields); +} + } + + /** + * Build a tuple given the fields and the topic. We want to use our FieldsConfiguration enum + * to define what this tuple looks like. + * @param T
[GitHub] incubator-metron pull request #486: METRON-793: Migrate to storm-kafka-clien...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/486#discussion_r107533535 --- Diff: metron-platform/metron-storm-kafka/src/main/java/org/apache/metron/storm/kafka/flux/SimpleStormKafkaBuilder.java --- @@ -0,0 +1,234 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.metron.storm.kafka.flux; + +import com.google.common.base.Joiner; +import org.apache.kafka.clients.consumer.Consumer; +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.kafka.common.serialization.ByteArrayDeserializer; +import org.apache.metron.common.utils.KafkaUtils; +import org.apache.storm.kafka.spout.*; +import org.apache.storm.spout.SpoutOutputCollector; +import org.apache.storm.topology.OutputFieldsDeclarer; +import org.apache.storm.topology.OutputFieldsGetter; +import org.apache.storm.tuple.Fields; +import org.apache.storm.tuple.Values; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.Map; +import java.util.function.Function; + +/** + * This is a convenience layer on top of the KafkaSpoutConfig.Builder available in storm-kafka-client. + * The justification for this class is two-fold. First, there are a lot of moving parts and a simplified + * approach to constructing spouts is useful. Secondly, and perhaps more importantly, the Builder pattern + * is decidedly unfriendly to use inside of Flux. Finally, we can make things a bit more friendly by only requiring + * zookeeper and automatically figuring out the brokers for the bootstrap server. + * + * @param The kafka key type + * @param The kafka value type + */ +public class SimpleStormKafkaBuilder extends KafkaSpoutConfig.Builder { + final static String STREAM = "default"; + + /** + * The fields exposed by the kafka consumer. These will show up in the Storm tuple. + */ + public enum FieldsConfiguration { +KEY("key", record -> record.key()), +VALUE("value", record -> record.value()), +PARTITION("partition", record -> record.partition()), +TOPIC("topic", record -> record.topic()) +; +String fieldName; +Function recordExtractor; + +FieldsConfiguration(String fieldName, Function recordExtractor) { + this.recordExtractor = recordExtractor; + this.fieldName = fieldName; +} + +/** + * Return a list of the enums + * @param configs + * @return + */ +public static List toList(String... configs) { + List ret = new ArrayList<>(); + for(String config : configs) { +ret.add(FieldsConfiguration.valueOf(config.toUpperCase())); + } + return ret; +} + +/** + * Return a list of the enums from their string representation. + * @param configs + * @return + */ +public static List toList(List configs) { + List ret = new ArrayList<>(); + for(String config : configs) { +ret.add(FieldsConfiguration.valueOf(config.toUpperCase())); + } + return ret; +} + +/** + * Construct a Fields object from an iterable of enums. These fields are the fields + * exposed in the Storm tuple emitted from the spout. + * @param configs + * @return + */ +public static Fields getFields(Iterable configs) { + List fields = new ArrayList<>(); + for(FieldsConfiguration config : configs) { +fields.add(config.fieldName); + } + return new Fields(fields); +} + } + + /** + * Build a tuple given the fields and the topic. We want to use our FieldsConfiguration enum + * to define what this tuple looks like. + * @param T
[GitHub] incubator-metron pull request #486: METRON-793: Migrate to storm-kafka-clien...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/486#discussion_r107436680 --- Diff: metron-platform/metron-integration-test/src/main/java/org/apache/metron/integration/components/FluxTopologyComponent.java --- @@ -133,7 +138,25 @@ public void start() throws UnableToStartException { @Override public void stop() { if (stormCluster != null) { - stormCluster.shutdown(); + try { +stormCluster.shutdown(); +if(new File("logs/workers-artifacts").exists()) { + Path rootPath = Paths.get("logs"); + Path destPath = Paths.get("target/logs"); + try { +Files.move(rootPath, destPath); +Files.walk(destPath) --- End diff -- Could this just be a FileUtils.deleteDirectory(destPath)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #482: METRON-791: Add links to website and dow...
GitHub user justinleet opened a pull request: https://github.com/apache/incubator-metron/pull/482 METRON-791: Add links to website and downloads to top level POM ## Contributor Comments Per the release thread discssion, I'm quick throwing a link to the main page in the first section, and a link to the releases in the "Obtaining Metron" section of the top level POM. Because it's just a README change, it can be validated quickly with Github's "View" button on the changes. If we want to change verbiage, or link something slightly different, let me know and I'll quick update the PR. ## Pull Request Checklist Thank you for submitting a contribution to Apache Metron (Incubating). Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [N/A] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [N/A] Have you included steps or a guide to how the change may be verified and tested manually? - [N/A] Have you ensured that the full suite of tests and checks have been executed in the root incubating-metron folder via: ``` mvn -q clean integration-test install && build_utils/verify_licenses.sh ``` - [N/A] Have you written or updated unit tests and or integration tests to verify your changes? - [N/A] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [N/A] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`: ``` cd site-book bin/generate-md.sh mvn site:site ``` Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommened that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/justinleet/incubator-metron METRON-791 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-metron/pull/482.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #482 commit 2df1dee9dce130d5d3d654fb400db8fcc0903d86 Author: justinjleet Date: 2017-03-18T19:54:11Z adding links --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #478: METRON-767: Clean up license
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/478 I'm +1 on this by inspection (pending Travis). Thanks for taking care of the mentor feedback on this so quickly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #478: METRON-767: Clean up license from METRON-622
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/478 Actually, looking into this, it's my fault. I split the MIT license stuff with the geo share alike. It should be moved up. @cestella You want to quick fix that and put up a new PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #459: METRON-726: Clean up mvn site generation
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/459 @mattf-horton I think this is a pretty good approach and gives us a lot of value especially as we build up more releases. And there is a `-Dmaven.site.skip=true` flag for maven that can be added to skip all the site stuff. For right now, I've seen instability in running tests in Jenkins on this branch after pulling in master. I was hoping to resolve it more quickly, but I might end up having to roll back bits and pieces of the change/integration until I can narrow down what went wrong. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #436: METRON-671: Refactor existing Ansible deploymen...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/436 I'm +1. I was just waiting for the EC2 component, but was able to get quick-dev, etc. spun up without issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #436: METRON-671: Refactor existing Ansible deploymen...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/436 @dlyle65535 METRON-745 is in (as I'm sure you can tell from the conflict list). I already incorporated the Kibana map changes, so you should just be able to accept master's version. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #459: METRON-726: Clean up mvn site generation
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/459 Updated to (hopefully) not blow up on Travis. Surefire needs jacoco:prepare-agent to resolve the @{argline}. It's only done in Travis, so it seems reasonable to just call it directly. Also makes the surefire version a global and sets it up throughout (including in some unversioned spots). Makes sure @{argLine} is in the appropriate tags. It might also be appropriate in reporting tags, but our management of surefire is pretty variable across the board. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #459: METRON-726: Clean up mvn site generation
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/459 This PR will need another fix, so I'll update when that's good to go --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #459: METRON-726: Clean up mvn site generation
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/459 @mmiklavc @dlyle65535 I just updated the PR to work with metron-interface (since it came in after this PR). I also merged in master to get some miscellaneous fixes (including in the site-book). I also added site-book as a component module, so that it gets built at the same time and gets pulled into the site (and can be clicked into when spun up). We may or may not want to leave that, depending on if we want that in the base mvn clean install. Let me know if there's a preference. It's a one line change to revert that. Finally, I added Javadoc report generation to the top level POM, so it's integrated directly with the site now (feel free to spin it up again and click into Javadocs!) @mattf-horton This PR now also includes a fatal Javadoc fix. I think at this point, it's mostly the integration (unless something else breaks in the meantime). I think at this point, all the reporting works and gets done in one shot. I haven't created the ticket for doing something nice with the outputs of this, just because I didn't know if there would be discussion. I'll go ahead and create this, since it seems like people are on board. Matt, if you have any insight into what the appropriate integrations are (e.g. what other projects do), I'd love to add that to the ticket to give whoever picks it up a little extra guidance. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #475: METRON-745: Create Error Dashboards
GitHub user justinleet opened a pull request: https://github.com/apache/incubator-metron/pull/475 METRON-745: Create Error Dashboards ## Summary Following Ryan's work in https://github.com/apache/incubator-metron/pull/453, we have the opportunity to present errors from our topologies. It's nothing too complicated, essentially just some high level overviews of the various error fields, along with a pane for viewing the actual errors along with all their fields. Note that they include both raw and unique message counts (via the hash fields) in most things. This also corrects the error_index.template files. These are supposed to match ErrorFields in Constants.java, but didn't. I've attached some screenshots, and this can be spun up on quick dev and Ambari (both dashboard.p and the various kibana-index.json are updated). Quick dev automatically passes some data through, so it's a good way to get this spun up with something interesting showing. Feedback on what else would be useful and if we want to adjust anything would be great. Keep in mind, we don't actually have a lot of fields to work with (because if everything was good, we wouldn't be here in the first place!). See error_index.template for the fields we have. ### Testing Spun up in quick-dev and Ambari. Quick-dev will automatically put data through ### Notes * I'm really not convinced the 'hostname' visualizations are needed. The field is there and useful, but given that it's populated with the Storm host that failed, it seems like it's probably useless most of the time. * Kibana occasionally rearranges the order of the visualizations (usually swapping a couple of the charts). If I recall correctly, that's a known Kibana bug that we're stuck with. * Keep in mind the graph shifts by the viewing window. So last 15 minutes vs last 7 days all updates accordingly. * This includes a fix to maps mentioned in https://github.com/apache/incubator-metron/pull/436. If that PR goes in before this one, this PR should be take it's own copy of the dashboards. If this PR goes in first, that PR should accept this one's dashboards. https://cloud.githubusercontent.com/assets/5077341/23672488/e5ae6c9a-033c-11e7-834d-caab26497f0e.png";> https://cloud.githubusercontent.com/assets/5077341/23672512/f3ceec5a-033c-11e7-946f-f7d279a4f3b0.png";> https://cloud.githubusercontent.com/assets/5077341/23672513/f58b317a-033c-11e7-8a21-b1a927971bba.png";> https://cloud.githubusercontent.com/assets/5077341/23672514/f6b6ee90-033c-11e7-9630-0143f2f31fcd.png";> The bottom pane extends further down, but we've all seen a table of data before. Thank you for submitting a contribution to Apache Metron (Incubating). Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions. Please refer also to our [Build Verification guildlines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides. In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [x] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [x] Have you included steps or a guide to how the change may be verified and tested manually? - [x] Have you ensured that the full suite of tests and checks have been executed in the root incubating-metron folder via: ``` mvn -q clean integration-test install && build_utils/verify_licenses.sh ``` - [ ] Have you written or updated unit tests and or integration tests to verify your changes? - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related chang
[GitHub] incubator-metron pull request #469: DO NOT MERGE METRON-745: Create Error Da...
Github user justinleet closed the pull request at: https://github.com/apache/incubator-metron/pull/469 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #469: DO NOT MERGE METRON-745: Create Error Dashboard...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/469 I'm going to just close this and open a new, much, much cleaner one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #474: METRON-758: HdfsServiceImplTest should sort fil...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/474 +1, by inspection --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #474: METRON-758: HdfsServiceImplTest should s...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/474#discussion_r104725400 --- Diff: metron-interface/metron-rest/src/test/java/org/apache/metron/rest/service/impl/HdfsServiceImplTest.java --- @@ -65,6 +66,7 @@ public void listShouldListFiles() throws Exception { FileUtils.writeStringToFile(new File(testDir, "file2.txt"), "value2"); List paths = hdfsService.list(new Path(testDir)); +Collections.sort(paths, String::compareTo); --- End diff -- This is nitpicky, but why even specify String::compareTo? Collections.sort(paths) uses compareTo by default. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #436: METRON-671: Refactor existing Ansible deploymen...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/436 @dlyle65535 @ottobackwards Can we link the various Ambari logs so they're visible on the local machine? Anything that blows up in the UI should blow up in the logs, which means everything is searchable like it was before. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #436: METRON-671: Refactor existing Ansible deploymen...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/436 @dlyle65535 Perfect, thanks. I'm going to go ahead and just make that change and test in 745. If this goes in first, 745 just takes its dashboard changes entirely. In the less likely event that 745 goes in first, this PR just accepts the changes entirely. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #459: METRON-726: Clean up mvn site generation
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/459 Bumping this. There was a dev list discussion and it spun out [METRON-746](https://issues.apache.org/jira/browse/METRON-746) and [METRON-747](https://issues.apache.org/jira/browse/METRON-747) I don't think either of those are blockers to reviewing and pulling in this ticket. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #436: METRON-671: Refactor existing Ansible de...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/436#discussion_r104678359 --- Diff: metron-deployment/roles/kibana/README.md --- @@ -1,35 +0,0 @@ -Kibana 4 - - -This role installs Kibana along with the default Metron Dashboard. - -### FAQ - - How do I change Metron's default dashboard? --- End diff -- I'm inclined (possibly by my own self interest) to make it a follow on jira that gets resolved either before or after 745. 745 takes care of this file (which is actually why I thought of it). I definitely agree on using the same file though, but I'm not sure off the top of my head how much refactoring happens there. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #436: METRON-671: Refactor existing Ansible de...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/436#discussion_r104673955 --- Diff: metron-deployment/roles/kibana/README.md --- @@ -1,35 +0,0 @@ -Kibana 4 - - -This role installs Kibana along with the default Metron Dashboard. - -### FAQ - - How do I change Metron's default dashboard? --- End diff -- kibana-index.json still exists in the docker stuff. Given that the module is buried in the Ambari stuff, do we still need/want the instructions and an elasticdump done to update that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #436: METRON-671: Refactor existing Ansible deploymen...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/436 @dlyle65535 @nickwallen I'm not sure what the exact fix was with the dashboard. Either approach to fixing will work for me, but I don't know what made the map not work to fix it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #436: METRON-671: Refactor existing Ansible de...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/436#discussion_r104672196 --- Diff: metron-deployment/extra_modules/ambari_service_state.py --- @@ -0,0 +1,352 @@ +#!/usr/bin/python +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +DOCUMENTATION = ''' +--- +module: ambari_service_state +version_added: "2.1" +author: Apache Metron (Incubating : https://github.com/apache/incubator-metron ) +short_description: Start/Stop/Change Service or Component State +description: +- Start/Stop/Change Service or Component State +options: + host: +description: + The hostname for the ambari web server + port: +description: + The port for the ambari web server + username: +description: + The username for the ambari web server + password: +description: + The name of the cluster in web server +required: yes + cluster_name: +description: + The name of the cluster in ambari +required: yes + service_name: +description: + The name of the service to alter +required: no + component_name: +description: + The name of the component to alter +required: no + component_host: +description: + The host running the targeted component. Required when component_name is used. +required: no + state: +description: + The desired service/component state. + wait_for_complete: +description: + Whether to wait for the request to complete before returning. Default is False. +required: no + requirements: [ 'requests'] +''' + +EXAMPLES = ''' +# must use full relative path to any files in stored in roles/role_name/files/ +- name: Create a new ambari cluster +ambari_cluster_state: + host: localhost + port: 8080 + username: admin + password: admin + cluster_name: my_cluster + cluster_state: present + blueprint_var: roles/my_role/files/blueprint.yml + blueprint_name: hadoop + wait_for_complete: True +- name: Start the ambari cluster + ambari_cluster_state: +host: localhost +port: 8080 +username: admin +password: admin +cluster_name: my_cluster +cluster_state: started +wait_for_complete: True +- name: Stop the ambari cluster + ambari_cluster_state: +host: localhost +port: 8080 +username: admin +password: admin +cluster_name: my_cluster +cluster_state: stopped +wait_for_complete: True +- name: Delete the ambari cluster + ambari_cluster_state: +host: localhost +port: 8080 +username: admin +password: admin +cluster_name: my_cluster +cluster_state: absent +''' + +RETURN = ''' +results: +description: The content of the requests object returned from the RESTful call +returned: success +type: string +''' + +__author__ = 'apachemetron' + +import json + +try: +import requests +except ImportError: +REQUESTS_FOUND = False +else: +REQUESTS_FOUND = True + + +def main(): + +argument_spec = dict( +host=dict(type='str', default=None, required=True), +port=dict(type='int', default=None, required=True), +username=dict(type='str', default=None, required=True), +password=dict(type='str', default=None, required=True), +cluster_name=dict(type='str', default=None, required=True), +state=dict(type='str', default=None, required=True, + choices=['started
[GitHub] incubator-metron pull request #436: METRON-671: Refactor existing Ansible de...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/436#discussion_r104671925 --- Diff: metron-deployment/roles/ambari_config/vars/single_node_vm.yml --- @@ -80,10 +88,32 @@ configurations: - kafka-broker: log.dirs: '{{ kafka_log_dirs }}' delete.topic.enable: "true" + - metron-env: + parsers: "bro,snort" + - elastic-site: + index_number_of_shards: 1 + index_number_of_replicas: 0 + zen_discovery_ping_unicast_hosts: "{{ groups.search | join(',') }}" + gateway_recover_after_data_nodes: 1 + network_host: "_lo_,_eth0_,_eth1_" + masters_also_are_datanodes: "1" --- End diff -- I'm fine with whatever works. It's ES configuration, so if it wants to accept "1", it can be my guest. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #436: METRON-671: Refactor existing Ansible de...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/436#discussion_r104665126 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/ELASTICSEARCH/2.3.3/configuration/elastic-site.xml --- @@ -27,6 +27,14 @@ Cluster name identifies your cluster +masters_also_are_datanodes +"false" --- End diff -- No, it's not that important. Can you add to the description that it has to be in quotes for ES compatibility? I can definitely see that causing confusion otherwise. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #436: METRON-671: Refactor existing Ansible de...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/436#discussion_r104659900 --- Diff: metron-deployment/roles/ambari_master/tasks/main.yml --- @@ -38,6 +38,16 @@ register: ambari_server_setup failed_when: ambari_server_setup.stderr +- name: Copy MPack to Ambari Host + copy: +src: "{{ playbook_dir }}/../packaging/ambari/metron-mpack/target/metron_mpack-0.3.1.0.tar.gz" --- End diff -- Can we pull the mpack version out into a variable so there are less places to change it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #436: METRON-671: Refactor existing Ansible de...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/436#discussion_r104654194 --- Diff: metron-deployment/extra_modules/ambari_service_state.py --- @@ -0,0 +1,352 @@ +#!/usr/bin/python +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +DOCUMENTATION = ''' +--- +module: ambari_service_state +version_added: "2.1" +author: Apache Metron (Incubating : https://github.com/apache/incubator-metron ) +short_description: Start/Stop/Change Service or Component State +description: +- Start/Stop/Change Service or Component State +options: + host: +description: + The hostname for the ambari web server + port: +description: + The port for the ambari web server + username: +description: + The username for the ambari web server + password: +description: + The name of the cluster in web server +required: yes + cluster_name: +description: + The name of the cluster in ambari +required: yes + service_name: +description: + The name of the service to alter +required: no + component_name: +description: + The name of the component to alter +required: no + component_host: +description: + The host running the targeted component. Required when component_name is used. +required: no + state: +description: + The desired service/component state. + wait_for_complete: +description: + Whether to wait for the request to complete before returning. Default is False. +required: no + requirements: [ 'requests'] +''' + +EXAMPLES = ''' +# must use full relative path to any files in stored in roles/role_name/files/ +- name: Create a new ambari cluster +ambari_cluster_state: + host: localhost + port: 8080 + username: admin + password: admin + cluster_name: my_cluster + cluster_state: present + blueprint_var: roles/my_role/files/blueprint.yml + blueprint_name: hadoop + wait_for_complete: True +- name: Start the ambari cluster + ambari_cluster_state: +host: localhost +port: 8080 +username: admin +password: admin +cluster_name: my_cluster +cluster_state: started +wait_for_complete: True +- name: Stop the ambari cluster + ambari_cluster_state: +host: localhost +port: 8080 +username: admin +password: admin +cluster_name: my_cluster +cluster_state: stopped +wait_for_complete: True +- name: Delete the ambari cluster + ambari_cluster_state: +host: localhost +port: 8080 +username: admin +password: admin +cluster_name: my_cluster +cluster_state: absent +''' + +RETURN = ''' +results: +description: The content of the requests object returned from the RESTful call +returned: success +type: string +''' + +__author__ = 'apachemetron' + +import json + +try: +import requests +except ImportError: +REQUESTS_FOUND = False +else: +REQUESTS_FOUND = True + + +def main(): + +argument_spec = dict( +host=dict(type='str', default=None, required=True), +port=dict(type='int', default=None, required=True), +username=dict(type='str', default=None, required=True), +password=dict(type='str', default=None, required=True), +cluster_name=dict(type='str', default=None, required=True), +state=dict(type='str', default=None, required=True, + choices=['started
[GitHub] incubator-metron pull request #436: METRON-671: Refactor existing Ansible de...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/436#discussion_r104654898 --- Diff: metron-deployment/extra_modules/ambari_service_state.py --- @@ -0,0 +1,352 @@ +#!/usr/bin/python +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +DOCUMENTATION = ''' +--- +module: ambari_service_state +version_added: "2.1" +author: Apache Metron (Incubating : https://github.com/apache/incubator-metron ) +short_description: Start/Stop/Change Service or Component State +description: +- Start/Stop/Change Service or Component State +options: + host: +description: + The hostname for the ambari web server + port: +description: + The port for the ambari web server + username: +description: + The username for the ambari web server + password: +description: + The name of the cluster in web server +required: yes + cluster_name: +description: + The name of the cluster in ambari +required: yes + service_name: +description: + The name of the service to alter +required: no + component_name: +description: + The name of the component to alter +required: no + component_host: +description: + The host running the targeted component. Required when component_name is used. +required: no + state: +description: + The desired service/component state. + wait_for_complete: +description: + Whether to wait for the request to complete before returning. Default is False. +required: no + requirements: [ 'requests'] +''' + +EXAMPLES = ''' +# must use full relative path to any files in stored in roles/role_name/files/ +- name: Create a new ambari cluster +ambari_cluster_state: + host: localhost + port: 8080 + username: admin + password: admin + cluster_name: my_cluster + cluster_state: present + blueprint_var: roles/my_role/files/blueprint.yml + blueprint_name: hadoop + wait_for_complete: True +- name: Start the ambari cluster + ambari_cluster_state: +host: localhost +port: 8080 +username: admin +password: admin +cluster_name: my_cluster +cluster_state: started +wait_for_complete: True +- name: Stop the ambari cluster + ambari_cluster_state: +host: localhost +port: 8080 +username: admin +password: admin +cluster_name: my_cluster +cluster_state: stopped +wait_for_complete: True +- name: Delete the ambari cluster + ambari_cluster_state: +host: localhost +port: 8080 +username: admin +password: admin +cluster_name: my_cluster +cluster_state: absent +''' + +RETURN = ''' +results: +description: The content of the requests object returned from the RESTful call +returned: success +type: string +''' + +__author__ = 'apachemetron' + +import json + +try: +import requests +except ImportError: +REQUESTS_FOUND = False +else: +REQUESTS_FOUND = True + + +def main(): + +argument_spec = dict( +host=dict(type='str', default=None, required=True), +port=dict(type='int', default=None, required=True), +username=dict(type='str', default=None, required=True), +password=dict(type='str', default=None, required=True), +cluster_name=dict(type='str', default=None, required=True), +state=dict(type='str', default=None, required=True, + choices=['started
[GitHub] incubator-metron pull request #436: METRON-671: Refactor existing Ansible de...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/436#discussion_r104658438 --- Diff: metron-deployment/roles/ambari_config/vars/single_node_vm.yml --- @@ -80,10 +88,32 @@ configurations: - kafka-broker: log.dirs: '{{ kafka_log_dirs }}' delete.topic.enable: "true" + - metron-env: + parsers: "bro,snort" + - elastic-site: + index_number_of_shards: 1 + index_number_of_replicas: 0 + zen_discovery_ping_unicast_hosts: "{{ groups.search | join(',') }}" + gateway_recover_after_data_nodes: 1 + network_host: "_lo_,_eth0_,_eth1_" + masters_also_are_datanodes: "1" --- End diff -- wasn't this a boolean earlier? Should this be true? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #436: METRON-671: Refactor existing Ansible de...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/436#discussion_r104655538 --- Diff: metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/ELASTICSEARCH/2.3.3/configuration/elastic-site.xml --- @@ -27,6 +27,14 @@ Cluster name identifies your cluster +masters_also_are_datanodes +"false" --- End diff -- Can we refactor things so that this is just `false`, rather than `"false"` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #436: METRON-671: Refactor existing Ansible de...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/436#discussion_r104661243 --- Diff: metron-deployment/roles/quick_dev/tasks/main.yml --- @@ -15,23 +15,50 @@ # limitations under the License. # --- -# -# Workaround for Kafka not starting -# Fire off async start followed by -# Sync start -execution will pause until -# final start completes. -# -- name: Start the ambari cluster - no wait - ambari_cluster_state: +- name: Delete the Metron Components from Ambari + ambari_service_state: host: "{{ groups.ambari_master[0] }}" port: "{{ ambari_port }}" username: "{{ ambari_user }}" password: "{{ ambari_password }}" cluster_name: "{{ cluster_name }}" -cluster_state: started -wait_for_complete: False +state: deleted +component_name: "{{ item }}" +component_host: "{{ inventory_hostname }}" + with_items: +- METRON_ENRICHMENT_MASTER +- METRON_INDEXING +- METRON_PARSERS + +- name: Remove the Metron packages + package: +name: "{{ item }}" +state: absent + with_items: +- metron-common +- metron-data-management +- metron-parsers +- metron-enrichment +- metron-indexing +- metron-elasticsearch + +- name: Re-install the Metron Packages via Ambari --- End diff -- Do we have any issues with the configured files still existing after this? I know RPMs don't like to touch files they didn't explicitly create, so I'm a little worried they still exist here. I know guards were added (the `if(is_*_configured` earlier), but just want to call it out. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #436: METRON-671: Refactor existing Ansible de...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/436#discussion_r104653567 --- Diff: metron-deployment/roles/kibana/README.md --- @@ -1,35 +0,0 @@ -Kibana 4 - - -This role installs Kibana along with the default Metron Dashboard. - -### FAQ - - How do I change Metron's default dashboard? --- End diff -- If you know where you want it, I'm actually writing that up as part of METRON-745 (which needed to use that module). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #471: METRON-755 Update GitHub PR Template
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/471 I prefer top, but I don't really care that much. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #469: DO NOT MERGE METRON-745: Create Error Dashboard...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/469 Alternative, and more sensical/readable approach, to the over time errors. https://cloud.githubusercontent.com/assets/5077341/23526431/06c36a1a-ff60-11e6-93f1-dd8437fd0688.png";> --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #469: DO NOT MERGE METRON-745: Create Error Da...
GitHub user justinleet opened a pull request: https://github.com/apache/incubator-metron/pull/469 DO NOT MERGE METRON-745: Create Error Dashboards # DO NOT MERGE ## Summary Based on Ryan's work in https://github.com/apache/incubator-metron/pull/453, I went ahead and created some a Kibana dashboard for tracking errors. **That PR is not finalized in master so this should not be merged!** However, the data flowing to the index is pretty final, so unless the actual fields or field names change, it doesn't really affect this. All we care about here is the dashboard itself, but unfortunately the 453 changes get pulled along for the ride until that's in. It's nothing too complicated, essentially just some high level overviews of the various fields output by Ryan (some counts, etc.), along with a pane for viewing the actual errors along with all their fields. Note that they include both raw and unique message counts (via the hash fields) in most things. I've attached some screenshots, but this can be also be spun up on an Ambari cluster (and will eventually have to be to be validated, given that the file isn't in a readable format). I'm basically looking for feedback on what else would be useful and if we want to adjust anything. Keep in mind, we don't actually have a lot of fields to work with (because if everything was good, we wouldn't be here in the first place!). See error_index.template for the fields we have. ### Notes * I'm really not convinced the 'hostname' visualizations are needed. The field is there and useful, but given that it's populated with the Storm host that failed, it seems like it's probably useless most of the time. * Kibana occasionally rearranges the order of the visualizations (usually swapping a couple of the charts). If I recall correctly, that's a known Kibana bug that we're stuck with. * The graph teaches a lesson of "Don't load all your data at once if you want a pretty graph". Still, it's just a basic graph of the error counts over time. * Keep in mind the graph shifts by the viewing window. So last 15 minutes vs last 7 days all updates accordingly. https://cloud.githubusercontent.com/assets/5077341/23518699/52eb58bc-ff42-11e6-912c-cc596fe46a3d.png";> https://cloud.githubusercontent.com/assets/5077341/23518700/549c3f0a-ff42-11e6-8e26-18553ce804bc.png";> https://cloud.githubusercontent.com/assets/5077341/23518702/5605c69a-ff42-11e6-8c76-15f485253e8f.png";> The bottom pane extends further down, but we've all seen a table of data before. ### For all changes: - [] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). - [] Does your PR title start with METRON- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? ### For code changes: - [ ] Have you included steps to reproduce the behavior or problem that is being changed or addressed? - [ ] Have you included steps or a guide to how the change may be verified and tested manually? - [ ] Have you ensured that the full suite of tests and checks have been executed in the root incubating-metron folder via: ``` mvn -q clean integration-test install && build_utils/verify_licenses.sh ``` - [ ] Have you written or updated unit tests and or integration tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via site-book/target/site/index.html. ``` cd site-book bin/generate-md.sh mvn site:site ``` ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. It is also recommened that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request. You can merge this pull request into a Git
[GitHub] incubator-metron issue #453: METRON-694: Index Errors from Topologies
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/453 I tried running this up and discovered that there's at least one error that doesn't get caught. Json parsing errors, e.g. if someone gives outright badly formatted messages to indexing (e.g. missing closing '}'), don't get caught and indexed right now. I don't believe we ever handled this type of error, because I don't think it ever occurs from our code directly. I'm inclined to not worry about it for this PR given that we never worried about it to being with, but we may want to create a follow on Jira to ensure that we handle cases like this well. As we add and increase visibility to extension points, we don't want things like this getting tripped by custom code. Anyone have objections to that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #465: METRON-741: Stellar Field Transformations shoul...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/465 +1, by inspection. Thanks for grabbing this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #464: METRON-740: Normalizing and adding log4j proper...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/464 +1 by inspection. Nice to have this setup --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #438: METRON-686 Record Rule Set that Fired During Th...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/438 @nickwallen I have slight preference towards flattening, fixing, and unflattening. I'd rather conform to convention and keep things consistent for now. I could pretty easily be persuaded to go with 1 if there's enough support for it and we think we'll address it relatively quickly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #463: METRON-728: ReaderSpliteratorTest fails randoml...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/463 Yep, my +1 is still in place. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #463: METRON-728: ReaderSpliteratorTest fails randoml...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/463 I figured out why From the docs: ``` List list = new LinkedList(); List spy = spy(list); //Impossible: real method is called so spy.get(0) throws IndexOutOfBoundsException (the list is yet empty) when(spy.get(0)).thenReturn("foo"); //You have to use doReturn() for stubbing doReturn("foo").when(spy).get(0); ``` So the first call to @cestella's trySplit() is in the when(), not in the lambda. The subsequent calls are. So everything shifts by one. As the docs note "Sometimes it's impossible or impractical to use when(Object) for stubbing spies. Therefore when using spies please consider doReturn|Answer|Throw() family of methods for stubbing." I just happened to try this, and now I know why it works. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #463: METRON-728: ReaderSpliteratorTest fails randoml...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/463 @cestella Spy() syntax ends up working differently than mock() from what I can tell. This worked for me ``` Spliterator delegatingSpliterator = spy(spliterator); doAnswer(invocationOnMock -> { Spliterator ret = spliterator.trySplit(); if(ret != null) { numSplits.incrementAndGet(); } return ret; }).when(delegatingSpliterator).trySplit(); ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #463: METRON-728: ReaderSpliteratorTest fails randoml...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/463 +1, I appreciate you going ahead and taking this ticket, given that I've been bitten by it twice now. Looks great. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #463: METRON-728: ReaderSpliteratorTest fails ...
Github user justinleet commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/463#discussion_r102776136 --- Diff: metron-platform/metron-common/src/test/java/org/apache/metron/common/utils/file/ReaderSpliteratorTest.java --- @@ -97,88 +110,73 @@ public void testSequentialStreamLargeBatch() throws FileNotFoundException { Map count = stream.map(s -> s.trim()) .collect(Collectors.toMap(s -> s, s -> 1, Integer::sum)); - Assert.assertEquals(5, count.size()); - Assert.assertEquals(3, (int) count.get("foo")); - Assert.assertEquals(2, (int) count.get("bar")); - Assert.assertEquals(1, (int) count.get("and")); - Assert.assertEquals(1, (int) count.get("the")); + validateMapCount(count); } } - @Test - public void testActuallyParallel() throws ExecutionException, InterruptedException, FileNotFoundException { -//With 9 elements and a batch of 2, we should only ceil(9/2) = 5 batches, so at most min(5, 2) = 2 threads will be used -try( Stream stream = ReaderSpliterator.lineStream(getReader(), 2)) { - ForkJoinPool forkJoinPool = new ForkJoinPool(2); - forkJoinPool.submit(() -> { -Map threads = -stream.parallel().map(s -> Thread.currentThread().getName()) -.collect(Collectors.toMap(s -> s, s -> 1, Integer::sum)); -Assert.assertTrue(threads.size() <= 2); - } - ).get(); -} - } + private int getNumberOfBatches(final ReaderSpliterator spliterator) throws ExecutionException, InterruptedException { +final AtomicInteger numSplits = new AtomicInteger(0); +//we want to wrap the spliterator and count the (valid) splits +Spliterator delegatingSpliterator = new Spliterator() { + @Override + public boolean tryAdvance(Consumer action) { +return spliterator.tryAdvance(action); + } - @Test - public void testActuallyParallel_mediumBatch() throws ExecutionException, InterruptedException, FileNotFoundException { -//With 9 elements and a batch of 2, we should only ceil(9/2) = 5 batches, so at most 5 threads of the pool of 10 will be used -try( Stream stream = ReaderSpliterator.lineStream(getReader(), 2)) { - ForkJoinPool forkJoinPool = new ForkJoinPool(10); - forkJoinPool.submit(() -> { -Map threads = -stream.parallel().map(s -> Thread.currentThread().getName()) -.collect(Collectors.toMap(s -> s, s -> 1, Integer::sum)); -Assert.assertTrue(threads.size() <= (int) Math.ceil(9.0 / 2) && threads.size() > 1); - } - ).get(); -} + @Override + public Spliterator trySplit() { +Spliterator ret = spliterator.trySplit(); +if(ret != null) { + numSplits.incrementAndGet(); +} +return ret; + } + + @Override + public long estimateSize() { +return spliterator.estimateSize(); + } + + @Override + public int characteristics() { +return spliterator.characteristics(); + } +}; + +Stream stream = StreamSupport.stream(delegatingSpliterator, true); + +//now run it in a parallel pool and do some calculation that doesn't really matter. +ForkJoinPool forkJoinPool = new ForkJoinPool(10); --- End diff -- Incredibly minor point, but since we no longer care about the actual execution and aren't running it a lot, it seems appropriate to just use ForkJoinPool.commonPool(), and drop the shutdown line. This is entirely up to you if you want to change, I don't consider it blocking by any means. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #463: METRON-728: ReaderSpliteratorTest fails randoml...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/463 @cestella that is a much better way of stating it, and exactly what I was alluding to. I'll look through the new commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #463: METRON-728: ReaderSpliteratorTest fails randoml...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/463 @cestella The more I'm thinking about this, the more I wonder if this test is inherently structured incorrectly. My thinking is that it seems more like we're testing whether or not a stream can run in parallel, rather than that the stream produced by the spliterator meets the appropriate contracts for a stream. Is there a way to restructure this so that it just tests "Does this meet the criteria of a Java stream?", rather than "Can a stream in Java run in parallel?" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #463: METRON-728: ReaderSpliteratorTest fails randoml...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/463 Nevermind, I can't read. You ran the whole test 100k times, correct? I'm fine with that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #463: METRON-728: ReaderSpliteratorTest fails randoml...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/463 Are we settling on "less sporadic"? Like I noted in the ticket, I had the original test run for over a minute (~90 seconds) before the JVM decided to actually be single threaded. It's not the usual case but I probably only ran it 20 or so times before I hit the 90 second case. It seems more likely to fail in Travis, which is fine, but I'm not sure I want my local build failing that often. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron issue #462: METRON-734 Builds failing because of MaxMind DB...
Github user justinleet commented on the issue: https://github.com/apache/incubator-metron/pull/462 Apparently https://issues.apache.org/jira/browse/METRON-728 occurs more frequently on travis than my local machine. The Travis running on my personal account already succeed (https://travis-ci.org/justinleet/incubator-metron/builds/204210174) I'll kick Travis and hopefully we aren't waiting 16 hours for a build --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-metron pull request #462: METRON-734 Builds failing because of Max...
Github user justinleet closed the pull request at: https://github.com/apache/incubator-metron/pull/462 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---