Re: Metron MaaS Issue

2019-12-19 Thread Casey Stella
Sorry for the late reply.  Try adding:

import sys,os
sys.path.append(os.getcwd())


On Fri, Dec 13, 2019 at 11:53 PM Hema malini 
wrote:

> Hi,
>
> I am not sure whether i am facing an issue or it's a bug . I try to deploy
> the sample Maas script in metron it works perfectly. In the model
> collateral , now i add another python class which loads the trained model
> pkl filed. I gave the absolute path as well as relative path( using dir
> path). I want to import that python class to the flask code. when i submit
> it starts with connection closed out .
>
> So it's not possible to load or import any other python class.Does model
> collateral should only have one rest and one python file?
>
> from flask import Flask
> from flask import request,jsonify
> import socket
> From Sample import sampleclass
> app = Flask(__name__)
> @app.route("/apply", methods=['GET'])
> def predict():
>  h = request.args.get('host')
>  r = {}
>  if h == 'yahoo.com' or h == 'amazon.com':
>  r['is_malicious'] = 'legit'
>  else:
>  r['is_malicious'] = 'malicious'
>  return jsonify(r)
>
>
> In the above code i couldn't import the sample class though it is present
> in the same folder.
>
> Thanks and Regards,
> Hema
>
>


Re: Issue when trying to load JSON

2019-04-25 Thread Casey Stella
Wait, are we sure that's the case?  Generally speaking, messages coming
into the parser which contains the envelope strategy has a _source field
which is a string, which this isn't (it's JSON).
For instance, the format expected is:

{

  "_index": "indexing",

  "_type": "Event",

  "_id": "AWAkTAefYn0uCUpkHmCy",

  "_score": 1,

  "_source": "{*\*"dst\": \"127.0.0.1\",\"devTimeEpoch\":
"151243734",\"dstPort\":
\"0\",\"srcPort\": \"80\",\"src\": \"194.51.198.185\"}"

}


I think that Simon is right, if we want to support this, we might consider
a different envelope strategy that takes parsed objects.  I think you can
just, in the *routing* parser, set the "_source" field as "original_string"
via a stellar field transformation.

On Thu, Apr 25, 2019 at 12:05 PM Nick Allen  wrote:

> > Otto: I’m not sure this is an enveloped issue, or a new feature for the
> json map parser
>
> This is not an issue with JSONMapParser.  This is an issue with the
> "enveloping" mechanism, prior to when the JSONMapParser gets the message.
>
> The entire message has been parsed as a JSON object including the value of
> the "_source" field.  Since the "_source" field itself contains valid JSON,
> the parser transformed it into a Map, rather than the String that it
> expects.
>
> In my opinion, the ENVELOPE strategy needs to not parse the contents of
> that "_source" field.  The ENVELOPE strategy should work for JSON and
> non-JSON content alike.
>
>
> On Thu, Apr 25, 2019 at 11:31 AM Otto Fowler 
> wrote:
>
>> I’m not sure about the name, I’m more thinking about the case.
>> I’m not sure this is an enveloped issue, or a new feature for the json
>> map parser ( or if you could do it with the jsonMap parser and JSONPath )
>>
>>
>>
>> On April 25, 2019 at 11:23:25, Simon Elliston Ball (
>> si...@simonellistonball.com) wrote:
>>
>> Seems like this would a good additional strategy, something like
>> ENVELOPE_PARSED? Any thoughts on a good name?
>>
>> On Thu, 25 Apr 2019 at 16:20, Otto Fowler 
>> wrote:
>>
>>> So,  the enveloped message doesn’t support getting an already parsed
>>> json object from the enveloped json, we would have to do some work to
>>> support this,  Even if we _could_ wrangle it in there now, from what I can
>>> see we would still  have to serialize to bytes to pass to the actual parser
>>> and that would be inefficient.
>>> Can you open a jira with the information you provided?
>>>
>>>
>>>
>>> On April 25, 2019 at 11:12:38, Otto Fowler (ottobackwa...@gmail.com)
>>> wrote:
>>>
>>> Raw message in this case assumes that the raw message is a String
>>> embedded in the json field that you supply, not a nested json object, so it
>>> is looking for
>>>
>>>
>>> “_source” : “some other embedded string of some format like syslog in
>>> json”
>>>
>>> There are other message strategies, but I’m not sure they would work in
>>> this instance.  I’ll keep looking. hopefully someone more familiar will
>>> jump in.
>>>
>>>
>>> On April 25, 2019 at 10:48:06, stephane.d...@orange.com (
>>> stephane.d...@orange.com) wrote:
>>>
>>> Hello,
>>>
>>>
>>>
>>> I’m trying to load some JSON data which has the following structure
>>> (this is a sample):
>>>
>>>
>>>
>>> {
>>>
>>>   "_index": "indexing",
>>>
>>>   "_type": "Event",
>>>
>>>   "_id": "AWAkTAefYn0uCUpkHmCy",
>>>
>>>   "_score": 1,
>>>
>>>   "_source": {
>>>
>>> "dst": "127.0.0.1",
>>>
>>> "devTimeEpoch": "151243734",
>>>
>>> "dstPort": "0",
>>>
>>> "srcPort": "80",
>>>
>>> "src": "194.51.198.185"
>>>
>>>   }
>>>
>>> }
>>>
>>>
>>>
>>> In my file, everything is on the same line. My parser config is the
>>> following:
>>>
>>>
>>>
>>> {
>>>
>>>   "parserClassName": "org.apache.metron.parsers.json.JSONMapParser",
>>>
>>>   "filterClassName": null,
>>>
>>>   "sensorTopic": "my_topic",
>>>
>>>   "outputTopic": null,
>>>
>>>   "errorTopic": null,
>>>
>>>   "writerClassName": null,
>>>
>>>   "errorWriterClassName": null,
>>>
>>>   "readMetadata": true,
>>>
>>>   "mergeMetadata": true,
>>>
>>>   "numWorkers": 2,
>>>
>>>   "numAckers": null,
>>>
>>>   "spoutParallelism": 1,
>>>
>>>   "spoutNumTasks": 1,
>>>
>>>   "parserParallelism": 2,
>>>
>>>   "parserNumTasks": 2,
>>>
>>>   "errorWriterParallelism": 1,
>>>
>>>   "errorWriterNumTasks": 1,
>>>
>>>   "spoutConfig": {},
>>>
>>>   "securityProtocol": null,
>>>
>>>   "stormConfig": {},
>>>
>>>   "parserConfig": {
>>>
>>>   },
>>>
>>>   "fieldTransformations": [
>>>
>>>{
>>>
>>>  "transformation":"RENAME",
>>>
>>>  "config": {
>>>
>>> "dst": "ip_dst_addr",
>>>
>>> "src": "ip_src_addr",
>>>
>>> "srcPort": "ip_src_port",
>>>
>>> "dstPort": "ip_dst_port",
>>>
>>> "devTimeEpoch": "timestamp"
>>>
>>>  }
>>>
>>>}
>>>
>>>   ],
>>>
>>>   "cacheConfig": {},
>>>
>>>   "rawMessageStrategy": "ENVELOPE",
>>>
>>>   "rawMessageStrategyConfig": {
>>>
>>> "messageField": "_source"
>>>
>>>   }
>>>
>>> }
>>>
>>>
>>>
>>> But in Storm I get the following errors:
>>>
>>>
>>>
>>> 

Re: MAP Data structure in Stellar to store key/value pairs

2019-01-04 Thread Casey Stella
Hi Anil,

Stefan is quite correct about initializing map objects in stellar.  I would
point out that, given you're using a multiset, you could also initialize
your data structure with MULTISET_INIT() and interact with it via
MULTISET_ADD(), similar to the geographic outliers use-case (we do this in
the profile creation section:
https://github.com/apache/metron/tree/master/use-cases/geographic_login_outliers#create-the-profiles-for-enrichment
).

Casey

On Fri, Jan 4, 2019 at 7:10 AM Anil Donthireddy 
wrote:

> Hi Stephen,
>
>
>
> Thanks a lot for your prompt response. Your response is helpful.
>
>
>
> To implement my usecase, it seems I need to leverage MULTISET which stores
> objects and its count in Map. In my case key can be the object and value
> can be the count.
>
>
>
> Thanks,
>
> Anil.
>
>
>
> *From:* Stefan Kupstaitis-Dunkler [mailto:stefan@gmail.com]
> *Sent:* Friday, January 4, 2019 5:21 PM
> *To:* user@metron.apache.org
> *Cc:* Christopher Berry ; Satish Abburi <
> satish.abb...@sstech.us>
> *Subject:* Re: MAP Data structure in Stellar to store key/value pairs
>
>
>
> Hi Anil,
>
>
>
> the easiest way to define a map in the Stellar language is to define it
> via a variable assignment in a JSON format.
>
>
>
>- For example, below I define a map called kafka_props.
>- Then, with "*MAP_EXISTS*" I check if the map ( => JSON object) has a
>key "security.protocol", which it has
>- Then I try to extract the value of the key "security.protocol" and
>get the correct value of "SASL_PLAINTEXT".
>
>
>
> [Stellar]>>> kafka_props := { "bootstrap.servers": "
> condla1.field.hortonworks.com:6667
> ",
> "security.protocol": "SASL_PLAINTEXT"}
>
> {security.protocol=SASL_PLAINTEXT, bootstrap.servers=
> condla1.field.hortonworks.com:6667
> 
> }
>
>
>
> [Stellar]>>> MAP_EXISTS("security.protocol", kafka_props)
>
> true
>
>
>
> [Stellar]>>> MAP_GET("security.protocol", kafka_props)
>
> SASL_PLAINTEXT
>
>
>
> As of now, I don't think there is a MAP_PUT .  In most of the Metron use
> cases, the map is either the incoming message or an external enrichment and
> you use MAP_GET to extract information from it.
>
>
>
> Does this answer your question?
>
>
>
> Best,
>
> Stefan
>
>
>
>
>
> On Fri, Jan 4, 2019 at 12:26 PM Anil Donthireddy <
> anil.donthire...@sstech.us> wrote:
>
> Hi,
>
>
>
> I have gone through the stellar documentation to understand how to store
> key/value pairs in to an stellar object. I can see there are functions
> MAP_EXISTS() and MAP_GET() that do operations on key/value pairs.
>
>
>
> But I am unable to find how to initialize MAP object and add/update
> key/value pairs in the MAP. Please help to store key/value pairs in stellar
> object.
>
>
>
> Thanks,
>
> Anil.
>
>


Re: [ANNOUNCE] Apache Metron release 0.7.0

2018-12-17 Thread Casey Stella
+1 to that!!
On Mon, Dec 17, 2018 at 13:16 Michael Miklavcic 
wrote:

> And a big thanks to Justin Leet for being our release manager. Great work
> Justin!
>
> On Mon, Dec 17, 2018 at 10:07 AM Justin Leet  wrote:
>
>> Hi all,
>>
>> I’m pleased to announce the release of Metron 0.7.0! There's been a lot
>> of work on improvements, upgrades, discussion, and more. Thanks to everyone
>> who's contributed, and thank you to our users.
>>
>> Details:
>> The official release source code tarballs may be obtained at any of the
>> mirrors listed in
>> http://www.apache.org/dyn/closer.cgi/metron/0.7.0
>>
>> As usual, the secure signatures and confirming hashes may be obtained at
>> https://dist.apache.org/repos/dist/release/metron/0.7.0
>>
>> The release branches in github is
>> https://github.com/apache/metron/tree/Metron_0.7.0 (tag
>> apache-metron_0.7.0-release)
>>
>> The release doc book is at
>> http://metron.apache.org/current-book/index.html
>> The Apache Metron web site at http://metron.apache.org/ has been
>> updated; please refresh your web browser cache if the new links do not
>> immediately appear.
>>
>> Change lists and Release Notes may be obtained at the same locations as
>> the tarballs.
>> For your reading pleasure, the change list is appended to this message.
>>
>> CHANGES (in reverse chronological order):
>>
>> METRON-1928 Bump Metron version to 0.7.0 for release. (justinleet) 
>> closes apache/metron#1293
>> METRON-1931 Update dev utilities to support new repo location 
>> (rlenferink via justinleet) closes apache/metron#1295
>> METRON-1922 Escaping incorrectly handled in current aesh version 
>> (justinleet) closes apache/metron#1291
>> METRON-1867 Remove `/api/v1/update/replace` endpoint (nickwallen) closes 
>> apache/metron#1284
>> METRON-1810 Storm Profiler Intermittent Test Failure (nickwallen) closes 
>> apache/metron#1289
>> METRON-1909 Remove http filter from release utils changelog generation 
>> (justinleet) closes apache/metron#1283
>> METRON-1869 Unable to Sort an Escalated Meta Alert (nickwallen) closes 
>> apache/metron#1280
>> METRON-1889: Add any missing timestamp fields to unified enrichment 
>> topology (mmiklavc via mmiklavc) closes apache/metron#1286
>> METRON-1913 metron-alert UI - Build broken by missing transitive 
>> dependency (tiborm via sardell) closes apache/metron#1285
>> METRON-1845 Correct Test Data Load in Elasticsearch Integration Tests 
>> (nickwallen) closes apache/metron#1247
>> METRON-1888 Default Topology Settings in MPack Cause Profiler to Stall 
>> (nickwallen) closes apache/metron#1276
>> METRON-1887: Add logging to the ClasspathFunctionResolver (mmiklavc via 
>> mmiklavc) closes apache/metron#1274
>> METRON-1873 Update Bootstrap version in Management UI (sardell) closes 
>> apache/metron#1267
>> METRON-1825 Upgrade bro to 2.5.5 (JonZeolla via nickwallen) closes 
>> apache/metron#1237
>> METRON-1890 Metron Vagrant should disable audio (ottobackwards) closes 
>> apache/metron#1277
>> METRON-1874 Create a Parser Debugger (nickwallen) closes 
>> apache/metron#1265
>> METRON-1880 Use Caffeine for Profiler Caching (nickwallen) closes 
>> apache/metron#1270
>> METRON-1877 Nested IF ELSE statements can cause parse errors in Stellar 
>> (justinleet) closes apache/metron#1268
>> METRON-1872 Move rat plugin away from snapshot version (justinleet) 
>> closes apache/metron#1264
>> METRON-1875 Expose configurable global settings in the Alerts UI 
>> (merrimanr) closes apache/metron#1266
>> METRON-1834: Migrate Elasticsearch from TransportClient to new Java REST 
>> API (mmiklavc via mmiklavc) closes apache/metron#1242
>> METRON-1834: Migrate Elasticsearch from TransportClient to new Java REST 
>> API (cstella via mmiklavc)
>> METRON-1749 Update Angular to latest release in Management UI (sardell 
>> via nickwallen) closes apache/metron#1217
>> METRON-1870 Intermittent Stellar REST test failures (merrimanr via 
>> nickwallen) closes apache/metron#1263
>> METRON-1868 metron-committer-common incorrectly checking REPO_NAME 
>> (JonZeolla via jonzeolla) closes apache/metron#1260
>> METRON-1740 Improve Palo Alto parser to handle CONFIG and SYSTEM syslog 
>> messages (liuy-tnz via nickwallen) closes apache/metron#1171
>> METRON-1847 Create reusable script with functions from prepare-commit 
>> (ottobackwards) closes apache/metron#1248
>> METRON-1850 Stellar REST function (merrimanr) closes apache/metron#1250
>> METRON-1858 BasicFireEyeParser check style cleanup and optimization 
>> (ottobackwards) closes apache/metron#1255
>> METRON-1864 Stellar date format test fails after daylight saving 
>> (ottobackwards) closes apache/metron#1258
>> METRON-1861 METRON-1861: REST fails to start when LDAP enabled and 
>> 'Active Spring profiles' config is empty (anandsubbu via justinleet) closes 
>> apache/metron#1256
>> METRON-1853: Add shutdown hook to Stellar Bas

Re: cisco asa elastic search template

2018-11-13 Thread Casey Stella
I believe that these are the only ones that we ship:
https://github.com/apache/metron/tree/master/metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/files
and ASA isn't part of it.  You should be able to adjust the bro or snort
example to suite your needs, though.

On Tue, Nov 13, 2018 at 9:19 AM Habi S Ravi  wrote:

> Where can i find elastic search template for cisco asa parser?
>
> --
> Regards,
> Habi
>


Re: Indexing topology keep crashing

2018-09-13 Thread Casey Stella
Two questions:
1. How much memory are you giving the workers for the indexing topology?
2. how large are the messages you're sending through?

On Thu, Sep 13, 2018 at 2:00 PM Vets, Laurens  wrote:

> Hello list,
>
> I've installed OS updates on my Metron 0.4.2 yesterday, restarted all
> nodes and now my indexing topology keeps crashing.
>
> This is what I see in the Storm UI for the indexing topology:
>
> Topology stats:
> 10m 0s1304380195352012499.8331320
> 3h 0m 0s1304380195352012499.8331320
> 1d 0h 0m 0s1304380195352012499.8331320
> All time1304380195352012499.8331320
>
> Spouts:
> kafkaSpout111299940194908012499.83313200
> metron36702java.lang.OutOfMemoryError: GC overhead limit
> exceeded at java.lang.Long.valueOf(Long.java:840) at
>
> org.apache.storm.kafka.spout.KafkaSpoutRetryExponentialBackoff$RetryEntryTimeStampComparator.compar
>
> Bolts:
> hdfsIndexingBolt11180018000.2787.0221820
> 38.63318000metron36702java.lang.NullPointerException
> at
> org.apache.metron.writer.hdfs.SourceHandler.handle(SourceHandler.java:80)
> at org.apache.metron.writer.hdfs.HdfsWriter.write(HdfsWriter.java:113)
> at org.apache.metrThur, 13 Sep 2018 07:35:02
> indexingBolt11132013200.2177.6621300
> 47.81513000metron36702java.lang.OutOfMemoryError: GC
> overhead limit exceeded at
> java.util.Arrays.copyOfRange(Arrays.java:3664) at
> java.lang.String.(String.java:207) at
> org.json.simple.parser.Yylex.yytext(Yylex.javThur, 13 Sep 2018 07:37:33
>
> When I check the Kafka topic, I can see that there's at least 3 million
> messages in the kafka indexing topic... I _suspect_ that the indexing
> topology tries to write those but fails, restarts, tries to write,
> fails, etc... Metron is currently not ingesting any additional messages,
> but also can't seem to index the current ones...
>
> Any idea on how to proceed?
>
>


Re: Issue with Enrichment topology: java.lang.OutOfMemoryError: GC overhead limit exceeded

2018-08-21 Thread Casey Stella
What version of Metron are you using?

On Tue, Aug 21, 2018 at 1:46 PM Otto Fowler  wrote:

> So, before you where doing GEO you did not have the problem?  If you took
> the GEO out it would stop?
>
>
> On August 21, 2018 at 11:04:56, Anil Donthireddy (
> anil.donthire...@sstech.us) wrote:
>
> Hi,
>
>
>
> We have been keep on getting the error “java.lang.OutOfMemoryError: GC
> overhead limit exceeded” at Enrichment topology in storm at several bolts
> defined. Please find the attached screenshot which shows the bolts in
> topology at which we are getting the same issue.
>
>
>
> I tried out couple of configuration changes to provide more RAM to storm
> topologies from ambary UI and restarted Storm but it dint help. Can I get
> some configuration steps to assign more RAM to storm topologies to resolve
> the issue.
>
>
>
> I see in the logs that for each record, it is trying to update the GeoIP
> data as below. I wonder if it is causing the issue.
>
> o.a.m.e.a.g.GeoLiteDatabase Thread-16-geoEnrichmentBolt-executor[6 6]
> [INFO] [Metron] Update to GeoIP data started with
> /apps/metron/geo/default/GeoLite2-City.mmdb.gz
>
>
>
> Note: I am facing the issue from the time I implemented the geo alerts
> rule as specified in the link
> .
>
>
>
>
> Thanks,
>
> Anil.
>
>


Re: Metron Not Reading From Kafka?

2018-08-17 Thread Casey Stella
Can you turn on root level logging to DEBUG from the storm UI (see
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_storm-component-guide/content/enabling-storm-log-levels.html)
in the parser topology for 1 minute and look at the logs for anything amiss?

Also, can you do a console consumer on the enrichments topic to make
double-plus sure you're not getting any data through into enrichments?

Casey

On Fri, Aug 17, 2018 at 11:07 AM David McGinnis 
wrote:

> All,
>
> We have a Metron 0.4.3 installation on an HDP cluster which has a sensor
> set up to read from a Kafka topic and write the data out to Elasticsearch.
> Data is being inserted into the Kafka topic, and we can read that through
> Kafka console consumer, but the system is not reporting any data coming
> through. The Storm spout says no data has been processed, and the index
> hasn't even been created in Elastic, despite running for nearly a month
> now.
>
> We've searched the worker logs for Storm, and the only error that comes up
> is a (we think) unrelated error about not being able to find the jmxmetrics
> JAR file. Metron reports that the topic is found, and does not tell us that
> the topic is not emitting, so we suspect it sees the data in there.
>
> Do you all have any ideas on where we can look to determine the cause of
> this issue, or things to try?
>
> Thanks!
>
> --
> David McGinnis
> Staff Hadoop Consultant | Avalon Consulting, LLC
> M: (513) 439-0082
> LinkedIn  | Google+
>  | Twitter
> 
>
> -
> This message (including any attachments) contains confidential information
> intended for a specific individual and purpose, and is protected by law.
> If
> you are not the intended recipient, you should delete this message. Any
> disclosure, copying, or distribution of this message, or the taking of any
> action based on it, is strictly prohibited.
>


Re: Google Cloud Platform

2018-08-09 Thread Casey Stella
Completely agree with Justin here.  I have not used GCP, but I would think
that it is not any different than the approach we have for AWS.

Definitely reach back and let us know.  We always welcome doc PRs ;)

Casey

On Thu, Aug 9, 2018 at 06:29 Justin Leet  wrote:

> Unfortunately, I have no familiarity with GCP at all, but a good place to
> start *may* be by reverse engineering some of our EC2 instructions
> .
> You might be able to sub in GCP steps as needed for provisioning and more
> or less follow the internal instructions otherwise.  Keep in mind Metron
> itself is deployed via Ambari, so as long as you can get a Hadoop cluster
> up and running, the RPMs out and installed + the mpack, you should at least
> be able to take a good stab at getting things up and running.
>
> I'd be curious if anyone has any GCP experience at all and would know if
> this is a reasonable approach.
>
> If you do make an attempt, I'd definitely like to hear back on how it
> goes, and what issues where hit, etc.
>
>
> Justin
>
> On Thu, Aug 9, 2018 at 1:04 AM Kevin Waterson 
> wrote:
>
>> Was hoping somebody else had.. not sure where to start... :)
>>
>>
>> On Thu, Aug 9, 2018 at 2:00 AM James Sirota  wrote:
>>
>>> Not to my knowledge. Are you trying it?
>>>
>>>
>>> 24.07.2018, 22:19, "Kevin Waterson" :
>>>
>>> Has anybody been able to deploy Metron using GCP?
>>>
>>> Thanks
>>> Kevin
>>>
>>>
>>>
>>> ---
>>> Thank you,
>>>
>>> James Sirota
>>> PMC- Apache Metron
>>> jsirota AT apache DOT org
>>>
>>>


Good press for Metron!

2018-08-09 Thread Casey Stella
https://www.darkreading.com/endpoint/oh-no-not-another-security-product/a/d-id/1332453


Re: Enrichment Topology: Split-Join vs Unified

2018-07-26 Thread Casey Stella
What Nick says is absolutely right, but I want to add just a bit of color
around the architectural differences between split-join vs unified here.

Split-join was the first approach to parallelizing the various enrichment
adapters that we have (e.g. hbase, geo, stellar).  We took a very "stormy"
approach to this (see:
https://groups.google.com/forum/#!topic/storm-user/7Gk34vwUATk).  What we
found, however, was during performance evaluation we had extremely pinched
throughput
with this architecture (see architecture at
https://github.com/apache/metron/tree/master/metron-platform/metron-enrichment#enrichment-architecture).
Specifically, the cost of the network overhead in a split/join topology was
overwhelming us and pinching throughput.

We then moved to the unified topology (architecture:
https://github.com/apache/metron/tree/master/metron-platform/metron-enrichment#unified-enrichment-topology),
which removed the network latency overhead, but did things in a less stormy
way.  Specifically enrichments are done in parallel, but inside of a
threadpool in the enrichment bolt.  This saved us network hops at the
expense of adding a threadpool to storm.  In our tests, we've found this to
be the preferred approach.

Hope this helps add color!

Best,

Casey

On Thu, Jul 26, 2018 at 5:31 AM Stefan Kupstaitis-Dunkler <
stefan@gmail.com> wrote:

> Hi,
>
> what are the key differences of Split-Join and Unified in the enrichment
> topology. Which should be used when and why?
>
> Best,
> Stefan
> --
> Stefan Kupstaitis-Dunkler
> https://datahovel.com/
> https://www.meetup.com/Hadoop-User-Group-Vienna/
> https://twitter.com/StefanDunkler
>


Re: CEF Parser not Indexing data via Nifi (SysLogs)

2018-07-20 Thread Casey Stella
So, I would really love to see METRON-1453 go in, because I'd love to
decouple syslog parsing (very common) from generic grok.

On Fri, Jul 20, 2018 at 10:26 AM Otto Fowler 
wrote:

> Metron does not have a generic Syslog Parser.
>
> Nifi has Syslog parsing ( either Records or standard Processor ), in two
> modes.
>
> ParseSyslog is the original, where regex’s are used to parse the syslog
> RFC3164 and RFC5424, but only extracts the common fields ( so the
> ‘additional info’ like program id, message id, structured data in 5424 is
> in the MSG ). I have recently added a record reader for that method as well
> ( Nifi PR#2900 ).
>
> Syslog5424Reader(records) and ParseSyslog5424 are new and instead of using
> regexes they use a new library simple-syslog–5424
>  I wrote that
> parses RFC5424 messages completely ( note properly formatted RFC 5424
> messages ) see Nifi PR#2805 
> and Nifi PR#2816  using an
> antlr grammar.
>
> You should be able to pick the manner best for you and parse that out in
> Nifi if you choose.
>
> Metron parses syslog as required in specific parsers that have messages
> assumed to be embedded in syslog.
>
> What I have been talking about in METRON–1453
>  and other places is
> separating out the syslog from the parser, such that the parsers don’t need
> to know that the message is delivered embedded in syslog.
>
> The new parser chaining work would give us an avenue to this, and as you
> can see here MetronPR#1099
>  I
> have put that case forward.
>
> If that hits, I think that we’d be able to : 1. parse plain syslog to
> metron 2. parse plain syslog as a transform and then have less complicated,
> more specific parsers for the msg part.
>
> We may end up having syslog parsers and transforms at the end of this.
>
> In the mean time, if you wish to parse plain syslog in Metron, you will
> have to use grok, which doesn’t get structured data.
>
> If you want the complete 5424 set of data, then you can open a jira for
> creating a parser using simple-syslog–5424.
>
>
>
>
> On July 20, 2018 at 04:23:36, Farrukh Naveed Anjum (
> anjum.farr...@gmail.com) wrote:
>
> Hi,
>
> I am trying to index the Syslog using CEF Parser with Nifi.
>
> It does not give any error though, transport data to kafa without indexing
> it. It keepg giving FAILED in Spout.
>
> I believe indexing Syslog are most basic usecase for all. But metron fails
> to do it with each in standard format.
>
> I tried bro for it. But even it keeps giving PARSER Error.
>
> Any help ? Fast will be apperciated.
>
>
>
>
> --
> With Regards
> Farrukh Naveed Anjum
>
>


Re: CEF Parser not Indexing data via Nifi (SysLogs)

2018-07-20 Thread Casey Stella
I just want to pile in here and recommend taking a look at the parser
chaining use-case, which is a walk-through of pulling in firewall logs over
syslog using grok (
https://github.com/apache/metron/tree/master/use-cases/parser_chaining).
Unfortunately this is in master and yet in a release, but it will show you
how to use grok to parse syslogs containing some other format inside.

Casey

On Fri, Jul 20, 2018 at 5:34 AM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> What you need to do is NOT ParseCEF in NiFi. Metron should handle be CEF
> parsing.
>
> Just use NiFi to do the listen syslog (no need to parse in NiFi) then
> SplitText to get one line of CEF per kafka message (if your syslog is
> batching, this may not be necessary. Set up a sensor in Metron using the
> CEF parser and you should be fine.
>
> Simon
>
>
> On 20 Jul 2018, at 09:39, Srikanth Nagarajan 
> wrote:
>
> Hi Farrukh,
>
> You can try using the Grok Parser and search for regular expression
> pattern for your log.  You can customize the regex to meet your needs.
>
>
> https://cwiki.apache.org/confluence/display/METRON/2016/04/25/Metron+Tutorial+-+Fundamentals+Part+1%3A+Creating+a+New+Telemetry
>
> Look at Step-5 on how to create a regex for grok parser. Grok parser
> also allows to validate the fields.
>
> Good luck !
>
> Thanks
> Srikanth
>
> On July 20, 2018 at 4:23 AM Farrukh Naveed Anjum 
> wrote:
>
> Hi,
>
> I am trying to index the Syslog using CEF Parser with Nifi.
>
> It does not give any error though, transport data to kafa without indexing
> it. It keepg giving FAILED in Spout.
>
> I believe indexing Syslog are most basic usecase for all. But metron fails
> to do it with each in standard format.
>
> I tried bro for it. But even it keeps giving PARSER Error.
>
> Any help ? Fast will be apperciated.
>
>
>
>
> --
> With Regards
> Farrukh Naveed Anjum
>
>
> __
>
> *Srikanth Nagarajan *
> *Principal*
>
> *Gandiva Networks Inc*
>
> *732.690.1884* Mobile
>
> s...@gandivanetworks.com
>
> www.gandivanetworks.com
>
> Please consider the environment before printing this. NOTICE: The
> information contained in this e-mail message is intended for addressee(s)
> only. If you have received this message in error please notify the sender.
>
>


Re: [ANNOUNCE] Apache Metron release 0.5.0

2018-06-08 Thread Casey Stella
Great job all!  This was a big release with a lot of good stuff.  I
especially like the performance improvements :)

Casey

On Fri, Jun 8, 2018 at 8:54 AM Justin Leet  wrote:

> Hi All,
>
> I’m happy to announce the release of Metron 0.5.0!  Everyone has put in a
> lot of working into improvements, new features, and discussion.  Thanks to
> everyone who contributed, and I look forward to having users enjoy our new
> features and improvements.
>
> Details:
> The official release source code tarballs may be obtained at any of the
> mirrors listed in
> http://www.apache.org/dyn/closer.cgi/metron/0.5.0
>
> As usual, the secure signatures and confirming hashes may be obtained at
> https://dist.apache.org/repos/dist/release/metron/0.5.0
>
> The release branches in github is
> https://github.com/apache/metron/tree/Metron_0.5.0 (tag
> apache-metron-0.5.0-release)
>
> The release doc book is at
> http://metron.apache.org/current-book/index.html
> The Apache Metron web site at http://metron.apache.org/ has been updated;
> please refresh your web browser cache if the new links do not immediately
> appear.
>
> Change lists and Release Notes may be obtained at the same locations as the
> tarballs.
> For your reading pleasure, the change list is appended to this message.
>
> Metron CHANGES (in reverse chronological order):
>
> METRON-1586 Defaulting for the source type field in alerts UI does
> not work (merrimanr via justinleet) closes apache/metron#1038
> METRON-1569: Allow user to change field name conversion when
> indexing to Elasticsearch (nickwallen via mmiklavc) closes
> apache/metron#1022
> METRON-1544 Flaky test:
> org.apache.metron.stellar.common.CachingStellarProcessorTest#testCaching
> (nickwallen) closes apache/metron#1015
> METRON-1580 Release candidate check script requires Bro Plugin
> (nickwallen via ottobackwards) closes apache/metron#1034
> METRON-1532 Getting started documentation improvements (sardell
> via nickwallen) closes apache/metron#1001
> METRON-1576 bundle.css RAT failure for
> metron-interface/metron-alerts (justinleet) closes apache/metron#1029
> METRON-1575 Add leet gpg public key to the KEYS file (justinleet)
> closes apache/metron#1028
> METRON-1574 Update version to 0.5.0 (justinleet) closes
> apache/metron#1026
> METRON-1566 Alert updates are not propagated to metaalert child
> alerts (merrimanr) closes apache/metron#1018
> METRON-1565 Metaalerts fix denormalization after moving to active
> status (merrimanr) closes apache/metron#1017
> METRON-1548 Remove hardcoded source:type from Alerts UI
> (justinleet) closes apache/metron#1010
> METRON-1548 Remove hardcoded source:type from Alerts UI (sardell
> via justinleet) closes apache/metron#1010
> METRON-1564 Full dev kafka has offsets.topic.replication.factor
> set to 3 instead of 1 (justinleet) closes apache/metron#1016
> METRON-1552: Add gzip file validation check to the geo loader
> (mmiklavc via mmiklavc) closes apache/metron#1011
> METRON-1551 Profiler Should Not Use Java Serialization
> (nickwallen) closes apache/metron#1012
> METRON-1549: Add empty object test to WriterBoltIntegrationTest
> implementation (mmiklavc via mmiklavc) closes apache/metron#1009
> METRON-1541 Mvn clean results in git status having deleted files.
> (justinleet via nickwallen) closes apache/metron#1003
> METRON-1461 MIN MAX stellar function should take a stats or list
> object and return min/max (MohanDV via nickwallen) closes
> apache/metron#942
> METRON-1184 EC2 Deployment - Updating control_path to accommodate
> for Linux (Ahmed Shah via ottobackwards) closes apache/metron#754
> METRON-1530 Default proxy config settings in metron-contrib need
> to be updated (sardell via merrimanr) closes apache/metron#998
> METRON-1545 Upgrade Spring and Spring Boot (merrimanr) closes
> apache/metron#1008
> METRON-1543 Unable to Set Parser Output Topic in Sensor Config
> (nickwallen) closes apache/metron#1007
> METRON-1539: Specialized RENAME field transformer closes
> apache/incubator-metron#1002
> METRON-1520: Add caching for stellar field transformations closes
> apache/incubator-metron#990
> METRON-1529 CONFIG_GET Fails to Retrieve Latest Config When Run in
> Zeppelin REPL (nickwallen) closes apache/metron#997
> METRON-1511 Unable to Serialize Profiler Configuration
> (nickwallen) closes apache/metron#982
> METRON-1528: Fix missing file in metron.spec (mmiklavc via
> mmiklavc) closes apache/metron#996
> METRON-1445: Update performance tuning guide with more explicit
> parameter instructions (mmiklavc via mmiklavc) closes
> apache/metron#988
> METRON-1502 Upgrade Doxia plugin to 1.8 (justinleet) closes
> apache/metron#974
> METRON-1527: Remove dead test file sitting in source folder
> (mmiklavc via mmiklavc) closes apache/metron#994
> METRON-1499 Enable Configuration of Unified Enrichment Topology
> via Ambari (nickwallen) closes apache/metron#

Re: Parse Exception at SplitterBolt in Profiler Topology: 2018-04-27 10:57:16.575 o.a.m.p.b.ProfileSplitterBolt Thread-6-splitterBolt-executor[7 7] [ERROR] Unexpected failure: message='null'

2018-04-27 Thread Casey Stella
It can be, depending on the use-case.  In your case,
PROFILE_GET('list-of-domains-per-user', ...) returns a list of the values
stored in the 'list-of-domains-per-user', so it's a list of (probably)
lists.  In the REPL, you can make the PROFILE_GET call and see what it
returns, but I'm suspecting it's a list of lists.

You probably want to flatten that list of lists.  I wish I could tell you
that there's a FLATTEN or LIST_MERGE stellar function, but I don't see one
(opportunity for contribution!).  We can cobble one together, however, for
you with reduce, like so:
"init" : {
  "p" : "PROFILE_GET('list-of-domains-per-user', ...)",
  "lateral_init_set" : "REDUCE( p, (s, x) -> REDUCE(x, (a,b) -> LIST_ADD(a,
b), s), [])"
  ...
}

[Aside for the reader: That is hideous, right?  Definitely want that
LIST_MERGE function.  Any ambitious person want to get their feet wet in
open source contribution and provide it?  Just add it right next to
LIST_ADD in DataStructureFunctions.java]

Now, if you were willing for lateral_init_set to be a SET rather than a
list, then we can do the following:
"init" : {
  "p" : "PROFILE_GET('list-of-domains-per-user', ...)",
  "lateral_init_set" : "SET_MERGE(p)"
  ...
}

I suspect the second option would work out better since you don't care
about duplicates in that list of domains.
On Fri, Apr 27, 2018 at 16:44 Anil Donthireddy 
wrote:

> I would like to know if using PROFILE_GET is appropriate in profiler
> definition?
>
>
>
> I am seeing un-appropriate results. I need to use PROFILE_GET in
> enrichment configuration as well as in profiler definition.
>
>
>
> I am seeing very un-required results when I experiment having PROFILE_GET
> in enrichment config and profiler definition.
>
>
>
> Thanks,
>
> Anil.
>
>
>
> *From:* Casey Stella [mailto:ceste...@gmail.com]
> *Sent:* Saturday, April 28, 2018 12:26 AM
> *To:* user@metron.apache.org
> *Cc:* Satish Abburi 
> *Subject:* Re: Parse Exception at SplitterBolt in Profiler Topology:
> 2018-04-27 10:57:16.575 o.a.m.p.b.ProfileSplitterBolt
> Thread-6-splitterBolt-executor[7 7] [ERROR] Unexpected failure:
> message='null'
>
>
>
> That exception appears to me to be a problem in parsing the message coming
> into the profiler as opposed to having trouble parsing the profiler
> config.  That list of integers are the raw characters in the message.  It
> may be worthwhile to try to take the array of integers and try to turn them
> into a string  (in python that could be as simple as ''.join(chr(i) for i
> in [123,***Lot of integer values here in the logs end with34,
> 48, 34, 125]) from the python REPL) to see exactly what was in the raw
> message.
>
>
>
> On Fri, Apr 27, 2018 at 2:40 PM Anil Donthireddy <
> anil.donthire...@sstech.us> wrote:
>
> Hi,
>
>
>
> I defined a profile definition for my requirement to perform profile
> statistics. The calculation of profiler statistics for current profiling
> period will depend on the previous profile flushed stats.
>
>
>
> So, I am trying to read the previous stats during initialization ( “init”
> field in json ) part of profile for the current profiling period.
>
>
>
> The definition works fine when there are profile stats from the previous
> period, which means my profile definition working fine for first 15 minutes
> (my profile period is set to 15 mins). When profiler flushes the profile
> stats into hbase for first 15 minutes and starts for next period, I am
> getting the parse exception as seen in attached doc at Profiler topology
> and SplitterBolt.
>
>
>
> I have executed all lines of code in the stellar shell (instantiated with
> zookeeper config ./bin/stellat –z ). It works fine without any issues
> in all the cases. Which means, I am able to fetch the previous profile
> stats, initialize for current record and execute all lines of code in
> profiler definition with expected result.
>
>
>
> I could not able to figure out what else is going wrong during
> SplitterBolt execution. I request someone to go through the attached logs
> and help me to troubleshoot the issue further.
>
>
>
> High Level Exception stack trace:
>
> ***Exception seen at profiler
> topology logs ***
>
> 2018-04-27 10:56:06.483 o.a.h.m.s.s.StormTimelineMetricsSink Thread-23
> [WARN] Unable to send metrics to collector by address:
> http://null:6188/ws/v1/timeline/metrics
>
> 2018-04-27 10:57:16.575 o.a.m.p.b.ProfileSplitterBolt
> Thread-6-splitterBolt-executor[7 

Re: Parse Exception at SplitterBolt in Profiler Topology: 2018-04-27 10:57:16.575 o.a.m.p.b.ProfileSplitterBolt Thread-6-splitterBolt-executor[7 7] [ERROR] Unexpected failure: message='null'

2018-04-27 Thread Casey Stella
That exception appears to me to be a problem in parsing the message coming
into the profiler as opposed to having trouble parsing the profiler
config.  That list of integers are the raw characters in the message.  It
may be worthwhile to try to take the array of integers and try to turn them
into a string  (in python that could be as simple as ''.join(chr(i) for i
in [123,***Lot of integer values here in the logs end with34,
48, 34, 125]) from the python REPL) to see exactly what was in the raw
message.

On Fri, Apr 27, 2018 at 2:40 PM Anil Donthireddy 
wrote:

> Hi,
>
>
>
> I defined a profile definition for my requirement to perform profile
> statistics. The calculation of profiler statistics for current profiling
> period will depend on the previous profile flushed stats.
>
>
>
> So, I am trying to read the previous stats during initialization ( “init”
> field in json ) part of profile for the current profiling period.
>
>
>
> The definition works fine when there are profile stats from the previous
> period, which means my profile definition working fine for first 15 minutes
> (my profile period is set to 15 mins). When profiler flushes the profile
> stats into hbase for first 15 minutes and starts for next period, I am
> getting the parse exception as seen in attached doc at Profiler topology
> and SplitterBolt.
>
>
>
> I have executed all lines of code in the stellar shell (instantiated with
> zookeeper config ./bin/stellat –z ). It works fine without any issues
> in all the cases. Which means, I am able to fetch the previous profile
> stats, initialize for current record and execute all lines of code in
> profiler definition with expected result.
>
>
>
> I could not able to figure out what else is going wrong during
> SplitterBolt execution. I request someone to go through the attached logs
> and help me to troubleshoot the issue further.
>
>
>
> High Level Exception stack trace:
>
> ***Exception seen at profiler
> topology logs ***
>
> 2018-04-27 10:56:06.483 o.a.h.m.s.s.StormTimelineMetricsSink Thread-23
> [WARN] Unable to send metrics to collector by address:
> http://null:6188/ws/v1/timeline/metrics
>
> 2018-04-27 10:57:16.575 o.a.m.p.b.ProfileSplitterBolt
> Thread-6-splitterBolt-executor[7 7] [ERROR] Unexpected failure:
> message='null', tuple='{value=[123,***Lot of integer values here in
> the logs end with34, 48, 34, 125]}'
>
> org.json.simple.parser.ParseException: null
>
> at org.json.simple.parser.Yylex.yylex(Yylex.java:610)
> ~[stormjar.jar:?]
>
> at
> org.json.simple.parser.JSONParser.nextToken(JSONParser.java:269)
> ~[stormjar.jar:?]
>
> at
> org.json.simple.parser.JSONParser.parse(JSONParser.java:118)
> ~[stormjar.jar:?]
>
> at
> org.json.simple.parser.JSONParser.parse(JSONParser.java:81)
> ~[stormjar.jar:?]
>
> at
> org.json.simple.parser.JSONParser.parse(JSONParser.java:75)
> ~[stormjar.jar:?]
>
> at
> org.apache.metron.profiler.bolt.ProfileSplitterBolt.doExecute(ProfileSplitterBolt.java:108)
> ~[stormjar.jar:?]
>
> at
> org.apache.metron.profiler.bolt.ProfileSplitterBolt.execute(ProfileSplitterBolt.java:94)
> [stormjar.jar:?]
>
> at
> org.apache.storm.daemon.executor$fn__10250$tuple_action_fn__10252.invoke(executor.clj:730)
> [storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235]
>
> at
> org.apache.storm.daemon.executor$mk_task_receiver$fn__10171.invoke(executor.clj:462)
> [storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235]
>
> at
> org.apache.storm.disruptor$clojure_handler$reify__9685.onEvent(disruptor.clj:40)
> [storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235]
>
> at
> org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472)
> [storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235]
>
> at
> org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:451)
> [storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235]
>
> at
> org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
> [storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235]
>
> at
> org.apache.storm.daemon.executor$fn__10250$fn__10263$fn__10316.invoke(executor.clj:849)
> [storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235]
>
> at
> org.apache.storm.util$async_loop$fn__553.invoke(util.clj:484)
> [storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235]
>
> at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
>
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
>
>
>
> **Exception at
> SplitterBolt:  ***
>
> Unexpected character (.) at position 885. at
> org.json.simple.parser.Yylex.yylex(Yylex

Re: Define a function that can be used in Stellar

2018-02-02 Thread Casey Stella
We use a guava cache to cache the data for 24 hours.  You can see how it's
done here:
https://github.com/apache/metron/blob/master/metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/stellar/ObjectGet.java

We also do something like this in GEO_GET as well, but it's a bit more
complex.

On Fri, Feb 2, 2018 at 9:35 AM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> I forgot we added OBJECT_GET. How does the caching work on that?
>
> Simn
>
>
> On 2 Feb 2018, at 14:33, Nick Allen  wrote:
>
> There are many functions that use the global configuration.  For example,
> GET_GEO in org.apache.metron.enrichment.stellar.GeoEnrichmentFunctions.
> There might be a better example, but that is one is staring at me at the
> moment.
>
> There is an OBJECT_GET function defined in 
> org.apache.metron.enrichment.stellar.ObjectGet
> that was purpose-built to retrieve files from HDFS.  If you wanted to
> retrieve a configuration from HDFS that would be a good example (if you
> can't just use that functions directly).
>
> On Fri, Feb 2, 2018 at 8:50 AM Ali Nazemian  wrote:
>
>> Is there any Stellar function already been implemented in Metron that has
>> a config file associated with it? I am trying to get an idea of how it
>> works.
>>
>> On 3 Feb. 2018 00:44, "Simon Elliston Ball" 
>> wrote:
>>
>>> Depends how you write the function class, but most likely, yes. Hence
>>> global config option.
>>>
>>> Simon
>>>
>>> On 2 Feb 2018, at 13:42, Ali Nazemian  wrote:
>>>
>>> Does it mean every time the function gets called it will load the
>>> config, but if I use the global one it will only read it one time and it
>>> will be available in memory?
>>>
>>> On 2 Feb. 2018 21:53, "Simon Elliston Ball" 
>>> wrote:
>>>
 Shouldn’t be. The one this I would point out though is that you don’t
 necessarily know which supervisor you will be running from, so pulling from
 HDFS would make sense. That said, the performance implications are probably
 not great. A good option here would be to have the config available in the
 global config for example and refer to that, since most instances of
 stellar apply global config to their context.

 Simon


 On 2 Feb 2018, at 07:14, Ali Nazemian  wrote:

 Will be any problem if the Stellar function we want to implement need
 to load an external config file?

 Cheers,
 Ali

 On Thu, Jan 18, 2018 at 4:58 PM, Ali Nazemian 
 wrote:

> Thanks, All.
>
> Yes, Nick. It is highly related to our use case and the way that we
> are going to enrich events with assets and vulnerability properties. It is
> not a general case at all.
>
> Cheers,
> Ali
>
> On Thu, Jan 18, 2018 at 5:43 AM, Matt Foley  wrote:
>
>> Besides the example code Simon mentioned at
>> https://github.com/apache/metron/tree/master/metron-
>> stellar/stellar-3rd-party-example ,
>> there is some documentation at http://metron.apache.org/
>> current-book/metron-stellar/stellar-common/3rdPartyStellar.html
>>
>>
>>
>> *From: *Nick Allen 
>> *Reply-To: *"user@metron.apache.org" 
>> *Date: *Wednesday, January 17, 2018 at 4:46 AM
>> *To: *"user@metron.apache.org" 
>> *Subject: *Re: Define a function that can be used in Stellar
>>
>>
>>
>>
>>
>>
>>
>> If something we have already does not fit the bill, I would recommend
>> creating that function in Java.   Since you described it as "a bit 
>> complex"
>> and "the logic would be complicated" I don't see any value in defining
>> something like this in Stellar with named functions.
>>
>>
>>
>> Best
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Jan 17, 2018 at 7:38 AM Simon Elliston Ball <
>> si...@simonellistonball.com> wrote:
>>
>> Have you looked at the recent TLSH functions in Stellar? We already
>> have that for similarity preserving hashes.
>>
>>
>>
>> Simon
>>
>>
>>
>>
>> On 17 Jan 2018, at 12:35, Ali Nazemian  wrote:
>>
>> It is a bit complex. We want to create a function that accepts a list
>> of arguments for an asset and generate an asset identifier that can be 
>> used
>> as a row_key for the enrichment store. The logic would be complicated,
>> though. We may need to include some sort of similarity aware hash 
>> function
>> as a part of this custom function.
>>
>>
>>
>> On Wed, Jan 17, 2018 at 10:32 PM, Nick Allen 
>> wrote:
>>
>> Ali - Can you describe the logic that you are trying to perform? That
>> would be useful as a use case to help drive a discussion around creating
>> named functions in Stellar.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Jan 17, 2018 at 6:29 AM Ali Nazemian 
>> wrote:
>>
>> Thanks, Simon. We have alread

Re: Metron User Community Meeting Call

2018-01-26 Thread Casey Stella
I can't wait!  This is going to be really cool :)

On Fri, Jan 26, 2018 at 5:25 PM, James Sirota  wrote:

> Yeah very interested in the presentation as well
>
> 26.01.2018, 15:15, "Simon Elliston Ball" :
> > This is going to be a really exciting call. Looking forward to seeing
> how the GCR Canary sings :)
> >
> > I’m going to volunteer https://hortonworks.zoom.us/my/simonellistonball
> as a location for the meeting.
> >
> > I would also support the idea of a quick poll on what people are doing
> with Metron, and maybe if anyone wants to volunteer at the end of the
> meeting it would be great to have an open mic of use cases.
> >
> > Talk to you all Wednesday.
> >
> > Simon
> >
> >>  On 26 Jan 2018, at 22:10, Seal, Steve  wrote:
> >>
> >>  HI all,
> >>
> >>  I have several people on my team that are looking forward to hearing
> about Ahmed’s work.
> >>
> >>  Steve
> >>
> >>  From: Daniel Schafer [mailto:daniel.scha...@sstech.us]
> >>  Sent: Friday, January 26, 2018 5:05 PM
> >>  To: user@metron.apache.org; d...@metron.apache.org
> >>  Subject: Re: Metron User Community Meeting Call
> >>
> >>  My team members and me would like to join as well.
> >>  We can provide Zoom Meeting login if necessary.
> >>
> >>  Thanks
> >>
> >>  Daniel
> >>  7134806608
> >>
> >>  From: Ahmed Shah mailto:AhmedShah@cmail.
> carleton.ca>>
> >>  Reply-To: "user@metron.apache.org " <
> user@metron.apache.org >
> >>  Date: Friday, January 26, 2018 at 2:06 PM
> >>  To: "d...@metron.apache.org " <
> d...@metron.apache.org >, "
> user@metron.apache.org " <
> user@metron.apache.org >
> >>  Subject: Re: Metron User Community Meeting Call
> >>
> >>  Looking forward to presenting!
> >>
> >>  Just a thought...
> >>  In advanced should we create a Google Forms to collect survey data on
> who is using Metron, how they are using it, ext.. and present the results
> to the group?
> >>
> >>  -Ahmed
> >>  ___
> >>  Ahmed Shah (PMP, M. Eng.)
> >>  Cybersecurity Analyst & Developer
> >>  GCR - Cybersecurity Operations Center
> >>  Carleton University - cugcr.com  proofpoint.com/v2/url?u=https-3A__cugcr.com_tiki_lce_index.
> php&d=DwMGaQ&c=H50I6Bh8SW87d_bXfZP_8g&r=yeB_CytRmKpr9adMUN0qfcwJfnmWAQuHY9
> inQHsSRow&m=1J5p3hWBZj3Fc4Xy-CytnTi_kafYqRMsY-Ntvr5HlHw&s=
> Pj0RGStdqj0bZkCYqDZCE_ZA1mRVP-jN6kxxYqgzK2E&e=>
> >>
> >>  From: Andrew Psaltis  psaltis.and...@gmail.com>>
> >>  Sent: January 26, 2018 1:53 PM
> >>  To: d...@metron.apache.org 
> >>  Subject: Re: Metron User Community Meeting Call
> >>
> >>  Count me in. Very interested to hear about Ahmed's journey.
> >>
> >>  On Fri, Jan 26, 2018 at 8:58 AM, Kyle Richardson <
> kylerichards...@gmail.com >
> >>  wrote:
> >>
> >>  > Thanks! I'll be there. Excited to hear Ahmed's successes and
> challenges.
> >>  >
> >>  > -Kyle
> >>  >
> >>  > On Thu, Jan 25, 2018 at 7:44 PM zeo...@gmail.com  zeo...@gmail.com> mailto:zeo...@gmail.com>> wrote:
> >>  >
> >>  > > Thanks Otto, I'm in to attend at that time/place.
> >>  > >
> >>  > > Jon
> >>  > >
> >>  > > On Thu, Jan 25, 2018, 14:45 Otto Fowler  > wrote:
> >>  > >
> >>  > >> I would like to propose a Metron user community meeting. I
> propose that
> >>  > >> we set the meeting next week, and will throw out Wednesday,
> January
> >>  > 31st at
> >>  > >> 09:30AM PST, 12:30 on the East Coast and 5:30 in London Towne.
> This
> >>  > meeting
> >>  > >> will be held over a web-ex, the details of which will be included
> in the
> >>  > >> actual meeting notice.
> >>  > >> Topics
> >>  > >>
> >>  > >> We have a volunteer for a community member presentation:
> >>  > >>
> >>  > >> Ahmed Shah (PMP, M. Eng.) Cybersecurity Analyst & Developer GCR -
> >>  > >> Cybersecurity Operations Center Carleton University - cugcr.com <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__cugcr.com&d=DwQGaQ&c=
> H50I6Bh8SW87d_bXfZP_8g&r=yeB_CytRmKpr9adMUN0qfcwJfnmWAQuHY9
> inQHsSRow&m=1J5p3hWBZj3Fc4Xy-CytnTi_kafYqRMsY-Ntvr5HlHw&s=
> d7cvqZL6hK21y2Y3YW0B49AlEgsICM0D9An4huvIsUI&e=>
> >>  > >>
> >>  > >> Ahmed would like to talk to the community about
> >>  > >>
> >>  > >> -
> >>  > >>
> >>  > >> Who the GCR group is
> >>  > >> -
> >>  > >>
> >>  > >> How they use Metron 0.4.1
> >>  > >> -
> >>  > >>
> >>  > >> Walk through their dashboards, UI management screen, nifi
> >>  > >> -
> >>  > >>
> >>  > >> Challenges we faced up until now
> >>  > >>
> >>  > >> I would like to thank Ahmed for stepping forward for this meeting.
> >>  > >>
> >>  > >> If you have something you would like to present or talk about
> please
> >>  > >> reply here! Maybe we can have people ask for “A better
> explanation of
> >>  > >> feature X” type things

Re: Stellar on another platform?

2018-01-18 Thread Casey Stella
Yeah, what otto said :) I'd just add one thing, stellar really requires
nothing more than:

   1. existing inside of a JVM environment.  We use it inside of storm and
   mapreduce, but it could be used inside of spark or whatever
   2. Have a VariableResolver implementation which could map your data to
   variable -> value pairs.  The default one that we use in metron is wrapping
   Map's, so the implementation is pretty trivial, but we also
   have a VariableResolver in the pcap work that will pull fields from the
   header of raw packets and expose them to stellar.  All this to say that
   normalizing your data to work with stellar is as simple as creating a
   VariableResolver which can take your raw data format and allow stellar to
   query it for variables.

We worked fairly hard to move stellar core into a position where it is
decoupled from Metron, so it shouldn't be too difficult to repurpose it.

On Thu, Jan 18, 2018 at 7:58 AM, Otto Fowler 
wrote:

> Please comment on the jira.  We can come up with what would be a good
> example program, obviously massively commented to show this.
> Down the line, we could even have archetypes for different application
> types… but that is just me thinking down the line ;)
>
>
> On January 18, 2018 at 07:57:17, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> I would also say that you should look at METRON–876
> .
>
> This is the umbrella jira for the effort to separate stellar into a more
> independent module.
>
>
>
>
> On January 18, 2018 at 07:54:38, Otto Fowler (ottobackwa...@gmail.com)
> wrote:
>
> I have created METRON–1409
> 
>
> There are several ways to look at hosting stellar to get examples:
>
>- The unit tests
>- The shell
>- The storm bolts and transformer classes
>
> From a high level, to host stellar you need to:
>
>- Include stellar-common in you pom
>- Create a Context
>- Initialize the function resolver
>- Create the StellarProcessor
>- Create a variable resolver
>
> Then you set everything up, set the vars for the call in the variable
> resolver, and have the processor execute a statement.
>
> The issue right now, and the reason we need METRON–1409 is that each of
> the things above are *so* integrated into the flow of the host, that it
> is not obvious what is going on.
>
> The tests are pretty straight forward, but don’t show the context init
> very well.
>
> I would suggest that you start with the unit tests, as they are the most
> concise. Look through them, debug through them etc.
>
> Then move onto the shell.
>
> I would look at the bolts/transformers last ( although they are the most
> analogous to what I think you want to do ).
>
>
>
>
>
>
> On January 17, 2018 at 17:34:45, Ian Abreu (iab...@wayfair.com) wrote:
>
> Hey all,
>
>
>
> We’ve come across the design decision where we’d like to use Metron
> tooling as a framework to build our SIEM around. This being the case,
> stellar is something that we’d like to use, but we’ve currently got
> different enrichment and normalization layers.
>
>
>
> So my question is this: Has anyone, or could anyone point me to a resource
> that’d help to normalize our data in such a way that Stellar could be used
> downstream from our data manipulation/normalization layer?
>
>
>
> Cheers,
>
> Z0r0
>
>


Re: [ALL] List Replies

2018-01-17 Thread Casey Stella
+1, if it doesn't happen on the list, it doesn't happen in Apache.

On Wed, Jan 17, 2018 at 6:55 AM, Otto Fowler 
wrote:

> The goal of the user list is to foster the Apache Metron community by
> allowing for common discussion of the uses and application of Apache
> Metron.  The list’s archives also provide a valuable resource for people to
> look through for ideas and answers to questions.
>
> Unless someone specifically requests an off-list contact, please keep
> replies and discussion on the list.  That way everyone gets the benefit (
> both now and in the future through the archives ).
>
> ottO
>
>


Re: Motivations for using Apache Storm?

2018-01-12 Thread Casey Stella
At the time, we chose storm because of a few reasons:

   - Metron inherited its codebase from OpenSOC, which chose Storm as it
   predated flink and spark streaming, the two other major contenders in the
   hadoop stack
   - Storm was battle tested at the time and, at least then, we had some
   concern around the maturity and performance of the other contenders
   - Specifically alternatives involving microbatching with high-throughput
   topologies like pcap was a concern

Going forward and looking back, I think it was a fine choice, honestly.
Ultimately, it quickly became apparent that it was necessary to produce a
simplified abstraction on top of the streaming technologies rather than
adopting their abstractions as the core point of extension for the
architecture.  So, for instance, we quickly realized that it was a better
experience for people to use Stellar to add enrichments rather than to
either modify code *or* flux files directly and possibly bounce
topologies.  This allowed us to not concern the user with the streaming
technology and also, fwiw, makes it easy to pivot to another streaming
technology if we so desire in the future.

Hope that provides some clarity!  Thanks for the very interesting question.
:)

Casey

On Fri, Jan 12, 2018 at 6:29 PM, M. Aaron Bossert 
wrote:

> Perhaps it might be useful for you to articulate your use case?  Not to
> sound like a generic non-answer, but most of the streaming/CEP frameworks
> are pretty good, narrowing down a short list of which ones to use beyond
> basic requirements can be highly subjective:  does your language of choice
> have a “first class” API?  do you already have a Hadoop environment that is
> more tightly integrated with one framework or another? Do you prefer a
> newer framework with newer features and a less mature code base? Or do you
> prefer a well tested/tried framework?
>
> All of these choices boils down to much more than a quantitative
> evaluation based on benchmark performance...
>
> Get Outlook for iOS 
> --
> *From:* Tarik Courdy 
> *Sent:* Friday, January 12, 2018 5:15:13 PM
> *To:* user@metron.apache.org
> *Subject:* Motivations for using Apache Storm?
>
> Good afternoon -
>
> I've started doing research on various stream processing frameworks and it
> seems like there are a ton of them out there.
>
> Out of curiosity what were the underlying motivations to go with Storm as
> opposed to one of the other frameworks out there?
>
> Thank you for your time.
>


Re: Full Dev -> Heartbeat issues

2018-01-08 Thread Casey Stella
I haven't seen that one.  I spun one up from master on Friday and it seemed
ok.  Sorry, "works for me!" isn't super helpful, but it may be relevant
since master is close to 0.4.2 :)

On Mon, Jan 8, 2018 at 11:11 AM, Otto Fowler 
wrote:

> I just started up full dev from the 0.4.2 release tag, and ended up with
> failed heartbeats for all my services in ambari.
> After investigation, I found the my /etc/hosts ( on node1 ) had multiple
> entries for node1 :
>
> [vagrant@node1 ~]$ cat /etc/hosts
> 127.0.0.1 node1 node1
> 127.0.0.1   localhost
>
> ## vagrant-hostmanager-start
> 192.168.66.121 node1
>
> ## vagrant-hostmanager-end
>
> After removing the 127.0.0.1 node1 node1 line and restarting the machine +
> all the services etc my issues are resolved and my board is green.
>
> I am not sure why this may happen.
> Hopefully if you are seeing this, this will help.
>
> Anyone know why this may happen?
>
>
> ottO
>
>
>


Re: [DISCUSS] Dropping support for elastic 2.x

2017-10-04 Thread Casey Stella
I agree, we shouldn't provide documentation for ES upgrade as ES has that
covered.  I also agree that we should provide doc for ES mpack upgrade.

On Wed, Oct 4, 2017 at 3:47 PM, James Sirota  wrote:

> I am in favor of moving to 5.x and dropping support for 2.x. As Justin
> mentioned, Elastic have very good docs around cluster migrations and the
> procedure itself to upgrade from 2.x to 5.x is very simple.
> https://www.elastic.co/guide/en/elasticsearch/reference/
> current/restart-upgrade.html
>
> I don't agree that we should provide documentation for ES upgrade. I think
> pointing to elastic docs should be good enough. I do agree that we should
> provide documentation for the ES mpack upgrade, which we will.
>
> With us supporting 5.x I see little reason to be backwards compatible to
> 2.x
>
>
> 04.10.2017, 11:59, "Farrukh Naveed Anjum" :
>
> Its better to move to Elastic Search 5 or 6. As Elasticsearch 2.x is
> really pretty old.
>
> On Wed, Oct 4, 2017 at 9:45 PM, Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
> A number of people are currently working on upgrading the ES support in
> Metron to 5.x (including the clients, and the mpack managed install).
>
> Would anyone have any objections to dropping formal support for 2.x as a
> result of this work? In theory the clients should be backward compatible
> against older data stores, so metron could be upgraded without needing an
> elastic upgrade.
>
> In practice, we would need to do pretty extensive testing and I wouldn’t
> want us to have to code around long term support on older clients if no-one
> in the community cares enough about the older ES. Do we think there is a
> case to be made for maintaining long term support for older clients?
>
> Simon
>
>
>
>
> --
> With Regards
> Farrukh Naveed Anjum
>
>
>
> ---
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>
>


Re: [ANNOUNCE] Apache Metron Release 0.4.1

2017-09-19 Thread Casey Stella
Fantastic!  I'm really proud of this release and a great job was done by
Matt and the community for getting this out!

On Tue, Sep 19, 2017 at 1:24 PM, Frank Horsfall <
frankhorsf...@cunet.carleton.ca> wrote:

> Congrats guys!
>
>
>
> Frank
>
>
>
>
>
> *From:* zeo...@gmail.com [mailto:zeo...@gmail.com]
> *Sent:* Tuesday, September 19, 2017 4:23 PM
> *To:* Matt Foley ; d...@metron.apache.org;
> user@metron.apache.org
> *Subject:* Re: [ANNOUNCE] Apache Metron Release 0.4.1
>
>
>
> Great job everybody, this is a really top notch release.  Well done
>
> Jon
>
>
>
> On Tue, Sep 19, 2017, 15:53 Otto Fowler  wrote:
>
> Congratulations everyone, great job.  Thank you Matt!
>
>
>
>
>
> On September 19, 2017 at 15:22:21, Matt Foley (ma...@apache.org) wrote:
>
> I’m very happy to announce the public release of Apache Metron version
> 0.4.1.
>
> The release, with PGP signature and checksums, may be obtained from our
> public website at http://metron.apache.org/documentation/#releases
> (Note that you may need to refresh this web page in your browser, to
> assure the latest version.)
>
> or the source code may be cloned from our github repository at
> https://github.com/apache/metron.git (tag: apache-metron-0.4.1-release)
>
> The newest documentation book is at
> http://www.apache.org/dyn/closer.cgi/metron/0.4.1/site-book/index.html
>
> This release represents a great deal of work by the developer community,
> including over 100 enhancements, bug fixes, and innovations, since 0.4.0.
> The complete list of changes is at
> http://www.apache.org/dyn/closer.cgi/metron/0.4.1/CHANGES
>
> The release notes are at
> http://www.apache.org/dyn/closer.cgi/metron/0.4.1/RELEASE_NOTES
>
> Many thanks to all who contributed, and enjoy your new release!
>
> Warm regards,
> --Matt Foley
> release manager
>
>
> --
>
> Jon
>


Re: Apache Metron and STIX

2017-08-18 Thread Casey Stella
At the moment, we are dependent upon the Stix library from Mitre, which is
Stix 1.x.  The schemata that we support are
https://github.com/STIXProject/java-stix/tree/v1.2.0.2/src/main/resources/schemas

On Fri, Aug 18, 2017 at 1:26 PM, Ahmed Shah 
wrote:

> Hello,
>
> Just wondering if Metron is able to ingest STIX 2.0 formatted data.
>
> 
> I came across the following link:
> 
> https://cwiki.apache.org/confluence/display/METRON/Threat+Intel
>
> Is there any other Metron/STIX documentation?.
>
>
>
> -Ahmed
>
>


Re: Offset lag tool?

2017-08-14 Thread Casey Stella
It's part of kafka, actually.  You can find it documented at
https://cwiki.apache.org/confluence/display/KAFKA/System+Tools#SystemTools-ConsumerOffsetChecker

On Mon, Aug 14, 2017 at 11:32 AM, Laurens Vets  wrote:

> From the Performance-tuning-guide.md: "You will find the offset lag tool
> indispensable while verifying your settings."
>
> Probably because it's Monday, but I can't seem to find this offset lag
> tool anywhere...
>


Re: Threat triage rules using stellar geo enrichment

2017-08-08 Thread Casey Stella
I think you want:

GEO_GET( ip_dst_addr, ['country']) != 'US'

On Tue, Aug 8, 2017 at 7:29 AM, Anand Subramanian <
asubraman...@hortonworks.com> wrote:

> Hello All,
>
> I am trying to write a triage rule where I would like to set the alert
> score based on Geo enrichment output, as follows.
>
> $ cat $METRON_HOME/config/zookeeper/enrichments/snort.json
> {
>   "enrichment" : {
> "fieldMap":
>   {
>   "geo": ["ip_dst_addr", "ip_src_addr"],
>   "host": ["host"]
> }
>   },
>   "threatIntel" : {
> "fieldMap":
>   {
>   "hbaseThreatIntel": ["ip_src_addr", "ip_dst_addr"]
> },
> "fieldToTypeMap":
>   {
>   "ip_src_addr" : ["malicious_ip"],
>   "ip_dst_addr" : ["malicious_ip"]
> },
> "triageConfig" : {
>   "riskLevelRules" : [
> {
>   "name" : "Rule 1",
>   "rule" : "not(IN_SUBNET(ip_dst_addr, '192.168.0.0/24'))",
>   "score" : 10
> },
> {
>   "name" : "Rule 2",
> *  "rule" : "not(GEO_GET(ip_dst_addr, '[country]'), 'US')",*
>   "score" : 20
> }
>   ],
>   "aggregator" : "MAX"
> }
>   }
> }
>
> But I am getting the following error when trying to push the configuration
> into zookeeper:
>
> Exception in thread "main" java.lang.RuntimeException: Unable to load {
>   "enrichment" : {
> "fieldMap":
>   {
>   "geo": ["ip_dst_addr", "ip_src_addr"],
>   "host": ["host"]
> }
> 
> at org.apache.metron.common.configuration.ConfigurationType.lambda$
> static$2(ConfigurationType.java:54)
> at org.apache.metron.common.configuration.ConfigurationType.deserialize(
> ConfigurationType.java:93)
> at org.apache.metron.common.configuration.ConfigurationsUtils.
> writeSensorEnrichmentConfigToZookeeper(ConfigurationsUtils.java:123)
> at org.apache.metron.common.configuration.ConfigurationsUtils.
> uploadConfigsToZookeeper(ConfigurationsUtils.java:265)
> at org.apache.metron.common.configuration.ConfigurationsUtils.
> uploadConfigsToZookeeper(ConfigurationsUtils.java:226)
> at org.apache.metron.common.cli.ConfigurationManager.push(
> ConfigurationManager.java:155)
> at org.apache.metron.common.cli.ConfigurationManager.run(
> ConfigurationManager.java:170)
> at org.apache.metron.common.cli.ConfigurationManager.run(
> ConfigurationManager.java:161)
> at org.apache.metron.common.cli.ConfigurationManager.main(
> ConfigurationManager.java:198)
> Caused by: org.apache.metron.jackson.databind.JsonMappingException: N/A
>  at [Source: {
> 
> }
> ; line: 31, column: 7] (through reference chain: org.apache.metron.common.
> configuration.enrichment.SensorEnrichmentConfig["
> threatIntel"]->org.apache.metron.common.configuration.
> enrichment.threatintel.ThreatIntelConfig["triageConfig"]->org.apache.
> metron.common.configuration.enrichment.threatintel.ThreatTriageConfig["
> riskLevelRules"])
> at org.apache.metron.jackson.databind.JsonMappingException.
> from(JsonMappingException.java:262)
> at org.apache.metron.jackson.databind.deser.SettableBeanProperty._
> throwAsIOE(SettableBeanProperty.java:537)
> at org.apache.metron.jackson.databind.deser.SettableBeanProperty._
> throwAsIOE(SettableBeanProperty.java:518)
> at org.apache.metron.jackson.databind.deser.impl.MethodProperty.
> deserializeAndSet(MethodProperty.java:99)
> at org.apache.metron.jackson.databind.deser.BeanDeserializer.
> vanillaDeserialize(BeanDeserializer.java:260)
> at org.apache.metron.jackson.databind.deser.BeanDeserializer.deserialize(
> BeanDeserializer.java:125)
> at org.apache.metron.jackson.databind.deser.SettableBeanProperty.
> deserialize(SettableBeanProperty.java:490)
> at org.apache.metron.jackson.databind.deser.impl.MethodProperty.
> deserializeAndSet(MethodProperty.java:95)
> at org.apache.metron.jackson.databind.deser.BeanDeserializer.
> vanillaDeserialize(BeanDeserializer.java:260)
> at org.apache.metron.jackson.databind.deser.BeanDeserializer.deserialize(
> BeanDeserializer.java:125)
> at org.apache.metron.jackson.databind.deser.SettableBeanProperty.
> deserialize(SettableBeanProperty.java:490)
> at org.apache.metron.jackson.databind.deser.impl.MethodProperty.
> deserializeAndSet(MethodProperty.java:95)
> at org.apache.metron.jackson.databind.deser.BeanDeserializer.
> vanillaDeserialize(BeanDeserializer.java:260)
> at org.apache.metron.jackson.databind.deser.BeanDeserializer.deserialize(
> BeanDeserializer.java:125)
> at org.apache.metron.jackson.databind.ObjectMapper._
> readMapAndClose(ObjectMapper.java:3807)
> at org.apache.metron.jackson.databind.ObjectMapper.
> readValue(ObjectMapper.java:2797)
> at org.apache.metron.common.utils.JSONUtils.load(JSONUtils.java:65)
> at org.apache.metron.common.configuration.ConfigurationType.lambda$
> static$2(ConfigurationType.java:52)
> ... 8 more
> Caused by: org.antlr.v4.runtime.NoViableAltException
> at org.antlr.v4.runtime.atn.ParserATNSimulator.noViableAlt(
> ParserATNSimulator.java:1894)
> at org.antlr.v4.runtime.atn.ParserATNSimulator.execATN(
> P

Re: MaaS and Metron Architecture talks at DataWorks Summit SJ 2017

2017-08-03 Thread Casey Stella
Ok, those talks are added.

On Thu, Aug 3, 2017 at 3:44 PM, Casey Stella  wrote:

> Absolutely!
>
> On Thu, Aug 3, 2017 at 3:41 PM, Justin Leet  wrote:
>
>> Could we put these up on the wiki page for tech talks in the community?
>> That page could probably use some love, although I know we've had
>> discussions about what we should do with wiki content.
>>
>> https://cwiki.apache.org/confluence/display/METRON/Tech+Talks
>>
>> On Thu, Aug 3, 2017 at 10:32 AM, Casey Stella  wrote:
>>
>>> The Videos of talks that Simon Ball and I gave at DataWorks Summit are
>>> now up and on youtube:
>>>
>>> * Solving Cyber at Scale (business-level track) -
>>> https://www.youtube.com/watch?v=zVdRhwfum4Q
>>> * Model as a Service (technical track) - https://www.youtube.com/watc
>>> h?v=LkrOKvyAc0s
>>> * Metron Architecture (with demo from LANL data) (technical track) -
>>> https://www.youtube.com/watch?v=0LrrAQXhqGY
>>>
>>> These talks are mostly current based on the existing architecture and
>>> the demos reflect the alerting UI that is not committed yet.  There are
>>> blogs coming out in support of this over the next week or so.
>>>
>>> If anyone has any questions about the talks or want any more
>>> information, feel free to ask. :)
>>>
>>> Best,
>>>
>>> Casey
>>>
>>
>>
>


Re: MaaS and Metron Architecture talks at DataWorks Summit SJ 2017

2017-08-03 Thread Casey Stella
Absolutely!

On Thu, Aug 3, 2017 at 3:41 PM, Justin Leet  wrote:

> Could we put these up on the wiki page for tech talks in the community?
> That page could probably use some love, although I know we've had
> discussions about what we should do with wiki content.
>
> https://cwiki.apache.org/confluence/display/METRON/Tech+Talks
>
> On Thu, Aug 3, 2017 at 10:32 AM, Casey Stella  wrote:
>
>> The Videos of talks that Simon Ball and I gave at DataWorks Summit are
>> now up and on youtube:
>>
>> * Solving Cyber at Scale (business-level track) -
>> https://www.youtube.com/watch?v=zVdRhwfum4Q
>> * Model as a Service (technical track) - https://www.youtube.com/watc
>> h?v=LkrOKvyAc0s
>> * Metron Architecture (with demo from LANL data) (technical track) -
>> https://www.youtube.com/watch?v=0LrrAQXhqGY
>>
>> These talks are mostly current based on the existing architecture and the
>> demos reflect the alerting UI that is not committed yet.  There are blogs
>> coming out in support of this over the next week or so.
>>
>> If anyone has any questions about the talks or want any more information,
>> feel free to ask. :)
>>
>> Best,
>>
>> Casey
>>
>
>


MaaS and Metron Architecture talks at DataWorks Summit SJ 2017

2017-08-03 Thread Casey Stella
The Videos of talks that Simon Ball and I gave at DataWorks Summit are now
up and on youtube:

* Solving Cyber at Scale (business-level track) -
https://www.youtube.com/watch?v=zVdRhwfum4Q
* Model as a Service (technical track) -
https://www.youtube.com/watch?v=LkrOKvyAc0s
* Metron Architecture (with demo from LANL data) (technical track) -
https://www.youtube.com/watch?v=0LrrAQXhqGY

These talks are mostly current based on the existing architecture and the
demos reflect the alerting UI that is not committed yet.  There are blogs
coming out in support of this over the next week or so.

If anyone has any questions about the talks or want any more information,
feel free to ask. :)

Best,

Casey


Re: Possible Stellar bug

2017-08-02 Thread Casey Stella
Ok, I think what you've found here is a bug in the REPL.  I take it that
what you're looking for is JOIN( ['a', 'b'], '\\') == 'a\b' right?  That is
a valid stellar expression, BUT because the REPL seems to be trying to
interpret the \\ before it gets to stellar, it's borking something.  When I
run it in the REPL, I get:
[Stellar]>>> JOIN( [ 'a', 'b'], '\\')
>

Notice that > at the end.  That's the REPL asking for more input.  This is
behaving like that because we based the REPL off of a shell interpreter.
OF course, in a shell interpreter \ indicates you want to continue the
input on the next line.  We need to turn this behavior off in the REPL,
likely.


The actual functionality does work in Stellar outside of the REPL.  We
actually even have unit tests for this case:
https://github.com/apache/metron/blob/e206f2508ef7e7d798510df76ccfeb38b9530e89/metron-stellar/stellar-common/src/test/java/org/apache/metron/stellar/dsl/functions/BasicStellarTest.java#L120

I also just wrote a quick testcase and validated the very specific issue
that you had (note I use  because it's java.  For you, it'd be \\):
 @Test
  public void testEscapedLiterals_test() {
Assert.assertEquals("a\\b", run("JOIN(['a', 'b'], '')", new
HashMap<>()));
  }

tl;dr:
If you used the expression as an enrichment, it should work. There's a bug,
however, in the REPL (filed @
https://issues.apache.org/jira/browse/METRON-1074)

Thanks for your careful testing!

On Tue, Aug 1, 2017 at 8:38 PM, Guillem Mateos  wrote:

> Hi,
>
> I'm trying to do something very simple with stellar and i'm running into
> issues, which i'd say point to a bug in stellar. I'm not very familiar with
> Antlr, so can't really tell exactly what is wrong, but based on the tests
> i've done, I'd say something it's not working properly.
>
> Example:
>
> JOIN(['a','b'],'\'')
>
> ouputs: a'b
>
> as one would expect. But then, try the following:
>
> JOIN(['a','b'],'\\')
>
> which simply fails and does not output anything when typed on the stellar
> cli (if you type another ' it throws a ParseException - no viable
> alternative).
>
> funny thing though, if you add another character, it does work. So:
>
> JOIN(['a','b'],'\\a')
>
> will output: a\ab
>
> Am I doing something wrong on how I write the escaping sequence or is this
> something that should be fixed on stellar?
>
> Thanks
>


Re: FW: STIX extractor problem.

2017-07-26 Thread Casey Stella
You are absolutely 100% right.  I missed a cast; sorry about that!  I
updated the PR to be correct and updated the integration test too.  If you
could try the patch again, that'd be appreciated.

On Wed, Jul 26, 2017 at 11:21 AM, Ziaja Aleksander <
aleksander.zi...@coi.gov.pl> wrote:

> You are right:
> 17/07/26 12:18:03 INFO state.ConnectionStateManager: State change:
> CONNECTED
> Exception in thread "main" java.lang.ClassCastException:
> org.apache.metron.dataloads.extractor.TransformFilterExtractorDecorator
> cannot be cast to org.apache.metron.dataloads.extractor.stix.StixExtractor
> at org.apache.metron.dataloads.nonbulk.taxii.TaxiiLoader.
> main(TaxiiLoader.java:204)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
>
> these exeption below was after my own changes on original code 0.4.0  ,
> sorry.
> rgds
> az
>
> From: Casey Stella [mailto:ceste...@gmail.com]
> Sent: Wednesday, July 26, 2017 11:56 AM
> To: user@metron.apache.org
> Cc: u...@metron.incubator.apache.org
> Subject: Re: FW: STIX extractor problem.
>
> Are you sure you applied the patch and rebuilt?  Line 189 of TaxiiLoader
> doesn't cast to StixExtractor.  If it were a problem, I'd expect the line
> to be 204 of TaxiiLoader.java
>
> On Wed, Jul 26, 2017 at 10:03 AM, Ziaja Aleksander  aleksander.zi...@coi.gov.pl> wrote:
> Hi again.
> After patching there i san exeption:
>
> Exception in thread "main" java.lang.ClassCastException:
> org.apache.metron.dataloads.extractor.TransformFilterExtractorDecorator
> cannot be cast to org.apache.metron.dataloads.extractor.stix.StixExtractor
> at org.apache.metron.dataloads.nonbulk.taxii.TaxiiLoader.
> main(TaxiiLoader.java:189)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
>
> Config files are the same (as below)
> Is the rany problem in config file know , please?
>
> Rgds,
> -aziaja
>
> From: Casey Stella [mailto:mailto:ceste...@gmail.com]
> Sent: Tuesday, July 25, 2017 4:52 PM
> To: mailto:user@metron.apache.org
> Cc: mailto:u...@metron.incubator.apache.org
> Subject: Re: FW: STIX extractor problem.
>
> Yep, unfortunately this is a bug and there's a PR open for it: METRON-1026
> https://github.com/apache/metron/pull/643
>
> On Tue, Jul 25, 2017 at 3:49 PM, Ziaja Aleksander <mailto:mailto:
> aleksander.zi...@coi.gov.pl> wrote:
> Hi All,
>
> I would like to ask you (I still can not figure out why it happenes , may
> be it is so obvious ...) what means exeption:
>
>
>
> Exception in thread "main" java.lang.IllegalStateException: Extractor
> must be a STIX Extractor
> at org.apache.metron.dataloads.nonbulk.taxii.TaxiiLoader.
> main(TaxiiLoader.java:202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
>
> executed: /usr/metron/0.4.0/bin/threatintel_taxii_load.sh -c
> /opt/taxii/connection.json  -e /opt/taxii/extractor.json
> os: Centos 7
>
> config files:
> cat /opt/taxii/connection.json
> {
>"endpoint" : "https://x/taxii-discovery-service";
>   ,"type" : "DISCOVER"
>   ,"username" : ""
>   ,"password" : ""
>   ,"collection" : "guest.Abuse_ch"
>   ,"table" : "threat_intel"
>   ,"columnFamily" : "cf"
>   ,"allowedIndicatorTypes" : [ "domainname:FQDN", "address:IPV_4_ADDR" ]
> }
>
> cat /opt/taxi/extractor.json
> {
>   "config" : {
> "stix_address_categories" : "IPV_4_ADDR"
>   }
>   ,"extractor" : "STIX"
> }
>
> So please, any sugestions welcome.
> Thank you in advance.
> -Alex
>
>


Re: FW: STIX extractor problem.

2017-07-26 Thread Casey Stella
Are you sure you applied the patch and rebuilt?  Line 189 of TaxiiLoader
doesn't cast to StixExtractor.  If it were a problem, I'd expect the line
to be 204 of TaxiiLoader.java

On Wed, Jul 26, 2017 at 10:03 AM, Ziaja Aleksander <
aleksander.zi...@coi.gov.pl> wrote:

> Hi again.
> After patching there i san exeption:
>
> Exception in thread "main" java.lang.ClassCastException:
> org.apache.metron.dataloads.extractor.TransformFilterExtractorDecorator
> cannot be cast to org.apache.metron.dataloads.extractor.stix.StixExtractor
> at org.apache.metron.dataloads.nonbulk.taxii.TaxiiLoader.
> main(TaxiiLoader.java:189)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
>
> Config files are the same (as below)
> Is the rany problem in config file know , please?
>
> Rgds,
> -aziaja
>
> From: Casey Stella [mailto:ceste...@gmail.com]
> Sent: Tuesday, July 25, 2017 4:52 PM
> To: user@metron.apache.org
> Cc: u...@metron.incubator.apache.org
> Subject: Re: FW: STIX extractor problem.
>
> Yep, unfortunately this is a bug and there's a PR open for it: METRON-1026
> https://github.com/apache/metron/pull/643
>
>
> On Tue, Jul 25, 2017 at 3:49 PM, Ziaja Aleksander  aleksander.zi...@coi.gov.pl> wrote:
> Hi All,
>
> I would like to ask you (I still can not figure out why it happenes , may
> be it is so obvious ...) what means exeption:
>
>
>
> Exception in thread "main" java.lang.IllegalStateException: Extractor
> must be a STIX Extractor
> at org.apache.metron.dataloads.nonbulk.taxii.TaxiiLoader.
> main(TaxiiLoader.java:202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
>
> executed: /usr/metron/0.4.0/bin/threatintel_taxii_load.sh -c
> /opt/taxii/connection.json  -e /opt/taxii/extractor.json
> os: Centos 7
>
> config files:
> cat /opt/taxii/connection.json
> {
>"endpoint" : "https://x/taxii-discovery-service";
>   ,"type" : "DISCOVER"
>   ,"username" : ""
>   ,"password" : ""
>   ,"collection" : "guest.Abuse_ch"
>   ,"table" : "threat_intel"
>   ,"columnFamily" : "cf"
>   ,"allowedIndicatorTypes" : [ "domainname:FQDN", "address:IPV_4_ADDR" ]
> }
>
> cat /opt/taxi/extractor.json
> {
>   "config" : {
> "stix_address_categories" : "IPV_4_ADDR"
>   }
>   ,"extractor" : "STIX"
> }
>
> So please, any sugestions welcome.
> Thank you in advance.
> -Alex
>
>


Re: FW: STIX extractor problem.

2017-07-25 Thread Casey Stella
Yep, unfortunately this is a bug and there's a PR open for it: METRON-1026
https://github.com/apache/metron/pull/643


On Tue, Jul 25, 2017 at 3:49 PM, Ziaja Aleksander <
aleksander.zi...@coi.gov.pl> wrote:

> Hi All,
>
> I would like to ask you (I still can not figure out why it happenes , may
> be it is so obvious ...) what means exeption:
>
>
>
> Exception in thread "main" java.lang.IllegalStateException: Extractor
> must be a STIX Extractor
> at org.apache.metron.dataloads.nonbulk.taxii.TaxiiLoader.
> main(TaxiiLoader.java:202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
>
> executed: /usr/metron/0.4.0/bin/threatintel_taxii_load.sh -c
> /opt/taxii/connection.json  -e /opt/taxii/extractor.json
> os: Centos 7
>
> config files:
> cat /opt/taxii/connection.json
> {
>"endpoint" : "https://x/taxii-discovery-service";
>   ,"type" : "DISCOVER"
>   ,"username" : ""
>   ,"password" : ""
>   ,"collection" : "guest.Abuse_ch"
>   ,"table" : "threat_intel"
>   ,"columnFamily" : "cf"
>   ,"allowedIndicatorTypes" : [ "domainname:FQDN", "address:IPV_4_ADDR" ]
> }
>
> cat /opt/taxi/extractor.json
> {
>   "config" : {
> "stix_address_categories" : "IPV_4_ADDR"
>   }
>   ,"extractor" : "STIX"
> }
>
> So please, any sugestions welcome.
> Thank you in advance.
> -Alex
>
>


Re: Treat Triage boost aggregation

2017-06-22 Thread Casey Stella
Actually, and I am shocked to find myself saying this, MaaS won't help you
here. ;)  I don't think the current system can encode your desire.  Just in
case I'm being dense, though, would you give us a concrete example with
some rules and how you'd like the score aggregated?

On Thu, Jun 22, 2017 at 8:07 PM, Ali Nazemian  wrote:

> Thanks, Casey and Nick. Is there any way that we can somehow overcome this
> requirement with the current features? Exclude MAAS.
>
> On Thu, Jun 22, 2017 at 11:42 PM, Nick Allen  wrote:
>
>> Ali -
>>
>> Here are some issues in JIRA related to this topic.  Feel free to add
>> commentary or specifics of your use case to either of these issues.
>> Feedback will only help improve the final result.
>>
>> https://issues.apache.org/jira/browse/METRON-683
>> https://issues.apache.org/jira/browse/METRON-685
>>
>>
>> Thanks
>>
>>
>>
>> On Thu, Jun 22, 2017 at 9:31 AM, Casey Stella  wrote:
>>
>>> That's correct that it's the last step.  Honestly, the threat triage
>>> functions were added prior to Stellar really being a thing.  We should
>>> allow arbitrary stellar statements in there rather than a fixed approach,
>>> so it's pluggable.
>>>
>>> On Thu, Jun 22, 2017 at 3:50 AM, Ali Nazemian 
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I know there are four different Treat Triage aggregation functions we
>>>> can use for the case of triggering multiple rules. These functions are
>>>> "max', "min", "mean", "positive mean". I was wondering whether there is any
>>>> way I can implement the following logic with the Treat Triage functions for
>>>> a non-deterministic score.
>>>>
>>>> In the case that a specific rule is triggered, I want to boost the
>>>> final result of Treat Triage score with a specific value. For example +20
>>>> to the score or multiply that by a specific value!
>>>>
>>>> Treat Triage is the last bolt in enrichment topology so it seems I
>>>> cannot have any additional enrichment/transformation based on the score
>>>> value. Is that right?
>>>>
>>>> Regards,
>>>> Ali
>>>>
>>>
>>>
>>
>
>
> --
> A.Nazemian
>


Re: Treat Triage boost aggregation

2017-06-22 Thread Casey Stella
That's correct that it's the last step.  Honestly, the threat triage
functions were added prior to Stellar really being a thing.  We should
allow arbitrary stellar statements in there rather than a fixed approach,
so it's pluggable.

On Thu, Jun 22, 2017 at 3:50 AM, Ali Nazemian  wrote:

> Hi all,
>
> I know there are four different Treat Triage aggregation functions we can
> use for the case of triggering multiple rules. These functions are "max',
> "min", "mean", "positive mean". I was wondering whether there is any way I
> can implement the following logic with the Treat Triage functions for a
> non-deterministic score.
>
> In the case that a specific rule is triggered, I want to boost the final
> result of Treat Triage score with a specific value. For example +20 to the
> score or multiply that by a specific value!
>
> Treat Triage is the last bolt in enrichment topology so it seems I cannot
> have any additional enrichment/transformation based on the score value. Is
> that right?
>
> Regards,
> Ali
>


Re: Metron in-memory enrichment

2017-06-19 Thread Casey Stella
The stellar enrichments do cache results.  It caches at the bolt level, so
it will associate an input message with computed output.

On Mon, Jun 19, 2017 at 6:28 AM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> Surely the caching should make this effectively an in memory lookup. Does
> the stellar enrichment function not use the same clientside caching as the
> Hbase bolt?
>
> Simon
>
> On 19 Jun 2017, at 06:21, Casey Stella  wrote:
>
> In order to do that, the easiest thing to do is to create a stellar
> function to load and do in-memory lookups.
>
> On Sun, Jun 18, 2017 at 11:48 PM, Ali Nazemian 
> wrote:
>
>> Hi all,
>>
>> We are using Metron HBase enrichment for a few use cases, but we have
>> noticed the achievable throughput is not very great. I was wondering
>> whether there is a way to load the external enrichment data in-memory and
>> use it with normal Stellar enrichments. In our use cases, the number of
>> rows in the external enrichments that we are dealing with is less than a
>> 100k and it is a static list, so it is feasible to load them in-memory and
>> use that for the enrichment. However, I am not sure how that would be
>> achievable from the Metron capabilities.
>>
>> Regards,
>> Ali
>>
>
>


Re: Metron in-memory enrichment

2017-06-19 Thread Casey Stella
That said, I think it'd be really cool to have a set of stellar functions
to interact with reference data stored in MapDB (http://www.mapdb.org/)
which would get localized similar to the geo enrichment stellar functions
for those small-data cases.

On Mon, Jun 19, 2017 at 6:21 AM, Casey Stella  wrote:

> In order to do that, the easiest thing to do is to create a stellar
> function to load and do in-memory lookups.
>
> On Sun, Jun 18, 2017 at 11:48 PM, Ali Nazemian 
> wrote:
>
>> Hi all,
>>
>> We are using Metron HBase enrichment for a few use cases, but we have
>> noticed the achievable throughput is not very great. I was wondering
>> whether there is a way to load the external enrichment data in-memory and
>> use it with normal Stellar enrichments. In our use cases, the number of
>> rows in the external enrichments that we are dealing with is less than a
>> 100k and it is a static list, so it is feasible to load them in-memory and
>> use that for the enrichment. However, I am not sure how that would be
>> achievable from the Metron capabilities.
>>
>> Regards,
>> Ali
>>
>
>


Re: Metron in-memory enrichment

2017-06-19 Thread Casey Stella
In order to do that, the easiest thing to do is to create a stellar
function to load and do in-memory lookups.

On Sun, Jun 18, 2017 at 11:48 PM, Ali Nazemian 
wrote:

> Hi all,
>
> We are using Metron HBase enrichment for a few use cases, but we have
> noticed the achievable throughput is not very great. I was wondering
> whether there is a way to load the external enrichment data in-memory and
> use it with normal Stellar enrichments. In our use cases, the number of
> rows in the external enrichments that we are dealing with is less than a
> 100k and it is a static list, so it is feasible to load them in-memory and
> use that for the enrichment. However, I am not sure how that would be
> achievable from the Metron capabilities.
>
> Regards,
> Ali
>


Re: Kafka spout error in the new HCP product

2017-05-16 Thread Casey Stella
Yeah, I've seen the same issue.  It appears that the storm-kafka-client in
versions < 1.1 has significant throughput problems.  We saw a 10x speedup
in moving to the 1.1 version.  There is a PR out for this currently:
https://github.com/apache/metron/pull/584

Casey

On Tue, May 16, 2017 at 4:26 AM, Ali Nazemian  wrote:

> I am still facing this issue and couldn't manage to fix it. I would be
> really grateful If somebody can help me.
>
> Thanks,
> Ali
>
> On Sun, May 14, 2017 at 1:58 PM, Ali Nazemian 
> wrote:
>
>> I was wrong. I think I couldn't increase the timeout value for Kafka
>> spout properly. Therefore, I was wondering how I can increase the timeout
>> value for Kafka spout? What is the right "-esc" property name I need to set
>> in this case? Also, what has changed in the newer version since I didn't
>> have this issue with the previous version?
>>
>>
>>
>> On Sun, May 14, 2017 at 3:00 AM, Ali Nazemian 
>> wrote:
>>
>>> Hi,
>>>
>>> I have installed the new version of HCP recently. I can see that the
>>> following error has appeared in Storm UI at Kafka spout section related to
>>> Parser topologies:
>>>
>>> org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot
>>> be completed since the group has already rebalanced and assigned the
>>> partitions to another member. This means that the time between subsequent
>>> calls to poll() was longer than the configured session.timeout.ms,
>>> which typically implies that the poll loop is spending too much time
>>> message processing. You can address this either by increasing the session
>>> timeout or by reducing the maximum size of batches returned in poll() with
>>> max.poll.records. at org.apache.kafka.clients.consu
>>> mer.internals.ConsumerCoordinator$OffsetCommitResponseHandle
>>> r.handle(ConsumerCoordinator.java:600) at org.apache.kafka.clients.consu
>>> mer.internals.ConsumerCoordinator$OffsetCommitResponseHandle
>>> r.handle(ConsumerCoordinator.java:541) at org.apache.kafka.clients.consu
>>> mer.internals.AbstractCoordinator$CoordinatorResponseHandler
>>> .onSuccess(AbstractCoordinator.java:679) at
>>> org.apache.kafka.clients.consumer.internals.AbstractCoordina
>>> tor$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658)
>>> at org.apache.kafka.clients.consumer.internals.RequestFuture$1.
>>> onSuccess(RequestFuture.java:167) at org.apache.kafka.clients.consu
>>> mer.internals.RequestFuture.fireSuccess(RequestFuture.java:133) at
>>> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
>>> at org.apache.kafka.clients.consumer.internals.ConsumerNetworkC
>>> lient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426)
>>> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278)
>>> at org.apache.kafka.clients.consumer.internals.ConsumerNetworkC
>>> lient.clientPoll(ConsumerNetworkClient.java:360) at
>>> org.apache.kafka.clients.consumer.internals.ConsumerNetworkC
>>> lient.poll(ConsumerNetworkClient.java:224) at
>>> org.apache.kafka.clients.consumer.internals.ConsumerNetworkC
>>> lient.poll(ConsumerNetworkClient.java:192) at
>>> org.apache.kafka.clients.consumer.internals.ConsumerNetworkC
>>> lient.poll(ConsumerNetworkClient.java:163) at
>>> org.apache.kafka.clients.consumer.internals.ConsumerCoordina
>>> tor.commitOffsetsSync(ConsumerCoordinator.java:426) at
>>> org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1059)
>>> at 
>>> org.apache.storm.kafka.spout.KafkaSpout.commitOffsetsForAckedTuples(KafkaSpout.java:302)
>>> at org.apache.storm.kafka.spout.KafkaSpout.nextTuple(KafkaSpout.java:204)
>>> at 
>>> org.apache.storm.daemon.executor$fn__6505$fn__6520$fn__6551.invoke(executor.clj:651)
>>> at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) at
>>> clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.ja
>>> va:748)
>>>
>>>
>>> This error has affected the Parsers throughput significantly!
>>>
>>> I have tried to increase the session timeout, but It didn't affect my
>>> situation. I would be grateful if you can help me to find the source of
>>> this issue. Please be advised that I haven't had this issue with the
>>> previous version of Metron (0.3.1).
>>>
>>> Regards,
>>> Ali
>>>
>>>
>>
>>
>> --
>> A.Nazemian
>>
>
>
>
> --
> A.Nazemian
>


No longer incubating, but newly hatched!

2017-04-24 Thread Casey Stella
Hi All,

Some of you know this already and some of you might not, but as of the last
ASF board meeting we became a top level project with me serving as the Vice
President of Apache Metron.  The good people at the ASF press office
scheduled some press early this morning.

- NASDAQ GlobeNewswire http://globenewswire.com/news-
release/2017/04/24/970127/0/en/The-Apache-Software-
Foundation-Announces-Apache-Metron-as-a-Top-Level-Project.html
- ASF "Foundation" blog https://s.apache.org/e4Uh
- @TheASF Twitter feed https://twitter.com/TheASF/status/856447775323631617
- NEW! ASF LinkedIn page https://www.linkedin.com/
company/the-apache-software-foundation

I heartily congratulate the users, committers, contributors and general
Metron community for the achievement.  It's no small feat and I am proud to
serve such a community as we have built together.

Keep up the great work!

Casey


Re: High percentage of failed/timed out tuples after performance tuning!

2017-04-22 Thread Casey Stella
So what is perplexing is that the latency is low and the capacity for each
bolt is less than 1, so it's keeping up. I would have expected this kind of
thing if the latency was high and timeouts were happening.

If you drop the spout pending config lower do you get to a point with no
errors (at obvious consequences to throughput)? Also how many ackers are
you running?
On Sat, Apr 22, 2017 at 00:50 Ali Nazemian  wrote:

> I have disabled the reliability retry by setting the number of
> acker executors to zero. I can see based on the number of tuples have been
> emitted on the indexing topologies and the number of documents in
> Elasticsearch there is almost no missing document. It seems for some reason
> acker executors can not pick the acknowledgement for indexing and
> enrichments topology. However, it can be seen at the destination of those
> topologies.
>
> I am also wondering where the best approach would be to find the failed
> tuples? I though I can find them in the corresponding error topics which
> seem to be not like that.
>
> On Sat, Apr 22, 2017 at 2:36 PM, Ali Nazemian 
> wrote:
>
>> Is the following fact rings any bell?
>>
>> There is no failure at the bolt level acknowledgement, but from the
>> topology status, the rate of failure is very high! This is the same
>> scenario for both indexing and enrichment topologies.
>>
>> On Sat, Apr 22, 2017 at 2:29 PM, Ali Nazemian 
>> wrote:
>>
>>> The value for topology.max.spout.pending is 1000 currently. I did
>>> decrease it previously to understand the effect of that value on my
>>> problem. Clearly, throughput dropped, but still a very high rate of failure!
>>>
>>> On Sat, Apr 22, 2017 at 3:12 AM, Casey Stella 
>>> wrote:
>>>
>>>> Ok, so ignoring the indexing topology, the fact that you're seeing
>>>> failures in the enrichment topology, which has no ES component, is
>>>> telling.  It's also telling that the enrichment topology stats are
>>>> perfectly sensible latency-wise (i.e. it's not sweating).
>>>>
>>>> What's your storm configuration for topology.max.spout.pending?  If
>>>> it's not set, then try setting it to 1000 and bouncing the topologies.
>>>>
>>>> On Fri, Apr 21, 2017 at 12:54 PM, Ali Nazemian 
>>>> wrote:
>>>>
>>>>> No, nothing ...
>>>>>
>>>>> On Sat, Apr 22, 2017 at 2:46 AM, Casey Stella 
>>>>> wrote:
>>>>>
>>>>>> Anything going on in the kafka broker logs?
>>>>>>
>>>>>> On Fri, Apr 21, 2017 at 12:24 PM, Ali Nazemian >>>>> > wrote:
>>>>>>
>>>>>>> Although this is a test platform with a way less spec than
>>>>>>> production, it should be enough for indexing 600 docs per second. I have
>>>>>>> seen benchmark result of 150-200k docs per second with this spec! I 
>>>>>>> haven't
>>>>>>> played with tuning the template yet, but I still think the current rate
>>>>>>> does not make sense at all.
>>>>>>>
>>>>>>> I have changed the batch size to 100. Throughput has been dropped,
>>>>>>> but still a very high rate of failure!
>>>>>>>
>>>>>>> Please find the screenshots for the enrichments:
>>>>>>> http://imgur.com/a/ceC8f
>>>>>>> http://imgur.com/a/sBQwM
>>>>>>>
>>>>>>> On Sat, Apr 22, 2017 at 2:08 AM, Casey Stella 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Ok, yeah, those latencies are pretty high.  I think what's
>>>>>>>> happening is that the tuples aren't being acked fast enough and are 
>>>>>>>> timing
>>>>>>>> out.  How taxed is your ES box?  Can you drop the batch size down to 
>>>>>>>> maybe
>>>>>>>> 100 and see what happens?
>>>>>>>>
>>>>>>>> On Fri, Apr 21, 2017 at 12:05 PM, Ali Nazemian <
>>>>>>>> alinazem...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Please find the bolt part of Storm-UI related to indexing topology:
>>>>>>>>>
>>>>>>>>> http://imgur.com/a/tFkmO
>>>>>>>>>
>>>>>>>>> As you can see a hdfs error has also appeared

Re: High percentage of failed/timed out tuples after performance tuning!

2017-04-21 Thread Casey Stella
Ok, so ignoring the indexing topology, the fact that you're seeing failures
in the enrichment topology, which has no ES component, is telling.  It's
also telling that the enrichment topology stats are perfectly sensible
latency-wise (i.e. it's not sweating).

What's your storm configuration for topology.max.spout.pending?  If it's
not set, then try setting it to 1000 and bouncing the topologies.

On Fri, Apr 21, 2017 at 12:54 PM, Ali Nazemian 
wrote:

> No, nothing ...
>
> On Sat, Apr 22, 2017 at 2:46 AM, Casey Stella  wrote:
>
>> Anything going on in the kafka broker logs?
>>
>> On Fri, Apr 21, 2017 at 12:24 PM, Ali Nazemian 
>> wrote:
>>
>>> Although this is a test platform with a way less spec than production,
>>> it should be enough for indexing 600 docs per second. I have seen benchmark
>>> result of 150-200k docs per second with this spec! I haven't played with
>>> tuning the template yet, but I still think the current rate does not make
>>> sense at all.
>>>
>>> I have changed the batch size to 100. Throughput has been dropped, but
>>> still a very high rate of failure!
>>>
>>> Please find the screenshots for the enrichments:
>>> http://imgur.com/a/ceC8f
>>> http://imgur.com/a/sBQwM
>>>
>>> On Sat, Apr 22, 2017 at 2:08 AM, Casey Stella 
>>> wrote:
>>>
>>>> Ok, yeah, those latencies are pretty high.  I think what's happening is
>>>> that the tuples aren't being acked fast enough and are timing out.  How
>>>> taxed is your ES box?  Can you drop the batch size down to maybe 100 and
>>>> see what happens?
>>>>
>>>> On Fri, Apr 21, 2017 at 12:05 PM, Ali Nazemian 
>>>> wrote:
>>>>
>>>>> Please find the bolt part of Storm-UI related to indexing topology:
>>>>>
>>>>> http://imgur.com/a/tFkmO
>>>>>
>>>>> As you can see a hdfs error has also appeared which is not important
>>>>> right now.
>>>>>
>>>>> On Sat, Apr 22, 2017 at 1:59 AM, Casey Stella 
>>>>> wrote:
>>>>>
>>>>>> What's curious is the enrichment topology showing the same issues,
>>>>>> but my mind went to ES as well.
>>>>>>
>>>>>> On Fri, Apr 21, 2017 at 11:57 AM, Ryan Merriman 
>>>>>> wrote:
>>>>>>
>>>>>>> Yes which bolt is reporting all those failures?  My theory is that
>>>>>>> there is some ES tuning that needs to be done.
>>>>>>>
>>>>>>> On Fri, Apr 21, 2017 at 10:53 AM, Casey Stella 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Could I see a little more of that screen?  Specifically what the
>>>>>>>> bolts look like.
>>>>>>>>
>>>>>>>> On Fri, Apr 21, 2017 at 11:51 AM, Ali Nazemian <
>>>>>>>> alinazem...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Please find the storm-UI screenshot as follows.
>>>>>>>>>
>>>>>>>>> http://imgur.com/FhIrGFd
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Apr 22, 2017 at 1:41 AM, Ali Nazemian <
>>>>>>>>> alinazem...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Casey,
>>>>>>>>>>
>>>>>>>>>> - topology.message.timeout: It was 30s at first. I have increased
>>>>>>>>>> it to 300s, no changes!
>>>>>>>>>> - It is a very basic geo-enrichment and simple rule for threat
>>>>>>>>>> triage!
>>>>>>>>>> - No, not at all.
>>>>>>>>>> - I have changed that to find the best value. it is 5000 which is
>>>>>>>>>> about to 5MB.
>>>>>>>>>> - I have changed the number of executors for the Storm acker
>>>>>>>>>> thread, and I have also changed the value of 
>>>>>>>>>> topology.max.spout.pending,
>>>>>>>>>> still no changes!
>>>>>>>>>>
>>>>>>>>>> On Sat, Apr 22, 2017 at 1:24 AM, Casey Stella >>>>>>>>> > wrote:
>>>>>

Re: High percentage of failed/timed out tuples after performance tuning!

2017-04-21 Thread Casey Stella
Anything going on in the kafka broker logs?

On Fri, Apr 21, 2017 at 12:24 PM, Ali Nazemian 
wrote:

> Although this is a test platform with a way less spec than production, it
> should be enough for indexing 600 docs per second. I have seen benchmark
> result of 150-200k docs per second with this spec! I haven't played with
> tuning the template yet, but I still think the current rate does not make
> sense at all.
>
> I have changed the batch size to 100. Throughput has been dropped, but
> still a very high rate of failure!
>
> Please find the screenshots for the enrichments:
> http://imgur.com/a/ceC8f
> http://imgur.com/a/sBQwM
>
> On Sat, Apr 22, 2017 at 2:08 AM, Casey Stella  wrote:
>
>> Ok, yeah, those latencies are pretty high.  I think what's happening is
>> that the tuples aren't being acked fast enough and are timing out.  How
>> taxed is your ES box?  Can you drop the batch size down to maybe 100 and
>> see what happens?
>>
>> On Fri, Apr 21, 2017 at 12:05 PM, Ali Nazemian 
>> wrote:
>>
>>> Please find the bolt part of Storm-UI related to indexing topology:
>>>
>>> http://imgur.com/a/tFkmO
>>>
>>> As you can see a hdfs error has also appeared which is not important
>>> right now.
>>>
>>> On Sat, Apr 22, 2017 at 1:59 AM, Casey Stella 
>>> wrote:
>>>
>>>> What's curious is the enrichment topology showing the same issues, but
>>>> my mind went to ES as well.
>>>>
>>>> On Fri, Apr 21, 2017 at 11:57 AM, Ryan Merriman 
>>>> wrote:
>>>>
>>>>> Yes which bolt is reporting all those failures?  My theory is that
>>>>> there is some ES tuning that needs to be done.
>>>>>
>>>>> On Fri, Apr 21, 2017 at 10:53 AM, Casey Stella 
>>>>> wrote:
>>>>>
>>>>>> Could I see a little more of that screen?  Specifically what the
>>>>>> bolts look like.
>>>>>>
>>>>>> On Fri, Apr 21, 2017 at 11:51 AM, Ali Nazemian >>>>> > wrote:
>>>>>>
>>>>>>> Please find the storm-UI screenshot as follows.
>>>>>>>
>>>>>>> http://imgur.com/FhIrGFd
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Apr 22, 2017 at 1:41 AM, Ali Nazemian >>>>>> > wrote:
>>>>>>>
>>>>>>>> Hi Casey,
>>>>>>>>
>>>>>>>> - topology.message.timeout: It was 30s at first. I have increased
>>>>>>>> it to 300s, no changes!
>>>>>>>> - It is a very basic geo-enrichment and simple rule for threat
>>>>>>>> triage!
>>>>>>>> - No, not at all.
>>>>>>>> - I have changed that to find the best value. it is 5000 which is
>>>>>>>> about to 5MB.
>>>>>>>> - I have changed the number of executors for the Storm acker
>>>>>>>> thread, and I have also changed the value of 
>>>>>>>> topology.max.spout.pending,
>>>>>>>> still no changes!
>>>>>>>>
>>>>>>>> On Sat, Apr 22, 2017 at 1:24 AM, Casey Stella 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Also,
>>>>>>>>> * what's your setting for topology.message.timeout?
>>>>>>>>> * You said you're seeing this in indexing and enrichment, what
>>>>>>>>> enrichments do you have in place?
>>>>>>>>> * Is ES being taxed heavily?
>>>>>>>>> * What's your ES batch size for the sensor?
>>>>>>>>>
>>>>>>>>> On Fri, Apr 21, 2017 at 10:46 AM, Casey Stella >>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> So you're seeing failures in the storm topology but no errors in
>>>>>>>>>> the logs.  Would you mind sending over a screenshot of the indexing
>>>>>>>>>> topology from the storm UI?  You might not be able to paste the 
>>>>>>>>>> image on
>>>>>>>>>> the mailing list, so maybe an imgur link would be in order.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>

Re: High percentage of failed/timed out tuples after performance tuning!

2017-04-21 Thread Casey Stella
Ok, yeah, those latencies are pretty high.  I think what's happening is
that the tuples aren't being acked fast enough and are timing out.  How
taxed is your ES box?  Can you drop the batch size down to maybe 100 and
see what happens?

On Fri, Apr 21, 2017 at 12:05 PM, Ali Nazemian 
wrote:

> Please find the bolt part of Storm-UI related to indexing topology:
>
> http://imgur.com/a/tFkmO
>
> As you can see a hdfs error has also appeared which is not important right
> now.
>
> On Sat, Apr 22, 2017 at 1:59 AM, Casey Stella  wrote:
>
>> What's curious is the enrichment topology showing the same issues, but my
>> mind went to ES as well.
>>
>> On Fri, Apr 21, 2017 at 11:57 AM, Ryan Merriman 
>> wrote:
>>
>>> Yes which bolt is reporting all those failures?  My theory is that there
>>> is some ES tuning that needs to be done.
>>>
>>> On Fri, Apr 21, 2017 at 10:53 AM, Casey Stella 
>>> wrote:
>>>
>>>> Could I see a little more of that screen?  Specifically what the bolts
>>>> look like.
>>>>
>>>> On Fri, Apr 21, 2017 at 11:51 AM, Ali Nazemian 
>>>> wrote:
>>>>
>>>>> Please find the storm-UI screenshot as follows.
>>>>>
>>>>> http://imgur.com/FhIrGFd
>>>>>
>>>>>
>>>>> On Sat, Apr 22, 2017 at 1:41 AM, Ali Nazemian 
>>>>> wrote:
>>>>>
>>>>>> Hi Casey,
>>>>>>
>>>>>> - topology.message.timeout: It was 30s at first. I have increased it
>>>>>> to 300s, no changes!
>>>>>> - It is a very basic geo-enrichment and simple rule for threat triage!
>>>>>> - No, not at all.
>>>>>> - I have changed that to find the best value. it is 5000 which is
>>>>>> about to 5MB.
>>>>>> - I have changed the number of executors for the Storm acker thread,
>>>>>> and I have also changed the value of topology.max.spout.pending, still no
>>>>>> changes!
>>>>>>
>>>>>> On Sat, Apr 22, 2017 at 1:24 AM, Casey Stella 
>>>>>> wrote:
>>>>>>
>>>>>>> Also,
>>>>>>> * what's your setting for topology.message.timeout?
>>>>>>> * You said you're seeing this in indexing and enrichment, what
>>>>>>> enrichments do you have in place?
>>>>>>> * Is ES being taxed heavily?
>>>>>>> * What's your ES batch size for the sensor?
>>>>>>>
>>>>>>> On Fri, Apr 21, 2017 at 10:46 AM, Casey Stella 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> So you're seeing failures in the storm topology but no errors in
>>>>>>>> the logs.  Would you mind sending over a screenshot of the indexing
>>>>>>>> topology from the storm UI?  You might not be able to paste the image 
>>>>>>>> on
>>>>>>>> the mailing list, so maybe an imgur link would be in order.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Casey
>>>>>>>>
>>>>>>>> On Fri, Apr 21, 2017 at 10:34 AM, Ali Nazemian <
>>>>>>>> alinazem...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Ryan,
>>>>>>>>>
>>>>>>>>> No, I cannot see any error inside the indexing error topic. Also,
>>>>>>>>> the number of tuples is emitted and transferred to the error indexing 
>>>>>>>>> bolt
>>>>>>>>> is zero!
>>>>>>>>>
>>>>>>>>> On Sat, Apr 22, 2017 at 12:29 AM, Ryan Merriman <
>>>>>>>>> merrim...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Do you see any errors in the error* index in Elasticsearch?
>>>>>>>>>> There are several catch blocks across the different topologies that
>>>>>>>>>> transform errors into json objects and forward them on to the 
>>>>>>>>>> indexing
>>>>>>>>>> topology.  If you're not seeing anything in the worker logs it's 
>>>>>>>>&g

Re: High percentage of failed/timed out tuples after performance tuning!

2017-04-21 Thread Casey Stella
What's curious is the enrichment topology showing the same issues, but my
mind went to ES as well.

On Fri, Apr 21, 2017 at 11:57 AM, Ryan Merriman  wrote:

> Yes which bolt is reporting all those failures?  My theory is that there
> is some ES tuning that needs to be done.
>
> On Fri, Apr 21, 2017 at 10:53 AM, Casey Stella  wrote:
>
>> Could I see a little more of that screen?  Specifically what the bolts
>> look like.
>>
>> On Fri, Apr 21, 2017 at 11:51 AM, Ali Nazemian 
>> wrote:
>>
>>> Please find the storm-UI screenshot as follows.
>>>
>>> http://imgur.com/FhIrGFd
>>>
>>>
>>> On Sat, Apr 22, 2017 at 1:41 AM, Ali Nazemian 
>>> wrote:
>>>
>>>> Hi Casey,
>>>>
>>>> - topology.message.timeout: It was 30s at first. I have increased it to
>>>> 300s, no changes!
>>>> - It is a very basic geo-enrichment and simple rule for threat triage!
>>>> - No, not at all.
>>>> - I have changed that to find the best value. it is 5000 which is about
>>>> to 5MB.
>>>> - I have changed the number of executors for the Storm acker thread,
>>>> and I have also changed the value of topology.max.spout.pending, still no
>>>> changes!
>>>>
>>>> On Sat, Apr 22, 2017 at 1:24 AM, Casey Stella 
>>>> wrote:
>>>>
>>>>> Also,
>>>>> * what's your setting for topology.message.timeout?
>>>>> * You said you're seeing this in indexing and enrichment, what
>>>>> enrichments do you have in place?
>>>>> * Is ES being taxed heavily?
>>>>> * What's your ES batch size for the sensor?
>>>>>
>>>>> On Fri, Apr 21, 2017 at 10:46 AM, Casey Stella 
>>>>> wrote:
>>>>>
>>>>>> So you're seeing failures in the storm topology but no errors in the
>>>>>> logs.  Would you mind sending over a screenshot of the indexing topology
>>>>>> from the storm UI?  You might not be able to paste the image on the 
>>>>>> mailing
>>>>>> list, so maybe an imgur link would be in order.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Casey
>>>>>>
>>>>>> On Fri, Apr 21, 2017 at 10:34 AM, Ali Nazemian >>>>> > wrote:
>>>>>>
>>>>>>> Hi Ryan,
>>>>>>>
>>>>>>> No, I cannot see any error inside the indexing error topic. Also,
>>>>>>> the number of tuples is emitted and transferred to the error indexing 
>>>>>>> bolt
>>>>>>> is zero!
>>>>>>>
>>>>>>> On Sat, Apr 22, 2017 at 12:29 AM, Ryan Merriman >>>>>> > wrote:
>>>>>>>
>>>>>>>> Do you see any errors in the error* index in Elasticsearch?  There
>>>>>>>> are several catch blocks across the different topologies that transform
>>>>>>>> errors into json objects and forward them on to the indexing topology. 
>>>>>>>>  If
>>>>>>>> you're not seeing anything in the worker logs it's likely the errors 
>>>>>>>> were
>>>>>>>> captured there instead.
>>>>>>>>
>>>>>>>> Ryan
>>>>>>>>
>>>>>>>> On Fri, Apr 21, 2017 at 9:19 AM, Ali Nazemian <
>>>>>>>> alinazem...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> No everything is fine at the log level. Also, when I checked
>>>>>>>>> resource consumption at the workers, there had been plenty resources 
>>>>>>>>> still
>>>>>>>>> available!
>>>>>>>>>
>>>>>>>>> On Fri, Apr 21, 2017 at 10:04 PM, Casey Stella >>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> Seeing anything in the storm logs for the workers?
>>>>>>>>>>
>>>>>>>>>> On Fri, Apr 21, 2017 at 07:41 Ali Nazemian 
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> After I tried to tune the Metron performance I have noticed the
>>>>>>>>>>> rate of failure for the indexing/enrichment topologies are very 
>>>>>>>>>>> high (about
>>>>>>>>>>> 95%). However, I can see the messages in Elasticsearch. I have 
>>>>>>>>>>> tried to
>>>>>>>>>>> increase the timeout value for the acknowledgement. It didn't fix 
>>>>>>>>>>> the
>>>>>>>>>>> problem. I can set the number of acker executors to 0 to 
>>>>>>>>>>> temporarily fix
>>>>>>>>>>> the problem which is not a good idea at all. Do you have any idea 
>>>>>>>>>>> what have
>>>>>>>>>>> caused such issue? The percentage of failure decreases by reducing 
>>>>>>>>>>> the
>>>>>>>>>>> number of parallelism, but even without any parallelism, it is 
>>>>>>>>>>> still high!
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Ali
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> A.Nazemian
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> A.Nazemian
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> A.Nazemian
>>>>
>>>
>>>
>>>
>>> --
>>> A.Nazemian
>>>
>>
>>
>


Re: High percentage of failed/timed out tuples after performance tuning!

2017-04-21 Thread Casey Stella
Could I see a little more of that screen?  Specifically what the bolts look
like.

On Fri, Apr 21, 2017 at 11:51 AM, Ali Nazemian 
wrote:

> Please find the storm-UI screenshot as follows.
>
> http://imgur.com/FhIrGFd
>
>
> On Sat, Apr 22, 2017 at 1:41 AM, Ali Nazemian 
> wrote:
>
>> Hi Casey,
>>
>> - topology.message.timeout: It was 30s at first. I have increased it to
>> 300s, no changes!
>> - It is a very basic geo-enrichment and simple rule for threat triage!
>> - No, not at all.
>> - I have changed that to find the best value. it is 5000 which is about
>> to 5MB.
>> - I have changed the number of executors for the Storm acker thread, and
>> I have also changed the value of topology.max.spout.pending, still no
>> changes!
>>
>> On Sat, Apr 22, 2017 at 1:24 AM, Casey Stella  wrote:
>>
>>> Also,
>>> * what's your setting for topology.message.timeout?
>>> * You said you're seeing this in indexing and enrichment, what
>>> enrichments do you have in place?
>>> * Is ES being taxed heavily?
>>> * What's your ES batch size for the sensor?
>>>
>>> On Fri, Apr 21, 2017 at 10:46 AM, Casey Stella 
>>> wrote:
>>>
>>>> So you're seeing failures in the storm topology but no errors in the
>>>> logs.  Would you mind sending over a screenshot of the indexing topology
>>>> from the storm UI?  You might not be able to paste the image on the mailing
>>>> list, so maybe an imgur link would be in order.
>>>>
>>>> Thanks,
>>>>
>>>> Casey
>>>>
>>>> On Fri, Apr 21, 2017 at 10:34 AM, Ali Nazemian 
>>>> wrote:
>>>>
>>>>> Hi Ryan,
>>>>>
>>>>> No, I cannot see any error inside the indexing error topic. Also, the
>>>>> number of tuples is emitted and transferred to the error indexing bolt is
>>>>> zero!
>>>>>
>>>>> On Sat, Apr 22, 2017 at 12:29 AM, Ryan Merriman 
>>>>> wrote:
>>>>>
>>>>>> Do you see any errors in the error* index in Elasticsearch?  There
>>>>>> are several catch blocks across the different topologies that transform
>>>>>> errors into json objects and forward them on to the indexing topology.  
>>>>>> If
>>>>>> you're not seeing anything in the worker logs it's likely the errors were
>>>>>> captured there instead.
>>>>>>
>>>>>> Ryan
>>>>>>
>>>>>> On Fri, Apr 21, 2017 at 9:19 AM, Ali Nazemian 
>>>>>> wrote:
>>>>>>
>>>>>>> No everything is fine at the log level. Also, when I checked
>>>>>>> resource consumption at the workers, there had been plenty resources 
>>>>>>> still
>>>>>>> available!
>>>>>>>
>>>>>>> On Fri, Apr 21, 2017 at 10:04 PM, Casey Stella 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Seeing anything in the storm logs for the workers?
>>>>>>>>
>>>>>>>> On Fri, Apr 21, 2017 at 07:41 Ali Nazemian 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> After I tried to tune the Metron performance I have noticed the
>>>>>>>>> rate of failure for the indexing/enrichment topologies are very high 
>>>>>>>>> (about
>>>>>>>>> 95%). However, I can see the messages in Elasticsearch. I have tried 
>>>>>>>>> to
>>>>>>>>> increase the timeout value for the acknowledgement. It didn't fix the
>>>>>>>>> problem. I can set the number of acker executors to 0 to temporarily 
>>>>>>>>> fix
>>>>>>>>> the problem which is not a good idea at all. Do you have any idea 
>>>>>>>>> what have
>>>>>>>>> caused such issue? The percentage of failure decreases by reducing the
>>>>>>>>> number of parallelism, but even without any parallelism, it is still 
>>>>>>>>> high!
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Ali
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> A.Nazemian
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> A.Nazemian
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> A.Nazemian
>>
>
>
>
> --
> A.Nazemian
>


Re: High percentage of failed/timed out tuples after performance tuning!

2017-04-21 Thread Casey Stella
Also,
* what's your setting for topology.message.timeout?
* You said you're seeing this in indexing and enrichment, what enrichments
do you have in place?
* Is ES being taxed heavily?
* What's your ES batch size for the sensor?

On Fri, Apr 21, 2017 at 10:46 AM, Casey Stella  wrote:

> So you're seeing failures in the storm topology but no errors in the
> logs.  Would you mind sending over a screenshot of the indexing topology
> from the storm UI?  You might not be able to paste the image on the mailing
> list, so maybe an imgur link would be in order.
>
> Thanks,
>
> Casey
>
> On Fri, Apr 21, 2017 at 10:34 AM, Ali Nazemian 
> wrote:
>
>> Hi Ryan,
>>
>> No, I cannot see any error inside the indexing error topic. Also, the
>> number of tuples is emitted and transferred to the error indexing bolt is
>> zero!
>>
>> On Sat, Apr 22, 2017 at 12:29 AM, Ryan Merriman 
>> wrote:
>>
>>> Do you see any errors in the error* index in Elasticsearch?  There are
>>> several catch blocks across the different topologies that transform errors
>>> into json objects and forward them on to the indexing topology.  If you're
>>> not seeing anything in the worker logs it's likely the errors were captured
>>> there instead.
>>>
>>> Ryan
>>>
>>> On Fri, Apr 21, 2017 at 9:19 AM, Ali Nazemian 
>>> wrote:
>>>
>>>> No everything is fine at the log level. Also, when I checked resource
>>>> consumption at the workers, there had been plenty resources still 
>>>> available!
>>>>
>>>> On Fri, Apr 21, 2017 at 10:04 PM, Casey Stella 
>>>> wrote:
>>>>
>>>>> Seeing anything in the storm logs for the workers?
>>>>>
>>>>> On Fri, Apr 21, 2017 at 07:41 Ali Nazemian 
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> After I tried to tune the Metron performance I have noticed the rate
>>>>>> of failure for the indexing/enrichment topologies are very high (about
>>>>>> 95%). However, I can see the messages in Elasticsearch. I have tried to
>>>>>> increase the timeout value for the acknowledgement. It didn't fix the
>>>>>> problem. I can set the number of acker executors to 0 to temporarily fix
>>>>>> the problem which is not a good idea at all. Do you have any idea what 
>>>>>> have
>>>>>> caused such issue? The percentage of failure decreases by reducing the
>>>>>> number of parallelism, but even without any parallelism, it is still 
>>>>>> high!
>>>>>>
>>>>>> Cheers,
>>>>>> Ali
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> A.Nazemian
>>>>
>>>
>>>
>>
>>
>> --
>> A.Nazemian
>>
>
>


Re: High percentage of failed/timed out tuples after performance tuning!

2017-04-21 Thread Casey Stella
So you're seeing failures in the storm topology but no errors in the logs.
Would you mind sending over a screenshot of the indexing topology from the
storm UI?  You might not be able to paste the image on the mailing list, so
maybe an imgur link would be in order.

Thanks,

Casey

On Fri, Apr 21, 2017 at 10:34 AM, Ali Nazemian 
wrote:

> Hi Ryan,
>
> No, I cannot see any error inside the indexing error topic. Also, the
> number of tuples is emitted and transferred to the error indexing bolt is
> zero!
>
> On Sat, Apr 22, 2017 at 12:29 AM, Ryan Merriman 
> wrote:
>
>> Do you see any errors in the error* index in Elasticsearch?  There are
>> several catch blocks across the different topologies that transform errors
>> into json objects and forward them on to the indexing topology.  If you're
>> not seeing anything in the worker logs it's likely the errors were captured
>> there instead.
>>
>> Ryan
>>
>> On Fri, Apr 21, 2017 at 9:19 AM, Ali Nazemian 
>> wrote:
>>
>>> No everything is fine at the log level. Also, when I checked resource
>>> consumption at the workers, there had been plenty resources still available!
>>>
>>> On Fri, Apr 21, 2017 at 10:04 PM, Casey Stella 
>>> wrote:
>>>
>>>> Seeing anything in the storm logs for the workers?
>>>>
>>>> On Fri, Apr 21, 2017 at 07:41 Ali Nazemian 
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> After I tried to tune the Metron performance I have noticed the rate
>>>>> of failure for the indexing/enrichment topologies are very high (about
>>>>> 95%). However, I can see the messages in Elasticsearch. I have tried to
>>>>> increase the timeout value for the acknowledgement. It didn't fix the
>>>>> problem. I can set the number of acker executors to 0 to temporarily fix
>>>>> the problem which is not a good idea at all. Do you have any idea what 
>>>>> have
>>>>> caused such issue? The percentage of failure decreases by reducing the
>>>>> number of parallelism, but even without any parallelism, it is still high!
>>>>>
>>>>> Cheers,
>>>>> Ali
>>>>>
>>>>
>>>
>>>
>>> --
>>> A.Nazemian
>>>
>>
>>
>
>
> --
> A.Nazemian
>


Re: High percentage of failed/timed out tuples after performance tuning!

2017-04-21 Thread Casey Stella
Seeing anything in the storm logs for the workers?
On Fri, Apr 21, 2017 at 07:41 Ali Nazemian  wrote:

> Hi all,
>
> After I tried to tune the Metron performance I have noticed the rate of
> failure for the indexing/enrichment topologies are very high (about 95%).
> However, I can see the messages in Elasticsearch. I have tried to increase
> the timeout value for the acknowledgement. It didn't fix the problem. I can
> set the number of acker executors to 0 to temporarily fix the problem which
> is not a good idea at all. Do you have any idea what have caused such
> issue? The percentage of failure decreases by reducing the number of
> parallelism, but even without any parallelism, it is still high!
>
> Cheers,
> Ali
>