Re: Fields with a period/dot in the name

2019-11-17 Thread Владимир Михайлов
> The overall approach with Metron has been to flatten nested fields,
rather than deal with deeply nested structures.  Most of the parsers are
built to flatten the data. This presents a "flat" view for all downstream
functionality like Enrichment, Profiling, etc. I assume in your example
that whatever parser you are using is not flattening the data.  Is that
correct?

Yes, we get ready-made JSON from AuditBeat and WinlogBeat and do not plan
to turn them into flat ones. Our goal at Metron is to enrich this data, to
run on TI feeds and profiler. And then index in ElasticSearch.

> Using a map literal would simplify your example a bit.

> system := { 'id': MAP_GET('id', MAP_GET('os', host_id)) }


if I understand correctly, this expression will create a 'system' object
(map) with one single 'id' property. But often it is necessary to make
changes to one of the properties of a more complex object.

> If we complete METRON-2072
,
adding some syntactic sugar around MAP_PUT/GET, then your example could be
much simpler.

>system := { 'id': host_id['os']['id'] }


It would be great!
Especially if something like this will be implemented:

system['id'] := host_id['os']['id']




сб, 16 нояб. 2019 г. в 02:09, Nick Allen :

> Hi Valdimir -
>
> > Converting ECS to flat json where the fields take the form {"system.id":
> ""} is not a good option, because the very meaning of its use and
> the convenience of the JSON format are lost.
>
> Right, it just depends on your use case.  My hope is that with the
> facilities in Metron, you can manipulate the data in whatever manner works
> best for you.
>
>
> > And with deep nesting, this generally turns into unreadable,
> hard-to-maintain code.
>
> The overall approach with Metron has been to flatten nested fields, rather
> than deal with deeply nested structures.  Most of the parsers are built to
> flatten the data. This presents a "flat" view for all downstream
> functionality like Enrichment, Profiling, etc. I assume in your example
> that whatever parser you are using is not flattening the data.  Is that
> correct?
>
>
> > And now the question: is there a way to easily work with nested JSON in
> Stellar? Deep diving into the documentation and source code has not yet
> given an answer.
> >
> > system := MAP_PUT('id', MAP_GET('id',MAP_GET('os',host_id)), system)
> >
>
> Using a map literal would simplify your example a bit.
>
> system := { 'id': MAP_GET('id', MAP_GET('os', host_id)) }
>
>
> If we complete METRON-2072
> ,
> adding some syntactic sugar around MAP_PUT/GET, then your example could be
> much simpler.
>
>
> system := { 'id': host_id['os']['id'] }
>
>
>
> > Now this is a fundamentally important issue that affects the moments of
> enrichment, TI, profiling and simply changing data when parsing.
>
> All that being said, I think this highlights one advantage of using a DSL
> like Stellar.  If you do not want to flatten your data, it should be easy
> enough to add whatever Stellar functions might be required to make the
> task simpler.
>
> I hope this helps.
>
>
>
>
>
> On Fri, Nov 15, 2019 at 5:14 AM Vladimir Mikhailov <
> v.mikhai...@content-media.ru> wrote:
>
>> Hi Nick!
>>
>> We, like Tom, plan to use Elastic Common Schema (ECS) to store events in
>> Metron.
>>
>> A feature of ECS is the nesting of JSON objects, and therefore the "
>> system.id" field implies storage in the form {"'system": {"id":
>> ""}}
>>
>> Converting ECS to flat json where the fields take the form {"system.id":
>> ""} is not a good option, because the very meaning of its use and
>> the convenience of the JSON format are lost.
>>
>> Now, in order to work with nested JSON using Stellar, we are forced to
>> use such complex constructs using the MAP_GET and MAP_PUT functions, for
>> example:
>>
>> "fieldTransformations": [
>> {
>> "output": ["system"],
>> "transformation": "STELLAR",
>> "config": {
>> "system": "MAP_PUT('id',
>> MAP_GET('id',MAP_GET('os',host_id)), system)"
>> }
>> }
>> ]
>>
>> And with deep nesting, this generally turns into unreadable,
>> hard-to-maintain code.
>>
>> And now the question: is there a way to easily work with nested JSON in
>> Stellar? Deep diving into the documentation and source code has not yet
>> given an answer.
>>
>> Now this is a fundamentally important issue that affects the moments of
>> enrichment, TI, profiling and simply changing data when parsing.
>>
>>
>> On 2019/11/01 17:50:29, Nick Allen  wrote:
>> > Hi Tom -
>> >
>> > > In the case of Metron, should we be modifying the field names to
>> replace
>> > dots? Can the Metron STELLAR language handle a field name with a dot in
>> it,
>> > or are there any special steps 

Re: Enable optional fields in csv parser

2019-11-17 Thread Otto Fowler
I think what he is saying is that the csv files may not always have all the
columns, in which case they won’t parse, which is before stellar can do
anything.




On November 16, 2019 at 11:30:25, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

A better way of doing this would be to use the fieldTransformation setting
and the REMOVE method to get rid of the extraneous fields. Docs are
included at
https://metron.apache.org/current-book/metron-platform/metron-parsers/index.html#

That way you don’t need a separate preprocessing step.

Simon

On Sat, 16 Nov 2019 at 16:09, Hema malini  wrote:

> Thanks ..will do preprocessing of data..
>
> On Sat, 16 Nov, 2019, 9:25 PM Otto Fowler, 
> wrote:
>
>> No, there is no way to do this currently.
>>
>> The parser parses the line into and array of strings that must match the
>> size of the columns.
>>
>> The underlying opencsv parser does not support this either.  You may have
>> to do some normalization work on your data if you need to account for this.
>>
>>
>>
>> On November 16, 2019 at 08:49:36, Hema malini (nhemamalin...@gmail.com)
>> wrote:
>>
>> Hi all,
>>
>> Is there any way to mark some columns as optional in column mapping in
>> CSV parser.
>>
>> Thanks and Regards,
>> Hema
>>
>> --
--
simon elliston ball
@sireb