Re: Multi Field Aggregation

2014-10-19 Thread Artur Martins
Yup, that's true. It will be able to query only by that set of columns, 
which is an issue for future requirements.
For now its a quick-fix, but I wonder if I'm missing something on the 
"aggregations" function..

Cheers

On Sunday, October 19, 2014 5:07:31 PM UTC+1, Alastair James wrote:
>
> Hmmm. I dont know much about logstash, but I suspect thats concatenating 
> the 3 values into one string and taking a hash of it This would allow 
> you to group by that exact set of 3 columns however my use case is that 
> I need to be able to group by and subset of columns, so this could not be 
> pre-defined in that way.
>
> Al
>
> On 19 October 2014 16:48, Artur Martins > 
> wrote:
>
>> I heard that it could be done with a fingerprint, but I don't know how to 
>> do this. It's in logstash.conf
>>
>> Have a look:
>>
>> Fingerprint the 3-tuple of source address, destination address, 
>> destination port
>>
>> if [SourceAddress] and [DestinationAddress] {
>>   fingerprint {
>> concatenate_sources => true
>> method => "SHA1"
>> key => "logstash"
>> source => [ "SourceAddress", "DestinationAddress", "DestinationPort" ]
>>   }
>> }
>>
>> But what exactly will this do? What next?
>> Hope you can understand this and help us both 😊
>>
>> Thanks
>>
>> --
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/elasticsearch/gVLNqArGvVA/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/005d8152-9ee0-49bb-a8d5-84ccb9634124%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Dr Alastair James
> CTO Ometria.com
> Skype: al.james
>
> 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f23f37e7-35a3-4a8a-9c8b-9334460f7aa7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Multi Field Aggregation

2014-10-19 Thread Artur Martins
I heard that it could be done with a fingerprint, but I don't know how to do 
this. It's in logstash.conf 
 
Have a look:

Fingerprint the 3-tuple of source address, destination address, destination port

if [SourceAddress] and [DestinationAddress] {
  fingerprint {
concatenate_sources => true
method => "SHA1"
key => "logstash"
source => [ "SourceAddress", "DestinationAddress", "DestinationPort" ]
  }
}

But what exactly will this do? What next? 
Hope you can understand this and help us both 😊 

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/005d8152-9ee0-49bb-a8d5-84ccb9634124%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [ANN] Elasticsearch CSV plugin for formatting search responses as CSV

2014-10-17 Thread Artur Martins
This is priceless. Thank you.

On Wednesday, July 16, 2014 12:23:11 AM UTC+1, Jörg Prante wrote:
>
> Hi,
>
> I wrote a little plugin for formatting search responses as CSV (comma 
> separated values)
>
> This format is useful for extracting some (or all) fields from ES JSON and 
> wrap it into a tabular display, e.g. for exporting them to spreadsheet 
> tools.
>
> More info:
>
> https://github.com/jprante/elasticsearch-csv
>
> In the hope it's useful,
>
> Jörg
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/45604748-47dd-4203-853b-8c64ec93f7b9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Multi Field Aggregation

2014-10-17 Thread Artur Martins
Hello,

I'm having the exact same problem. 
Have you managed to find a solution?

My thread is here: LINK 


Thanks

On Thursday, October 16, 2014 1:57:35 PM UTC+1, Alastair James wrote:
>
> Hi there.
>
> I am trying to create an aggregation that mimics the following SQL query:
>
> SELECT col1, col2, COUNT(*), SUM(metric) FROM table GROUP BY col1, col2 
> ORDER BY SUM(metric) DESC
>
> On the face of it, I could create an terms aggregation for col1, add a 
> terms aggregation for col2 inside it, and the metric aggregations inside 
> that. I could then dynamically build the SQL result like grid and sort it 
> myself. However this breaks down for large results set, or a paginated 
> result set of a larger result. 
>
> The problem is that the ES aggregation system always returns the top N 
> results for each parent and child bucket. Thus for each value of col1 I 
> have N values of col2.
>
> What I really want is to consider all possible combinations of col1 and 
> col2 in the same way as SQL does it and return the top N based on some 
> other metric. E.g. in ES speak, a single aggregation where the keys are 
> tuples of (col1, col2).
>
> I suppose one way would be to use a script terms aggregation to 
> concatenate each value of col1 and col2, however thats going to be slow.
>
> Does anyone else have any ideas?
>
> Ideally there would be a tuple aggregation built in, e.g.:
>
> "my_agg":{
>"tuple":{
>   "fields":["col1","col2"]
>}
> }
>
> Would product keys that are objects like:
>
> {
>"col1":"value1",
>"col2":"value2"
> }
>
> Does anyone know if this would be possible to write as a  plugin?
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/713127be-b89e-42ee-8811-18dd0e31d16a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How to count tuples of 3 variables, sorted

2014-10-17 Thread Artur Martins
Greetings community,

I'm new to elasticsearch, so first of all sorry for my questions being so 
basic.

I developed a flow collector which dumps flows to my elasticsearch server. 
Right now i use Kibana to perform the Top 10 destination and Top 10 source 
IPs filters, and such.
But the query I'm having more difficulties about is knowing the Top 10 
combination of (source + dest + dest_port) so that I can know what the top 
flows are, and from which IPs and to which destinations and protocols.

Example:

{
> "aggs":{
> "tupulo_teste":{
> "value_count":{
> "field":"SRC_ADDR",
> "field":"DST_ADDR",
> "field":"DST_PORT"
> }
> }
> }
> }


This does not compute all combinations of (SRC_ADDR, DST_ADDR, DST_PORT) 
nor even sort it giving the Top10 hits. If you are familiar with splunk, I 
need the equivalent of "*stats count by a,b,c | sort 10 -count*"

I've tried:

> {
> "aggs":{
> "src":{
> "terms":{"field": "SRC_ADDR"},
> "aggs":{
> "dst":{
> "terms":{"field": "DST_ADDR"},
> "aggs":{
> "dstprt":{
> "terms":{"field": 
> "DST_PORT"}
> }
> }
> }
> }
> }
> }


but this produces a strange and long combination, also without sorting.

Can someone please help me on how to do this result combination, with a 
sort by occurence count?

Thank you

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f2c7bad6-dbd7-4edd-b3bd-a9cc6018e7a7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.