Re: Aggregation profiling?

2015-05-28 Thread James Macdonald
I don't have an answer, but I really like this question. I too would love
to see more query and aggregation profiling tools for performance
optimization purposes.

Also, I assume you have already looked at this, but have you made sure you
are not evicting anything from your in memory field data?

James

On Mon, May 25, 2015 at 4:08 PM, Mike Sukmanowsky 
mike.sukmanow...@gmail.com wrote:

 I don't believe there are any current endpoints in the API that support
 this, but are there plans to add better profiling information to ES
 aggregation queries? We'll see some agg queries return in 11s, then 5s
 then 11s again. Sometimes we can see associated filter cache expirations,
 but it's really hard to line these up to one specific query in our
 production environment since multiple users are executing queries
 simultaneously.

 It'd be really helpful to optionally see where aggregation queries are
 spending the bulk of their time to help us understand what to improve in
 the future.

 Anything we can do here right now?

 --
 Mike Sukmanowsky
 Aspiring Digital Carpenter

 *e*: mike.sukmanow...@gmail.com

 facebook http://facebook.com/mike.sukmanowsky | twitter
 http://twitter.com/msukmanowsky | LinkedIn
 http://www.linkedin.com/profile/view?id=10897143 | github
 https://github.com/msukmanowsky

   --
 Please update your bookmarks! We have moved to https://discuss.elastic.co/
 ---
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAOH6cu5WSGqQ%2BZ0_qrofXEvwo8JuSH9xoSbZgSwiT90MJ_wxdA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAOH6cu5WSGqQ%2BZ0_qrofXEvwo8JuSH9xoSbZgSwiT90MJ_wxdA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
Please update your bookmarks! We have moved to https://discuss.elastic.co/
--- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAABsnTZOmx-fk%2BG9dR6-XYB_1j7mGRNRwTqvQRwKx0YAcopFWA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation not limited to filter?

2015-04-15 Thread Ivan Brusic
Which version are you using! The old post filter methods simply named
filter, should have been removed, or at least deprecated.

Cheers,

Ivan
On Apr 13, 2015 1:33 PM, James Green james.mk.gr...@gmail.com wrote:

 Indeed. I had used postFilter to add my filters. The documentation for
 filters doesn't show how to use a query with a matchAll and a bunch of
 filters so I blindly followed IDE auto-complete.

 Lesson learned.

 On 10 April 2015 at 21:17, James Macdonald james.macdon...@geofeedia.com
 wrote:

 I had a similar problem recently and solved it by moving my filter into a
 filtered query (leaving the query as a match_all), see documentation here
 http://www.elastic.co/guide/en/elasticsearch/reference/1.5/query-dsl-filtered-query.html
 .

 I am not certain why filters do not restrict the scope of the aggregates,
 but queries do, but I suspect it interprets the filter (not wrapped in a
 filtered_query) as a post_filter (
 http://www.elastic.co/guide/en/elasticsearch/reference/1.x/search-request-post-filter.html).
 Maybe someone else actually knows why.


 Hope that helps,
 James

 On Fri, Apr 10, 2015 at 11:39 AM, James Green james.mk.gr...@gmail.com
 wrote:

 I must be doing something stupid!

 Using the Java client I can perform a search with a filter and iterate
 over the hits. I see exactly the right source documents.

 If I add an aggregation, I see the expected keyAsText string but the
 docCount reflects the volume if the filter had not been applied.

 I expected the aggregation to be restricted to the results within that
 filter?

 Thanks,

 James

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxkmZVfDhkJW-bWPrRs5BMzTem-2zCQRWeF%2BLQCR2L5sA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxkmZVfDhkJW-bWPrRs5BMzTem-2zCQRWeF%2BLQCR2L5sA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAABsnTbD0JgcpMCMWuzjVC1W3C-pt6pC6PJG0xT31O44MZQs%3DA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAABsnTbD0JgcpMCMWuzjVC1W3C-pt6pC6PJG0xT31O44MZQs%3DA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxDfHvCicw5rewNOAun5Vy2qZe8X_awGD3wR8B-vVZY-A%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxDfHvCicw5rewNOAun5Vy2qZe8X_awGD3wR8B-vVZY-A%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCy8fZvnKZpuGFJMWXvt9MFQdUQzFO8au77mZj7r3VW0A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation not limited to filter?

2015-04-13 Thread James Green
Indeed. I had used postFilter to add my filters. The documentation for
filters doesn't show how to use a query with a matchAll and a bunch of
filters so I blindly followed IDE auto-complete.

Lesson learned.

On 10 April 2015 at 21:17, James Macdonald james.macdon...@geofeedia.com
wrote:

 I had a similar problem recently and solved it by moving my filter into a
 filtered query (leaving the query as a match_all), see documentation here
 http://www.elastic.co/guide/en/elasticsearch/reference/1.5/query-dsl-filtered-query.html
 .

 I am not certain why filters do not restrict the scope of the aggregates,
 but queries do, but I suspect it interprets the filter (not wrapped in a
 filtered_query) as a post_filter (
 http://www.elastic.co/guide/en/elasticsearch/reference/1.x/search-request-post-filter.html).
 Maybe someone else actually knows why.


 Hope that helps,
 James

 On Fri, Apr 10, 2015 at 11:39 AM, James Green james.mk.gr...@gmail.com
 wrote:

 I must be doing something stupid!

 Using the Java client I can perform a search with a filter and iterate
 over the hits. I see exactly the right source documents.

 If I add an aggregation, I see the expected keyAsText string but the
 docCount reflects the volume if the filter had not been applied.

 I expected the aggregation to be restricted to the results within that
 filter?

 Thanks,

 James

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxkmZVfDhkJW-bWPrRs5BMzTem-2zCQRWeF%2BLQCR2L5sA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxkmZVfDhkJW-bWPrRs5BMzTem-2zCQRWeF%2BLQCR2L5sA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAABsnTbD0JgcpMCMWuzjVC1W3C-pt6pC6PJG0xT31O44MZQs%3DA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAABsnTbD0JgcpMCMWuzjVC1W3C-pt6pC6PJG0xT31O44MZQs%3DA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxhjdOQrfqy9Upsvnh%2B%2BgHmFDwqw%3DQqjHW2Z7DUm7BvJg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation not limited to filter?

2015-04-13 Thread James Green
Indeed. I had used postFilter to add my filters. The documentation for
filters doesn't show how to use a query with a matchAll and a bunch of
filters so I blindly followed IDE auto-complete.

Lesson learned.

On 10 April 2015 at 21:17, James Macdonald james.macdon...@geofeedia.com
wrote:

 I had a similar problem recently and solved it by moving my filter into a
 filtered query (leaving the query as a match_all), see documentation here
 http://www.elastic.co/guide/en/elasticsearch/reference/1.5/query-dsl-filtered-query.html
 .

 I am not certain why filters do not restrict the scope of the aggregates,
 but queries do, but I suspect it interprets the filter (not wrapped in a
 filtered_query) as a post_filter (
 http://www.elastic.co/guide/en/elasticsearch/reference/1.x/search-request-post-filter.html).
 Maybe someone else actually knows why.


 Hope that helps,
 James

 On Fri, Apr 10, 2015 at 11:39 AM, James Green james.mk.gr...@gmail.com
 wrote:

 I must be doing something stupid!

 Using the Java client I can perform a search with a filter and iterate
 over the hits. I see exactly the right source documents.

 If I add an aggregation, I see the expected keyAsText string but the
 docCount reflects the volume if the filter had not been applied.

 I expected the aggregation to be restricted to the results within that
 filter?

 Thanks,

 James

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxkmZVfDhkJW-bWPrRs5BMzTem-2zCQRWeF%2BLQCR2L5sA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxkmZVfDhkJW-bWPrRs5BMzTem-2zCQRWeF%2BLQCR2L5sA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAABsnTbD0JgcpMCMWuzjVC1W3C-pt6pC6PJG0xT31O44MZQs%3DA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAABsnTbD0JgcpMCMWuzjVC1W3C-pt6pC6PJG0xT31O44MZQs%3DA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxDfHvCicw5rewNOAun5Vy2qZe8X_awGD3wR8B-vVZY-A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation not limited to filter?

2015-04-10 Thread James Macdonald
I had a similar problem recently and solved it by moving my filter into a
filtered query (leaving the query as a match_all), see documentation here
http://www.elastic.co/guide/en/elasticsearch/reference/1.5/query-dsl-filtered-query.html
.

I am not certain why filters do not restrict the scope of the aggregates,
but queries do, but I suspect it interprets the filter (not wrapped in a
filtered_query) as a post_filter (
http://www.elastic.co/guide/en/elasticsearch/reference/1.x/search-request-post-filter.html).
Maybe someone else actually knows why.


Hope that helps,
James

On Fri, Apr 10, 2015 at 11:39 AM, James Green james.mk.gr...@gmail.com
wrote:

 I must be doing something stupid!

 Using the Java client I can perform a search with a filter and iterate
 over the hits. I see exactly the right source documents.

 If I add an aggregation, I see the expected keyAsText string but the
 docCount reflects the volume if the filter had not been applied.

 I expected the aggregation to be restricted to the results within that
 filter?

 Thanks,

 James

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxkmZVfDhkJW-bWPrRs5BMzTem-2zCQRWeF%2BLQCR2L5sA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAMH6%2BaxkmZVfDhkJW-bWPrRs5BMzTem-2zCQRWeF%2BLQCR2L5sA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAABsnTbD0JgcpMCMWuzjVC1W3C-pt6pC6PJG0xT31O44MZQs%3DA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation / Sort and CircuitBreakingException

2015-03-15 Thread joergpra...@gmail.com
Have you considered doc values?

http://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html

Jörg

On Sun, Mar 15, 2015 at 11:11 PM, Lindsey Poole lpo...@gmail.com wrote:

 Hey guys,

 I have a question about the mechanics of aggregation and sorting w.r.t.
 the fielddata cache. I know this has been covered in some detail
 previously, and I'm caught up on the advice to use doc_values where
 possible, but we have a use case where we do light analysis on a particular
 set of fields in our document, but also allow sorting on those fields.

 While we'll probably modify our schema to solve the issue, I was first
 wondering whether it is possible to filter the set of documents that ES
 aggregates / sorts over *before* pulling them into the fielddata cache? We
 have extremely high cardinality fields, but very selective queries, and it
 seems very inefficient to pull multiple gigabytes into the fielddata cache
 to select relatively few matching documents.

 Thanks,

 Lindsey

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e32cf7c3-e2b3-48e9-bc7c-d7f2e0016835%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e32cf7c3-e2b3-48e9-bc7c-d7f2e0016835%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFgpwVbkkAsKK11m74qqE_avwQ5mmMGb2z1w0-qH5hNMw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation / Sort and CircuitBreakingException

2015-03-15 Thread joergpra...@gmail.com
I mean, I do not understand what you mean by I'm caught up on the advice
to use doc_values where possible, but we have a use case where we do light
analysis on a particular set of fields in our document - what exactly
prevents you from doc values?

Jörg

On Mon, Mar 16, 2015 at 12:41 AM, joergpra...@gmail.com 
joergpra...@gmail.com wrote:

 Have you considered doc values?

 http://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html

 Jörg

 On Sun, Mar 15, 2015 at 11:11 PM, Lindsey Poole lpo...@gmail.com wrote:

 Hey guys,

 I have a question about the mechanics of aggregation and sorting w.r.t.
 the fielddata cache. I know this has been covered in some detail
 previously, and I'm caught up on the advice to use doc_values where
 possible, but we have a use case where we do light analysis on a particular
 set of fields in our document, but also allow sorting on those fields.

 While we'll probably modify our schema to solve the issue, I was first
 wondering whether it is possible to filter the set of documents that ES
 aggregates / sorts over *before* pulling them into the fielddata cache? We
 have extremely high cardinality fields, but very selective queries, and it
 seems very inefficient to pull multiple gigabytes into the fielddata cache
 to select relatively few matching documents.

 Thanks,

 Lindsey

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e32cf7c3-e2b3-48e9-bc7c-d7f2e0016835%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e32cf7c3-e2b3-48e9-bc7c-d7f2e0016835%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGD8qRCq6k6MwK4ujnWYfYv%2BGzdqn45GA6a6Gv4jHcUWw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation / Sort and CircuitBreakingException

2015-03-15 Thread Lindsey Poole
Well, we have a field that is supporting a backward compatibility use case. 
Clients are executing a partial match query on this field, so we used the 
keyword tokenizer instead of not_analyzed. Since this is supporting legacy 
functionality, the clients cannot be updated to change the expectation that 
a partial match will return results.

I can modify the schema and re-index so that we aggregate and sort over a 
not_analyzed subfield instead, while executing any queries on the parent 
field, but I wanted to verify that there is no other way to filter out 
terms prior to loading them into the fielddata cache.

The kind of filtering I'm looking for would be something like, only 
consider terms in field1 from documents where field2=valueA.

-Lindsey

On Sunday, March 15, 2015 at 4:43:56 PM UTC-7, Jörg Prante wrote:

 I mean, I do not understand what you mean by I'm caught up on the advice 
 to use doc_values where possible, but we have a use case where we do light 
 analysis on a particular set of fields in our document - what exactly 
 prevents you from doc values?

 Jörg

 On Mon, Mar 16, 2015 at 12:41 AM, joerg...@gmail.com javascript: 
 joerg...@gmail.com javascript: wrote:

 Have you considered doc values?

 http://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html

 Jörg

 On Sun, Mar 15, 2015 at 11:11 PM, Lindsey Poole lpo...@gmail.com 
 javascript: wrote:

 Hey guys,

 I have a question about the mechanics of aggregation and sorting w.r.t. 
 the fielddata cache. I know this has been covered in some detail 
 previously, and I'm caught up on the advice to use doc_values where 
 possible, but we have a use case where we do light analysis on a particular 
 set of fields in our document, but also allow sorting on those fields.

 While we'll probably modify our schema to solve the issue, I was first 
 wondering whether it is possible to filter the set of documents that ES 
 aggregates / sorts over *before* pulling them into the fielddata cache? We 
 have extremely high cardinality fields, but very selective queries, and it 
 seems very inefficient to pull multiple gigabytes into the fielddata cache 
 to select relatively few matching documents.

 Thanks,

 Lindsey

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/e32cf7c3-e2b3-48e9-bc7c-d7f2e0016835%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/e32cf7c3-e2b3-48e9-bc7c-d7f2e0016835%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.





-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0c9dc986-cfe1-42f9-ac83-d1ca40699c3d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation / Sort and CircuitBreakingException

2015-03-15 Thread Lindsey Poole
Also, if I understand correctly, there are negative implications when 
sorting over a column that has been analyzed - in our case, to remove 
stop-words.

Since the total cardinality of our sort field exceeds the heap available, 
we can't sort a single users documents when using stop word analysis since 
doc_values do not support analyzed fields.

It seems like we'll have to preprocess the field to remove stop-words?

On Sunday, March 15, 2015 at 7:01:21 PM UTC-7, Lindsey Poole wrote:

 Well, we have a field that is supporting a backward compatibility use 
 case. Clients are executing a partial match query on this field, so we used 
 the keyword tokenizer instead of not_analyzed. Since this is supporting 
 legacy functionality, the clients cannot be updated to change the 
 expectation that a partial match will return results.

 I can modify the schema and re-index so that we aggregate and sort over a 
 not_analyzed subfield instead, while executing any queries on the parent 
 field, but I wanted to verify that there is no other way to filter out 
 terms prior to loading them into the fielddata cache.

 The kind of filtering I'm looking for would be something like, only 
 consider terms in field1 from documents where field2=valueA.

 -Lindsey

 On Sunday, March 15, 2015 at 4:43:56 PM UTC-7, Jörg Prante wrote:

 I mean, I do not understand what you mean by I'm caught up on the 
 advice to use doc_values where possible, but we have a use case where we do 
 light analysis on a particular set of fields in our document - what 
 exactly prevents you from doc values?

 Jörg

 On Mon, Mar 16, 2015 at 12:41 AM, joerg...@gmail.com joerg...@gmail.com 
 wrote:

 Have you considered doc values?


 http://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html

 Jörg

 On Sun, Mar 15, 2015 at 11:11 PM, Lindsey Poole lpo...@gmail.com 
 wrote:

 Hey guys,

 I have a question about the mechanics of aggregation and sorting w.r.t. 
 the fielddata cache. I know this has been covered in some detail 
 previously, and I'm caught up on the advice to use doc_values where 
 possible, but we have a use case where we do light analysis on a 
 particular 
 set of fields in our document, but also allow sorting on those fields.

 While we'll probably modify our schema to solve the issue, I was first 
 wondering whether it is possible to filter the set of documents that ES 
 aggregates / sorts over *before* pulling them into the fielddata cache? We 
 have extremely high cardinality fields, but very selective queries, and it 
 seems very inefficient to pull multiple gigabytes into the fielddata cache 
 to select relatively few matching documents.

 Thanks,

 Lindsey

 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/e32cf7c3-e2b3-48e9-bc7c-d7f2e0016835%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/e32cf7c3-e2b3-48e9-bc7c-d7f2e0016835%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.





-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8027c84c-dd00-490e-a845-7fb0bb2f6107%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation - Blank and date aggregation

2015-01-16 Thread buddarapu nagaraju
Hi ,

I tried but date histrogram didnt work not sure what is the mistake am doing

here is date histrogram request(json) am passing and also pasted sample doc
structure





date histogram request

{
  aggs: {
createddatetime: {
  date_histogram: {
field: createddatetime,
interval: day
  }
}
  }
}

Document in index has fields



   -
  - id: 79,
  - rank: 0,
  - dateSort2: 2015-01-15T06:08:06.7091884Z,
  - dateSort3: 0001-01-01T00:00:00,
  - doubleSort1: 118.5,
  - doubleSort2: 67884.18,
  - doubleSort3: 54262.6006,
  - numField: 0,
  - createdDateTime: 2015-01-16T06:08:06.7091884Z,
  -


Regards
Nagaraju
908 517 6981

On Thu, Jan 15, 2015 at 12:38 PM, Adrien Grand 
adrien.gr...@elasticsearch.com wrote:

 Then it means that you want to use a date_histogram aggregation with
 interval=day. See
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html

 On Thu, Jan 15, 2015 at 4:43 PM, buddarapu nagaraju budda08n...@gmail.com
  wrote:

 Hey Adrien ,Thank you.I have one more question on aggregating on dates .

 We actually stored date time in a field called createdDateTime but I
 need only aggregates on date part of date time .

 Any ideas ? Or sample code  can help us ?

 Regards
 Nagaraju
 908 517 6981

 On Wed, Jan 14, 2015 at 6:10 AM, Adrien Grand 
 adrien.gr...@elasticsearch.com wrote:



 On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju 
 budda08n...@gmail.com wrote:

 Does term aggregation counts on blank field values ?


 Yes, an empty value  counts as a term. Note that you need the field to
 be not analyzed for it to work (or to use an analyzer that emits empty
 strings). Otherwise the standard analyzer would analyzer  as an empty
 list of tokens, so a field value of  would not actually count...


 Does term aggregation is enough for doing date aggregation ? Or there
 any specific aggregations we have ?All I need in date aggregation is to
 know different dates and its counts ?


 A terms aggregation is enough, but a date_histogram aggregation is
 generally more useful on dates as there are lots of unique values and it's
 often more useful to group them based on the year, month or day.

 --
 Adrien Grand

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/i9N09n_-n38/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAFtuXXKp0JycJfNvLxPGN_5YL7P-X%3DGDzvmYJQ9NFN7Q%2BaJjQw%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAFtuXXKp0JycJfNvLxPGN_5YL7P-X%3DGDzvmYJQ9NFN7Q%2BaJjQw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




 --
 Adrien Grand

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/i9N09n_-n38/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7Nn8h7C9BoW6PUjHbS%2Bnerpw3%3DWUi5RrC5ewtDBtSRaA%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7Nn8h7C9BoW6PUjHbS%2Bnerpw3%3DWUi5RrC5ewtDBtSRaA%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAFtuXXJ-s14ZFCMi4RjFjePCpvW9-URP5T61hBCLHKbCU9o_aA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation - Blank and date aggregation

2015-01-16 Thread buddarapu nagaraju
Index mapping here

mappings: {

   - document: {
  - properties: {
 - createdDateTime: {
- format: dateOptionalTime,
- type: date
 },
 - doubleSort1: {
- type: double
 },
 - stringSort3: {
- type: string
 },
 - doubleSort2: {
- type: double
 },
 - doubleSort3: {
- type: double
 },
 - numSort1: {
- type: long
 },
 - stringSort2: {
- type: string
 },
 - dcn: {
- type: string
 },
 - numSort2: {
- type: long
 },
 - numSort3: {
- type: long
 },
 - path: {
- type: string
 },
 - numField: {
- type: long
 },
 - dateSort3: {
- format: dateOptionalTime,
- type: date
 },
 - dateSort2: {
- format: dateOptionalTime,
- type: date
 },
 - rank: {
- type: double
 },
 - id: {
- type: long
 },
 - text: {
- type: string
 },
 - fields: {
- properties: {
   - isAnalyzed: {
  - type: boolean
   },
   - name: {
  - type: string
   },
   - isFullText: {
  - type: boolean
   },
   - isStored: {
  - type: boolean
   },
   - value: {
  - type: string
   }
}
 }
  }
   }


Regards
Nagaraju
908 517 6981

On Fri, Jan 16, 2015 at 3:23 AM, buddarapu nagaraju budda08n...@gmail.com
wrote:

 Hi ,

 I tried but date histrogram didnt work not sure what is the mistake am
 doing

 here is date histrogram request(json) am passing and also pasted sample
 doc structure





 date histogram request

 {
   aggs: {
 createddatetime: {
   date_histogram: {
 field: createddatetime,
 interval: day
   }
 }
   }
 }

 Document in index has fields



-
   - id: 79,
   - rank: 0,
   - dateSort2: 2015-01-15T06:08:06.7091884Z,
   - dateSort3: 0001-01-01T00:00:00,
   - doubleSort1: 118.5,
   - doubleSort2: 67884.18,
   - doubleSort3: 54262.6006,
   - numField: 0,
   - createdDateTime: 2015-01-16T06:08:06.7091884Z,
   -


 Regards
 Nagaraju
 908 517 6981

 On Thu, Jan 15, 2015 at 12:38 PM, Adrien Grand 
 adrien.gr...@elasticsearch.com wrote:

 Then it means that you want to use a date_histogram aggregation with
 interval=day. See
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html

 On Thu, Jan 15, 2015 at 4:43 PM, buddarapu nagaraju 
 budda08n...@gmail.com wrote:

 Hey Adrien ,Thank you.I have one more question on aggregating on dates .

 We actually stored date time in a field called createdDateTime but I
 need only aggregates on date part of date time .

 Any ideas ? Or sample code  can help us ?

 Regards
 Nagaraju
 908 517 6981

 On Wed, Jan 14, 2015 at 6:10 AM, Adrien Grand 
 adrien.gr...@elasticsearch.com wrote:



 On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju 
 budda08n...@gmail.com wrote:

 Does term aggregation counts on blank field values ?


 Yes, an empty value  counts as a term. Note that you need the field
 to be not analyzed for it to work (or to use an analyzer that emits empty
 strings). Otherwise the standard analyzer would analyzer  as an empty
 list of tokens, so a field value of  would not actually count...


 Does term aggregation is enough for doing date aggregation ? Or there
 any specific aggregations we have ?All I need in date aggregation is to
 know different dates and its counts ?


 A terms aggregation is enough, but a date_histogram aggregation is
 generally more useful on dates as there are lots of unique values and it's
 often more useful to group them based on the year, month or day.

 --
 Adrien Grand

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/i9N09n_-n38/unsubscribe
 .
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To 

Re: Aggregation - Blank and date aggregation

2015-01-16 Thread Adrien Grand
This looks good, what error did you get?

On Fri, Jan 16, 2015 at 9:41 AM, buddarapu nagaraju budda08n...@gmail.com
wrote:

 Index mapping here

 mappings: {

- document: {
   - properties: {
  - createdDateTime: {
 - format: dateOptionalTime,
 - type: date
  },
  - doubleSort1: {
 - type: double
  },
  - stringSort3: {
 - type: string
  },
  - doubleSort2: {
 - type: double
  },
  - doubleSort3: {
 - type: double
  },
  - numSort1: {
 - type: long
  },
  - stringSort2: {
 - type: string
  },
  - dcn: {
 - type: string
  },
  - numSort2: {
 - type: long
  },
  - numSort3: {
 - type: long
  },
  - path: {
 - type: string
  },
  - numField: {
 - type: long
  },
  - dateSort3: {
 - format: dateOptionalTime,
 - type: date
  },
  - dateSort2: {
 - format: dateOptionalTime,
 - type: date
  },
  - rank: {
 - type: double
  },
  - id: {
 - type: long
  },
  - text: {
 - type: string
  },
  - fields: {
 - properties: {
- isAnalyzed: {
   - type: boolean
},
- name: {
   - type: string
},
- isFullText: {
   - type: boolean
},
- isStored: {
   - type: boolean
},
- value: {
   - type: string
}
 }
  }
   }
}


 Regards
 Nagaraju
 908 517 6981

 On Fri, Jan 16, 2015 at 3:23 AM, buddarapu nagaraju budda08n...@gmail.com
  wrote:

 Hi ,

 I tried but date histrogram didnt work not sure what is the mistake am
 doing

 here is date histrogram request(json) am passing and also pasted sample
 doc structure





 date histogram request

 {
   aggs: {
 createddatetime: {
   date_histogram: {
 field: createddatetime,
 interval: day
   }
 }
   }
 }

 Document in index has fields



-
   - id: 79,
   - rank: 0,
   - dateSort2: 2015-01-15T06:08:06.7091884Z,
   - dateSort3: 0001-01-01T00:00:00,
   - doubleSort1: 118.5,
   - doubleSort2: 67884.18,
   - doubleSort3: 54262.6006,
   - numField: 0,
   - createdDateTime: 2015-01-16T06:08:06.7091884Z,
   -


 Regards
 Nagaraju
 908 517 6981

 On Thu, Jan 15, 2015 at 12:38 PM, Adrien Grand 
 adrien.gr...@elasticsearch.com wrote:

 Then it means that you want to use a date_histogram aggregation with
 interval=day. See
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html

 On Thu, Jan 15, 2015 at 4:43 PM, buddarapu nagaraju 
 budda08n...@gmail.com wrote:

 Hey Adrien ,Thank you.I have one more question on aggregating on dates .

 We actually stored date time in a field called createdDateTime but I
 need only aggregates on date part of date time .

 Any ideas ? Or sample code  can help us ?

 Regards
 Nagaraju
 908 517 6981

 On Wed, Jan 14, 2015 at 6:10 AM, Adrien Grand 
 adrien.gr...@elasticsearch.com wrote:



 On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju 
 budda08n...@gmail.com wrote:

 Does term aggregation counts on blank field values ?


 Yes, an empty value  counts as a term. Note that you need the field
 to be not analyzed for it to work (or to use an analyzer that emits empty
 strings). Otherwise the standard analyzer would analyzer  as an empty
 list of tokens, so a field value of  would not actually count...


 Does term aggregation is enough for doing date aggregation ? Or there
 any specific aggregations we have ?All I need in date aggregation is to
 know different dates and its counts ?


 A terms aggregation is enough, but a date_histogram aggregation is
 generally more useful on dates as there are lots of unique values and it's
 often more useful to group them based on the year, month or day.

 --
 Adrien Grand

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/i9N09n_-n38/unsubscribe
 .
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com
 

Re: Aggregation - Blank and date aggregation

2015-01-16 Thread buddarapu nagaraju
I was able to figure out through fiddler ...date histrograms are returns in
seperate nested object in the result .. Now works

On Friday, January 16, 2015, Adrien Grand adrien.gr...@elasticsearch.com
wrote:

 This looks good, what error did you get?

 On Fri, Jan 16, 2015 at 9:41 AM, buddarapu nagaraju budda08n...@gmail.com
 javascript:_e(%7B%7D,'cvml','budda08n...@gmail.com'); wrote:

 Index mapping here

 mappings: {

- document: {
   - properties: {
  - createdDateTime: {
 - format: dateOptionalTime,
 - type: date
  },
  - doubleSort1: {
 - type: double
  },
  - stringSort3: {
 - type: string
  },
  - doubleSort2: {
 - type: double
  },
  - doubleSort3: {
 - type: double
  },
  - numSort1: {
 - type: long
  },
  - stringSort2: {
 - type: string
  },
  - dcn: {
 - type: string
  },
  - numSort2: {
 - type: long
  },
  - numSort3: {
 - type: long
  },
  - path: {
 - type: string
  },
  - numField: {
 - type: long
  },
  - dateSort3: {
 - format: dateOptionalTime,
 - type: date
  },
  - dateSort2: {
 - format: dateOptionalTime,
 - type: date
  },
  - rank: {
 - type: double
  },
  - id: {
 - type: long
  },
  - text: {
 - type: string
  },
  - fields: {
 - properties: {
- isAnalyzed: {
   - type: boolean
},
- name: {
   - type: string
},
- isFullText: {
   - type: boolean
},
- isStored: {
   - type: boolean
},
- value: {
   - type: string
}
 }
  }
   }
}


 Regards
 Nagaraju
 908 517 6981

 On Fri, Jan 16, 2015 at 3:23 AM, buddarapu nagaraju 
 budda08n...@gmail.com
 javascript:_e(%7B%7D,'cvml','budda08n...@gmail.com'); wrote:

 Hi ,

 I tried but date histrogram didnt work not sure what is the mistake am
 doing

 here is date histrogram request(json) am passing and also pasted sample
 doc structure





 date histogram request

 {
   aggs: {
 createddatetime: {
   date_histogram: {
 field: createddatetime,
 interval: day
   }
 }
   }
 }

 Document in index has fields



-
   - id: 79,
   - rank: 0,
   - dateSort2: 2015-01-15T06:08:06.7091884Z,
   - dateSort3: 0001-01-01T00:00:00,
   - doubleSort1: 118.5,
   - doubleSort2: 67884.18,
   - doubleSort3: 54262.6006,
   - numField: 0,
   - createdDateTime: 2015-01-16T06:08:06.7091884Z,
   -


 Regards
 Nagaraju
 908 517 6981

 On Thu, Jan 15, 2015 at 12:38 PM, Adrien Grand 
 adrien.gr...@elasticsearch.com
 javascript:_e(%7B%7D,'cvml','adrien.gr...@elasticsearch.com'); wrote:

 Then it means that you want to use a date_histogram aggregation with
 interval=day. See
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html

 On Thu, Jan 15, 2015 at 4:43 PM, buddarapu nagaraju 
 budda08n...@gmail.com
 javascript:_e(%7B%7D,'cvml','budda08n...@gmail.com'); wrote:

 Hey Adrien ,Thank you.I have one more question on aggregating on dates
 .

 We actually stored date time in a field called createdDateTime but I
 need only aggregates on date part of date time .

 Any ideas ? Or sample code  can help us ?

 Regards
 Nagaraju
 908 517 6981

 On Wed, Jan 14, 2015 at 6:10 AM, Adrien Grand 
 adrien.gr...@elasticsearch.com
 javascript:_e(%7B%7D,'cvml','adrien.gr...@elasticsearch.com');
 wrote:



 On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju 
 budda08n...@gmail.com
 javascript:_e(%7B%7D,'cvml','budda08n...@gmail.com'); wrote:

 Does term aggregation counts on blank field values ?


 Yes, an empty value  counts as a term. Note that you need the field
 to be not analyzed for it to work (or to use an analyzer that emits empty
 strings). Otherwise the standard analyzer would analyzer  as an empty
 list of tokens, so a field value of  would not actually count...


 Does term aggregation is enough for doing date aggregation ? Or
 there any specific aggregations we have ?All I need in date aggregation 
 is
 to know different dates and its counts ?


 A terms aggregation is enough, but a date_histogram aggregation is
 generally more useful on dates as there are lots of unique values and 
 it's
 often more useful to group them based on the year, month or day.

 --
 Adrien Grand

 --
 You received this message because you are 

Re: Aggregation - Blank and date aggregation

2015-01-15 Thread Adrien Grand
Then it means that you want to use a date_histogram aggregation with
interval=day. See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html

On Thu, Jan 15, 2015 at 4:43 PM, buddarapu nagaraju budda08n...@gmail.com
wrote:

 Hey Adrien ,Thank you.I have one more question on aggregating on dates .

 We actually stored date time in a field called createdDateTime but I
 need only aggregates on date part of date time .

 Any ideas ? Or sample code  can help us ?

 Regards
 Nagaraju
 908 517 6981

 On Wed, Jan 14, 2015 at 6:10 AM, Adrien Grand 
 adrien.gr...@elasticsearch.com wrote:



 On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju 
 budda08n...@gmail.com wrote:

 Does term aggregation counts on blank field values ?


 Yes, an empty value  counts as a term. Note that you need the field to
 be not analyzed for it to work (or to use an analyzer that emits empty
 strings). Otherwise the standard analyzer would analyzer  as an empty
 list of tokens, so a field value of  would not actually count...


 Does term aggregation is enough for doing date aggregation ? Or there
 any specific aggregations we have ?All I need in date aggregation is to
 know different dates and its counts ?


 A terms aggregation is enough, but a date_histogram aggregation is
 generally more useful on dates as there are lots of unique values and it's
 often more useful to group them based on the year, month or day.

 --
 Adrien Grand

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/i9N09n_-n38/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAFtuXXKp0JycJfNvLxPGN_5YL7P-X%3DGDzvmYJQ9NFN7Q%2BaJjQw%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAFtuXXKp0JycJfNvLxPGN_5YL7P-X%3DGDzvmYJQ9NFN7Q%2BaJjQw%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7Nn8h7C9BoW6PUjHbS%2Bnerpw3%3DWUi5RrC5ewtDBtSRaA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation - Blank and date aggregation

2015-01-15 Thread buddarapu nagaraju
Hey Adrien ,Thank you.I have one more question on aggregating on dates .

We actually stored date time in a field called createdDateTime but I need
only aggregates on date part of date time .

Any ideas ? Or sample code  can help us ?

Regards
Nagaraju
908 517 6981

On Wed, Jan 14, 2015 at 6:10 AM, Adrien Grand 
adrien.gr...@elasticsearch.com wrote:



 On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju 
 budda08n...@gmail.com wrote:

 Does term aggregation counts on blank field values ?


 Yes, an empty value  counts as a term. Note that you need the field to
 be not analyzed for it to work (or to use an analyzer that emits empty
 strings). Otherwise the standard analyzer would analyzer  as an empty
 list of tokens, so a field value of  would not actually count...


 Does term aggregation is enough for doing date aggregation ? Or there any
 specific aggregations we have ?All I need in date aggregation is to know
 different dates and its counts ?


 A terms aggregation is enough, but a date_histogram aggregation is
 generally more useful on dates as there are lots of unique values and it's
 often more useful to group them based on the year, month or day.

 --
 Adrien Grand

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/i9N09n_-n38/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAFtuXXKp0JycJfNvLxPGN_5YL7P-X%3DGDzvmYJQ9NFN7Q%2BaJjQw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation - Blank and date aggregation

2015-01-14 Thread Adrien Grand
On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju budda08n...@gmail.com
wrote:

 Does term aggregation counts on blank field values ?


Yes, an empty value  counts as a term. Note that you need the field to be
not analyzed for it to work (or to use an analyzer that emits empty
strings). Otherwise the standard analyzer would analyzer  as an empty
list of tokens, so a field value of  would not actually count...


 Does term aggregation is enough for doing date aggregation ? Or there any
 specific aggregations we have ?All I need in date aggregation is to know
 different dates and its counts ?


A terms aggregation is enough, but a date_histogram aggregation is
generally more useful on dates as there are lots of unique values and it's
often more useful to group them based on the year, month or day.

-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j74ZqbBN0zNW6-5Feu7xYTKkomzx%3DDMhx28inFVYLSu5Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: aggregation giving inconsistent results

2014-10-31 Thread Adrien Grand
This is unfortunately a known limitation of the terms aggregation. Note
however that elasticsearch 1.4 (only a beta version is available today but
the GA release should be available within a couple of weeks) improves some
heuristics which allow to have a better accuracy by default, and also
reports an error bound on the document counts that are returned.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-approximate-counts

On Thu, Oct 30, 2014 at 5:48 PM, Jay Hilden jay.hil...@gmail.com wrote:

 I'm running an aggregation and getting the top 5 results.  When I run the
 exact same aggregation on the top 50 results I'm getting totally different
 results.  I expect that when asking for 50 the top 5 should remain the same
 and an additional 45 should be added to the list.  That is not what's
 happening.

 Note: I'm aggregating on the non_analyzed part of a multi-field
 authInput.userName, I'm not sure if that makes a difference or not.

 *Here is my query: *

 GET prodstarbucks/authEvent/_search
 {
   size: 0,
   aggs: {
 users: {
   terms: {
 field: authInput.userName.userNameNotAnalyzed,
 size: 5
   }
 }
   },
   query: {
 filtered: {
   query: {
 match_all: {}
   },
   filter: {
 bool: {
   must: [
 {
   range: {
 authResult.authEventDate: {
   gte: 2014-10-01T00:00:00.000,
   lte: 2014-10-31T00:00:00.000
 }
   }
 }
   ]
 }
   }
 }
   }
 }

 *RESULT:*
 {
took: 2171,
timed_out: false,
_shards: {
   total: 5,
   successful: 5,
   failed: 0
},
hits: {
   total: 1090455,
   max_score: 0,
   hits: []
},
aggregations: {
   users: {
  buckets: [
 {
key: 3D64E4FD-6D25-4E77-966E-A0E059CFD31E,
doc_count: 91
 },
 {
key: 3982EC96-DB4C-4A22-AC64-2CFC09D52579,
doc_count: 68
 },
 {
key: 674E6691-8A46-4D34-BB31-B78780969311,
doc_count: 24
 },
 {
key: 64449480-77AC-4D64-B79D-DDB545BEE472,
doc_count: 23
 },
 {
key: {7CB63FEE-709A-4AD5-AA16-2CFE3282FEE8},
doc_count: 23
 }
  ]
   }
}
 }

 If I change the aggregation size to be 50, these are my top 5 results:
 {
took: 2256,
timed_out: false,
_shards: {
   total: 5,
   successful: 5,
   failed: 0
},
hits: {
   total: 1090501,
   max_score: 0,
   hits: []
},
aggregations: {
   users: {
  buckets: [
 {
key: 3D64E4FD-6D25-4E77-966E-A0E059CFD31E,
doc_count: 109
 },
 {
key: 3982EC96-DB4C-4A22-AC64-2CFC09D52579,
doc_count: 84
 },
 {
key: F77E8291-1640-4C3F-8A1A-D6D955AB940A,
doc_count: 59
 },
 {
key: 6AC1ED48-8F91-400B-9353-172BB6E1823B,
doc_count: 53
 },
 {
key: 52CDF454-92C2-4C32-91F6-AF4F08C8F908,
doc_count: 52
 },
 ...


 The doc_counts are all different.  Can someone help explain this to me and
 let me know how I might get the correct doc_count even when only asking for
 the top 5 results.

 Thank you!

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/3e7e5a69-59ee-4472-abb5-598258f97341%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/3e7e5a69-59ee-4472-abb5-598258f97341%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7Qp%3DCAKSqe1H9zY87fy4T2UBoNvjh4tYpgZNoLpPbkaw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: aggregation giving inconsistent results

2014-10-31 Thread Jay Hilden
Thanks Adrien, your link was very helpful in understanding why I was
getting the results I'm getting.  Doing some experimentation on our data
I'm going to use a 20x multiplier on the shard_count against the size.  So
in my testing when I want the top 5 results for a very flat term I'm going
to set shard_size to 100 (5*20) and that is giving me accurate results.

Thanks again!

On Fri, Oct 31, 2014 at 3:44 AM, Adrien Grand 
adrien.gr...@elasticsearch.com wrote:

 This is unfortunately a known limitation of the terms aggregation. Note
 however that elasticsearch 1.4 (only a beta version is available today but
 the GA release should be available within a couple of weeks) improves some
 heuristics which allow to have a better accuracy by default, and also
 reports an error bound on the document counts that are returned.
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-approximate-counts

 On Thu, Oct 30, 2014 at 5:48 PM, Jay Hilden jay.hil...@gmail.com wrote:

 I'm running an aggregation and getting the top 5 results.  When I run the
 exact same aggregation on the top 50 results I'm getting totally different
 results.  I expect that when asking for 50 the top 5 should remain the same
 and an additional 45 should be added to the list.  That is not what's
 happening.

 Note: I'm aggregating on the non_analyzed part of a multi-field
 authInput.userName, I'm not sure if that makes a difference or not.

 *Here is my query: *

 GET prodstarbucks/authEvent/_search
 {
   size: 0,
   aggs: {
 users: {
   terms: {
 field: authInput.userName.userNameNotAnalyzed,
 size: 5
   }
 }
   },
   query: {
 filtered: {
   query: {
 match_all: {}
   },
   filter: {
 bool: {
   must: [
 {
   range: {
 authResult.authEventDate: {
   gte: 2014-10-01T00:00:00.000,
   lte: 2014-10-31T00:00:00.000
 }
   }
 }
   ]
 }
   }
 }
   }
 }

 *RESULT:*
 {
took: 2171,
timed_out: false,
_shards: {
   total: 5,
   successful: 5,
   failed: 0
},
hits: {
   total: 1090455,
   max_score: 0,
   hits: []
},
aggregations: {
   users: {
  buckets: [
 {
key: 3D64E4FD-6D25-4E77-966E-A0E059CFD31E,
doc_count: 91
 },
 {
key: 3982EC96-DB4C-4A22-AC64-2CFC09D52579,
doc_count: 68
 },
 {
key: 674E6691-8A46-4D34-BB31-B78780969311,
doc_count: 24
 },
 {
key: 64449480-77AC-4D64-B79D-DDB545BEE472,
doc_count: 23
 },
 {
key: {7CB63FEE-709A-4AD5-AA16-2CFE3282FEE8},
doc_count: 23
 }
  ]
   }
}
 }

 If I change the aggregation size to be 50, these are my top 5 results:
 {
took: 2256,
timed_out: false,
_shards: {
   total: 5,
   successful: 5,
   failed: 0
},
hits: {
   total: 1090501,
   max_score: 0,
   hits: []
},
aggregations: {
   users: {
  buckets: [
 {
key: 3D64E4FD-6D25-4E77-966E-A0E059CFD31E,
doc_count: 109
 },
 {
key: 3982EC96-DB4C-4A22-AC64-2CFC09D52579,
doc_count: 84
 },
 {
key: F77E8291-1640-4C3F-8A1A-D6D955AB940A,
doc_count: 59
 },
 {
key: 6AC1ED48-8F91-400B-9353-172BB6E1823B,
doc_count: 53
 },
 {
key: 52CDF454-92C2-4C32-91F6-AF4F08C8F908,
doc_count: 52
 },
 ...


 The doc_counts are all different.  Can someone help explain this to me
 and let me know how I might get the correct doc_count even when only asking
 for the top 5 results.

 Thank you!

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/3e7e5a69-59ee-4472-abb5-598258f97341%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/3e7e5a69-59ee-4472-abb5-598258f97341%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 --
 Adrien Grand

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 

Re: Aggregation on last element

2014-10-26 Thread vineeth mohan
Hello Michaël ,

I cant think of a way to do this in a single call.
May be you should try the following


(Terms aggregation on element) - (Top N hits aggregation , sort by date by
asc and size = 1  ) - (Filter  aggregation by type A)
With this you will get the elements that you are looking for. Now do a
filter on those elements and a terms aggregation query on element filed to
get the results.

Thanks
  Vineeth



On Sun, Oct 26, 2014 at 1:04 AM, Michaël Gallego mich...@maestrooo.com
wrote:

 Hi,

 I have a type whose data looks like this:

 {
 date: 2014-01-01
 element: abc,
 type: A
 },
 {
 date: 2014-01-02
 element: abc,
 type: B
 },
 {
 date: 2014-01-03
 element: def,
 type: A
 }

 I'd like to be able to group the data by element, and count the documents
 where the LAST document by date have a type of A. In this case, I want the
 result to be 1 (because the second document, that has the same element as
 the first document, has a date that is after the first document, but as its
 type is not B, I don't want it to be counted ; for the last document, it is
 the only one with element def and the type is A).

 I'm not sure this is even possible. Please note that the cardinality of
 element can be quite high (up to 20 000 different values).

 Thank you in advance!

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/46509869-4afa-4062-8c34-ad828dcf680c%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/46509869-4afa-4062-8c34-ad828dcf680c%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5mGGdrY-K3maf4H0QeGuDjS-GUTCbV3MSxdE62wdMYpyA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation on last element

2014-10-26 Thread Michaël Gallego
Hi Vineeth,

I'm afraid that this won't work, because as I said element can have high 
cardinality (while it's not bounded in theory, in practice it will range 
from 500 to 4). Therefore if I do a terms on element, then a top hit, 
it will require to generate maybe 4 sub-buckets. I think this will kill 
performance.

For now, I've rethought my format so it now looks like this:

{
   element: abc,
   history: [
  {type: A, date: 2014-01-01},
  {type: B, date: 2014-01-02}
   ]
}

Where history is mapped as nested. Now, I can do that:

{
  aggs: {
history: {
  nested: {
path: history
  },
  aggs: {
latest-history: {
  filter: {
limit: {
  value: 1
}
  },
  
  aggs: {
by-type: {
  terms: {
field: history.type,
size: 0
  }
}
  }
}
  }
}
  }
}

This will get the nested history, limit by 1, then group by type, so I can 
get the count of the ones I'm interested (A type or B type). The only 
drawback is that inside the history nested, I need to sort the history by 
date in my application (I have not found any way to sort the nested by date 
before doing the limit filter...), and that while history is typically 
quite low (around 10-200 elements), it is not bounded, and updating is 
harder to do...

If anyone has any other idea, don't hesitate to share!


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3948211d-2029-42f4-a07a-3ff0ba1834c7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation on last element

2014-10-26 Thread Michaël Gallego
After some testing, it appears that my solution does not work, but I'm not 
sure to understand why. The filter returns less result that what is 
expected.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/933d228e-82f1-47c4-9fc3-909de234b93b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation buckets, with an additional key:value inside.

2014-10-25 Thread Ivan Brusic
I maintain a mapping on the client side to due the lookups. Thankfully my
taxonomy is static (but somewhat large). There is a PR to do server-side
mappings, but I don't think it would apply to aggregations and is quite old.

An alternative solution would be to create compound values such as
48885:Car Rental and decompose the value on the client side, but this
would create a string aggregation, which could have slower performance.

Cheers,

Ivan

On Fri, Oct 24, 2014 at 5:50 PM, Cody Stringham cs.nega...@gmail.com
wrote:

 Hey everyone,

 These aggregations are working out great, but I need to return more than
 one value in the bucket so we can use them in our API. The basic idea is
 that we aggregate all of the category id's, but we also want the
 category_name to be included in that same bucket for ease of use.


 *Mapping:*
 categories : {
 properties : {
 category_name : {
 analyzer : keyword,
 type : string
 },
 category_id : {
 type : integer
 },
 parent_id : {
 type : integer
 }
 }
 }

 *Aggs:*
 aggs: {
   categories: {
 terms: {
   size: 130,
   field: categories.category_id
 }
   },


 *Returns (actual):*

 category_stats: [
 {
   category_id: 58,
   offer_count: 48885
 },
 {
   category_id: 1008,
   offer_count: 44530
 },

 ...



 *Returns (desired):*

 category_stats: [
 {
   category_name: Car Rental,
   category_id: 58,
   offer_count: 48885
 },
 {
   category_name: Fast Food,
   category_id: 1008,
   offer_count: 44530
 },

 ...

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/055d0c2b-f4ba-455b-883f-587c09b61582%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/055d0c2b-f4ba-455b-883f-587c09b61582%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQB-Asuq5igGP-mJQ7RGv4t2CjsjryBGSTPDn0EAb-vfZw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation framework, Java API

2014-09-15 Thread Emanuel Buzek
Ivan, thanks a lot for the reply, I switched to FilteredQuery (using 
matchAll when no query is submitted) and that simplified my code a lot. It 
also makes more sense than using post filters and filtered aggregations...
best, emanuel

Dne úterý, 9. září 2014 18:46:25 UTC+2 Ivan Brusic napsal(a):

 A filtered query with no explicit query will ultimately be translated into 
 a match-all/constant-score query at the Lucene level. I prefer to 
 explicitly define all my match all queries and use the specific post filter 
 name, and not the old filter name, which was deprecated due to its 
 ambiguity.

 Besides, even if you did not have aggregations, you want to do as much 
 pre-filtered as you can. Post filters work on documents that have been 
 scored. No need to score documents that will eventually be filtered. Post 
 filters have some benefits, but it seems like they do not apply in this 
 case.

 -- 
 Ivan


 On Tue, Sep 9, 2014 at 2:26 AM, Emanuel Buzek emanue...@roke.cz 
 javascript: wrote:

 Thanks Ivan.

 Yes, it was the post filter which was ignored. We use filtered query only 
 when the user sends a query string, otherwise (when only exact filters for 
 specific columns are specified) we use the post filter. It seems strange to 
 me to use the FilteredQuery when the query string is empty, but perhaps 
 that would be the most straight forward way of doing this.

 thank you,
 emanuel

 Dne pondělí, 8. září 2014 17:21:21 UTC+2 Ivan Brusic napsal(a):

 Which filter was ignored? I am assuming you meant the post filter (which 
 might be still called filter at the Java API), which in this case the 
 filter is bypassed by design. Post filters allow you to filter the 
 documents returned, but leave the aggregations as is. Sounds like you are 
 looking for filtered queries. The method name is ambiguous, which is why it 
 has been renamed (and should actually be deprecated in the API). 

 Best way to learn the Java API is via the unit tests, but I do agree, 
 there is no clean way to write elegant code due to explicit casting.

 https://github.com/elasticsearch/elasticsearch/
 tree/master/src/test/java/org/elasticsearch/search/aggregations

 Cheers,

 Ivan


 On Mon, Sep 8, 2014 at 5:41 AM, mooky nick.mi...@gmail.com wrote:

 The aggregation takes into account a query - but not a post-filter. I'm 
 not sure of the rationale behind the difference.

 The java api for traversing results is quite painful - but I think a 
 good part of that is due to Java  the fact that there is very little 
 polymorphic behaviour between aggregation results (some have single 
 results, others have buckets, some have sub-aggregations, some dont).
 The only alternative that I can think of is a completely type-less 
 navigation of the data - which does little more than navigate the JSON 
 document.

 Hope that helps a bit.



 On Monday, 8 September 2014 10:26:44 UTC+1, Emanuel Buzek wrote:

 Hi there,
 I just used the elasticsearch aggregations through the Java API for 
 the first time.

 All I wanted was a simple min/max/sum/avg, so I used the Stats 
 aggregation. However, I was very surprised that the filter in the 
 SearchRequestBuilder is ignored, so I had to wrap the Stats Aggregation 
 into FilterAggregation.

 Getting the aggregation result seems a bit tedious:

 InternalStats stats = (InternalStats)((InternalFilter)a).
 getAggregations().asList().get(0);

 Maybe I am using the Java API wrong (I hope I am, otherwise it's imho 
 poorly designed.) Can anyone point me to an example how to access the 
 aggregation results from Java better?


 Also, I think that the aggregation should be filtered by default. If I 
 specify the filter with a query or post filter:

 queryBuilder = QueryBuilders.filteredQuery(queryBuilder, 
 filterBuilder);

   searchRequestBuilder.setQuery(queryBuilder);

 and then add an aggregation GET to the same searchRequestBuilder, I 
 think it's very unintuitive if the aggregation is computed globally. 
 Anyone 
 has this feeling as well?

 thanks, emanuel

 -- 
 Emanuel Buzek
 Software Engineer, ROKE.cz http://www.roke.cz
 tel: +420 776 54 26 26
  
  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%
 40googlegroups.com 
 https://groups.google.com/d/msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 

Re: Aggregation framework, Java API

2014-09-09 Thread Emanuel Buzek
Thanks Ivan.

Yes, it was the post filter which was ignored. We use filtered query only 
when the user sends a query string, otherwise (when only exact filters for 
specific columns are specified) we use the post filter. It seems strange to 
me to use the FilteredQuery when the query string is empty, but perhaps 
that would be the most straight forward way of doing this.

thank you,
emanuel

Dne pondělí, 8. září 2014 17:21:21 UTC+2 Ivan Brusic napsal(a):

 Which filter was ignored? I am assuming you meant the post filter (which 
 might be still called filter at the Java API), which in this case the 
 filter is bypassed by design. Post filters allow you to filter the 
 documents returned, but leave the aggregations as is. Sounds like you are 
 looking for filtered queries. The method name is ambiguous, which is why it 
 has been renamed (and should actually be deprecated in the API). 

 Best way to learn the Java API is via the unit tests, but I do agree, 
 there is no clean way to write elegant code due to explicit casting.


 https://github.com/elasticsearch/elasticsearch/tree/master/src/test/java/org/elasticsearch/search/aggregations

 Cheers,

 Ivan


 On Mon, Sep 8, 2014 at 5:41 AM, mooky nick.mi...@gmail.com javascript: 
 wrote:

 The aggregation takes into account a query - but not a post-filter. I'm 
 not sure of the rationale behind the difference.

 The java api for traversing results is quite painful - but I think a good 
 part of that is due to Java  the fact that there is very little 
 polymorphic behaviour between aggregation results (some have single 
 results, others have buckets, some have sub-aggregations, some dont).
 The only alternative that I can think of is a completely type-less 
 navigation of the data - which does little more than navigate the JSON 
 document.

 Hope that helps a bit.



 On Monday, 8 September 2014 10:26:44 UTC+1, Emanuel Buzek wrote:

 Hi there,
 I just used the elasticsearch aggregations through the Java API for the 
 first time.

 All I wanted was a simple min/max/sum/avg, so I used the Stats 
 aggregation. However, I was very surprised that the filter in the 
 SearchRequestBuilder is ignored, so I had to wrap the Stats Aggregation 
 into FilterAggregation.

 Getting the aggregation result seems a bit tedious:

 InternalStats stats = (InternalStats)((InternalFilter)a).
 getAggregations().asList().get(0);

 Maybe I am using the Java API wrong (I hope I am, otherwise it's imho 
 poorly designed.) Can anyone point me to an example how to access the 
 aggregation results from Java better?


 Also, I think that the aggregation should be filtered by default. If I 
 specify the filter with a query or post filter:

 queryBuilder = QueryBuilders.filteredQuery(queryBuilder, filterBuilder);

   searchRequestBuilder.setQuery(queryBuilder);

 and then add an aggregation GET to the same searchRequestBuilder, I 
 think it's very unintuitive if the aggregation is computed globally. Anyone 
 has this feeling as well?

 thanks, emanuel

 -- 
 Emanuel Buzek
 Software Engineer, ROKE.cz http://www.roke.cz
 tel: +420 776 54 26 26
  
  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1eb0e2c2-91e9-43b0-b076-decd33fa6440%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation framework, Java API

2014-09-09 Thread Ivan Brusic
A filtered query with no explicit query will ultimately be translated into
a match-all/constant-score query at the Lucene level. I prefer to
explicitly define all my match all queries and use the specific post filter
name, and not the old filter name, which was deprecated due to its
ambiguity.

Besides, even if you did not have aggregations, you want to do as much
pre-filtered as you can. Post filters work on documents that have been
scored. No need to score documents that will eventually be filtered. Post
filters have some benefits, but it seems like they do not apply in this
case.

-- 
Ivan


On Tue, Sep 9, 2014 at 2:26 AM, Emanuel Buzek emanuel.bu...@roke.cz wrote:

 Thanks Ivan.

 Yes, it was the post filter which was ignored. We use filtered query only
 when the user sends a query string, otherwise (when only exact filters for
 specific columns are specified) we use the post filter. It seems strange to
 me to use the FilteredQuery when the query string is empty, but perhaps
 that would be the most straight forward way of doing this.

 thank you,
 emanuel

 Dne pondělí, 8. září 2014 17:21:21 UTC+2 Ivan Brusic napsal(a):

 Which filter was ignored? I am assuming you meant the post filter (which
 might be still called filter at the Java API), which in this case the
 filter is bypassed by design. Post filters allow you to filter the
 documents returned, but leave the aggregations as is. Sounds like you are
 looking for filtered queries. The method name is ambiguous, which is why it
 has been renamed (and should actually be deprecated in the API).

 Best way to learn the Java API is via the unit tests, but I do agree,
 there is no clean way to write elegant code due to explicit casting.

 https://github.com/elasticsearch/elasticsearch/
 tree/master/src/test/java/org/elasticsearch/search/aggregations

 Cheers,

 Ivan


 On Mon, Sep 8, 2014 at 5:41 AM, mooky nick.mi...@gmail.com wrote:

 The aggregation takes into account a query - but not a post-filter. I'm
 not sure of the rationale behind the difference.

 The java api for traversing results is quite painful - but I think a
 good part of that is due to Java  the fact that there is very little
 polymorphic behaviour between aggregation results (some have single
 results, others have buckets, some have sub-aggregations, some dont).
 The only alternative that I can think of is a completely type-less
 navigation of the data - which does little more than navigate the JSON
 document.

 Hope that helps a bit.



 On Monday, 8 September 2014 10:26:44 UTC+1, Emanuel Buzek wrote:

 Hi there,
 I just used the elasticsearch aggregations through the Java API for the
 first time.

 All I wanted was a simple min/max/sum/avg, so I used the Stats
 aggregation. However, I was very surprised that the filter in the
 SearchRequestBuilder is ignored, so I had to wrap the Stats Aggregation
 into FilterAggregation.

 Getting the aggregation result seems a bit tedious:

 InternalStats stats = (InternalStats)((InternalFilter)a).
 getAggregations().asList().get(0);

 Maybe I am using the Java API wrong (I hope I am, otherwise it's imho
 poorly designed.) Can anyone point me to an example how to access the
 aggregation results from Java better?


 Also, I think that the aggregation should be filtered by default. If I
 specify the filter with a query or post filter:

 queryBuilder = QueryBuilders.filteredQuery(queryBuilder,
 filterBuilder);

   searchRequestBuilder.setQuery(queryBuilder);

 and then add an aggregation GET to the same searchRequestBuilder, I
 think it's very unintuitive if the aggregation is computed globally. Anyone
 has this feeling as well?

 thanks, emanuel

 --
 Emanuel Buzek
 Software Engineer, ROKE.cz http://www.roke.cz
 tel: +420 776 54 26 26

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%
 40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1eb0e2c2-91e9-43b0-b076-decd33fa6440%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/1eb0e2c2-91e9-43b0-b076-decd33fa6440%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 

Re: Aggregation framework, Java API

2014-09-08 Thread mooky
The aggregation takes into account a query - but not a post-filter. I'm not 
sure of the rationale behind the difference.

The java api for traversing results is quite painful - but I think a good 
part of that is due to Java  the fact that there is very little 
polymorphic behaviour between aggregation results (some have single 
results, others have buckets, some have sub-aggregations, some dont).
The only alternative that I can think of is a completely type-less 
navigation of the data - which does little more than navigate the JSON 
document.

Hope that helps a bit.



On Monday, 8 September 2014 10:26:44 UTC+1, Emanuel Buzek wrote:

 Hi there,
 I just used the elasticsearch aggregations through the Java API for the 
 first time.

 All I wanted was a simple min/max/sum/avg, so I used the Stats 
 aggregation. However, I was very surprised that the filter in the 
 SearchRequestBuilder is ignored, so I had to wrap the Stats Aggregation 
 into FilterAggregation.

 Getting the aggregation result seems a bit tedious:

 InternalStats stats = 
 (InternalStats)((InternalFilter)a).getAggregations().asList().get(0);

 Maybe I am using the Java API wrong (I hope I am, otherwise it's imho 
 poorly designed.) Can anyone point me to an example how to access the 
 aggregation results from Java better?


 Also, I think that the aggregation should be filtered by default. If I 
 specify the filter with a query or post filter:

 queryBuilder = QueryBuilders.filteredQuery(queryBuilder, filterBuilder);

   searchRequestBuilder.setQuery(queryBuilder);

 and then add an aggregation GET to the same searchRequestBuilder, I think 
 it's very unintuitive if the aggregation is computed globally. Anyone has 
 this feeling as well?

 thanks, emanuel

 -- 
 Emanuel Buzek
 Software Engineer, ROKE.cz http://www.roke.cz
 tel: +420 776 54 26 26
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation framework, Java API

2014-09-08 Thread Ivan Brusic
Which filter was ignored? I am assuming you meant the post filter (which
might be still called filter at the Java API), which in this case the
filter is bypassed by design. Post filters allow you to filter the
documents returned, but leave the aggregations as is. Sounds like you are
looking for filtered queries. The method name is ambiguous, which is why it
has been renamed (and should actually be deprecated in the API).

Best way to learn the Java API is via the unit tests, but I do agree, there
is no clean way to write elegant code due to explicit casting.

https://github.com/elasticsearch/elasticsearch/tree/master/src/test/java/org/elasticsearch/search/aggregations

Cheers,

Ivan


On Mon, Sep 8, 2014 at 5:41 AM, mooky nick.minute...@gmail.com wrote:

 The aggregation takes into account a query - but not a post-filter. I'm
 not sure of the rationale behind the difference.

 The java api for traversing results is quite painful - but I think a good
 part of that is due to Java  the fact that there is very little
 polymorphic behaviour between aggregation results (some have single
 results, others have buckets, some have sub-aggregations, some dont).
 The only alternative that I can think of is a completely type-less
 navigation of the data - which does little more than navigate the JSON
 document.

 Hope that helps a bit.



 On Monday, 8 September 2014 10:26:44 UTC+1, Emanuel Buzek wrote:

 Hi there,
 I just used the elasticsearch aggregations through the Java API for the
 first time.

 All I wanted was a simple min/max/sum/avg, so I used the Stats
 aggregation. However, I was very surprised that the filter in the
 SearchRequestBuilder is ignored, so I had to wrap the Stats Aggregation
 into FilterAggregation.

 Getting the aggregation result seems a bit tedious:

 InternalStats stats = (InternalStats)((InternalFilter)a).
 getAggregations().asList().get(0);

 Maybe I am using the Java API wrong (I hope I am, otherwise it's imho
 poorly designed.) Can anyone point me to an example how to access the
 aggregation results from Java better?


 Also, I think that the aggregation should be filtered by default. If I
 specify the filter with a query or post filter:

 queryBuilder = QueryBuilders.filteredQuery(queryBuilder, filterBuilder);

   searchRequestBuilder.setQuery(queryBuilder);

 and then add an aggregation GET to the same searchRequestBuilder, I think
 it's very unintuitive if the aggregation is computed globally. Anyone has
 this feeling as well?

 thanks, emanuel

 --
 Emanuel Buzek
 Software Engineer, ROKE.cz http://www.roke.cz
 tel: +420 776 54 26 26

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/a2e3a1e1-912a-4257-b938-6036d0c9d3ff%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA_Ca%2B-dZOOcAxX97pDEmgoS53wLaoBtEiHb2xFHqMxnA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: aggregation of hierchical elements possible?

2014-09-02 Thread vineeth mohan
Hello Markus ,

I cant seem to think of any straight method , but then you can try the
following


   1. Apply source transform script to convert /a/b/c = [ /a , /a/b ,
   /a/b/c ] -
   
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-transform.html#mapping-transform
   2. Now apply normal term aggregation.
   3. But then your query on this field will also match /a , /a/b also , so
   go for a raw field too

Thanks
  Vineeth


On Mon, Sep 1, 2014 at 9:21 PM, skippi1 skip...@gmx.de wrote:

 The index has a field named path which contains the canonical file name,
 e.g.:

 /a/file1
 /a/file2
 /a/b/file3

 Is it possible to create an bucket aggregation to summarize all file per
 path including subfolders?

 Something like that:

 /a = 3 files
 /a/b = 1 file

 regars,
 markus





 --
 View this message in context:
 http://elasticsearch-users.115913.n3.nabble.com/aggregation-of-hierchical-elements-possible-tp4062768.html
 Sent from the ElasticSearch Users mailing list archive at Nabble.com.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/1409586703001-4062768.post%40n3.nabble.com
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nN04%3DjEborsm%3D%3D_N6Yt8vSCN95WJ6Ka4gJGqHRXSJ2Bw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: aggregation of hierchical elements possible?

2014-09-02 Thread Markus Breuer
Helli Vineeth,

thx for your response. Your proposal 1. seems to be similar to the 
path-tokenizer, which I used, isn't it?

settings : {
 index : {
 analysis : {
 analyzer : {
 path-analyzer : {
 type : custom
 tokenizer : path-tokenizer
 }
 }
 tokenizer : {
 path-tokenizer : {
 type : path_hierarchy
 delimiter : /
 }
 }
 }
 }
}

But when using the term-aggregation, the result is not correct. The 
following query should do an aggregation per folder and sum the length 
of all files in this folder and subfolders. The query returns some 
result but the result seems not to be complete. Can you explain in which 
way you would apply your proposal at 3.?

{
 aggs : {
 file_count : {
 terms : {
 field : path,
 order : {
 _term : asc
 }
 },
 aggs : {
 file_size : {
 sum : {
 field : length
 }
 }
 }
 }
 },
 size : 0
}

These are my mappings:

{
 properties : {
 path : {
 type : string,
 analyzer : path-analyzer
 },
 full_path : {
 type : string,
 index : not_analyzed
 },
 is_dir : {
 type : boolean
 }
 }
}

regards,
markus



 Hello Markus ,

 I cant seem to think of any straight method , but then you can try the 
 following

  1. Apply source transform script to convert /a/b/c = [ /a , /a/b ,
 /a/b/c ] -
 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-transform.html#mapping-transform
  2. Now apply normal term aggregation.
  3. But then your query on this field will also match /a , /a/b also ,
 so go for a raw field too

 Thanks
   Vineeth


 On Mon, Sep 1, 2014 at 9:21 PM, skippi1 [hidden email] 
 /user/SendEmail.jtp?type=nodenode=4062817i=0 wrote:

 The index has a field named path which contains the canonical
 file name,
 e.g.:

 /a/file1
 /a/file2
 /a/b/file3

 Is it possible to create an bucket aggregation to summarize all
 file per
 path including subfolders?

 Something like that:

 /a = 3 files
 /a/b = 1 file

 regars,
 markus





 --
 View this message in context:
 
 http://elasticsearch-users.115913.n3.nabble.com/aggregation-of-hierchical-elements-possible-tp4062768.html
 Sent from the ElasticSearch Users mailing list archive at Nabble.com.

 --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it,
 send an email to [hidden email]
 /user/SendEmail.jtp?type=nodenode=4062817i=1.
 To view this discussion on the web visit
 
 https://groups.google.com/d/msgid/elasticsearch/1409586703001-4062768.post%40n3.nabble.com.
 For more options, visit https://groups.google.com/d/optout.


 -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to [hidden email] 
 /user/SendEmail.jtp?type=nodenode=4062817i=2.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nN04%3DjEborsm%3D%3D_N6Yt8vSCN95WJ6Ka4gJGqHRXSJ2Bw%40mail.gmail.com
  
 https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nN04%3DjEborsm%3D%3D_N6Yt8vSCN95WJ6Ka4gJGqHRXSJ2Bw%40mail.gmail.com?utm_medium=emailutm_source=footer.
 For more options, visit https://groups.google.com/d/optout.


 
 If you reply to this email, your message will be added to the 
 discussion below:
 http://elasticsearch-users.115913.n3.nabble.com/aggregation-of-hierchical-elements-possible-tp4062768p4062817.html
  

 To unsubscribe from aggregation of hierchical elements possible?, 
 click here 
 http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4062768code=c2tpcHBpMUBnbXguZGV8NDA2Mjc2OHwxMjgxODY3Mzg0.
 NAML 
 http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
  






--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/aggregation-of-hierchical-elements-possible-tp4062768p4062846.html
Sent from the 

Re: Aggregation across indices

2014-08-26 Thread vineeth mohan
Hello Sandeep ,

What you are intending is not possible.
But then Elasticsearch do have some good relational operations which needs
to be defined before indexing.
If you can elaborate your use case , we can help on this.

Thanks
   Vineeth


On Tue, Aug 26, 2014 at 6:04 PM, 'Sandeep Ramesh Khanzode' via
elasticsearch elasticsearch@googlegroups.com wrote:

 Hi,

 If I have two indices each having part of the record and joined using some
 common identifier, can I issue a query across both indices and have
 aggregations apply taking into consideration both indices?

 Example:
 Index 1: Type 1:
 ID: String
 Field1: String
 Field2: String

 Index 2: Type 2:
 ID: String (From above. I can keep this same to behave like a foreign key.)
 Field3: String
 Field4: String

 Can I effect a join across both indices and aggregate on Field4 for
 example?

 Please let me know. Thanks,
 Sandeep

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/2b839a9a-b109-4948-8d8b-58107f77572e%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/2b839a9a-b109-4948-8d8b-58107f77572e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5%3D93B%2Bk1QCKQHg_n%3D%3Da9Yih9Lyi1k4Gt_LZ7kywnBiroQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation

2014-08-15 Thread Yuheng Du



I am using:

https://lh4.googleusercontent.com/-cJH7ZK_Aw_I/U-4rNlfDRwI/ANA/tMq8twaPRcw/s1600/12.jpg
and I got the following errors:

https://lh5.googleusercontent.com/-lQYfexkUZ2U/U-4rqz__L8I/ANI/SNisnl0duhc/s1600/34.jpg




















can anyone tell me what is going wrong?

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/13b34421-fa99-4de4-ab11-15aac1dfff65%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation

2014-08-15 Thread chenlin rao
What's your `deviceId` mapping type? Make sure it's a number as using in
percentile aggregation.


2014-08-15 23:49 GMT+08:00 Yuheng Du yuheng.du.h...@gmail.com:


 I am using:


 https://lh4.googleusercontent.com/-cJH7ZK_Aw_I/U-4rNlfDRwI/ANA/tMq8twaPRcw/s1600/12.jpg
 and I got the following errors:


 https://lh5.googleusercontent.com/-lQYfexkUZ2U/U-4rqz__L8I/ANI/SNisnl0duhc/s1600/34.jpg




















 can anyone tell me what is going wrong?

 Thanks!

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/13b34421-fa99-4de4-ab11-15aac1dfff65%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/13b34421-fa99-4de4-ab11-15aac1dfff65%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CABwsoohkLx1mqmpRoo7WSo6r6HVouY83dLhWwRqGh80vPQZvNg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation query

2014-08-15 Thread vineeth mohan
Hello Ivan ,

This is expected.
Only the top N(size mentioned in aggregation) results are taken from each
shard before reducing the result.
Due this , the accuracy is not guaranteed but the order is guaranteed.
As a fix , you can use this to improve accuracy at the cost of memory -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html#CO16-1

Thanks
   Vineeth


On Fri, Aug 15, 2014 at 9:16 PM, Ivan Stone reachout...@gmail.com wrote:

 When I run the following query on a 5 shard ES db I don't get accurate
 results. I have had to reduce the amount of shards on my ES server to 1 to
 get the accuracy I need? Has anyone had a similar issue?

 GET /incidents/_search?search_type=count
 {
   query : {
 filtered : {
   filter : {
 bool : {
   must: {
 range : {
   Date : {
 from : 2014-08-05T00:00:00.000Z,
 to : 2014-08-06T00:00:00.000Z
   }
 }
   },
   must: {
 exists : { field : AttackTypes }
   }
 }
   }
 }
   },
   aggs: {
 by_attackType : {
   terms: {
 field: AttackTypes,
 order: {
   _count: desc
 },
 shard_size: 0,
 size: 10
   },
   aggs: {
 by_perpertrator : {
   terms: {
 field: Perpetrators,
 order: {
   _term: asc
 },
 min_doc_count: 0,
 shard_size: 0,
 size: 0
   }
 }
   }
 }
   }
 }

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/cf1f5980-717d-4da9-b55b-3262a284e144%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/cf1f5980-717d-4da9-b55b-3262a284e144%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5m%3DTcrAtwCO7ovfjRgnO%2BbmxP8DEK%2BRmUvEsyE-1taYcQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation

2014-08-14 Thread 饶琛琳
No problem:
#!/usr/bin/perl
use 5.010;
use Data::Dumper;
use Search::Elasticsearch;
my $e = Search::Elasticsearch-new(
nodes = [
'10.13.57.35:9200',
'10.13.57.36:9200'
]
);
my $r = $e-search(
index = 'logstash-2014.08.15',
body = {
aggs = {
aggsname = {
percentiles = {
field = h5_view_loadtime
}
}
},
size = 0
}
);
say for values $r-{'aggregations'}-{'aggsname'};
在 2014年8月15日,上午6:52,Yuheng Du yuheng.du.h...@gmail.com 写道:

 Does perl module of elasticsearch allows Aggregation syntax?
 
 I run a few tests but it failed. 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/4c023317-6ef3-42cd-9496-bda276e8c2ba%40googlegroups.com.
 For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/FE4E6588-1BA2-4F6E-B21B-BE61618759FC%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation on boolean

2014-08-11 Thread Fabian Köstring
Sry!
It was caused by different indexes.

Am Montag, 11. August 2014 10:18:25 UTC+2 schrieb Fabian Köstring:

 Hey there!

 I got one index with two types. I want to do a aggragtion query.
 This is my query.


 GET index1/type1,type2/_search
 {
query: {
   match_all: {}
},
size : 0,
aggs: {
   myaggregation: {
  terms: {
 field: boolean_field
  }
   }
}
 }

 When I execute this query my results are a bit strange.

 aggregations: {
   myaggregation: {
  buckets: [
 {
key: F,
doc_count: 12032
 },
 {
key: T,
doc_count: 3388
 },
 {
key: 0,
doc_count: 1394
 },
 {
key: 1,
doc_count: 259
 }
  ]
   }
}


 Why are the aggregations splitted in T,F,0,1 and not combined. F = 0, T 
 = 1 ???



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4ecd8dd9-5fcb-4727-8249-1c3bf0b3f9b5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation on parent/child documents

2014-07-25 Thread Adrien Grand
Hi Thomas,

None of the aggregations that we have today can leverage parent/child
relations. However, there is a `children` aggregation in the pipeline:
https://github.com/elasticsearch/elasticsearch/pull/6936


On Fri, Jul 25, 2014 at 1:54 PM, Thomas thomas.bo...@gmail.com wrote:

 Hi,

 I wanted to ask whether is possible to perform aggregations combining
 parent/child documents, something similar with the nested aggregation and
 the reverse nested aggregation. It would be very helpful to have the
 ability to create for instance buckets based on parent document fields and
 get back aggregations that contain fields of both parent and children
 documents combined.

 Any thoughts, future features to be added in the near releases, related to
 the above?

 Thank you
 Thomas

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/91d60d52-c538-45b5-8cf0-91cb1e9d9a9a%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/91d60d52-c538-45b5-8cf0-91cb1e9d9a9a%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6s3BM-fS-LdZU0hfdBBwAYBaVGpi3j95xzhBGsckrpgg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation on parent/child documents

2014-07-25 Thread Thomas
Hi Adrien and thank you for the reply,

This is exactly what i had in mind alongside with the reversed search 
equivalent with the reverse_nested, this is planed for version 1.4.0 
onwards as i see, will keep track of any updates on this, thanks

Thomas

On Friday, 25 July 2014 14:54:50 UTC+3, Thomas wrote:

 Hi,

 I wanted to ask whether is possible to perform aggregations combining 
 parent/child documents, something similar with the nested aggregation and 
 the reverse nested aggregation. It would be very helpful to have the 
 ability to create for instance buckets based on parent document fields and 
 get back aggregations that contain fields of both parent and children 
 documents combined.

 Any thoughts, future features to be added in the near releases, related to 
 the above?

 Thank you
 Thomas


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a6c7dfa1-d8b1-4ce5-8046-73892f74b33e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [Aggregation] Be able to count number of item in a sub-collection

2014-06-27 Thread Grégoire Pineau
Yes and no ;) Because I would like to be able to also filter node in the 
collection. And then cound.
Actually, the collection contains orders, and I want to be able to know how 
many paid order I get for a user.

On Friday, June 27, 2014 8:40:49 AM UTC+2, Timber wrote:

 Could you not add a count of items in the collection at index time? In 
 this case you could filter on this value. 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8a2b4608-a54a-4c76-813a-e6a2d810c697%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation Framework, possible to get distribution of requests per user

2014-06-24 Thread David Pilato
Imagine that you have indexed users.
User has a numberOfDocs field.

You can build a range aggregation on top of that and gives back the count for 
buckets like:

numberOfDocs  2
1  numberOfDocs  3
…

See 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-range-aggregation.html#search-aggregations-bucket-range-aggregation


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 24 juin 2014 à 12:32:16, Thomas (thomas.bo...@gmail.com) a écrit:

Hi,

I wanted to ask whether it is possible to get with the aggregation framework 
the distribution of one specific type of documents sent per user, I'm 
interested for occurrences of documents per user, e.g. :

1000 users sent 1 document 
500 ussers  sent 2 documents
X number of unique users sent Y documents (each)
etc.

on each document i index the user_id

Is there a way to support such a query, or partially support it? get the first 
10 rows of this type of list not the exhaustive list. Can you give me some 
hint? 

Thanks
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c9e7e543-372c-4441-9cac-e7c0f259ed4e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53a956a4.5bd062c2.950f%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation Framework, possible to get distribution of requests per user

2014-06-24 Thread Thomas
Hi David 

Thank you for your reply, so based on your suggestion I should maintain a 
document (e.g. user) with some aggregated values and I should update it as 
we move along with our indexing of our data, correct?

This though would only give me totals. I cannot apply something like a 
range. I found as well a similar discussion here 
https://groups.google.com/forum/#!msg/elasticsearch/UsrCG2Abj-A/IDO9DX_PoQwJ. 
Maybe something similar with the terms and histogram aggregation could 
support this logic like instead of giving :

{
aggs : {
requests_distribution : {
distribution : {
field : user_id,
interval : 50
}
}
}
}

and the result could be:

{
aggregations: {
requests_distribution : {
buckets: [
{
key: 0,
doc_count: 2
},
{
key: 50,
doc_count: 400
},
{
key: 150,
doc_count: 30
}
]
}
}
}

Where the key represents a unique number of users like for 0 to 50 users 
have 2 documents per user etc.

Just an idea

Thanks
Thomas

On Tuesday, 24 June 2014 13:32:13 UTC+3, Thomas wrote:

 Hi,

 I wanted to ask whether it is possible to get with the aggregation 
 framework the distribution of one specific type of documents sent per user, 
 I'm interested for occurrences of documents per user, e.g. :

 1000 users sent 1 document 
 500 ussers  sent 2 documents
 X number of unique users sent Y documents (each)
 etc.

 on each document i index the user_id

 Is there a way to support such a query, or partially support it? get the 
 first 10 rows of this type of list not the exhaustive list. Can you give me 
 some hint? 

 Thanks


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ae8b56f1-a783-4ade-b948-079f6457ae27%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation Framework, possible to get distribution of requests per user

2014-06-24 Thread David Pilato
I was only thinking loud. I mean that I don't know what your model looks like.
May be you could illustrate your use case with some actual data and we can move 
forward from here?

What kind of documents are you actually indexing and searching for? What fields 
do you have?


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 24 juin 2014 à 14:42:14, Thomas (thomas.bo...@gmail.com) a écrit:

Hi David 

Thank you for your reply, so based on your suggestion I should maintain a 
document (e.g. user) with some aggregated values and I should update it as we 
move along with our indexing of our data, correct?

This though would only give me totals. I cannot apply something like a range. I 
found as well a similar discussion here 
https://groups.google.com/forum/#!msg/elasticsearch/UsrCG2Abj-A/IDO9DX_PoQwJ. 
Maybe something similar with the terms and histogram aggregation could support 
this logic like instead of giving :

{
    aggs : {
        requests_distribution : {
            distribution : {
                field : user_id,
                interval : 50
            }
        }
    }
}

and the result could be:

{
    aggregations: {
        requests_distribution : {
            buckets: [
                {
                    key: 0,
                    doc_count: 2
                },
                {
                    key: 50,
                    doc_count: 400
                },
                {
                    key: 150,
                    doc_count: 30
                }
            ]
        }
    }
}

Where the key represents a unique number of users like for 0 to 50 users have 2 
documents per user etc.

Just an idea

Thanks
Thomas

On Tuesday, 24 June 2014 13:32:13 UTC+3, Thomas wrote:
Hi,

I wanted to ask whether it is possible to get with the aggregation framework 
the distribution of one specific type of documents sent per user, I'm 
interested for occurrences of documents per user, e.g. :

1000 users sent 1 document 
500 ussers  sent 2 documents
X number of unique users sent Y documents (each)
etc.

on each document i index the user_id

Is there a way to support such a query, or partially support it? get the first 
10 rows of this type of list not the exhaustive list. Can you give me some 
hint? 

Thanks
--
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ae8b56f1-a783-4ade-b948-079f6457ae27%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.53a97c1d.2443a858.950f%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation Framework, possible to get distribution of requests per user

2014-06-24 Thread Thomas
My mistake sorry,

Here is an example:

I have the request document:

request:{
 dynamic : strict,
 properties : {
time : {
  format : dateOptionalTime,
  type : date
},
user_id : {
   index : not_analyzed,
   type : string
},
country : {
   index : not_analyzed,
   type : string
}
  }
}

I want to find the number of (unique) user_ids that have X number of 
documents, e.g. for country US, and ideally I need the full list e.g.:


1000 users have 43 documents
..
100 users have 234 documents
150 users have 500 documents
etc..

In other words the distribution of documents (requests) per unique user 
count, of course I can understand that it is a pretty heavy operation in 
terms of memory, but we may limit to the top 100 rows for instance, or if 
we can workaround it.

Thanks again for your time
Thomas

On Tuesday, 24 June 2014 13:32:13 UTC+3, Thomas wrote:

 Hi,

 I wanted to ask whether it is possible to get with the aggregation 
 framework the distribution of one specific type of documents sent per user, 
 I'm interested for occurrences of documents per user, e.g. :

 1000 users sent 1 document 
 500 ussers  sent 2 documents
 X number of unique users sent Y documents (each)
 etc.

 on each document i index the user_id

 Is there a way to support such a query, or partially support it? get the 
 first 10 rows of this type of list not the exhaustive list. Can you give me 
 some hint? 

 Thanks


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e07561ed-7f1b-4e98-8a8d-16e410324cc2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation equivalent of Facet global : true ?

2014-06-11 Thread Adrien Grand
You can do this by running a `global` aggregation:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-global-aggregation.html


On Wed, Jun 11, 2014 at 12:02 PM, mooky nick.minute...@gmail.com wrote:


 Is there a way of specifying the scope of an aggregation (if there is I
 cant seem to find it)?

 I want to achieve the equivalent of a Facet global : true.

 Do I need to use facets instead of aggregations in this case?

 I am just doing term aggregations - to give the user a dropdown list to
 filter by, say, commodity - and provide choices like:
 Aluminium (0)
 Copper (110)
 Gold (6)
 Lead (243)
 Zinc (0)

 I want to do a global aggregation to get all the possible terms (ie all
 possible filter values).
 I want to do a contextual (based on current user query) term aggregation
 to get the counts.

 How do I do this with Aggregations?

 Thanks

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/2ea2b66e-5274-4d04-a4b4-dabef399db13%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/2ea2b66e-5274-4d04-a4b4-dabef399db13%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7oewksDpV0Nkqoczw5Wkjq4uO2m%2B4eDZcshSu5Yb5HyQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation equivalent of Facet global : true ?

2014-06-11 Thread mooky
Aha. I missed that. Many thanks.

On Wednesday, 11 June 2014 11:23:57 UTC+1, Adrien Grand wrote:

 You can do this by running a `global` aggregation: 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-global-aggregation.html


 On Wed, Jun 11, 2014 at 12:02 PM, mooky nick.mi...@gmail.com 
 javascript: wrote:


 Is there a way of specifying the scope of an aggregation (if there is I 
 cant seem to find it)?

 I want to achieve the equivalent of a Facet global : true.

 Do I need to use facets instead of aggregations in this case?

 I am just doing term aggregations - to give the user a dropdown list to 
 filter by, say, commodity - and provide choices like:
 Aluminium (0)
 Copper (110)
 Gold (6)
 Lead (243)
 Zinc (0)

 I want to do a global aggregation to get all the possible terms (ie all 
 possible filter values).
 I want to do a contextual (based on current user query) term aggregation 
 to get the counts.

 How do I do this with Aggregations?

 Thanks

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/2ea2b66e-5274-4d04-a4b4-dabef399db13%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/2ea2b66e-5274-4d04-a4b4-dabef399db13%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 -- 
 Adrien Grand
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/86b96aae-6a71-4e47-a605-3db55a120905%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation bug? Or user error?

2014-06-06 Thread mooky
 
Ok. I have written a test case that (if run enough) will reproduce it. Its 
an intermittent bug.
I have raised an issue:
https://github.com/elasticsearch/elasticsearch/issues/6435 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0b666a91-2f49-4787-ba2f-fb33a8fc023e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation vs Search/Filter discrepancy - caching issue?

2014-06-04 Thread mooky
Turns out it was user error. Please ignore.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/81ca5558-4fd3-44cd-ae72-4490145fa905%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation bug? Or user error?

2014-06-04 Thread mooky
Bah.
I thought I had a simple unit test that was reliably recreating it - but it 
would appear not. Its still very intermittent - and my test never seems to 
fail when run on its own.



On Tuesday, 3 June 2014 21:41:04 UTC+1, Adrien Grand wrote:

 A recreation would be really great! If you can zip it and upload it to any 
 file sharing service, that would work for me.


 On Tue, Jun 3, 2014 at 6:41 PM, mooky nick.mi...@gmail.com javascript: 
 wrote:


 By the way this test fails with elastic 1.2 also.

 How do I go about uploading an index with aggregation request json, etc?
  
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/2284bf7f-5561-40d6-a430-08b4dbbaca00%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/2284bf7f-5561-40d6-a430-08b4dbbaca00%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




 -- 
 Adrien Grand
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1286f2ed-fbce-4145-834e-a579bcf84cb1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation vs Search/Filter discrepancy - caching issue?

2014-06-03 Thread mooky
Update elastic to 1.2 - still seeing the same issue...

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2805708d-57dd-4977-a17c-2c27d9ee98d0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation bug? Or user error?

2014-06-03 Thread mooky
I have managed to produce a unit test that exposes this (albeit different 
to the data above).
The index is quite small - and the data fictional - so theres no problem 
sending you the index.

Here is a result I get - and we can see the sub-aggregations have higher 
counts than the parent:
{
sales_quotas: {
doc_count: 6,
shipmentDate: {
buckets: [
{
key: Overdue,
to: 1.3989024E12,
to_as_string: 2014-05-01,
doc_count: 2,
nothingAllocated: {
doc_count: 6,
ME: {
doc_count: 0
},
NOT_ME: {
doc_count: 6
}
}
}
]
}
}
}








On Tuesday, 6 May 2014 10:34:53 UTC+1, mooky wrote:

 I am using elastic 1.1.1.
 The index isn't huge (600m) - but it contains financially sensitive 
 data... will be too problematic legally to allow it offsite. I can try 
 anonymise the data - see if it can be reproduced that way - might learn 
 something about what is causing it.





 On Friday, 2 May 2014 14:34:21 UTC+1, Adrien Grand wrote:

 What version of Elasticsearch are you using? If it is small enough, I 
 would also be interested if you could share your index so that I can try to 
 reproduce the issue locally.


 On Fri, May 2, 2014 at 12:07 PM, mooky nick.mi...@gmail.com wrote:

  
 I havent been able to figure out what is required to recreate it.
 I am doing a number of identical aggregations (just different values 
 intentMarketCode 
 and intentDate
 Three aggregations give correct numbers - one doesnt I havent 
 figured why
  

 On Wednesday, 30 April 2014 14:13:00 UTC+1, Adrien Grand wrote:

 This looks wrong indeed. By any chance, would you have a curl 
 recreation of this issue?


 On Tue, Apr 29, 2014 at 7:35 PM, mooky nick.mi...@gmail.com wrote:

 It looks like a bug to me - but if its user error, then obviously I 
 can fix it a lot quicker :)
  

 On Tuesday, 29 April 2014 13:04:53 UTC+1, mooky wrote:

  I am seeing some very odd aggregation results - where the sum of 
 the sub-aggregations is more than the parent bucket.

 Results:
 CSSX : {
   doc_count : *24*,
   intentDate : {
 buckets : [ {
   key : Overdue,
   to : 1.3981248E12,
   to_as_string : 2014-04-22,
   doc_count : *1*,
   ME : {
 doc_count : *0*
   },
   NOT_ME : {
 doc_count : *24*
   }
 }, {
   key : May,
   from : 1.3981248E12,
   from_as_string : 2014-04-22,
   to : 1.4006304E12,
   to_as_string : 2014-05-21,
   doc_count : *23*,
   ME : {
 doc_count : 0
   },
   NOT_ME : {
 doc_count : *24*
   }
 }, {
   key : June,
   from : 1.4006304E12,
   from_as_string : 2014-05-21,
   to : 1.4033088E12,
   to_as_string : 2014-06-21,
   doc_count : *0*,
   ME : {
 doc_count : *0*
   },
   NOT_ME : {
 doc_count : *24*
   }
 } ]
   }
 },


 I wouldn't have thought that to be possible at all.
 Here is the request that generated the dodgy results.


 CSSX : {
   filter : {
 and : {
   filters : [ {
 type : {
   value : inventory
 }
   }, {
 term : {
   isAllocated : false
 }
   }, {
 term : {
   intentMarketCode : CSSX
 }
   }, {
 terms : {
   groupCompanyId : [ 0D13EF2D0E114D43BFE362F5024D8873
 , 0D593DE0CFBE49BEA3BF5AD7CD965782, 1E9C36CC45C64FCAACDEE0AF4FB91F
 BA, 33A946DC2B0E494EB371993D345F52E4, 
 6471AA50DFCF4192B8DD1C2E72A032C7, 9FB2FFDC0FF0797FE04014AC6F0616B6
 , 9FB2FFDC0FF1797FE04014AC6F0616B6, 9FB2FFDC0FF2797FE04014AC6F0616
 B6, 9FB2FFDC0FF3797FE04014AC6F0616B6, 
 9FB2FFDC0FF5797FE04014AC6F0616B6, 9FB2FFDC0FF6797FE04014AC6F0616B6
 , AFE0FED33F06AFB6E04015AC5E060AA3 ]
 }
   }, {
 not : {
   filter : {
 terms : {
   status : [ Cancelled, Completed ]
 }
   }
 }
   } ]
 }
   },
   aggregations : {
 intentDate : {
   date_range : {
 field : intentDate,
 ranges : [ {
   key : Overdue,
   to : 2014-04-22
 }, {
   key : May,
   from : 2014-04-22,
   to : 2014-05-21
 }, {
   key : June,
   from : 2014-05-21,
   to : 2014-06-21
 } ]
   },
   

Re: Aggregation bug? Or user error?

2014-06-03 Thread mooky

By the way this test fails with elastic 1.2 also.

How do I go about uploading an index with aggregation request json, etc?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2284bf7f-5561-40d6-a430-08b4dbbaca00%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation bug? Or user error?

2014-06-03 Thread Adrien Grand
A recreation would be really great! If you can zip it and upload it to any
file sharing service, that would work for me.


On Tue, Jun 3, 2014 at 6:41 PM, mooky nick.minute...@gmail.com wrote:


 By the way this test fails with elastic 1.2 also.

 How do I go about uploading an index with aggregation request json, etc?

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/2284bf7f-5561-40d6-a430-08b4dbbaca00%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/2284bf7f-5561-40d6-a430-08b4dbbaca00%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j68foACRpstuJ-nGTgANgR%3Dqk%2Bh%3DTaSw2mRgRQiauY9%3Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation vs Search/Filter discrepancy - caching issue?

2014-06-03 Thread Adrien Grand
Can you share your test case?


On Tue, Jun 3, 2014 at 1:00 PM, mooky nick.minute...@gmail.com wrote:

 Update elastic to 1.2 - still seeing the same issue...

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/2805708d-57dd-4977-a17c-2c27d9ee98d0%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/2805708d-57dd-4977-a17c-2c27d9ee98d0%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7T%2B%3DgXO%2BGHGFw11RBjpTOkXCUJ%2Bz2yVauU_YOo6Pmwxg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation-sql like optimization guidance with elasticsearch 1.0.0

2014-05-29 Thread Niko Nyrhila
Hi,

You can nest aggregations, so in this case you'd first use Date Histogram 
aggregation with an interval of one hour:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html

Then you'd aggregate by id field:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html

Here is an example:
http://www.solinea.com/blog/elasticsearch-aggs-save-the-day

This should be very fast, even when running on a single machine.


On Friday, January 31, 2014 3:36:20 AM UTC+2, Maxime Nay wrote:

 Hi,

 We are experimenting elasticsearch 1.0.0, and are particularly excited 
 about the new aggregation feature.

 Here is one of our use-case that we would like to optimize :

 Right now, to imitate a basic SQL group by query that would look like : 
 SELECT day, hour, id, SUM(views), SUM(clicks), SUM(video_plays) FROM 
 events GROUP BY day, hour, id

 we are issuing this kind of queries :

 {  
 size : 0,
 query:{match_all:{}},
 aggs : {
 test_aggregation : {
 terms : {
 script : doc['day'].date + '-' + doc['hour'].value + 
 '-' + doc['id'].value,
 order : { _term : asc },
 size: 
 },
 aggs : {
 sum_click : { sum : { field : clicks } },
 sum_views : { sum : { field : views } },
 sum_video_plays : { sum : { field : video_plays } }
 }
 }
 }
 }

 But the perfs for this kind of queries are kind of low. Thus, we would 
 like to know if there are a more optimized way to get what we want.

 Thanks !
 Maxime


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bb2293a1-b83c-45a1-af42-e48b3fd9a0c9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: aggregation with average value in java api

2014-05-25 Thread Subhadip Bagui
Thank you Adrien, I'm getting the aggregation value now.

One doubt here. I have a field which stores values likes CPU_USED : 
0.04%
Want to do aggregation on that. Can I do any string manipulation here on 
the field passed on AggregationBuilders ? Tried like this but not working . 
Please suggest.
addAggregation(AggregationBuilders.avg(cpu_used_avg).script(doc['CPU_USED'].value.replace(%,
 
)))

Thanks
Subhadip 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8635fb87-8d03-44ac-866a-550b5bb07dce%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: aggregation with average value in java api

2014-05-23 Thread Adrien Grand
Hi,

You should be able to fix your problem by replacing the call to
setAggregations with:

.addAggregation(AggregationBuilders.avg(memory_average).field(MEMORY))


On Fri, May 23, 2014 at 7:30 PM, Subhadip Bagui i.ba...@gmail.com wrote:

 Hi,

 I'm trying to do aggregation using java api. But couldn't get the
 *SearchResponse* correctly. Getting below error. I've written method like
 below using XContentBuilder.
 Couldn't get any working scenarios from google. Please suggest any working
 example for aggregation of single value.

 * public static Aggregations searchAggregation() throws IOException {*

 * Client client = ESClientFactory.getInstance();*

 * XContentBuilder contentBuilder = XContentFactory.jsonBuilder()*

 * .startObject(aggs).startObject(memory_average)*

 * .startObject(avg).field(field, MEMORY).endObject()*

 * .endObject().endObject();*


 * System.out.println(contentBuilder.string());*

 * SearchResponse response = client.prepareSearch(virtualmachines)*

 * .setTypes(nodes).setQuery(QueryBuilders.matchAllQuery())*

 * .setAggregations(contentBuilder).execute().actionGet();*

 * client.close();*

 * System.out.println(response);*

 * return response.getAggregations();*

 * }*

 *Error:*

 Exception in thread main
 *org.elasticsearch.transport.TransportSerializationException*: Failed to
 deserialize exception response from stream

   at
 org.elasticsearch.transport.netty.MessageChannelHandler.handlerResponseError(
 *MessageChannelHandler.java:169*)

   at
 org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(
 *MessageChannelHandler.java:123*)

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/9c802ccd-1c2d-4d7a-b6f5-c0f045a5097d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/9c802ccd-1c2d-4d7a-b6f5-c0f045a5097d%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5paxw_hxdQLGfjkbJzm0uf2GQ0BbmB7Rh7C_RtKfgqxg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation bug? Or user error?

2014-05-06 Thread mooky
I am using elastic 1.1.1.
The index isn't huge (600m) - but it contains financially sensitive data... 
will be too problematic legally to allow it offsite. I can try anonymise 
the data - see if it can be reproduced that way - might learn something 
about what is causing it.





On Friday, 2 May 2014 14:34:21 UTC+1, Adrien Grand wrote:

 What version of Elasticsearch are you using? If it is small enough, I 
 would also be interested if you could share your index so that I can try to 
 reproduce the issue locally.


 On Fri, May 2, 2014 at 12:07 PM, mooky nick.mi...@gmail.com javascript:
  wrote:

  
 I havent been able to figure out what is required to recreate it.
 I am doing a number of identical aggregations (just different values 
 intentMarketCode 
 and intentDate
 Three aggregations give correct numbers - one doesnt I havent figured 
 why
  

 On Wednesday, 30 April 2014 14:13:00 UTC+1, Adrien Grand wrote:

 This looks wrong indeed. By any chance, would you have a curl recreation 
 of this issue?


 On Tue, Apr 29, 2014 at 7:35 PM, mooky nick.mi...@gmail.com wrote:

 It looks like a bug to me - but if its user error, then obviously I can 
 fix it a lot quicker :)
  

 On Tuesday, 29 April 2014 13:04:53 UTC+1, mooky wrote:

  I am seeing some very odd aggregation results - where the sum of the 
 sub-aggregations is more than the parent bucket.

 Results:
 CSSX : {
   doc_count : *24*,
   intentDate : {
 buckets : [ {
   key : Overdue,
   to : 1.3981248E12,
   to_as_string : 2014-04-22,
   doc_count : *1*,
   ME : {
 doc_count : *0*
   },
   NOT_ME : {
 doc_count : *24*
   }
 }, {
   key : May,
   from : 1.3981248E12,
   from_as_string : 2014-04-22,
   to : 1.4006304E12,
   to_as_string : 2014-05-21,
   doc_count : *23*,
   ME : {
 doc_count : 0
   },
   NOT_ME : {
 doc_count : *24*
   }
 }, {
   key : June,
   from : 1.4006304E12,
   from_as_string : 2014-05-21,
   to : 1.4033088E12,
   to_as_string : 2014-06-21,
   doc_count : *0*,
   ME : {
 doc_count : *0*
   },
   NOT_ME : {
 doc_count : *24*
   }
 } ]
   }
 },


 I wouldn't have thought that to be possible at all.
 Here is the request that generated the dodgy results.


 CSSX : {
   filter : {
 and : {
   filters : [ {
 type : {
   value : inventory
 }
   }, {
 term : {
   isAllocated : false
 }
   }, {
 term : {
   intentMarketCode : CSSX
 }
   }, {
 terms : {
   groupCompanyId : [ 0D13EF2D0E114D43BFE362F5024D8873, 
 0D593DE0CFBE49BEA3BF5AD7CD965782, 1E9C36CC45C64FCAACDEE0AF4FB91FBA
 , 33A946DC2B0E494EB371993D345F52E4, 6471AA50DFCF4192B8DD1C2E72A032
 C7, 9FB2FFDC0FF0797FE04014AC6F0616B6, 
 9FB2FFDC0FF1797FE04014AC6F0616B6, 9FB2FFDC0FF2797FE04014AC6F0616B6, 
 9FB2FFDC0FF3797FE04014AC6F0616B6, 9FB2FFDC0FF5797FE04014AC6F0616B6
 , 9FB2FFDC0FF6797FE04014AC6F0616B6, AFE0FED33F06AFB6E04015AC5E060A
 A3 ]
 }
   }, {
 not : {
   filter : {
 terms : {
   status : [ Cancelled, Completed ]
 }
   }
 }
   } ]
 }
   },
   aggregations : {
 intentDate : {
   date_range : {
 field : intentDate,
 ranges : [ {
   key : Overdue,
   to : 2014-04-22
 }, {
   key : May,
   from : 2014-04-22,
   to : 2014-05-21
 }, {
   key : June,
   from : 2014-05-21,
   to : 2014-06-21
 } ]
   },
   aggregations : {
 ME : {
   filter : {
 term : {

   trafficOperatorSid : S-1-5-21-20xxspan 
 style=color: #000; class=styled-by
 ...

  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/4ceceaaf-4fb8-4e54-97f4-c49fcbf9493d%
 40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ceceaaf-4fb8-4e54-97f4-c49fcbf9493d%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




 -- 
 Adrien Grand
  
  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving 

Re: Aggregation bug? Or user error?

2014-05-02 Thread mooky
 
I havent been able to figure out what is required to recreate it.
I am doing a number of identical aggregations (just different values 
intentMarketCode 
and intentDate
Three aggregations give correct numbers - one doesnt I havent figured 
why
 

On Wednesday, 30 April 2014 14:13:00 UTC+1, Adrien Grand wrote:

 This looks wrong indeed. By any chance, would you have a curl recreation 
 of this issue?


 On Tue, Apr 29, 2014 at 7:35 PM, mooky nick.mi...@gmail.com javascript:
  wrote:

 It looks like a bug to me - but if its user error, then obviously I can 
 fix it a lot quicker :)


 On Tuesday, 29 April 2014 13:04:53 UTC+1, mooky wrote:

 I am seeing some very odd aggregation results - where the sum of the 
 sub-aggregations is more than the parent bucket.

 Results:
 CSSX : {
   doc_count : *24*,
   intentDate : {
 buckets : [ {
   key : Overdue,
   to : 1.3981248E12,
   to_as_string : 2014-04-22,
   doc_count : *1*,
   ME : {
 doc_count : *0*
   },
   NOT_ME : {
 doc_count : *24*
   }
 }, {
   key : May,
   from : 1.3981248E12,
   from_as_string : 2014-04-22,
   to : 1.4006304E12,
   to_as_string : 2014-05-21,
   doc_count : *23*,
   ME : {
 doc_count : 0
   },
   NOT_ME : {
 doc_count : *24*
   }
 }, {
   key : June,
   from : 1.4006304E12,
   from_as_string : 2014-05-21,
   to : 1.4033088E12,
   to_as_string : 2014-06-21,
   doc_count : *0*,
   ME : {
 doc_count : *0*
   },
   NOT_ME : {
 doc_count : *24*
   }
 } ]
   }
 },


 I wouldn't have thought that to be possible at all.
 Here is the request that generated the dodgy results.


 CSSX : {
   filter : {
 and : {
   filters : [ {
 type : {
   value : inventory
 }
   }, {
 term : {
   isAllocated : false
 }
   }, {
 term : {
   intentMarketCode : CSSX
 }
   }, {
 terms : {
   groupCompanyId : [ 0D13EF2D0E114D43BFE362F5024D8873, 
 0D593DE0CFBE49BEA3BF5AD7CD965782, 1E9C36CC45C64FCAACDEE0AF4FB91FBA, 
 33A946DC2B0E494EB371993D345F52E4, 6471AA50DFCF4192B8DD1C2E72A032C7, 
 9FB2FFDC0FF0797FE04014AC6F0616B6, 9FB2FFDC0FF1797FE04014AC6F0616B6, 
 9FB2FFDC0FF2797FE04014AC6F0616B6, 9FB2FFDC0FF3797FE04014AC6F0616B6, 
 9FB2FFDC0FF5797FE04014AC6F0616B6, 9FB2FFDC0FF6797FE04014AC6F0616B6, 
 AFE0FED33F06AFB6E04015AC5E060AA3 ]
 }
   }, {
 not : {
   filter : {
 terms : {
   status : [ Cancelled, Completed ]
 }
   }
 }
   } ]
 }
   },
   aggregations : {
 intentDate : {
   date_range : {
 field : intentDate,
 ranges : [ {
   key : Overdue,
   to : 2014-04-22
 }, {
   key : May,
   from : 2014-04-22,
   to : 2014-05-21
 }, {
   key : June,
   from : 2014-05-21,
   to : 2014-06-21
 } ]
   },
   aggregations : {
 ME : {
   filter : {
 term : {

   trafficOperatorSid : S-1-5-21-20xxspan 
 style=color: #000; class=styled-by
 ...

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/4ceceaaf-4fb8-4e54-97f4-c49fcbf9493d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ceceaaf-4fb8-4e54-97f4-c49fcbf9493d%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




 -- 
 Adrien Grand
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3e7d8928-f76b-4358-97b9-3189e037006c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation bug? Or user error?

2014-05-02 Thread Adrien Grand
What version of Elasticsearch are you using? If it is small enough, I would
also be interested if you could share your index so that I can try to
reproduce the issue locally.


On Fri, May 2, 2014 at 12:07 PM, mooky nick.minute...@gmail.com wrote:


 I havent been able to figure out what is required to recreate it.
 I am doing a number of identical aggregations (just different values 
 intentMarketCode
 and intentDate
 Three aggregations give correct numbers - one doesnt I havent figured
 why


 On Wednesday, 30 April 2014 14:13:00 UTC+1, Adrien Grand wrote:

 This looks wrong indeed. By any chance, would you have a curl recreation
 of this issue?


 On Tue, Apr 29, 2014 at 7:35 PM, mooky nick.mi...@gmail.com wrote:

 It looks like a bug to me - but if its user error, then obviously I can
 fix it a lot quicker :)


 On Tuesday, 29 April 2014 13:04:53 UTC+1, mooky wrote:

 I am seeing some very odd aggregation results - where the sum of the
 sub-aggregations is more than the parent bucket.

 Results:
 CSSX : {
   doc_count : *24*,
   intentDate : {
 buckets : [ {
   key : Overdue,
   to : 1.3981248E12,
   to_as_string : 2014-04-22,
   doc_count : *1*,
   ME : {
 doc_count : *0*
   },
   NOT_ME : {
 doc_count : *24*
   }
 }, {
   key : May,
   from : 1.3981248E12,
   from_as_string : 2014-04-22,
   to : 1.4006304E12,
   to_as_string : 2014-05-21,
   doc_count : *23*,
   ME : {
 doc_count : 0
   },
   NOT_ME : {
 doc_count : *24*
   }
 }, {
   key : June,
   from : 1.4006304E12,
   from_as_string : 2014-05-21,
   to : 1.4033088E12,
   to_as_string : 2014-06-21,
   doc_count : *0*,
   ME : {
 doc_count : *0*
   },
   NOT_ME : {
 doc_count : *24*
   }
 } ]
   }
 },


 I wouldn't have thought that to be possible at all.
 Here is the request that generated the dodgy results.


 CSSX : {
   filter : {
 and : {
   filters : [ {
 type : {
   value : inventory
 }
   }, {
 term : {
   isAllocated : false
 }
   }, {
 term : {
   intentMarketCode : CSSX
 }
   }, {
 terms : {
   groupCompanyId : [ 0D13EF2D0E114D43BFE362F5024D8873,
 0D593DE0CFBE49BEA3BF5AD7CD965782, 1E9C36CC45C64FCAACDEE0AF4FB91FBA,
 33A946DC2B0E494EB371993D345F52E4, 6471AA50DFCF4192B8DD1C2E72A032C7,
 9FB2FFDC0FF0797FE04014AC6F0616B6, 9FB2FFDC0FF1797FE04014AC6F0616B6,
 9FB2FFDC0FF2797FE04014AC6F0616B6, 9FB2FFDC0FF3797FE04014AC6F0616B6,
 9FB2FFDC0FF5797FE04014AC6F0616B6, 9FB2FFDC0FF6797FE04014AC6F0616B6,
 AFE0FED33F06AFB6E04015AC5E060AA3 ]
 }
   }, {
 not : {
   filter : {
 terms : {
   status : [ Cancelled, Completed ]
 }
   }
 }
   } ]
 }
   },
   aggregations : {
 intentDate : {
   date_range : {
 field : intentDate,
 ranges : [ {
   key : Overdue,
   to : 2014-04-22
 }, {
   key : May,
   from : 2014-04-22,
   to : 2014-05-21
 }, {
   key : June,
   from : 2014-05-21,
   to : 2014-06-21
 } ]
   },
   aggregations : {
 ME : {
   filter : {
 term : {

   trafficOperatorSid : S-1-5-21-20xxspan
 style=color: #000; class=styled-by
 ...

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/4ceceaaf-4fb8-4e54-97f4-c49fcbf9493d%
 40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ceceaaf-4fb8-4e54-97f4-c49fcbf9493d%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




 --
 Adrien Grand

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/3e7d8928-f76b-4358-97b9-3189e037006c%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/3e7d8928-f76b-4358-97b9-3189e037006c%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, 

Re: Aggregation problem

2014-05-01 Thread Adrien Grand
Hi,

Your request is not valid, it should be

GET /ckdocuments/msv/_search
{
aggregations: {
MSV:{
terms:{
field : MSV.country
}
}
},
size:0
}
}


The next release of Elasticsearch will be less resilient to malformed
aggregations, so you will hopefully get an error instead of just getting
empty aggregations.



On Thu, May 1, 2014 at 6:37 PM, Niv Penso n...@toonimo.com wrote:

 Hey guys,

 I have this mapping:

 GET ckdocuments/msv/_mapping

 {
ckdocuments: {
   mappings: {
  msv: {
 properties: {
MSV: {
   properties: {
  country: {
 type: string,
 index: not_analyzed
  },
  date: {
 type: date,
 format: -MM-dd HH:mm:ss
  },
  hits: {
 type: nested,
 properties: {
click_type: {
   type: string,
   index: not_analyzed
}
 }
  }
   }
},
c: {
   type: string,
   index: not_analyzed
},
doc_creation_time: {
   type: date,
   format: -MM-dd HH:mm:ss
},
views: {
   properties: {
  country: {
 type: string
  },
  date: {
 type: date,
 format: -MM-dd HH:mm:ss
  },
  hits: {
 properties: {
click_type: {
   type: string
}
 }
  }
   }
}
 }
  }
   }
}
 }

 Wih these three docs:

 PUT ckdocuments/msv/1
 {
 c:a,
 MSV :
  [
  {
  country : US,
  date : 2013-01-01 00:00:00,
  hits : [
 {
 click_type : click
 }
 ]
  }
  ],
  views:[
  {
  country : US,
  date : 2013-01-01 00:00:00,
  hits : [
 {
 click_type : click
 }
 ]
  },
  {
  country : IL,
  date : 2013-01-01 00:00:00,
  hits : [
 {
 click_type : click
 }
 ]
  }
  ]
 }

 PUT ckdocuments/msv/2
 {
 doc_creation_time : 2013-01-01 00:00:00,
 MSV :
  [
  {
  country : IL,
  date : 2013-01-01 00:00:00,
  hits : [
  {
  click_type : pixel
  }
  ]
  },
  {
  country : US,
  date : 2013-01-02 00:00:00,
  hits : [
  {
  click_type : click
  }
  ]
  }
  ],
  views:[
  {
  country : US,
  date : 2013-01-01 00:00:00,
  hits : []
  },
  {
  country : US,
  date : 2013-01-01 00:00:00,
  hits : [
  {
  click_type : pixel
  },
  {
  click_type : pixel
  }
  ]
  },
  {
  country : US,
  date : 2013-01-02 00:00:00,
  hits : [
  {
  click_type : click
  }
  ]
  }
  ]
 }

 PUT ckdocuments/msv/3
 {
 MSV :
 [
  {
  country : IL,
  date : 2013-01-01 00:00:00,
  hits : [
 {
 click_type : click
 }
 ]
  }
  ],
  views:[
  {
  country : US,
  date : 2013-01-01 00:00:00,
  hits : [
 {
 click_type : click
 }
 ]
  },
  {
  country : IL,
  date : 2013-01-01 00:00:00,
  hits : [
 {
 click_type : click
 }
 ]
  }
  ]
 }


 When I run:

 GET /ckdocuments/msv/_search
 {
 aggregations: {
 MSV:{
 aggs:{
 terms:{
 field : MSV.country
 }
 }
 }
 },
 size:0
 }
 }

 I get these results:

 {
took: 1,
timed_out: false,
_shards: {
   total: 5,
   successful: 5,
   failed: 0
},
hits: {
   total: 3,
   max_score: 0,
   hits: []
}
 }

 I just can't understand why the aggregation sperate the results to
 countries..
 Do you have any idea?

 Thnx Niv

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/b928cd7b-042a-435d-9839-ef2bd1cc74fa%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/b928cd7b-042a-435d-9839-ef2bd1cc74fa%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you 

Re: Aggregation bug? Or user error?

2014-04-30 Thread Adrien Grand
This looks wrong indeed. By any chance, would you have a curl recreation of
this issue?


On Tue, Apr 29, 2014 at 7:35 PM, mooky nick.minute...@gmail.com wrote:

 It looks like a bug to me - but if its user error, then obviously I can
 fix it a lot quicker :)


 On Tuesday, 29 April 2014 13:04:53 UTC+1, mooky wrote:

 I am seeing some very odd aggregation results - where the sum of the
 sub-aggregations is more than the parent bucket.

 Results:
 CSSX : {
   doc_count : *24*,
   intentDate : {
 buckets : [ {
   key : Overdue,
   to : 1.3981248E12,
   to_as_string : 2014-04-22,
   doc_count : *1*,
   ME : {
 doc_count : *0*
   },
   NOT_ME : {
 doc_count : *24*
   }
 }, {
   key : May,
   from : 1.3981248E12,
   from_as_string : 2014-04-22,
   to : 1.4006304E12,
   to_as_string : 2014-05-21,
   doc_count : *23*,
   ME : {
 doc_count : 0
   },
   NOT_ME : {
 doc_count : *24*
   }
 }, {
   key : June,
   from : 1.4006304E12,
   from_as_string : 2014-05-21,
   to : 1.4033088E12,
   to_as_string : 2014-06-21,
   doc_count : *0*,
   ME : {
 doc_count : *0*
   },
   NOT_ME : {
 doc_count : *24*
   }
 } ]
   }
 },


 I wouldn't have thought that to be possible at all.
 Here is the request that generated the dodgy results.


 CSSX : {
   filter : {
 and : {
   filters : [ {
 type : {
   value : inventory
 }
   }, {
 term : {
   isAllocated : false
 }
   }, {
 term : {
   intentMarketCode : CSSX
 }
   }, {
 terms : {
   groupCompanyId : [ 0D13EF2D0E114D43BFE362F5024D8873, 
 0D593DE0CFBE49BEA3BF5AD7CD965782, 1E9C36CC45C64FCAACDEE0AF4FB91FBA, 
 33A946DC2B0E494EB371993D345F52E4, 6471AA50DFCF4192B8DD1C2E72A032C7, 
 9FB2FFDC0FF0797FE04014AC6F0616B6, 9FB2FFDC0FF1797FE04014AC6F0616B6, 
 9FB2FFDC0FF2797FE04014AC6F0616B6, 9FB2FFDC0FF3797FE04014AC6F0616B6, 
 9FB2FFDC0FF5797FE04014AC6F0616B6, 9FB2FFDC0FF6797FE04014AC6F0616B6, 
 AFE0FED33F06AFB6E04015AC5E060AA3 ]
 }
   }, {
 not : {
   filter : {
 terms : {
   status : [ Cancelled, Completed ]
 }
   }
 }
   } ]
 }
   },
   aggregations : {
 intentDate : {
   date_range : {
 field : intentDate,
 ranges : [ {
   key : Overdue,
   to : 2014-04-22
 }, {
   key : May,
   from : 2014-04-22,
   to : 2014-05-21
 }, {
   key : June,
   from : 2014-05-21,
   to : 2014-06-21
 } ]
   },
   aggregations : {
 ME : {
   filter : {
 term : {

   trafficOperatorSid : S-1-5-21-20xxspan
 style=color: #000; class=styled-by
 ...

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/4ceceaaf-4fb8-4e54-97f4-c49fcbf9493d%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4ceceaaf-4fb8-4e54-97f4-c49fcbf9493d%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7WWj4GaAEH0K%2B37srpP4f_9S%3DKffM7k1DAAyZiy1zUpQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation in Kibana or Elasticsearch

2014-04-24 Thread Prazzy
The below query worked for me.

LogDate:[2013-08-01 TO 2013-08-10] AND (LogDetail:Antenna plate 1
temperature:* AND LogDetail:[70.00 TO 80.00])



On Wed, Apr 23, 2014 at 6:19 PM, Praveen Shilavantar praz...@gmail.comwrote:

 Hi,

 I am new to elasticsearch and kibana. I have loaded some log data into
 elasticsearch and I have a field called LogDetail and the content looks
 like below

 *Antenna plate 1 temperature: 40.00 degC*

 I would like to get the log events/documents for temperature  70.00 degC.
 This is how we are doing in MySQL.

 SELECT substring(log_detail, 30, 5) AS temp
 FROM log_table
 WHERE log_detail like 'temperature: %'
 HAVING temp  70.00

 Is this something we can do with elastisearch query or the data needs to
 be parsed to get the temperature value out of it while loading the data?


 Thanks


  --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/v2L149ADZWk/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/6fc53bd1-a088-4ed6-a83d-1c00931f023c%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/6fc53bd1-a088-4ed6-a83d-1c00931f023c%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAN5oP9VmJ339GS8-gAxuYpcV3TRyMuPKDYPv8AsJEZoKm2z7zQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation error( Java heap space)

2014-04-02 Thread vir . candy
The smaller index have 1 million lines of data. They are the lines filtered 
 by prefix:{ip:100.1} from the bigger one.

在 2014年4月2日星期三UTC+8下午4时04分27秒,vir@gmail.com写道:

 I do an *aggregation* search on my index(*6 nodes*). There are about *200 
 million lines* of data(port scanning). Each line is same* like this 
 :**{ip:85.18.68.5, 
 banner:cisco-IOS, country:IT, _type:port-80}.* 
 So you can image I have these data sort into different type by port they 
 are scanning. Now, I want to know who open a lot of ports at the same time. 
 So, I choose to do aggregation on IP field, and I get an OOM error that may 
 be reasonable because of most of them open only one port so that there are 
 too many buckets? I guess.


 And then, I use aggregation filter. 

 {
 aggs:{
 just_name1:{
   filter:{
   prefix:{
   ip:100.1
   }
   },
   aggs:{
   just_name2:{
   terms:{
   field:ip,
   execution_hint:map
   }
   }
   }
   }
 }
 }(yes, my ip field is set as string)

 I think this time, I could make ES narrow down the set for aggregation. But I 
 still get an OOM error. While It works on a smaller index(another cluster, 
 one node). Why would this happen? After filtering, 2 cluster should have an 
 equal-volume set. Why the bigger one failed?  



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d384bea8-4a60-4521-aa0e-34bb2fd61ec5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation error( Java heap space)

2014-04-02 Thread Adrien Grand
Given your description of the problem, I think the issue is that your
Elasticsearch cluster doesn't have enough memory to load field data for the
ip field (which needs to be done for all documents, not only those that
match your query). So you either need to give more nodes to your cluster,
more memory to your nodes, or use doc values for your ip field[1] (the
latter option requires reindexing).

[1]
http://www.elasticsearch.org/blog/disk-based-field-data-a-k-a-doc-values/


On Wed, Apr 2, 2014 at 10:09 AM, vir.ca...@gmail.com wrote:

 The smaller index have 1 million lines of data. They are the lines
 filtered  by prefix:{ip:100.1} from the bigger one.

 在 2014年4月2日星期三UTC+8下午4时04分27秒,vir@gmail.com写道:

 I do an *aggregation* search on my index(*6 nodes*). There are about *200
 million lines* of data(port scanning). Each line is same* like this 
 :**{ip:85.18.68.5,
 banner:cisco-IOS, country:IT, _type:port-80}.*
 So you can image I have these data sort into different type by port they
 are scanning. Now, I want to know who open a lot of ports at the same time.
 So, I choose to do aggregation on IP field, and I get an OOM error that may
 be reasonable because of most of them open only one port so that there are
 too many buckets? I guess.


 And then, I use aggregation filter.

 {
 aggs:{
 just_name1:{
  filter:{
  prefix:{
  ip:100.1
  }
  },
  aggs:{
  just_name2:{
  terms:{
  field:ip,
  execution_hint:map
  }
  }
  }
  }
 }
 }(yes, my ip field is set as string)

 I think this time, I could make ES narrow down the set for aggregation. But 
 I still get an OOM error. While It works on a smaller index(another cluster, 
 one node). Why would this happen? After filtering, 2 cluster should have an 
 equal-volume set. Why the bigger one failed?

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/d384bea8-4a60-4521-aa0e-34bb2fd61ec5%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/d384bea8-4a60-4521-aa0e-34bb2fd61ec5%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6kOx7RXmBzU9wfhesUYiz-2Qx8mrZStb_rCGdQv%2BpqNQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation error( Java heap space)

2014-04-02 Thread 张阳
But I can do aggregation on 'banner' field on both cluster. Is that because
values of 'banner' are not so unique compared to 'ip' field


2014-04-02 16:27 GMT+08:00 Adrien Grand adrien.gr...@elasticsearch.com:

 Given your description of the problem, I think the issue is that your
 Elasticsearch cluster doesn't have enough memory to load field data for the
 ip field (which needs to be done for all documents, not only those that
 match your query). So you either need to give more nodes to your cluster,
 more memory to your nodes, or use doc values for your ip field[1] (the
 latter option requires reindexing).

 [1]
 http://www.elasticsearch.org/blog/disk-based-field-data-a-k-a-doc-values/


 On Wed, Apr 2, 2014 at 10:09 AM, vir.ca...@gmail.com wrote:

 The smaller index have 1 million lines of data. They are the lines
 filtered  by prefix:{ip:100.1} from the bigger one.

 在 2014年4月2日星期三UTC+8下午4时04分27秒,vir@gmail.com写道:

 I do an *aggregation* search on my index(*6 nodes*). There are about *200
 million lines* of data(port scanning). Each line is same* like this 
 :**{ip:85.18.68.5,
 banner:cisco-IOS, country:IT, _type:port-80}.*
 So you can image I have these data sort into different type by port they
 are scanning. Now, I want to know who open a lot of ports at the same time.
 So, I choose to do aggregation on IP field, and I get an OOM error that may
 be reasonable because of most of them open only one port so that there are
 too many buckets? I guess.


 And then, I use aggregation filter.

 {
 aggs:{
 just_name1:{
 filter:{
 prefix:{
 ip:100.1
 }
 },
 aggs:{
 just_name2:{
 terms:{
 field:ip,
 execution_hint:map
 }
 }
 }
 }
 }
 }(yes, my ip field is set as string)

 I think this time, I could make ES narrow down the set for aggregation. But 
 I still get an OOM error. While It works on a smaller index(another 
 cluster, one node). Why would this happen? After filtering, 2 cluster 
 should have an equal-volume set. Why the bigger one failed?

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/d384bea8-4a60-4521-aa0e-34bb2fd61ec5%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/d384bea8-4a60-4521-aa0e-34bb2fd61ec5%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




 --
 Adrien Grand

 --
 You received this message because you are subscribed to a topic in the
 Google Groups elasticsearch group.
 To unsubscribe from this topic, visit
 https://groups.google.com/d/topic/elasticsearch/cf6dpcV7G3w/unsubscribe.
 To unsubscribe from this group and all its topics, send an email to
 elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6kOx7RXmBzU9wfhesUYiz-2Qx8mrZStb_rCGdQv%2BpqNQ%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6kOx7RXmBzU9wfhesUYiz-2Qx8mrZStb_rCGdQv%2BpqNQ%40mail.gmail.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJp1%3DtwM3KJ1QYvsKGcXi4bDfjwDF-bRviSsYX6jUBEg6w5qgQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation error( Java heap space)

2014-04-02 Thread Adrien Grand
On Wed, Apr 2, 2014 at 10:52 AM, 张阳 vir.ca...@gmail.com wrote:

 But I can do aggregation on 'banner' field on both cluster. Is that
 because values of 'banner' are not so unique compared to 'ip' field


Very likely, yes. Memory usage of field data is higher on high-cardinality
fields.

-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7Fzw6Aud-J2RFb7a2DvfzrDfjyNdMLP0DcjuWgd0Ax9g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: aggregation with conditions

2014-04-01 Thread Binh Ly
The first one is not available, however a terms aggregation and sort by 
_count asc will bubble up the least frequent terms (emails) and you can 
filter yourself which ones you want. The second one sounds like a simple 
terms aggregation on the email field (just make sure the email field is 
not_analyzed):

{
  aggs: {
group_by_email: {
  terms: {
field: email
  }
}
  }
}

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3c2f9f63-9a05-4bd5-beda-093f162b48e2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: aggregation with conditions

2014-04-01 Thread lvalbuena


On Tuesday, April 1, 2014 6:17:30 PM UTC+8, lval...@egg.ph wrote:

 Hi,

 I have 2 cases.

 Given the structure
 {
email:value,
points:value
 }

 Case 1:
 I have 1000  rows, where multiple rows can have the same value for the 
 email field.
 {email:s...@email.com,points:5}
 {email:s...@email.com,points:2}
 ...

 How do I tell elasticsearch to search for all emails that have only 
 appeared *once* in the data set.

 Case 2:
 Also using aggregation. How can I tell elasticsearch to get all possible 
 occurrences the emails appeared in the data set.
 ex.
 emails = 5, occourances = 5 // There are 5 emails that appeared 5 times 
 or greater in the dataset
 emails = 6, occourances = 4
 emails = 23, occourances = 3
 emails = 2, occourances = 2
 emails = 12, occourances = 1

 Or is it even posible?

 Thanks


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/70ad3a1d-283f-4828-b241-b151432d4957%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation on big data

2014-03-31 Thread Adrien Grand
This kind of use-case requires memory for two main reasons:
 - field data,
 - counting values (aggregations).

Field data memory usage can be reduced by using doc values[1] which will
effectively store data on disk instead of memory and rely on the filesystem
cache.

Aggregations memory usage is more complicated to improve. In case you are
storing your IPs as string fields, you might want to use the `map`
execution hint that requires less memory than the `ordinals` execution hint
(please however note that we are working on improving the efficiency of
ordinals on high-cardinality fields so it might improve in future versions).

[1]
http://www.elasticsearch.org/blog/disk-based-field-data-a-k-a-doc-values/
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/search-aggregations-bucket-terms-aggregation.html#_execution_hint

On Mon, Mar 31, 2014 at 11:01 AM, vir.ca...@gmail.com wrote:

 I have 200 million lines of data(about port scanning). I want ES to return
 those ip who open not only one port at the same time(order by count).
 But, considering the volume of data and very little docs have same value on
 ip field, obviously I get an out of memory error. Is there any way to
 finish my query mission.

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/c15fcb9d-14f0-4eac-ba33-4b46d21c75a0%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/c15fcb9d-14f0-4eac-ba33-4b46d21c75a0%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7mS0Z3CSBmBD52v668knp-nR5UpXfYUPC8c4VgAbaMAw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation on big data

2014-03-31 Thread vir . candy
Thank you!

在 2014年3月31日星期一UTC+8下午5时26分09秒,Adrien Grand写道:

 This kind of use-case requires memory for two main reasons:
  - field data,
  - counting values (aggregations).

 Field data memory usage can be reduced by using doc values[1] which will 
 effectively store data on disk instead of memory and rely on the filesystem 
 cache.

 Aggregations memory usage is more complicated to improve. In case you are 
 storing your IPs as string fields, you might want to use the `map` 
 execution hint that requires less memory than the `ordinals` execution hint 
 (please however note that we are working on improving the efficiency of 
 ordinals on high-cardinality fields so it might improve in future versions).

 [1] 
 http://www.elasticsearch.org/blog/disk-based-field-data-a-k-a-doc-values/
 [2] 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/search-aggregations-bucket-terms-aggregation.html#_execution_hint

 On Mon, Mar 31, 2014 at 11:01 AM, vir@gmail.com javascript: wrote:

 I have 200 million lines of data(about port scanning). I want ES to 
 return those ip who open not only one port at the same time(order by 
 count). But, considering the volume of data and very little docs have same 
 value on ip field, obviously I get an out of memory error. Is there any 
 way to finish my query mission.
  
 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/c15fcb9d-14f0-4eac-ba33-4b46d21c75a0%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/c15fcb9d-14f0-4eac-ba33-4b46d21c75a0%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




 -- 
 Adrien Grand
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f09ca4ee-d34d-430d-ba58-9ec9136273e2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Aggregation for query facets?

2014-02-27 Thread Binh Ly
You'll want the filter aggregation:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b3c44c8e-45cd-4283-9228-051f650d186c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Aggregation on parent/child documents

2014-02-25 Thread Augusto Uehara
We run 4 instances of ES 1.0.0 using 30G for JVM. We run 64-bit OpenJDK 
1.7.0_25 on ubuntu servers.

$ ulimit -a
core file size  (blocks, -c) 0
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 515139
max locked memory   (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files  (-n) 64000
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) 8192
cpu time   (seconds, -t) unlimited
max user processes  (-u) 515139
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited

And I also disabled swap on linux.

You can use this gist to simulate the issue we have: 
https://gist.github.com/chaos-generator/9143655

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a6db68fc-a7c8-43af-bbc4-59a0866aba36%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Aggregation question

2014-02-21 Thread mooky
Excellent. Thanks!

On Tuesday, 18 February 2014 15:28:32 UTC, Binh Ly wrote:

 Yes, the correct way would be to index intentLocationDescription as a 
 multi-field. You don't have to introduce it as multiple fields in your 
 source document. All you need to do is on the ES mapping, you set that 
 field to a multi-field, once as whatever analyzed you want, and the other 
 as not_analyzed. You can see an example here:


 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#_multi_fields_3

 Wherein you have 2 fields in the index derived from 1 single field in your 
 JSON source. The name field is analyzed. And then the name.raw field is 
 not_analyzed which is what you want to aggregate on.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/70acc449-62d8-4230-8da8-6aabf206d5cd%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Aggregation on parent/child documents

2014-02-21 Thread Binh Ly
I'm wondering if the filter aggregation will work for you:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html

However, it does not support parent child, but if you have the children 
embedded directly inside the parent document, I think it should be similar 
in functionality to your _msearch solution.

BTW, if you are only doing aggregations or counts and don't really need 
search hits returned, you can further optimize by using the count search 
type:

_search?search_type=count

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/262d7ef2-e794-4927-b6d9-cb021fee3b00%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Aggregation on parent/child documents

2014-02-21 Thread Augusto Uehara
Thank you for your reply Binh,

I've tried the bucket filter, but had problems with parent/child 
relationships.

I've modified the multi-search query to use type = count, but the 
performance didn't change much, it took about 40 seconds to return the 
results. It was almost 20% faster indeed, but it is not the performance we 
want yet.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/569b670e-d1d7-4edb-81bf-199f24cce552%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Aggregation question

2014-02-18 Thread Binh Ly
Yes, the correct way would be to index intentLocationDescription as a 
multi-field. You don't have to introduce it as multiple fields in your 
source document. All you need to do is on the ES mapping, you set that 
field to a multi-field, once as whatever analyzed you want, and the other 
as not_analyzed. You can see an example here:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#_multi_fields_3

Wherein you have 2 fields in the index derived from 1 single field in your 
JSON source. The name field is analyzed. And then the name.raw field is 
not_analyzed which is what you want to aggregate on.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3b0b8f26-0775-4d6c-9376-faab0e03b106%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Aggregation on nested document always takes 2-3 seconds?

2014-02-13 Thread Adrien Grand
Very likely this problem is not related to nested documents but to
fielddata loading because of the integer field. Field data is a
column-oriented view of the data that is, by default, lazily loaded from
the inverted index on the first time that it is needed, and then cached
until the end of life of the segment it belongs to. So only the first
request that needs it is supposed to be slow.

It is possible to load field data eagerly[1] in order to make sure that
field data loading is never going to impact response times. This way you
should not get such slow response times on the first queries.

Another option would be to use doc values[2] that will store field data on
disk instead of loading it from the inverted index. Since data will already
be stored in a column-oriented way, there will be no need to uninvert data
from the inverted index (which is costly and probably the reason of your
slow queries).

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/fielddata-formats.html#_fielddata_loading
[2]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/fielddata-formats.html#_numeric_field_data_types



On Thu, Feb 13, 2014 at 7:34 PM, Luke Scott l...@visionlaunchers.comwrote:

 I have an index that uses 1 level of nested documents. When I run a query
 on it the result comes back in about 20-200 milliseconds. When I add a
 facet or an aggregation involving the nested documents the uncached
 response always takes 2-3 seconds, regardless of how many documents have
 been selected, even zero.

 My map looks like this:

 {
 document: {
 dynamic: strict,
 properties: {
 account_id: {
 type: long
 },
 data: {
 type: nested,
 properties: {
 key: {
 type: string,
 index: not_analyzed
 },
 string: {
 type: string,
 index: not_analyzed,
 fields: {
 token: {
 type: string
 }
 }
 },
 integer: {
 type: long
 },
 date: {
 type: date,
 format: dateOptionalTime
 }
 }
 }
 }
 }
 }

 There are 3.6 million documents in this index. My query looks like this:

 {
 query: {
 bool:{
 must:[
  {term:{account_id: 1}},
 {
 nested:{
 path:data,
 query:{term:{key:amount}}
 }
 }
 ]
 }
 }
 }

 The result to the above query is 0 documents because account_id 1 doesn't
 have any documents with a key of amount. Uncached this returns in about
 10-150ms:

 {
 took: 9,
 timed_out: false,
 _shards: {
 total: 5,
 successful: 5,
 failed: 0
 },
 hits: {
 total: 0,
 max_score: null,
 hits: []
 }
 }

 When I add an aggregation to the query:

 {
 ...
 aggs : {
 report : {
 nested : {
 path : data
 },
 aggs : {
 amount : {
 filter : {
 query: {term: {key:amount}}
 },
 aggs: {
 sum: {
 sum : { field : integer }
 }
 }
 }
 }
 }
 }
 }

 Uncached the query returns in about 2-3 seconds:

 {
 took: 2770,
 timed_out: false,
 _shards: {
 total: 5,
 successful: 5,
 failed: 0
 },
 hits: {
 total: 0,
 max_score: null,
 hits: []
 },
 aggregations: {
 report: {
 doc_count: 0,
 amount: {
 doc_count: 0,
 sum: {
 value: 0
 }
 }
 }
 }
 }

 If I run the same thing a second time (cached) it runs in 26 milliseconds.
 If I clear the cache and run it again it takes 2 seconds.

 Why is this aggregation always taking 2-3 seconds, even though the query
 is returning 0 documents? The same thing happens with a statistical facet.

 -
 Luke

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 

Re: Aggregation-sql like optimization guidance with elasticsearch 1.0.0

2014-01-31 Thread Maxime Nay
For test purposes we currently have an index containing about 50M docs, 
distributed on a 4 nodes cluster, with 16 shards.
Do you think that drastically increasing the number of shards would help ? 

On Friday, January 31, 2014 10:14:08 AM UTC-8, Binh Ly wrote:

 Maxime, forgot to mention, you can also distribute the load out by 
 increasing the shard count and adding more nodes. But precomputing the 
 field is probably the quickest way to improve that performance. Keep in 
 mind that unlike SQL, ES aggregations may return approximate metrics if you 
 have more than 1 shard.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/21874399-99c9-4c6b-8c76-f856ff95216f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Aggregation module - value_count clarification/problem

2014-01-24 Thread watsindename
The link to gist is 
https://gist.github.com/shivprak/8611922#file-es-agg-test-1-sh 

On Friday, January 24, 2014 8:56:43 PM UTC-8, watsin...@gmail.com wrote:

 Hi,

 I am trying to get the unique number of values for a given field. From my 
 understanding of 
 value_counthttp://www.elasticsearch.org/guide/en/elasticsearch/reference/master/search-aggregations-metrics-valuecount-aggregation.html
  
 it counts the number of values that are extracted from the aggregated 
 documents. After the steps in es-agg-test-1.sh, and on querying for 
 unique values of field k the response that I get is 

 {
took: 16,
timed_out: false,
_shards: {
   total: 5,
   successful: 5,
   failed: 0
},
hits: {
   total: 20,
   max_score: 1,
   hits: [] //deleted content of hits to shorten the paste
},
aggregations: {
   k_count: {
  value: 22
   }
}
 }

 How can I be getting value 22, shouldn't it be 20 as thats the number of 
 unique k values in documents.

 Another example, on querying for unique values of field t I get the 
 following response

 {
took: 7,
timed_out: false,
_shards: {
   total: 5,
   successful: 5,
   failed: 0
},
hits: {
   total: 20,
   max_score: 1,
   hits: [] //deleted content of hits to shorten the paste
},
aggregations: {
   k_count: {
  value: 20
   }
}
 }

 Again, shouldn't the value be 1, as the only value of t is 23 in all 
 documents.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3f822944-898b-444a-b748-6be2f72809b6%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.