Re: Aggregation profiling?

2015-05-28 Thread James Macdonald
I don't have an answer, but I really like this question. I too would love to see more query and aggregation profiling tools for performance optimization purposes. Also, I assume you have already looked at this, but have you made sure you are not evicting anything from your in memory field data? J

Re: Aggregation not limited to filter?

2015-04-14 Thread Ivan Brusic
Which version are you using! The old post filter methods simply named filter, should have been removed, or at least deprecated. Cheers, Ivan On Apr 13, 2015 1:33 PM, "James Green" wrote: > Indeed. I had used postFilter to add my filters. The documentation for > filters doesn't show how to use a

Re: Aggregation not limited to filter?

2015-04-13 Thread James Green
Indeed. I had used postFilter to add my filters. The documentation for filters doesn't show how to use a query with a matchAll and a bunch of filters so I blindly followed IDE auto-complete. Lesson learned. On 10 April 2015 at 21:17, James Macdonald wrote: > I had a similar problem recently and

Re: Aggregation not limited to filter?

2015-04-13 Thread James Green
Indeed. I had used postFilter to add my filters. The documentation for filters doesn't show how to use a query with a matchAll and a bunch of filters so I blindly followed IDE auto-complete. Lesson learned. On 10 April 2015 at 21:17, James Macdonald wrote: > I had a similar problem recently and

Re: Aggregation not limited to filter?

2015-04-10 Thread James Macdonald
I had a similar problem recently and solved it by moving my filter into a filtered query (leaving the query as a match_all), see documentation here http://www.elastic.co/guide/en/elasticsearch/reference/1.5/query-dsl-filtered-query.html . I am not certain why filters do not restrict the scope of t

Re: Aggregation / Sort and CircuitBreakingException

2015-03-16 Thread joergpra...@gmail.com
You should sort over doc values (recommended, it will be the default in next ES version). Sorting over not_analyzed / keyword analyzed fields is old school. Doc values for analyzed strings make not much sense in my opinion and lead to unwanted results. If you use multifield, then you do not have t

Re: Aggregation / Sort and CircuitBreakingException

2015-03-15 Thread Lindsey Poole
Also, if I understand correctly, there are negative implications when sorting over a column that has been analyzed - in our case, to remove stop-words. Since the total cardinality of our sort field exceeds the heap available, we can't sort a single users documents when using stop word analysis

Re: Aggregation / Sort and CircuitBreakingException

2015-03-15 Thread Lindsey Poole
Well, we have a field that is supporting a backward compatibility use case. Clients are executing a partial match query on this field, so we used the keyword tokenizer instead of not_analyzed. Since this is supporting legacy functionality, the clients cannot be updated to change the expectation

Re: Aggregation / Sort and CircuitBreakingException

2015-03-15 Thread joergpra...@gmail.com
I mean, I do not understand what you mean by "I'm caught up on the advice to use doc_values where possible, but we have a use case where we do light analysis on a particular set of fields in our document" - what exactly prevents you from doc values? Jörg On Mon, Mar 16, 2015 at 12:41 AM, joergpra

Re: Aggregation / Sort and CircuitBreakingException

2015-03-15 Thread joergpra...@gmail.com
Have you considered doc values? http://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html Jörg On Sun, Mar 15, 2015 at 11:11 PM, Lindsey Poole wrote: > Hey guys, > > I have a question about the mechanics of aggregation and sorting w.r.t. > the fielddata cache. I know this has

Re: Aggregation - Blank and date aggregation

2015-01-16 Thread buddarapu nagaraju
I was able to figure out through fiddler ...date histrograms are returns in seperate nested object in the result .. Now works On Friday, January 16, 2015, Adrien Grand wrote: > This looks good, what error did you get? > > On Fri, Jan 16, 2015 at 9:41 AM, buddarapu nagaraju > wrote: > >> Index m

Re: Aggregation - Blank and date aggregation

2015-01-16 Thread Adrien Grand
This looks good, what error did you get? On Fri, Jan 16, 2015 at 9:41 AM, buddarapu nagaraju wrote: > Index mapping here > > "mappings": { > >- "document": { > - "properties": { > - "createdDateTime": { > - "format": "dateOptionalTime", > - "type": "dat

Re: Aggregation - Blank and date aggregation

2015-01-16 Thread buddarapu nagaraju
Index mapping here "mappings": { - "document": { - "properties": { - "createdDateTime": { - "format": "dateOptionalTime", - "type": "date" }, - "doubleSort1": { - "type": "double" }, - "stringSort3": {

Re: Aggregation - Blank and date aggregation

2015-01-16 Thread buddarapu nagaraju
Hi , I tried but date histrogram didnt work not sure what is the mistake am doing here is date histrogram request(json) am passing and also pasted sample doc structure date histogram request { "aggs": { "createddatetime": { "date_histogram": { "field": "createddatetime",

Re: Aggregation - Blank and date aggregation

2015-01-15 Thread Adrien Grand
Then it means that you want to use a date_histogram aggregation with interval=day. See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html On Thu, Jan 15, 2015 at 4:43 PM, buddarapu nagaraju wrote: > Hey Adrien ,Thank yo

Re: Aggregation - Blank and date aggregation

2015-01-15 Thread buddarapu nagaraju
Hey Adrien ,Thank you.I have one more question on aggregating on dates . We actually stored date time in a field called "createdDateTime" but I need only aggregates on date part of date time . Any ideas ? Or sample code can help us ? Regards Nagaraju 908 517 6981 On Wed, Jan 14, 2015 at 6:10 A

Re: Aggregation - Blank and date aggregation

2015-01-14 Thread Adrien Grand
On Wed, Jan 14, 2015 at 10:37 AM, buddarapu nagaraju wrote: > Does term aggregation counts on blank field values ? > > Yes, an empty value "" counts as a term. Note that you need the field to be not analyzed for it to work (or to use an analyzer that emits empty strings). Otherwise the standard a

Re: aggregation giving inconsistent results

2014-10-31 Thread Jay Hilden
Thanks Adrien, your link was very helpful in understanding why I was getting the results I'm getting. Doing some experimentation on our data I'm going to use a 20x multiplier on the shard_count against the size. So in my testing when I want the top 5 results for a very flat term I'm going to set

Re: aggregation giving inconsistent results

2014-10-31 Thread Adrien Grand
This is unfortunately a known limitation of the terms aggregation. Note however that elasticsearch 1.4 (only a beta version is available today but the GA release should be available within a couple of weeks) improves some heuristics which allow to have a better accuracy by default, and also reports

Re: Aggregation on last element

2014-10-26 Thread Michaël Gallego
After some testing, it appears that my solution does not work, but I'm not sure to understand why. The filter returns less result that what is expected. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop r

Re: Aggregation on last element

2014-10-26 Thread Michaël Gallego
Hi Vineeth, I'm afraid that this won't work, because as I said "element" can have high cardinality (while it's not bounded in theory, in practice it will range from 500 to 4). Therefore if I do a "terms" on element, then a top hit, it will require to generate maybe 4 sub-buckets. I thin

Re: Aggregation on last element

2014-10-26 Thread vineeth mohan
Hello Michaël , I cant think of a way to do this in a single call. May be you should try the following (Terms aggregation on element) -> (Top N hits aggregation , sort by date by asc and size = 1 ) -> (Filter aggregation by type A) With this you will get the elements that you are looking for.

Re: Aggregation buckets, with an additional key:value inside.

2014-10-25 Thread Ivan Brusic
I maintain a mapping on the client side to due the lookups. Thankfully my taxonomy is static (but somewhat large). There is a PR to do server-side mappings, but I don't think it would apply to aggregations and is quite old. An alternative solution would be to create compound values such as "48885:

Re: Aggregation framework, Java API

2014-09-15 Thread Emanuel Buzek
Ivan, thanks a lot for the reply, I switched to FilteredQuery (using matchAll when no query is submitted) and that simplified my code a lot. It also makes more sense than using post filters and filtered aggregations... best, emanuel Dne úterý, 9. září 2014 18:46:25 UTC+2 Ivan Brusic napsal(a): >

Re: Aggregation framework, Java API

2014-09-09 Thread Ivan Brusic
A filtered query with no explicit query will ultimately be translated into a match-all/constant-score query at the Lucene level. I prefer to explicitly define all my match all queries and use the specific post filter name, and not the old filter name, which was deprecated due to its ambiguity. Bes

Re: Aggregation framework, Java API

2014-09-09 Thread Emanuel Buzek
Thanks Ivan. Yes, it was the post filter which was ignored. We use filtered query only when the user sends a query string, otherwise (when only exact filters for specific columns are specified) we use the post filter. It seems strange to me to use the FilteredQuery when the query string is empt

Re: Aggregation framework, Java API

2014-09-08 Thread Ivan Brusic
Which filter was ignored? I am assuming you meant the post filter (which might be still called filter at the Java API), which in this case the filter is bypassed by design. Post filters allow you to filter the documents returned, but leave the aggregations as is. Sounds like you are looking for fil

Re: Aggregation framework, Java API

2014-09-08 Thread mooky
The aggregation takes into account a query - but not a post-filter. I'm not sure of the rationale behind the difference. The java api for traversing results is quite painful - but I think a good part of that is due to Java & the fact that there is very little polymorphic behaviour between aggre

Re: aggregation of hierchical elements possible?

2014-09-02 Thread vineeth mohan
;i=2>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nN04%3DjEborsm%3D%3D_N6Yt8vSCN95WJ6Ka4gJGqHRXSJ2Bw%40mail.gmail.com > <https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nN04%3DjEborsm%3D%3D_N6Yt8vSCN95WJ6Ka4gJGqHRXSJ2Bw%40mai

Re: aggregation of hierchical elements possible?

2014-09-02 Thread Markus Breuer
Helli Vineeth, thx for your response. Your proposal 1. seems to be similar to the path-tokenizer, which I used, isn't it? "settings" : { "index" : { "analysis" : { "analyzer" : { "path-analyzer" : { "type" : "custom"

Re: aggregation of hierchical elements possible?

2014-09-02 Thread vineeth mohan
Hello Markus , I cant seem to think of any straight method , but then you can try the following 1. Apply source transform script to convert /a/b/c => [ /a , /a/b , /a/b/c ] - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-transform.html#mapping-transform

Re: Aggregation across indices

2014-08-26 Thread vineeth mohan
Hello Sandeep , What you are intending is not possible. But then Elasticsearch do have some good relational operations which needs to be defined before indexing. If you can elaborate your use case , we can help on this. Thanks Vineeth On Tue, Aug 26, 2014 at 6:04 PM, 'Sandeep Ramesh

Re: Aggregation query

2014-08-15 Thread vineeth mohan
Hello Ivan , This is expected. Only the top N(size mentioned in aggregation) results are taken from each shard before reducing the result. Due this , the accuracy is not guaranteed but the order is guaranteed. As a fix , you can use this to improve accuracy at the cost of memory - http://www.elast

Re: Aggregation

2014-08-15 Thread chenlin rao
What's your `deviceId` mapping type? Make sure it's a number as using in percentile aggregation. 2014-08-15 23:49 GMT+08:00 Yuheng Du : > > I am using: > > > > and I got the following errors: > > >

Re: Aggregation

2014-08-15 Thread Yuheng Du
I am using: and I got the following errors: can anyone tell me what is going w

Re: Aggregation

2014-08-14 Thread 饶琛琳
No problem: #!/usr/bin/perl use 5.010; use Data::Dumper; use Search::Elasticsearch; my $e = Search::Elasticsearch->new( nodes => [ '10.13.57.35:9200', '10.13.57.36:9200' ] ); my $r = $e->search( index => 'logstash-2014.08.15', body => { ag

Re: Aggregation on boolean

2014-08-11 Thread Fabian Köstring
Sry! It was caused by different indexes. Am Montag, 11. August 2014 10:18:25 UTC+2 schrieb Fabian Köstring: > > Hey there! > > I got one index with two types. I want to do a aggragtion query. > This is my query. > > > GET index1/type1,type2/_search > { >"query": { > "match_all": {} >

Re: Aggregation on parent/child documents

2014-07-25 Thread Thomas
Hi Adrien and thank you for the reply, This is exactly what i had in mind alongside with the reversed search equivalent with the reverse_nested, this is planed for version 1.4.0 onwards as i see, will keep track of any updates on this, thanks Thomas On Friday, 25 July 2014 14:54:50 UTC+3, Thom

Re: Aggregation on parent/child documents

2014-07-25 Thread Adrien Grand
Hi Thomas, None of the aggregations that we have today can leverage parent/child relations. However, there is a `children` aggregation in the pipeline: https://github.com/elasticsearch/elasticsearch/pull/6936 On Fri, Jul 25, 2014 at 1:54 PM, Thomas wrote: > Hi, > > I wanted to ask whether is p

Re: Aggregation using the results of an aggregation?

2014-07-11 Thread Rémi Nonnon
Hi, If I am not wrong it's not possible yet to use the result of an aggregation in another one. You have to do that outside elasticsearch. Regards, Rémi Le vendredi 11 juillet 2014 02:39:16 UTC+2, Greg Day a écrit : > > Hi guys > > Im wondering if it is possible to use the results of an aggreg

Re: [Aggregation] Be able to count number of item in a sub-collection

2014-06-27 Thread Grégoire Pineau
Yes and no ;) Because I would like to be able to also filter node in the collection. And then cound. Actually, the collection contains orders, and I want to be able to know how many paid order I get for a user. On Friday, June 27, 2014 8:40:49 AM UTC+2, Timber wrote: > > Could you not add a coun

Re: Aggregation Framework, possible to get distribution of requests per user

2014-06-24 Thread Thomas
My mistake sorry, Here is an example: I have the request document: "request":{ "dynamic" : "strict", "properties" : { "time" : { "format" : "dateOptionalTime", "type" : "date" }, "user_id" : { "index" : "not_analyzed", "type" : "string" }, "country

Re: Aggregation Framework, possible to get distribution of requests per user

2014-06-24 Thread David Pilato
I was only thinking loud. I mean that I don't know what your model looks like. May be you could illustrate your use case with some actual data and we can move forward from here? What kind of documents are you actually indexing and searching for? What fields do you have? --  David Pilato | Tech

Re: Aggregation Framework, possible to get distribution of requests per user

2014-06-24 Thread Thomas
Hi David Thank you for your reply, so based on your suggestion I should maintain a document (e.g. user) with some aggregated values and I should update it as we move along with our indexing of our data, correct? This though would only give me totals. I cannot apply something like a range. I f

Re: Aggregation Framework, possible to get distribution of requests per user

2014-06-24 Thread David Pilato
Imagine that you have indexed users. User has a numberOfDocs field. You can build a range aggregation on top of that and gives back the count for buckets like: numberOfDocs < 2 1 < numberOfDocs < 3 … See  http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-

Re: Aggregation average value is not coming correct

2014-06-16 Thread Alexander Reelsen
Hey, you are setting a post filter, which means, that the aggregations will work without the range filter applied. You may want to use a filtered query and move the filter inside the filter part of that particular query. --Alex On Thu, Jun 5, 2014 at 12:38 PM, Subhadip Bagui wrote: > Hi, > >

Re: Aggregation equivalent of Facet "global" : true ?

2014-06-11 Thread mooky
Aha. I missed that. Many thanks. On Wednesday, 11 June 2014 11:23:57 UTC+1, Adrien Grand wrote: > > You can do this by running a `global` aggregation: > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-global-aggregation.html > > > On Wed, Jun 11, 2

Re: Aggregation equivalent of Facet "global" : true ?

2014-06-11 Thread Adrien Grand
You can do this by running a `global` aggregation: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-global-aggregation.html On Wed, Jun 11, 2014 at 12:02 PM, mooky wrote: > > Is there a way of specifying the scope of an aggregation (if there is I

Re: Aggregation bug? Or user error?

2014-06-06 Thread mooky
Ok. I have written a test case that (if run enough) will reproduce it. Its an intermittent bug. I have raised an issue: https://github.com/elasticsearch/elasticsearch/issues/6435 -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscrib

Re: Aggregation bug? Or user error?

2014-06-04 Thread mooky
Bah. I thought I had a simple unit test that was reliably recreating it - but it would appear not. Its still very intermittent - and my test never seems to fail when run on its own. On Tuesday, 3 June 2014 21:41:04 UTC+1, Adrien Grand wrote: > > A recreation would be really great! If you c

Re: Aggregation vs Search/Filter discrepancy - caching issue?

2014-06-04 Thread mooky
Turns out it was user error. Please ignore. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion

Re: Aggregation vs Search/Filter discrepancy - caching issue?

2014-06-03 Thread Adrien Grand
Can you share your test case? On Tue, Jun 3, 2014 at 1:00 PM, mooky wrote: > Update elastic to 1.2 - still seeing the same issue... > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving ema

Re: Aggregation bug? Or user error?

2014-06-03 Thread Adrien Grand
A recreation would be really great! If you can zip it and upload it to any file sharing service, that would work for me. On Tue, Jun 3, 2014 at 6:41 PM, mooky wrote: > > By the way this test fails with elastic 1.2 also. > > How do I go about uploading an index with aggregation request json, etc

Re: Aggregation bug? Or user error?

2014-06-03 Thread mooky
By the way this test fails with elastic 1.2 also. How do I go about uploading an index with aggregation request json, etc? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send

Re: Aggregation bug? Or user error?

2014-06-03 Thread mooky
I have managed to produce a unit test that exposes this (albeit different to the data above). The index is quite small - and the data fictional - so theres no problem sending you the index. Here is a result I get - and we can see the sub-aggregations have higher counts than the parent: { "s

Re: Aggregation vs Search/Filter discrepancy - caching issue?

2014-06-03 Thread mooky
Update elastic to 1.2 - still seeing the same issue... -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this

Re: Aggregation-"sql like" optimization guidance with elasticsearch 1.0.0

2014-05-29 Thread Niko Nyrhila
Hi, You can nest aggregations, so in this case you'd first use Date Histogram aggregation with an interval of one hour: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html Then you'd aggregate by "id" field: http://www.e

Re: aggregation with average value in java api

2014-05-24 Thread Subhadip Bagui
Thank you Adrien, I'm getting the aggregation value now. One doubt here. I have a field which stores values likes "CPU_USED" : "0.04%" Want to do aggregation on that. Can I do any string manipulation here on the field passed on AggregationBuilders ? Tried like this but not working . Please sugg

Re: aggregation with average value in java api

2014-05-23 Thread Adrien Grand
Hi, You should be able to fix your problem by replacing the call to setAggregations with: .addAggregation(AggregationBuilders.avg("memory_average").field("MEMORY")) On Fri, May 23, 2014 at 7:30 PM, Subhadip Bagui wrote: > Hi, > > I'm trying to do aggregation using java api. But couldn't get t

Re: Aggregation bug? Or user error?

2014-05-06 Thread mooky
I am using elastic 1.1.1. The index isn't huge (600m) - but it contains financially sensitive data... will be too problematic legally to allow it offsite. I can try anonymise the data - see if it can be reproduced that way - might learn something about what is causing it. On Friday, 2 May 2

Re: Aggregation bug? Or user error?

2014-05-02 Thread Adrien Grand
What version of Elasticsearch are you using? If it is small enough, I would also be interested if you could share your index so that I can try to reproduce the issue locally. On Fri, May 2, 2014 at 12:07 PM, mooky wrote: > > I havent been able to figure out what is required to recreate it. > I

Re: Aggregation bug? Or user error?

2014-05-02 Thread mooky
I havent been able to figure out what is required to recreate it. I am doing a number of identical aggregations (just different values intentMarketCode and intentDate Three aggregations give correct numbers - one doesnt I havent figured why On Wednesday, 30 April 2014 14:13:00 UTC+1

Re: Aggregation problem

2014-05-01 Thread Adrien Grand
Hi, Your request is not valid, it should be GET /ckdocuments/msv/_search { "aggregations": { "MSV":{ "terms":{ "field" : "MSV.country" } } }, "size":0 } } The next release of Elasticsearch will be le

Re: Aggregation bug? Or user error?

2014-04-30 Thread Adrien Grand
This looks wrong indeed. By any chance, would you have a curl recreation of this issue? On Tue, Apr 29, 2014 at 7:35 PM, mooky wrote: > It looks like a bug to me - but if its user error, then obviously I can > fix it a lot quicker :) > > > On Tuesday, 29 April 2014 13:04:53 UTC+1, mooky wrote:

Re: Aggregation bug? Or user error?

2014-04-29 Thread mooky
It looks like a bug to me - but if its user error, then obviously I can fix it a lot quicker :) On Tuesday, 29 April 2014 13:04:53 UTC+1, mooky wrote: > > I am seeing some very odd aggregation results - where the sum of the > sub-aggregations is more than the parent bucket. > > Results: > "

Re: Aggregation in Kibana or Elasticsearch

2014-04-23 Thread Prazzy
The below query worked for me. LogDate:[2013-08-01 TO 2013-08-10] AND (LogDetail:"Antenna plate 1 temperature:*" AND LogDetail:[70.00 TO 80.00]) On Wed, Apr 23, 2014 at 6:19 PM, Praveen Shilavantar wrote: > Hi, > > I am new to elasticsearch and kibana. I have loaded some log data into > elasti

Re: Aggregation error( Java heap space)

2014-04-02 Thread Adrien Grand
On Wed, Apr 2, 2014 at 10:52 AM, 张阳 wrote: > But I can do aggregation on 'banner' field on both cluster. Is that > because values of 'banner' are not so unique compared to 'ip' field > Very likely, yes. Memory usage of field data is higher on high-cardinality fields. -- Adrien Grand -- You r

Re: Aggregation error( Java heap space)

2014-04-02 Thread 张阳
But I can do aggregation on 'banner' field on both cluster. Is that because values of 'banner' are not so unique compared to 'ip' field 2014-04-02 16:27 GMT+08:00 Adrien Grand : > Given your description of the problem, I think the issue is that your > Elasticsearch cluster doesn't have enough me

Re: Aggregation error( Java heap space)

2014-04-02 Thread Adrien Grand
Given your description of the problem, I think the issue is that your Elasticsearch cluster doesn't have enough memory to load field data for the ip field (which needs to be done for all documents, not only those that match your query). So you either need to give more nodes to your cluster, more me

Re: Aggregation error( Java heap space)

2014-04-02 Thread vir . candy
The smaller index have 1 million lines of data. They are the lines filtered by "prefix":{"ip":"100.1"} from the bigger one. 在 2014年4月2日星期三UTC+8下午4时04分27秒,vir@gmail.com写道: > > I do an *aggregation* search on my index(*6 nodes*). There are about *200 > million lines* of data(port scanning). E

Re: aggregation with conditions

2014-04-01 Thread lvalbuena
On Tuesday, April 1, 2014 6:17:30 PM UTC+8, lval...@egg.ph wrote: > > Hi, > > I have 2 cases. > > Given the structure > { >email:value, >points:value > } > > Case 1: > I have 1000 rows, where multiple rows can have the same value for the > email field. > {"email":"s...@email.com","point

Re: aggregation with conditions

2014-04-01 Thread Binh Ly
The first one is not available, however a terms aggregation and sort by _count asc will bubble up the least frequent terms (emails) and you can filter yourself which ones you want. The second one sounds like a simple terms aggregation on the email field (just make sure the email field is not_an

Re: Aggregation on big data

2014-03-31 Thread vir . candy
Thank you! 在 2014年3月31日星期一UTC+8下午5时26分09秒,Adrien Grand写道: > > This kind of use-case requires memory for two main reasons: > - field data, > - counting values (aggregations). > > Field data memory usage can be reduced by using doc values[1] which will > effectively store data on disk instead of

Re: Aggregation on big data

2014-03-31 Thread Adrien Grand
This kind of use-case requires memory for two main reasons: - field data, - counting values (aggregations). Field data memory usage can be reduced by using doc values[1] which will effectively store data on disk instead of memory and rely on the filesystem cache. Aggregations memory usage is mo

Re: Aggregation for query facets?

2014-02-27 Thread Binh Ly
You'll want the filter aggregation: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and sto

Re: Aggregation on parent/child documents

2014-02-25 Thread Augusto Uehara
We run 4 instances of ES 1.0.0 using 30G for JVM. We run 64-bit OpenJDK 1.7.0_25 on ubuntu servers. $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signa

Re: Aggregation on parent/child documents

2014-02-21 Thread Augusto Uehara
Thank you for your reply Binh, I've tried the bucket filter, but had problems with parent/child relationships. I've modified the multi-search query to use type = count, but the performance didn't change much, it took about 40 seconds to return the results. It was almost 20% faster indeed, but

Re: Aggregation on parent/child documents

2014-02-21 Thread Binh Ly
I'm wondering if the filter aggregation will work for you: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html However, it does not support parent child, but if you have the children embedded directly inside the parent document

Re: Aggregation question

2014-02-21 Thread mooky
Excellent. Thanks! On Tuesday, 18 February 2014 15:28:32 UTC, Binh Ly wrote: > > Yes, the correct way would be to index intentLocationDescription as a > multi-field. You don't have to introduce it as multiple fields in your > source document. All you need to do is on the ES mapping, you set that

Re: Aggregation question

2014-02-18 Thread Binh Ly
Yes, the correct way would be to index intentLocationDescription as a multi-field. You don't have to introduce it as multiple fields in your source document. All you need to do is on the ES mapping, you set that field to a multi-field, once as whatever analyzed you want, and the other as not_an

Re: Aggregation on nested document always takes 2-3 seconds?

2014-02-13 Thread Adrien Grand
Very likely this problem is not related to nested documents but to fielddata loading because of the "integer" field. Field data is a column-oriented view of the data that is, by default, lazily loaded from the inverted index on the first time that it is needed, and then cached until the end of life

Re: Aggregation-"sql like" optimization guidance with elasticsearch 1.0.0

2014-01-31 Thread Maxime Nay
For test purposes we currently have an index containing about 50M docs, distributed on a 4 nodes cluster, with 16 shards. Do you think that drastically increasing the number of shards would help ? On Friday, January 31, 2014 10:14:08 AM UTC-8, Binh Ly wrote: > > Maxime, forgot to mention, you ca

Re: Aggregation-"sql like" optimization guidance with elasticsearch 1.0.0

2014-01-31 Thread Binh Ly
Maxime, forgot to mention, you can also distribute the load out by increasing the shard count and adding more nodes. But precomputing the field is probably the quickest way to improve that performance. Keep in mind that unlike SQL, ES aggregations may return approximate metrics if you have more

Re: Aggregation-"sql like" optimization guidance with elasticsearch 1.0.0

2014-01-31 Thread Maxime Nay
Unfortunately, we have about 8 different fields that could serve as aggregation key, and a lot of potential combinations between these fields. Thus, pre-computing all these combinations doesn't seem to be a viable solution. On Friday, January 31, 2014 7:52:40 AM UTC-8, Binh Ly wrote: > > Maxime,

Re: Aggregation Module - value_count problem

2014-01-31 Thread Jun Ohtani
Hi, I think “value_count” counts the number of values, as terms, per each docs. Your first question: Why is “k_count” 22 instead of 20? Your example field “k” is analyzed using the standard tokenizer. 2nd doc’s “k” is “a few” and 5th doc’s “k” is “next year”. These text is divided by standard to

Re: Aggregation-"sql like" optimization guidance with elasticsearch 1.0.0

2014-01-31 Thread Binh Ly
Maxime, your bottleneck is likely in the script part. It has to dynamically compute that per doc just like in sql. However, if you can precompute that at index time (for example, introduce a field that contains the value of date-hour-id, you should be able to improve that aggregation time signi

Re: Aggregation module - value_count clarification/problem

2014-01-24 Thread watsindename
The link to gist is https://gist.github.com/shivprak/8611922#file-es-agg-test-1-sh On Friday, January 24, 2014 8:56:43 PM UTC-8, watsin...@gmail.com wrote: > > Hi, > > I am trying to get the unique number of values for a given field. From my > understanding of > "value_count