Re: Inconsistent responses from aggregations (ES1.0.0RC1)
Hi Adrien, Good news! The problem is solved. Can't wait for the release containing the fix, but for now I will use my own build :) On Thursday, February 6, 2014 5:25:11 PM UTC+1, Nils Dijk wrote: Yay! I will try this somewhere tomorrow. Thanks for fixing, much appreciated! Seems like it was difficult to find. Since it only happens when a 'page' gets recycled internally. On Thursday, February 6, 2014 3:53:46 PM UTC+1, Adrien Grand wrote: It took me some time but I finally managed to understand the cause and to write a fix: https://github.com/elasticsearch/elasticsearch/pull/5039 Thanks very much for reporting this and for your help reproducing and debugging this issue! On Thu, Feb 6, 2014 at 2:08 PM, Nils Dijk m...@thanod.nl wrote: Good, It is always easier to fix when it's on your own machine. I tried your .patch, but it did not fix the problem. I also tried your config, although I did not really get where to put the setting, I ended up putting the setting on the index. This also did not fix the problem. I also tried with a bigger shard_size in the agg. Yet again no difference. To test some more around aggs I loaded a complete production set into both my local ES RC2 (osx) and one on a linux server with ES RC2. I have a hunch it could be in the sorting of the terms. When I do a sub agg and sort on it I see all kind of weird results that are even lower than the ones I see when I do not sort on the sub agg. If you need me to test some more I am keeping a close watch on this thread. -- Nils On Thursday, February 6, 2014 1:19:40 PM UTC+1, Adrien Grand wrote: OK, I finally managed to reproduce it on both mac and linux by increasing the number of shards to 20, will keep you posted On Thu, Feb 6, 2014 at 9:29 AM, Adrien Grand adrien...@elasticsearch. com wrote: On Wed, Feb 5, 2014 at 6:42 PM, Nils Dijk m...@thanod.nl wrote: Ok, I was preparing to do a long bisecting session, but I started with the commit you highlighted below (4271d573d60f39564c458e2d3fb7c14afb82d4d8) and the commit before that one (6481a2fde858520988f2ce28c02a1 5be3fe108e4). And as it turns out, it is the breaking commit. If I build the commit of yours from December 3 it fails my test suite. If I build the commit of Nik from Januari 6 it still passes my test. I also tried reverting your commit on the v1.0.0.RC1 tag, but it gave me all kinds of conflicts so I could not test RC1 without your commit. If you would like I can still do a full bisect, but I suspect I end up at your commit since I tested that one, and the one before. Would it be possible for you to send a .patch without the unsafe stuff, so I can apply that to a commit and make a build? Thanks Nils for your work, this is much appreciated. Here is a simple patch attached that short-circuits the use of Unsafe to do string comparisons. Maybe you could also try to set the `cache.recycler.page.type` setting to `none` to see if that changes anything. -- Adrien Grand -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/af8e91d8-4a97-42d3-9dd5-8a980ded493e%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c6acf6ac-3f47-49e4-8240-57c4c697c635%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Inconsistent responses from aggregations (ES1.0.0RC1)
Excellent news, thanks for checking! RC2 was the last release candidate, so the next release containing the fix should be 1.0 GA. Hopefully it will be out soon. On Fri, Feb 7, 2014 at 12:39 PM, Nils Dijk m...@thanod.nl wrote: Hi Adrien, Good news! The problem is solved. Can't wait for the release containing the fix, but for now I will use my own build :) On Thursday, February 6, 2014 5:25:11 PM UTC+1, Nils Dijk wrote: Yay! I will try this somewhere tomorrow. Thanks for fixing, much appreciated! Seems like it was difficult to find. Since it only happens when a 'page' gets recycled internally. On Thursday, February 6, 2014 3:53:46 PM UTC+1, Adrien Grand wrote: It took me some time but I finally managed to understand the cause and to write a fix: https://github.com/elasticsearch/elasticsearch/pull/5039 Thanks very much for reporting this and for your help reproducing and debugging this issue! On Thu, Feb 6, 2014 at 2:08 PM, Nils Dijk m...@thanod.nl wrote: Good, It is always easier to fix when it's on your own machine. I tried your .patch, but it did not fix the problem. I also tried your config, although I did not really get where to put the setting, I ended up putting the setting on the index. This also did not fix the problem. I also tried with a bigger shard_size in the agg. Yet again no difference. To test some more around aggs I loaded a complete production set into both my local ES RC2 (osx) and one on a linux server with ES RC2. I have a hunch it could be in the sorting of the terms. When I do a sub agg and sort on it I see all kind of weird results that are even lower than the ones I see when I do not sort on the sub agg. If you need me to test some more I am keeping a close watch on this thread. -- Nils On Thursday, February 6, 2014 1:19:40 PM UTC+1, Adrien Grand wrote: OK, I finally managed to reproduce it on both mac and linux by increasing the number of shards to 20, will keep you posted On Thu, Feb 6, 2014 at 9:29 AM, Adrien Grand adrien...@elasticsearch. com wrote: On Wed, Feb 5, 2014 at 6:42 PM, Nils Dijk m...@thanod.nl wrote: Ok, I was preparing to do a long bisecting session, but I started with the commit you highlighted below (4271d573d60f39564c458e2d3fb7c 14afb82d4d8) and the commit before that one ( 6481a2fde858520988f2ce28c02a15be3fe108e4). And as it turns out, it is the breaking commit. If I build the commit of yours from December 3 it fails my test suite. If I build the commit of Nik from Januari 6 it still passes my test. I also tried reverting your commit on the v1.0.0.RC1 tag, but it gave me all kinds of conflicts so I could not test RC1 without your commit. If you would like I can still do a full bisect, but I suspect I end up at your commit since I tested that one, and the one before. Would it be possible for you to send a .patch without the unsafe stuff, so I can apply that to a commit and make a build? Thanks Nils for your work, this is much appreciated. Here is a simple patch attached that short-circuits the use of Unsafe to do string comparisons. Maybe you could also try to set the `cache.recycler.page.type` setting to `none` to see if that changes anything. -- Adrien Grand -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/af8e91d8-4a97-42d3-9dd5-8a980ded493e% 40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c6acf6ac-3f47-49e4-8240-57c4c697c635%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6K8VbCDfz7cz8v7qR3k3e8afQdULGBfDDsBL%2B%2BqGLOjw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Inconsistent responses from aggregations (ES1.0.0RC1)
Yay! I will try this somewhere tomorrow. Thanks for fixing, much appreciated! Seems like it was difficult to find. Since it only happens when a 'page' gets recycled internally. On Thursday, February 6, 2014 3:53:46 PM UTC+1, Adrien Grand wrote: It took me some time but I finally managed to understand the cause and to write a fix: https://github.com/elasticsearch/elasticsearch/pull/5039 Thanks very much for reporting this and for your help reproducing and debugging this issue! On Thu, Feb 6, 2014 at 2:08 PM, Nils Dijk m...@thanod.nl javascript:wrote: Good, It is always easier to fix when it's on your own machine. I tried your .patch, but it did not fix the problem. I also tried your config, although I did not really get where to put the setting, I ended up putting the setting on the index. This also did not fix the problem. I also tried with a bigger shard_size in the agg. Yet again no difference. To test some more around aggs I loaded a complete production set into both my local ES RC2 (osx) and one on a linux server with ES RC2. I have a hunch it could be in the sorting of the terms. When I do a sub agg and sort on it I see all kind of weird results that are even lower than the ones I see when I do not sort on the sub agg. If you need me to test some more I am keeping a close watch on this thread. -- Nils On Thursday, February 6, 2014 1:19:40 PM UTC+1, Adrien Grand wrote: OK, I finally managed to reproduce it on both mac and linux by increasing the number of shards to 20, will keep you posted On Thu, Feb 6, 2014 at 9:29 AM, Adrien Grand adrien...@elasticsearch. com wrote: On Wed, Feb 5, 2014 at 6:42 PM, Nils Dijk m...@thanod.nl wrote: Ok, I was preparing to do a long bisecting session, but I started with the commit you highlighted below (4271d573d60f39564c458e2d3fb7c14afb82d4d8) and the commit before that one (6481a2fde858520988f2ce28c02a1 5be3fe108e4). And as it turns out, it is the breaking commit. If I build the commit of yours from December 3 it fails my test suite. If I build the commit of Nik from Januari 6 it still passes my test. I also tried reverting your commit on the v1.0.0.RC1 tag, but it gave me all kinds of conflicts so I could not test RC1 without your commit. If you would like I can still do a full bisect, but I suspect I end up at your commit since I tested that one, and the one before. Would it be possible for you to send a .patch without the unsafe stuff, so I can apply that to a commit and make a build? Thanks Nils for your work, this is much appreciated. Here is a simple patch attached that short-circuits the use of Unsafe to do string comparisons. Maybe you could also try to set the `cache.recycler.page.type` setting to `none` to see if that changes anything. -- Adrien Grand -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com javascript:. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/af8e91d8-4a97-42d3-9dd5-8a980ded493e%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0e604ec8-05a8-4697-b6bf-28d8bda756ee%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Inconsistent responses from aggregations (ES1.0.0RC1)
Sorry, but your file at https://gist.github.com/8803745.git is broken, it contains invalid JSON, so it can not be processed. It would be helpful to provide a script with escaped JSON in bulk format. From what I suspect, you do not use keyword analyzer for faceting/agg'ing, so you will get all kinds of unwanted results. If that explains your fluctuating aggs results, I can not tell. It is rather uncommon to use facets and aggs side by side. Jörg On Tue, Feb 4, 2014 at 3:01 PM, Nils Dijk m...@thanod.nl wrote: To follow up, I have a contained test suite at https://gist.github.com/thanodnl/8803745for this problem. It contains two files: 1. aggsbug.sh 2. aggsbug.json The .json file contains ~1M documents newline separated to load into the database, I was not able to create a curl request to load them directly into the index. The .sh file (https://gist.github.com/thanodnl/8803745/raw/aggsbug.sh) contains the instructions for recreating this behavior. I have ran these against the following version: 1. 1.0.0.Beta2 2. 1.0.0.RC1 3. 1.0.0-SNAPSHOT as compiled from the git 1.0 branch on commit 0f8b41ffad9b5ecdfd543d7c73edcf404e6fc763 When ran on 1.0.0.Beta2 it gives the same output consistently when I run the _search over and over again. When ran on 1.0.0.RC1 it will give me multiple different outcomes comparable to the numbers I posted earlier in the thread, When ran on 1.0.0-SNAPSHOT it behaves the same as in 1.0.0.RC1. That it still was working on 1.0.0.Beta2 proves to me that it is a bug that got into RC1. I could not find any related ticket on the issues page of the github repository. Hopefully this is enough information to recreate the problem. The json file is quite big and could bug when you open the gist it in a browser. A clone of the gist locally will work best: $ git clone https://gist.github.com/8803745.git I do not really know how to move on from here. Do you want me to open an issue for this problem at github.com/elasticsearch/elasticsearch? It would be nice to fix this problem before a release of 1.0.0 since that is the first release containing the aggregations for analytics. On Tuesday, February 4, 2014 12:31:10 PM UTC+1, Nils Dijk wrote: I've loaded the same dataset in ES1.0.0.Beta2 with the same index configuration as in the topic start. However now the numbers are consistent if I call the same aggregation multiple times in a row AND the number match the numbers of the facets. This leads me to the conclusion something is broken from Beta2 to RC1! I would like to test this on master, but I could not find any nightly builds of elasticsearch. Is there a location where they are stored or should I compile it myself? On Friday, January 31, 2014 6:43:07 PM UTC+1, Nils Dijk wrote: Hi Binh Ly, Thanks for the response. I'm aware that the numbers are not exact (hence the link to issue #1305 in my initial post), and have been advocating slightly incorrect numbers with my colleges and customers for some time already to prepare them for the moment we provide analytics with ES. But what bothers me is that they are *inconsistent*. If you look at my gist you see that I ran the same aggs 3 times right after each other. If we just look at the top item we see the following results: 1. { key: totaltrafficbos, doc_count: 2880 } 2. { key: totaltrafficbos, doc_count: 2552 } 3. { key: totaltrafficbos, doc_count: 2179 } These results are taken within seconds without any change to the number of documents in the index. If I run them even more you see that it rotates between a hand full of numbers. Is this also behavior one would expect from the aggs? And if so, why do the facets show the same number over and over again? Anyway, I will try to work myself through the aggs code this weekend to get a better hang of what we could do with it, and what not. -- Nils On Friday, January 31, 2014 6:18:43 PM UTC+1, Binh Ly wrote: Nils, This is just the nature of splitting data around in shards. Actually the terms facet has the same limitations (i.e. it will also give approximate counts). Neither the terms facet nor the terms aggregation is better or worse than the other - they are both approximations (using different implementations). It is correct that if you put all your data in 1 shard, then all the counts are exact. If you need to shard, you can increase the shard_size parameter inside the terms aggregation to improve accuracy. Play with that number until it suits your purposes but the important thing is they are just approximations the more documents you have in the index - so just don't expect absolute numbers from them if you have more than 1 shard. { size: 0, aggs: { a: { terms: { field: actor.displayName, shard_size: 1 } } } } -- You received this message because you are subscribed to the Google Groups elasticsearch group. To
Re: Inconsistent responses from aggregations (ES1.0.0RC1)
Thanks. I tried to reproduce it on 1.0.0.RC2, but without success. curl '0:9200/aggsbug/_mapping?pretty' { aggsbug : { mappings : { messages : { properties : { a : { type : string, analyzer : keyword } } } } } } Using analyzer keyword, the aggregations is working flawlessly here, with constant result. curl '0:9200/aggsbug/_search?pretty' -d ' { size: 0, aggs: { a: { terms: { field: a, size: 10 } } } } ' { took : 669, timed_out : false, _shards : { total : 1, successful : 1, failed : 0 }, hits : { total : 1060387, max_score : 0.0, hits : [ ] }, aggregations : { a : { buckets : [ { key : TotalTrafficBOS, doc_count : 3599 }, { key : MAI93thm, doc_count : 2517 }, { key : MAI90thm, doc_count : 2207 }, { key : MAI95thm, doc_count : 2207 }, { key : TotalTrafficNYC, doc_count : 1660 }, { key : incidentreports, doc_count : 1468 }, { key : NJI80thm, doc_count : 1180 }, { key : PAI76thm, doc_count : 1142 }, { key : TXI35thm, doc_count : 1064 }, { key : NYI87thm, doc_count : 1029 } ] } } } Jörg On Wed, Feb 5, 2014 at 2:17 PM, Nils Dijk m...@thanod.nl wrote: Hi, I updated the gist now with a file in bulkindex format. I also split up the loading from the testing phase, so you can do the test multiple times in a row. I also added a README.md to instruct how to run the test. I'm also creating a bug as stated here http://www.elasticsearch.org/blog/0-90-11-1-0-0-rc2-released/. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFRak9JtwQNnEdd%3DPGzJRqiqpCMEJXSAsgZ52OztJiTJw%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Inconsistent responses from aggregations (ES1.0.0RC1)
I just installed 1.7u25 on a mac with maverick to try to reproduce the issue, but without success (on 1.0.0-RC2). On Wed, Feb 5, 2014 at 4:49 PM, Nils Dijk m...@thanod.nl wrote: Hi Adrien, I'm using OSX (Mavericks) and java: (having the issue) $ java -version java version 1.7.0_25 Java(TM) SE Runtime Environment (build 1.7.0_25-b15) Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode) My colleague is running OSX (Lion) and java: (having the issue) $ java -version java version 1.6.0_26 Java(TM) SE Runtime Environment (build 1.6.0_26-b03-383-11D50) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02-383, mixed mode) A server soon to be used for production Ubuntu 12.04 LTS with java: (Not having the issue) $ java -version java version 1.7.0_45 Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) Could this be an issue with java on OSX than? On Wednesday, February 5, 2014 4:38:36 PM UTC+1, Adrien Grand wrote: I didn't manage to reproduce the issue locally either. What JVM / OS are you using (RC1 introduced Unsafe to perform String comparisons in terms aggs so I'm wondering if that could be related to your issue)? On Wed, Feb 5, 2014 at 4:33 PM, Nils Dijk m...@thanod.nl wrote: I did only test it with 1 and with 10 shards, indeed with 1 shard it did not have any issues, with 10 shards it has issues all the time. I also had a colleague testing it with the two scripts in the gist (which uses 10 shards). Also I do not think the analyzer _should_ have impact, since it would only index more terms on that field if it tokenizes it. Can you use the aggsbug.load.sh to load the data? And than use aggsbug.test.sh to run the test? It should give you a field analyzed with the default analyzer and 10 shards. I'll try out some different analyzers, and loading the data in 3 shards now to see if that changes things. On Wednesday, February 5, 2014 4:02:54 PM UTC+1, Jörg Prante wrote: Also the same with shards = 3 and analyzer = standard. Stable results. { took : 240, timed_out : false, _shards : { total : 3, successful : 3, failed : 0 }, hits : { total : 1060387, max_score : 0.0, hits : [ ] }, aggregations : { a : { buckets : [ { key : totaltrafficbos, doc_count : 3599 }, { key : mai93thm, doc_count : 2517 }, { key : mai90thm, doc_count : 2207 }, { key : mai95thm, doc_count : 2207 }, { key : totaltrafficnyc, doc_count : 1660 }, { key : confessions, doc_count : 1534 }, { key : incidentreports, doc_count : 1468 }, { key : nji80thm, doc_count : 1180 }, { key : pai76thm, doc_count : 1142 }, { key : txi35thm, doc_count : 379 } ] } } } You should examine your log files if your ES cluster was able to process all the docs correctly while indexing or searching, maybe you encountered OOMs or other subtle issues. Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/7c74c649-8a4a-46c5-aaec-b6f3254cc0d9% 40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8f0f80b7-fbf2-4747-90d4-725a06560938%40googlegroups.com . For more options, visit https://groups.google.com/groups/opt_out. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j4Y5EA5qAxE1BkLmbBX_7xgwZKPz00x_96YM4X9qLNE4w%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Inconsistent responses from aggregations (ES1.0.0RC1)
Thanks for the effort. I tried running on 1.7.0_51, and it gave me the same issue. I was trying to find out if I could disable this unsafe string comparisons, but could not really find where that should be disabled. Is there an easy way for me to switch back that change? Do you know on what commit this was changed so I can revert that commit in my local clone of the repo, do a build to see if the problem is solved that way? For reproducing I do not really see what could impact this besides from the OS and java version. And the other OSX machine was a different version of OS AND java, and still having the same results. I am however a bit more relaxed with the issue not showing up on our production machines, that would have killed the ES migration we are currently doing. Although it is unfortunate that we can not test our stuff on our developement machines (all showing the issue here). Do you have any thoughts on what could be different between our setups that we are having the issue, and you don't? To make sure, you use my scripts to load it in? Since Jörg seemed to load the data on a different way (different shardcount and different mapping) which did not show the issues here. On Wednesday, February 5, 2014 5:40:10 PM UTC+1, Adrien Grand wrote: I just installed 1.7u25 on a mac with maverick to try to reproduce the issue, but without success (on 1.0.0-RC2). On Wed, Feb 5, 2014 at 4:49 PM, Nils Dijk m...@thanod.nl javascript:wrote: Hi Adrien, I'm using OSX (Mavericks) and java: (having the issue) $ java -version java version 1.7.0_25 Java(TM) SE Runtime Environment (build 1.7.0_25-b15) Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode) My colleague is running OSX (Lion) and java: (having the issue) $ java -version java version 1.6.0_26 Java(TM) SE Runtime Environment (build 1.6.0_26-b03-383-11D50) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02-383, mixed mode) A server soon to be used for production Ubuntu 12.04 LTS with java: (Not having the issue) $ java -version java version 1.7.0_45 Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) Could this be an issue with java on OSX than? On Wednesday, February 5, 2014 4:38:36 PM UTC+1, Adrien Grand wrote: I didn't manage to reproduce the issue locally either. What JVM / OS are you using (RC1 introduced Unsafe to perform String comparisons in terms aggs so I'm wondering if that could be related to your issue)? On Wed, Feb 5, 2014 at 4:33 PM, Nils Dijk m...@thanod.nl wrote: I did only test it with 1 and with 10 shards, indeed with 1 shard it did not have any issues, with 10 shards it has issues all the time. I also had a colleague testing it with the two scripts in the gist (which uses 10 shards). Also I do not think the analyzer _should_ have impact, since it would only index more terms on that field if it tokenizes it. Can you use the aggsbug.load.sh to load the data? And than use aggsbug.test.sh to run the test? It should give you a field analyzed with the default analyzer and 10 shards. I'll try out some different analyzers, and loading the data in 3 shards now to see if that changes things. On Wednesday, February 5, 2014 4:02:54 PM UTC+1, Jörg Prante wrote: Also the same with shards = 3 and analyzer = standard. Stable results. { took : 240, timed_out : false, _shards : { total : 3, successful : 3, failed : 0 }, hits : { total : 1060387, max_score : 0.0, hits : [ ] }, aggregations : { a : { buckets : [ { key : totaltrafficbos, doc_count : 3599 }, { key : mai93thm, doc_count : 2517 }, { key : mai90thm, doc_count : 2207 }, { key : mai95thm, doc_count : 2207 }, { key : totaltrafficnyc, doc_count : 1660 }, { key : confessions, doc_count : 1534 }, { key : incidentreports, doc_count : 1468 }, { key : nji80thm, doc_count : 1180 }, { key : pai76thm, doc_count : 1142 }, { key : txi35thm, doc_count : 379 } ] } } } You should examine your log files if your ES cluster was able to process all the docs correctly while indexing or searching, maybe you encountered OOMs or other subtle issues. Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/ msgid/elasticsearch/7c74c649-8a4a-46c5-aaec-b6f3254cc0d9% 40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out. -- Adrien Grand -- You received
Re: Inconsistent responses from aggregations (ES1.0.0RC1)
On Wed, Feb 5, 2014 at 6:01 PM, Nils Dijk m...@thanod.nl wrote: I was trying to find out if I could disable this unsafe string comparisons, but could not really find where that should be disabled. Is there an easy way for me to switch back that change? Do you know on what commit this was changed so I can revert that commit in my local clone of the repo, do a build to see if the problem is solved that way? Sure, this was changed in 4271d573d60f39564c458e2d3fb7c14afb82d4d8 However I also just read that you can't reproduce the issue with one shard although this shouldn't be relevant. For reproducing I do not really see what could impact this besides from the OS and java version. And the other OSX machine was a different version of OS AND java, and still having the same results. I am however a bit more relaxed with the issue not showing up on our production machines, that would have killed the ES migration we are currently doing. Although it is unfortunate that we can not test our stuff on our developement machines (all showing the issue here). Do you have any thoughts on what could be different between our setups that we are having the issue, and you don't? I wish I had ideas! :-) Since the issue seems to reproduce consistently for you, something that would be super helpful would be to git bisect in order to find the commit that broke aggregations in your setup (Beta2 commit is 296cfbe3 and rc1 commit is 2c8ee3fb). To make sure, you use my scripts to load it in? Since Jörg seemed to load the data on a different way (different shardcount and different mapping) which did not show the issues here. Yes, I used your scripts, exactly as described in the README. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7pAMdOPGoy5ssjdAHLa4eMntKnCZPLH6U9Ft2TZaO77w%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Inconsistent responses from aggregations (ES1.0.0RC1)
Ok, I was preparing to do a long bisecting session, but I started with the commit you highlighted below (4271d573d60f39564c458e2d3fb7c14afb82d4d8) and the commit before that one (6481a2fde858520988f2ce28c02a15be3fe108e4). And as it turns out, it is the breaking commit. If I build the commit of yours from December 3 it fails my test suite. If I build the commit of Nik from Januari 6 it still passes my test. I also tried reverting your commit on the v1.0.0.RC1 tag, but it gave me all kinds of conflicts so I could not test RC1 without your commit. If you would like I can still do a full bisect, but I suspect I end up at your commit since I tested that one, and the one before. Would it be possible for you to send a .patch without the unsafe stuff, so I can apply that to a commit and make a build? Thanks in advance, On Wednesday, February 5, 2014 6:10:35 PM UTC+1, Adrien Grand wrote: On Wed, Feb 5, 2014 at 6:01 PM, Nils Dijk m...@thanod.nl javascript:wrote: I was trying to find out if I could disable this unsafe string comparisons, but could not really find where that should be disabled. Is there an easy way for me to switch back that change? Do you know on what commit this was changed so I can revert that commit in my local clone of the repo, do a build to see if the problem is solved that way? Sure, this was changed in 4271d573d60f39564c458e2d3fb7c14afb82d4d8 However I also just read that you can't reproduce the issue with one shard although this shouldn't be relevant. For reproducing I do not really see what could impact this besides from the OS and java version. And the other OSX machine was a different version of OS AND java, and still having the same results. I am however a bit more relaxed with the issue not showing up on our production machines, that would have killed the ES migration we are currently doing. Although it is unfortunate that we can not test our stuff on our developement machines (all showing the issue here). Do you have any thoughts on what could be different between our setups that we are having the issue, and you don't? I wish I had ideas! :-) Since the issue seems to reproduce consistently for you, something that would be super helpful would be to git bisect in order to find the commit that broke aggregations in your setup (Beta2 commit is 296cfbe3 and rc1 commit is 2c8ee3fb). To make sure, you use my scripts to load it in? Since Jörg seemed to load the data on a different way (different shardcount and different mapping) which did not show the issues here. Yes, I used your scripts, exactly as described in the README. -- Adrien Grand -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ab8f000d-d0ee-4be8-aaa5-46d0718c56e8%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Inconsistent responses from aggregations (ES1.0.0RC1)
Hi Jörg, Glad you could reproduce with my updated gist. cb. On Wednesday, February 5, 2014 8:18:39 PM UTC+1, Jörg Prante wrote: Nils, I ran the test on my Mac, and I can reproduce the issue. And also on Linux. Unfortunately the Mac locked up and I had to cold reboot, and my copy/paste logs are gone with all the numbers, but anyway. As a matter of fact, your aggregates demo is daunting. On the Mac, it shows different counts even between the first and the subsequent executions. The counts of the first are lower, and also, even different terms show up. On Linux, I do not observe different counts between runs. The issue you describe for Mac is the issue I discussed here. But, what's more bothering is, I observed different results in regard to the shard count, and that is both on Mac and Linux. The more the hit count is on top of the buckets, the more the counts match, only the lower buckets differ, so the deviating counts are somewhat hard to notice. The counts differ when you change the shard size is long known problem of elasticsearch and was also a problem in faceting. A long thread about the nature of this problem can be found here: https://github.com/elasticsearch/elasticsearch/issues/1305. It is an issue which you can circumvent easily by one of two options: 1. Use the term you do the aggregation for as a routing key. This forces to have the same tokens in the same shard, and thus always return the exact count. Although this only works if you do these kind of analytics over 1 field. 2. Increase the shard_size for the terms aggregation. This way the internal shards create bigger lists which than have more chance of containing the actual top terms. http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/search-aggregations-bucket-terms-aggregation.html#_size_amp_shard_size I use Java 8 FCS, but since you observe this issue also on Java 7, I think it is not an issue of Java 8. And it's both on Mac and Linux, but with different symptoms. This makes the only factor occurring multiple times the MacOSX OS. And on all java versions, I tested both 1.7 and 1.6. It is unfortunate that Adrien wasn't able to reproduce it on OSX. ES 1.0.0.RC2 Mac OS X 10.8.5 Darwin Jorg-Prantes-MacBook-Pro.local 12.5.0 Darwin Kernel Version 12.5.0: Sun Sep 29 13:33:47 PDT 2013; root:xnu-2050.48.12~1/RELEASE_X86_64 x86_64 java version 1.8.0 Java(TM) SE Runtime Environment (build 1.8.0-b128) Java HotSpot(TM) 64-Bit Server VM (build 25.0-b69, mixed mode) G1GC enabled ES 1.0.0.RC2 RHEL 6.3 Linux zephyros 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux java version 1.8.0 Java(TM) SE Runtime Environment (build 1.8.0-b128) Java HotSpot(TM) 64-Bit Server VM (build 25.0-b69, mixed mode) G1GC enabled Here are two Linux examples. Note, the last three terms and counts are different. shards=10 { took : 143, timed_out : false, _shards : { total : 10, successful : 10, failed : 0 }, hits : { total : 1060387, max_score : 0.0, hits : [ ] }, aggregations : { a : { buckets : [ { key : totaltrafficbos, doc_count : 3599 }, { key : mai93thm, doc_count : 2517 }, { key : mai90thm, doc_count : 2207 }, { key : mai95thm, doc_count : 2207 }, { key : totaltrafficnyc, doc_count : 1660 }, { key : confessions, doc_count : 1534 }, { key : incidentreports, doc_count : 1468 }, { key : nji80thm, doc_count : 1071 }, { key : pai76thm, doc_count : 1039 }, { key : txi35thm, doc_count : 357 } ] } } } shards=5 { took : 172, timed_out : false, _shards : { total : 5, successful : 5, failed : 0 }, hits : { total : 1060387, max_score : 0.0, hits : [ ] }, aggregations : { a : { buckets : [ { key : totaltrafficbos, doc_count : 3599 }, { key : mai93thm, doc_count : 2517 }, { key : mai90thm, doc_count : 2207 }, { key : mai95thm, doc_count : 2207 }, { key : totaltrafficnyc, doc_count : 1660 }, { key : confessions, doc_count : 1534 }, { key : incidentreports, doc_count : 1468 }, { key : nji80thm, doc_count : 1180 }, { key : pai76thm, doc_count : 936 }, { key : nji78thm, doc_count : 422 } ] } } } Jörg -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to
Re: Inconsistent responses from aggregations (ES1.0.0RC1)
I've loaded the same dataset in ES1.0.0.Beta2 with the same index configuration as in the topic start. However now the numbers are consistent if I call the same aggregation multiple times in a row AND the number match the numbers of the facets. This leads me to the conclusion something is broken from Beta2 to RC1! I would like to test this on master, but I could not find any nightly builds of elasticsearch. Is there a location where they are stored or should I compile it myself? On Friday, January 31, 2014 6:43:07 PM UTC+1, Nils Dijk wrote: Hi Binh Ly, Thanks for the response. I'm aware that the numbers are not exact (hence the link to issue #1305 in my initial post), and have been advocating slightly incorrect numbers with my colleges and customers for some time already to prepare them for the moment we provide analytics with ES. But what bothers me is that they are *inconsistent*. If you look at my gist you see that I ran the same aggs 3 times right after each other. If we just look at the top item we see the following results: 1. { key: totaltrafficbos, doc_count: 2880 } 2. { key: totaltrafficbos, doc_count: 2552 } 3. { key: totaltrafficbos, doc_count: 2179 } These results are taken within seconds without any change to the number of documents in the index. If I run them even more you see that it rotates between a hand full of numbers. Is this also behavior one would expect from the aggs? And if so, why do the facets show the same number over and over again? Anyway, I will try to work myself through the aggs code this weekend to get a better hang of what we could do with it, and what not. -- Nils On Friday, January 31, 2014 6:18:43 PM UTC+1, Binh Ly wrote: Nils, This is just the nature of splitting data around in shards. Actually the terms facet has the same limitations (i.e. it will also give approximate counts). Neither the terms facet nor the terms aggregation is better or worse than the other - they are both approximations (using different implementations). It is correct that if you put all your data in 1 shard, then all the counts are exact. If you need to shard, you can increase the shard_size parameter inside the terms aggregation to improve accuracy. Play with that number until it suits your purposes but the important thing is they are just approximations the more documents you have in the index - so just don't expect absolute numbers from them if you have more than 1 shard. { size: 0, aggs: { a: { terms: { field: actor.displayName, shard_size: 1 } } } } -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6bee2ff8-ae78-4837-91f5-77ee80f55d34%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Re: Inconsistent responses from aggregations (ES1.0.0RC1)
To follow up, I have a contained test suite at https://gist.github.com/thanodnl/8803745for this problem. It contains two files: 1. aggsbug.sh 2. aggsbug.json The .json file contains ~1M documents newline separated to load into the database, I was not able to create a curl request to load them directly into the index. The .sh file (https://gist.github.com/thanodnl/8803745/raw/aggsbug.sh) contains the instructions for recreating this behavior. I have ran these against the following version: 1. 1.0.0.Beta2 2. 1.0.0.RC1 3. 1.0.0-SNAPSHOT as compiled from the git 1.0 branch on commit 0f8b41ffad9b5ecdfd543d7c73edcf404e6fc763 When ran on 1.0.0.Beta2 it gives the same output consistently when I run the _search over and over again. When ran on 1.0.0.RC1 it will give me multiple different outcomes comparable to the numbers I posted earlier in the thread, When ran on 1.0.0-SNAPSHOT it behaves the same as in 1.0.0.RC1. That it still was working on 1.0.0.Beta2 proves to me that it is a bug that got into RC1. I could not find any related ticket on the issues page of the github repository. Hopefully this is enough information to recreate the problem. The json file is quite big and could bug when you open the gist it in a browser. A clone of the gist locally will work best: $ git clone https://gist.github.com/8803745.git I do not really know how to move on from here. Do you want me to open an issue for this problem at github.com/elasticsearch/elasticsearch? It would be nice to fix this problem before a release of 1.0.0 since that is the first release containing the aggregations for analytics. On Tuesday, February 4, 2014 12:31:10 PM UTC+1, Nils Dijk wrote: I've loaded the same dataset in ES1.0.0.Beta2 with the same index configuration as in the topic start. However now the numbers are consistent if I call the same aggregation multiple times in a row AND the number match the numbers of the facets. This leads me to the conclusion something is broken from Beta2 to RC1! I would like to test this on master, but I could not find any nightly builds of elasticsearch. Is there a location where they are stored or should I compile it myself? On Friday, January 31, 2014 6:43:07 PM UTC+1, Nils Dijk wrote: Hi Binh Ly, Thanks for the response. I'm aware that the numbers are not exact (hence the link to issue #1305 in my initial post), and have been advocating slightly incorrect numbers with my colleges and customers for some time already to prepare them for the moment we provide analytics with ES. But what bothers me is that they are *inconsistent*. If you look at my gist you see that I ran the same aggs 3 times right after each other. If we just look at the top item we see the following results: 1. { key: totaltrafficbos, doc_count: 2880 } 2. { key: totaltrafficbos, doc_count: 2552 } 3. { key: totaltrafficbos, doc_count: 2179 } These results are taken within seconds without any change to the number of documents in the index. If I run them even more you see that it rotates between a hand full of numbers. Is this also behavior one would expect from the aggs? And if so, why do the facets show the same number over and over again? Anyway, I will try to work myself through the aggs code this weekend to get a better hang of what we could do with it, and what not. -- Nils On Friday, January 31, 2014 6:18:43 PM UTC+1, Binh Ly wrote: Nils, This is just the nature of splitting data around in shards. Actually the terms facet has the same limitations (i.e. it will also give approximate counts). Neither the terms facet nor the terms aggregation is better or worse than the other - they are both approximations (using different implementations). It is correct that if you put all your data in 1 shard, then all the counts are exact. If you need to shard, you can increase the shard_size parameter inside the terms aggregation to improve accuracy. Play with that number until it suits your purposes but the important thing is they are just approximations the more documents you have in the index - so just don't expect absolute numbers from them if you have more than 1 shard. { size: 0, aggs: { a: { terms: { field: actor.displayName, shard_size: 1 } } } } -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fb421a29-8923-4188-9363-03682fec71ab%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.