Re: Inconsistent responses from aggregations (ES1.0.0RC1)

2014-02-07 Thread Nils Dijk
Hi Adrien,

Good news! The problem is solved.
Can't wait for the release containing the fix, but for now I will use my 
own build :)

On Thursday, February 6, 2014 5:25:11 PM UTC+1, Nils Dijk wrote:

 Yay!

 I will try this somewhere tomorrow. Thanks for fixing, much appreciated!

 Seems like it was difficult to find. Since it only happens when a 'page' 
 gets recycled internally.

 On Thursday, February 6, 2014 3:53:46 PM UTC+1, Adrien Grand wrote:

 It took me some time but I finally managed to understand the cause and to 
 write a fix:
   https://github.com/elasticsearch/elasticsearch/pull/5039

 Thanks very much for reporting this and for your help reproducing and 
 debugging this issue!


 On Thu, Feb 6, 2014 at 2:08 PM, Nils Dijk m...@thanod.nl wrote:

 Good,

 It is always easier to fix when it's on your own machine.

 I tried your .patch, but it did not fix the problem. I also tried your 
 config, although I did not really get where to put the setting, I ended up 
 putting the setting on the index. This also did not fix the problem.

 I also tried with a bigger shard_size in the agg. Yet again no 
 difference.

 To test some more around aggs I loaded a complete production set into 
 both my local ES RC2 (osx) and one on a linux server with ES RC2. I have a 
 hunch it could be in the sorting of the terms. When I do a sub agg and sort 
 on it I see all kind of weird results that are even lower than the ones I 
 see when I do not sort on the sub agg.

 If you need me to test some more I am keeping a close watch on this 
 thread.

 -- Nils

 On Thursday, February 6, 2014 1:19:40 PM UTC+1, Adrien Grand wrote:

 OK, I finally managed to reproduce it on both mac and linux by 
 increasing the number of shards to 20, will keep you posted


 On Thu, Feb 6, 2014 at 9:29 AM, Adrien Grand adrien...@elasticsearch.
 com wrote:

 On Wed, Feb 5, 2014 at 6:42 PM, Nils Dijk m...@thanod.nl wrote:

 Ok, I was preparing to do a long bisecting session, but I started 
 with the commit you highlighted below 
 (4271d573d60f39564c458e2d3fb7c14afb82d4d8) 
 and the commit before that one (6481a2fde858520988f2ce28c02a1
 5be3fe108e4). And as it turns out, it is the breaking commit.

 If I build the commit of yours from December 3 it fails my test suite.
 If I build the commit of Nik from Januari 6 it still passes my test.

 I also tried reverting your commit on the v1.0.0.RC1 tag, but it gave 
 me all kinds of conflicts so I could not test RC1 without your commit.

 If you would like I can still do a full bisect, but I suspect I end 
 up at your commit since I tested that one, and the one before.

 Would it be possible for you to send a .patch without the unsafe 
 stuff, so I can apply that to a commit and make a build?


 Thanks Nils for your work, this is much appreciated.

 Here is a simple patch attached that short-circuits the use of Unsafe 
 to do string comparisons.

 Maybe you could also try to set the `cache.recycler.page.type` setting 
 to `none` to see if that changes anything.

 -- 
 Adrien Grand
  



 -- 
 Adrien Grand
  
  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/af8e91d8-4a97-42d3-9dd5-8a980ded493e%40googlegroups.com
 .

 For more options, visit https://groups.google.com/groups/opt_out.




 -- 
 Adrien Grand
  


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c6acf6ac-3f47-49e4-8240-57c4c697c635%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Inconsistent responses from aggregations (ES1.0.0RC1)

2014-02-07 Thread Adrien Grand
Excellent news, thanks for checking! RC2 was the last release candidate, so
the next release containing the fix should be 1.0 GA. Hopefully it will be
out soon.


On Fri, Feb 7, 2014 at 12:39 PM, Nils Dijk m...@thanod.nl wrote:

 Hi Adrien,

 Good news! The problem is solved.
 Can't wait for the release containing the fix, but for now I will use my
 own build :)

 On Thursday, February 6, 2014 5:25:11 PM UTC+1, Nils Dijk wrote:

 Yay!

 I will try this somewhere tomorrow. Thanks for fixing, much appreciated!

 Seems like it was difficult to find. Since it only happens when a 'page'
 gets recycled internally.

 On Thursday, February 6, 2014 3:53:46 PM UTC+1, Adrien Grand wrote:

 It took me some time but I finally managed to understand the cause and
 to write a fix:
   https://github.com/elasticsearch/elasticsearch/pull/5039

 Thanks very much for reporting this and for your help reproducing and
 debugging this issue!


 On Thu, Feb 6, 2014 at 2:08 PM, Nils Dijk m...@thanod.nl wrote:

 Good,

 It is always easier to fix when it's on your own machine.

 I tried your .patch, but it did not fix the problem. I also tried your
 config, although I did not really get where to put the setting, I ended up
 putting the setting on the index. This also did not fix the problem.

 I also tried with a bigger shard_size in the agg. Yet again no
 difference.

 To test some more around aggs I loaded a complete production set into
 both my local ES RC2 (osx) and one on a linux server with ES RC2. I have a
 hunch it could be in the sorting of the terms. When I do a sub agg and sort
 on it I see all kind of weird results that are even lower than the ones I
 see when I do not sort on the sub agg.

 If you need me to test some more I am keeping a close watch on this
 thread.

 -- Nils

 On Thursday, February 6, 2014 1:19:40 PM UTC+1, Adrien Grand wrote:

 OK, I finally managed to reproduce it on both mac and linux by
 increasing the number of shards to 20, will keep you posted


 On Thu, Feb 6, 2014 at 9:29 AM, Adrien Grand adrien...@elasticsearch.
 com wrote:

 On Wed, Feb 5, 2014 at 6:42 PM, Nils Dijk m...@thanod.nl wrote:

 Ok, I was preparing to do a long bisecting session, but I started
 with the commit you highlighted below (4271d573d60f39564c458e2d3fb7c
 14afb82d4d8) and the commit before that one (
 6481a2fde858520988f2ce28c02a15be3fe108e4). And as it turns out, it
 is the breaking commit.

 If I build the commit of yours from December 3 it fails my test
 suite.
 If I build the commit of Nik from Januari 6 it still passes my test.

 I also tried reverting your commit on the v1.0.0.RC1 tag, but it
 gave me all kinds of conflicts so I could not test RC1 without your 
 commit.

 If you would like I can still do a full bisect, but I suspect I end
 up at your commit since I tested that one, and the one before.

 Would it be possible for you to send a .patch without the unsafe
 stuff, so I can apply that to a commit and make a build?


 Thanks Nils for your work, this is much appreciated.

 Here is a simple patch attached that short-circuits the use of Unsafe
 to do string comparisons.

 Maybe you could also try to set the `cache.recycler.page.type`
 setting to `none` to see if that changes anything.

 --
 Adrien Grand




 --
 Adrien Grand

  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/af8e91d8-4a97-42d3-9dd5-8a980ded493e%
 40googlegroups.com.

 For more options, visit https://groups.google.com/groups/opt_out.




 --
 Adrien Grand

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/c6acf6ac-3f47-49e4-8240-57c4c697c635%40googlegroups.com
 .

 For more options, visit https://groups.google.com/groups/opt_out.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6K8VbCDfz7cz8v7qR3k3e8afQdULGBfDDsBL%2B%2BqGLOjw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Inconsistent responses from aggregations (ES1.0.0RC1)

2014-02-06 Thread Nils Dijk
Yay!

I will try this somewhere tomorrow. Thanks for fixing, much appreciated!

Seems like it was difficult to find. Since it only happens when a 'page' 
gets recycled internally.

On Thursday, February 6, 2014 3:53:46 PM UTC+1, Adrien Grand wrote:

 It took me some time but I finally managed to understand the cause and to 
 write a fix:
   https://github.com/elasticsearch/elasticsearch/pull/5039

 Thanks very much for reporting this and for your help reproducing and 
 debugging this issue!


 On Thu, Feb 6, 2014 at 2:08 PM, Nils Dijk m...@thanod.nl javascript:wrote:

 Good,

 It is always easier to fix when it's on your own machine.

 I tried your .patch, but it did not fix the problem. I also tried your 
 config, although I did not really get where to put the setting, I ended up 
 putting the setting on the index. This also did not fix the problem.

 I also tried with a bigger shard_size in the agg. Yet again no difference.

 To test some more around aggs I loaded a complete production set into 
 both my local ES RC2 (osx) and one on a linux server with ES RC2. I have a 
 hunch it could be in the sorting of the terms. When I do a sub agg and sort 
 on it I see all kind of weird results that are even lower than the ones I 
 see when I do not sort on the sub agg.

 If you need me to test some more I am keeping a close watch on this 
 thread.

 -- Nils

 On Thursday, February 6, 2014 1:19:40 PM UTC+1, Adrien Grand wrote:

 OK, I finally managed to reproduce it on both mac and linux by 
 increasing the number of shards to 20, will keep you posted


 On Thu, Feb 6, 2014 at 9:29 AM, Adrien Grand adrien...@elasticsearch.
 com wrote:

 On Wed, Feb 5, 2014 at 6:42 PM, Nils Dijk m...@thanod.nl wrote:

 Ok, I was preparing to do a long bisecting session, but I started with 
 the commit you highlighted below 
 (4271d573d60f39564c458e2d3fb7c14afb82d4d8) 
 and the commit before that one (6481a2fde858520988f2ce28c02a1
 5be3fe108e4). And as it turns out, it is the breaking commit.

 If I build the commit of yours from December 3 it fails my test suite.
 If I build the commit of Nik from Januari 6 it still passes my test.

 I also tried reverting your commit on the v1.0.0.RC1 tag, but it gave 
 me all kinds of conflicts so I could not test RC1 without your commit.

 If you would like I can still do a full bisect, but I suspect I end up 
 at your commit since I tested that one, and the one before.

 Would it be possible for you to send a .patch without the unsafe 
 stuff, so I can apply that to a commit and make a build?


 Thanks Nils for your work, this is much appreciated.

 Here is a simple patch attached that short-circuits the use of Unsafe 
 to do string comparisons.

 Maybe you could also try to set the `cache.recycler.page.type` setting 
 to `none` to see if that changes anything.

 -- 
 Adrien Grand
  



 -- 
 Adrien Grand
  
  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/af8e91d8-4a97-42d3-9dd5-8a980ded493e%40googlegroups.com
 .

 For more options, visit https://groups.google.com/groups/opt_out.




 -- 
 Adrien Grand
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0e604ec8-05a8-4697-b6bf-28d8bda756ee%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Inconsistent responses from aggregations (ES1.0.0RC1)

2014-02-05 Thread joergpra...@gmail.com
Sorry, but your file at  https://gist.github.com/8803745.git is broken, it
contains invalid JSON, so it can not be processed.

It would be helpful to provide a script with escaped JSON in bulk format.

From what I suspect, you do not use keyword analyzer for faceting/agg'ing,
so you will get all kinds of unwanted results. If that explains your
fluctuating aggs results, I can not tell. It is rather uncommon to use
facets and aggs side by side.

Jörg



On Tue, Feb 4, 2014 at 3:01 PM, Nils Dijk m...@thanod.nl wrote:

 To follow up,

 I have a contained test suite at https://gist.github.com/thanodnl/8803745for 
 this problem. It contains two files:

1. aggsbug.sh
2. aggsbug.json

 The .json file contains ~1M documents newline separated to load into the
 database, I was not able to create a curl request to load them directly
 into the index.
 The .sh file (https://gist.github.com/thanodnl/8803745/raw/aggsbug.sh)
 contains the instructions for recreating this behavior.

 I have ran these against the following version:

1. 1.0.0.Beta2
2. 1.0.0.RC1
3. 1.0.0-SNAPSHOT as compiled from the git 1.0 branch on commit
0f8b41ffad9b5ecdfd543d7c73edcf404e6fc763

 When ran on 1.0.0.Beta2 it gives the same output consistently when I run
 the _search over and over again.
 When ran on 1.0.0.RC1 it will give me multiple different outcomes
 comparable to the numbers I posted earlier in the thread,
 When ran on 1.0.0-SNAPSHOT it behaves the same as in 1.0.0.RC1.

 That it still was working on 1.0.0.Beta2 proves to me that it is a bug
 that got into RC1. I could not find any related ticket on the issues page
 of the github repository. Hopefully this is enough information to recreate
 the problem.

 The json file is quite big and could bug when you open the gist it in a
 browser. A clone of the gist locally will work best:
 $ git clone https://gist.github.com/8803745.git

 I do not really know how to move on from here. Do you want me to open an
 issue for this problem at github.com/elasticsearch/elasticsearch? It
 would be nice to fix this problem before a release of 1.0.0 since that is
 the first release containing the aggregations for analytics.

 On Tuesday, February 4, 2014 12:31:10 PM UTC+1, Nils Dijk wrote:

 I've loaded the same dataset in ES1.0.0.Beta2 with the same index
 configuration as in the topic start.

 However now the numbers are consistent if I call the same aggregation
 multiple times in a row AND the number match the numbers of the facets.
 This leads me to the conclusion something is broken from Beta2 to RC1!

 I would like to test this on master, but I could not find any nightly
 builds of elasticsearch. Is there a location where they are stored or
 should I compile it myself?

 On Friday, January 31, 2014 6:43:07 PM UTC+1, Nils Dijk wrote:

 Hi Binh Ly,

 Thanks for the response.

 I'm aware that the numbers are not exact (hence the link to issue #1305
 in my initial post), and have been advocating slightly incorrect numbers
 with my colleges and customers for some time already to prepare them for
 the moment we provide analytics with ES. But what bothers me is that they
 are *inconsistent*.

 If you look at my gist you see that I ran the same aggs 3 times right
 after each other. If we just look at the top item we see the following
 results:

1. { key: totaltrafficbos, doc_count: 2880 }
2. { key: totaltrafficbos, doc_count: 2552 }
3. { key: totaltrafficbos, doc_count: 2179 }

 These results are taken within seconds without any change to the number of 
 documents in the index. If I run them even more you see that it rotates 
 between a hand full of numbers. Is this also behavior one would expect from 
 the aggs? And if so, why do the facets show the same number over and over 
 again?

 Anyway, I will try to work myself through the aggs code this weekend to get 
 a better hang of what we could do with it, and what not.

 -- Nils

 On Friday, January 31, 2014 6:18:43 PM UTC+1, Binh Ly wrote:

 Nils,

 This is just the nature of splitting data around in shards. Actually
 the terms facet has the same limitations (i.e. it will also give
 approximate counts). Neither the terms facet nor the terms aggregation is
 better or worse than the other - they are both approximations (using
 different implementations). It is correct that if you put all your data in
 1 shard, then all the counts are exact. If you need to shard, you can
 increase the shard_size parameter inside the terms aggregation to
 improve accuracy. Play with that number until it suits your purposes but
 the important thing is they are just approximations the more documents you
 have in the index - so just don't expect absolute numbers from them if you
 have more than 1 shard.

 {
   size: 0,
   aggs: {
 a: {
   terms: {
 field: actor.displayName,
 shard_size: 1
   }
 }
   }
 }

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To 

Re: Inconsistent responses from aggregations (ES1.0.0RC1)

2014-02-05 Thread joergpra...@gmail.com
Thanks. I tried to reproduce it on 1.0.0.RC2, but without success.

curl '0:9200/aggsbug/_mapping?pretty'
{
  aggsbug : {
mappings : {
  messages : {
properties : {
  a : {
type : string,
analyzer : keyword
  }
}
  }
}
  }
}

Using analyzer keyword, the aggregations is working flawlessly here,
with constant result.

curl  '0:9200/aggsbug/_search?pretty' -d '
{
   size: 0,
   aggs: {
  a: {
 terms: {
field: a,
size: 10
 }
  }
   }
}
'
{
  took : 669,
  timed_out : false,
  _shards : {
total : 1,
successful : 1,
failed : 0
  },
  hits : {
total : 1060387,
max_score : 0.0,
hits : [ ]
  },
  aggregations : {
a : {
  buckets : [ {
key : TotalTrafficBOS,
doc_count : 3599
  }, {
key : MAI93thm,
doc_count : 2517
  }, {
key : MAI90thm,
doc_count : 2207
  }, {
key : MAI95thm,
doc_count : 2207
  }, {
key : TotalTrafficNYC,
doc_count : 1660
  }, {
key : incidentreports,
doc_count : 1468
  }, {
key : NJI80thm,
doc_count : 1180
  }, {
key : PAI76thm,
doc_count : 1142
  }, {
key : TXI35thm,
doc_count : 1064
  }, {
key : NYI87thm,
doc_count : 1029
  } ]
}
  }
}


Jörg

On Wed, Feb 5, 2014 at 2:17 PM, Nils Dijk m...@thanod.nl wrote:

 Hi,

 I updated the gist now with a file in bulkindex format.
 I also split up the loading from the testing phase, so you can do the test
 multiple times in a row.
 I also added a README.md to instruct how to run the test.

 I'm also creating a bug as stated here
 http://www.elasticsearch.org/blog/0-90-11-1-0-0-rc2-released/.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFRak9JtwQNnEdd%3DPGzJRqiqpCMEJXSAsgZ52OztJiTJw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Inconsistent responses from aggregations (ES1.0.0RC1)

2014-02-05 Thread Adrien Grand
I just installed 1.7u25 on a mac with maverick to try to reproduce the
issue, but without success (on 1.0.0-RC2).


On Wed, Feb 5, 2014 at 4:49 PM, Nils Dijk m...@thanod.nl wrote:

 Hi Adrien,

 I'm using OSX (Mavericks) and java: (having the issue)

 $ java -version
 java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

 My colleague is running OSX (Lion) and java: (having the issue)

 $ java -version
 java version 1.6.0_26
 Java(TM) SE Runtime Environment (build 1.6.0_26-b03-383-11D50)
 Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02-383, mixed mode)

 A server soon to be used for production Ubuntu 12.04 LTS with java: (Not
 having the issue)

 $ java -version
 java version 1.7.0_45
 Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
 Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)

 Could this be an issue with java on OSX than?

 On Wednesday, February 5, 2014 4:38:36 PM UTC+1, Adrien Grand wrote:

 I didn't manage to reproduce the issue locally either. What JVM / OS are
 you using (RC1 introduced Unsafe to perform String comparisons in terms
 aggs so I'm wondering if that could be related to your issue)?


 On Wed, Feb 5, 2014 at 4:33 PM, Nils Dijk m...@thanod.nl wrote:

 I did only test it with 1 and with 10 shards, indeed with 1 shard it did
 not have any issues, with 10 shards it has issues all the time.
 I also had a colleague testing it with the two scripts in the gist
 (which uses 10 shards).

 Also I do not think the analyzer _should_ have impact, since it would
 only index more terms on that field if it tokenizes it. Can you use the
 aggsbug.load.sh to load the data? And than use aggsbug.test.sh to run
 the test? It should give you a field analyzed with the default analyzer and
 10 shards.

 I'll try out some different analyzers, and loading the data in 3 shards
 now to see if that changes things.

 On Wednesday, February 5, 2014 4:02:54 PM UTC+1, Jörg Prante wrote:

 Also the same with shards = 3 and analyzer = standard. Stable results.

 {
   took : 240,
   timed_out : false,
   _shards : {
 total : 3,
 successful : 3,
 failed : 0
   },
   hits : {
 total : 1060387,
 max_score : 0.0,
 hits : [ ]
   },
   aggregations : {
 a : {
   buckets : [ {
 key : totaltrafficbos,
 doc_count : 3599
   }, {
 key : mai93thm,
 doc_count : 2517
   }, {
 key : mai90thm,
 doc_count : 2207
   }, {
 key : mai95thm,
 doc_count : 2207
   }, {
 key : totaltrafficnyc,
 doc_count : 1660
   }, {
 key : confessions,
 doc_count : 1534
   }, {
 key : incidentreports,
 doc_count : 1468
   }, {
 key : nji80thm,
 doc_count : 1180
   }, {
 key : pai76thm,
 doc_count : 1142
   }, {
 key : txi35thm,
 doc_count : 379
   } ]
 }
   }
 }

 You should examine your log files if your ES cluster was able to
 process all the docs correctly while indexing or searching, maybe you
 encountered OOMs or other subtle issues.

 Jörg


  --
 You received this message because you are subscribed to the Google
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/7c74c649-8a4a-46c5-aaec-b6f3254cc0d9%
 40googlegroups.com.

 For more options, visit https://groups.google.com/groups/opt_out.




 --
 Adrien Grand

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/8f0f80b7-fbf2-4747-90d4-725a06560938%40googlegroups.com
 .

 For more options, visit https://groups.google.com/groups/opt_out.




-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j4Y5EA5qAxE1BkLmbBX_7xgwZKPz00x_96YM4X9qLNE4w%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Inconsistent responses from aggregations (ES1.0.0RC1)

2014-02-05 Thread Nils Dijk
Thanks for the effort.

I tried running on 1.7.0_51, and it gave me the same issue.

I was trying to find out if I could disable this unsafe string comparisons, 
but could not really find where that should be disabled. Is there an easy 
way for me to switch back that change? Do you know on what commit this was 
changed so I can revert that commit in my local clone of the repo, do a 
build to see if the problem is solved that way?

For reproducing I do not really see what could impact this besides from the 
OS and java version. And the other OSX machine was a different version of 
OS AND java, and still having the same results.

I am however a bit more relaxed with the issue not showing up on our 
production machines, that would have killed the ES migration we are 
currently doing. Although it is unfortunate that we can not test our stuff 
on our developement machines (all showing the issue here).

Do you have any thoughts on what could be different between our setups that 
we are having the issue, and you don't?

To make sure, you use my scripts to load it in? Since Jörg seemed to load 
the data on a different way (different shardcount and different mapping) 
which did not show the issues here.

On Wednesday, February 5, 2014 5:40:10 PM UTC+1, Adrien Grand wrote:

 I just installed 1.7u25 on a mac with maverick to try to reproduce the 
 issue, but without success (on 1.0.0-RC2).


 On Wed, Feb 5, 2014 at 4:49 PM, Nils Dijk m...@thanod.nl javascript:wrote:

 Hi Adrien,

 I'm using OSX (Mavericks) and java: (having the issue)

 $ java -version
 java version 1.7.0_25
 Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

 My colleague is running OSX (Lion) and java: (having the issue)

 $ java -version
 java version 1.6.0_26
 Java(TM) SE Runtime Environment (build 1.6.0_26-b03-383-11D50)
 Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02-383, mixed mode)

 A server soon to be used for production Ubuntu 12.04 LTS with java: (Not 
 having the issue)

 $ java -version
 java version 1.7.0_45
 Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
 Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)

 Could this be an issue with java on OSX than?

 On Wednesday, February 5, 2014 4:38:36 PM UTC+1, Adrien Grand wrote:

 I didn't manage to reproduce the issue locally either. What JVM / OS are 
 you using (RC1 introduced Unsafe to perform String comparisons in terms 
 aggs so I'm wondering if that could be related to your issue)?


 On Wed, Feb 5, 2014 at 4:33 PM, Nils Dijk m...@thanod.nl wrote:

  I did only test it with 1 and with 10 shards, indeed with 1 shard it 
 did not have any issues, with 10 shards it has issues all the time.
 I also had a colleague testing it with the two scripts in the gist 
 (which uses 10 shards).

 Also I do not think the analyzer _should_ have impact, since it would 
 only index more terms on that field if it tokenizes it. Can you use the 
 aggsbug.load.sh to load the data? And than use aggsbug.test.sh to run 
 the test? It should give you a field analyzed with the default analyzer 
 and 
 10 shards.

 I'll try out some different analyzers, and loading the data in 3 shards 
 now to see if that changes things.

 On Wednesday, February 5, 2014 4:02:54 PM UTC+1, Jörg Prante wrote:

 Also the same with shards = 3 and analyzer = standard. Stable results.

 {
   took : 240,
   timed_out : false,
   _shards : {
 total : 3,
 successful : 3,
 failed : 0
   },
   hits : {
 total : 1060387,
 max_score : 0.0,
 hits : [ ]
   },
   aggregations : {
 a : {
   buckets : [ {
 key : totaltrafficbos,
 doc_count : 3599
   }, {
 key : mai93thm,
 doc_count : 2517
   }, {
 key : mai90thm,
 doc_count : 2207
   }, {
 key : mai95thm,
 doc_count : 2207
   }, {
 key : totaltrafficnyc,
 doc_count : 1660
   }, {
 key : confessions,
 doc_count : 1534
   }, {
 key : incidentreports,
 doc_count : 1468
   }, {
 key : nji80thm,
 doc_count : 1180
   }, {
 key : pai76thm,
 doc_count : 1142
   }, {
 key : txi35thm,
 doc_count : 379
   } ]
 }
   }
 }

 You should examine your log files if your ES cluster was able to 
 process all the docs correctly while indexing or searching, maybe you 
 encountered OOMs or other subtle issues.

 Jörg


  -- 
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/7c74c649-8a4a-46c5-aaec-b6f3254cc0d9%
 40googlegroups.com.

 For more options, visit https://groups.google.com/groups/opt_out.




 -- 
 Adrien Grand
  
  -- 
 You received 

Re: Inconsistent responses from aggregations (ES1.0.0RC1)

2014-02-05 Thread Adrien Grand
On Wed, Feb 5, 2014 at 6:01 PM, Nils Dijk m...@thanod.nl wrote:

 I was trying to find out if I could disable this unsafe
 string comparisons, but could not really find where that should be
 disabled. Is there an easy way for me to switch back that change? Do you
 know on what commit this was changed so I can revert that commit in my
 local clone of the repo, do a build to see if the problem is solved that
 way?


Sure, this was changed in 4271d573d60f39564c458e2d3fb7c14afb82d4d8 However
I also just read that you can't reproduce the issue with one shard although
this shouldn't be relevant.


 For reproducing I do not really see what could impact this besides from
 the OS and java version. And the other OSX machine was a different version
 of OS AND java, and still having the same results.

 I am however a bit more relaxed with the issue not showing up on our
 production machines, that would have killed the ES migration we are
 currently doing. Although it is unfortunate that we can not test our stuff
 on our developement machines (all showing the issue here).

 Do you have any thoughts on what could be different between our setups
 that we are having the issue, and you don't?


I wish I had ideas! :-)

Since the issue seems to reproduce consistently for you, something that
would be super helpful would be to git bisect in order to find the commit
that broke aggregations in your setup (Beta2 commit is 296cfbe3 and rc1
commit is 2c8ee3fb).


 To make sure, you use my scripts to load it in? Since Jörg seemed to load
 the data on a different way (different shardcount and different mapping)
 which did not show the issues here.


Yes, I used your scripts, exactly as described in the README.

-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7pAMdOPGoy5ssjdAHLa4eMntKnCZPLH6U9Ft2TZaO77w%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Inconsistent responses from aggregations (ES1.0.0RC1)

2014-02-05 Thread Nils Dijk
Ok, I was preparing to do a long bisecting session, but I started with the 
commit you highlighted below (4271d573d60f39564c458e2d3fb7c14afb82d4d8) and 
the commit before that one (6481a2fde858520988f2ce28c02a15be3fe108e4). And 
as it turns out, it is the breaking commit.

If I build the commit of yours from December 3 it fails my test suite.
If I build the commit of Nik from Januari 6 it still passes my test.

I also tried reverting your commit on the v1.0.0.RC1 tag, but it gave me 
all kinds of conflicts so I could not test RC1 without your commit.

If you would like I can still do a full bisect, but I suspect I end up at 
your commit since I tested that one, and the one before.

Would it be possible for you to send a .patch without the unsafe stuff, so 
I can apply that to a commit and make a build?

Thanks in advance,

On Wednesday, February 5, 2014 6:10:35 PM UTC+1, Adrien Grand wrote:


 On Wed, Feb 5, 2014 at 6:01 PM, Nils Dijk m...@thanod.nl javascript:wrote:

 I was trying to find out if I could disable this unsafe 
 string comparisons, but could not really find where that should be 
 disabled. Is there an easy way for me to switch back that change? Do you 
 know on what commit this was changed so I can revert that commit in my 
 local clone of the repo, do a build to see if the problem is solved that 
 way?


 Sure, this was changed in 4271d573d60f39564c458e2d3fb7c14afb82d4d8 However 
 I also just read that you can't reproduce the issue with one shard although 
 this shouldn't be relevant.
  

 For reproducing I do not really see what could impact this besides from 
 the OS and java version. And the other OSX machine was a different version 
 of OS AND java, and still having the same results.

 I am however a bit more relaxed with the issue not showing up on our 
 production machines, that would have killed the ES migration we are 
 currently doing. Although it is unfortunate that we can not test our stuff 
 on our developement machines (all showing the issue here).

 Do you have any thoughts on what could be different between our setups 
 that we are having the issue, and you don't?


 I wish I had ideas! :-)

 Since the issue seems to reproduce consistently for you, something that 
 would be super helpful would be to git bisect in order to find the commit 
 that broke aggregations in your setup (Beta2 commit is 296cfbe3 and rc1 
 commit is 2c8ee3fb).
  

 To make sure, you use my scripts to load it in? Since Jörg seemed to load 
 the data on a different way (different shardcount and different mapping) 
 which did not show the issues here.


 Yes, I used your scripts, exactly as described in the README.
  
 -- 
 Adrien Grand
  

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ab8f000d-d0ee-4be8-aaa5-46d0718c56e8%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Inconsistent responses from aggregations (ES1.0.0RC1)

2014-02-05 Thread Nils Dijk
Hi Jörg,

Glad you could reproduce with my updated gist.

cb.

On Wednesday, February 5, 2014 8:18:39 PM UTC+1, Jörg Prante wrote:

 Nils, I ran the test on my Mac, and I can reproduce the issue. And also on 
 Linux.

 Unfortunately the Mac locked up and I had to cold reboot, and my 
 copy/paste logs are gone with all the numbers, but anyway.

 As a matter of fact, your aggregates demo is daunting.

 On the Mac, it shows different counts even between the first and the 
 subsequent executions. The counts of the first are lower, and also, even 
 different terms show up. On Linux, I do not observe different counts 
 between runs.


The issue you describe for Mac is the issue I discussed here.


 But, what's more bothering is, I observed different results in regard to 
 the shard count, and that is both on Mac and Linux. The more the hit count 
 is on top of the buckets, the more the counts match, only the lower buckets 
 differ, so the deviating counts are somewhat hard to notice.


The counts differ when you change the shard size is long known problem of 
elasticsearch and was also a problem in faceting. A long thread about the 
nature of this problem can be found here: 
https://github.com/elasticsearch/elasticsearch/issues/1305.

It is an issue which you can circumvent easily by one of two options:

   1. Use the term you do the aggregation for as a routing key. This forces 
   to have the same tokens in the same shard, and thus always return the exact 
   count. Although this only works if you do these kind of analytics over 1 
   field.
   2. Increase the shard_size for the terms aggregation. This way the 
   internal shards create bigger lists which than have more chance of 
   containing the actual top terms. 
   
http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.x/search-aggregations-bucket-terms-aggregation.html#_size_amp_shard_size


 I use Java 8 FCS, but since you observe this issue also on Java 7, I think 
 it is not an issue of Java 8. And it's both on Mac and Linux, but with 
 different symptoms.


This makes the only factor occurring multiple times the MacOSX OS. And on 
all java versions, I tested both 1.7 and 1.6. It is unfortunate that Adrien 
wasn't able to reproduce it on OSX.
 


 ES 1.0.0.RC2
 Mac OS X 10.8.5
 Darwin Jorg-Prantes-MacBook-Pro.local 12.5.0 Darwin Kernel Version 12.5.0: 
 Sun Sep 29 13:33:47 PDT 2013; root:xnu-2050.48.12~1/RELEASE_X86_64 x86_64
 java version 1.8.0
 Java(TM) SE Runtime Environment (build 1.8.0-b128)
 Java HotSpot(TM) 64-Bit Server VM (build 25.0-b69, mixed mode)
 G1GC enabled

 ES 1.0.0.RC2
 RHEL 6.3
 Linux zephyros 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 
 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.8.0
 Java(TM) SE Runtime Environment (build 1.8.0-b128)
 Java HotSpot(TM) 64-Bit Server VM (build 25.0-b69, mixed mode)
 G1GC enabled

 Here are two Linux examples. Note, the last three terms and counts are 
 different.

 shards=10

 {
   took : 143,
   timed_out : false,
   _shards : {
 total : 10,
 successful : 10,
 failed : 0
   },
   hits : {
 total : 1060387,
 max_score : 0.0,
 hits : [ ]
   },
   aggregations : {
 a : {
   buckets : [ {
 key : totaltrafficbos,
 doc_count : 3599
   }, {
 key : mai93thm,
 doc_count : 2517
   }, {
 key : mai90thm,
 doc_count : 2207
   }, {
 key : mai95thm,
 doc_count : 2207
   }, {
 key : totaltrafficnyc,
 doc_count : 1660
   }, {
 key : confessions,
 doc_count : 1534
   }, {
 key : incidentreports,
 doc_count : 1468
   }, {
 key : nji80thm,
 doc_count : 1071
   }, {
 key : pai76thm,
 doc_count : 1039
   }, {
 key : txi35thm,
 doc_count : 357
   } ]
 }
   }
 }

 shards=5

 {
   took : 172,
   timed_out : false,
   _shards : {
 total : 5,
 successful : 5,
 failed : 0
   },
   hits : {
 total : 1060387,
 max_score : 0.0,
 hits : [ ]
   },
   aggregations : {
 a : {
   buckets : [ {
 key : totaltrafficbos,
 doc_count : 3599
   }, {
 key : mai93thm,
 doc_count : 2517
   }, {
 key : mai90thm,
 doc_count : 2207
   }, {
 key : mai95thm,
 doc_count : 2207
   }, {
 key : totaltrafficnyc,
 doc_count : 1660
   }, {
 key : confessions,
 doc_count : 1534
   }, {
 key : incidentreports,
 doc_count : 1468
   }, {
 key : nji80thm,
 doc_count : 1180
   }, {
 key : pai76thm,
 doc_count : 936
   }, {
 key : nji78thm,
 doc_count : 422
   } ]
 }
   }
 }


 Jörg



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 

Re: Inconsistent responses from aggregations (ES1.0.0RC1)

2014-02-04 Thread Nils Dijk
I've loaded the same dataset in ES1.0.0.Beta2 with the same index 
configuration as in the topic start.

However now the numbers are consistent if I call the same aggregation 
multiple times in a row AND the number match the numbers of the facets. 
This leads me to the conclusion something is broken from Beta2 to RC1!

I would like to test this on master, but I could not find any nightly 
builds of elasticsearch. Is there a location where they are stored or 
should I compile it myself?

On Friday, January 31, 2014 6:43:07 PM UTC+1, Nils Dijk wrote:

 Hi Binh Ly,

 Thanks for the response.

 I'm aware that the numbers are not exact (hence the link to issue #1305 in 
 my initial post), and have been advocating slightly incorrect numbers with 
 my colleges and customers for some time already to prepare them for the 
 moment we provide analytics with ES. But what bothers me is that they are 
 *inconsistent*.

 If you look at my gist you see that I ran the same aggs 3 times right 
 after each other. If we just look at the top item we see the following 
 results:

1. { key: totaltrafficbos, doc_count: 2880 }
2. { key: totaltrafficbos, doc_count: 2552 }
3. { key: totaltrafficbos, doc_count: 2179 }

 These results are taken within seconds without any change to the number of 
 documents in the index. If I run them even more you see that it rotates 
 between a hand full of numbers. Is this also behavior one would expect from 
 the aggs? And if so, why do the facets show the same number over and over 
 again?

 Anyway, I will try to work myself through the aggs code this weekend to get a 
 better hang of what we could do with it, and what not.

 -- Nils

 On Friday, January 31, 2014 6:18:43 PM UTC+1, Binh Ly wrote:

 Nils,

 This is just the nature of splitting data around in shards. Actually the 
 terms facet has the same limitations (i.e. it will also give approximate 
 counts). Neither the terms facet nor the terms aggregation is better or 
 worse than the other - they are both approximations (using different 
 implementations). It is correct that if you put all your data in 1 shard, 
 then all the counts are exact. If you need to shard, you can increase the 
 shard_size parameter inside the terms aggregation to improve accuracy. 
 Play with that number until it suits your purposes but the important thing 
 is they are just approximations the more documents you have in the index - 
 so just don't expect absolute numbers from them if you have more than 1 
 shard.

 {
   size: 0,
   aggs: {
 a: {
   terms: {
 field: actor.displayName,
 shard_size: 1
   }
 }
   }
 }



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6bee2ff8-ae78-4837-91f5-77ee80f55d34%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Inconsistent responses from aggregations (ES1.0.0RC1)

2014-02-04 Thread Nils Dijk
To follow up,

I have a contained test suite at https://gist.github.com/thanodnl/8803745for 
this problem. It contains two files:

   1. aggsbug.sh
   2. aggsbug.json

The .json file contains ~1M documents newline separated to load into the 
database, I was not able to create a curl request to load them directly 
into the index.
The .sh file (https://gist.github.com/thanodnl/8803745/raw/aggsbug.sh) 
contains the instructions for recreating this behavior.

I have ran these against the following version:

   1. 1.0.0.Beta2
   2. 1.0.0.RC1
   3. 1.0.0-SNAPSHOT as compiled from the git 1.0 branch on commit 
   0f8b41ffad9b5ecdfd543d7c73edcf404e6fc763

When ran on 1.0.0.Beta2 it gives the same output consistently when I run 
the _search over and over again.
When ran on 1.0.0.RC1 it will give me multiple different outcomes 
comparable to the numbers I posted earlier in the thread,
When ran on 1.0.0-SNAPSHOT it behaves the same as in 1.0.0.RC1.

That it still was working on 1.0.0.Beta2 proves to me that it is a bug that 
got into RC1. I could not find any related ticket on the issues page of the 
github repository. Hopefully this is enough information to recreate the 
problem.

The json file is quite big and could bug when you open the gist it in a 
browser. A clone of the gist locally will work best:
$ git clone https://gist.github.com/8803745.git

I do not really know how to move on from here. Do you want me to open an 
issue for this problem at github.com/elasticsearch/elasticsearch? It would 
be nice to fix this problem before a release of 1.0.0 since that is the 
first release containing the aggregations for analytics.

On Tuesday, February 4, 2014 12:31:10 PM UTC+1, Nils Dijk wrote:

 I've loaded the same dataset in ES1.0.0.Beta2 with the same index 
 configuration as in the topic start.

 However now the numbers are consistent if I call the same aggregation 
 multiple times in a row AND the number match the numbers of the facets. 
 This leads me to the conclusion something is broken from Beta2 to RC1!

 I would like to test this on master, but I could not find any nightly 
 builds of elasticsearch. Is there a location where they are stored or 
 should I compile it myself?

 On Friday, January 31, 2014 6:43:07 PM UTC+1, Nils Dijk wrote:

 Hi Binh Ly,

 Thanks for the response.

 I'm aware that the numbers are not exact (hence the link to issue #1305 
 in my initial post), and have been advocating slightly incorrect numbers 
 with my colleges and customers for some time already to prepare them for 
 the moment we provide analytics with ES. But what bothers me is that they 
 are *inconsistent*.

 If you look at my gist you see that I ran the same aggs 3 times right 
 after each other. If we just look at the top item we see the following 
 results:

1. { key: totaltrafficbos, doc_count: 2880 }
2. { key: totaltrafficbos, doc_count: 2552 }
3. { key: totaltrafficbos, doc_count: 2179 }

 These results are taken within seconds without any change to the number of 
 documents in the index. If I run them even more you see that it rotates 
 between a hand full of numbers. Is this also behavior one would expect from 
 the aggs? And if so, why do the facets show the same number over and over 
 again?

 Anyway, I will try to work myself through the aggs code this weekend to get 
 a better hang of what we could do with it, and what not.

 -- Nils

 On Friday, January 31, 2014 6:18:43 PM UTC+1, Binh Ly wrote:

 Nils,

 This is just the nature of splitting data around in shards. Actually the 
 terms facet has the same limitations (i.e. it will also give approximate 
 counts). Neither the terms facet nor the terms aggregation is better or 
 worse than the other - they are both approximations (using different 
 implementations). It is correct that if you put all your data in 1 shard, 
 then all the counts are exact. If you need to shard, you can increase the 
 shard_size parameter inside the terms aggregation to improve accuracy. 
 Play with that number until it suits your purposes but the important thing 
 is they are just approximations the more documents you have in the index - 
 so just don't expect absolute numbers from them if you have more than 1 
 shard.

 {
   size: 0,
   aggs: {
 a: {
   terms: {
 field: actor.displayName,
 shard_size: 1
   }
 }
   }
 }



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fb421a29-8923-4188-9363-03682fec71ab%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.