Just tested this. When I used a large number to get all of my documents according to some criteria (4926 in the result) I got: 13.951s when using a size of 1M 43.6s when using scan/scroll (with a size of 100)
Looks like I should be using the not recommended paging. Can I make the scroll better? Thanks, Ron On Wednesday, December 10, 2014 10:53:50 PM UTC+2, David Pilato wrote: > > No I did not say that. Or I did not mean that. Sorry if it was unclear. > I said: don’t use large sizes: > > Never use size:10000000 or from:10000000. >> > > You should read this: > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html#scroll-scan > > -- > *David Pilato* | *Technical Advocate* | *Elasticsearch.com > <http://Elasticsearch.com>* > @dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr > <https://twitter.com/elasticsearchfr> | @scrutmydocs > <https://twitter.com/scrutmydocs> > > > > Le 10 déc. 2014 à 21:16, Ron Sher <ron....@gmail.com <javascript:>> a > écrit : > > So you're saying there's no impact on elasticsearch if I issue a large > size? > If that's the case then why shouldn't I just call size of 1M if I want to > make sure I get everything? > > On Wednesday, December 10, 2014 8:22:47 PM UTC+2, David Pilato wrote: >> >> Scan/scroll is the best option to extract a huge amount of data. >> Never use size:10000000 or from:10000000. >> >> It's not realtime because you basically scroll over a given set of >> segments and all new changes that will come in new segments won't be taken >> into account during the scroll. >> Which is good because you won't get inconsistent results. >> >> About size, I'd would try and test. It depends on your docs size I >> believe. >> Try with 10000 and see how it goes when you increase it. You will may be >> discover that getting 10*10000 docs is the same as 1*100000. :) >> >> Best >> >> David >> >> Le 10 déc. 2014 à 19:09, Ron Sher <ron....@gmail.com> a écrit : >> >> Hi, >> >> I was wondering about best practices to to get all data according to some >> filters. >> The options as I see them are: >> >> - Use a very big size that will return all accounts, i.e. use some >> value like 1m to make sure I get everything back (even if I need just a >> few >> hundreds or tens of documents). This is the quickest way, development >> wise. >> - Use paging - using size and from. This requires looping over the >> result and the performance gets worse as we advance to later pages. Also, >> we need to use preference if we want to get consistent results over the >> pages. Also, it's not clear what's the recommended size for each page. >> - Use scan/scroll - this gives consistent paging but also has several >> drawbacks: If I use search_type=scan then it can't be sorted; using >> scan/scroll is (maybe) less performant than paging (the documentation >> says >> it's not for realtime use); again not clear which size is recommended. >> >> So you see - many options and not clear which path to take. >> >> What do you think? >> >> Thanks, >> Ron >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearc...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/764a37c5-1fec-48c4-9c66-7835d8141713%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/764a37c5-1fec-48c4-9c66-7835d8141713%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearc...@googlegroups.com <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/838020dc-d2ea-423d-9606-778d807b1a0d%40googlegroups.com > > <https://groups.google.com/d/msgid/elasticsearch/838020dc-d2ea-423d-9606-778d807b1a0d%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d41729a8-8dfc-48eb-ae7b-1ac16cd05787%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.