Just tested this.
When I used a large number to get all of my documents according to some 
criteria (4926 in the result) I got:
13.951s when using a size of 1M
43.6s when using scan/scroll (with a size of 100)

Looks like I should be using the not recommended paging.
Can I make the scroll better?

Thanks,
Ron

On Wednesday, December 10, 2014 10:53:50 PM UTC+2, David Pilato wrote:
>
> No I did not say that. Or I did not mean that. Sorry if it was unclear.
> I said: don’t use large sizes:
>
> Never use size:10000000 or from:10000000. 
>>
>
> You should read this: 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html#scroll-scan
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com 
> <http://Elasticsearch.com>*
> @dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr 
> <https://twitter.com/elasticsearchfr> | @scrutmydocs 
> <https://twitter.com/scrutmydocs>
>
>
>  
> Le 10 déc. 2014 à 21:16, Ron Sher <ron....@gmail.com <javascript:>> a 
> écrit :
>
> So you're saying there's no impact on elasticsearch if I issue a large 
> size? 
> If that's the case then why shouldn't I just call size of 1M if I want to 
> make sure I get everything?
>
> On Wednesday, December 10, 2014 8:22:47 PM UTC+2, David Pilato wrote:
>>
>> Scan/scroll is the best option to extract a huge amount of data.
>> Never use size:10000000 or from:10000000. 
>>
>> It's not realtime because you basically scroll over a given set of 
>> segments and all new changes that will come in new segments won't be taken 
>> into account during the scroll.
>> Which is good because you won't get inconsistent results.
>>
>> About size, I'd would try and test. It depends on your docs size I 
>> believe.
>> Try with 10000 and see how it goes when you increase it. You will may be 
>> discover that getting 10*10000 docs is the same as 1*100000. :)
>>
>> Best
>>
>> David
>>
>> Le 10 déc. 2014 à 19:09, Ron Sher <ron....@gmail.com> a écrit :
>>
>> Hi,
>>
>> I was wondering about best practices to to get all data according to some 
>> filters.
>> The options as I see them are:
>>
>>    - Use a very big size that will return all accounts, i.e. use some 
>>    value like 1m to make sure I get everything back (even if I need just a 
>> few 
>>    hundreds or tens of documents). This is the quickest way, development 
>> wise.
>>    - Use paging - using size and from. This requires looping over the 
>>    result and the performance gets worse as we advance to later pages. Also, 
>>    we need to use preference if we want to get consistent results over the 
>>    pages. Also, it's not clear what's the recommended size for each page.
>>    - Use scan/scroll - this gives consistent paging but also has several 
>>    drawbacks: If I use search_type=scan then it can't be sorted; using 
>>    scan/scroll is (maybe) less performant than paging (the documentation 
>> says 
>>    it's not for realtime use); again not clear which size is recommended.
>>
>> So you see - many options and not clear which path to take.
>>
>> What do you think?
>>
>> Thanks,
>> Ron
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/764a37c5-1fec-48c4-9c66-7835d8141713%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/764a37c5-1fec-48c4-9c66-7835d8141713%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com <javascript:>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/838020dc-d2ea-423d-9606-778d807b1a0d%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/elasticsearch/838020dc-d2ea-423d-9606-778d807b1a0d%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d41729a8-8dfc-48eb-ae7b-1ac16cd05787%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to