I'm reindexing a ElasticSearch base with 50m docs using the scroll-scan
request to retrieve all docs, but my reindexer program stopped at 30m
Is there a way to redo the query to retrieve the left docs? Like using
offset?
Would the the internal order of the scan query be the same with a second
The scroll is available based on a timeout value you give it. Everytimetime
you scroll you restart the countdown.
You could track the last scroll id you used and try it again from there?
On Thursday, 23 October 2014 12:47:02 UTC-4, Roger de Cordova Farias wrote:
I'm reindexing a ElasticSearch
Hmm, I was using a small ttl, just enough to process each scroll call, but
I could try using a longer time to live and resuming from the last
scroll_id in case of error
That is a good idea, thanks
2014-10-23 17:12 GMT-02:00 John Smith java.dev@gmail.com:
The scroll is available based on a
Small ttl is ok (well adjusted properly for you process) because everytime
you call scroll it resets the ttl. So you don't need to put a 60m scroll
time. It just has to be long enough to be able to process the next scroll
id.
I'm curious if you can re-use the scroll id. It's not specifically
I know it resets the ttl on each scroll call, but since I don't have an
automatic resuming process, I need to manually check the last scroll_id (I
will log it to a file) and restart the reindexing program using it. That is
why I need a longer ttl
I just tested the re-use of the scroll_id. Looks