Re: bulk indexing - optimal refresh_interval

2014-07-30 Thread shikhar
Thanks for the explanation! I'll switch over for the next time I need to reindex. On Tue, Jul 29, 2014 at 6:35 PM, Michael McCandless m...@elasticsearch.com wrote: Disabling refresh (-1) is a good choice if you are fully maximizing your cluster's CPU/IO resources (using enough bulk client

bulk indexing - optimal refresh_interval

2014-07-29 Thread shikhar
The 1.3.0 release notes state: - Increase the refresh_interval http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.html#bulk if you are doing heavy bulk indexing, or you are happy with your search results being refreshed less

Re: bulk indexing - optimal refresh_interval

2014-07-29 Thread Mark Walkom
I'd say because if you are inserting a lot of data, you will have a massive hit at the end when you need to index, as opposed to smaller ones along the way. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 29 July

Re: bulk indexing - optimal refresh_interval

2014-07-29 Thread joergpra...@gmail.com
There is no more a massive hit when opening an index for read once than at every 30 seconds. The only explanation I can think of is that users perform searches while indexing and somehow want up-to-date results while they search along. This is not the case when I do bulk indexing, search is

Re: bulk indexing - optimal refresh_interval

2014-07-29 Thread Michael McCandless
Disabling refresh (-1) is a good choice if you are fully maximizing your cluster's CPU/IO resources (using enough bulk client threads or async requests). In that case it should give faster indexing throughput than 30s refresh. But if you are not saturating the cluster's resources, then a refresh