A few days ago we started to receive a lot of timeouts across our cluster.
This is causing shard allocation to fail and a perpetual red/yellow state.
Examples:
[2015-04-16 15:04:50,970][DEBUG][action.admin.cluster.node.stats]
[coordinator02] failed to execute on node [1rfWT-mXTZmF_NzR_h1IZw]
This was tracked down to a problem with Ubuntu 14.04 running under Xen (in
AWS). The latest kernel in Ubuntu resolves the problem, so I had to do a
rolling apt-get update; apt-get dist-upgrade; reboot on all nodes. This
appears to have resolved the issue.
For reference:
Also related https://github.com/elastic/elasticsearch/issues/10447
On 17 April 2015 at 12:37, Charlie Moad charlie.m...@geofeedia.com wrote:
This was tracked down to a problem with Ubuntu 14.04 running under Xen (in
AWS). The latest kernel in Ubuntu resolves the problem, so I had to do a