What do you do when Riak freezes?

2014-03-19 Thread Michael Dillon
I've run into a problem with Riak freezing completely on one node running on Ubuntu 12.04 LTS on a XEN VM (EC2). If I ssh into the node and run "ps ax" that shell session also freezes. I also tried another ssh session with "netstat -lnp" to see if I could find the process ID to kill, but that also

Re: What do you do when Riak freezes?

2014-03-19 Thread Matthew Von-Maszewski
Any chance you are overflowing into swap? Or in the case of XEN have you exceeded the guaranteed RAM for the VM memory and moved into the disk backed portion of "ram"? What backend do you use within riak? Do you have memory statistics from before and after the seizure/freeze? Matthew On M

Re: What do you do when Riak freezes?

2014-03-19 Thread Michael Dillon
We are using AMazon EC2 m3.x2large nodes and while the freeze is occurring free reports total used free sharedbuffers cached Mem: 306232328818792 21804440 0 880924411832 -/+ buffers/cache:4318868 26304364 Swap:0

Re: What do you do when Riak freezes?

2014-03-19 Thread Matthew Von-Maszewski
I thought I knew the cause of this problem. I do not. We need to await input from others. My apologies. Other basic questions will be: what version of Riak, what is your app.config, how many servers/nodes, any reason this one node is "different"? Matthew On Mar 19, 2014, at 5:30 PM, Micha

Re: What do you do when Riak freezes?

2014-03-19 Thread Michael Dillon
I'm running Riak2.0pre11 but I keep mentioning Erlang because I have seen a similar situation with hanging a couple of years ago with RabbitMQ. I suspect that even if there is a Riak bug involved, there is probably also some Erlang problem as well. Now I have discovered that by using "pstree -p" I

Re: What do you do when Riak freezes?

2014-03-19 Thread Adam Lindsay
I'm trying to remember some of the pain points of Riak in production on EC2 back 2.5 years ago. Riak is a different product now, but EC2 is still a challenging environment. Do you have any monitoring on the state of network interfaces? Is it possible the IP of one of the nodes changed from underne

Re: What do you do when Riak freezes?

2014-03-20 Thread Shane McEwan
On 19/03/14 20:56, Michael Dillon wrote: > I've run into a problem with Riak freezing completely on one node > running on Ubuntu 12.04 LTS on a XEN VM (EC2). If I ssh into the node > and run "ps ax" that shell session also freezes. I also tried another > ssh session with "netstat -lnp" to see if I