Hi,
I'm running a 5-m2.2xlarge-node Riak cluster on EC2.
Today, two of the nodes crashed.
Running risk console I found this line which seemed to be related:
21:44:30.449 [error] Failed to start riak_kv_eleveldb_backend Reason:
{db_open,IO error:
Hi, Nam,
On the node that is reporting the LevelDB Manifest error, I would do the
following:
1. Stop the node if it isn't down already.
2. Backup
/var/lib/riak/leveldb/50239118783249787813251666124688006726811648 to
another folder outside of /var/lib/riak/leveldb.
3. Run the erl binary
Hi Justin,
I had essentially done the same thing you suggested. I moved the affected
directory out of leveldb and restarted riak. Though I did not run repair, it
seemed to go well. The nodes are leaving the cluster now.
However, it has been three hours and riak-admin transfers still shows many
Nam,
What is the output of `riak-admin member_status` and `riak-admin
ring_status`?
Justin Shoffstall
Developer Advocate | Basho Technologies, Inc.
--
View this message in context:
http://riak-users.197444.n3.nabble.com/Riak-crashed-with-MANIFEST-not-found-tp4015987p4016157.html
Sent from the
Output of the commands:
ubuntu@ip-10-20-2-243:~$ riak-admin member_status
Attempting to restart script through sudo -u riak
= Membership ==
Status RingPendingNode
Nam,
Thanks for speaking with me tonight. Let us know if the cluster has any more
trouble, or if we can do anything else for you.
Cheers,
Justin Shoffstall
Developer Advocate | Basho Technologies, Inc.
On May 25, 2012, at 10:01 PM, Nam Nguyen-2 [via Riak Users] wrote:
Output of the
Nam,
To recap the upshot of our offline chat tonight:
Though the leave operations on your cluster progressed fine, in the future I
would just take the damaged nodes down, do the repair like I mentioned in
the earlier post in this thread, and bring the nodes back up. No membership
changes should