Hello,

I have a cluster of 8 Riak-1.3.1 nodes. Recently one of my nodes silently
crashed. Nothing unusual was reported in logs.

When I've tried to start my node again it worked for few seconds and
silently crashed again. I've run 'riak console' and seen "Segmentation
fault".

gdb with dumped core shows:

Program terminated with signal 11, Segmentation fault.
#0  0x00007f162547fa30 in MurmurHash64A(void const*, int, unsigned int) ()
   from /tank/riak-1.3.1/lib/eleveldb-1.3.0/priv/eleveldb.so

Backtrace shows that it happens somewhere in LevelDB compaction.

(gdb) bt
#0  0x00007f162547fa30 in MurmurHash64A(void const*, int, unsigned int) ()
   from /tank/riak-1.3.1/lib/eleveldb-1.3.0/priv/eleveldb.so
#1  0x00007f162547833c in leveldb::(anonymous
namespace)::BloomFilterPolicy2::CreateFilter(leveldb::Slice const*, int,
std::string*) const ()
   from /tank/riak-1.3.1/lib/eleveldb-1.3.0/priv/eleveldb.so
#2  0x00007f162548382d in leveldb::FilterBlockBuilder::GenerateFilter() ()
   from /tank/riak-1.3.1/lib/eleveldb-1.3.0/priv/eleveldb.so
#3  0x00007f1625483a58 in leveldb::FilterBlockBuilder::StartBlock(unsigned
long) ()
   from /tank/riak-1.3.1/lib/eleveldb-1.3.0/priv/eleveldb.so
#4  0x00007f1625475175 in leveldb::TableBuilder::Flush() ()
   from /tank/riak-1.3.1/lib/eleveldb-1.3.0/priv/eleveldb.so
#5  0x00007f1625475395 in leveldb::TableBuilder::Add(leveldb::Slice const&,
leveldb::Slice const&) () from
/tank/riak-1.3.1/lib/eleveldb-1.3.0/priv/eleveldb.so
#6  0x00007f162545b561 in
leveldb::DBImpl::DoCompactionWork(leveldb::DBImpl::CompactionState*) ()
from /tank/riak-1.3.1/lib/eleveldb-1.3.0/priv/eleveldb.so
#7  0x00007f162545bd3b in leveldb::DBImpl::BackgroundCompaction() ()
   from /tank/riak-1.3.1/lib/eleveldb-1.3.0/priv/eleveldb.so
#8  0x00007f162545ca5d in leveldb::DBImpl::BackgroundCall() ()
   from /tank/riak-1.3.1/lib/eleveldb-1.3.0/priv/eleveldb.so
#9  0x00007f162547bb38 in leveldb::(anonymous
namespace)::PosixEnv::BGThreadWrapper(void*) () from
/tank/riak-1.3.1/lib/eleveldb-1.3.0/priv/eleveldb.so
#10 0x00007f163366ab50 in start_thread () from
/lib/x86_64-linux-gnu/libpthread.so.0
#11 0x00007f16331aca7d in clone () from /lib/x86_64-linux-gnu/libc.so.6
#12 0x0000000000000000 in ?? ()

gdb output in gist
https://gist.github.com/vshabanov/5768546

Why it's happening and how to bring the node back to life?
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to