Vladimir,

I can explain what happened, but not how to correct the problem.  The gentleman 
that can walk you through a repair is tied up on another project, but he 
intends to respond as soon as he is able.

We recently discovered / realized that Google's leveldb code does not check the 
CRC of each block rewritten during a compaction.  This means that blocks with 
bad CRCs get read without being flagged as bad, then rewritten to a new file 
with a new, valid CRC.  The corruption is now hidden.

A more thorough discussion of the problem is found here:

https://github.com/basho/leveldb/wiki/mv-verify-compactions


We added code to the 1.3.2 and 1.4 Riak releases to have the block CRC checked 
during both read (Get) requests and compaction rewrites.  This prevents future 
corruption hiding.  Unfortunately, it does NOTHING for blocks already corrupted 
and rewritten with valid CRCs.  You are encountering this latter condition.  We 
have a developer advocate / client services person that has walked others 
through a fix via the Riak data replicas … 

… please hold and the doctor will be with you shortly.

Matthew


On Jul 24, 2013, at 9:39 PM, Vladimir Shabanov <vshaban...@gmail.com> wrote:

> Hello,
> 
> Recently I've started expanding my Riak cluster and found that handoffs were 
> continuously retried for one partition.
> 
> Here are logs from two nodes
> https://gist.github.com/vshabanov/41282e622479fbe81974
> 
> The most interesting parts of logs are
> "Handoff receiver for partition ... exited abnormally after processing 
> 2860338 objects: {{badarg,[{erlang,binary_to_term,..."
> and
> "bad argument in call to erlang:binary_to_term(<<131,104,...."
> 
> Both nodes are running Riak 1.3.2 (old one was running 1.3.1 previously).
> 
> 
> When I've printed corrupted binary string I found that it corresponds to one 
> value.
> 
> When I've tried to "get" it, it was read OK but node with corrupted value 
> shown the same binary_to_term error.
> 
> When I've tried to delete corrupted value I've got timeout.
> 
> 
> I'm running machines with ECC memory and ZFS filesystem (which doesn't report 
> any checksum failures) so I doubt data was silently corrupted on disk.
> 
> LOG from corresponding LevelDB partition doesn't show any errors. But there 
> is a lost/BLOCKS.bad file in this partition (7kb, created more than a month 
> ago and looks like it doesn't contain corrupted value).
> 
> At the moment I've stopped handoffs using "risk-admin transfer-limit 0".
> 
> Why the value was corrupted? It there any way to remove it or fix it?
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to