Gary Helmling created HBASE-18072:
-------------------------------------

             Summary: Malformed Cell from client causes Regionserver abort on 
flush
                 Key: HBASE-18072
                 URL: https://issues.apache.org/jira/browse/HBASE-18072
             Project: HBase
          Issue Type: Bug
          Components: regionserver, rpc
    Affects Versions: 1.3.0
            Reporter: Gary Helmling
            Assignee: Gary Helmling
            Priority: Critical


When a client writes a mutation with a Cell with a corrupted value length 
field, it is possible for the corrupt cell to trigger an exception on memstore 
flush, which will trigger regionserver aborts until the region is manually 
recovered.

This boils down to a lack of validation on the client submitted byte[] backing 
the cell.

Consider the following sequence:

1. Client creates a new Put with a cell with value of byte[16]
2. When the backing KeyValue for the Put is created, we serialize 16 for the 
value length field in the backing array
3. Client calls Table.put()
4. RpcClientImpl calls KeyValueEncoder.encode() to serialize the Cell to the 
OutputStream
5. Memory corruption in the backing array changes the serialized contents of 
the value length field from 16 to 48
6. Regionserver handling the put uses KeyValueDecoder.decode() to create a 
KeyValue with the byte[] read directly off the InputStream.  The overall length 
of the array is correct, but the integer value serialized at the value length 
offset has been corrupted from the original value of 16 to 48.
7. The corrupt KeyValue is appended to the WAL and added to the memstore
8. After some time, the memstore flushes.  As HFileWriter is writing out the 
corrupted cell, it reads the serialized int from the value length position in 
the cell's byte[] to determine the number of bytes to write for the value.  
Because value offset + 48 is greater than the length of the cell's byte[], we 
hit an IndexOutOfBoundsException:
{noformat}
java.lang.IndexOutOfBoundsException
        at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:151)
        at java.io.DataOutputStream.write(DataOutputStream.java:107)
        at 
org.apache.hadoop.hbase.io.hfile.NoOpDataBlockEncoder.encode(NoOpDataBlockEncoder.java:56)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.write(HFileBlock.java:954)
        at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:284)
        at 
org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
        at 
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:1041)
        at 
org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:138)
        at 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75)
        at 
org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:937)
        at 
org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2413)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2456)
{noformat}
9. Regionserver aborts due to the failed flush
10. The regionserver WAL is split into recovered.edits files, one of these 
containing the same corrupted cell
11. A new regionserver is assigned the region with the corrupted write
12. The new regionserver replays the recovered.edits entries into memstore and 
then tries to flush the memstore to an HFile
13. The flush triggers the same IndexOutOfBoundsException, causing us to go 
back to step #8 and loop on repeat until manual intervention is taken

The corrupted cell basically becomes a poison pill that aborts regionservers 
one at a time as the region with the problem edit is passed around.  This also 
means that a malicious client could easily construct requests allowing a denial 
of service attack against regionservers hosting any tables that the client has 
write access to.

At bare minimum, I think we need to do a sanity check on all the lengths for 
Cells read off the CellScanner for incoming requests.  This would allow us to 
reject corrupt cells before we append them to the WAL and succeed the request, 
putting us in a position where we cannot recover.  This would only detect the 
corruption of length fields which puts us in a bad state.

Whether or not Cells should carry some checksum generated at the time the Cell 
is created, which could then validated on the server-side, is a separate 
question.  This would allow detection of other parts of the backing cell 
byte[], such as within the key fields or the value field.  But the computer 
overhead of this may be too heavyweight to be practical.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to