[ https://issues.apache.org/jira/browse/KUDU-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bankim Bhavsar updated KUDU-2968: --------------------------------- Status: In Review (was: Open) > RleDecoder::GetNextRun() may attempt decoding past the last byte leading to > assertion failure > --------------------------------------------------------------------------------------------- > > Key: KUDU-2968 > URL: https://issues.apache.org/jira/browse/KUDU-2968 > Project: Kudu > Issue Type: Bug > Components: util > Reporter: Bankim Bhavsar > Assignee: Bankim Bhavsar > Priority: Major > > RLE encoding may encode "literally" when it doesn't find sufficient repeated > values. > SeeĀ > [https://github.com/apache/kudu/blob/master/src/kudu/util/rle-encoding.h#L28] > Consider a scenarios where consecutive (non-repeated) integers are encoded > using RLE encoding. In that case values are encoded in literal fashion. > Literal count is encoded and it's a multiple of 8. > When the number of values are not multiple of 8, literal count is rounded up > to multiple of 8. > For e.g. if number of values is 100, then literal_count is 104 but max_bytes > is correctly set at 100 for int8 datatype. > In this scenario after reading the last value when {{ret}} is 0, > literal_count still remains at 4. > Hence the next {{GetValue}} return false since it's trying to read beyond > {{max_bytes}}. > https://github.com/apache/kudu/blob/master/src/kudu/util/rle-encoding.h#L319 > {code} > DCHECK(literal_count_ > 0); > if (ret == 0) { > bool has_more = bit_reader_.GetValue(bit_width_, val); > DCHECK(has_more); > literal_count_--; > ret++; > rem--; > } > while (literal_count_ > 0) { > bool result = bit_reader_.GetValue(bit_width_, ¤t_value_); > DCHECK(result); > if (current_value_ != *val || rem == 0) { > bit_reader_.Rewind(bit_width_); > return ret; > } > ret++; > rem--; > literal_count_--; > } > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)