Bryan Cutler created ARROW-2594: ----------------------------------- Summary: Vector reallocation does not properly clear reused buffers Key: ARROW-2594 URL: https://issues.apache.org/jira/browse/ARROW-2594 Project: Apache Arrow Issue Type: Bug Components: Java - Vectors Reporter: Bryan Cutler Assignee: Bryan Cutler
When reallocating a vector buffer, it assumes that the first half of the new buffer was clean or populated from the previous and only zeros out the second half. This is not the case if the vector has released the buffer and the current capacity is 0 (empty). If the new buffer has values set, then they will cause bogus values when used in the vector. I came across this when looking into SPARK-23030, due to the comment here https://github.com/apache/spark/pull/21312#issuecomment-389035697 -- This message was sent by Atlassian JIRA (v7.6.3#76005)