[ https://issues.apache.org/jira/browse/ARROW-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952475#comment-16952475 ]
Jacques Nadeau commented on ARROW-6896: --------------------------------------- I disagree with the issue here. We should probably add a better description of reference count semantics but having the container close it's children makes sense. We depend on this functionality quite a bit. Generally speaking, Vectors are things that shouldn't be handed around, they should be transferred, which has clear reference management semantics. The design was based on the [AttributeSource design pattern|[https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/util/AttributeSource.html]] in Lucene where you create an object once and then pass many separate pieces of data through it to minimize heap churn and pointer/reference management. I think if you're hitting the problem you describe, you're misunderstanding the goals of the codebase. > [Java] Vector schema root should not share vectors > -------------------------------------------------- > > Key: ARROW-6896 > URL: https://issues.apache.org/jira/browse/ARROW-6896 > Project: Apache Arrow > Issue Type: Bug > Components: Java > Reporter: Liya Fan > Assignee: Liya Fan > Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Vector schema root should not share vectors. Otherwise, unexpectd behavior > would happen. > Please note that VectorSchemaRoot is not just a container for vectors, it is > also a resource (it implements the AutoClosable interface), and it manages > the life cycle of its inner vectors. > When two VectorSchemaRoots share vectors, something unexpected may happen. > Consider the following scenario, which is frequently encountered in a SQL > engine. > 1. We create a batch: > VectorSchemaRoot oldBatch = ... > 2. We add a vector to it, which results in a new batch > VectorSchemaRoot newBatch = oldBatch.addVector(vector); > 3. We are done with the old batch, and release the resource > oldBatch.close(); > 4. We continue to use the new batch, but gets an exception, because some > inner vectors have been released by the old batch. -- This message was sent by Atlassian Jira (v8.3.4#803005)