[ 
https://issues.apache.org/jira/browse/ARROW-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956199#comment-16956199
 ] 

Micah Kornfield commented on ARROW-6896:
----------------------------------------

{quote} I think you're mixing up the design of the library. VectorSchemaRoot != 
a record batch.
{quote}
I think adding clarification on this topic to the javadoc in VectorSchemaRoot 
(the documentation is a little sparse), at least to me I thought they were 
synonymous (or at least could be used in both ways).  Is the reason not to 
create new VectorSchemaRoots for performance?  

 

CC [~tianchen92] because I think this affects the documentation that he is 
writing, and probably affects the API for a VectorSchemaRootIterater in JDBC 
and Avro libraries.

> [Java] Vector schema root should not share vectors
> --------------------------------------------------
>
>                 Key: ARROW-6896
>                 URL: https://issues.apache.org/jira/browse/ARROW-6896
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java
>            Reporter: Liya Fan
>            Assignee: Liya Fan
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Vector schema root should not share vectors. Otherwise, unexpectd behavior 
> would happen. 
> Please note that VectorSchemaRoot is not just a container for vectors, it is 
> also a resource (it implements the AutoClosable interface), and it manages 
> the life cycle of its inner vectors.
> When two VectorSchemaRoots share vectors, something unexpected may happen. 
> Consider the following scenario, which is frequently encountered in a SQL 
> engine.
> 1. We create a batch:
> VectorSchemaRoot oldBatch = ...
> 2. We add a vector to it, which results in a new batch
> VectorSchemaRoot newBatch = oldBatch.addVector(vector);
> 3. We are done with the old batch, and release the resource
> oldBatch.close();
> 4. We continue to use the new batch, but gets an exception, because some 
> inner vectors have been released by the old batch. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to