[ https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16219836#comment-16219836 ]
Jacques Nadeau commented on ARROW-1710: --------------------------------------- I'm one of the voices strongly arguing for dropping the additional class objects. (I also was the one who originally introduced the two separate sets when the code was first developed.) My experience has been the following: * Extra complexity of managing two different runtime classes is very expensive (maintenance, coercing between, managing runtime code generation, etc) * Most source data is actually declared as nullable but rarely has nulls As such, having an adaptive interaction where you look at cells 64 values at a time and adapt your behavior based on actual nullability (as opposed to declared nullability) provides a much better performance lift in real world use cases than having specialized code for declared non-nullable situations. FYI: [~e.levine], the updated approach with vectors is moving to a situation where we don't have a bit vector and ultimately also consolidates the buffer for the bits and the fixed bytes in the same buffer. In that case, there is no heap memory overhead and the direct memory overhead is 1 bit per value, far less than necessary. Also note that in reality, most people focused on super high performance Java implementations interact directly with the memory. You can see an example of how we do this here: https://github.com/dremio/dremio-oss/blob/master/sabot/kernel/src/main/java/com/dremio/sabot/op/common/ht2/Pivots.java#L89 If, in the future, if people need the vector classes to have an additional set of methods such as: allocateNewNoNull() setSafeIgnoreNull(int index, int value) let's just add those when someone's usecase requires it. No need to have an extra set of vectors for that purpose. > [Java] Decide what to do with non-nullable vectors in new vector class > hierarchy > --------------------------------------------------------------------------------- > > Key: ARROW-1710 > URL: https://issues.apache.org/jira/browse/ARROW-1710 > Project: Apache Arrow > Issue Type: Sub-task > Components: Java - Vectors > Reporter: Li Jin > Fix For: 0.8.0 > > > So far the consensus seems to be remove all non-nullable vectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029)