[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262971#comment-16262971
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on issue #1341: [WIP] ARROW-1710: [Java] Remove 
Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#issuecomment-346422375
 
 
   > Do we want to port the missing function (getNullCount() and setRangeToOne) 
to the new bit vector?
   The new vector has `getNullCount()` just that it only counts nulls, maybe 
the equivalent would be `getZeroCount()` but I'm not sure how useful that 
really is.  I think I saw there was also a `setRangeToOne` and I'm ok with 
adding those.
   
   > Can we remove NonNullableMapVector altogether? (git rid of the MapVector 
extends NonNullableMapVector inherence and roll them into a single class )
   
   I was thinking this is what we were going to do, maybe as a followup though, 
so +1 for me.  One thing I'm not sure of is ullability useful for this vector?  
For example, Spark StructType doesn't have a nullable param, it's up to the 
child type definitions.  But maybe it's different in the Arrow sense, I'll have 
to think about that..
   
   Happy Thanksgiving to you all too!  You guys work too hard, I don't want to 
see any PRs coming through for a couple days :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262948#comment-16262948
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152630603
 
 

 ##
 File path: java/vector/src/main/java/org/apache/arrow/vector/BitVector.java
 ##
 @@ -18,342 +18,469 @@
 
 package org.apache.arrow.vector;
 
+import io.netty.buffer.ArrowBuf;
 import org.apache.arrow.memory.BufferAllocator;
-import org.apache.arrow.memory.BaseAllocator;
-import org.apache.arrow.memory.OutOfMemoryException;
+import org.apache.arrow.vector.complex.impl.BitReaderImpl;
 import org.apache.arrow.vector.complex.reader.FieldReader;
 import org.apache.arrow.vector.holders.BitHolder;
 import org.apache.arrow.vector.holders.NullableBitHolder;
-import org.apache.arrow.vector.schema.ArrowFieldNode;
-import org.apache.arrow.vector.types.Types.MinorType;
-import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.Types;
+import org.apache.arrow.vector.types.pojo.FieldType;
 import org.apache.arrow.vector.util.OversizedAllocationException;
 import org.apache.arrow.vector.util.TransferPair;
 
-import io.netty.buffer.ArrowBuf;
-
 /**
- * Bit implements a vector of bit-width values. Elements in the vector are 
accessed by position from the logical start
- * of the vector. The width of each element is 1 bit. The equivalent Java 
primitive is an int containing the value '0'
- * or '1'.
+ * BitVector implements a fixed width (1 bit) vector of
+ * boolean values which could be null. Each value in the vector corresponds
+ * to a single bit in the underlying data stream backing the vector.
  */
-public final class BitVector extends BaseDataValueVector implements 
FixedWidthVector {
-  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(BitVector.class);
-
-  private final Accessor accessor = new Accessor();
-  private final Mutator mutator = new Mutator();
-
-  int valueCount;
-  private int allocationSizeInBytes = 
getSizeFromCount(INITIAL_VALUE_ALLOCATION);
-  private int allocationMonitor = 0;
+public class BitVector extends BaseFixedWidthVector {
+  private final FieldReader reader;
 
+  /**
+   * Instantiate a BitVector. This doesn't allocate any memory for
+   * the data in vector.
+   *
+   * @param name  name of the vector
+   * @param allocator allocator for memory management.
+   */
   public BitVector(String name, BufferAllocator allocator) {
-super(name, allocator);
+this(name, FieldType.nullable(Types.MinorType.BIT.getType()),
+allocator);
   }
 
-  @Override
-  public void load(ArrowFieldNode fieldNode, ArrowBuf data) {
-// When the vector is all nulls or all defined, the content of the buffer 
can be omitted
-if (data.readableBytes() == 0 && fieldNode.getLength() != 0) {
-  int count = fieldNode.getLength();
-  allocateNew(count);
-  int n = getSizeFromCount(count);
-  if (fieldNode.getNullCount() == 0) {
-// all defined
-// create an all 1s buffer
-// set full bytes
-int fullBytesCount = count / 8;
-for (int i = 0; i < fullBytesCount; ++i) {
-  this.data.setByte(i, 0xFF);
-}
-int remainder = count % 8;
-// set remaining bits
-if (remainder > 0) {
-  byte bitMask = (byte) (0xFFL >>> ((8 - remainder) & 7));
-  this.data.setByte(fullBytesCount, bitMask);
-}
-  } else if (fieldNode.getNullCount() == fieldNode.getLength()) {
-// all null
-// create an all 0s buffer
-zeroVector();
-  } else {
-throw new IllegalArgumentException("The buffer can be empty only if 
there's no data or it's all null or all defined");
-  }
-  this.data.writerIndex(n);
-} else {
-  super.load(fieldNode, data);
-}
-this.valueCount = fieldNode.getLength();
+  /**
+   * Instantiate a BitVector. This doesn't allocate any memory for
+   * the data in vector.
+   *
+   * @param name  name of the vector
+   * @param fieldType type of Field materialized by this vector
+   * @param allocator allocator for memory management.
+   */
+  public BitVector(String name, FieldType fieldType, BufferAllocator 
allocator) {
+super(name, allocator, fieldType, (byte) 0);
+reader = new BitReaderImpl(BitVector.this);
   }
 
+  /**
+   * Get a reader that supports reading values from this vector
+   *
+   * @return Field Reader for this vector
+   */
   @Override
-  public Field getField() {
-throw new UnsupportedOperationException("internal vector");
+  public FieldReader getReader() {
+return reader;
   }
 
+  /**
+   * Get minor type for this vector. The vector holds values belonging
+   * to a particular type.
+   *
+   * @return 

[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262803#comment-16262803
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

icexelloss commented on issue #1341: [WIP] ARROW-1710: [Java] Remove 
Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#issuecomment-346386518
 
 
   @BryanCutler High level looks good to me. Two major comments:
   * Do we want to port the missing function (`getNullCount()` and 
`setRangeToOne`) to the new bit vector?
   * Can we remove `NonNullableMapVector` altogether? (git rid of the 
`MapVector extends NonNullableMapVector` inherence and roll them into a single 
class )
   
   Happy thanksgiving!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262798#comment-16262798
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

icexelloss commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152598542
 
 

 ##
 File path: 
java/vector/src/test/java/org/apache/arrow/vector/TestOversizedAllocationForValueVector.java
 ##
 @@ -112,10 +112,10 @@ public void testVariableVectorReallocation() {
 try {
   vector.allocateNew(expectedAllocationInBytes, 10);
   assertTrue(expectedOffsetSize <= vector.getValueCapacity());
-  assertTrue(expectedAllocationInBytes <= vector.getBuffer().capacity());
+  assertTrue(expectedAllocationInBytes <= 
vector.getDataBuffer().capacity());
 
 Review comment:
   Why this is checking DataBuffer and the check after reAlloc is checking 
OffsetBuffer?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262792#comment-16262792
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

icexelloss commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152597645
 
 

 ##
 File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVector.java
 ##
 @@ -526,15 +505,17 @@ private void validateRange(int length, int start, int 
count) {
 try (BitVector bitVector = new BitVector("bits", allocator)) {
   bitVector.reset();
   bitVector.allocateNew(length);
-  bitVector.getMutator().setRangeToOne(start, count);
+  for (int i = start; i < start + count; i++) {
+bitVector.set(i, 1);
+  }
 
 Review comment:
   Should we consider add `setRangeToOne` back in `BitVector`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262788#comment-16262788
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

icexelloss commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152596476
 
 

 ##
 File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVector.java
 ##
 @@ -426,81 +405,81 @@ public void testReallocAfterVectorTransfer2() {
   public void testBitVector() {
 // Create a new value vector for 1024 integers
 try (final BitVector vector = new BitVector(EMPTY_SCHEMA_PATH, allocator)) 
{
-  final BitVector.Mutator m = vector.getMutator();
   vector.allocateNew(1024);
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
   // Put and set a few values
-  m.set(0, 1);
-  m.set(1, 0);
-  m.set(100, 0);
-  m.set(1022, 1);
+  vector.set(0, 1);
+  vector.set(1, 0);
+  vector.set(100, 0);
+  vector.set(1022, 1);
 
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
-  final BitVector.Accessor accessor = vector.getAccessor();
-  assertEquals(1, accessor.get(0));
-  assertEquals(0, accessor.get(1));
-  assertEquals(0, accessor.get(100));
-  assertEquals(1, accessor.get(1022));
+  assertEquals(1, vector.get(0));
+  assertEquals(0, vector.get(1));
+  assertEquals(0, vector.get(100));
+  assertEquals(1, vector.get(1022));
 
-  assertEquals(1022, accessor.getNullCount());
+  assertEquals(1020, vector.getNullCount());
 
 Review comment:
   For migration purpose, should me provide an alternative method to provide 
the old behavior? Something like `getNullAndUnsetCount()`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262789#comment-16262789
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

icexelloss commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152596476
 
 

 ##
 File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVector.java
 ##
 @@ -426,81 +405,81 @@ public void testReallocAfterVectorTransfer2() {
   public void testBitVector() {
 // Create a new value vector for 1024 integers
 try (final BitVector vector = new BitVector(EMPTY_SCHEMA_PATH, allocator)) 
{
-  final BitVector.Mutator m = vector.getMutator();
   vector.allocateNew(1024);
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
   // Put and set a few values
-  m.set(0, 1);
-  m.set(1, 0);
-  m.set(100, 0);
-  m.set(1022, 1);
+  vector.set(0, 1);
+  vector.set(1, 0);
+  vector.set(100, 0);
+  vector.set(1022, 1);
 
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
-  final BitVector.Accessor accessor = vector.getAccessor();
-  assertEquals(1, accessor.get(0));
-  assertEquals(0, accessor.get(1));
-  assertEquals(0, accessor.get(100));
-  assertEquals(1, accessor.get(1022));
+  assertEquals(1, vector.get(0));
+  assertEquals(0, vector.get(1));
+  assertEquals(0, vector.get(100));
+  assertEquals(1, vector.get(1022));
 
-  assertEquals(1022, accessor.getNullCount());
+  assertEquals(1020, vector.getNullCount());
 
 Review comment:
   For migration purpose, should me provide an alternative method to provide 
the old behavior? Something like `getNullAndUnsetCount()`. Seems like a weird 
API though.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262773#comment-16262773
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

icexelloss commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152593713
 
 

 ##
 File path: java/vector/src/main/java/org/apache/arrow/vector/BitVector.java
 ##
 @@ -18,342 +18,469 @@
 
 package org.apache.arrow.vector;
 
+import io.netty.buffer.ArrowBuf;
 import org.apache.arrow.memory.BufferAllocator;
-import org.apache.arrow.memory.BaseAllocator;
-import org.apache.arrow.memory.OutOfMemoryException;
+import org.apache.arrow.vector.complex.impl.BitReaderImpl;
 import org.apache.arrow.vector.complex.reader.FieldReader;
 import org.apache.arrow.vector.holders.BitHolder;
 import org.apache.arrow.vector.holders.NullableBitHolder;
-import org.apache.arrow.vector.schema.ArrowFieldNode;
-import org.apache.arrow.vector.types.Types.MinorType;
-import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.Types;
+import org.apache.arrow.vector.types.pojo.FieldType;
 import org.apache.arrow.vector.util.OversizedAllocationException;
 import org.apache.arrow.vector.util.TransferPair;
 
-import io.netty.buffer.ArrowBuf;
-
 /**
- * Bit implements a vector of bit-width values. Elements in the vector are 
accessed by position from the logical start
- * of the vector. The width of each element is 1 bit. The equivalent Java 
primitive is an int containing the value '0'
- * or '1'.
+ * BitVector implements a fixed width (1 bit) vector of
+ * boolean values which could be null. Each value in the vector corresponds
+ * to a single bit in the underlying data stream backing the vector.
  */
-public final class BitVector extends BaseDataValueVector implements 
FixedWidthVector {
-  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(BitVector.class);
-
-  private final Accessor accessor = new Accessor();
-  private final Mutator mutator = new Mutator();
-
-  int valueCount;
-  private int allocationSizeInBytes = 
getSizeFromCount(INITIAL_VALUE_ALLOCATION);
-  private int allocationMonitor = 0;
+public class BitVector extends BaseFixedWidthVector {
+  private final FieldReader reader;
 
+  /**
+   * Instantiate a BitVector. This doesn't allocate any memory for
+   * the data in vector.
+   *
+   * @param name  name of the vector
+   * @param allocator allocator for memory management.
+   */
   public BitVector(String name, BufferAllocator allocator) {
-super(name, allocator);
+this(name, FieldType.nullable(Types.MinorType.BIT.getType()),
+allocator);
   }
 
-  @Override
-  public void load(ArrowFieldNode fieldNode, ArrowBuf data) {
-// When the vector is all nulls or all defined, the content of the buffer 
can be omitted
-if (data.readableBytes() == 0 && fieldNode.getLength() != 0) {
-  int count = fieldNode.getLength();
-  allocateNew(count);
-  int n = getSizeFromCount(count);
-  if (fieldNode.getNullCount() == 0) {
-// all defined
-// create an all 1s buffer
-// set full bytes
-int fullBytesCount = count / 8;
-for (int i = 0; i < fullBytesCount; ++i) {
-  this.data.setByte(i, 0xFF);
-}
-int remainder = count % 8;
-// set remaining bits
-if (remainder > 0) {
-  byte bitMask = (byte) (0xFFL >>> ((8 - remainder) & 7));
-  this.data.setByte(fullBytesCount, bitMask);
-}
-  } else if (fieldNode.getNullCount() == fieldNode.getLength()) {
-// all null
-// create an all 0s buffer
-zeroVector();
-  } else {
-throw new IllegalArgumentException("The buffer can be empty only if 
there's no data or it's all null or all defined");
-  }
-  this.data.writerIndex(n);
-} else {
-  super.load(fieldNode, data);
-}
-this.valueCount = fieldNode.getLength();
+  /**
+   * Instantiate a BitVector. This doesn't allocate any memory for
+   * the data in vector.
+   *
+   * @param name  name of the vector
+   * @param fieldType type of Field materialized by this vector
+   * @param allocator allocator for memory management.
+   */
+  public BitVector(String name, FieldType fieldType, BufferAllocator 
allocator) {
+super(name, allocator, fieldType, (byte) 0);
+reader = new BitReaderImpl(BitVector.this);
   }
 
+  /**
+   * Get a reader that supports reading values from this vector
+   *
+   * @return Field Reader for this vector
+   */
   @Override
-  public Field getField() {
-throw new UnsupportedOperationException("internal vector");
+  public FieldReader getReader() {
+return reader;
   }
 
+  /**
+   * Get minor type for this vector. The vector holds values belonging
+   * to a particular type.
+   *
+   * @return 

[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261761#comment-16261761
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152441194
 
 

 ##
 File path: 
java/vector/src/main/java/org/apache/arrow/vector/complex/MapVector.java
 ##
 @@ -21,332 +21,492 @@
 import static com.google.common.base.Preconditions.checkNotNull;
 
 import java.util.ArrayList;
-import java.util.Collection;
-import java.util.Iterator;
+import java.util.Arrays;
+import java.util.Collections;
 import java.util.List;
-import java.util.Map;
 
-import javax.annotation.Nullable;
-
-import com.google.common.base.Preconditions;
-import com.google.common.collect.Ordering;
-import com.google.common.primitives.Ints;
+import com.google.common.collect.ObjectArrays;
 
 import io.netty.buffer.ArrowBuf;
-
+import org.apache.arrow.memory.BaseAllocator;
 import org.apache.arrow.memory.BufferAllocator;
 import org.apache.arrow.vector.*;
-import org.apache.arrow.vector.complex.impl.SingleMapReaderImpl;
-import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.impl.NullableMapReaderImpl;
+import org.apache.arrow.vector.complex.impl.NullableMapWriter;
 import org.apache.arrow.vector.holders.ComplexHolder;
-import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.schema.ArrowFieldNode;
 import org.apache.arrow.vector.types.pojo.ArrowType;
-import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.ArrowType.Struct;
+import org.apache.arrow.vector.types.pojo.DictionaryEncoding;
 import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Field;
 import org.apache.arrow.vector.util.CallBack;
-import org.apache.arrow.vector.util.JsonStringHashMap;
+import org.apache.arrow.vector.util.OversizedAllocationException;
 import org.apache.arrow.vector.util.TransferPair;
 
-public class MapVector extends AbstractMapVector {
-  //private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(MapVector.class);
+public class MapVector extends NonNullableMapVector implements FieldVector {
 
 Review comment:
   Yes, the above is correct and what I was referencing in (1) from 
https://github.com/apache/arrow/pull/1341#issuecomment-345871911.  Maybe I 
missed the discussion, so I thought we would be removing the 
`NonNullableMapVector` and would combine the 2 classes.  Why do we need to keep 
both?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261754#comment-16261754
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

siddharthteotia commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152440144
 
 

 ##
 File path: 
java/vector/src/main/java/org/apache/arrow/vector/complex/MapVector.java
 ##
 @@ -21,332 +21,492 @@
 import static com.google.common.base.Preconditions.checkNotNull;
 
 import java.util.ArrayList;
-import java.util.Collection;
-import java.util.Iterator;
+import java.util.Arrays;
+import java.util.Collections;
 import java.util.List;
-import java.util.Map;
 
-import javax.annotation.Nullable;
-
-import com.google.common.base.Preconditions;
-import com.google.common.collect.Ordering;
-import com.google.common.primitives.Ints;
+import com.google.common.collect.ObjectArrays;
 
 import io.netty.buffer.ArrowBuf;
-
+import org.apache.arrow.memory.BaseAllocator;
 import org.apache.arrow.memory.BufferAllocator;
 import org.apache.arrow.vector.*;
-import org.apache.arrow.vector.complex.impl.SingleMapReaderImpl;
-import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.impl.NullableMapReaderImpl;
+import org.apache.arrow.vector.complex.impl.NullableMapWriter;
 import org.apache.arrow.vector.holders.ComplexHolder;
-import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.schema.ArrowFieldNode;
 import org.apache.arrow.vector.types.pojo.ArrowType;
-import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.ArrowType.Struct;
+import org.apache.arrow.vector.types.pojo.DictionaryEncoding;
 import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Field;
 import org.apache.arrow.vector.util.CallBack;
-import org.apache.arrow.vector.util.JsonStringHashMap;
+import org.apache.arrow.vector.util.OversizedAllocationException;
 import org.apache.arrow.vector.util.TransferPair;
 
-public class MapVector extends AbstractMapVector {
-  //private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(MapVector.class);
+public class MapVector extends NonNullableMapVector implements FieldVector {
 
 Review comment:
   This is slightly different from changing the name of NullableInt to Int 
since we are going to remove the latter one. 
   
   For the MapVector case, I don't think so we are removing anything and 
instead keeping around both -- base class MapVector and subclass 
NullableMapVector. So probably no need to change name here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261744#comment-16261744
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

siddharthteotia commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152439861
 
 

 ##
 File path: 
java/vector/src/main/java/org/apache/arrow/vector/complex/MapVector.java
 ##
 @@ -21,332 +21,492 @@
 import static com.google.common.base.Preconditions.checkNotNull;
 
 import java.util.ArrayList;
-import java.util.Collection;
-import java.util.Iterator;
+import java.util.Arrays;
+import java.util.Collections;
 import java.util.List;
-import java.util.Map;
 
-import javax.annotation.Nullable;
-
-import com.google.common.base.Preconditions;
-import com.google.common.collect.Ordering;
-import com.google.common.primitives.Ints;
+import com.google.common.collect.ObjectArrays;
 
 import io.netty.buffer.ArrowBuf;
-
+import org.apache.arrow.memory.BaseAllocator;
 import org.apache.arrow.memory.BufferAllocator;
 import org.apache.arrow.vector.*;
-import org.apache.arrow.vector.complex.impl.SingleMapReaderImpl;
-import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.impl.NullableMapReaderImpl;
+import org.apache.arrow.vector.complex.impl.NullableMapWriter;
 import org.apache.arrow.vector.holders.ComplexHolder;
-import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.schema.ArrowFieldNode;
 import org.apache.arrow.vector.types.pojo.ArrowType;
-import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.ArrowType.Struct;
+import org.apache.arrow.vector.types.pojo.DictionaryEncoding;
 import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Field;
 import org.apache.arrow.vector.util.CallBack;
-import org.apache.arrow.vector.util.JsonStringHashMap;
+import org.apache.arrow.vector.util.OversizedAllocationException;
 import org.apache.arrow.vector.util.TransferPair;
 
-public class MapVector extends AbstractMapVector {
-  //private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(MapVector.class);
+public class MapVector extends NonNullableMapVector implements FieldVector {
 
 Review comment:
   If I understand it correctly, the original MapVector has now become 
NonNullableMapVector and NullableMapVector (subclass of of MapVector) has now 
become MapVector. @BryanCutler  can confirm.
   
   I would prefer to not change the naming here since there are essentially two 
classes -- NullableMapVector and MapVector where the former one is a MapVector 
with a validity buffer.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261736#comment-16261736
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

icexelloss commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152439296
 
 

 ##
 File path: 
java/vector/src/main/java/org/apache/arrow/vector/complex/MapVector.java
 ##
 @@ -21,332 +21,492 @@
 import static com.google.common.base.Preconditions.checkNotNull;
 
 import java.util.ArrayList;
-import java.util.Collection;
-import java.util.Iterator;
+import java.util.Arrays;
+import java.util.Collections;
 import java.util.List;
-import java.util.Map;
 
-import javax.annotation.Nullable;
-
-import com.google.common.base.Preconditions;
-import com.google.common.collect.Ordering;
-import com.google.common.primitives.Ints;
+import com.google.common.collect.ObjectArrays;
 
 import io.netty.buffer.ArrowBuf;
-
+import org.apache.arrow.memory.BaseAllocator;
 import org.apache.arrow.memory.BufferAllocator;
 import org.apache.arrow.vector.*;
-import org.apache.arrow.vector.complex.impl.SingleMapReaderImpl;
-import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.complex.impl.NullableMapReaderImpl;
+import org.apache.arrow.vector.complex.impl.NullableMapWriter;
 import org.apache.arrow.vector.holders.ComplexHolder;
-import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.schema.ArrowFieldNode;
 import org.apache.arrow.vector.types.pojo.ArrowType;
-import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.ArrowType.Struct;
+import org.apache.arrow.vector.types.pojo.DictionaryEncoding;
 import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Field;
 import org.apache.arrow.vector.util.CallBack;
-import org.apache.arrow.vector.util.JsonStringHashMap;
+import org.apache.arrow.vector.util.OversizedAllocationException;
 import org.apache.arrow.vector.util.TransferPair;
 
-public class MapVector extends AbstractMapVector {
-  //private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(MapVector.class);
+public class MapVector extends NonNullableMapVector implements FieldVector {
 
 Review comment:
   Is `NonNullableMapVector` something new? Can you explain the change of class 
hierarchy here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261733#comment-16261733
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

icexelloss commented on issue #1341: [WIP] ARROW-1710: [Java] Remove 
Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#issuecomment-346203057
 
 
   @BryanCutler Good work! High level looks good. I will try to look more 
closely tomorrow.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261330#comment-16261330
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

icexelloss commented on issue #1341: [WIP] ARROW-1710: [Java] Remove 
Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#issuecomment-346132977
 
 
   If this is merged first. I would just delete different files, shouldn't be 
hard. If #1330 get merged first, you need to rename NullableTimestampVector -> 
TimestampVector.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261324#comment-16261324
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

icexelloss commented on issue #1341: [WIP] ARROW-1710: [Java] Remove 
Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#issuecomment-346132669
 
 
   @BryanCutler There might be some conflicts but I think we should be fine.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261308#comment-16261308
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on issue #1341: [WIP] ARROW-1710: [Java] Remove 
Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#issuecomment-346128441
 
 
   @icexelloss I didn't see you were already working on #1330, which is also a 
sizable change, before I started on this.  I'm not sure which one would be 
easier to merge first, any thoughts?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261273#comment-16261273
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

icexelloss commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152368299
 
 

 ##
 File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVector.java
 ##
 @@ -426,81 +405,81 @@ public void testReallocAfterVectorTransfer2() {
   public void testBitVector() {
 // Create a new value vector for 1024 integers
 try (final BitVector vector = new BitVector(EMPTY_SCHEMA_PATH, allocator)) 
{
-  final BitVector.Mutator m = vector.getMutator();
   vector.allocateNew(1024);
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
   // Put and set a few values
-  m.set(0, 1);
-  m.set(1, 0);
-  m.set(100, 0);
-  m.set(1022, 1);
+  vector.set(0, 1);
+  vector.set(1, 0);
+  vector.set(100, 0);
+  vector.set(1022, 1);
 
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
-  final BitVector.Accessor accessor = vector.getAccessor();
-  assertEquals(1, accessor.get(0));
-  assertEquals(0, accessor.get(1));
-  assertEquals(0, accessor.get(100));
-  assertEquals(1, accessor.get(1022));
+  assertEquals(1, vector.get(0));
+  assertEquals(0, vector.get(1));
+  assertEquals(0, vector.get(100));
+  assertEquals(1, vector.get(1022));
 
-  assertEquals(1022, accessor.getNullCount());
+  assertEquals(1020, vector.getNullCount());
 
 Review comment:
   I agree this is an misnomer. I think this is fine as long as it doesn't 
break Dremio. cc @siddharthteotia.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16261250#comment-16261250
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152364883
 
 

 ##
 File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVector.java
 ##
 @@ -426,81 +405,81 @@ public void testReallocAfterVectorTransfer2() {
   public void testBitVector() {
 // Create a new value vector for 1024 integers
 try (final BitVector vector = new BitVector(EMPTY_SCHEMA_PATH, allocator)) 
{
-  final BitVector.Mutator m = vector.getMutator();
   vector.allocateNew(1024);
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
   // Put and set a few values
-  m.set(0, 1);
-  m.set(1, 0);
-  m.set(100, 0);
-  m.set(1022, 1);
+  vector.set(0, 1);
+  vector.set(1, 0);
+  vector.set(100, 0);
+  vector.set(1022, 1);
 
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
-  final BitVector.Accessor accessor = vector.getAccessor();
-  assertEquals(1, accessor.get(0));
-  assertEquals(0, accessor.get(1));
-  assertEquals(0, accessor.get(100));
-  assertEquals(1, accessor.get(1022));
+  assertEquals(1, vector.get(0));
+  assertEquals(0, vector.get(1));
+  assertEquals(0, vector.get(100));
+  assertEquals(1, vector.get(1022));
 
-  assertEquals(1022, accessor.getNullCount());
+  assertEquals(1020, vector.getNullCount());
 
 Review comment:
   Yeah that would make sense, but sort of a misnomer imo.  Should we include 
this change in the release notes somewhere?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260049#comment-16260049
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on issue #1341: [WIP] ARROW-1710: [Java] Remove 
Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#issuecomment-345872196
 
 
   Given the scope of this, I would be nice to get it and resolve the above 
issues later as long as this doesn't break anything - looks like I already need 
to do a rebase..


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260045#comment-16260045
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on issue #1341: [WIP] ARROW-1710: [Java] Remove 
Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#issuecomment-345871911
 
 
   @siddharthteotia @icexelloss please take a look when you have a chance. 
There are still some things I was not sure of so I marked this as a WIP:
   
   1) MapVector still has a non-nullable vector class in it's hierarchy.  Maybe 
I missed some previous discussion from the refactoring, but I couldn't just 
remove the non-nullable class so I renamed them to be consistent with the other 
vectors
   
   2) Holders have both nullable and non-nullable, not sure if there is still a 
use for that so I left it for now... maybe after ARROW-1833 we can fix this up
   
   3) There are still references to NullableMapWriter/Reader, that got a little 
messy trying to remove but I can try again later
   
   cc @jacques-n @wesm 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260040#comment-16260040
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152144518
 
 

 ##
 File path: 
java/vector/src/test/java/org/apache/arrow/vector/TestVectorReAlloc.java
 ##
 @@ -108,7 +107,8 @@ public void testListType() {
   assertEquals(1023, vector.getValueCapacity());
 
   try {
-vector.getOffsetVector().getAccessor().get(2014);
+// TODO: is this right?
+vector.getInnerValueCountAt(2014);
 
 Review comment:
   Is this what the test is checking for?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260037#comment-16260037
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152143688
 
 

 ##
 File path: 
java/vector/src/test/java/org/apache/arrow/vector/TestOversizedAllocationForValueVector.java
 ##
 @@ -112,10 +112,10 @@ public void testVariableVectorReallocation() {
 try {
   vector.allocateNew(expectedAllocationInBytes, 10);
   assertTrue(expectedOffsetSize <= vector.getValueCapacity());
-  assertTrue(expectedAllocationInBytes <= vector.getBuffer().capacity());
+  assertTrue(expectedAllocationInBytes <= 
vector.getDataBuffer().capacity());
 
 Review comment:
   Is this correct? it is checking only the data buffer capacity i believe


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260035#comment-16260035
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152143477
 
 

 ##
 File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVector.java
 ##
 @@ -526,15 +505,17 @@ private void validateRange(int length, int start, int 
count) {
 try (BitVector bitVector = new BitVector("bits", allocator)) {
   bitVector.reset();
   bitVector.allocateNew(length);
-  bitVector.getMutator().setRangeToOne(start, count);
+  for (int i = start; i < start + count; i++) {
+bitVector.set(i, 1);
+  }
 
 Review comment:
   again, slightly different API, need to check range manually


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260033#comment-16260033
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152143196
 
 

 ##
 File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVector.java
 ##
 @@ -426,81 +405,81 @@ public void testReallocAfterVectorTransfer2() {
   public void testBitVector() {
 // Create a new value vector for 1024 integers
 try (final BitVector vector = new BitVector(EMPTY_SCHEMA_PATH, allocator)) 
{
-  final BitVector.Mutator m = vector.getMutator();
   vector.allocateNew(1024);
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
   // Put and set a few values
-  m.set(0, 1);
-  m.set(1, 0);
-  m.set(100, 0);
-  m.set(1022, 1);
+  vector.set(0, 1);
+  vector.set(1, 0);
+  vector.set(100, 0);
+  vector.set(1022, 1);
 
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
-  final BitVector.Accessor accessor = vector.getAccessor();
-  assertEquals(1, accessor.get(0));
-  assertEquals(0, accessor.get(1));
-  assertEquals(0, accessor.get(100));
-  assertEquals(1, accessor.get(1022));
+  assertEquals(1, vector.get(0));
+  assertEquals(0, vector.get(1));
+  assertEquals(0, vector.get(100));
+  assertEquals(1, vector.get(1022));
 
-  assertEquals(1022, accessor.getNullCount());
+  assertEquals(1020, vector.getNullCount());
 
   // test setting the same value twice
-  m.set(0, 1);
-  m.set(0, 1);
-  m.set(1, 0);
-  m.set(1, 0);
-  assertEquals(1, accessor.get(0));
-  assertEquals(0, accessor.get(1));
+  vector.set(0, 1);
+  vector.set(0, 1);
+  vector.set(1, 0);
+  vector.set(1, 0);
+  assertEquals(1, vector.get(0));
+  assertEquals(0, vector.get(1));
 
   // test toggling the values
-  m.set(0, 0);
-  m.set(1, 1);
-  assertEquals(0, accessor.get(0));
-  assertEquals(1, accessor.get(1));
+  vector.set(0, 0);
+  vector.set(1, 1);
+  assertEquals(0, vector.get(0));
+  assertEquals(1, vector.get(1));
 
   // should not change
-  assertEquals(1022, accessor.getNullCount());
+  assertEquals(1020, vector.getNullCount());
 
-  // Ensure unallocated space returns 0
-  assertEquals(0, accessor.get(3));
+  // Ensure null value
+  assertTrue(vector.isNull(3));
 
 Review comment:
   can't check that a null entry has any type of value


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260030#comment-16260030
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152142911
 
 

 ##
 File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVector.java
 ##
 @@ -426,81 +405,81 @@ public void testReallocAfterVectorTransfer2() {
   public void testBitVector() {
 // Create a new value vector for 1024 integers
 try (final BitVector vector = new BitVector(EMPTY_SCHEMA_PATH, allocator)) 
{
-  final BitVector.Mutator m = vector.getMutator();
   vector.allocateNew(1024);
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
   // Put and set a few values
-  m.set(0, 1);
-  m.set(1, 0);
-  m.set(100, 0);
-  m.set(1022, 1);
+  vector.set(0, 1);
+  vector.set(1, 0);
+  vector.set(100, 0);
+  vector.set(1022, 1);
 
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
-  final BitVector.Accessor accessor = vector.getAccessor();
-  assertEquals(1, accessor.get(0));
-  assertEquals(0, accessor.get(1));
-  assertEquals(0, accessor.get(100));
-  assertEquals(1, accessor.get(1022));
+  assertEquals(1, vector.get(0));
+  assertEquals(0, vector.get(1));
+  assertEquals(0, vector.get(100));
+  assertEquals(1, vector.get(1022));
 
-  assertEquals(1022, accessor.getNullCount());
+  assertEquals(1020, vector.getNullCount());
 
 Review comment:
   The value checked here seem wrong before, am I missing something?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260031#comment-16260031
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152143001
 
 

 ##
 File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVector.java
 ##
 @@ -426,81 +405,81 @@ public void testReallocAfterVectorTransfer2() {
   public void testBitVector() {
 // Create a new value vector for 1024 integers
 try (final BitVector vector = new BitVector(EMPTY_SCHEMA_PATH, allocator)) 
{
-  final BitVector.Mutator m = vector.getMutator();
   vector.allocateNew(1024);
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
   // Put and set a few values
-  m.set(0, 1);
-  m.set(1, 0);
-  m.set(100, 0);
-  m.set(1022, 1);
+  vector.set(0, 1);
+  vector.set(1, 0);
+  vector.set(100, 0);
+  vector.set(1022, 1);
 
-  m.setValueCount(1024);
+  vector.setValueCount(1024);
 
-  final BitVector.Accessor accessor = vector.getAccessor();
-  assertEquals(1, accessor.get(0));
-  assertEquals(0, accessor.get(1));
-  assertEquals(0, accessor.get(100));
-  assertEquals(1, accessor.get(1022));
+  assertEquals(1, vector.get(0));
+  assertEquals(0, vector.get(1));
+  assertEquals(0, vector.get(100));
+  assertEquals(1, vector.get(1022));
 
-  assertEquals(1022, accessor.getNullCount());
+  assertEquals(1020, vector.getNullCount());
 
   // test setting the same value twice
-  m.set(0, 1);
-  m.set(0, 1);
-  m.set(1, 0);
-  m.set(1, 0);
-  assertEquals(1, accessor.get(0));
-  assertEquals(0, accessor.get(1));
+  vector.set(0, 1);
+  vector.set(0, 1);
+  vector.set(1, 0);
+  vector.set(1, 0);
+  assertEquals(1, vector.get(0));
+  assertEquals(0, vector.get(1));
 
   // test toggling the values
-  m.set(0, 0);
-  m.set(1, 1);
-  assertEquals(0, accessor.get(0));
-  assertEquals(1, accessor.get(1));
+  vector.set(0, 0);
+  vector.set(1, 1);
+  assertEquals(0, vector.get(0));
+  assertEquals(1, vector.get(1));
 
   // should not change
-  assertEquals(1022, accessor.getNullCount());
+  assertEquals(1020, vector.getNullCount());
 
 Review comment:
   again, this value seemed wrong before


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260028#comment-16260028
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152142654
 
 

 ##
 File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVector.java
 ##
 @@ -242,90 +230,81 @@ public void testReallocAfterVectorTransfer1() {
   int valueCapacity = vector.getValueCapacity();
   assertEquals(4096, valueCapacity);
 
-  final BitVector.Mutator mutator = vector.getMutator();
-  final BitVector.Accessor accessor = vector.getAccessor();
-
   for (int i = 0; i < valueCapacity; i++) {
 if ((i & 1) == 1) {
-  mutator.setToOne(i);
+  vector.set(i, 1);
 }
   }
 
   for (int i = 0; i < valueCapacity; i++) {
-int val = accessor.get(i);
 if ((i & 1) == 1) {
-  assertEquals("unexpected cleared bit at index: " + i, 1, val);
+  assertEquals("unexpected cleared bit at index: " + i, 1, 
vector.get(i));
 }
 else {
-  assertEquals("unexpected set bit at index: " + i, 0, val);
+  assertTrue("unexpected set bit at index: " + i, vector.isNull(i));
 
 Review comment:
   since it's now a nullable `BitVector`, can't get values where a null is set


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260026#comment-16260026
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler commented on a change in pull request #1341: [WIP] ARROW-1710: 
[Java] Remove Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341#discussion_r152142514
 
 

 ##
 File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVector.java
 ##
 @@ -242,90 +230,81 @@ public void testReallocAfterVectorTransfer1() {
   int valueCapacity = vector.getValueCapacity();
   assertEquals(4096, valueCapacity);
 
-  final BitVector.Mutator mutator = vector.getMutator();
-  final BitVector.Accessor accessor = vector.getAccessor();
-
   for (int i = 0; i < valueCapacity; i++) {
 if ((i & 1) == 1) {
-  mutator.setToOne(i);
+  vector.set(i, 1);
 
 Review comment:
   Slightly different API with the new BitVector


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260016#comment-16260016
 ] 

ASF GitHub Bot commented on ARROW-1710:
---

BryanCutler opened a new pull request #1341: [WIP] ARROW-1710: [Java] Remove 
Non-Nullable Vectors
URL: https://github.com/apache/arrow/pull/1341
 
 
   This removes non-nullable vectors that are no longer part of the vector 
class hierarchy and renames Nullable*Vector classes to remove the Nullable 
prefix.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
>  Labels: pull-request-available
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-19 Thread Bryan Cutler (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258884#comment-16258884
 ] 

Bryan Cutler commented on ARROW-1710:
-

Sounds good to me, I'll work on this and have a PR soon.

> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-18 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258231#comment-16258231
 ] 

Wes McKinney commented on ARROW-1710:
-

+1. See ARROW-1833

> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-18 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16258123#comment-16258123
 ] 

Jacques Nadeau commented on ARROW-1710:
---

Agree to both nullable prefix removable and adding "dirty" accessor/mutator 
methods but i think the latter could come in 0.9.0 since it is enhancement to 
the api.

> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-17 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257872#comment-16257872
 ] 

Wes McKinney commented on ARROW-1710:
-

I would also propose to remove the Nullable prefix and add "dirty" accessor 
methods for users who are working with data without nulls (or that can be used 
on the hot path when you see that the null count for a vector is 0)

> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
>Assignee: Bryan Cutler
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-11-07 Thread Li Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242687#comment-16242687
 ] 

Li Jin commented on ARROW-1710:
---

Seems we agree to remove non nullable vectors.

The next question is, what do people feel about dropping the "Nullable" prefix 
in new vector classes? [~bryanc] brought this up initially.

I am +1 for dropping the "Nullable" prefix. It makes the code more concise. 

> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-10-26 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220962#comment-16220962
 ] 

Jacques Nadeau commented on ARROW-1710:
---

The consolidation doesn't inline the values. They are simply prepended. All 
validity bits come before all data bytes. They are just in the same ArrowBuf 
rather than maintaining two ArrowBufs. Normally we use different buffers to 
support resizing only one. However, in the case of validity and data, both will 
get resized at the same time depending on the values held.

> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-10-26 Thread Ethan Levine (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220392#comment-16220392
 ] 

Ethan Levine commented on ARROW-1710:
-

[~jnadeau]: I'd be interested to learn more about that consolidation. It sounds 
like the validity bits will be stored inline with the data? It seems like 
eliding that could be difficult.

Our use case involves data that's mostly not nullable, and we take care to 
ensure that it's declared that way. If the costs of writing only non-null 
values to a nullable array (in terms of memory and computation) become 
insignificant, then it makes sense to only include nullable arrays. But if 
that's not the case then I think it makes sense to keep both.

> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-10-25 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219836#comment-16219836
 ] 

Jacques Nadeau commented on ARROW-1710:
---

I'm one of the voices strongly arguing for dropping the additional class 
objects. (I also was the one who originally introduced the two separate sets 
when the code was first developed.) My experience has been the following:

* Extra complexity of managing two different runtime classes is very expensive 
(maintenance, coercing between, managing runtime code generation, etc)
* Most source data is actually declared as nullable but rarely has nulls

As such, having an adaptive interaction where you look at cells 64 values at a 
time and adapt your behavior based on actual nullability (as opposed to 
declared nullability) provides a much better performance lift in real world use 
cases than having specialized code for declared non-nullable situations.

FYI: [~e.levine], the updated approach with vectors is moving to a situation 
where we don't have a bit vector and ultimately also consolidates the buffer 
for the bits and the fixed bytes in the same buffer. In that case, there is no 
heap memory overhead and the direct memory overhead is 1 bit per value, far 
less than necessary.

Also note that in reality, most people focused on super high performance Java 
implementations interact directly with the memory. You can see an example of 
how we do this here: 
https://github.com/dremio/dremio-oss/blob/master/sabot/kernel/src/main/java/com/dremio/sabot/op/common/ht2/Pivots.java#L89

If, in the future, if people need the vector classes to have an additional set 
of methods such as: 
allocateNewNoNull()
setSafeIgnoreNull(int index, int value) 

let's just add those when someone's usecase requires it. No need to have an 
extra set of vectors for that purpose.


> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-10-25 Thread Wes McKinney (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218606#comment-16218606
 ] 

Wes McKinney commented on ARROW-1710:
-

See https://github.com/apache/arrow/blob/master/format/Layout.md#null-bitmaps. 
"Arrays having a 0 null count may choose to not allocate the null bitmap.". So 
when there are no nulls, it is not necessary to create a BitVector. It is also 
not necessary to populate the bit vector, so as you say waiting until the first 
null to create the bitmap might be the way to go.

> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-10-24 Thread Ethan Levine (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217560#comment-16217560
 ] 

Ethan Levine commented on ARROW-1710:
-

The BitVector is an extra object that has to be allocated (both in terms of the 
backing data and in terms of the Java objects involved). You'd also need to 
perform bit masking of the underlying data with every write, which could 
involve a cache miss if the data for the BitVector isn't neatly colocated with 
the actual data for the nullable vector.

Perhaps a tracking flag could be added to the nullable vectors, though. It 
would start out "false", and get set to "true" if you ever write a null value. 
That way you could avoid the extra allocation and computation involved with 
tracking the validity of each value in the case where there are no null values. 
This seems like it would be more complicated than just keeping non-nullable 
vectors around, however.

> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ARROW-1710) [Java] Decide what to do with non-nullable vectors in new vector class hierarchy

2017-10-23 Thread Gonzalo Ortiz (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214748#comment-16214748
 ] 

Gonzalo Ortiz commented on ARROW-1710:
--

I'm a little bit concern about performance loss if non-nullable vectors are 
removed. Arrow on Java has quite bad performance so far (compared with plan 
ByteBuffers) and to remove non-nullable vectors can make it even worst.

> [Java] Decide what to do with non-nullable vectors in new vector class 
> hierarchy 
> -
>
> Key: ARROW-1710
> URL: https://issues.apache.org/jira/browse/ARROW-1710
> Project: Apache Arrow
>  Issue Type: Sub-task
>  Components: Java - Vectors
>Reporter: Li Jin
> Fix For: 0.8.0
>
>
> So far the consensus seems to be remove all non-nullable vectors. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)