[jira] [Commented] (ARROW-2368) DecimalVector#setBigEndian is not padding correctly for negative values

2018-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419912#comment-16419912
 ] 

ASF GitHub Bot commented on ARROW-2368:
---

siddharthteotia closed pull request #1809: ARROW-2368: [JAVA] Correctly pad 
negative values in DecimalVector#setBigEndian
URL: https://github.com/apache/arrow/pull/1809
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/java/vector/src/main/java/org/apache/arrow/vector/DecimalVector.java 
b/java/vector/src/main/java/org/apache/arrow/vector/DecimalVector.java
index a04357508..db5af51a7 100644
--- a/java/vector/src/main/java/org/apache/arrow/vector/DecimalVector.java
+++ b/java/vector/src/main/java/org/apache/arrow/vector/DecimalVector.java
@@ -211,7 +211,6 @@ public void set(int index, ArrowBuf buffer) {
* @param value array of bytes containing decimal in big endian byte order.
*/
   public void setBigEndian(int index, byte[] value) {
-assert value.length <= TYPE_WIDTH;
 BitVectorHelper.setValidityBitToOne(validityBuffer, index);
 final int length = value.length;
 int startIndex = index * TYPE_WIDTH;
@@ -223,13 +222,32 @@ public void setBigEndian(int index, byte[] value) {
 valueBuffer.setByte(startIndex + 3, value[i-3]);
 startIndex += 4;
   }
-} else {
+
+  return;
+}
+
+if (length == 0) {
+  valueBuffer.setZero(startIndex, TYPE_WIDTH);
+  return;
+}
+
+if (length < 16) {
   for (int i = length - 1; i >= 0; i--) {
 valueBuffer.setByte(startIndex, value[i]);
 startIndex++;
   }
-  valueBuffer.setZero(startIndex, TYPE_WIDTH - length);
+
+  final byte pad = (byte) (value[0] < 0 ? 0xFF : 0x00);
+  final int maxStartIndex = (index + 1) * TYPE_WIDTH;
+  while (startIndex < maxStartIndex) {
+valueBuffer.setByte(startIndex, pad);
+startIndex++;
+  }
+
+  return;
 }
+
+throw new IllegalArgumentException("Invalid decimal value length. Valid 
length in [1 - 16], got " + length);
   }
 
   /**
@@ -467,4 +485,4 @@ public void copyValueSafe(int fromIndex, int toIndex) {
   to.copyFromSafe(fromIndex, toIndex, DecimalVector.this);
 }
   }
-}
\ No newline at end of file
+}
diff --git 
a/java/vector/src/test/java/org/apache/arrow/vector/TestDecimalVector.java 
b/java/vector/src/test/java/org/apache/arrow/vector/TestDecimalVector.java
index 8c86452fc..25bcf2be9 100644
--- a/java/vector/src/test/java/org/apache/arrow/vector/TestDecimalVector.java
+++ b/java/vector/src/test/java/org/apache/arrow/vector/TestDecimalVector.java
@@ -20,6 +20,7 @@
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
 
 import java.math.BigDecimal;
 import java.math.BigInteger;
@@ -190,4 +191,57 @@ public void testBigDecimalReadWrite() {
   assertEquals(decimal8, decimalVector.getObject(7));
 }
   }
+
+  /**
+   * Test {@link DecimalVector#setBigEndian(int, byte[])} which takes BE 
layout input and stores in LE layout.
+   * Cases to cover: input byte array in different lengths in range [1-16] and 
negative values.
+   */
+  @Test
+  public void decimalBE2LE() {
+try (DecimalVector decimalVector = 
TestUtils.newVector(DecimalVector.class, "decimal", new ArrowType.Decimal(21, 
2), allocator)) {
+  decimalVector.allocateNew();
+
+  BigInteger[] testBigInts = new BigInteger[] {
+  new BigInteger("0"),
+  new BigInteger("-1"),
+  new BigInteger("23"),
+  new BigInteger("234234"),
+  new BigInteger("-234234234"),
+  new BigInteger("234234234234"),
+  new BigInteger("-56345345345345"),
+  new BigInteger("29823462983462893462934679234653456345"), // 
converts to 16 byte array
+  new BigInteger("-3894572983475982374598324598234346536"), // 
converts to 16 byte array
+  new BigInteger("-345345"),
+  new BigInteger("754533")
+  };
+
+  int insertionIdx = 0;
+  insertionIdx++; // insert a null
+  for (BigInteger val : testBigInts) {
+decimalVector.setBigEndian(insertionIdx++, val.toByteArray());
+  }
+  insertionIdx++; // insert a null
+  // insert a zero length buffer
+  decimalVector.setBigEndian(insertionIdx++, new byte[0]);
+
+  // Try inserting a buffer larger than 16bytes and expect a failure
+  try {
+decimalVector.setBigEndian(insertionIdx, new byte[17]);
+fail("above statement should have failed");
+  } catch (IllegalArgumentException ex) {
+assertTrue(ex.getMessage().equals("Invalid decimal value length. Valid 
length in [1 - 16], got 17"));
+  }
+  

[jira] [Commented] (ARROW-2368) DecimalVector#setBigEndian is not padding correctly for negative values

2018-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419800#comment-16419800
 ] 

ASF GitHub Bot commented on ARROW-2368:
---

vkorukanti opened a new pull request #1809: ARROW-2368: [JAVA] Correctly pad 
negative values in DecimalVector#setBigEndian
URL: https://github.com/apache/arrow/pull/1809
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> DecimalVector#setBigEndian is not padding correctly for negative values
> ---
>
> Key: ARROW-2368
> URL: https://issues.apache.org/jira/browse/ARROW-2368
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java - Vectors
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
>Priority: Blocker
>  Labels: pull-request-available
>
> If the input given is less than 16 bytes, we pad zero for the remaining bytes 
> [1]. This is not correct if the value is a negative number. We should pad 
> 0x00 or 0xFF depending upon the number is a positive or negative number. 
> Later when the value is retrieved we end up with an incorrect value.
>  
> [1] 
> https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/DecimalVector.java#L231



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2368) DecimalVector#setBigEndian is not padding correctly for negative values

2018-03-29 Thread Venki Korukanti (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419737#comment-16419737
 ] 

Venki Korukanti commented on ARROW-2368:


Yes, 
[here|https://github.com/vkorukanti/arrow/commit/a14b263ed442a4a088b1b0e5578993644147a599]
 is the patch. Its on a older version of arrow. Will post once I rebase the 
patch.

> DecimalVector#setBigEndian is not padding correctly for negative values
> ---
>
> Key: ARROW-2368
> URL: https://issues.apache.org/jira/browse/ARROW-2368
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java - Vectors
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
>Priority: Blocker
>
> If the input given is less than 16 bytes, we pad zero for the remaining bytes 
> [1]. This is not correct if the value is a negative number. We should pad 
> 0x00 or 0xFF depending upon the number is a positive or negative number. 
> Later when the value is retrieved we end up with an incorrect value.
>  
> [1] 
> https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/DecimalVector.java#L231



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-2368) DecimalVector#setBigEndian is not padding correctly for negative values

2018-03-29 Thread Phillip Cloud (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419734#comment-16419734
 ] 

Phillip Cloud commented on ARROW-2368:
--

This seems like a fairly straightforward fix. Care to submit a PR?

> DecimalVector#setBigEndian is not padding correctly for negative values
> ---
>
> Key: ARROW-2368
> URL: https://issues.apache.org/jira/browse/ARROW-2368
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java - Vectors
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
>Priority: Blocker
>
> If the input given is less than 16 bytes, we pad zero for the remaining bytes 
> [1]. This is not correct if the value is a negative number. We should pad 
> 0x00 or 0xFF depending upon the number is a positive or negative number. 
> Later when the value is retrieved we end up with an incorrect value.
>  
> [1] 
> https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/DecimalVector.java#L231



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)