[GitHub] spark pull request #22101: [SPARK-23207][Core][FOLLOWUP] Fix RecordBinaryCom...

2018-08-14 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22101#discussion_r209863964
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/execution/RecordBinaryComparator.java
 ---
@@ -42,16 +42,16 @@ public int compare(
   while ((leftOff + i) % 8 != 0 && i < leftLen) {
 res = (Platform.getByte(leftObj, leftOff + i) & 0xff) -
 (Platform.getByte(rightObj, rightOff + i) & 0xff);
-if (res != 0) return res;
+if (res != 0) return (int) res;
 i += 1;
   }
 }
 // for architectures that support unaligned accesses, chew it up 8 
bytes at a time
 if (Platform.unaligned() || (((leftOff + i) % 8 == 0) && ((rightOff + 
i) % 8 == 0))) {
   while (i <= leftLen - 8) {
-res = (int) ((Platform.getLong(leftObj, leftOff + i) -
-Platform.getLong(rightObj, rightOff + i)) % 
Integer.MAX_VALUE);
-if (res != 0) return res;
+res = Platform.getLong(leftObj, leftOff + i) -
+Platform.getLong(rightObj, rightOff + i);
+if (res != 0) return res > 0 ? 1 : -1;
--- End diff --

+1 for no subtraction.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22101: [SPARK-23207][Core][FOLLOWUP] Fix RecordBinaryCom...

2018-08-14 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/22101#discussion_r209862610
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/execution/RecordBinaryComparator.java
 ---
@@ -42,16 +42,16 @@ public int compare(
   while ((leftOff + i) % 8 != 0 && i < leftLen) {
 res = (Platform.getByte(leftObj, leftOff + i) & 0xff) -
 (Platform.getByte(rightObj, rightOff + i) & 0xff);
-if (res != 0) return res;
+if (res != 0) return (int) res;
 i += 1;
   }
 }
 // for architectures that support unaligned accesses, chew it up 8 
bytes at a time
 if (Platform.unaligned() || (((leftOff + i) % 8 == 0) && ((rightOff + 
i) % 8 == 0))) {
   while (i <= leftLen - 8) {
-res = (int) ((Platform.getLong(leftObj, leftOff + i) -
-Platform.getLong(rightObj, rightOff + i)) % 
Integer.MAX_VALUE);
-if (res != 0) return res;
+res = Platform.getLong(leftObj, leftOff + i) -
+Platform.getLong(rightObj, rightOff + i);
+if (res != 0) return res > 0 ? 1 : -1;
--- End diff --

The subtraction is buggy due to potential overflow.
Why not simply use:
```
  final long v1 = Platform.getLong(leftObj, leftOff + i);
  final long v2 = Platform.getLong(rightObj, rightOff + i);
  if (v1 != v2) {
return v1 > v2 ? -1 : 1;
  }
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22101: [SPARK-23207][Core][FOLLOWUP] Fix RecordBinaryCom...

2018-08-14 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/22101#discussion_r209860397
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/execution/RecordBinaryComparator.java
 ---
@@ -42,16 +42,16 @@ public int compare(
   while ((leftOff + i) % 8 != 0 && i < leftLen) {
 res = (Platform.getByte(leftObj, leftOff + i) & 0xff) -
 (Platform.getByte(rightObj, rightOff + i) & 0xff);
-if (res != 0) return res;
+if (res != 0) return (int) res;
--- End diff --

We can restrict scope of 'res' as an 'int' : and avoid the type 
promotions/conversions.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22101: [SPARK-23207][Core][FOLLOWUP] Fix RecordBinaryCom...

2018-08-14 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22101#discussion_r209855589
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/execution/RecordBinaryComparator.java
 ---
@@ -60,7 +60,7 @@ public int compare(
 while (i < leftLen) {
   res = (Platform.getByte(leftObj, leftOff + i) & 0xff) -
   (Platform.getByte(rightObj, rightOff + i) & 0xff);
-  if (res != 0) return res;
+  if (res != 0) return (int) res;
--- End diff --

ditto


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22101: [SPARK-23207][Core][FOLLOWUP] Fix RecordBinaryCom...

2018-08-14 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22101#discussion_r209855528
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/execution/RecordBinaryComparator.java
 ---
@@ -42,16 +42,16 @@ public int compare(
   while ((leftOff + i) % 8 != 0 && i < leftLen) {
 res = (Platform.getByte(leftObj, leftOff + i) & 0xff) -
 (Platform.getByte(rightObj, rightOff + i) & 0xff);
-if (res != 0) return res;
+if (res != 0) return (int) res;
--- End diff --

How about the following change to minimize and localize the change?
```
int res = (Platform.getByte(leftObj, leftOff + i) & 0xff) -
 (Platform.getByte(rightObj, rightOff + i) & 0xff);
if (res != 0) return res;
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22101: [SPARK-23207][Core][FOLLOWUP] Fix RecordBinaryCom...

2018-08-14 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22101#discussion_r209855135
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/execution/RecordBinaryComparator.java
 ---
@@ -42,16 +42,16 @@ public int compare(
   while ((leftOff + i) % 8 != 0 && i < leftLen) {
 res = (Platform.getByte(leftObj, leftOff + i) & 0xff) -
 (Platform.getByte(rightObj, rightOff + i) & 0xff);
-if (res != 0) return res;
+if (res != 0) return (int) res;
 i += 1;
   }
 }
 // for architectures that support unaligned accesses, chew it up 8 
bytes at a time
 if (Platform.unaligned() || (((leftOff + i) % 8 == 0) && ((rightOff + 
i) % 8 == 0))) {
   while (i <= leftLen - 8) {
-res = (int) ((Platform.getLong(leftObj, leftOff + i) -
-Platform.getLong(rightObj, rightOff + i)) % 
Integer.MAX_VALUE);
-if (res != 0) return res;
+res = Platform.getLong(leftObj, leftOff + i) -
+Platform.getLong(rightObj, rightOff + i);
+if (res != 0) return res > 0 ? 1 : -1;
--- End diff --

How about the following change to minimize and localize the change?
```
long res = Platform.getLong(leftObj, leftOff + i) -
 Platform.getLong(rightObj, rightOff + i);
if (res != 0) return res > 0 ? 1 : -1;
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22101: [SPARK-23207][Core][FOLLOWUP] Fix RecordBinaryCom...

2018-08-14 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request:

https://github.com/apache/spark/pull/22101

[SPARK-23207][Core][FOLLOWUP] Fix RecordBinaryComparator when subtraction 
between two words is divisible by Integer.MAX_VALUE.

## What changes were proposed in this pull request?

https://github.com/apache/spark/pull/22079#discussion_r209705612 It is 
possible for two objects to be unequal and yet we consider them as equal with 
this code, if the long values are separated by Int.MaxValue.
This PR fixes the issue.

## How was this patch tested?
N/A


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiangxb1987/spark fix-rbc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22101.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22101


commit 1f6b2594ebe8b50e4cb2fcde15181cfa9a17f48c
Author: Xingbo Jiang 
Date:   2018-08-14T04:04:40Z

fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org