[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...

2016-11-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15929


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...

2016-11-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/15929#discussion_r88827491
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
 ---
@@ -452,15 +444,7 @@ case class EqualNullSafe(left: Expression, right: 
Expression) extends BinaryComp
 } else if (input1 == null || input2 == null) {
   false
 } else {
-  if (left.dataType == FloatType) {
-Utils.nanSafeCompareFloats(input1.asInstanceOf[Float], 
input2.asInstanceOf[Float]) == 0
-  } else if (left.dataType == DoubleType) {
-Utils.nanSafeCompareDoubles(input1.asInstanceOf[Double], 
input2.asInstanceOf[Double]) == 0
-  } else if (left.dataType != BinaryType) {
-input1 == input2
-  } else {
-java.util.Arrays.equals(input1.asInstanceOf[Array[Byte]], 
input2.asInstanceOf[Array[Byte]])
--- End diff --

yea, we may compare an unsafe and safe row/array/map previously.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...

2016-11-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15929#discussion_r88774814
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
 ---
@@ -452,15 +444,7 @@ case class EqualNullSafe(left: Expression, right: 
Expression) extends BinaryComp
 } else if (input1 == null || input2 == null) {
   false
 } else {
-  if (left.dataType == FloatType) {
-Utils.nanSafeCompareFloats(input1.asInstanceOf[Float], 
input2.asInstanceOf[Float]) == 0
-  } else if (left.dataType == DoubleType) {
-Utils.nanSafeCompareDoubles(input1.asInstanceOf[Double], 
input2.asInstanceOf[Double]) == 0
-  } else if (left.dataType != BinaryType) {
-input1 == input2
-  } else {
-java.util.Arrays.equals(input1.asInstanceOf[Array[Byte]], 
input2.asInstanceOf[Array[Byte]])
--- End diff --

This is a bug, right? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...

2016-11-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15929#discussion_r88774467
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -481,8 +481,13 @@ class CodegenContext {
 case FloatType => s"(java.lang.Float.isNaN($c1) && 
java.lang.Float.isNaN($c2)) || $c1 == $c2"
 case DoubleType => s"(java.lang.Double.isNaN($c1) && 
java.lang.Double.isNaN($c2)) || $c1 == $c2"
 case dt: DataType if isPrimitiveType(dt) => s"$c1 == $c2"
+case dt: DataType if dt.isInstanceOf[AtomicType] => s"$c1.equals($c2)"
+case array: ArrayType => genComp(array, c1, c2) + " == 0"
+case struct: StructType => genComp(struct, c1, c2) + " == 0"
--- End diff --

If we do not plan to support `MapType`, could we add a negative test case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...

2016-11-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15929#discussion_r88770432
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -481,8 +481,13 @@ class CodegenContext {
 case FloatType => s"(java.lang.Float.isNaN($c1) && 
java.lang.Float.isNaN($c2)) || $c1 == $c2"
 case DoubleType => s"(java.lang.Double.isNaN($c1) && 
java.lang.Double.isNaN($c2)) || $c1 == $c2"
 case dt: DataType if isPrimitiveType(dt) => s"$c1 == $c2"
+case dt: DataType if dt.isInstanceOf[AtomicType] => s"$c1.equals($c2)"
+case array: ArrayType => genComp(array, c1, c2) + " == 0"
+case struct: StructType => genComp(struct, c1, c2) + " == 0"
--- End diff --

: ) That is a fun discussion. 

Also accidentally found how Preso did it in a PR: 
https://github.com/prestodb/presto/pull/2469/files




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...

2016-11-18 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/15929#discussion_r88764135
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -481,8 +481,13 @@ class CodegenContext {
 case FloatType => s"(java.lang.Float.isNaN($c1) && 
java.lang.Float.isNaN($c2)) || $c1 == $c2"
 case DoubleType => s"(java.lang.Double.isNaN($c1) && 
java.lang.Double.isNaN($c2)) || $c1 == $c2"
 case dt: DataType if isPrimitiveType(dt) => s"$c1 == $c2"
+case dt: DataType if dt.isInstanceOf[AtomicType] => s"$c1.equals($c2)"
+case array: ArrayType => genComp(array, c1, c2) + " == 0"
+case struct: StructType => genComp(struct, c1, c2) + " == 0"
--- End diff --

MapType is not comparable. We currently do not support equals() nor 
hashcode() for MapData. See https://issues.apache.org/jira/browse/SPARK-18134 
for a fun discussion on this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...

2016-11-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15929#discussion_r88763155
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -481,8 +481,13 @@ class CodegenContext {
 case FloatType => s"(java.lang.Float.isNaN($c1) && 
java.lang.Float.isNaN($c2)) || $c1 == $c2"
 case DoubleType => s"(java.lang.Double.isNaN($c1) && 
java.lang.Double.isNaN($c2)) || $c1 == $c2"
 case dt: DataType if isPrimitiveType(dt) => s"$c1 == $c2"
+case dt: DataType if dt.isInstanceOf[AtomicType] => s"$c1.equals($c2)"
+case array: ArrayType => genComp(array, c1, c2) + " == 0"
+case struct: StructType => genComp(struct, c1, c2) + " == 0"
--- End diff --

How about `MapType`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...

2016-11-18 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/15929#discussion_r88701973
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -512,6 +517,9 @@ class CodegenContext {
   val funcCode: String =
 s"""
   public int $compareFunc(ArrayData a, ArrayData b) {
+if (a instanceof UnsafeArrayData && b instanceof 
UnsafeArrayData && a == b) {
--- End diff --

Why the instance checks? This is java, you are comparing by reference.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...

2016-11-18 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/15929

[SPARK-18053][SQL] compare unsafe and safe complex-type values correctly

## What changes were proposed in this pull request?

In Spark SQL, some expression may output safe format values, e.g. 
`CreateArray`, `CreateStruct`, `Cast`, etc. When we compare 2 values, we should 
be able to compare safe and unsafe formats.

The `GreaterThan`, `LessThan`, etc. in Spark SQL already handles it, but 
the `EqualTo` doesn't. This PR fixes it.

## How was this patch tested?

new unit test and regression test



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark type-aware

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15929.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15929


commit 00c432a23336d20634f2caf957a4df1c0c23f78e
Author: Wenchen Fan 
Date:   2016-11-18T07:51:44Z

compare unsafe and safe complex-type values correctly




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org