[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15929 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15929#discussion_r88827491 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -452,15 +444,7 @@ case class EqualNullSafe(left: Expression, right: Expression) extends BinaryComp } else if (input1 == null || input2 == null) { false } else { - if (left.dataType == FloatType) { -Utils.nanSafeCompareFloats(input1.asInstanceOf[Float], input2.asInstanceOf[Float]) == 0 - } else if (left.dataType == DoubleType) { -Utils.nanSafeCompareDoubles(input1.asInstanceOf[Double], input2.asInstanceOf[Double]) == 0 - } else if (left.dataType != BinaryType) { -input1 == input2 - } else { -java.util.Arrays.equals(input1.asInstanceOf[Array[Byte]], input2.asInstanceOf[Array[Byte]]) --- End diff -- yea, we may compare an unsafe and safe row/array/map previously. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15929#discussion_r88774814 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -452,15 +444,7 @@ case class EqualNullSafe(left: Expression, right: Expression) extends BinaryComp } else if (input1 == null || input2 == null) { false } else { - if (left.dataType == FloatType) { -Utils.nanSafeCompareFloats(input1.asInstanceOf[Float], input2.asInstanceOf[Float]) == 0 - } else if (left.dataType == DoubleType) { -Utils.nanSafeCompareDoubles(input1.asInstanceOf[Double], input2.asInstanceOf[Double]) == 0 - } else if (left.dataType != BinaryType) { -input1 == input2 - } else { -java.util.Arrays.equals(input1.asInstanceOf[Array[Byte]], input2.asInstanceOf[Array[Byte]]) --- End diff -- This is a bug, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15929#discussion_r88774467 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -481,8 +481,13 @@ class CodegenContext { case FloatType => s"(java.lang.Float.isNaN($c1) && java.lang.Float.isNaN($c2)) || $c1 == $c2" case DoubleType => s"(java.lang.Double.isNaN($c1) && java.lang.Double.isNaN($c2)) || $c1 == $c2" case dt: DataType if isPrimitiveType(dt) => s"$c1 == $c2" +case dt: DataType if dt.isInstanceOf[AtomicType] => s"$c1.equals($c2)" +case array: ArrayType => genComp(array, c1, c2) + " == 0" +case struct: StructType => genComp(struct, c1, c2) + " == 0" --- End diff -- If we do not plan to support `MapType`, could we add a negative test case? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15929#discussion_r88770432 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -481,8 +481,13 @@ class CodegenContext { case FloatType => s"(java.lang.Float.isNaN($c1) && java.lang.Float.isNaN($c2)) || $c1 == $c2" case DoubleType => s"(java.lang.Double.isNaN($c1) && java.lang.Double.isNaN($c2)) || $c1 == $c2" case dt: DataType if isPrimitiveType(dt) => s"$c1 == $c2" +case dt: DataType if dt.isInstanceOf[AtomicType] => s"$c1.equals($c2)" +case array: ArrayType => genComp(array, c1, c2) + " == 0" +case struct: StructType => genComp(struct, c1, c2) + " == 0" --- End diff -- : ) That is a fun discussion. Also accidentally found how Preso did it in a PR: https://github.com/prestodb/presto/pull/2469/files --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15929#discussion_r88764135 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -481,8 +481,13 @@ class CodegenContext { case FloatType => s"(java.lang.Float.isNaN($c1) && java.lang.Float.isNaN($c2)) || $c1 == $c2" case DoubleType => s"(java.lang.Double.isNaN($c1) && java.lang.Double.isNaN($c2)) || $c1 == $c2" case dt: DataType if isPrimitiveType(dt) => s"$c1 == $c2" +case dt: DataType if dt.isInstanceOf[AtomicType] => s"$c1.equals($c2)" +case array: ArrayType => genComp(array, c1, c2) + " == 0" +case struct: StructType => genComp(struct, c1, c2) + " == 0" --- End diff -- MapType is not comparable. We currently do not support equals() nor hashcode() for MapData. See https://issues.apache.org/jira/browse/SPARK-18134 for a fun discussion on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15929#discussion_r88763155 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -481,8 +481,13 @@ class CodegenContext { case FloatType => s"(java.lang.Float.isNaN($c1) && java.lang.Float.isNaN($c2)) || $c1 == $c2" case DoubleType => s"(java.lang.Double.isNaN($c1) && java.lang.Double.isNaN($c2)) || $c1 == $c2" case dt: DataType if isPrimitiveType(dt) => s"$c1 == $c2" +case dt: DataType if dt.isInstanceOf[AtomicType] => s"$c1.equals($c2)" +case array: ArrayType => genComp(array, c1, c2) + " == 0" +case struct: StructType => genComp(struct, c1, c2) + " == 0" --- End diff -- How about `MapType`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15929#discussion_r88701973 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -512,6 +517,9 @@ class CodegenContext { val funcCode: String = s""" public int $compareFunc(ArrayData a, ArrayData b) { +if (a instanceof UnsafeArrayData && b instanceof UnsafeArrayData && a == b) { --- End diff -- Why the instance checks? This is java, you are comparing by reference. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/15929 [SPARK-18053][SQL] compare unsafe and safe complex-type values correctly ## What changes were proposed in this pull request? In Spark SQL, some expression may output safe format values, e.g. `CreateArray`, `CreateStruct`, `Cast`, etc. When we compare 2 values, we should be able to compare safe and unsafe formats. The `GreaterThan`, `LessThan`, etc. in Spark SQL already handles it, but the `EqualTo` doesn't. This PR fixes it. ## How was this patch tested? new unit test and regression test You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark type-aware Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15929.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15929 commit 00c432a23336d20634f2caf957a4df1c0c23f78e Author: Wenchen FanDate: 2016-11-18T07:51:44Z compare unsafe and safe complex-type values correctly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org