[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174161885 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174161889 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49935/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174163035 **[Test build #2447 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2447/consoleFull)** for PR 10882 at commit [`d5acd17`](https://github.com/apache/spark/commit/d5acd1703f1c5586b4dfe814fa7c5b865d2d9d08). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174166173 **[Test build #49932 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49932/consoleFull)** for PR 10882 at commit [`d5acd17`](https://github.com/apache/spark/commit/d5acd1703f1c5586b4dfe814fa7c5b865d2d9d08). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174166217 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174166218 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49932/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174159461 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/10845 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10882 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174217310 Merging this in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173840961 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173855055 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49915/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173861317 @rxin Sure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173870840 **[Test build #49918 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49918/consoleFull)** for PR 10845 at commit [`dc05a08`](https://github.com/apache/spark/commit/dc05a083b48a2a1ebfcf5b5a3e15a64fa849e8c0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10845#discussion_r50514078 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -406,6 +406,105 @@ object HiveTypeCoercion { PromotePrecision(Cast(e, dataType)) } +private def binaryOperator(e: Expression): Expression = e match { + case b @ BinaryComparison(e1 @ DecimalType.Expression(p1, s1), +e2 @ DecimalType.Expression(p2, s2)) if p1 != p2 || s1 != s2 => +val resultType = widerDecimalType(p1, s1, p2, s2) +b.makeCopy(Array(Cast(e1, resultType), Cast(e2, resultType))) + + // Strength reduction for comparisons between an integral column and a decimal literal: + // 1. int_col > decimal_literal => int_col > floor(decimal_literal) + // 2. int_col >= decimal_literal => int_col >= ceil(decimal_literal) + // 3. int_col < decimal_literal => int_col < ceil(decimal_literal) + // 4. int_col <= decimal_literal => int_col <= floor(decimal_literal) + // 5. decimal_literal > int_col => ceil(decimal_literal) > int_col + // 6. decimal_literal >= int_col => floor(decimal_literal) >= int_col + // 7. decimal_literal < int_col => floor(decimal_literal) < int_col + // 8. decimal_literal <= int_col => ceil(decimal_literal) <= int_col + case GreaterThan(i @ IntegralType(), Literal(value: Decimal, _: DecimalType)) => +GreaterThan(i, Literal.create(value.floor.toInt, IntegerType)) --- End diff -- i don't think you can just do a toInt here decimal scope is larger than int. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173853848 @viirya rather than me pointing out all the problem in each iteration, can you try figuring out all the corner cases yourself and then submit this for review? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173844374 **[Test build #49915 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49915/consoleFull)** for PR 10845 at commit [`1de0616`](https://github.com/apache/spark/commit/1de06165e0a442c217fc79ad1150d93fbf9f07d3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173854997 **[Test build #49915 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49915/consoleFull)** for PR 10845 at commit [`1de0616`](https://github.com/apache/spark/commit/1de06165e0a442c217fc79ad1150d93fbf9f07d3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173855053 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173900617 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49918/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173900615 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173900419 **[Test build #49918 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49918/consoleFull)** for PR 10845 at commit [`dc05a08`](https://github.com/apache/spark/commit/dc05a083b48a2a1ebfcf5b5a3e15a64fa849e8c0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174157512 **[Test build #49932 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49932/consoleFull)** for PR 10882 at commit [`d5acd17`](https://github.com/apache/spark/commit/d5acd1703f1c5586b4dfe814fa7c5b865d2d9d08). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174157457 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49931/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/10882 [SPARK-12904][SQL] Strength reduction for integral and decimal literal comparisons This pull request implements strength reduction for comparing integral expressions and decimal literals, which is more common now You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark SPARK-12904-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10882.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10882 commit 8fdaafee2c1510f918309f3c794ebdf4ed54377a Author: Reynold XinDate: 2016-01-23T07:23:55Z [SPARK-12904][SQL] Strength reduction for integral and decimal literal comparisons commit 6bda5bc27bd8f8173db727b28ffe5eb113e6e2f8 Author: Reynold Xin Date: 2016-01-23T07:24:54Z Add decimal precision rule. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174157278 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174157279 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49930/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10882#discussion_r50617297 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala --- @@ -0,0 +1,261 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.analysis + +import org.apache.spark.sql.catalyst.expressions.Literal._ +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.catalyst.rules.Rule +import org.apache.spark.sql.types._ + + +// scalastyle:off +/** + * Calculates and propagates precision for fixed-precision decimals. Hive has a number of + * rules for this based on the SQL standard and MS SQL: + * https://cwiki.apache.org/confluence/download/attachments/27362075/Hive_Decimal_Precision_Scale_Support.pdf + * https://msdn.microsoft.com/en-us/library/ms190476.aspx + * + * In particular, if we have expressions e1 and e2 with precision/scale p1/s2 and p2/s2 + * respectively, then the following operations have the following precision / scale: + * + * OperationResult PrecisionResult Scale + * + * e1 + e2 max(s1, s2) + max(p1-s1, p2-s2) + 1 max(s1, s2) + * e1 - e2 max(s1, s2) + max(p1-s1, p2-s2) + 1 max(s1, s2) + * e1 * e2 p1 + p2 + 1 s1 + s2 + * e1 / e2 p1 - s1 + s2 + max(6, s1 + p2 + 1) max(6, s1 + p2 + 1) + * e1 % e2 min(p1-s1, p2-s2) + max(s1, s2) max(s1, s2) + * e1 union e2 max(s1, s2) + max(p1-s1, p2-s2) max(s1, s2) + * sum(e1) p1 + 10 s1 + * avg(e1) p1 + 4 s1 + 4 + * + * Catalyst also has unlimited-precision decimals. For those, all ops return unlimited precision. + * + * To implement the rules for fixed-precision types, we introduce casts to turn them to unlimited + * precision, do the math on unlimited-precision numbers, then introduce casts back to the + * required fixed precision. This allows us to do all rounding and overflow handling in the + * cast-to-fixed-precision operator. + * + * In addition, when mixing non-decimal types with decimals, we use the following rules: + * - BYTE gets turned into DECIMAL(3, 0) + * - SHORT gets turned into DECIMAL(5, 0) + * - INT gets turned into DECIMAL(10, 0) + * - LONG gets turned into DECIMAL(20, 0) + * - FLOAT and DOUBLE cause fixed-length decimals to turn into DOUBLE + */ +// scalastyle:on +object DecimalPrecision extends Rule[LogicalPlan] { + import scala.math.{max, min} + + private def isFloat(t: DataType): Boolean = t == FloatType || t == DoubleType + + // Returns the wider decimal type that's wider than both of them + def widerDecimalType(d1: DecimalType, d2: DecimalType): DecimalType = { +widerDecimalType(d1.precision, d1.scale, d2.precision, d2.scale) + } + // max(s1, s2) + max(p1-s1, p2-s2), max(s1, s2) + def widerDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = { +val scale = max(s1, s2) +val range = max(p1 - s1, p2 - s2) +DecimalType.bounded(range + scale, scale) + } + + private def promotePrecision(e: Expression, dataType: DataType): Expression = { +PromotePrecision(Cast(e, dataType)) + } + + def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators { +// fix decimal precision for expressions +case q => q.transformExpressions( + decimalAndDecimal.orElse(integralAndDecimalLiteral).orElse(nondecimalAndDecimal)) --- End diff -- i broke the previous monolithic decimal precision rule into 2 parts, and then added integralAndDecimalLiteral --- If your project is set up for it, you can reply to this email
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10882#discussion_r50617303 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala --- @@ -0,0 +1,261 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.analysis + +import org.apache.spark.sql.catalyst.expressions.Literal._ +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.catalyst.rules.Rule +import org.apache.spark.sql.types._ + + +// scalastyle:off +/** + * Calculates and propagates precision for fixed-precision decimals. Hive has a number of + * rules for this based on the SQL standard and MS SQL: + * https://cwiki.apache.org/confluence/download/attachments/27362075/Hive_Decimal_Precision_Scale_Support.pdf + * https://msdn.microsoft.com/en-us/library/ms190476.aspx + * + * In particular, if we have expressions e1 and e2 with precision/scale p1/s2 and p2/s2 + * respectively, then the following operations have the following precision / scale: + * + * OperationResult PrecisionResult Scale + * + * e1 + e2 max(s1, s2) + max(p1-s1, p2-s2) + 1 max(s1, s2) + * e1 - e2 max(s1, s2) + max(p1-s1, p2-s2) + 1 max(s1, s2) + * e1 * e2 p1 + p2 + 1 s1 + s2 + * e1 / e2 p1 - s1 + s2 + max(6, s1 + p2 + 1) max(6, s1 + p2 + 1) + * e1 % e2 min(p1-s1, p2-s2) + max(s1, s2) max(s1, s2) + * e1 union e2 max(s1, s2) + max(p1-s1, p2-s2) max(s1, s2) + * sum(e1) p1 + 10 s1 + * avg(e1) p1 + 4 s1 + 4 + * + * Catalyst also has unlimited-precision decimals. For those, all ops return unlimited precision. + * + * To implement the rules for fixed-precision types, we introduce casts to turn them to unlimited + * precision, do the math on unlimited-precision numbers, then introduce casts back to the + * required fixed precision. This allows us to do all rounding and overflow handling in the + * cast-to-fixed-precision operator. + * + * In addition, when mixing non-decimal types with decimals, we use the following rules: + * - BYTE gets turned into DECIMAL(3, 0) + * - SHORT gets turned into DECIMAL(5, 0) + * - INT gets turned into DECIMAL(10, 0) + * - LONG gets turned into DECIMAL(20, 0) + * - FLOAT and DOUBLE cause fixed-length decimals to turn into DOUBLE + */ +// scalastyle:on +object DecimalPrecision extends Rule[LogicalPlan] { + import scala.math.{max, min} + + private def isFloat(t: DataType): Boolean = t == FloatType || t == DoubleType + + // Returns the wider decimal type that's wider than both of them + def widerDecimalType(d1: DecimalType, d2: DecimalType): DecimalType = { +widerDecimalType(d1.precision, d1.scale, d2.precision, d2.scale) + } + // max(s1, s2) + max(p1-s1, p2-s2), max(s1, s2) + def widerDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = { +val scale = max(s1, s2) +val range = max(p1 - s1, p2 - s2) +DecimalType.bounded(range + scale, scale) + } + + private def promotePrecision(e: Expression, dataType: DataType): Expression = { +PromotePrecision(Cast(e, dataType)) + } + + def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators { +// fix decimal precision for expressions +case q => q.transformExpressions( + decimalAndDecimal.orElse(integralAndDecimalLiteral).orElse(nondecimalAndDecimal)) + } + + /** Decimal precision promotion for +, -, *, /, %, pmod, and binary comparison. */ + private val decimalAndDecimal: PartialFunction[Expression, Expression] = { +// Skip
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174156957 This should subsume https://github.com/apache/spark/pull/10845 cc @viirya - I refactored the code so it is a lot cleaner, and looked at overflow cases. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10882#issuecomment-174157423 **[Test build #2447 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2447/consoleFull)** for PR 10882 at commit [`d5acd17`](https://github.com/apache/spark/commit/d5acd1703f1c5586b4dfe814fa7c5b865d2d9d08). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173491009 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49863/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173491008 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173490658 **[Test build #49863 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49863/consoleFull)** for PR 10845 at commit [`7202c54`](https://github.com/apache/spark/commit/7202c546d025fc2c5cf71856c7e64fce8e85444f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10845#discussion_r50497168 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -462,6 +462,32 @@ object HiveTypeCoercion { val resultType = widerDecimalType(p1, s1, p2, s2) b.makeCopy(Array(Cast(e1, resultType), Cast(e2, resultType))) +// Strength reduction for comparisons between an integral column and a decimal literal: +// 1. int_col > decimal_literal => int_col > floor(decimal_literal) +// 2. int_col >= decimal_literal => int_col >= ceil(decimal_literal) +// 3. int_col < decimal_literal => int_col < ceil(decimal_literal) +// 4. int_col <= decimal_literal => int_col <= floor(decimal_literal) +// 5. decimal_literal > int_col => ceil(decimal_literal) > int_col +// 6. decimal_literal >= int_col => floor(decimal_literal) >= int_col +// 7. decimal_literal < int_col => floor(decimal_literal) < int_col +// 8. decimal_literal <= int_col => ceil(decimal_literal) <= int_col +case GreaterThan(i @ IntegerType(), Literal(value: Decimal, _: DecimalType)) => --- End diff -- this should apply for any integral type, not just integer. Also can you make this function shorter by breaking it into two functions? It is pretty long now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173813686 **[Test build #49902 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49902/consoleFull)** for PR 10845 at commit [`a1271fe`](https://github.com/apache/spark/commit/a1271fea6256c3930713926c560cbcdeea2f5b5f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173819651 **[Test build #49902 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49902/consoleFull)** for PR 10845 at commit [`a1271fe`](https://github.com/apache/spark/commit/a1271fea6256c3930713926c560cbcdeea2f5b5f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173834979 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49906/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173834872 **[Test build #49906 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49906/consoleFull)** for PR 10845 at commit [`1de0616`](https://github.com/apache/spark/commit/1de06165e0a442c217fc79ad1150d93fbf9f07d3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173834978 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173822992 **[Test build #49906 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49906/consoleFull)** for PR 10845 at commit [`1de0616`](https://github.com/apache/spark/commit/1de06165e0a442c217fc79ad1150d93fbf9f07d3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10845#discussion_r50361137 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -696,6 +697,43 @@ object HiveTypeCoercion { } /** + * Strength reduction for comparisons between an integral column and a decimal literal: + * + * 1. int_col > decimal_literal => int_col > floor(decimal_literal) + * 2. int_col >= decimal_literal => int_col >= ceil(decimal_literal) + * 3. int_col < decimal_literal => int_col < ceil(decimal_literal) + * 4. int_col <= decimal_literal => int_col <= floor(decimal_literal) + * 5. decimal_literal > int_col => ceil(decimal_literal) > int_col + * 6. decimal_literal >= int_col => floor(decimal_literal) >= int_col + * 7. decimal_literal < int_col => floor(decimal_literal) < int_col + * 8. decimal_literal <= int_col => ceil(decimal_literal) <= int_col + * + */ + object SimplifyIntegerDecimalComparing extends Rule[LogicalPlan] { --- End diff -- Agreed. I also think that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173473549 **[Test build #49863 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49863/consoleFull)** for PR 10845 at commit [`7202c54`](https://github.com/apache/spark/commit/7202c546d025fc2c5cf71856c7e64fce8e85444f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10845#discussion_r50292340 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -696,6 +697,43 @@ object HiveTypeCoercion { } /** + * Strength reduction for comparisons between an integral column and a decimal literal: + * + * 1. int_col > decimal_literal => int_col > floor(decimal_literal) + * 2. int_col >= decimal_literal => int_col >= ceil(decimal_literal) + * 3. int_col < decimal_literal => int_col < ceil(decimal_literal) + * 4. int_col <= decimal_literal => int_col <= floor(decimal_literal) + * 5. decimal_literal > int_col => ceil(decimal_literal) > int_col + * 6. decimal_literal >= int_col => floor(decimal_literal) >= int_col + * 7. decimal_literal < int_col => floor(decimal_literal) < int_col + * 8. decimal_literal <= int_col => ceil(decimal_literal) <= int_col + * + */ + object SimplifyIntegerDecimalComparing extends Rule[LogicalPlan] { --- End diff -- In this case I'd consider just putting this in decimal precision - so we don't have the ordering problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/10845 [SPARK-12904][SQL] Strength reduction for integer/decimal comparisons JIRA: https://issues.apache.org/jira/browse/SPARK-12904 We can do the following strength reduction for comparisons between an integral column and a decimal literal: 1. int_col > decimal_literal => int_col > floor(decimal_literal) 2. int_col >= decimal_literal => int_col >= ceil(decimal_literal) 3. int_col < decimal_literal => int_col < ceil(decimal_literal) 4. int_col <= decimal_literal => int_col <= floor(decimal_literal) 5. decimal_literal > int_col => ceil(decimal_literal) > int_col 6. decimal_literal >= int_col => floor(decimal_literal) >= int_col 7. decimal_literal < int_col => floor(decimal_literal) < int_col 8. decimal_literal <= int_col => ceil(decimal_literal) <= int_col You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 int-decimal-compare Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10845.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10845 commit 5a96018c3d4ac193defc15ede6fbf91fe10a8f60 Author: Liang-Chi HsiehDate: 2016-01-20T08:24:56Z Strength reduction for integer/decimal comparisons. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10845#discussion_r50226790 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -696,6 +697,43 @@ object HiveTypeCoercion { } /** + * Strength reduction for comparisons between an integral column and a decimal literal: + * + * 1. int_col > decimal_literal => int_col > floor(decimal_literal) + * 2. int_col >= decimal_literal => int_col >= ceil(decimal_literal) + * 3. int_col < decimal_literal => int_col < ceil(decimal_literal) + * 4. int_col <= decimal_literal => int_col <= floor(decimal_literal) + * 5. decimal_literal > int_col => ceil(decimal_literal) > int_col + * 6. decimal_literal >= int_col => floor(decimal_literal) >= int_col + * 7. decimal_literal < int_col => floor(decimal_literal) < int_col + * 8. decimal_literal <= int_col => ceil(decimal_literal) <= int_col + * + */ + object SimplifyIntegerDecimalComparing extends Rule[LogicalPlan] { --- End diff -- As I testes before, you will get something like [cast(cast(a, decimal(10, 1), decimal(11, 1)) > cast(4.7, decimal(11.1))] after analysis phase due to `DecimalPrecision` and `ImplicitTypeCasts` rules. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10845#discussion_r50227019 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -696,6 +697,43 @@ object HiveTypeCoercion { } /** + * Strength reduction for comparisons between an integral column and a decimal literal: + * + * 1. int_col > decimal_literal => int_col > floor(decimal_literal) + * 2. int_col >= decimal_literal => int_col >= ceil(decimal_literal) + * 3. int_col < decimal_literal => int_col < ceil(decimal_literal) + * 4. int_col <= decimal_literal => int_col <= floor(decimal_literal) + * 5. decimal_literal > int_col => ceil(decimal_literal) > int_col + * 6. decimal_literal >= int_col => floor(decimal_literal) >= int_col + * 7. decimal_literal < int_col => floor(decimal_literal) < int_col + * 8. decimal_literal <= int_col => ceil(decimal_literal) <= int_col + * + */ + object SimplifyIntegerDecimalComparing extends Rule[LogicalPlan] { --- End diff -- The other way to do this is to put it in decimal precision... although it's best to put it in the optimizer. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10845#discussion_r50227045 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -696,6 +697,43 @@ object HiveTypeCoercion { } /** + * Strength reduction for comparisons between an integral column and a decimal literal: + * + * 1. int_col > decimal_literal => int_col > floor(decimal_literal) + * 2. int_col >= decimal_literal => int_col >= ceil(decimal_literal) + * 3. int_col < decimal_literal => int_col < ceil(decimal_literal) + * 4. int_col <= decimal_literal => int_col <= floor(decimal_literal) + * 5. decimal_literal > int_col => ceil(decimal_literal) > int_col + * 6. decimal_literal >= int_col => floor(decimal_literal) >= int_col + * 7. decimal_literal < int_col => floor(decimal_literal) < int_col + * 8. decimal_literal <= int_col => ceil(decimal_literal) <= int_col + * + */ + object SimplifyIntegerDecimalComparing extends Rule[LogicalPlan] { --- End diff -- btw you can probably simplify the code a little bit by introducing a DecimalLiteral object with an unapply. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10845#discussion_r50226598 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -696,6 +697,43 @@ object HiveTypeCoercion { } /** + * Strength reduction for comparisons between an integral column and a decimal literal: + * + * 1. int_col > decimal_literal => int_col > floor(decimal_literal) + * 2. int_col >= decimal_literal => int_col >= ceil(decimal_literal) + * 3. int_col < decimal_literal => int_col < ceil(decimal_literal) + * 4. int_col <= decimal_literal => int_col <= floor(decimal_literal) + * 5. decimal_literal > int_col => ceil(decimal_literal) > int_col + * 6. decimal_literal >= int_col => floor(decimal_literal) >= int_col + * 7. decimal_literal < int_col => floor(decimal_literal) < int_col + * 8. decimal_literal <= int_col => ceil(decimal_literal) <= int_col + * + */ + object SimplifyIntegerDecimalComparing extends Rule[LogicalPlan] { --- End diff -- Ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10845#discussion_r50225930 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -49,6 +49,7 @@ object HiveTypeCoercion { InConversion :: WidenSetOperationTypes :: PromoteStrings :: + SimplifyIntegerDecimalComparing :: --- End diff -- We add this rule in analyser instead of optimizer because `DecimalPrecision` and `ImplicitTypeCasts` rules in analyser will add additional `Cast` operator to the integer expression. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173129414 **[Test build #49772 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49772/consoleFull)** for PR 10845 at commit [`5a96018`](https://github.com/apache/spark/commit/5a96018c3d4ac193defc15ede6fbf91fe10a8f60). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10845#discussion_r50228717 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -696,6 +697,43 @@ object HiveTypeCoercion { } /** + * Strength reduction for comparisons between an integral column and a decimal literal: + * + * 1. int_col > decimal_literal => int_col > floor(decimal_literal) + * 2. int_col >= decimal_literal => int_col >= ceil(decimal_literal) + * 3. int_col < decimal_literal => int_col < ceil(decimal_literal) + * 4. int_col <= decimal_literal => int_col <= floor(decimal_literal) + * 5. decimal_literal > int_col => ceil(decimal_literal) > int_col + * 6. decimal_literal >= int_col => floor(decimal_literal) >= int_col + * 7. decimal_literal < int_col => floor(decimal_literal) < int_col + * 8. decimal_literal <= int_col => ceil(decimal_literal) <= int_col + * + */ + object SimplifyIntegerDecimalComparing extends Rule[LogicalPlan] { --- End diff -- I am thinking, is it okay to simply removing these type cast here? I meant, we can't make sure where these type cast are added. If we do this optimization by removing type casting around the integer attribute and decimal literal, will we introduce possible bugs when users add type casting manually? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10845#discussion_r50226074 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -696,6 +697,43 @@ object HiveTypeCoercion { } /** + * Strength reduction for comparisons between an integral column and a decimal literal: + * + * 1. int_col > decimal_literal => int_col > floor(decimal_literal) + * 2. int_col >= decimal_literal => int_col >= ceil(decimal_literal) + * 3. int_col < decimal_literal => int_col < ceil(decimal_literal) + * 4. int_col <= decimal_literal => int_col <= floor(decimal_literal) + * 5. decimal_literal > int_col => ceil(decimal_literal) > int_col + * 6. decimal_literal >= int_col => floor(decimal_literal) >= int_col + * 7. decimal_literal < int_col => floor(decimal_literal) < int_col + * 8. decimal_literal <= int_col => ceil(decimal_literal) <= int_col + * + */ + object SimplifyIntegerDecimalComparing extends Rule[LogicalPlan] { --- End diff -- This should go into the optimizer rather than the analyzer. Note that you will need to check for a type cast there. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173158563 **[Test build #49772 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49772/consoleFull)** for PR 10845 at commit [`5a96018`](https://github.com/apache/spark/commit/5a96018c3d4ac193defc15ede6fbf91fe10a8f60). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173159490 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49772/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173159483 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org