[GitHub] [spark] AmplabJenkins removed a comment on pull request #28387: [SPARK-29339][R][FOLLOW-UP] remove requireNamespace1 workaround for arrow
AmplabJenkins removed a comment on pull request #28387: URL: https://github.com/apache/spark/pull/28387#issuecomment-620488750 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wang-zhun commented on pull request #28009: [SPARK-31235][YARN] Separates different categories of applications
wang-zhun commented on pull request #28009: URL: https://github.com/apache/spark/pull/28009#issuecomment-620488942 @tgravescs help look at this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28389: [SPARK-31592]bufferPoolsBySize in HeapMemoryAllocator should be thread safe
AmplabJenkins commented on pull request #28389: URL: https://github.com/apache/spark/pull/28389#issuecomment-620488475 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28387: [SPARK-29339][R][FOLLOW-UP] remove requireNamespace1 workaround for arrow
AmplabJenkins commented on pull request #28387: URL: https://github.com/apache/spark/pull/28387#issuecomment-620488750 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] fanyunbojerry opened a new pull request #28389: [SPARK-31592]bufferPoolsBySize in HeapMemoryAllocator should be thread safe
fanyunbojerry opened a new pull request #28389: URL: https://github.com/apache/spark/pull/28389 ### What changes were proposed in this pull request? Currently, bufferPoolsBySize in HeapMemoryAllocator uses a Map type whose value type is LinkedList. LinkedList is not thread safe and may hit the error below ``` java.util.NoSuchElementExceptionException at java.util.LinkedList.removeFirst(LinkedList.java:270) at java.util.LinkedList.remove(LinkedList.java:685) at org.apache.spark.unsafe.memory.HeapMemoryAllocator.allocate(HeapMemoryAllocator.java:57) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28387: [SPARK-29339][R][FOLLOW-UP] remove requireNamespace1 workaround for arrow
SparkQA commented on pull request #28387: URL: https://github.com/apache/spark/pull/28387#issuecomment-620488228 **[Test build #121985 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121985/testReport)** for PR 28387 at commit [`0a7ec3b`](https://github.com/apache/spark/commit/0a7ec3beba0b61e8aebd443e6cc357b1320816f2). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MichaelChirico commented on a change in pull request #28387: [SPARK-29339][R][FOLLOW-UP] remove requireNamespace1 workaround for arrow
MichaelChirico commented on a change in pull request #28387: URL: https://github.com/apache/spark/pull/28387#discussion_r416463288 ## File path: R/pkg/R/types.R ## @@ -88,11 +88,6 @@ specialtypeshandle <- function(type) { checkSchemaInArrow <- function(schema) { stopifnot(inherits(schema, "structType")) - requireNamespace1 <- requireNamespace Review comment: In the meantime, given that `sparkR.conf("spark.sql.execution.arrow.sparkr.enabled")[[1]] == "true"` was popping up in a few places, I added a helper to `utils.R` to centralize maintenance of that snippet. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28387: [SPARK-29339][R][FOLLOW-UP] remove requireNamespace1 workaround for arrow
AmplabJenkins commented on pull request #28387: URL: https://github.com/apache/spark/pull/28387#issuecomment-620485227 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28387: [SPARK-29339][R][FOLLOW-UP] remove requireNamespace1 workaround for arrow
AmplabJenkins removed a comment on pull request #28387: URL: https://github.com/apache/spark/pull/28387#issuecomment-620485227 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28387: [SPARK-29339][R][FOLLOW-UP] remove requireNamespace1 workaround for arrow
SparkQA commented on pull request #28387: URL: https://github.com/apache/spark/pull/28387#issuecomment-620484508 **[Test build #121984 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121984/testReport)** for PR 28387 at commit [`043dafc`](https://github.com/apache/spark/commit/043dafcf78092f6d481c7bd09fc2e20e0a2d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on pull request #28222: SPARK-31447 Fix issue in ExtractIntervalPart expression
yaooqinn commented on pull request #28222: URL: https://github.com/apache/spark/pull/28222#issuecomment-620484345 Checked PostgresSQL(not ANSI interval type) and presto(ANSI), both of them return proper days This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MichaelChirico commented on a change in pull request #28387: [SPARK-29339][R][FOLLOW-UP] remove requireNamespace1 workaround for arrow
MichaelChirico commented on a change in pull request #28387: URL: https://github.com/apache/spark/pull/28387#discussion_r416459279 ## File path: R/pkg/R/types.R ## @@ -88,11 +88,6 @@ specialtypeshandle <- function(type) { checkSchemaInArrow <- function(schema) { stopifnot(inherits(schema, "structType")) - requireNamespace1 <- requireNamespace Review comment: I checked all the usages of `checkSchemaInArrow`: ``` grep -Fnr "checkSchemaInArrow" R R/types.R:88:checkSchemaInArrow <- function(schema) { R/SQLContext.R:277:checkSchemaInArrow(schema) R/group.R:235: checkSchemaInArrow(schema) R/DataFrame.R:1211:checkSchemaInArrow(schema(x)) R/DataFrame.R:1509: checkSchemaInArrow(schema) ``` These are all within branches that have checked ``` arrowEnabled <- sparkR.conf("spark.sql.execution.arrow.sparkr.enabled")[[1]] == "true" ``` Since `arrow` is not directly used and this conf check is passed already, I think the `requireNamespace` here is unnecessary. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MichaelChirico commented on a change in pull request #28387: [SPARK-29339][R][FOLLOW-UP] remove requireNamespace1 workaround for arrow
MichaelChirico commented on a change in pull request #28387: URL: https://github.com/apache/spark/pull/28387#discussion_r416457657 ## File path: R/pkg/R/DataFrame.R ## @@ -1226,8 +1226,7 @@ setMethod("collect", # empty data.frame with 0 columns and 0 rows data.frame() } else if (useArrow) { - requireNamespace1 <- requireNamespace - if (requireNamespace1("arrow", quietly = TRUE)) { + if (requireNamespace("arrow", quietly = TRUE)) { read_arrow <- get("read_arrow", envir = asNamespace("arrow"), inherits = FALSE) Review comment: Yep, I wasn't reading carefully enough. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620481044 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620481044 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620480549 **[Test build #121983 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121983/testReport)** for PR 28386 at commit [`7f83232`](https://github.com/apache/spark/commit/7f83232b14529901df19ae23743380768a8daca5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28375: [SPARK-30282][SQL][FOLLOWUP] SHOW TBLPROPERTIES should support views
cloud-fan commented on a change in pull request #28375: URL: https://github.com/apache/spark/pull/28375#discussion_r416448092 ## File path: docs/sql-migration-guide.md ## @@ -59,7 +59,7 @@ license: | - In Spark 3.0, you can use `ADD FILE` to add file directories as well. Earlier you could add only single files using this command. To restore the behavior of earlier versions, set `spark.sql.legacy.addSingleFileInAddFile` to `true`. - - In Spark 3.0, `SHOW TBLPROPERTIES` throws `AnalysisException` if the table does not exist. In Spark version 2.4 and below, this scenario caused `NoSuchTableException`. Also, `SHOW TBLPROPERTIES` on a temporary view causes `AnalysisException`. In Spark version 2.4 and below, it returned an empty result. + - In Spark 3.0, `SHOW TBLPROPERTIES` throws `AnalysisException` if the table does not exist. In Spark version 2.4 and below, this scenario caused `NoSuchTableException`. Review comment: I don't think error message/exception type change is a breaking change. @dongjoon-hyun what do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #28222: SPARK-31447 Fix issue in ExtractIntervalPart expression
cloud-fan commented on pull request #28222: URL: https://github.com/apache/spark/pull/28222#issuecomment-620473740 cc @yaooqinn can you take a look? This seems like a hard problem as we have a non-standard interval definition. It's interesting to see what results other systems return, like presto, hive, snowflake, etc. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28385: [SPARK-31591][CORE] Fix null name prefix when create directory
AmplabJenkins removed a comment on pull request #28385: URL: https://github.com/apache/spark/pull/28385#issuecomment-620465474 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #28385: [SPARK-31591][CORE] Fix null name prefix when create directory
maropu commented on pull request #28385: URL: https://github.com/apache/spark/pull/28385#issuecomment-620465500 Why is the method called with null in the shuffle case? Is this an issue in callsite? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28385: [SPARK-31591][CORE] Fix null name prefix when create directory
AmplabJenkins commented on pull request #28385: URL: https://github.com/apache/spark/pull/28385#issuecomment-620465474 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28385: [SPARK-31591][CORE] Fix null name prefix when create directory
SparkQA commented on pull request #28385: URL: https://github.com/apache/spark/pull/28385#issuecomment-620464831 **[Test build #121982 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121982/testReport)** for PR 28385 at commit [`a42f005`](https://github.com/apache/spark/commit/a42f005b8942a4e7d05f8271bf990b46e916c790). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #28385: [SPARK-31591][CORE] Fix null name prefix when create directory
maropu commented on pull request #28385: URL: https://github.com/apache/spark/pull/28385#issuecomment-620464218 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28387: [SPARK-29339][R][FOLLOW-UP] remove requireNamespace1 workaround for arrow
HyukjinKwon commented on a change in pull request #28387: URL: https://github.com/apache/spark/pull/28387#discussion_r416431223 ## File path: R/pkg/R/DataFrame.R ## @@ -1226,8 +1226,7 @@ setMethod("collect", # empty data.frame with 0 columns and 0 rows data.frame() } else if (useArrow) { - requireNamespace1 <- requireNamespace - if (requireNamespace1("arrow", quietly = TRUE)) { + if (requireNamespace("arrow", quietly = TRUE)) { read_arrow <- get("read_arrow", envir = asNamespace("arrow"), inherits = FALSE) Review comment: @MichaelChirico I believe this can be fixed as `arrow::read_arrow`. It was also a workaround. see also https://github.com/apache/spark/pull/25993 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #28387: [SPARK-29339][R][FOLLOW-UP] remove requireNamespace1 workaround for arrow
HyukjinKwon commented on a change in pull request #28387: URL: https://github.com/apache/spark/pull/28387#discussion_r416429707 ## File path: R/pkg/R/DataFrame.R ## @@ -1226,8 +1226,7 @@ setMethod("collect", # empty data.frame with 0 columns and 0 rows data.frame() } else if (useArrow) { - requireNamespace1 <- requireNamespace - if (requireNamespace1("arrow", quietly = TRUE)) { + if (requireNamespace("arrow", quietly = TRUE)) { Review comment: Thanks, I wonder why I missed this .. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] igreenfield commented on pull request #26624: [SPARK-8981][core] Add MDC support in Executor
igreenfield commented on pull request #26624: URL: https://github.com/apache/spark/pull/26624#issuecomment-620460159 The failed test does not seems to be connected to the changes in the code This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #28367: [SPARK-31573][R] Apply fixed=TRUE as appropriate to regex usage in R
HyukjinKwon commented on pull request #28367: URL: https://github.com/apache/spark/pull/28367#issuecomment-620458462 Merged to master and branch-3.0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26141: [SPARK-29492][SQL]Reset HiveSession's SessionState conf's ClassLoader when sync mode
cloud-fan commented on a change in pull request #26141: URL: https://github.com/apache/spark/pull/26141#discussion_r416423730 ## File path: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala ## @@ -72,6 +72,48 @@ class HiveThriftBinaryServerSuite extends HiveThriftJdbcTest { try f(client) finally transport.close() } + test("SPARK-29492: use add jar in sync mode") { Review comment: can you post the error message when running the test before this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26141: [SPARK-29492][SQL]Reset HiveSession's SessionState conf's ClassLoader when sync mode
cloud-fan commented on a change in pull request #26141: URL: https://github.com/apache/spark/pull/26141#discussion_r416423419 ## File path: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala ## @@ -72,6 +72,48 @@ class HiveThriftBinaryServerSuite extends HiveThriftJdbcTest { try f(client) finally transport.close() } + test("SPARK-29492: use add jar in sync mode") { Review comment: let's put the new test at the end. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
HyukjinKwon commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620454937 cc @felixcheung and @shivaram FYI This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620453870 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/121980/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620453844 **[Test build #121980 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121980/testReport)** for PR 28386 at commit [`d0965d5`](https://github.com/apache/spark/commit/d0965d5de4288c5ad83337ca19577ebf10195dc3). * This patch **fails R style tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620449564 **[Test build #121980 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121980/testReport)** for PR 28386 at commit [`d0965d5`](https://github.com/apache/spark/commit/d0965d5de4288c5ad83337ca19577ebf10195dc3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620453862 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620453862 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xuanyuanking commented on a change in pull request #28326: [SPARK-27340][SS] Alias on TimeWindow expression cause watermark metadata lost
xuanyuanking commented on a change in pull request #28326: URL: https://github.com/apache/spark/pull/28326#discussion_r416419173 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala ## @@ -1040,17 +1034,11 @@ class Column(val expr: Expression) extends Logging { * df.select($"colA".name("colB")) * }}} * - * If the current column has metadata associated with it, this metadata will be propagated - * to the new column. If this not desired, use `as` with explicitly empty metadata. Review comment: These comments added together with the changes we just reverted. But it's good to have clear comments, I'll rephrase and add them back in the follow-up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #28294: [SPARK-31519][SQL] Cast in having aggregate expressions returns the wrong result
cloud-fan commented on pull request #28294: URL: https://github.com/apache/spark/pull/28294#issuecomment-620452258 @xuanyuanking can you send a backport PR for 2.4? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #28294: [SPARK-31519][SQL] Cast in having aggregate expressions returns the wrong result
cloud-fan commented on pull request #28294: URL: https://github.com/apache/spark/pull/28294#issuecomment-620451860 thanks, merging to master/3.0! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28366: [WIP][SPARK-31365][SQL] Enable nested predicate pushdown per data sources
AmplabJenkins removed a comment on pull request #28366: URL: https://github.com/apache/spark/pull/28366#issuecomment-620450142 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28366: [WIP][SPARK-31365][SQL] Enable nested predicate pushdown per data sources
AmplabJenkins commented on pull request #28366: URL: https://github.com/apache/spark/pull/28366#issuecomment-620450142 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620450040 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620450040 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28366: [WIP][SPARK-31365][SQL] Enable nested predicate pushdown per data sources
SparkQA commented on pull request #28366: URL: https://github.com/apache/spark/pull/28366#issuecomment-620449590 **[Test build #121981 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121981/testReport)** for PR 28366 at commit [`e555a1c`](https://github.com/apache/spark/commit/e555a1c94d6ec7b1a338015a686af63eaec3c8a9). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620449564 **[Test build #121980 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121980/testReport)** for PR 28386 at commit [`d0965d5`](https://github.com/apache/spark/commit/d0965d5de4288c5ad83337ca19577ebf10195dc3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620447533 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/121978/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620442254 **[Test build #121978 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121978/testReport)** for PR 28386 at commit [`abc9bd6`](https://github.com/apache/spark/commit/abc9bd6a1f02796be4940aac228c76f96cd9b49a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620447524 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620447524 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620447503 **[Test build #121978 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121978/testReport)** for PR 28386 at commit [`abc9bd6`](https://github.com/apache/spark/commit/abc9bd6a1f02796be4940aac228c76f96cd9b49a). * This patch **fails R style tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #27978: [SPARK-31127][ML] Implement abstract Selector
AmplabJenkins removed a comment on pull request #27978: URL: https://github.com/apache/spark/pull/27978#issuecomment-620446584 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620442764 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #27978: [SPARK-31127][ML] Implement abstract Selector
AmplabJenkins commented on pull request #27978: URL: https://github.com/apache/spark/pull/27978#issuecomment-620446584 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #27978: [SPARK-31127][ML] Implement abstract Selector
SparkQA commented on pull request #27978: URL: https://github.com/apache/spark/pull/27978#issuecomment-620445888 **[Test build #121979 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121979/testReport)** for PR 27978 at commit [`5eabb62`](https://github.com/apache/spark/commit/5eabb625e452236eff344131345deb2e8aaa8e7b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27627: [WIP][SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
cloud-fan commented on a change in pull request #27627: URL: https://github.com/apache/spark/pull/27627#discussion_r416408564 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala ## @@ -62,38 +62,113 @@ case class Sum(child: Expression) extends DeclarativeAggregate with ImplicitCast private lazy val sum = AttributeReference("sum", sumDataType)() + private lazy val isEmptyOrNulls = AttributeReference("isEmptyOrNulls", BooleanType, false)() + private lazy val zero = Literal.default(sumDataType) - override lazy val aggBufferAttributes = sum :: Nil + override lazy val aggBufferAttributes = sum :: isEmptyOrNulls :: Nil override lazy val initialValues: Seq[Expression] = Seq( -/* sum = */ Literal.create(null, sumDataType) +/* sum = */ zero, +/* isEmptyOrNulls = */ Literal.create(true, BooleanType) ) + /** + * For decimal types and when child is nullable: + * isEmptyOrNulls flag is a boolean to represent if there are no rows or if all rows that + * have been seen are null. This will be used to identify if the end result of sum in + * evaluateExpression should be null or not. + * + * Update of the isEmptyOrNulls flag: + * If this flag is false, then keep it as is. + * If this flag is true, then check if the incoming value is null and if it is null, keep it + * as true else update it to false. + * Once this flag is switched to false, it will remain false. + * + * The update of the sum is as follows: + * If sum is null, then we have a case of overflow, so keep sum as is. + * If sum is not null, and the incoming value is not null, then perform the addition along + * with the overflow checking. Note, that if overflow occurs, then sum will be null here. + * If the new incoming value is null, we will keep the sum in buffer as is and skip this + * incoming null + */ override lazy val updateExpressions: Seq[Expression] = { if (child.nullable) { - Seq( -/* sum = */ -coalesce(coalesce(sum, zero) + child.cast(sumDataType), sum) - ) + resultType match { +case d: DecimalType => + Seq( +/* sum */ +If(IsNull(sum), sum, + If(IsNotNull(child.cast(sumDataType)), +CheckOverflow(sum + child.cast(sumDataType), d, true), sum)), +/* isEmptyOrNulls */ +If(isEmptyOrNulls, IsNull(child.cast(sumDataType)), isEmptyOrNulls) + ) +case _ => + Seq( +coalesce(sum + child.cast(sumDataType), sum), +If(isEmptyOrNulls, IsNull(child.cast(sumDataType)), isEmptyOrNulls) + ) + } } else { - Seq( -/* sum = */ -coalesce(sum, zero) + child.cast(sumDataType) - ) + resultType match { +case d: DecimalType => + Seq( +/* sum */ +If(IsNull(sum), sum, CheckOverflow(sum + child.cast(sumDataType), d, true)), +/* isEmptyOrNulls */ +false + ) +case _ => Seq(sum + child.cast(sumDataType), false) + } } } + /** + * For decimal type: + * update of the sum is as follows: + * Check if either portion of the left.sum or right.sum has overflowed + * If it has, then the sum value will remain null. + * If it did not have overflow, then add the sum.left and sum.right and check for overflow. + * + * isEmptyOrNulls: Set to false if either one of the left or right is set to false. This + * means we have seen atleast a row that was not null. + * If the value from bufferLeft and bufferRight are both true, then this will be true. + */ override lazy val mergeExpressions: Seq[Expression] = { -Seq( - /* sum = */ - coalesce(coalesce(sum.left, zero) + sum.right, sum.left) -) +resultType match { + case d: DecimalType => +Seq( + /* sum = */ + If(And(IsNull(sum.left), EqualTo(isEmptyOrNulls.left, false)) || +And(IsNull(sum.right), EqualTo(isEmptyOrNulls.right, false)), + Literal.create(null, resultType), + CheckOverflow(sum.left + sum.right, d, true)), + /* isEmptyOrNulls = */ + And(isEmptyOrNulls.left, isEmptyOrNulls.right) + ) + case _ => +Seq( + coalesce(sum.left + sum.right, sum.left), + And(isEmptyOrNulls.left, isEmptyOrNulls.right) +) +} } + /** + * If the isEmptyOrNulls is true, then it means either there are no rows, or all the rows were + * null, so the result will be null. + * If the isEmptyOrNulls is false, then if sum is null that means an overflow has happened. + * So now, if ansi is enabled, then throw exception, if not then return null. + * If sum is not null, then return the sum. Review comment: If we don't check overflow at https://github.com/apache/spark/pull/27627/files#r416407527 , we
[GitHub] [spark] cloud-fan commented on a change in pull request #27627: [WIP][SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
cloud-fan commented on a change in pull request #27627: URL: https://github.com/apache/spark/pull/27627#discussion_r416407527 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala ## @@ -62,38 +62,113 @@ case class Sum(child: Expression) extends DeclarativeAggregate with ImplicitCast private lazy val sum = AttributeReference("sum", sumDataType)() + private lazy val isEmptyOrNulls = AttributeReference("isEmptyOrNulls", BooleanType, false)() + private lazy val zero = Literal.default(sumDataType) - override lazy val aggBufferAttributes = sum :: Nil + override lazy val aggBufferAttributes = sum :: isEmptyOrNulls :: Nil override lazy val initialValues: Seq[Expression] = Seq( -/* sum = */ Literal.create(null, sumDataType) +/* sum = */ zero, +/* isEmptyOrNulls = */ Literal.create(true, BooleanType) ) + /** + * For decimal types and when child is nullable: + * isEmptyOrNulls flag is a boolean to represent if there are no rows or if all rows that + * have been seen are null. This will be used to identify if the end result of sum in + * evaluateExpression should be null or not. + * + * Update of the isEmptyOrNulls flag: + * If this flag is false, then keep it as is. + * If this flag is true, then check if the incoming value is null and if it is null, keep it + * as true else update it to false. + * Once this flag is switched to false, it will remain false. + * + * The update of the sum is as follows: + * If sum is null, then we have a case of overflow, so keep sum as is. + * If sum is not null, and the incoming value is not null, then perform the addition along + * with the overflow checking. Note, that if overflow occurs, then sum will be null here. + * If the new incoming value is null, we will keep the sum in buffer as is and skip this + * incoming null + */ override lazy val updateExpressions: Seq[Expression] = { if (child.nullable) { - Seq( -/* sum = */ -coalesce(coalesce(sum, zero) + child.cast(sumDataType), sum) - ) + resultType match { +case d: DecimalType => + Seq( +/* sum */ +If(IsNull(sum), sum, + If(IsNotNull(child.cast(sumDataType)), +CheckOverflow(sum + child.cast(sumDataType), d, true), sum)), +/* isEmptyOrNulls */ +If(isEmptyOrNulls, IsNull(child.cast(sumDataType)), isEmptyOrNulls) + ) +case _ => + Seq( +coalesce(sum + child.cast(sumDataType), sum), +If(isEmptyOrNulls, IsNull(child.cast(sumDataType)), isEmptyOrNulls) + ) + } } else { - Seq( -/* sum = */ -coalesce(sum, zero) + child.cast(sumDataType) - ) + resultType match { +case d: DecimalType => + Seq( +/* sum */ +If(IsNull(sum), sum, CheckOverflow(sum + child.cast(sumDataType), d, true)), +/* isEmptyOrNulls */ +false + ) +case _ => Seq(sum + child.cast(sumDataType), false) + } } } + /** + * For decimal type: + * update of the sum is as follows: + * Check if either portion of the left.sum or right.sum has overflowed + * If it has, then the sum value will remain null. + * If it did not have overflow, then add the sum.left and sum.right and check for overflow. Review comment: We don't need to check overflow here. We can do it at the end. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #27803: [SPARK-31049][SQL] Support nested adjacent generators, e.g., explode(explode(v))
dongjoon-hyun commented on pull request #27803: URL: https://github.com/apache/spark/pull/27803#issuecomment-620444208 Thank you, @maropu . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27627: [WIP][SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
cloud-fan commented on a change in pull request #27627: URL: https://github.com/apache/spark/pull/27627#discussion_r416407135 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala ## @@ -62,38 +62,113 @@ case class Sum(child: Expression) extends DeclarativeAggregate with ImplicitCast private lazy val sum = AttributeReference("sum", sumDataType)() + private lazy val isEmptyOrNulls = AttributeReference("isEmptyOrNulls", BooleanType, false)() + private lazy val zero = Literal.default(sumDataType) - override lazy val aggBufferAttributes = sum :: Nil + override lazy val aggBufferAttributes = sum :: isEmptyOrNulls :: Nil override lazy val initialValues: Seq[Expression] = Seq( -/* sum = */ Literal.create(null, sumDataType) +/* sum = */ zero, +/* isEmptyOrNulls = */ Literal.create(true, BooleanType) ) + /** + * For decimal types and when child is nullable: + * isEmptyOrNulls flag is a boolean to represent if there are no rows or if all rows that + * have been seen are null. This will be used to identify if the end result of sum in + * evaluateExpression should be null or not. + * + * Update of the isEmptyOrNulls flag: + * If this flag is false, then keep it as is. + * If this flag is true, then check if the incoming value is null and if it is null, keep it + * as true else update it to false. + * Once this flag is switched to false, it will remain false. + * + * The update of the sum is as follows: + * If sum is null, then we have a case of overflow, so keep sum as is. + * If sum is not null, and the incoming value is not null, then perform the addition along + * with the overflow checking. Note, that if overflow occurs, then sum will be null here. + * If the new incoming value is null, we will keep the sum in buffer as is and skip this + * incoming null + */ override lazy val updateExpressions: Seq[Expression] = { if (child.nullable) { - Seq( -/* sum = */ -coalesce(coalesce(sum, zero) + child.cast(sumDataType), sum) - ) + resultType match { +case d: DecimalType => + Seq( +/* sum */ +If(IsNull(sum), sum, + If(IsNotNull(child.cast(sumDataType)), +CheckOverflow(sum + child.cast(sumDataType), d, true), sum)), +/* isEmptyOrNulls */ +If(isEmptyOrNulls, IsNull(child.cast(sumDataType)), isEmptyOrNulls) + ) +case _ => + Seq( +coalesce(sum + child.cast(sumDataType), sum), +If(isEmptyOrNulls, IsNull(child.cast(sumDataType)), isEmptyOrNulls) + ) + } } else { - Seq( -/* sum = */ -coalesce(sum, zero) + child.cast(sumDataType) - ) + resultType match { +case d: DecimalType => + Seq( +/* sum */ +If(IsNull(sum), sum, CheckOverflow(sum + child.cast(sumDataType), d, true)), +/* isEmptyOrNulls */ +false + ) +case _ => Seq(sum + child.cast(sumDataType), false) + } } } + /** + * For decimal type: + * update of the sum is as follows: + * Check if either portion of the left.sum or right.sum has overflowed Review comment: we should explain how we check overflow: the `sum` is null and `isEmpty` is false. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620442764 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620442254 **[Test build #121978 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121978/testReport)** for PR 28386 at commit [`abc9bd6`](https://github.com/apache/spark/commit/abc9bd6a1f02796be4940aac228c76f96cd9b49a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sathyaprakashg commented on pull request #28222: SPARK-31447 Fix issue in ExtractIntervalPart expression
sathyaprakashg commented on pull request #28222: URL: https://github.com/apache/spark/pull/28222#issuecomment-620441968 > @sathyaprakashg Please, take a look at the PRs > #26337 > #27262 Thanks @MaxGekk for prompt reply. CalendarInterval change is not required to fix the issue. I can revert the proposed change for CalendarInterval change. How does my proposed change for ExtractIntervalPart looks? If it looks good, I will update my PR to include only ExtractIntervalPart change We need to change ExtractIntervalPart so that below query returns output as 14 instead of 0. Please refer _Why are the changes needed?_ for more information SELECT EXTRACT(DAY FROM (cast('2020-01-15 00:00:00' as timestamp) - cast('2020-01-01 00:00:00' as timestamp))) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620439117 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/121975/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28349: [SPARK-30642][ML][PYSPARK] LinearSVC blockify input vectors
AmplabJenkins removed a comment on pull request #28349: URL: https://github.com/apache/spark/pull/28349#issuecomment-620439260 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620438619 **[Test build #121975 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121975/testReport)** for PR 28386 at commit [`155543e`](https://github.com/apache/spark/commit/155543ee0a1ae91469240fef3f9edb67d0d5a998). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620439110 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620439095 **[Test build #121975 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121975/testReport)** for PR 28386 at commit [`155543e`](https://github.com/apache/spark/commit/155543ee0a1ae91469240fef3f9edb67d0d5a998). * This patch **fails RAT tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620439205 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620439110 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28349: [SPARK-30642][ML][PYSPARK] LinearSVC blockify input vectors
AmplabJenkins commented on pull request #28349: URL: https://github.com/apache/spark/pull/28349#issuecomment-620439260 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28349: [SPARK-30642][ML][PYSPARK] LinearSVC blockify input vectors
SparkQA commented on pull request #28349: URL: https://github.com/apache/spark/pull/28349#issuecomment-620438687 **[Test build #121976 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121976/testReport)** for PR 28349 at commit [`a97a8fc`](https://github.com/apache/spark/commit/a97a8fc0058e73e180cd69ce9f9df5a11c6bc03c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620438619 **[Test build #121975 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121975/testReport)** for PR 28386 at commit [`155543e`](https://github.com/apache/spark/commit/155543ee0a1ae91469240fef3f9edb67d0d5a998). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #27978: [SPARK-31127][ML] Implement abstract Selector
SparkQA commented on pull request #27978: URL: https://github.com/apache/spark/pull/27978#issuecomment-620438664 **[Test build #121977 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121977/testReport)** for PR 27978 at commit [`e5a19c1`](https://github.com/apache/spark/commit/e5a19c19f970868ddc7a95d4d23fe5fa910b33f1). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27627: [WIP][SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
cloud-fan commented on a change in pull request #27627: URL: https://github.com/apache/spark/pull/27627#discussion_r416399567 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala ## @@ -62,38 +62,113 @@ case class Sum(child: Expression) extends DeclarativeAggregate with ImplicitCast private lazy val sum = AttributeReference("sum", sumDataType)() + private lazy val isEmptyOrNulls = AttributeReference("isEmptyOrNulls", BooleanType, false)() + private lazy val zero = Literal.default(sumDataType) - override lazy val aggBufferAttributes = sum :: Nil + override lazy val aggBufferAttributes = sum :: isEmptyOrNulls :: Nil Review comment: we only need to add it to the buffer attributes for decimal type. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27627: [WIP][SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
cloud-fan commented on a change in pull request #27627: URL: https://github.com/apache/spark/pull/27627#discussion_r416398914 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Sum.scala ## @@ -62,38 +62,113 @@ case class Sum(child: Expression) extends DeclarativeAggregate with ImplicitCast private lazy val sum = AttributeReference("sum", sumDataType)() + private lazy val isEmptyOrNulls = AttributeReference("isEmptyOrNulls", BooleanType, false)() + private lazy val zero = Literal.default(sumDataType) - override lazy val aggBufferAttributes = sum :: Nil + override lazy val aggBufferAttributes = sum :: isEmptyOrNulls :: Nil override lazy val initialValues: Seq[Expression] = Seq( -/* sum = */ Literal.create(null, sumDataType) +/* sum = */ zero, +/* isEmptyOrNulls = */ Literal.create(true, BooleanType) ) + /** + * For decimal types and when child is nullable: + * isEmptyOrNulls flag is a boolean to represent if there are no rows or if all rows that + * have been seen are null. This will be used to identify if the end result of sum in + * evaluateExpression should be null or not. + * + * Update of the isEmptyOrNulls flag: + * If this flag is false, then keep it as is. + * If this flag is true, then check if the incoming value is null and if it is null, keep it + * as true else update it to false. + * Once this flag is switched to false, it will remain false. + * + * The update of the sum is as follows: + * If sum is null, then we have a case of overflow, so keep sum as is. + * If sum is not null, and the incoming value is not null, then perform the addition along + * with the overflow checking. Note, that if overflow occurs, then sum will be null here. Review comment: Is it really necessary? We can let it overflow, and it will become null when we write it out to shuffle files. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #28376: [SPARK-31582] [Yarn] Being able to not populate Hadoop classpath
viirya commented on pull request #28376: URL: https://github.com/apache/spark/pull/28376#issuecomment-620437448 One question, for the case mentioned in the description, "One case we have is when a user uses an Apache Spark distribution with its-own embedded hadoop, and submits a job to Cloudera or Hortonworks Yarn clusters", since the embedded hadoop is incompatible to the cluster, is it generally okay to submit the app there? Even you don't populate the classpath, will RPC or protocol be a problem? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #27978: [SPARK-31127][ML] Implement abstract Selector
AmplabJenkins removed a comment on pull request #27978: URL: https://github.com/apache/spark/pull/27978#issuecomment-620435924 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #26339: [SPARK-27194][SPARK-29302][SQL] For dynamic partition overwrite operation, fix speculation task conflict issue and FileAlreadyE
AmplabJenkins removed a comment on pull request #26339: URL: https://github.com/apache/spark/pull/26339#issuecomment-620435969 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #26339: [SPARK-27194][SPARK-29302][SQL] For dynamic partition overwrite operation, fix speculation task conflict issue and FileAlreadyExistsExc
AmplabJenkins commented on pull request #26339: URL: https://github.com/apache/spark/pull/26339#issuecomment-620435969 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #27978: [SPARK-31127][ML] Implement abstract Selector
AmplabJenkins commented on pull request #27978: URL: https://github.com/apache/spark/pull/27978#issuecomment-620435924 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #26339: [SPARK-27194][SPARK-29302][SQL] For dynamic partition overwrite operation, fix speculation task conflict issue and FileAlreadyExistsException
SparkQA commented on pull request #26339: URL: https://github.com/apache/spark/pull/26339#issuecomment-620435452 **[Test build #121974 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121974/testReport)** for PR 26339 at commit [`7211e27`](https://github.com/apache/spark/commit/7211e27be13c6c99c18dcf036bedf09dd7340bd1). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on pull request #28222: SPARK-31447 Fix issue in ExtractIntervalPart expression
MaxGekk commented on pull request #28222: URL: https://github.com/apache/spark/pull/28222#issuecomment-620432117 @sathyaprakashg Please, take a look at the PRs https://github.com/apache/spark/pull/26337 https://github.com/apache/spark/pull/27262 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620429261 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/121971/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620428768 **[Test build #121971 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121971/testReport)** for PR 28386 at commit [`e155c0d`](https://github.com/apache/spark/commit/e155c0d631f61b6d84d3f44040ae07d7ff55ec54). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28381: [SPARK-31586][SQL] Replace expression TimeSub(l, r) with TimeAdd(l -r)
AmplabJenkins removed a comment on pull request #28381: URL: https://github.com/apache/spark/pull/28381#issuecomment-620429356 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28387: [SPARK-26924][R] remove requireNamespace1 workaround for arrow
AmplabJenkins commented on pull request #28387: URL: https://github.com/apache/spark/pull/28387#issuecomment-620429212 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620429239 **[Test build #121971 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121971/testReport)** for PR 28386 at commit [`e155c0d`](https://github.com/apache/spark/commit/e155c0d631f61b6d84d3f44040ae07d7ff55ec54). * This patch **fails RAT tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620429256 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28194: [SPARK-31372][SQL][TEST] Display expression schema for double check.
AmplabJenkins commented on pull request #28194: URL: https://github.com/apache/spark/pull/28194#issuecomment-620429379 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28388: [SPARK-31553][SQL] Revert "[SPARK-29048] Improve performance on Column.isInCollection() with a large size collection"
AmplabJenkins commented on pull request #28388: URL: https://github.com/apache/spark/pull/28388#issuecomment-620429270 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28381: [SPARK-31586][SQL] Replace expression TimeSub(l, r) with TimeAdd(l -r)
AmplabJenkins commented on pull request #28381: URL: https://github.com/apache/spark/pull/28381#issuecomment-620429356 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28387: [SPARK-26924][R] remove requireNamespace1 workaround for arrow
AmplabJenkins removed a comment on pull request #28387: URL: https://github.com/apache/spark/pull/28387#issuecomment-620429212 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
AmplabJenkins removed a comment on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620429256 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28388: [SPARK-31553][SQL] Revert "[SPARK-29048] Improve performance on Column.isInCollection() with a large size collection"
AmplabJenkins removed a comment on pull request #28388: URL: https://github.com/apache/spark/pull/28388#issuecomment-620429270 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28194: [SPARK-31372][SQL][TEST] Display expression schema for double check.
AmplabJenkins removed a comment on pull request #28194: URL: https://github.com/apache/spark/pull/28194#issuecomment-620429379 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28388: [SPARK-31553][SQL] Revert "[SPARK-29048] Improve performance on Column.isInCollection() with a large size collection"
[GitHub] [spark] MaxGekk commented on pull request #28388: [SPARK-31553][SQL] Revert "[SPARK-29048] Improve performance on Column.isInCollection() with a large size collection"
MaxGekk commented on pull request #28388: URL: https://github.com/apache/spark/pull/28388#issuecomment-620428658 also cc @WeichenXu123 @dongjoon-hyun @maropu This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28387: [SPARK-26924][R] remove requireNamespace1 workaround for arrow
SparkQA commented on pull request #28387: URL: https://github.com/apache/spark/pull/28387#issuecomment-620428791 **[Test build #121970 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121970/testReport)** for PR 28387 at commit [`5c255bb`](https://github.com/apache/spark/commit/5c255bbf8ca03cda7b60ae6c99712e803e56078d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28194: [SPARK-31372][SQL][TEST] Display expression schema for double check.
SparkQA commented on pull request #28194: URL: https://github.com/apache/spark/pull/28194#issuecomment-620428775 **[Test build #121973 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121973/testReport)** for PR 28194 at commit [`9803e4a`](https://github.com/apache/spark/commit/9803e4a2f1ebfd10c7054bd0a16f2521528d81c7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sathyaprakashg commented on pull request #28222: SPARK-31447 Fix issue in ExtractIntervalPart expression
sathyaprakashg commented on pull request #28222: URL: https://github.com/apache/spark/pull/28222#issuecomment-620428630 @cloud-fan @MaxGekk @yaooqinn I am looking for help to review this PR created 2 weeks ago. Since you guys were involved in PR related to simillar change (https://issues.apache.org/jira/browse/SPARK-31469), I am tagging you guys to see if you can help it review it. Since it is my first PR, please bear with me if I missed anything. I am happy to get guidance to improve it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28386: [SPARK-26199][SPARK-31517][R] fix strategy for handling ... names in mutate
SparkQA commented on pull request #28386: URL: https://github.com/apache/spark/pull/28386#issuecomment-620428768 **[Test build #121971 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121971/testReport)** for PR 28386 at commit [`e155c0d`](https://github.com/apache/spark/commit/e155c0d631f61b6d84d3f44040ae07d7ff55ec54). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28381: [SPARK-31586][SQL] Replace expression TimeSub(l, r) with TimeAdd(l -r)
SparkQA commented on pull request #28381: URL: https://github.com/apache/spark/pull/28381#issuecomment-620428796 **[Test build #121972 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/121972/testReport)** for PR 28381 at commit [`17b0438`](https://github.com/apache/spark/commit/17b0438fc95b4306662c122f255ab0e9e5337425). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org