[GitHub] [spark] AmplabJenkins removed a comment on pull request #30908: [SPARK-33892][SQL] Display char/varchar in DESC and SHOW CREATE TABLE
AmplabJenkins removed a comment on pull request #30908: URL: https://github.com/apache/spark/pull/30908#issuecomment-750790543 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/12/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
AmplabJenkins removed a comment on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750790541 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37935/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30898: [SPARK-33884][SQL] Simplify CaseWhen when one clause is null and another is boolean
AmplabJenkins removed a comment on pull request #30898: URL: https://github.com/apache/spark/pull/30898#issuecomment-750790542 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37933/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
AmplabJenkins removed a comment on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750794254 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37934/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
AmplabJenkins removed a comment on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-750790540 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37932/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30905: [SPARK-33890][SQL] Improve the implement of trim/trimleft/trimright
AmplabJenkins removed a comment on pull request #30905: URL: https://github.com/apache/spark/pull/30905#issuecomment-750778086 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/13/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30869: [SPARK-33865][SQL] When HiveDDL, we need check avro schema too
SparkQA commented on pull request #30869: URL: https://github.com/apache/spark/pull/30869#issuecomment-750795522 **[Test build #133348 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133348/testReport)** for PR 30869 at commit [`f42275f`](https://github.com/apache/spark/commit/f42275f890aee250ed93df35b02a3ad636a78435). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30869: [SPARK-33865][SQL] When HiveDDL, we need check avro schema too
AngersZh commented on a change in pull request #30869: URL: https://github.com/apache/spark/pull/30869#discussion_r548436302 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ## @@ -920,9 +921,12 @@ object DDLUtils { serde == Some("parquet.hive.serde.ParquetHiveSerDe") || serde == Some("org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe")) { ParquetSchemaConverter.checkFieldNames(colNames) + } else if (serde == HiveSerDe.sourceToSerDe("avro").get.serde) { +AvroFileFormat.checkFieldNames(colNames) } case "parquet" => ParquetSchemaConverter.checkFieldNames(colNames) case "orc" => OrcFileFormat.checkFieldNames(colNames) +case "avro" => AvroFileFormat.checkFieldNames(colNames) Review comment: Updated as your comment, but many test failed since ``` org.apache.spark.sql.AnalysisException: Failed to find data source: avro. Avro is built-in but external data source module since Spark 2.4. Please deploy the application as per the deployment section of "Apache Avro Data Source Guide". ``` Then move all these test case to Avro module? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
AmplabJenkins commented on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750794254 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37934/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
SparkQA commented on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750794245 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37934/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #30870: [SPARK-33542][SQL] Group exception messages in catalyst/catalog
beliefer commented on pull request #30870: URL: https://github.com/apache/spark/pull/30870#issuecomment-750792243 cc @HyukjinKwon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #30905: [SPARK-33890][SQL] Improve the implement of trim/trimleft/trimright
beliefer commented on pull request #30905: URL: https://github.com/apache/spark/pull/30905#issuecomment-750792203 cc @cloud-fan @wangyum This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30908: [SPARK-33892][SQL] Display char/varchar in DESC and SHOW CREATE TABLE
SparkQA commented on pull request #30908: URL: https://github.com/apache/spark/pull/30908#issuecomment-750791799 **[Test build #133347 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133347/testReport)** for PR 30908 at commit [`057f1ff`](https://github.com/apache/spark/commit/057f1ffe9002df782555b6d7d7eebc5b50521c86). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #30914: [SPARK-33895][SQL] Char and Varchar fail in MetaOperation of ThriftServer
cloud-fan closed pull request #30914: URL: https://github.com/apache/spark/pull/30914 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #30914: [SPARK-33895][SQL] Char and Varchar fail in MetaOperation of ThriftServer
cloud-fan commented on pull request #30914: URL: https://github.com/apache/spark/pull/30914#issuecomment-750790913 thanks, merging to master/3.1! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
AmplabJenkins commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750790541 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37935/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30898: [SPARK-33884][SQL] Simplify CaseWhen when one clause is null and another is boolean
AmplabJenkins commented on pull request #30898: URL: https://github.com/apache/spark/pull/30898#issuecomment-750790542 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37933/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
AmplabJenkins commented on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-750790540 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37932/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30908: [SPARK-33892][SQL] Display char/varchar in DESC and SHOW CREATE TABLE
AmplabJenkins commented on pull request #30908: URL: https://github.com/apache/spark/pull/30908#issuecomment-750790543 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/12/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30893: [SPARK-33881][SQL][TESTS] Check null and empty string as partition values in DS v1 and v2 tests
SparkQA commented on pull request #30893: URL: https://github.com/apache/spark/pull/30893#issuecomment-750789727 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37937/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30908: [SPARK-33892][SQL] Display char/varchar in DESC and SHOW CREATE TABLE
SparkQA removed a comment on pull request #30908: URL: https://github.com/apache/spark/pull/30908#issuecomment-750713271 **[Test build #12 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12/testReport)** for PR 30908 at commit [`2adf7f9`](https://github.com/apache/spark/commit/2adf7f94a1b0b2f64525f0062c4916e916176afb). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30908: [SPARK-33892][SQL] Display char/varchar in DESC and SHOW CREATE TABLE
SparkQA commented on pull request #30908: URL: https://github.com/apache/spark/pull/30908#issuecomment-750788108 **[Test build #12 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12/testReport)** for PR 30908 at commit [`2adf7f9`](https://github.com/apache/spark/commit/2adf7f94a1b0b2f64525f0062c4916e916176afb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30898: [SPARK-33884][SQL] Simplify CaseWhen when one clause is null and another is boolean
SparkQA commented on pull request #30898: URL: https://github.com/apache/spark/pull/30898#issuecomment-750786814 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37933/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
SparkQA commented on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750785864 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37934/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30812: [SPARK-33814][SS] Provide preferred locations for stateful operations without reported state store locations
viirya commented on pull request #30812: URL: https://github.com/apache/spark/pull/30812#issuecomment-750781778 > I'm wondering whether this would solve the issue. There are so many factors involved. I feel the initial even task distribution (assuming the executors are free so the task scheduler will respect the preferred locations) would become uneven quickly after some micro batches, caused by, such as, uneven partitions, executor lost, concurrent queries, etc... Did you verify that this would make a real workload that didn't work before become working? During running stateful streaming queries recently, it caused some troubles by bad initial locations of stores. I agree that this is not ideal, but it solves the problem I saw during I tested SS recently. It is simple and shouldn't have bad impact/regression to SS queries. Combined with task locality configuration, it makes SS queries more stable in my local test. > If the issue is about memory, a low memory state store implementation such as https://github.com/qubole/spark-state-store is a better solution. The problem is, we have a built-in in memory store. And, I don't think Rocksdb-based state store is the answer to all cases. Even with Rocksdb-based store, is it good to have skew stores on few executors? Then local disk space might be the next issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
SparkQA commented on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-750779938 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37932/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30905: [SPARK-33890][SQL] Improve the implement of trim/trimleft/trimright
AmplabJenkins commented on pull request #30905: URL: https://github.com/apache/spark/pull/30905#issuecomment-750778086 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/13/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30885: [SPARK-33659][SS] Document the current behavior for DataStreamWriter.toTable API
AmplabJenkins removed a comment on pull request #30885: URL: https://github.com/apache/spark/pull/30885#issuecomment-750775594 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/14/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
AmplabJenkins removed a comment on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750775294 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37931/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30885: [SPARK-33659][SS] Document the current behavior for DataStreamWriter.toTable API
SparkQA removed a comment on pull request #30885: URL: https://github.com/apache/spark/pull/30885#issuecomment-750712512 **[Test build #14 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14/testReport)** for PR 30885 at commit [`eff4b9d`](https://github.com/apache/spark/commit/eff4b9d9acfece6c2e7fae4111ce106fb282d13e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30905: [SPARK-33890][SQL] Improve the implement of trim/trimleft/trimright
SparkQA removed a comment on pull request #30905: URL: https://github.com/apache/spark/pull/30905#issuecomment-750712497 **[Test build #13 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13/testReport)** for PR 30905 at commit [`21fb9b9`](https://github.com/apache/spark/commit/21fb9b9f1d3ed725b6700421c45c2a2d14d26612). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30905: [SPARK-33890][SQL] Improve the implement of trim/trimleft/trimright
SparkQA commented on pull request #30905: URL: https://github.com/apache/spark/pull/30905#issuecomment-750777478 **[Test build #13 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13/testReport)** for PR 30905 at commit [`21fb9b9`](https://github.com/apache/spark/commit/21fb9b9f1d3ed725b6700421c45c2a2d14d26612). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30893: [SPARK-33881][SQL][TESTS] Check null and empty string as partition values in DS v1 and v2 tests
SparkQA commented on pull request #30893: URL: https://github.com/apache/spark/pull/30893#issuecomment-750776425 **[Test build #133346 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133346/testReport)** for PR 30893 at commit [`4e0ca6c`](https://github.com/apache/spark/commit/4e0ca6c203c7624cf2f5b95c0ed88b66fa6ccd41). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30898: [SPARK-33884][SQL] Simplify CaseWhen when one clause is null and another is boolean
SparkQA commented on pull request #30898: URL: https://github.com/apache/spark/pull/30898#issuecomment-750776236 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37933/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30885: [SPARK-33659][SS] Document the current behavior for DataStreamWriter.toTable API
AmplabJenkins commented on pull request #30885: URL: https://github.com/apache/spark/pull/30885#issuecomment-750775594 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/14/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
AmplabJenkins commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750775294 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37931/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
SparkQA commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750775283 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37931/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30885: [SPARK-33659][SS] Document the current behavior for DataStreamWriter.toTable API
SparkQA commented on pull request #30885: URL: https://github.com/apache/spark/pull/30885#issuecomment-750774931 **[Test build #14 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14/testReport)** for PR 30885 at commit [`eff4b9d`](https://github.com/apache/spark/commit/eff4b9d9acfece6c2e7fae4111ce106fb282d13e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
SparkQA commented on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750773611 **[Test build #133343 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133343/testReport)** for PR 30912 at commit [`0461b8d`](https://github.com/apache/spark/commit/0461b8da388931891c504308c2e404abcff7195f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
SparkQA commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750772861 **[Test build #133345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133345/testReport)** for PR 30387 at commit [`4f18319`](https://github.com/apache/spark/commit/4f183190df6a324364b5f895acaacbfebc16b7ca). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30893: [SPARK-33881][SQL][TESTS] Check null and empty string as partition values in DS v1 and v2 tests
SparkQA commented on pull request #30893: URL: https://github.com/apache/spark/pull/30893#issuecomment-750772639 **[Test build #133344 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133344/testReport)** for PR 30893 at commit [`fe2a4f8`](https://github.com/apache/spark/commit/fe2a4f86e3401a80340fc3dbb1549ff50af9d471). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
AmplabJenkins removed a comment on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750771518 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
AmplabJenkins commented on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750771519 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zsxwing commented on pull request #30812: [SPARK-33814][SS] Provide preferred locations for stateful operations without reported state store locations
zsxwing commented on pull request #30812: URL: https://github.com/apache/spark/pull/30812#issuecomment-750770232 I'm wondering whether this would solve the issue. There are so many factors involved. I feel the initial even task distribution (assuming the executors are free so the task scheduler will respect the preferred locations) would become uneven quickly after some micro batches, caused by, such as, uneven partitions, executor lost, concurrent queries, etc... Did you verify that this would make a real workload that didn't work before become working? If the issue is about memory, a low memory state store implementation such as https://github.com/qubole/spark-state-store is a better solution. > If we want to draw the ideal picture here, IMO my ideal picture is to pin executors and force these executors to serve these stateful tasks on the lifetime of the query. It's ideal to guarantee these stateful tasks never have to reload the state unless crash. This would not be ideal if the application runs multiple queries where batch and streaming are mixed and streaming queries have longer trigger interval hence the chance to be idle. Either the query should wait to be assigned to the executor, or executor should be allowed to be idle for the query. I think the answer to your ideal picture is continuous processing. Although we have not invested resources on continuous processing for a while, it's unlikely we would pin executors in micro batch since micro batch trades latency for throughput. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
SparkQA commented on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-750767838 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37932/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] saikocat commented on pull request #30902: [SPARK-33888][SQL] Add support for TimeMillis logicalType to TimestampType
saikocat commented on pull request #30902: URL: https://github.com/apache/spark/pull/30902#issuecomment-750766426 Noted with thanks. Should I just make a change/fix to `JdbcUtils.scala` then? Should be an easy fix if we choose to go with `IntegerType`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
SparkQA commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750765191 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37931/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #30902: [SPARK-33888][SQL] Add support for TimeMillis logicalType to TimestampType
cloud-fan commented on pull request #30902: URL: https://github.com/apache/spark/pull/30902#issuecomment-750764030 TIME is a standard SQL data type, which is not TIMESTAMP or INTERVAL. Unfortunately, Spark doesn't support this data type yet, and can only fail (or read as the physical type int). The JDBC mapping is wrong and we should fix it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
viirya commented on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750761612 For Json's e2e tests, I will work in other PR. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
SparkQA removed a comment on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750486073 **[Test build #133324 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133324/testReport)** for PR 30912 at commit [`b500344`](https://github.com/apache/spark/commit/b500344c7a5b0bf72b9b868d3f7a6ff211a328bf). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
SparkQA commented on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750760830 **[Test build #133324 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133324/testReport)** for PR 30912 at commit [`b500344`](https://github.com/apache/spark/commit/b500344c7a5b0bf72b9b868d3f7a6ff211a328bf). * This patch **fails from timeout after a configured wait of `500m`**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
viirya commented on a change in pull request #30912: URL: https://github.com/apache/spark/pull/30912#discussion_r548401034 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeCsvExprsSuite.scala ## @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.dsl.expressions._ +import org.apache.spark.sql.catalyst.dsl.plans._ +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.sql.catalyst.plans.PlanTest +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan} +import org.apache.spark.sql.catalyst.rules.RuleExecutor +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.types._ + +class OptimizeCsvExprsSuite extends PlanTest with ExpressionEvalHelper { Review comment: Added e2e tests now. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
AmplabJenkins removed a comment on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750750934 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
AmplabJenkins removed a comment on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750750932 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37928/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
AmplabJenkins removed a comment on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-750750931 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/16/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
AmplabJenkins removed a comment on pull request #29966: URL: https://github.com/apache/spark/pull/29966#issuecomment-750750939 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/15/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
beliefer commented on a change in pull request #30387: URL: https://github.com/apache/spark/pull/30387#discussion_r548399696 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala ## @@ -153,26 +153,24 @@ class FrameLessOffsetWindowFunctionFrame( extends OffsetWindowFunctionFrameBase( target, ordinal, expressions, inputSchema, newMutableProjection, offset) { - assert(expressions.toSeq.filterNot(_.input.isInstanceOf[Attribute]).isEmpty) - - /** The input expression of Lead/Lag. */ - private lazy val inputExpression = expressions.toSeq.map(_.input).head - - /** The index of input expression in the row. */ - private lazy val idx = inputAttrs.zipWithIndex.find(_._1 == inputExpression).map(_._2).head - /** Holder the UnsafeRow where the input operator by function is not null. */ private var nextSelectedRow = EmptyRow // The number of rows skipped to get the next UnsafeRow where the input operator by function // is not null. private var skippedNonNullCount = 0 + /** Create the projection to determine whether input is null. */ + private val project = UnsafeProjection.create(Seq(IsNull(expressions.head.input)), inputSchema) + + /** Check if the output value of the first index is null. */ + private val nullCheck: InternalRow => Boolean = row => project(row).getBoolean(0) Review comment: OK This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30898: [SPARK-33884][SQL] Simplify CaseWhen when one clause is null and another is boolean
SparkQA commented on pull request #30898: URL: https://github.com/apache/spark/pull/30898#issuecomment-750756902 **[Test build #133342 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133342/testReport)** for PR 30898 at commit [`15c74ca`](https://github.com/apache/spark/commit/15c74ca9c63e80fb70e0d71ae89914c4b8f35272). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30881: [SPARK-33875][SQL] Implement DESCRIBE COLUMN for v2 tables
cloud-fan commented on a change in pull request #30881: URL: https://github.com/apache/spark/pull/30881#discussion_r548399246 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala ## @@ -344,10 +344,11 @@ case class DescribeRelation( */ case class DescribeColumn( relation: LogicalPlan, -colNameParts: Seq[String], +column: NamedExpression, isExtended: Boolean) extends Command { override def children: Seq[LogicalPlan] = Seq(relation) override def output: Seq[Attribute] = DescribeCommandSchema.describeColumnAttributes() + override lazy val references: AttributeSet = AttributeSet.empty Review comment: I agree. About `ResolvedView`, I think eventually it will have output as well, after we have v2 view APII. For now, `ResolvedView` always go to v1 command, so we don't need its output. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LuciferYang commented on a change in pull request #30484: [SPARK-33532][SQL] Remove unreachable branch in SpecificParquetRecordReaderBase.initialize method
LuciferYang commented on a change in pull request #30484: URL: https://github.com/apache/spark/pull/30484#discussion_r548398249 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java ## @@ -93,48 +93,17 @@ public void initialize(InputSplit inputSplit, TaskAttemptContext taskAttemptCont throws IOException, InterruptedException { Configuration configuration = taskAttemptContext.getConfiguration(); ParquetInputSplit split = (ParquetInputSplit)inputSplit; +Preconditions.checkState(split.getRowGroupOffsets() == null, +"rowGroupOffsets in ParquetInputSplit should be null."); this.file = split.getPath(); -long[] rowGroupOffsets = split.getRowGroupOffsets(); - -ParquetMetadata footer; -List blocks; - -// if task.side.metadata is set, rowGroupOffsets is null -if (rowGroupOffsets == null) { Review comment: Add a comment and revert the code change? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
SparkQA commented on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-750754445 **[Test build #133341 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133341/testReport)** for PR 30865 at commit [`878beb0`](https://github.com/apache/spark/commit/878beb0be0ee1f76186a59ec9db900445a52ddfa). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
SparkQA commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750754315 **[Test build #133340 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133340/testReport)** for PR 30387 at commit [`0cd9bcc`](https://github.com/apache/spark/commit/0cd9bcce7475c2b189be32c0209c0ff017a0736e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30908: [SPARK-33892][SQL] Display char/varchar in DESC and SHOW CREATE TABLE
cloud-fan commented on a change in pull request #30908: URL: https://github.com/apache/spark/pull/30908#discussion_r548395417 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/HiveCharVarcharTestSuite.scala ## @@ -41,6 +41,15 @@ class HiveCharVarcharTestSuite extends CharVarcharTestSuite with TestHiveSinglet } super.afterAll() } + + test("SHOW CREATE TABLE AS SERDE w/ char/varchar") { Review comment: ditto This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30908: [SPARK-33892][SQL] Display char/varchar in DESC and SHOW CREATE TABLE
cloud-fan commented on a change in pull request #30908: URL: https://github.com/apache/spark/pull/30908#discussion_r548395346 ## File path: sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala ## @@ -443,6 +443,14 @@ trait CharVarcharTestSuite extends QueryTest with SQLTestUtils { ("c1 IN (c2)", true))) } } + + test("DESCRIBE TABLE w/ char/varchar") { Review comment: +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
AmplabJenkins commented on pull request #29966: URL: https://github.com/apache/spark/pull/29966#issuecomment-750750939 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/15/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
AmplabJenkins commented on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750750935 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
AmplabJenkins commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750750932 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37928/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
AmplabJenkins commented on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-750750931 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/16/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
cloud-fan commented on a change in pull request #30387: URL: https://github.com/apache/spark/pull/30387#discussion_r548391607 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala ## @@ -153,26 +153,24 @@ class FrameLessOffsetWindowFunctionFrame( extends OffsetWindowFunctionFrameBase( target, ordinal, expressions, inputSchema, newMutableProjection, offset) { - assert(expressions.toSeq.filterNot(_.input.isInstanceOf[Attribute]).isEmpty) - - /** The input expression of Lead/Lag. */ - private lazy val inputExpression = expressions.toSeq.map(_.input).head - - /** The index of input expression in the row. */ - private lazy val idx = inputAttrs.zipWithIndex.find(_._1 == inputExpression).map(_._2).head - /** Holder the UnsafeRow where the input operator by function is not null. */ private var nextSelectedRow = EmptyRow // The number of rows skipped to get the next UnsafeRow where the input operator by function // is not null. private var skippedNonNullCount = 0 + /** Create the projection to determine whether input is null. */ + private val project = UnsafeProjection.create(Seq(IsNull(expressions.head.input)), inputSchema) + + /** Check if the output value of the first index is null. */ + private val nullCheck: InternalRow => Boolean = row => project(row).getBoolean(0) Review comment: this can be a def ``` private def nullCheck(row: InternalRow): Boolean = project(row).getBoolean(0) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
cloud-fan commented on a change in pull request #30387: URL: https://github.com/apache/spark/pull/30387#discussion_r548391607 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala ## @@ -153,26 +153,24 @@ class FrameLessOffsetWindowFunctionFrame( extends OffsetWindowFunctionFrameBase( target, ordinal, expressions, inputSchema, newMutableProjection, offset) { - assert(expressions.toSeq.filterNot(_.input.isInstanceOf[Attribute]).isEmpty) - - /** The input expression of Lead/Lag. */ - private lazy val inputExpression = expressions.toSeq.map(_.input).head - - /** The index of input expression in the row. */ - private lazy val idx = inputAttrs.zipWithIndex.find(_._1 == inputExpression).map(_._2).head - /** Holder the UnsafeRow where the input operator by function is not null. */ private var nextSelectedRow = EmptyRow // The number of rows skipped to get the next UnsafeRow where the input operator by function // is not null. private var skippedNonNullCount = 0 + /** Create the projection to determine whether input is null. */ + private val project = UnsafeProjection.create(Seq(IsNull(expressions.head.input)), inputSchema) + + /** Check if the output value of the first index is null. */ + private val nullCheck: InternalRow => Boolean = row => project(row).getBoolean(0) Review comment: this can be a def ``` def nullCheck(row: InternalRow): Boolean = project(row).getBoolean(0) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kozakana commented on pull request #30803: [SPARK-33897][SQL]Can't set option 'cross' in join method.
kozakana commented on pull request #30803: URL: https://github.com/apache/spark/pull/30803#issuecomment-750749214 @HyukjinKwon I created [SPARK-33897](https://issues.apache.org/jira/browse/SPARK-33897) in JIRA and modify title of github issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30869: [SPARK-33865][SQL] When HiveDDL, we need check avro schema too
AngersZh commented on a change in pull request #30869: URL: https://github.com/apache/spark/pull/30869#discussion_r54830 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ## @@ -920,9 +921,12 @@ object DDLUtils { serde == Some("parquet.hive.serde.ParquetHiveSerDe") || serde == Some("org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe")) { ParquetSchemaConverter.checkFieldNames(colNames) + } else if (serde == HiveSerDe.sourceToSerDe("avro").get.serde) { +AvroFileFormat.checkFieldNames(colNames) } case "parquet" => ParquetSchemaConverter.checkFieldNames(colNames) case "orc" => OrcFileFormat.checkFieldNames(colNames) +case "avro" => AvroFileFormat.checkFieldNames(colNames) Review comment: then for hive provider part, we use similar way? ``` } else if (serde == HiveSerDe.sourceToSerDe("avro").get.serde) { DataSource.lookupdataSource("avro", conf).checkFieldNames(colNames) } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
cloud-fan commented on a change in pull request #30865: URL: https://github.com/apache/spark/pull/30865#discussion_r548388560 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/SimplifyConditionalsInPredicate.scala ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.expressions.{And, CaseWhen, Expression, If, Literal, Not, Or} +import org.apache.spark.sql.catalyst.expressions.Literal.{FalseLiteral, TrueLiteral} +import org.apache.spark.sql.catalyst.plans.logical._ +import org.apache.spark.sql.catalyst.rules.Rule +import org.apache.spark.sql.types.BooleanType +import org.apache.spark.util.Utils + + +object SimplifyConditionalsInPredicate extends Rule[LogicalPlan] { + + def apply(plan: LogicalPlan): LogicalPlan = plan transform { +case f @ Filter(cond, _) => f.copy(condition = simplifyConditional(cond)) +case j @ Join(_, _, _, Some(cond), _) => j.copy(condition = Some(simplifyConditional(cond))) +case d @ DeleteFromTable(_, Some(cond)) => d.copy(condition = Some(simplifyConditional(cond))) +case u @ UpdateTable(_, _, Some(cond)) => u.copy(condition = Some(simplifyConditional(cond))) + } + + private def simplifyConditional(e: Expression): Expression = e match { +case Literal(null, BooleanType) => FalseLiteral +case And(left, right) => And(simplifyConditional(left), simplifyConditional(right)) +case Or(left, right) => Or(simplifyConditional(left), simplifyConditional(right)) +case If(cond, t, FalseLiteral) => And(cond, t) +case If(cond, t, TrueLiteral) => Or(Not(cond), t) +case If(cond, FalseLiteral, f) => And(Not(cond), f) +case If(cond, TrueLiteral, f) => Or(cond, f) +case CaseWhen(Seq((cond, trueValue)), +Some(FalseLiteral) | Some(Literal(null, BooleanType)) | None) => + And(cond, trueValue) +case CaseWhen(Seq((cond, trueValue)), Some(TrueLiteral)) => + Or(Not(cond), trueValue) +case CaseWhen(Seq((cond, FalseLiteral)), elseValue) => + And(Not(cond), elseValue.getOrElse(Literal(null, BooleanType))) +case CaseWhen(Seq((cond, TrueLiteral)), elseValue) => + Or(cond, elseValue.getOrElse(Literal(null, BooleanType))) +case e if e.dataType == BooleanType => e Review comment: I mean something like ``` case e => assert(e.dataType != BooleanType, ...) e ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
cloud-fan commented on a change in pull request #30865: URL: https://github.com/apache/spark/pull/30865#discussion_r548388326 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/SimplifyConditionalsInPredicate.scala ## @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.expressions.{And, CaseWhen, Expression, If, Literal, Not, Or} +import org.apache.spark.sql.catalyst.expressions.Literal.{FalseLiteral, TrueLiteral} +import org.apache.spark.sql.catalyst.plans.logical._ +import org.apache.spark.sql.catalyst.rules.Rule +import org.apache.spark.sql.types.BooleanType + +/** + * A rule that converting conditional expressions to predicate expressions, if possible, in the Review comment: `that converting` -> `that converts` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
SparkQA removed a comment on pull request #29966: URL: https://github.com/apache/spark/pull/29966#issuecomment-750712775 **[Test build #15 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15/testReport)** for PR 29966 at commit [`4c44dae`](https://github.com/apache/spark/commit/4c44daecc7c77527e150d30051170f7ee8667f70). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30869: [SPARK-33865][SQL] When HiveDDL, we need check avro schema too
cloud-fan commented on a change in pull request #30869: URL: https://github.com/apache/spark/pull/30869#discussion_r548388043 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ## @@ -920,9 +921,12 @@ object DDLUtils { serde == Some("parquet.hive.serde.ParquetHiveSerDe") || serde == Some("org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe")) { ParquetSchemaConverter.checkFieldNames(colNames) + } else if (serde == HiveSerDe.sourceToSerDe("avro").get.serde) { +AvroFileFormat.checkFieldNames(colNames) } case "parquet" => ParquetSchemaConverter.checkFieldNames(colNames) case "orc" => OrcFileFormat.checkFieldNames(colNames) +case "avro" => AvroFileFormat.checkFieldNames(colNames) Review comment: What we can do here is: ``` DataSource.lookupDataSource(provider, conf) match { case f: FileFormat => f.checkFieldNames... // a new API case _ => } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
SparkQA commented on pull request #29966: URL: https://github.com/apache/spark/pull/29966#issuecomment-750747800 **[Test build #15 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15/testReport)** for PR 29966 at commit [`4c44dae`](https://github.com/apache/spark/commit/4c44daecc7c77527e150d30051170f7ee8667f70). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #30900: [SPARK-33886][SQL] UnresolvedTable should retain SQL text position for DDL commands
cloud-fan closed pull request #30900: URL: https://github.com/apache/spark/pull/30900 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #30900: [SPARK-33886][SQL] UnresolvedTable should retain SQL text position for DDL commands
cloud-fan commented on pull request #30900: URL: https://github.com/apache/spark/pull/30900#issuecomment-750746667 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #30864: [SPARK-33857][SQL] Unify the default seed of random functions
cloud-fan commented on a change in pull request #30864: URL: https://github.com/apache/spark/pull/30864#discussion_r548386088 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala ## @@ -83,14 +87,13 @@ trait ExpressionWithRandomSeed { """, since = "1.5.0") // scalastyle:on line.size.limit -case class Rand(child: Expression, hideSeed: Boolean = false) - extends RDG with ExpressionWithRandomSeed { +case class Rand(child: Expression, hideSeed: Boolean = false) extends RDG { - def this() = this(Literal(Utils.random.nextLong(), LongType), true) + def this() = this(UnresolvedSeed, true) def this(child: Expression) = this(child, false) - override def withNewSeed(seed: Long): Rand = Rand(Literal(seed, LongType)) + override def withNewSeed(seed: Long): Rand = Rand(Literal(seed, LongType), hideSeed) Review comment: This is an orthogonal bug fix, but it only affects the `sql` method. I don't have a strong opinion about backporting the fix or not. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
SparkQA commented on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750745148 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37929/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #30776: [SPARK-33787][SQL] Add the `purge` parameter to `dropPartition()` of `SupportsPartitionManagement`
cloud-fan commented on pull request #30776: URL: https://github.com/apache/spark/pull/30776#issuecomment-750744615 This was closed in favor of https://github.com/apache/spark/pull/30886 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
beliefer commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750744600 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
SparkQA removed a comment on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-750726287 **[Test build #16 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16/testReport)** for PR 30865 at commit [`8bd9ef9`](https://github.com/apache/spark/commit/8bd9ef9e3136e67a49dbbfe8574725dfe9f38c94). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
SparkQA commented on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-750743768 **[Test build #16 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16/testReport)** for PR 30865 at commit [`8bd9ef9`](https://github.com/apache/spark/commit/8bd9ef9e3136e67a49dbbfe8574725dfe9f38c94). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kozakana commented on pull request #30803: [SPARK-30803][SQL]Can't set option 'cross' in join method.
kozakana commented on pull request #30803: URL: https://github.com/apache/spark/pull/30803#issuecomment-750742951 ok! I understand the number and I can post the JIRA. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
SparkQA commented on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750742064 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37928/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
SparkQA removed a comment on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750727016 **[Test build #18 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18/testReport)** for PR 30912 at commit [`58cc2ba`](https://github.com/apache/spark/commit/58cc2ba4fc5a032cefb285a8c35f4bfb8dff9a07). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
AmplabJenkins removed a comment on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750741601 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/18/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
AmplabJenkins commented on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750741601 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/18/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
SparkQA commented on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750741485 **[Test build #18 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18/testReport)** for PR 30912 at commit [`58cc2ba`](https://github.com/apache/spark/commit/58cc2ba4fc5a032cefb285a8c35f4bfb8dff9a07). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
AmplabJenkins removed a comment on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-750741136 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37927/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
AmplabJenkins commented on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-750741136 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37927/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30865: [SPARK-33861][SQL] Simplify conditional in predicate
SparkQA commented on pull request #30865: URL: https://github.com/apache/spark/pull/30865#issuecomment-750741123 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37927/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30912: [SPARK-32968][SQL] Prune unnecessary columns from CsvToStructs
SparkQA commented on pull request #30912: URL: https://github.com/apache/spark/pull/30912#issuecomment-750738156 **[Test build #19 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19/testReport)** for PR 30912 at commit [`fc5dd38`](https://github.com/apache/spark/commit/fc5dd38c1c0f8c90fe083f0e7f603ccdcc6e7b19). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30908: [SPARK-33892][SQL] Display char/varchar in DESC and SHOW CREATE TABLE
AmplabJenkins removed a comment on pull request #30908: URL: https://github.com/apache/spark/pull/30908#issuecomment-750737030 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37923/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30885: [SPARK-33659][SS] Document the current behavior for DataStreamWriter.toTable API
AmplabJenkins removed a comment on pull request #30885: URL: https://github.com/apache/spark/pull/30885#issuecomment-750737026 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37925/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30914: [SPARK-33895][SQL] Char and Varchar fail in MetaOperation of ThriftServer
AmplabJenkins removed a comment on pull request #30914: URL: https://github.com/apache/spark/pull/30914#issuecomment-750737025 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/37921/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29966: [SPARK-33084][CORE][SQL] Add jar support ivy path
AmplabJenkins removed a comment on pull request #29966: URL: https://github.com/apache/spark/pull/29966#issuecomment-750737028 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133329/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30387: [SPARK-33443][SQL] LEAD/LAG should support [ IGNORE NULLS | RESPECT NULLS ]
AmplabJenkins removed a comment on pull request #30387: URL: https://github.com/apache/spark/pull/30387#issuecomment-750737029 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/17/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org