[GitHub] [spark] viirya commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning
viirya commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning URL: https://github.com/apache/spark/pull/27702#discussion_r384961781 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/SchemaPruningSuite.scala ## @@ -301,6 +301,24 @@ abstract class SchemaPruningSuite checkAnswer(query, Row("Y.", 1) :: Row("X.", 1) :: Row(null, 2) :: Row(null, 2) :: Nil) } + testSchemaPruning("select explode of nested field of array of struct from Generate output") { Review comment: As the current source at `SchemaPruningSuite` doesn't have the data structure suitable for the test, this test doesn't use the source provided here. I put it here because it is highly related to nested column pruning. If we have another proper test suite to put, it is also ok. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yma11 commented on a change in pull request #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines
yma11 commented on a change in pull request #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines URL: https://github.com/apache/spark/pull/27546#discussion_r384959926 ## File path: mllib-local/src/main/scala/org/apache/spark/ml/linalg/BLAS.scala ## @@ -220,20 +240,12 @@ private[spark] object BLAS extends Serializable { case sx: SparseVector => f2jBLAS.dscal(sx.values.length, a, sx.values, 1) Review comment: done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yma11 commented on a change in pull request #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines
yma11 commented on a change in pull request #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines URL: https://github.com/apache/spark/pull/27546#discussion_r384959562 ## File path: mllib-local/src/main/scala/org/apache/spark/ml/linalg/BLAS.scala ## @@ -27,15 +27,35 @@ private[spark] object BLAS extends Serializable { @transient private var _f2jBLAS: NetlibBLAS = _ @transient private var _nativeBLAS: NetlibBLAS = _ + @transient private val vectorSizeThreshold: Int = 256 - // For level-1 routines, we use Java implementation. + // For level-3 routines, we use the native BLAS. + // if native BLAS is not properly configured in system, will automatically fallback to f2jBLAS + private[ml] def nativeBLAS: NetlibBLAS = { Review comment: I add a note in ml-guide to highlight nativeBLAS usage depends on proper configuration in system otherwise will fallback to java implementation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27707: [SPARK-30958][SQL] do not set default era for DateTimeFormatter
AmplabJenkins removed a comment on issue #27707: [SPARK-30958][SQL] do not set default era for DateTimeFormatter URL: https://github.com/apache/spark/pull/27707#issuecomment-591821642 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119008/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27716: [SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index
AmplabJenkins removed a comment on issue #27716: [SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index URL: https://github.com/apache/spark/pull/27716#issuecomment-591827249 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23762/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27717: [SPARK-23435][INFRA][FOLLOW-UP] Remove unnecessary dependency listing in AppVeyor
AmplabJenkins removed a comment on issue #27717: [SPARK-23435][INFRA][FOLLOW-UP] Remove unnecessary dependency listing in AppVeyor URL: https://github.com/apache/spark/pull/27717#issuecomment-591829374 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27707: [SPARK-30958][SQL] do not set default era for DateTimeFormatter
AmplabJenkins removed a comment on issue #27707: [SPARK-30958][SQL] do not set default era for DateTimeFormatter URL: https://github.com/apache/spark/pull/27707#issuecomment-591821631 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27717: [SPARK-23435][INFRA][FOLLOW-UP] Remove unnecessary dependency listing in AppVeyor
AmplabJenkins removed a comment on issue #27717: [SPARK-23435][INFRA][FOLLOW-UP] Remove unnecessary dependency listing in AppVeyor URL: https://github.com/apache/spark/pull/27717#issuecomment-591829378 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119007/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27707: [SPARK-30958][SQL] do not set default era for DateTimeFormatter
SparkQA removed a comment on issue #27707: [SPARK-30958][SQL] do not set default era for DateTimeFormatter URL: https://github.com/apache/spark/pull/27707#issuecomment-591776664 **[Test build #119008 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119008/testReport)** for PR 27707 at commit [`e26ffc0`](https://github.com/apache/spark/commit/e26ffc0a244acb4a89cde20e6172a48dee676a94). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27716: [SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index
AmplabJenkins removed a comment on issue #27716: [SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index URL: https://github.com/apache/spark/pull/27716#issuecomment-591827239 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27717: [SPARK-23435][INFRA][FOLLOW-UP] Remove unnecessary dependency listing in AppVeyor
SparkQA removed a comment on issue #27717: [SPARK-23435][INFRA][FOLLOW-UP] Remove unnecessary dependency listing in AppVeyor URL: https://github.com/apache/spark/pull/27717#issuecomment-591773040 **[Test build #119007 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119007/testReport)** for PR 27717 at commit [`94df720`](https://github.com/apache/spark/commit/94df7202531c171a2cca9cd42312e520beb5fd2c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27717: [SPARK-23435][INFRA][FOLLOW-UP] Remove unnecessary dependency listing in AppVeyor
AmplabJenkins commented on issue #27717: [SPARK-23435][INFRA][FOLLOW-UP] Remove unnecessary dependency listing in AppVeyor URL: https://github.com/apache/spark/pull/27717#issuecomment-591829374 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27717: [SPARK-23435][INFRA][FOLLOW-UP] Remove unnecessary dependency listing in AppVeyor
AmplabJenkins commented on issue #27717: [SPARK-23435][INFRA][FOLLOW-UP] Remove unnecessary dependency listing in AppVeyor URL: https://github.com/apache/spark/pull/27717#issuecomment-591829378 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119007/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Eric5553 commented on issue #27511: [SPARK-30765][SQL] Refine base operator abstraction code style
Eric5553 commented on issue #27511: [SPARK-30765][SQL] Refine base operator abstraction code style URL: https://github.com/apache/spark/pull/27511#issuecomment-591828655 Thanks a lot! @HyukjinKwon @cloud-fan @maropu This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] peter-toth commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning
peter-toth commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning URL: https://github.com/apache/spark/pull/27702#discussion_r384955295 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ## @@ -3394,15 +3395,25 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark } } - test("SPARK-30870: Column pruning shouldn't alias a nested column if it means the whole " + -"structure") { -val df = sql( - """ -|SELECT explodedvalue.field -|FROM VALUES array(named_struct('field', named_struct('a', 1, 'b', 2))) AS (value) -|LATERAL VIEW explode(value) AS explodedvalue - """.stripMargin) -checkAnswer(df, Row(Row(1, 2)) :: Nil) + test("SPARK-30870: Column pruning shouldn't alias a nested column for the whole structure") { +withTable("t") { + val df = sql( +""" + |SELECT value + |FROM VALUES array(named_struct('field', named_struct('a', 1, 'b', 2))) AS (value) +""".stripMargin) + df.write.format("parquet").saveAsTable("t") Review comment: Both PRs fix the concrete SQL in SPARK-30870 independently because - the whole structure was selected (my pr fixed it) and - a nested column was selected from the output of a generate (this PR fixes it). But this PR is better for cases that contain generate as it can handle nested field of a nested field https://github.com/apache/spark/pull/27702/files#diff-957112380b0a2ef014abc8227d0b70acR313-R318 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27717: [SPARK-23435][INFRA][FOLLOW-UP] Remove unnecessary dependency listing in AppVeyor
SparkQA commented on issue #27717: [SPARK-23435][INFRA][FOLLOW-UP] Remove unnecessary dependency listing in AppVeyor URL: https://github.com/apache/spark/pull/27717#issuecomment-591828509 **[Test build #119007 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119007/testReport)** for PR 27717 at commit [`94df720`](https://github.com/apache/spark/commit/94df7202531c171a2cca9cd42312e520beb5fd2c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #27511: [SPARK-30765][SQL] Refine base operator abstraction code style
HyukjinKwon closed pull request #27511: [SPARK-30765][SQL] Refine base operator abstraction code style URL: https://github.com/apache/spark/pull/27511 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #27511: [SPARK-30765][SQL] Refine base operator abstraction code style
HyukjinKwon commented on issue #27511: [SPARK-30765][SQL] Refine base operator abstraction code style URL: https://github.com/apache/spark/pull/27511#issuecomment-591827725 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27716: [SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index
AmplabJenkins commented on issue #27716: [SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index URL: https://github.com/apache/spark/pull/27716#issuecomment-591827249 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23762/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27716: [SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index
AmplabJenkins commented on issue #27716: [SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index URL: https://github.com/apache/spark/pull/27716#issuecomment-591827239 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] peter-toth commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning
peter-toth commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning URL: https://github.com/apache/spark/pull/27702#discussion_r384955295 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ## @@ -3394,15 +3395,25 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark } } - test("SPARK-30870: Column pruning shouldn't alias a nested column if it means the whole " + -"structure") { -val df = sql( - """ -|SELECT explodedvalue.field -|FROM VALUES array(named_struct('field', named_struct('a', 1, 'b', 2))) AS (value) -|LATERAL VIEW explode(value) AS explodedvalue - """.stripMargin) -checkAnswer(df, Row(Row(1, 2)) :: Nil) + test("SPARK-30870: Column pruning shouldn't alias a nested column for the whole structure") { +withTable("t") { + val df = sql( +""" + |SELECT value + |FROM VALUES array(named_struct('field', named_struct('a', 1, 'b', 2))) AS (value) +""".stripMargin) + df.write.format("parquet").saveAsTable("t") Review comment: Both fixes fix the concrete SQL in SPARK-30870 independently because - the whole structure was selected (my pr fixes it) and - a nested column was selected from the output of a generate (this pr fixes it). But this PR is better for cases that contain generate as it can handle nested field of a nested field https://github.com/apache/spark/pull/27702/files#diff-957112380b0a2ef014abc8227d0b70acR313-R318 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27716: [SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index
SparkQA commented on issue #27716: [SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index URL: https://github.com/apache/spark/pull/27716#issuecomment-591826830 **[Test build #119015 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119015/testReport)** for PR 27716 at commit [`da463e9`](https://github.com/apache/spark/commit/da463e95097f6173a3a9989d09bf054307a8e098). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on issue #27716: [SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index
gengliangwang commented on issue #27716: [SPARK-30964][Core][WebUI] Accelerate InMemoryStore with a new index URL: https://github.com/apache/spark/pull/27716#issuecomment-591825428 retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning
HyukjinKwon commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning URL: https://github.com/apache/spark/pull/27702#discussion_r384950585 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/SchemaPruningSuite.scala ## @@ -301,6 +301,24 @@ abstract class SchemaPruningSuite checkAnswer(query, Row("Y.", 1) :: Row("X.", 1) :: Row(null, 2) :: Row(null, 2) :: Nil) } + testSchemaPruning("select explode of nested field of array of struct from Generate output") { Review comment: And, why is the test case here? Does it relate to source anything specific? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27707: [SPARK-30958][SQL] do not set default era for DateTimeFormatter
AmplabJenkins commented on issue #27707: [SPARK-30958][SQL] do not set default era for DateTimeFormatter URL: https://github.com/apache/spark/pull/27707#issuecomment-591821642 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119008/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27707: [SPARK-30958][SQL] do not set default era for DateTimeFormatter
SparkQA commented on issue #27707: [SPARK-30958][SQL] do not set default era for DateTimeFormatter URL: https://github.com/apache/spark/pull/27707#issuecomment-591821473 **[Test build #119008 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119008/testReport)** for PR 27707 at commit [`e26ffc0`](https://github.com/apache/spark/commit/e26ffc0a244acb4a89cde20e6172a48dee676a94). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27707: [SPARK-30958][SQL] do not set default era for DateTimeFormatter
AmplabJenkins commented on issue #27707: [SPARK-30958][SQL] do not set default era for DateTimeFormatter URL: https://github.com/apache/spark/pull/27707#issuecomment-591821631 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning
viirya commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning URL: https://github.com/apache/spark/pull/27702#discussion_r384948107 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ## @@ -3394,15 +3395,25 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark } } - test("SPARK-30870: Column pruning shouldn't alias a nested column if it means the whole " + -"structure") { -val df = sql( - """ -|SELECT explodedvalue.field -|FROM VALUES array(named_struct('field', named_struct('a', 1, 'b', 2))) AS (value) -|LATERAL VIEW explode(value) AS explodedvalue - """.stripMargin) -checkAnswer(df, Row(Row(1, 2)) :: Nil) + test("SPARK-30870: Column pruning shouldn't alias a nested column for the whole structure") { +withTable("t") { + val df = sql( +""" + |SELECT value + |FROM VALUES array(named_struct('field', named_struct('a', 1, 'b', 2))) AS (value) +""".stripMargin) + df.write.format("parquet").saveAsTable("t") Review comment: The test added by SPARK-30870 is moved to `SchemaPruningSuite`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning
viirya commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning URL: https://github.com/apache/spark/pull/27702#discussion_r384947817 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ## @@ -3394,15 +3395,25 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark } } - test("SPARK-30870: Column pruning shouldn't alias a nested column if it means the whole " + -"structure") { -val df = sql( - """ -|SELECT explodedvalue.field -|FROM VALUES array(named_struct('field', named_struct('a', 1, 'b', 2))) AS (value) -|LATERAL VIEW explode(value) AS explodedvalue - """.stripMargin) -checkAnswer(df, Row(Row(1, 2)) :: Nil) + test("SPARK-30870: Column pruning shouldn't alias a nested column for the whole structure") { +withTable("t") { + val df = sql( +""" + |SELECT value + |FROM VALUES array(named_struct('field', named_struct('a', 1, 'b', 2))) AS (value) +""".stripMargin) + df.write.format("parquet").saveAsTable("t") Review comment: I changed the test added by SPARK-30870, because the root cause of the failure is not due to selecting whole structure. Here I change this test to more fit the original purpose of SPARK-30870: when selecting whole structure, column pruning shouldn't alias the nested column. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning
viirya commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning URL: https://github.com/apache/spark/pull/27702#discussion_r384946779 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala ## @@ -104,19 +104,23 @@ object NestedColumnAliasing { /** * Return two maps in order to replace nested fields to aliases. * + * If `exclusiveAttrs` is given, any nested field accessors of these attributes + * won't be considered in nested fields aliasing. + * * 1. ExtractValue -> Alias: A new alias is created for each nested field. * 2. ExprId -> Seq[Alias]: A reference attribute has multiple aliases pointing it. */ - def getAliasSubMap(exprList: Seq[Expression]) + def getAliasSubMap(exprList: Seq[Expression], exclusiveAttrs: Seq[Attribute] = Seq.empty) Review comment: I think yes, I will sync #27517 with this later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning
HyukjinKwon commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning URL: https://github.com/apache/spark/pull/27702#discussion_r384946240 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ## @@ -3394,15 +3395,25 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark } } - test("SPARK-30870: Column pruning shouldn't alias a nested column if it means the whole " + -"structure") { -val df = sql( - """ -|SELECT explodedvalue.field -|FROM VALUES array(named_struct('field', named_struct('a', 1, 'b', 2))) AS (value) -|LATERAL VIEW explode(value) AS explodedvalue - """.stripMargin) -checkAnswer(df, Row(Row(1, 2)) :: Nil) + test("SPARK-30870: Column pruning shouldn't alias a nested column for the whole structure") { +withTable("t") { + val df = sql( +""" + |SELECT value + |FROM VALUES array(named_struct('field', named_struct('a', 1, 'b', 2))) AS (value) +""".stripMargin) + df.write.format("parquet").saveAsTable("t") Review comment: Sorry if I am being dumb here but I can't fully follow why LocalRelation matters here. How does the current fix relate to the test fix here? If we concern about `ConvertToLocalRelation`, it seems not effective here: ``` == Parsed Logical Plan == 'Project ['explodedvalue.field] +- 'Generate 'explode('value), false, as, ['explodedvalue] +- 'SubqueryAlias AS +- 'UnresolvedInlineTable [value], [List('array('named_struct(field, 'named_struct(a, 1, b, 2] == Analyzed Logical Plan == field: struct Project [explodedvalue#219.field AS field#220] +- Generate explode(value#218), false, as, [explodedvalue#219] +- SubqueryAlias AS +- LocalRelation [value#218] == Optimized Logical Plan == Project [explodedvalue#219.field AS field#220] +- Generate explode(value#218), [0], false, as, [explodedvalue#219] +- LocalRelation [value#218] == Physical Plan == *(1) Project [explodedvalue#219.field AS field#220] +- Generate explode(value#218), false, [explodedvalue#219] +- LocalTableScan [value#218] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-591815017 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119002/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-591815007 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27670: [SPARK-30937][DOC] Group Hive upgrade guides together
AmplabJenkins removed a comment on issue #27670: [SPARK-30937][DOC] Group Hive upgrade guides together URL: https://github.com/apache/spark/pull/27670#issuecomment-591814471 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27670: [SPARK-30937][DOC] Group Hive upgrade guides together
AmplabJenkins removed a comment on issue #27670: [SPARK-30937][DOC] Group Hive upgrade guides together URL: https://github.com/apache/spark/pull/27670#issuecomment-591814481 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23761/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-591815007 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-591815017 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119002/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
SparkQA removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-591733607 **[Test build #119002 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119002/testReport)** for PR 27616 at commit [`52fcdf3`](https://github.com/apache/spark/commit/52fcdf3c40cb36928a4cad7b01e9396fbeb9870f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27670: [SPARK-30937][DOC] Group Hive upgrade guides together
AmplabJenkins commented on issue #27670: [SPARK-30937][DOC] Group Hive upgrade guides together URL: https://github.com/apache/spark/pull/27670#issuecomment-591814471 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
SparkQA commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-591814237 **[Test build #119002 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119002/testReport)** for PR 27616 at commit [`52fcdf3`](https://github.com/apache/spark/commit/52fcdf3c40cb36928a4cad7b01e9396fbeb9870f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27670: [SPARK-30937][DOC] Group Hive upgrade guides together
AmplabJenkins commented on issue #27670: [SPARK-30937][DOC] Group Hive upgrade guides together URL: https://github.com/apache/spark/pull/27670#issuecomment-591814481 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23761/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on a change in pull request #23495: [SPARK-26503][CORE] Get rid of spark.sql.legacy.timeParser.enabled
gatorsmile commented on a change in pull request #23495: [SPARK-26503][CORE] Get rid of spark.sql.legacy.timeParser.enabled URL: https://github.com/apache/spark/pull/23495#discussion_r384942367 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala ## @@ -1451,109 +1451,6 @@ class JsonSuite extends QueryTest with SharedSQLContext with TestJsonData { }) } - test("backward compatibility") { -withSQLConf(SQLConf.LEGACY_TIME_PARSER_ENABLED.key -> "true") { - // This test we make sure our JSON support can read JSON data generated by previous version - // of Spark generated through toJSON method and JSON data source. - // The data is generated by the following program. - // Here are a few notes: - // - Spark 1.5.0 cannot save timestamp data. So, we manually added timestamp field (col13) - // in the JSON object. - // - For Spark before 1.5.1, we do not generate UDTs. So, we manually added the UDT value to - // JSON objects generated by those Spark versions (col17). - // - If the type is NullType, we do not write data out. Review comment: Based on the latest discussions in https://github.com/apache/spark/pull/27710#discussion_r384584040, we can't silently return the wrong results. The backward compatibility is very critical. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27670: [SPARK-30937][DOC] Group Hive upgrade guides together
SparkQA commented on issue #27670: [SPARK-30937][DOC] Group Hive upgrade guides together URL: https://github.com/apache/spark/pull/27670#issuecomment-591814025 **[Test build #119014 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119014/testReport)** for PR 27670 at commit [`b7e04b7`](https://github.com/apache/spark/commit/b7e04b7b52b9ff3c14f0863b65f090b973dc4e86). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott commented on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#issuecomment-591813279 No problem. I believe what's remaining is just some tests on TaskSetManager. Will remove WIP once I add. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bmarcott edited a comment on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling.
bmarcott edited a comment on issue #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#issuecomment-591813279 No problem. I believe what's remaining is just some tests on TaskSetManager. Will remove WIP once I add them. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xuanyuanking commented on a change in pull request #25620: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files
xuanyuanking commented on a change in pull request #25620: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files URL: https://github.com/apache/spark/pull/25620#discussion_r384940070 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java ## @@ -172,7 +172,7 @@ public ManagedBuffer getBlockData( String appId, String execId, int shuffleId, - int mapId, + long mapId, Review comment: Yes, after this patch, we set mapId by using the `taskAttemptId` of map task, which is a unique Id within the same SparkContext. You can see the comment https://github.com/apache/spark/pull/25620#discussion_r319396089 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-591811439 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119003/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
AmplabJenkins commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-591811430 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-591811430 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
AmplabJenkins removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-591811439 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119003/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cricket007 commented on issue #18209: [SPARK-20992][Scheduler] Add support for Nomad as a scheduler backend
cricket007 commented on issue #18209: [SPARK-20992][Scheduler] Add support for Nomad as a scheduler backend URL: https://github.com/apache/spark/pull/18209#issuecomment-591810902 Follow up question, what's the `external` folder for if not non-core packages? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
SparkQA commented on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-591810675 **[Test build #119003 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119003/testReport)** for PR 27616 at commit [`8ff8b71`](https://github.com/apache/spark/commit/8ff8b712725c69f68b6953b22c26896c4e0760b0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution
SparkQA removed a comment on issue #27616: [SPARK-30864] [SQL]add the user guide for Adaptive Query Execution URL: https://github.com/apache/spark/pull/27616#issuecomment-591735908 **[Test build #119003 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119003/testReport)** for PR 27616 at commit [`8ff8b71`](https://github.com/apache/spark/commit/8ff8b712725c69f68b6953b22c26896c4e0760b0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning
HyukjinKwon commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning URL: https://github.com/apache/spark/pull/27702#discussion_r384936106 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala ## @@ -104,19 +104,23 @@ object NestedColumnAliasing { /** * Return two maps in order to replace nested fields to aliases. * + * If `exclusiveAttrs` is given, any nested field accessors of these attributes + * won't be considered in nested fields aliasing. + * * 1. ExtractValue -> Alias: A new alias is created for each nested field. * 2. ExprId -> Seq[Alias]: A reference attribute has multiple aliases pointing it. */ - def getAliasSubMap(exprList: Seq[Expression]) + def getAliasSubMap(exprList: Seq[Expression], exclusiveAttrs: Seq[Attribute] = Seq.empty) Review comment: @viirya, just to clarify, we will keep `exclusiveAttrs` instead of `skipAttrs`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
HyukjinKwon commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type URL: https://github.com/apache/spark/pull/27499#issuecomment-591806127 +1 LGTM too This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
viirya commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type URL: https://github.com/apache/spark/pull/27499#issuecomment-591804431 Thanks! I will open a JIRA for discussion of typed select API. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27710: [SPARK-30960][SQL] add back the legacy date/timestamp format support in CSV/JSON parser
cloud-fan commented on a change in pull request #27710: [SPARK-30960][SQL] add back the legacy date/timestamp format support in CSV/JSON parser URL: https://github.com/apache/spark/pull/27710#discussion_r384933114 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala ## @@ -175,10 +175,30 @@ class UnivocityParser( } case _: TimestampType => (d: String) => - nullSafeDatum(d, name, nullable, options)(timestampFormatter.parse) + nullSafeDatum(d, name, nullable, options) { datum => +try { + timestampFormatter.parse(datum) +} catch { + case NonFatal(e) => +// If fails to parse, then tries the way used in 2.0 and 1.x for backwards +// compatibility. +val str = UTF8String.fromString(datum) +DateTimeUtils.stringToTimestamp(str, options.zoneId).getOrElse(throw e) Review comment: I know they are different implementations, but we care more about the behavior to end users. Are there any timestamp strings that can be parsed by `stringToTime` but not `stringToTimestamp`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] mridulm commented on a change in pull request #27583: [SPARK-29149][YARN] Update YARN cluster manager For Stage Level Scheduling
mridulm commented on a change in pull request #27583: [SPARK-29149][YARN] Update YARN cluster manager For Stage Level Scheduling URL: https://github.com/apache/spark/pull/27583#discussion_r384930920 ## File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala ## @@ -166,69 +195,188 @@ private[yarn] class YarnAllocator( private val labelExpression = sparkConf.get(EXECUTOR_NODE_LABEL_EXPRESSION) - // A map to store preferred hostname and possible task numbers running on it. - private var hostToLocalTaskCounts: Map[String, Int] = Map.empty - - // Number of tasks that have locality preferences in active stages - private[yarn] var numLocalityAwareTasks: Int = 0 - // A container placement strategy based on pending tasks' locality preference private[yarn] val containerPlacementStrategy = -new LocalityPreferredContainerPlacementStrategy(sparkConf, conf, resource, resolver) +new LocalityPreferredContainerPlacementStrategy(sparkConf, conf, resolver) + + // The default profile is always present so we need to initialize the datastructures keyed by + // ResourceProfile id to ensure its present if things start running before a request for + // executors could add it. This approach is easier then going and special casing everywhere. + private def initDefaultProfile(): Unit = synchronized { +allocatedHostToContainersMapPerRPId(DEFAULT_RESOURCE_PROFILE_ID) = + new HashMap[String, mutable.Set[ContainerId]]() +runningExecutorsPerResourceProfileId.put(DEFAULT_RESOURCE_PROFILE_ID, mutable.HashSet[String]()) +numExecutorsStartingPerResourceProfileId(DEFAULT_RESOURCE_PROFILE_ID) = new AtomicInteger(0) +targetNumExecutorsPerResourceProfileId(DEFAULT_RESOURCE_PROFILE_ID) = + SchedulerBackendUtils.getInitialTargetExecutorNumber(sparkConf) +rpIdToYarnResource(DEFAULT_RESOURCE_PROFILE_ID) = defaultResource +rpIdToResourceProfile(DEFAULT_RESOURCE_PROFILE_ID) = + ResourceProfile.getOrCreateDefaultProfile(sparkConf) + } + + initDefaultProfile() - def getNumExecutorsRunning: Int = runningExecutors.size() + def getNumExecutorsRunning: Int = synchronized { +runningExecutorsPerResourceProfileId.values.map(_.size).sum + } + + def getNumLocalityAwareTasks: Int = synchronized { +numLocalityAwareTasksPerResourceProfileId.values.sum + } - def getNumReleasedContainers: Int = releasedContainers.size() + def getNumExecutorsStarting: Int = { Review comment: synchronized on `this` ? I was expecting static analysis via \@GuardedBy to catch this in build, apparently we dont have that validation. Can you also check use of some of the other variables as well ? `targetNumExecutorsPerResourceProfileId`, etc also seems to have similar issues. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning
viirya commented on a change in pull request #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning URL: https://github.com/apache/spark/pull/27702#discussion_r384932661 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ## @@ -3394,15 +3395,25 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark } } - test("SPARK-30870: Column pruning shouldn't alias a nested column if it means the whole " + -"structure") { -val df = sql( - """ -|SELECT explodedvalue.field -|FROM VALUES array(named_struct('field', named_struct('a', 1, 'b', 2))) AS (value) -|LATERAL VIEW explode(value) AS explodedvalue - """.stripMargin) -checkAnswer(df, Row(Row(1, 2)) :: Nil) + test("SPARK-30870: Column pruning shouldn't alias a nested column for the whole structure") { +withTable("t") { + val df = sql( +""" + |SELECT value + |FROM VALUES array(named_struct('field', named_struct('a', 1, 'b', 2))) AS (value) +""".stripMargin) + df.write.format("parquet").saveAsTable("t") Review comment: Yea, if not store as a table, the query becomes a simple LocalRelation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration
AmplabJenkins commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration URL: https://github.com/apache/spark/pull/27719#issuecomment-591803565 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27567: [SPARK-30822][SQL] Remove semicolon at the end of a sql query
AmplabJenkins removed a comment on issue #27567: [SPARK-30822][SQL] Remove semicolon at the end of a sql query URL: https://github.com/apache/spark/pull/27567#issuecomment-591803577 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23760/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27567: [SPARK-30822][SQL] Remove semicolon at the end of a sql query
AmplabJenkins commented on issue #27567: [SPARK-30822][SQL] Remove semicolon at the end of a sql query URL: https://github.com/apache/spark/pull/27567#issuecomment-591803566 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration
AmplabJenkins removed a comment on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration URL: https://github.com/apache/spark/pull/27719#issuecomment-591803569 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23759/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27567: [SPARK-30822][SQL] Remove semicolon at the end of a sql query
AmplabJenkins commented on issue #27567: [SPARK-30822][SQL] Remove semicolon at the end of a sql query URL: https://github.com/apache/spark/pull/27567#issuecomment-591803577 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23760/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27567: [SPARK-30822][SQL] Remove semicolon at the end of a sql query
AmplabJenkins removed a comment on issue #27567: [SPARK-30822][SQL] Remove semicolon at the end of a sql query URL: https://github.com/apache/spark/pull/27567#issuecomment-591803566 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration
AmplabJenkins removed a comment on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration URL: https://github.com/apache/spark/pull/27719#issuecomment-591803565 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration
AmplabJenkins commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration URL: https://github.com/apache/spark/pull/27719#issuecomment-591803569 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23759/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration
SparkQA commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration URL: https://github.com/apache/spark/pull/27719#issuecomment-591803189 **[Test build #119012 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119012/testReport)** for PR 27719 at commit [`563ea7d`](https://github.com/apache/spark/commit/563ea7d599ed7b666b8fb54ccb414ae9547371de). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27567: [SPARK-30822][SQL] Remove semicolon at the end of a sql query
SparkQA commented on issue #27567: [SPARK-30822][SQL] Remove semicolon at the end of a sql query URL: https://github.com/apache/spark/pull/27567#issuecomment-591803216 **[Test build #119013 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119013/testReport)** for PR 27567 at commit [`4bca772`](https://github.com/apache/spark/commit/4bca772f5926b75c729e382510ed771d5651bd90). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #27718: [SPARK-30885][SQL][FOLLOW-UP] Fix a hack in ResolveSessionCatalog.TempViewOrV1Table
cloud-fan commented on issue #27718: [SPARK-30885][SQL][FOLLOW-UP] Fix a hack in ResolveSessionCatalog.TempViewOrV1Table URL: https://github.com/apache/spark/pull/27718#issuecomment-591803097 do we still support `spark_catalog.t`? It seems we still support it , as `SessionCatalogAndTable` will extract table name `t` from `spark_catalog.t` and pass to the v1 commands. V1 commands will fill the default database for name `t`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #27710: [SPARK-30960][SQL] add back the legacy date/timestamp format support in CSV/JSON parser
MaxGekk commented on a change in pull request #27710: [SPARK-30960][SQL] add back the legacy date/timestamp format support in CSV/JSON parser URL: https://github.com/apache/spark/pull/27710#discussion_r384931421 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala ## @@ -175,10 +175,30 @@ class UnivocityParser( } case _: TimestampType => (d: String) => - nullSafeDatum(d, name, nullable, options)(timestampFormatter.parse) + nullSafeDatum(d, name, nullable, options) { datum => +try { + timestampFormatter.parse(datum) +} catch { + case NonFatal(e) => +// If fails to parse, then tries the way used in 2.0 and 1.x for backwards +// compatibility. +val str = UTF8String.fromString(datum) +DateTimeUtils.stringToTimestamp(str, options.zoneId).getOrElse(throw e) Review comment: `stringToTimestamp` doesn't have the issue but you implemented different fallback mechanism. From my point of view, it makes sense to either: - restore the old function `stringToTime` - or keep current code as is. Fallback to `stringToTimestamp` is new feature, I think. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration
AmplabJenkins removed a comment on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration URL: https://github.com/apache/spark/pull/27719#issuecomment-591801128 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27650: [SPARK-30902][SQL] Default table provider should be decided by catalog implementations
AmplabJenkins removed a comment on issue #27650: [SPARK-30902][SQL] Default table provider should be decided by catalog implementations URL: https://github.com/apache/spark/pull/27650#issuecomment-591801315 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration
dongjoon-hyun commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration URL: https://github.com/apache/spark/pull/27719#issuecomment-591801910 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27650: [SPARK-30902][SQL] Default table provider should be decided by catalog implementations
AmplabJenkins removed a comment on issue #27650: [SPARK-30902][SQL] Default table provider should be decided by catalog implementations URL: https://github.com/apache/spark/pull/27650#issuecomment-591801318 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23758/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration
AmplabJenkins removed a comment on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration URL: https://github.com/apache/spark/pull/27719#issuecomment-591800803 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27650: [SPARK-30902][SQL] Default table provider should be decided by catalog implementations
AmplabJenkins commented on issue #27650: [SPARK-30902][SQL] Default table provider should be decided by catalog implementations URL: https://github.com/apache/spark/pull/27650#issuecomment-591801315 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration
AmplabJenkins commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration URL: https://github.com/apache/spark/pull/27719#issuecomment-591801128 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tooptoop4 commented on issue #27697: [SPARK-27750] Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service
tooptoop4 commented on issue #27697: [SPARK-27750] Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service URL: https://github.com/apache/spark/pull/27697#issuecomment-591801217 i am doing a data lake, where 1s of files of different schemas get ingested daily. what is the reluctance to merging this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27650: [SPARK-30902][SQL] Default table provider should be decided by catalog implementations
AmplabJenkins commented on issue #27650: [SPARK-30902][SQL] Default table provider should be decided by catalog implementations URL: https://github.com/apache/spark/pull/27650#issuecomment-591801318 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23758/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27650: [SPARK-30902][SQL] Default table provider should be decided by catalog implementations
SparkQA commented on issue #27650: [SPARK-30902][SQL] Default table provider should be decided by catalog implementations URL: https://github.com/apache/spark/pull/27650#issuecomment-591800947 **[Test build #119011 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119011/testReport)** for PR 27650 at commit [`f724968`](https://github.com/apache/spark/commit/f72496884d51216fbbc2b04b7441815958cb9f29). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration
AmplabJenkins commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration URL: https://github.com/apache/spark/pull/27719#issuecomment-591800803 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #27650: [SPARK-30902][SQL] Default table provider should be decided by catalog implementations
cloud-fan commented on issue #27650: [SPARK-30902][SQL] Default table provider should be decided by catalog implementations URL: https://github.com/apache/spark/pull/27650#issuecomment-591800454 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on issue #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning
viirya commented on issue #27702: [SPARK-30955][SQL] Exclude Generate output when aliasing in nested column pruning URL: https://github.com/apache/spark/pull/27702#issuecomment-591800373 > how does this relate to #27517? This is to fix a bug when pruning nested column in Project on top of Generate. #27517 extends the nested column pruning to a single Generate without Project. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
cloud-fan closed pull request #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type URL: https://github.com/apache/spark/pull/27499 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type
cloud-fan commented on issue #27499: [SPARK-30590][SQL] Untyped select API cannot take typed column expression that needs input type URL: https://github.com/apache/spark/pull/27499#issuecomment-591800215 thanks, merging to master/3.0! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] iRakson commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration
iRakson commented on issue #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration URL: https://github.com/apache/spark/pull/27719#issuecomment-591799948 cc @cloud-fan @dongjoon-hyun This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] iRakson opened a new pull request #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration
iRakson opened a new pull request #27719: [SPARK-27619][SQL][FOLLOWUP] Rename Configuration URL: https://github.com/apache/spark/pull/27719 ### What changes were proposed in this pull request? Renamed configuration from `spark.sql.legacy.useHashOnMapType` to `spark.sql.legacy.allowHashOnMapType`. ### Why are the changes needed? Better readability of configuration. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing UTs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27718: [SPARK-30885][SQL][FOLLOW-UP] Fix a hack in ResolveSessionCatalog.TempViewOrV1Table
AmplabJenkins removed a comment on issue #27718: [SPARK-30885][SQL][FOLLOW-UP] Fix a hack in ResolveSessionCatalog.TempViewOrV1Table URL: https://github.com/apache/spark/pull/27718#issuecomment-591799073 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill
AmplabJenkins removed a comment on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill URL: https://github.com/apache/spark/pull/24618#issuecomment-591799071 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27718: [SPARK-30885][SQL][FOLLOW-UP] Fix a hack in ResolveSessionCatalog.TempViewOrV1Table
AmplabJenkins removed a comment on issue #27718: [SPARK-30885][SQL][FOLLOW-UP] Fix a hack in ResolveSessionCatalog.TempViewOrV1Table URL: https://github.com/apache/spark/pull/27718#issuecomment-591799077 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23756/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill
AmplabJenkins removed a comment on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill URL: https://github.com/apache/spark/pull/24618#issuecomment-591799079 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23757/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill
AmplabJenkins commented on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill URL: https://github.com/apache/spark/pull/24618#issuecomment-591799071 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill
AmplabJenkins commented on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill URL: https://github.com/apache/spark/pull/24618#issuecomment-591799079 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23757/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27718: [SPARK-30885][SQL][FOLLOW-UP] Fix a hack in ResolveSessionCatalog.TempViewOrV1Table
AmplabJenkins commented on issue #27718: [SPARK-30885][SQL][FOLLOW-UP] Fix a hack in ResolveSessionCatalog.TempViewOrV1Table URL: https://github.com/apache/spark/pull/27718#issuecomment-591799073 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27718: [SPARK-30885][SQL][FOLLOW-UP] Fix a hack in ResolveSessionCatalog.TempViewOrV1Table
AmplabJenkins commented on issue #27718: [SPARK-30885][SQL][FOLLOW-UP] Fix a hack in ResolveSessionCatalog.TempViewOrV1Table URL: https://github.com/apache/spark/pull/27718#issuecomment-591799077 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23756/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27718: [SPARK-30885][SQL][FOLLOW-UP] Fix a hack in ResolveSessionCatalog.TempViewOrV1Table
SparkQA commented on issue #27718: [SPARK-30885][SQL][FOLLOW-UP] Fix a hack in ResolveSessionCatalog.TempViewOrV1Table URL: https://github.com/apache/spark/pull/27718#issuecomment-591798632 **[Test build #119009 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119009/testReport)** for PR 27718 at commit [`1f48bba`](https://github.com/apache/spark/commit/1f48bbaa4fa874cc58e0e171a07b0a0387fad47a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill
SparkQA commented on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill URL: https://github.com/apache/spark/pull/24618#issuecomment-591798686 **[Test build #119010 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119010/testReport)** for PR 24618 at commit [`ab410fc`](https://github.com/apache/spark/commit/ab410fc48a039f161693ea81f8d15dbd3041d57f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill
AmplabJenkins removed a comment on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill URL: https://github.com/apache/spark/pull/24618#issuecomment-574298101 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill
gatorsmile commented on issue #24618: [SPARK-27734][CORE][SQL] Add memory based thresholds for shuffle spill URL: https://github.com/apache/spark/pull/24618#issuecomment-591798151 add to whitelist This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org