[GitHub] [spark] AmplabJenkins removed a comment on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job
AmplabJenkins removed a comment on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job URL: https://github.com/apache/spark/pull/27784#issuecomment-594335056 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119275/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job
SparkQA removed a comment on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job URL: https://github.com/apache/spark/pull/27784#issuecomment-594318061 **[Test build #119275 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119275/testReport)** for PR 27784 at commit [`930edd6`](https://github.com/apache/spark/commit/930edd68716567dae32d67e9c194c4da8947e01b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job
SparkQA commented on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job URL: https://github.com/apache/spark/pull/27784#issuecomment-594334725 **[Test build #119275 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119275/testReport)** for PR 27784 at commit [`930edd6`](https://github.com/apache/spark/commit/930edd68716567dae32d67e9c194c4da8947e01b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dbtsai commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
dbtsai commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387455194 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -635,3 +636,18 @@ object DataSourceStrategy { } } } + +case class PushDownCol(name: String, dataType: DataType) Review comment: WDYT now? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala
SparkQA removed a comment on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala URL: https://github.com/apache/spark/pull/27783#issuecomment-594284394 **[Test build #119272 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119272/testReport)** for PR 27783 at commit [`1309a0a`](https://github.com/apache/spark/commit/1309a0a966fd83997688ba9c879b0878c9f0b383). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yma11 commented on a change in pull request #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines
yma11 commented on a change in pull request #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines URL: https://github.com/apache/spark/pull/27546#discussion_r387454997 ## File path: mllib-local/pom.xml ## @@ -61,13 +61,17 @@ This spark-tags test-dep is needed even though it isn't used in this module, otherwise testing-cmds that exclude them will yield errors. --> + Review comment: revert configurable related commits and now use 256 as the nativeL1Threshold. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594327792 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala
AmplabJenkins removed a comment on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala URL: https://github.com/apache/spark/pull/27783#issuecomment-594333926 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119272/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594327794 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24017/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala
AmplabJenkins removed a comment on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala URL: https://github.com/apache/spark/pull/27783#issuecomment-594333919 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala
AmplabJenkins commented on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala URL: https://github.com/apache/spark/pull/27783#issuecomment-594333926 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119272/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala
AmplabJenkins commented on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala URL: https://github.com/apache/spark/pull/27783#issuecomment-594333919 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala
SparkQA commented on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala URL: https://github.com/apache/spark/pull/27783#issuecomment-59453 **[Test build #119272 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119272/testReport)** for PR 27783 at commit [`1309a0a`](https://github.com/apache/spark/commit/1309a0a966fd83997688ba9c879b0878c9f0b383). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #27746: [SPARK-30994][CORE] Update xerces to 2.12.0
dongjoon-hyun edited a comment on issue #27746: [SPARK-30994][CORE] Update xerces to 2.12.0 URL: https://github.com/apache/spark/pull/27746#issuecomment-594327760 @srowen . Unfortunately, the five Maven Jenkins jobs skip Python Packaging test. Only SBT jobs have the Python Packaging test coverage. And, those SBT jobs still show 2.9.1 in `master` and `branch-3.0`. It seems that we need a test coverage recovery as a follow-up. ``` copying deps/jars/xercesImpl-2.9.1.jar -> pyspark-3.0.0.dev0/deps/jars ``` We need to choose one of the followings. 1. Fix SBT build 2. Add a test coverage on one of Maven Jenkins jobs (1) is the correct way, but (2) is also okay. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #27746: [SPARK-30994][CORE] Update xerces to 2.12.0
dongjoon-hyun edited a comment on issue #27746: [SPARK-30994][CORE] Update xerces to 2.12.0 URL: https://github.com/apache/spark/pull/27746#issuecomment-594327760 @srowen . Unfortunately, the five Maven Jenkins jobs skip Python Packaging test. Only SBT jobs have the Python Packaging test coverage. And, those SBT jobs still show 2.9.1 in `master` and `branch-3.0`. It seems that we need a test coverage recovery as a follow-up. ``` copying deps/jars/xercesImpl-2.9.1.jar -> pyspark-3.0.0.dev0/deps/jars ``` We need to choose one of the following. 1. Fix SBT build 2. Add a test coverage on one of Maven Jenkins jobs (1) is the correct way, but (2) is also okay. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594327792 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #27746: [SPARK-30994][CORE] Update xerces to 2.12.0
dongjoon-hyun edited a comment on issue #27746: [SPARK-30994][CORE] Update xerces to 2.12.0 URL: https://github.com/apache/spark/pull/27746#issuecomment-594327760 @srowen . Unfortunately, the five Maven Jenkins jobs skip Python Packaging test. Only SBT jobs have the Python Packaging test coverage. And, those SBT jobs still show 2.9.1 in `master` and `branch-3.0`. It seems that we need a test coverage recovery as a follow-up. ``` copying deps/jars/xercesImpl-2.9.1.jar -> pyspark-3.0.0.dev0/deps/jars ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594327794 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24017/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27746: [SPARK-30994][CORE] Update xerces to 2.12.0
dongjoon-hyun commented on issue #27746: [SPARK-30994][CORE] Update xerces to 2.12.0 URL: https://github.com/apache/spark/pull/27746#issuecomment-594327760 @srowen . Unfortunately, the five Maven Jenkins jobs skip Python Packaging. Only SBT jobs have the Python Packaging test coverage. And, those SBT jobs still show 2.9.1 in `master` and `branch-3.0`. It seems that we need a test coverage recovery as a follow-up. ``` copying deps/jars/xercesImpl-2.9.1.jar -> pyspark-3.0.0.dev0/deps/jars ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594327472 **[Test build #119277 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119277/testReport)** for PR 27778 at commit [`4bd58d3`](https://github.com/apache/spark/commit/4bd58d3e798c957c8fe92ea05801996459517e88). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27752: [SPARK-30999][SQL] Don't cancel a QueryStageExec which failed before call doMaterialize
cloud-fan commented on a change in pull request #27752: [SPARK-30999][SQL] Don't cancel a QueryStageExec which failed before call doMaterialize URL: https://github.com/apache/spark/pull/27752#discussion_r387449025 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -192,13 +197,12 @@ case class AdaptiveSparkPlanExec( stage.resultOption = Some(res) case StageFailure(stage, ex) => errors.append( - new SparkException(s"Failed to materialize query stage: ${stage.treeString}." + -s" and the cause is ${ex.getMessage}", ex)) + new SparkException(s"Failed to materialize query stage: ${stage.treeString}.", ex)) Review comment: Note: this is the java standard to set the "cause" exception as the "cause", instead of embedding its error message in the current exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#issuecomment-594326787 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
AmplabJenkins removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#issuecomment-594326795 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119263/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #27782: [SPARK-30289][FOLLOWUP][DOC] Update the migration guide for `spark.sql.legacy.ctePrecedencePolicy`
HyukjinKwon closed pull request #27782: [SPARK-30289][FOLLOWUP][DOC] Update the migration guide for `spark.sql.legacy.ctePrecedencePolicy` URL: https://github.com/apache/spark/pull/27782 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#issuecomment-594326787 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
AmplabJenkins commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#issuecomment-594326795 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119263/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #27782: [SPARK-30289][FOLLOWUP][DOC] Update the migration guide for `spark.sql.legacy.ctePrecedencePolicy`
HyukjinKwon commented on issue #27782: [SPARK-30289][FOLLOWUP][DOC] Update the migration guide for `spark.sql.legacy.ctePrecedencePolicy` URL: https://github.com/apache/spark/pull/27782#issuecomment-594326763 Merged to master and branch-3.0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on issue #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines
zhengruifeng commented on issue #27546: [SPARK-30773][ML]Support NativeBlas for level-1 routines URL: https://github.com/apache/spark/pull/27546#issuecomment-594326474 Should `mllib.linalg.BLAS` also be changed? Some algorithms now are implemented in the `.mllib` side. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function
SparkQA removed a comment on issue #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#issuecomment-594236428 **[Test build #119263 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119263/testReport)** for PR 27759 at commit [`d8ec950`](https://github.com/apache/spark/commit/d8ec9504e27a4097d7b0997d5601105607f26dec). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function
SparkQA commented on issue #27759: [SPARK-31008][SQL]Support json_array_length function URL: https://github.com/apache/spark/pull/27759#issuecomment-594326269 **[Test build #119263 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119263/testReport)** for PR 27759 at commit [`d8ec950`](https://github.com/apache/spark/commit/d8ec9504e27a4097d7b0997d5601105607f26dec). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class LengthOfJsonArray(child: Expression)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594325905 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594325912 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24016/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dbtsai commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
dbtsai commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387447770 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -437,61 +437,76 @@ object DataSourceStrategy { } } + /** + * Find the column name of an expression that can be pushed down. + */ + private[sql] def pushDownColName(e: Expression): Option[String] = { Review comment: I re-named it to make it cleaner. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387447685 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -635,3 +636,18 @@ object DataSourceStrategy { } } } + +case class PushDownCol(name: String, dataType: DataType) Review comment: I see. what about doing it as: ```scala case expressions.Contains(e: Expression, Literal(v: UTF8String, StringType)) if PushDownCol.unapply(e).isDefined => e.dataType ... val Some(name) = PushDownCol.unapply(e) ... ``` I faced similar problem before and worked around as above. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387447772 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -635,3 +636,18 @@ object DataSourceStrategy { } } } + +case class PushDownCol(name: String, dataType: DataType) Review comment: at https://github.com/apache/spark/blob/0032d85153e34b9ac69598b7dff530094ed0f640/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala#L245-L248 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594325905 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594325912 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24016/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387447685 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -635,3 +636,18 @@ object DataSourceStrategy { } } } + +case class PushDownCol(name: String, dataType: DataType) Review comment: I see. what about doing it as: ```scala case expressions.Contains(e: Expression, Literal(v: UTF8String, StringType)) if PushDownCol.unapply(name).isDefined => e.dataType ... name ... ``` I faced similar problem before and worked around as above. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #27782: [SPARK-30289][FOLLOWUP][DOC] Update the migration guide for `spark.sql.legacy.ctePrecedencePolicy`
cloud-fan commented on issue #27782: [SPARK-30289][FOLLOWUP][DOC] Update the migration guide for `spark.sql.legacy.ctePrecedencePolicy` URL: https://github.com/apache/spark/pull/27782#issuecomment-594325199 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots
AmplabJenkins removed a comment on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots URL: https://github.com/apache/spark/pull/27780#issuecomment-594324397 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119262/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots
AmplabJenkins removed a comment on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots URL: https://github.com/apache/spark/pull/27780#issuecomment-594324392 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots
AmplabJenkins commented on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots URL: https://github.com/apache/spark/pull/27780#issuecomment-594324397 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119262/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots
AmplabJenkins commented on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots URL: https://github.com/apache/spark/pull/27780#issuecomment-594324392 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594323923 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24015/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594323914 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots
SparkQA removed a comment on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots URL: https://github.com/apache/spark/pull/27780#issuecomment-594234095 **[Test build #119262 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119262/testReport)** for PR 27780 at commit [`f7326f9`](https://github.com/apache/spark/commit/f7326f916ee8b3eba719553b25a0a77b33f5586c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594323923 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24015/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots
SparkQA commented on issue #27780: [SPARK-31026] [SQL] [test-hive1.2] Parquet predicate pushdown on columns with dots URL: https://github.com/apache/spark/pull/27780#issuecomment-594323926 **[Test build #119262 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119262/testReport)** for PR 27780 at commit [`f7326f9`](https://github.com/apache/spark/commit/f7326f916ee8b3eba719553b25a0a77b33f5586c). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public final class SparkFilterApi ` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594323914 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dbtsai commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
dbtsai commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387446236 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -635,3 +636,18 @@ object DataSourceStrategy { } } } + +case class PushDownCol(name: String, dataType: DataType) Review comment: We need it for keeping the `dataType`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#issuecomment-594323662 **[Test build #119276 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119276/testReport)** for PR 27778 at commit [`89fe568`](https://github.com/apache/spark/commit/89fe568062fc42c06ff0fc21889b0da547826529). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387446016 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -437,61 +437,76 @@ object DataSourceStrategy { } } + /** + * Find the column name of an expression that can be pushed down. + */ + private[sql] def pushDownColName(e: Expression): Option[String] = { Review comment: Ur, the last commit is quite different from the above sample code. I'm not sure that last commit is better or not. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] liangxs commented on issue #27050: [SPARK-30388][Core] Mark running map stages of finished job as finished, and cancel running tasks
liangxs commented on issue #27050: [SPARK-30388][Core] Mark running map stages of finished job as finished, and cancel running tasks URL: https://github.com/apache/spark/pull/27050#issuecomment-594323104 @tgravescs @jiangxb1987 Thanks very much for your review! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387445780 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -635,3 +636,18 @@ object DataSourceStrategy { } } } + +case class PushDownCol(name: String, dataType: DataType) Review comment: Ur, do we need to have a new case class for this? @HyukjinKwon . Is this aligned with your recommendation? This seems to add a new complexity. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387445483 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -635,3 +636,18 @@ object DataSourceStrategy { } } } + +case class PushDownCol(name: String, dataType: DataType) + +/** + * Find the column name of an expression that can be pushed down. + */ +object PushDownCol { Review comment: `PushDownCol` sounds like `Column` instead of name. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387445339 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -437,61 +437,76 @@ object DataSourceStrategy { } } + /** + * Find the column name of an expression that can be pushed down. + */ + private[sql] def pushDownColName(e: Expression): Option[String] = { Review comment: Oh, thanks, @dongjoon-hyun. I didn't see your comment before my comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387445266 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -437,61 +437,76 @@ object DataSourceStrategy { } } + /** + * Find the column name of an expression that can be pushed down. + */ + private[sql] def pushDownColName(e: Expression): Option[String] = { Review comment: I was thinking the pattern match seems better with less diff and more consistent. There's also one similar example such as https://github.com/apache/spark/blob/0032d85153e34b9ac69598b7dff530094ed0f640/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala#L194-L203 It might be best to rename `PushDownColName` something like `PushableColumnName` but no strong preference. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387444902 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -437,61 +437,76 @@ object DataSourceStrategy { } } + /** + * Find the column name of an expression that can be pushed down. + */ + private[sql] def pushDownColName(e: Expression): Option[String] = { Review comment: +1 for removal. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387444597 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -437,61 +437,76 @@ object DataSourceStrategy { } } + /** + * Find the column name of an expression that can be pushed down. + */ + private[sql] def pushDownColName(e: Expression): Option[String] = { Review comment: +1 for @HyukjinKwon 's suggestion and the new @dbtsai 's code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala
AmplabJenkins removed a comment on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala URL: https://github.com/apache/spark/pull/27783#issuecomment-594318305 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119270/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27537: [SPARK-30668][SQL][FOLLOWUP] Raise exception instead of silent change for new DateFormatter
AmplabJenkins removed a comment on issue #27537: [SPARK-30668][SQL][FOLLOWUP] Raise exception instead of silent change for new DateFormatter URL: https://github.com/apache/spark/pull/27537#issuecomment-594318518 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119268/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
SparkQA removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594311197 **[Test build #119274 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119274/testReport)** for PR 27245 at commit [`0b0f723`](https://github.com/apache/spark/commit/0b0f7231b00a914843a42f2c02673011536a1e5c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594316796 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119274/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
HyukjinKwon commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387442798 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -437,61 +437,76 @@ object DataSourceStrategy { } } + /** + * Find the column name of an expression that can be pushed down. + */ + private[sql] def pushDownColName(e: Expression): Option[String] = { Review comment: I understand `private[sql]` can scope explicitly which makes sense in a way. However, the current decision was made across the codebase as of SPARK-16964 which also makes sense. What about sticking to one existing way as the current codebase does, and changing it globally later if this is found problematic? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594316791 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala
SparkQA removed a comment on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala URL: https://github.com/apache/spark/pull/27783#issuecomment-594280123 **[Test build #119270 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119270/testReport)** for PR 27783 at commit [`1162d90`](https://github.com/apache/spark/commit/1162d90393d0967679c08c63356fd63d2f2d0075). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27537: [SPARK-30668][SQL][FOLLOWUP] Raise exception instead of silent change for new DateFormatter
AmplabJenkins removed a comment on issue #27537: [SPARK-30668][SQL][FOLLOWUP] Raise exception instead of silent change for new DateFormatter URL: https://github.com/apache/spark/pull/27537#issuecomment-594318513 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job
AmplabJenkins removed a comment on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job URL: https://github.com/apache/spark/pull/27784#issuecomment-594316686 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24014/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27537: [SPARK-30668][SQL][FOLLOWUP] Raise exception instead of silent change for new DateFormatter
SparkQA removed a comment on issue #27537: [SPARK-30668][SQL][FOLLOWUP] Raise exception instead of silent change for new DateFormatter URL: https://github.com/apache/spark/pull/27537#issuecomment-594270431 **[Test build #119268 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119268/testReport)** for PR 27537 at commit [`46dbf9b`](https://github.com/apache/spark/commit/46dbf9b8c80cf607754a35b922a163efe487374d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala
AmplabJenkins removed a comment on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala URL: https://github.com/apache/spark/pull/27783#issuecomment-594318299 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job
AmplabJenkins removed a comment on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job URL: https://github.com/apache/spark/pull/27784#issuecomment-594316679 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27537: [SPARK-30668][SQL][FOLLOWUP] Raise exception instead of silent change for new DateFormatter
AmplabJenkins commented on issue #27537: [SPARK-30668][SQL][FOLLOWUP] Raise exception instead of silent change for new DateFormatter URL: https://github.com/apache/spark/pull/27537#issuecomment-594318513 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
zhengruifeng commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594318560 Merged to master, thanks all! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng closed pull request #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
zhengruifeng closed pull request #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27537: [SPARK-30668][SQL][FOLLOWUP] Raise exception instead of silent change for new DateFormatter
AmplabJenkins commented on issue #27537: [SPARK-30668][SQL][FOLLOWUP] Raise exception instead of silent change for new DateFormatter URL: https://github.com/apache/spark/pull/27537#issuecomment-594318518 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119268/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala
AmplabJenkins commented on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala URL: https://github.com/apache/spark/pull/27783#issuecomment-594318299 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala
AmplabJenkins commented on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala URL: https://github.com/apache/spark/pull/27783#issuecomment-594318305 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119270/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala
SparkQA commented on issue #27783: [SPARK-30913][CORE] Add version information to the configuration of Tests.scala URL: https://github.com/apache/spark/pull/27783#issuecomment-594318065 **[Test build #119270 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119270/testReport)** for PR 27783 at commit [`1162d90`](https://github.com/apache/spark/commit/1162d90393d0967679c08c63356fd63d2f2d0075). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job
SparkQA commented on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one job URL: https://github.com/apache/spark/pull/27784#issuecomment-594318061 **[Test build #119275 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119275/testReport)** for PR 27784 at commit [`930edd6`](https://github.com/apache/spark/commit/930edd68716567dae32d67e9c194c4da8947e01b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27537: [SPARK-30668][SQL][FOLLOWUP] Raise exception instead of silent change for new DateFormatter
SparkQA commented on issue #27537: [SPARK-30668][SQL][FOLLOWUP] Raise exception instead of silent change for new DateFormatter URL: https://github.com/apache/spark/pull/27537#issuecomment-594317806 **[Test build #119268 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119268/testReport)** for PR 27537 at commit [`46dbf9b`](https://github.com/apache/spark/commit/46dbf9b8c80cf607754a35b922a163efe487374d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594316796 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119274/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594316791 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one pass
AmplabJenkins commented on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one pass URL: https://github.com/apache/spark/pull/27784#issuecomment-594316679 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one pass
AmplabJenkins commented on issue #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one pass URL: https://github.com/apache/spark/pull/27784#issuecomment-594316686 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24014/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
SparkQA commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594316568 **[Test build #119274 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119274/testReport)** for PR 27245 at commit [`0b0f723`](https://github.com/apache/spark/commit/0b0f7231b00a914843a42f2c02673011536a1e5c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng opened a new pull request #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one pass
zhengruifeng opened a new pull request #27784: [SPARK-31032][ML] GMM compute summary and update distributions in one pass URL: https://github.com/apache/spark/pull/27784 ### What changes were proposed in this pull request? 1, compute summary and update distributions in one pass; 2, remove logic related to check `shouldDistributeGaussians` ### Why are the changes needed? In current impl, GMM need to trigger two jobs at one iteration: 1, one to compute summary; 2, if `shouldDistributeGaussians = ((k - 1.0) / k) * numFeatures > 25.0`, trigger another to update distributions; `shouldDistributeGaussians` is almost true in practice, since numFeatures is likely to be greater than 25. We can use only one job to impl above computation. ### Does this PR introduce any user-facing change? No ### How was this patch tested? existing testsuites This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594311520 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24013/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594311515 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594311515 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594311520 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24013/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
SparkQA commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594311197 **[Test build #119274 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119274/testReport)** for PR 27245 at commit [`0b0f723`](https://github.com/apache/spark/commit/0b0f7231b00a914843a42f2c02673011536a1e5c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dbtsai commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable
dbtsai commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable URL: https://github.com/apache/spark/pull/27778#discussion_r387434600 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ## @@ -437,61 +437,76 @@ object DataSourceStrategy { } } + /** + * Find the column name of an expression that can be pushed down. + */ + private[sql] def pushDownColName(e: Expression): Option[String] = { Review comment: With ```scala object PushDownColName { def unapply(e: Expression): Option[String] = { def helper(e: Expression) = e match { case a: Attribute => Some(a.name) case _ => None } helper(e) } } ``` The following code can be written ```scala case expressions.EqualTo(PushDownColName(name), Literal(v, t)) => Some(sources.EqualTo(name, convertToScala(v, t))) ``` instead of ```scala case expressions.EqualTo(e: Expression, Literal(v, t)) => pushDownColName(e).map(sources.EqualTo(_, convertToScala(v, t))) ``` I don't have strong preference about it. What do others feel? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
zhengruifeng commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-594309976 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage
AmplabJenkins commented on issue #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage URL: https://github.com/apache/spark/pull/27690#issuecomment-594305785 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage
AmplabJenkins removed a comment on issue #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage URL: https://github.com/apache/spark/pull/27690#issuecomment-594305785 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage
AmplabJenkins removed a comment on issue #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage URL: https://github.com/apache/spark/pull/27690#issuecomment-594305794 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24012/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage
AmplabJenkins commented on issue #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage URL: https://github.com/apache/spark/pull/27690#issuecomment-594305794 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24012/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage
SparkQA commented on issue #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage URL: https://github.com/apache/spark/pull/27690#issuecomment-594305442 **[Test build #119273 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119273/testReport)** for PR 27690 at commit [`76a189e`](https://github.com/apache/spark/commit/76a189e7070c0e279e599f88f97fe76f218e055b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maryannxue commented on a change in pull request #27752: [SPARK-30999][SQL] Don't cancel a QueryStageExec which failed before call doMaterialize
maryannxue commented on a change in pull request #27752: [SPARK-30999][SQL] Don't cancel a QueryStageExec which failed before call doMaterialize URL: https://github.com/apache/spark/pull/27752#discussion_r387429708 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -192,13 +197,12 @@ case class AdaptiveSparkPlanExec( stage.resultOption = Some(res) case StageFailure(stage, ex) => errors.append( - new SparkException(s"Failed to materialize query stage: ${stage.treeString}." + -s" and the cause is ${ex.getMessage}", ex)) + new SparkException(s"Failed to materialize query stage: ${stage.treeString}.", ex)) Review comment: This is when we need to change the tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] moomindani commented on a change in pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage
moomindani commented on a change in pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage URL: https://github.com/apache/spark/pull/27690#discussion_r387429262 ## File path: sql/hive/pom.xml ## @@ -189,6 +189,11 @@ scalacheck_${scala.binary.version} test + + org.mockito + mockito-core + test Review comment: Thanks, I removed it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org