[GitHub] [spark] kiszk commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
kiszk commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1 URL: https://github.com/apache/spark/pull/27271#issuecomment-575875962 This is not available yet at https://mvnrepository.com/artifact/org.lz4/lz4-java/1.7.1, too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
dongjoon-hyun commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1 URL: https://github.com/apache/spark/pull/27271#issuecomment-575873811 The Jenkins will pass because Google Maven Central is only used in GitHub Action. So, we can merge this when GitHub Action passes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
dongjoon-hyun edited a comment on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1 URL: https://github.com/apache/spark/pull/27271#issuecomment-575873734 Yes, I've been waiting for this, @maropu . However, it seems that we need to wait for one or two days because `Google Maven Central` is not mirroring it yet. ``` [ERROR] Failed to execute goal on project spark-core_2.12: Could not resolve dependencies for project org.apache.spark:spark-core_2.12:jar:3.0.0-SNAPSHOT: Could not find artifact org.lz4:lz4-java:jar:1.7.1 in google-maven-central ( https://maven-central.storage-download.googleapis.com/repos/central/data/) -> [Help 1] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
dongjoon-hyun commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1 URL: https://github.com/apache/spark/pull/27271#issuecomment-575873734 Yes, I've been waiting for this, @maropu . However, it seems that we need to wait for one or two days because `Google Maven Central` is not mirroring it yet. ``` [ERROR] Failed to execute goal on project spark-core_2.12: Could not resolve dependencies for project org.apache.spark:spark-core_2.12:jar:3.0.0-SNAPSHOT: Could not find artifact org.lz4:lz4-java:jar:1.7.1 in google-maven-central (https://maven-central.storage-download.googleapis.com/repos/central/data/) -> [Help 1] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
AmplabJenkins removed a comment on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1 URL: https://github.com/apache/spark/pull/27271#issuecomment-575873349 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
AmplabJenkins removed a comment on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1 URL: https://github.com/apache/spark/pull/27271#issuecomment-575873351 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21741/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
AmplabJenkins commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1 URL: https://github.com/apache/spark/pull/27271#issuecomment-575873349 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
AmplabJenkins commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1 URL: https://github.com/apache/spark/pull/27271#issuecomment-575873351 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21741/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
SparkQA commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1 URL: https://github.com/apache/spark/pull/27271#issuecomment-575873282 **[Test build #116972 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116972/testReport)** for PR 27271 at commit [`bdf7bfc`](https://github.com/apache/spark/commit/bdf7bfcd2834de8b125d3042c543995721e5e3ea). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
maropu commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1 URL: https://github.com/apache/spark/pull/27271#issuecomment-575873248 FYI: https://github.com/lz4/lz4-java/issues/156 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
maropu commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1 URL: https://github.com/apache/spark/pull/27271#issuecomment-575873235 cc: @HyukjinKwon @dongjoon-hyun @srowen This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu opened a new pull request #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1
maropu opened a new pull request #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1 URL: https://github.com/apache/spark/pull/27271 ### What changes were proposed in this pull request? This pr intends to upgrade lz4-java from 1.7.0 to 1.7.1. ### Why are the changes needed? This release includes a bug fix for older macOS. You can see the link below for the changes; https://github.com/lz4/lz4-java/blob/master/CHANGES.md#171 ### Does this PR introduce any user-facing change? ### How was this patch tested? Existing tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] guykhazma edited a comment on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing
guykhazma edited a comment on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing URL: https://github.com/apache/spark/pull/27157#issuecomment-575870211 @gengliangwang thanks for reviewing. I agree with your concern, this can be improved in subsequent PRs which will require a broader change in the V2 File based DataSources and v2 API. I'll be glad to help with that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] guykhazma edited a comment on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing
guykhazma edited a comment on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing URL: https://github.com/apache/spark/pull/27157#issuecomment-575870211 @gengliangwang thanks for reviewing. I agree with your concern, this can be improved in subsequent PRs which will require a broader change in the V2 File based DataSources. I'll be glad to help with that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] guykhazma edited a comment on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing
guykhazma edited a comment on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing URL: https://github.com/apache/spark/pull/27157#issuecomment-575870211 @gengliangwang thanks for reviewing. I agree with your concern, this can be improved in subsequent PRs which will require a broader change in the V2 DataSource API. I'll be glad to help with that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] guykhazma commented on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing
guykhazma commented on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing URL: https://github.com/apache/spark/pull/27157#issuecomment-575870211 @gengliangwang thanks for reviewing. I agree with your concern, and also this can be improved in subsequent PRs which will require a broader change in the V2 DataSource API. I'll be glad to help with that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] guykhazma commented on a change in pull request #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing
guykhazma commented on a change in pull request #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing URL: https://github.com/apache/spark/pull/27157#discussion_r368208826 ## File path: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala ## @@ -1575,6 +1576,36 @@ class AvroV2Suite extends AvroSuite { } } + test("Avro source v2: support passing data filters to FileScan without partitionFilters") { +withTempPath { dir => + Seq(("a", 1, 2), ("b", 1, 2), ("c", 2, 1)) +.toDF("value", "p1", "p2") +.write +.format("avro") +.option("header", true) Review comment: fixed, missed this by mistake, thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] linzebing commented on issue #27223: [SPARK-30511][SPARK-28403][CORE] Don't treat failed/killed speculative tasks as pending in Spark scheduler
linzebing commented on issue #27223: [SPARK-30511][SPARK-28403][CORE] Don't treat failed/killed speculative tasks as pending in Spark scheduler URL: https://github.com/apache/spark/pull/27223#issuecomment-575865048 > Seem likes the same problem also exists for normal task when speculative task finished before normal task? > > Is it possible to check whether there's another task attempt has succeed when we receive a failed taskEnd event. e.g. ask for TaskSchedulerImpl/TaskSetManager or just record those successful tasks in `ExecutorAllocationManager`. If a speculative task finished before the normal task, then the normal task will be killed. I have addressed this case in this PR, see explanation in https://github.com/apache/spark/pull/27223#discussion_r368204585 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-575864674 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-575864676 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21740/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-575864676 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21740/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-575864674 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
SparkQA commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-575864592 **[Test build #116971 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116971/testReport)** for PR 27260 at commit [`3c4c84f`](https://github.com/apache/spark/commit/3c4c84fda772cd4c310ed2b0ece853c84b5eb7af). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors
linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors URL: https://github.com/apache/spark/pull/27223#discussion_r368204585 ## File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ## @@ -614,18 +614,24 @@ private[spark] class ExecutorAllocationManager( stageAttemptToNumRunningTask -= stageAttempt } } -// If the task failed, we expect it to be resubmitted later. To ensure we have -// enough resources to run the resubmitted task, we need to mark the scheduler -// as backlogged again if it's not already marked as such (SPARK-8366) -if (taskEnd.reason != Success) { - if (totalPendingTasks() == 0) { -allocationManager.onSchedulerBacklogged() - } - if (taskEnd.taskInfo.speculative) { -stageAttemptToSpeculativeTaskIndices.get(stageAttempt).foreach {_.remove(taskIndex)} - } else { -stageAttemptToTaskIndices.get(stageAttempt).foreach {_.remove(taskIndex)} - } + +if (taskEnd.taskInfo.speculative) { + stageAttemptToSpeculativeTaskIndices.get(stageAttempt).foreach {_.remove{taskIndex}} + stageAttemptToNumSpeculativeTasks(stageAttempt) -= 1 +} + +// If the task failed (not intentionally killed), we expect it to be resubmitted later. To +// ensure we have enough resources to run the resubmitted task, we need to mark the +// scheduler as backlogged again if it's not already marked as such (SPARK-8366) +taskEnd.reason match { + case Success | _: TaskKilled => Review comment: Inside the the brackets, there are two things: ``` if (totalPendingTasks() == 0) { allocationManager.onSchedulerBacklogged() } ``` This one is straightforward. If a task is intentionally killed, then we don't expect this task to be resubmitted again, and we don't need to mark the scheduler as backlogged. ``` if (!taskEnd.taskInfo.speculative) { stageAttemptToTaskIndices.get(stageAttempt).foreach {_.remove(taskIndex)} } ``` If a non-speculative task is intentionally killed, it means the speculative task has succeeded, and no further task of this task index will be resubmitted. In this case, the task index is completed and we shouldn't remove it from `stageAttemptToTaskIndices`. Otherwise, we will have a pending non-speculative task for the task index. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project
AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project URL: https://github.com/apache/spark/pull/26978#issuecomment-575864284 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116968/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gatorsmile commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE
gatorsmile commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE URL: https://github.com/apache/spark/pull/27260#issuecomment-575864324 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project
AmplabJenkins commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project URL: https://github.com/apache/spark/pull/26978#issuecomment-575864282 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project
AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project URL: https://github.com/apache/spark/pull/26978#issuecomment-575864282 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project
AmplabJenkins commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project URL: https://github.com/apache/spark/pull/26978#issuecomment-575864284 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116968/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project
SparkQA removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project URL: https://github.com/apache/spark/pull/26978#issuecomment-575841050 **[Test build #116968 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116968/testReport)** for PR 26978 at commit [`19f7cd4`](https://github.com/apache/spark/commit/19f7cd4950e88ad79c2a8f7a5758577513261d91). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project
SparkQA commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project URL: https://github.com/apache/spark/pull/26978#issuecomment-575864178 **[Test build #116968 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116968/testReport)** for PR 26978 at commit [`19f7cd4`](https://github.com/apache/spark/commit/19f7cd4950e88ad79c2a8f7a5758577513261d91). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors
linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors URL: https://github.com/apache/spark/pull/27223#discussion_r368203963 ## File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ## @@ -614,18 +614,24 @@ private[spark] class ExecutorAllocationManager( stageAttemptToNumRunningTask -= stageAttempt } } -// If the task failed, we expect it to be resubmitted later. To ensure we have -// enough resources to run the resubmitted task, we need to mark the scheduler -// as backlogged again if it's not already marked as such (SPARK-8366) -if (taskEnd.reason != Success) { - if (totalPendingTasks() == 0) { -allocationManager.onSchedulerBacklogged() - } - if (taskEnd.taskInfo.speculative) { -stageAttemptToSpeculativeTaskIndices.get(stageAttempt).foreach {_.remove(taskIndex)} - } else { -stageAttemptToTaskIndices.get(stageAttempt).foreach {_.remove(taskIndex)} - } + +if (taskEnd.taskInfo.speculative) { Review comment: Note we already remove pending task `stageAttemptToNumSpeculativeTasks(stageAttempt) -= 1`. @Ngone51 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors
linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors URL: https://github.com/apache/spark/pull/27223#discussion_r368203860 ## File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ## @@ -263,9 +263,15 @@ private[spark] class ExecutorAllocationManager( */ private def maxNumExecutorsNeeded(): Int = { val numRunningOrPendingTasks = listener.totalPendingTasks + listener.totalRunningTasks -math.ceil(numRunningOrPendingTasks * executorAllocationRatio / - tasksPerExecutorForFullParallelism) - .toInt +val maxNeeded = math.ceil(numRunningOrPendingTasks * executorAllocationRatio / + tasksPerExecutorForFullParallelism).toInt +if (listener.pendingSpeculativeTasks > 0 && tasksPerExecutorForFullParallelism > 1) { Review comment: Because if we are using 1 task per executor, we don't need to allocate an extra executor for locality requirements. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors
linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors URL: https://github.com/apache/spark/pull/27223#discussion_r368203775 ## File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ## @@ -263,9 +263,15 @@ private[spark] class ExecutorAllocationManager( */ private def maxNumExecutorsNeeded(): Int = { val numRunningOrPendingTasks = listener.totalPendingTasks + listener.totalRunningTasks -math.ceil(numRunningOrPendingTasks * executorAllocationRatio / - tasksPerExecutorForFullParallelism) - .toInt +val maxNeeded = math.ceil(numRunningOrPendingTasks * executorAllocationRatio / + tasksPerExecutorForFullParallelism).toInt +if (listener.pendingSpeculativeTasks > 0 && tasksPerExecutorForFullParallelism > 1) { + // If we have pending speculative tasks, allocate one more executor to satisfy the + // locality requirements of speculative tasks + maxNeeded + 1 Review comment: As specified in the comments, this is to satisfy the locality requirements of speculative tasks. Let's say we have 1 normal task and 1 speculative task (setting is 4 tasks/executor), in this case we should allocate 2 executors instead of 1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors
linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors URL: https://github.com/apache/spark/pull/27223#discussion_r368203775 ## File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ## @@ -263,9 +263,15 @@ private[spark] class ExecutorAllocationManager( */ private def maxNumExecutorsNeeded(): Int = { val numRunningOrPendingTasks = listener.totalPendingTasks + listener.totalRunningTasks -math.ceil(numRunningOrPendingTasks * executorAllocationRatio / - tasksPerExecutorForFullParallelism) - .toInt +val maxNeeded = math.ceil(numRunningOrPendingTasks * executorAllocationRatio / + tasksPerExecutorForFullParallelism).toInt +if (listener.pendingSpeculativeTasks > 0 && tasksPerExecutorForFullParallelism > 1) { + // If we have pending speculative tasks, allocate one more executor to satisfy the + // locality requirements of speculative tasks + maxNeeded + 1 Review comment: As specified in the comments, this is to satisfy the locality requirements of speculative tasks. Let's say we have 1 normal task and 1 speculative task, in this case we should allocate 2 executors instead of 1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575859310 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116970/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575859307 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
SparkQA removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575857893 **[Test build #116970 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116970/testReport)** for PR 27245 at commit [`d8266c2`](https://github.com/apache/spark/commit/d8266c26b2e793db0d42b708861a5684d2b5adcb). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575859307 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575859310 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116970/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575859247 **[Test build #116970 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116970/testReport)** for PR 27245 at commit [`d8266c2`](https://github.com/apache/spark/commit/d8266c26b2e793db0d42b708861a5684d2b5adcb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] vanzin commented on issue #26586: [SPARK-29950][k8s] Blacklist deleted executors in K8S with dynamic allocation.
vanzin commented on issue #26586: [SPARK-29950][k8s] Blacklist deleted executors in K8S with dynamic allocation. URL: https://github.com/apache/spark/pull/26586#issuecomment-575858951 That's the same "Launcher client dependencies" test that seems super flaky, and has failed with the same error in other PRs before this one was merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575857986 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21739/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575857983 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575857986 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21739/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575857983 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575857893 **[Test build #116970 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116970/testReport)** for PR 27245 at commit [`d8266c2`](https://github.com/apache/spark/commit/d8266c26b2e793db0d42b708861a5684d2b5adcb). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575856520 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116969/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575856518 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575856476 **[Test build #116969 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116969/testReport)** for PR 27245 at commit [`bc2ebe8`](https://github.com/apache/spark/commit/bc2ebe8e56eb2be61c2d1577a7b12be171a588f8). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class _PredictorParams(HasLabelCol, HasFeaturesCol, HasPredictionCol):` * `class Predictor(Estimator, _PredictorParams):` * `class PredictionModel(Model, _PredictorParams):` * `class _ClassifierParams(HasRawPredictionCol, _PredictorParams):` * `class Classifier(Predictor, _ClassifierParams):` * `class ClassificationModel(PredictionModel, _ClassifierParams):` * `class _ProbabilisticClassifierParams(HasProbabilityCol, HasThresholds, _ClassifierParams):` * `class ProbabilisticClassifier(Classifier, _ProbabilisticClassifierParams):` * `class ProbabilisticClassificationModel(ClassificationModel,` * `class _JavaClassifier(Classifier, JavaPredictor):` * `class _JavaClassificationModel(ClassificationModel, JavaPredictionModel):` * `class _JavaProbabilisticClassifier(ProbabilisticClassifier, _JavaClassifier):` * `class _JavaProbabilisticClassificationModel(ProbabilisticClassificationModel,` * `class _LinearSVCParams(_ClassifierParams, HasRegParam, HasMaxIter, HasFitIntercept, HasTol,` * `class LinearSVC(_JavaClassifier, _LinearSVCParams, JavaMLWritable, JavaMLReadable):` * `class LinearSVCModel(_JavaClassificationModel, _LinearSVCParams, JavaMLWritable, JavaMLReadable):` * `class _LogisticRegressionParams(_ProbabilisticClassifierParams, HasRegParam,` * `class LogisticRegression(_JavaProbabilisticClassifier, _LogisticRegressionParams, JavaMLWritable,` * `class LogisticRegressionModel(_JavaProbabilisticClassificationModel, _LogisticRegressionParams,` * `class DecisionTreeClassifier(_JavaProbabilisticClassifier, _DecisionTreeClassifierParams,` * `class DecisionTreeClassificationModel(_DecisionTreeModel, _JavaProbabilisticClassificationModel,` * `class RandomForestClassifier(_JavaProbabilisticClassifier, _RandomForestClassifierParams,` * `class RandomForestClassificationModel(_TreeEnsembleModel, _JavaProbabilisticClassificationModel,` * `class GBTClassifier(_JavaProbabilisticClassifier, _GBTClassifierParams,` * `class GBTClassificationModel(_TreeEnsembleModel, _JavaProbabilisticClassificationModel,` * `class _NaiveBayesParams(_PredictorParams, HasWeightCol):` * `class NaiveBayes(_JavaProbabilisticClassifier, _NaiveBayesParams, HasThresholds, HasWeightCol,` * `class NaiveBayesModel(_JavaProbabilisticClassificationModel, _NaiveBayesParams, JavaMLWritable,` * `class _MultilayerPerceptronParams(_ProbabilisticClassifierParams, HasSeed, HasMaxIter,` * `class MultilayerPerceptronClassifier(_JavaProbabilisticClassifier, _MultilayerPerceptronParams,` * `class MultilayerPerceptronClassificationModel(_JavaProbabilisticClassificationModel,` * `class _OneVsRestParams(_ClassifierParams, HasWeightCol):` * `class FMClassifier(_JavaProbabilisticClassifier, _FactorizationMachinesParams, JavaMLWritable,` * `class FMClassificationModel(_JavaProbabilisticClassificationModel, _FactorizationMachinesParams,` * `class Regressor(Predictor, _PredictorParams):` * `class RegressionModel(PredictionModel, _PredictorParams):` * `class _JavaRegressor(Regressor, JavaPredictor):` * `class _JavaRegressionModel(RegressionModel, JavaPredictionModel):` * `class _LinearRegressionParams(_PredictorParams, HasRegParam, HasElasticNetParam, HasMaxIter,` * `class LinearRegression(_JavaRegressor, _LinearRegressionParams, JavaMLWritable, JavaMLReadable):` * `class LinearRegressionModel(_JavaRegressionModel, _LinearRegressionParams, GeneralJavaMLWritable,` * `class DecisionTreeRegressor(_JavaRegressor, _DecisionTreeRegressorParams, JavaMLWritable,` * `class DecisionTreeRegressionModel(` * `class RandomForestRegressor(_JavaRegressor, _RandomForestRegressorParams, JavaMLWritable,` * `class RandomForestRegressionModel(` * `class GBTRegressor(_JavaRegressor, _GBTRegressorParams, JavaMLWritable, JavaMLReadable):` * `class GBTRegressionModel(` * `class _AFTSurvivalRegressionParams(_PredictorParams, HasMaxIter, HasTol, HasFitIntercept,` * `class AFTSurvivalRegression(_JavaRegressor, _AFTSurvivalRegressionParams,` * `class AFTSurvivalRegressionModel(_JavaRegressionModel, _AFTSurvivalRegressionParams,` * `class _GeneralizedLinearRegressionParams(_PredictorParams, HasFitIntercept, HasMaxIter,` * `class Generalized
[GitHub] [spark] AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575856520 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116969/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
SparkQA removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575855200 **[Test build #116969 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116969/testReport)** for PR 27245 at commit [`bc2ebe8`](https://github.com/apache/spark/commit/bc2ebe8e56eb2be61c2d1577a7b12be171a588f8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575856518 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575855305 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575855306 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21738/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575855305 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575855306 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21738/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zero323 commented on issue #27241: [SPARK-30533][ML][PYSPARK] Add classes to represent Java Regressors and RegressionModels
zero323 commented on issue #27241: [SPARK-30533][ML][PYSPARK] Add classes to represent Java Regressors and RegressionModels URL: https://github.com/apache/spark/pull/27241#issuecomment-575855169 Thanks @huaxingao @srowen @zhengruifeng This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend URL: https://github.com/apache/spark/pull/27245#issuecomment-575855200 **[Test build #116969 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116969/testReport)** for PR 27245 at commit [`bc2ebe8`](https://github.com/apache/spark/commit/bc2ebe8e56eb2be61c2d1577a7b12be171a588f8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27268: [SPARK-30553][DOCS] fix structured-streaming java example error
dongjoon-hyun commented on issue #27268: [SPARK-30553][DOCS] fix structured-streaming java example error URL: https://github.com/apache/spark/pull/27268#issuecomment-575854721 Oh, got it. Thank you for checking. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] fuwhu commented on a change in pull request #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions
fuwhu commented on a change in pull request #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions URL: https://github.com/apache/spark/pull/26805#discussion_r368196763 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/PruneHiveTablePartitions.scala ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive.execution + +import org.apache.hadoop.hive.common.StatsSetupConst + +import org.apache.spark.sql.SparkSession +import org.apache.spark.sql.catalyst.analysis.CastSupport +import org.apache.spark.sql.catalyst.catalog.{CatalogStatistics, CatalogTable, CatalogTablePartition, ExternalCatalogUtils, HiveTableRelation} +import org.apache.spark.sql.catalyst.expressions.{And, AttributeSet, Expression, ExpressionSet, SubqueryExpression} +import org.apache.spark.sql.catalyst.planning.PhysicalOperation +import org.apache.spark.sql.catalyst.plans.logical.{Filter, LogicalPlan, Project} +import org.apache.spark.sql.catalyst.rules.Rule +import org.apache.spark.sql.execution.datasources.DataSourceStrategy +import org.apache.spark.sql.internal.SQLConf + +/** + * TODO: merge this with PruneFileSourcePartitions after we completely make hive as a data source. + */ +private[sql] class PruneHiveTablePartitions(session: SparkSession) + extends Rule[LogicalPlan] with CastSupport { + + override val conf: SQLConf = session.sessionState.conf + + /** + * Extract the partition filters from the filters on the table. + */ + private def getPartitionKeyFilters( + filters: Seq[Expression], + relation: HiveTableRelation): ExpressionSet = { +val normalizedFilters = DataSourceStrategy.normalizeExprs( + filters.filter(f => f.deterministic && !SubqueryExpression.hasSubquery(f)), relation.output) +val partitionColumnSet = AttributeSet(relation.partitionCols) +ExpressionSet(normalizedFilters.filter { f => + !f.references.isEmpty && f.references.subsetOf(partitionColumnSet) +}) + } + + /** + * Prune the hive table using filters on the partitions of the table. + */ + private def prunePartitions( + relation: HiveTableRelation, + partitionFilters: ExpressionSet): Seq[CatalogTablePartition] = { +if (conf.metastorePartitionPruning) { + session.sessionState.catalog.listPartitionsByFilter( +relation.tableMeta.identifier, partitionFilters.toSeq) +} else { + ExternalCatalogUtils.prunePartitionsByFilter(relation.tableMeta, + session.sessionState.catalog.listPartitions(relation.tableMeta.identifier), +partitionFilters.toSeq, conf.sessionLocalTimeZone) +} + } + + /** + * Update the statistics of the table. + */ + private def updateTableMeta( + tableMeta: CatalogTable, + prunedPartitions: Seq[CatalogTablePartition]): CatalogTable = { +val sizeOfPartitions = prunedPartitions.map { partition => + val rawDataSize = partition.parameters.get(StatsSetupConst.RAW_DATA_SIZE).map(_.toLong) + val totalSize = partition.parameters.get(StatsSetupConst.TOTAL_SIZE).map(_.toLong) + if (rawDataSize.isDefined && rawDataSize.get > 0) { +rawDataSize.get + } else if (totalSize.isDefined && totalSize.get > 0L) { +totalSize.get + } else { +0L + } +} +if (sizeOfPartitions.forall(s => s>0)) { + val sizeInBytes = sizeOfPartitions.sum + tableMeta.copy(stats = Some(CatalogStatistics(sizeInBytes = BigInt(sizeInBytes +} else { + tableMeta +} + } + + override def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators { +case op @ PhysicalOperation(projections, filters, relation: HiveTableRelation) + if filters.nonEmpty && relation.isPartitioned && relation.prunedPartitions.isEmpty => + val partitionKeyFilters = getPartitionKeyFilters(filters, relation) + if (partitionKeyFilters.nonEmpty) { +val newPartitions = prunePartitions(relation, partitionKeyFilters) +val newTableMeta = updateTableMeta(relation.tableMeta, newPartitions) +val newRelation = relation.copy( + tableMeta = newTableMeta, prunedPartitions = Some(newPartitions)) +Projec
[GitHub] [spark] srowen commented on issue #27241: [SPARK-30533][ML][PYSPARK] Add classes to represent Java Regressors and RegressionModels
srowen commented on issue #27241: [SPARK-30533][ML][PYSPARK] Add classes to represent Java Regressors and RegressionModels URL: https://github.com/apache/spark/pull/27241#issuecomment-575853279 Merged to master This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen closed pull request #27241: [SPARK-30533][ML][PYSPARK] Add classes to represent Java Regressors and RegressionModels
srowen closed pull request #27241: [SPARK-30533][ML][PYSPARK] Add classes to represent Java Regressors and RegressionModels URL: https://github.com/apache/spark/pull/27241 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] fuwhu commented on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore
fuwhu commented on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore URL: https://github.com/apache/spark/pull/27232#issuecomment-575852899 cc @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bettermouse edited a comment on issue #27268: [SPARK-30553][DOCS] fix structured-streaming java example error
bettermouse edited a comment on issue #27268: [SPARK-30553][DOCS] fix structured-streaming java example error URL: https://github.com/apache/spark/pull/27268#issuecomment-575852494 @dongjoon-hyun I have checked it.The class JavaStructuredNetworkWordCountWindowed does not use API withWatermark. So there is no problem This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] bettermouse commented on issue #27268: [SPARK-30553][DOCS] fix structured-streaming java example error
bettermouse commented on issue #27268: [SPARK-30553][DOCS] fix structured-streaming java example error URL: https://github.com/apache/spark/pull/27268#issuecomment-575852494 I have checked it.The class JavaStructuredNetworkWordCountWindowed does not use API withWatermark. So there is no problem This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
dongjoon-hyun commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect URL: https://github.com/apache/spark/pull/27270#issuecomment-575852458 You are faster than me. :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
maropu commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect URL: https://github.com/apache/spark/pull/27270#issuecomment-575852189 hahaha, I was a bit late ;) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
dongjoon-hyun commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect URL: https://github.com/apache/spark/pull/27270#issuecomment-575851846 Thank you, @maropu ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
dongjoon-hyun closed pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect URL: https://github.com/apache/spark/pull/27270 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
dongjoon-hyun commented on a change in pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect URL: https://github.com/apache/spark/pull/27270#discussion_r368195075 ## File path: docs/sql-migration-guide.md ## @@ -344,6 +344,12 @@ license: | - Since Spark 2.4.5, `TRUNCATE TABLE` command tries to set back original permission and ACLs during re-creating the table/partition paths. To restore the behaviour of earlier versions, set `spark.sql.truncateTable.ignorePermissionAcl.enabled` to `true`. + - Since Spark 2.4.5, `spark.sql.legacy.mssqlserver.numericMapping.enabled` configuration is added in order to support the legacy MsSQLServer dialect mapping behavior using IntegerType and DoubleType for SMALLINT and REAL JDBC types, respectively. To restore the behaviour of 2.4.3 and earlier versions, set `spark.sql.legacy.mssqlserver.numericMapping.enabled` to `true`. + +## Upgrading from Spark SQL 2.4.3 to 2.4.4 Review comment: Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
dongjoon-hyun commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect URL: https://github.com/apache/spark/pull/27270#issuecomment-575851702 Merged to master/2.4. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
dongjoon-hyun closed pull request #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
viirya commented on a change in pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect URL: https://github.com/apache/spark/pull/27270#discussion_r368194683 ## File path: docs/sql-migration-guide.md ## @@ -344,6 +344,12 @@ license: | - Since Spark 2.4.5, `TRUNCATE TABLE` command tries to set back original permission and ACLs during re-creating the table/partition paths. To restore the behaviour of earlier versions, set `spark.sql.truncateTable.ignorePermissionAcl.enabled` to `true`. + - Since Spark 2.4.5, `spark.sql.legacy.mssqlserver.numericMapping.enabled` configuration is added in order to support the legacy MsSQLServer dialect mapping behavior using IntegerType and DoubleType for SMALLINT and REAL JDBC types, respectively. To restore the behaviour of 2.4.3 and earlier versions, set `spark.sql.legacy.mssqlserver.numericMapping.enabled` to `true`. + +## Upgrading from Spark SQL 2.4.3 to 2.4.4 Review comment: ok sounds good. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
dongjoon-hyun commented on a change in pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect URL: https://github.com/apache/spark/pull/27270#discussion_r368193217 ## File path: docs/sql-migration-guide.md ## @@ -344,6 +344,12 @@ license: | - Since Spark 2.4.5, `TRUNCATE TABLE` command tries to set back original permission and ACLs during re-creating the table/partition paths. To restore the behaviour of earlier versions, set `spark.sql.truncateTable.ignorePermissionAcl.enabled` to `true`. + - Since Spark 2.4.5, `spark.sql.legacy.mssqlserver.numericMapping.enabled` configuration is added in order to support the legacy MsSQLServer dialect mapping behavior using IntegerType and DoubleType for SMALLINT and REAL JDBC types, respectively. To restore the behaviour of 2.4.3 and earlier versions, set `spark.sql.legacy.mssqlserver.numericMapping.enabled` to `true`. + +## Upgrading from Spark SQL 2.4.3 to 2.4.4 Review comment: For `2.4.4` release doc, I'll update `spark-website` repository. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
dongjoon-hyun commented on a change in pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect URL: https://github.com/apache/spark/pull/27270#discussion_r368193120 ## File path: docs/sql-migration-guide.md ## @@ -344,6 +344,12 @@ license: | - Since Spark 2.4.5, `TRUNCATE TABLE` command tries to set back original permission and ACLs during re-creating the table/partition paths. To restore the behaviour of earlier versions, set `spark.sql.truncateTable.ignorePermissionAcl.enabled` to `true`. + - Since Spark 2.4.5, `spark.sql.legacy.mssqlserver.numericMapping.enabled` configuration is added in order to support the legacy MsSQLServer dialect mapping behavior using IntegerType and DoubleType for SMALLINT and REAL JDBC types, respectively. To restore the behaviour of 2.4.3 and earlier versions, set `spark.sql.legacy.mssqlserver.numericMapping.enabled` to `true`. + +## Upgrading from Spark SQL 2.4.3 to 2.4.4 Review comment: IMO, although it's late for 2.4.4, `2.4.3` to `2.4.4` will be correct. When the users upgrade from 1.6.3 to 3.0.0, they need to see all previous migration guides. If there is some regression on 2.4.5, the users can use 2.4.4 instead of 2.4.5. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
AmplabJenkins removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130#issuecomment-575849751 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116967/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
AmplabJenkins commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130#issuecomment-575849745 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
AmplabJenkins commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130#issuecomment-575849751 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116967/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
AmplabJenkins removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130#issuecomment-575849745 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
SparkQA removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130#issuecomment-575824139 **[Test build #116967 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116967/testReport)** for PR 27130 at commit [`39f271f`](https://github.com/apache/spark/commit/39f271f23278c334a8230408703201276e7292ac). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories
SparkQA commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories URL: https://github.com/apache/spark/pull/27130#issuecomment-575849490 **[Test build #116967 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116967/testReport)** for PR 27130 at commit [`39f271f`](https://github.com/apache/spark/commit/39f271f23278c334a8230408703201276e7292ac). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing
gengliangwang commented on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing URL: https://github.com/apache/spark/pull/27157#issuecomment-575848049 @guykhazma Sorry to reply late. I was thinking about another approach, but I can't come up with a better one yet. My major concern is that the filters are supposed to be pushed down in the `FileScanBuilder`. It is wired to push down again for in the `FileScan`. Technically, the partition filters should be pushed down in `FileScanBuilder` as well. However, the current DSV2 API exposes the filters as `Filter` only instead of `Expression`. The coverage of `Filter` is limited. That's why I push the partition filters into FileScan in https://github.com/apache/spark/pull/27112. Keeping the behavior in V2 is also important. I will merge this one. We can improve the approach in the future. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on a change in pull request #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing
gengliangwang commented on a change in pull request #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing URL: https://github.com/apache/spark/pull/27157#discussion_r368189535 ## File path: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala ## @@ -1575,6 +1576,36 @@ class AvroV2Suite extends AvroSuite { } } + test("Avro source v2: support passing data filters to FileScan without partitionFilters") { +withTempPath { dir => + Seq(("a", 1, 2), ("b", 1, 2), ("c", 2, 1)) +.toDF("value", "p1", "p2") +.write +.format("avro") +.option("header", true) Review comment: For Avro data source, `.option("header", true)` is not needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on a change in pull request #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing
gengliangwang commented on a change in pull request #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing URL: https://github.com/apache/spark/pull/27157#discussion_r368189550 ## File path: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala ## @@ -1575,6 +1576,36 @@ class AvroV2Suite extends AvroSuite { } } + test("Avro source v2: support passing data filters to FileScan without partitionFilters") { +withTempPath { dir => + Seq(("a", 1, 2), ("b", 1, 2), ("c", 2, 1)) +.toDF("value", "p1", "p2") +.write +.format("avro") +.option("header", true) +.save(dir.getCanonicalPath) + val df = spark +.read +.format("avro") +.option("header", true) Review comment: Ditto. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] closed pull request #17234: [SPARK-19892][MLlib] Implement findAnalogies method for Word2VecModel
github-actions[bot] closed pull request #17234: [SPARK-19892][MLlib] Implement findAnalogies method for Word2VecModel URL: https://github.com/apache/spark/pull/17234 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] closed pull request #18193: [SPARK-15616] [SQL] CatalogRelation should fallback to HDFS size of partitions that are involved in Query for JoinSelection.
github-actions[bot] closed pull request #18193: [SPARK-15616] [SQL] CatalogRelation should fallback to HDFS size of partitions that are involved in Query for JoinSelection. URL: https://github.com/apache/spark/pull/18193 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project
AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project URL: https://github.com/apache/spark/pull/26978#issuecomment-575841295 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] closed pull request #17365: [SPARK-19962] [MLlib] add DictVectorizer to ml.feature
github-actions[bot] closed pull request #17365: [SPARK-19962] [MLlib] add DictVectorizer to ml.feature URL: https://github.com/apache/spark/pull/17365 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on issue #20935: [SPARK-23819][SQL] Fix InMemoryTableScanExec complex type pruning
github-actions[bot] commented on issue #20935: [SPARK-23819][SQL] Fix InMemoryTableScanExec complex type pruning URL: https://github.com/apache/spark/pull/20935#issuecomment-575841406 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] closed pull request #21006: [SPARK-22256][MESOS] - Introduce spark.mesos.driver.memoryOverhead
github-actions[bot] closed pull request #21006: [SPARK-22256][MESOS] - Introduce spark.mesos.driver.memoryOverhead URL: https://github.com/apache/spark/pull/21006 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on issue #14431: [SPARK-16258][SparkR] Automatically append the grouping keys in SparkR's gapply
github-actions[bot] commented on issue #14431: [SPARK-16258][SparkR] Automatically append the grouping keys in SparkR's gapply URL: https://github.com/apache/spark/pull/14431#issuecomment-575841442 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on issue #14936: [SPARK-7877][MESOS] Allow configuration of framework timeout
github-actions[bot] commented on issue #14936: [SPARK-7877][MESOS] Allow configuration of framework timeout URL: https://github.com/apache/spark/pull/14936#issuecomment-575841434 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] closed pull request #15326: [SPARK-17759] [CORE] Avoid adding duplicate schedulables
github-actions[bot] closed pull request #15326: [SPARK-17759] [CORE] Avoid adding duplicate schedulables URL: https://github.com/apache/spark/pull/15326 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on issue #21164: [SPARK-24098][SQL] ScriptTransformationExec should wait process exiting before output iterator finish
github-actions[bot] commented on issue #21164: [SPARK-24098][SQL] ScriptTransformationExec should wait process exiting before output iterator finish URL: https://github.com/apache/spark/pull/21164#issuecomment-575841397 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project
AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project URL: https://github.com/apache/spark/pull/26978#issuecomment-575841301 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21737/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on issue #15496: [SPARK-17950] [Python] Match SparseVector behavior with DenseVector
github-actions[bot] commented on issue #15496: [SPARK-17950] [Python] Match SparseVector behavior with DenseVector URL: https://github.com/apache/spark/pull/15496#issuecomment-575841424 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on issue #13650: [SPARK-9623] [ML] Provide conditional variance for RandomForestRegressor
github-actions[bot] commented on issue #13650: [SPARK-9623] [ML] Provide conditional variance for RandomForestRegressor URL: https://github.com/apache/spark/pull/13650#issuecomment-575841447 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on issue #13379: [SPARK-12431][GraphX] Add local checkpointing to GraphX.
github-actions[bot] commented on issue #13379: [SPARK-12431][GraphX] Add local checkpointing to GraphX. URL: https://github.com/apache/spark/pull/13379#issuecomment-575841455 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project
AmplabJenkins commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project URL: https://github.com/apache/spark/pull/26978#issuecomment-575841301 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21737/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org