[GitHub] [spark] kiszk commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1

2020-01-17 Thread GitBox
kiszk commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 
1.7.1
URL: https://github.com/apache/spark/pull/27271#issuecomment-575875962
 
 
   This is not available yet at 
https://mvnrepository.com/artifact/org.lz4/lz4-java/1.7.1, too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1

2020-01-17 Thread GitBox
dongjoon-hyun commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java 
version to 1.7.1
URL: https://github.com/apache/spark/pull/27271#issuecomment-575873811
 
 
   The Jenkins will pass because Google Maven Central is only used in GitHub 
Action.
   So, we can merge this when GitHub Action passes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun edited a comment on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1

2020-01-17 Thread GitBox
dongjoon-hyun edited a comment on issue #27271: [SPARK-30486][BUILD] Bump 
lz4-java version to 1.7.1
URL: https://github.com/apache/spark/pull/27271#issuecomment-575873734
 
 
   Yes, I've been waiting for this, @maropu .
   However, it seems that we need to wait for one or two days because `Google 
Maven Central` is not mirroring it yet.
   ```
   [ERROR] Failed to execute goal on project spark-core_2.12: 
   Could not resolve dependencies for project 
org.apache.spark:spark-core_2.12:jar:3.0.0-SNAPSHOT:
   Could not find artifact org.lz4:lz4-java:jar:1.7.1 in google-maven-central (
   https://maven-central.storage-download.googleapis.com/repos/central/data/) 
-> [Help 1]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1

2020-01-17 Thread GitBox
dongjoon-hyun commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java 
version to 1.7.1
URL: https://github.com/apache/spark/pull/27271#issuecomment-575873734
 
 
   Yes, I've been waiting for this, @maropu .
   However, it seems that we need to wait for one or two days because `Google 
Maven Central` is not mirroring it yet.
   ```
   [ERROR] Failed to execute goal on project spark-core_2.12: Could not resolve 
dependencies for project org.apache.spark:spark-core_2.12:jar:3.0.0-SNAPSHOT: 
Could not find artifact org.lz4:lz4-java:jar:1.7.1 in google-maven-central 
(https://maven-central.storage-download.googleapis.com/repos/central/data/) -> 
[Help 1]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27271: [SPARK-30486][BUILD] Bump 
lz4-java version to 1.7.1
URL: https://github.com/apache/spark/pull/27271#issuecomment-575873349
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27271: [SPARK-30486][BUILD] Bump 
lz4-java version to 1.7.1
URL: https://github.com/apache/spark/pull/27271#issuecomment-575873351
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21741/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java 
version to 1.7.1
URL: https://github.com/apache/spark/pull/27271#issuecomment-575873349
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java 
version to 1.7.1
URL: https://github.com/apache/spark/pull/27271#issuecomment-575873351
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21741/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1

2020-01-17 Thread GitBox
SparkQA commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version 
to 1.7.1
URL: https://github.com/apache/spark/pull/27271#issuecomment-575873282
 
 
   **[Test build #116972 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116972/testReport)**
 for PR 27271 at commit 
[`bdf7bfc`](https://github.com/apache/spark/commit/bdf7bfcd2834de8b125d3042c543995721e5e3ea).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1

2020-01-17 Thread GitBox
maropu commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 
1.7.1
URL: https://github.com/apache/spark/pull/27271#issuecomment-575873248
 
 
   FYI: https://github.com/lz4/lz4-java/issues/156


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1

2020-01-17 Thread GitBox
maropu commented on issue #27271: [SPARK-30486][BUILD] Bump lz4-java version to 
1.7.1
URL: https://github.com/apache/spark/pull/27271#issuecomment-575873235
 
 
   cc: @HyukjinKwon @dongjoon-hyun @srowen 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu opened a new pull request #27271: [SPARK-30486][BUILD] Bump lz4-java version to 1.7.1

2020-01-17 Thread GitBox
maropu opened a new pull request #27271: [SPARK-30486][BUILD] Bump lz4-java 
version to 1.7.1
URL: https://github.com/apache/spark/pull/27271
 
 
   
   
   ### What changes were proposed in this pull request?
   
   This pr intends to upgrade lz4-java from 1.7.0 to 1.7.1.
   
   ### Why are the changes needed?
   
   This release includes a bug fix for older macOS. You can see the link below 
for the changes;
   https://github.com/lz4/lz4-java/blob/master/CHANGES.md#171
   
   ### Does this PR introduce any user-facing change?
   
   
   
   ### How was this patch tested?
   
   Existing tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] guykhazma edited a comment on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing

2020-01-17 Thread GitBox
guykhazma edited a comment on issue #27157: [SPARK-30475][SQL] File source V2: 
Push data filters for file listing
URL: https://github.com/apache/spark/pull/27157#issuecomment-575870211
 
 
   @gengliangwang thanks for reviewing.
   I agree with your concern, this can be improved in subsequent PRs which will 
require a broader change in the V2 File based DataSources and v2 API. I'll be 
glad to help with that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] guykhazma edited a comment on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing

2020-01-17 Thread GitBox
guykhazma edited a comment on issue #27157: [SPARK-30475][SQL] File source V2: 
Push data filters for file listing
URL: https://github.com/apache/spark/pull/27157#issuecomment-575870211
 
 
   @gengliangwang thanks for reviewing.
   I agree with your concern, this can be improved in subsequent PRs which will 
require a broader change in the V2 File based DataSources. I'll be glad to help 
with that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] guykhazma edited a comment on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing

2020-01-17 Thread GitBox
guykhazma edited a comment on issue #27157: [SPARK-30475][SQL] File source V2: 
Push data filters for file listing
URL: https://github.com/apache/spark/pull/27157#issuecomment-575870211
 
 
   @gengliangwang thanks for reviewing.
   I agree with your concern, this can be improved in subsequent PRs which will 
require a broader change in the V2 DataSource API. I'll be glad to help with 
that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] guykhazma commented on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing

2020-01-17 Thread GitBox
guykhazma commented on issue #27157: [SPARK-30475][SQL] File source V2: Push 
data filters for file listing
URL: https://github.com/apache/spark/pull/27157#issuecomment-575870211
 
 
   @gengliangwang thanks for reviewing.
   I agree with your concern, and also this can be improved in subsequent PRs 
which will require a broader change in the V2 DataSource API. I'll be glad to 
help with that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] guykhazma commented on a change in pull request #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing

2020-01-17 Thread GitBox
guykhazma commented on a change in pull request #27157: [SPARK-30475][SQL] File 
source V2: Push data filters for file listing
URL: https://github.com/apache/spark/pull/27157#discussion_r368208826
 
 

 ##
 File path: 
external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
 ##
 @@ -1575,6 +1576,36 @@ class AvroV2Suite extends AvroSuite {
 }
   }
 
+  test("Avro source v2: support passing data filters to FileScan without 
partitionFilters") {
+withTempPath { dir =>
+  Seq(("a", 1, 2), ("b", 1, 2), ("c", 2, 1))
+.toDF("value", "p1", "p2")
+.write
+.format("avro")
+.option("header", true)
 
 Review comment:
   fixed, missed this by mistake, thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] linzebing commented on issue #27223: [SPARK-30511][SPARK-28403][CORE] Don't treat failed/killed speculative tasks as pending in Spark scheduler

2020-01-17 Thread GitBox
linzebing commented on issue #27223: [SPARK-30511][SPARK-28403][CORE] Don't 
treat failed/killed speculative tasks as pending in Spark scheduler
URL: https://github.com/apache/spark/pull/27223#issuecomment-575865048
 
 
   > Seem likes the same problem also exists for normal task when speculative 
task finished before normal task?
   > 
   > Is it possible to check whether there's another task attempt has succeed 
when we receive a failed taskEnd event. e.g. ask for 
TaskSchedulerImpl/TaskSetManager or just record those successful tasks in 
`ExecutorAllocationManager`.
   
   If a speculative task finished before the normal task, then the normal task 
will be killed. I have addressed this case in this PR, see explanation in 
https://github.com/apache/spark/pull/27223#discussion_r368204585 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery 
shown issue in UI When enable AQE
URL: https://github.com/apache/spark/pull/27260#issuecomment-575864674
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27260: [SPARK-30549][SQL] Fix the subquery 
shown issue in UI When enable AQE
URL: https://github.com/apache/spark/pull/27260#issuecomment-575864676
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21740/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the 
subquery shown issue in UI When enable AQE
URL: https://github.com/apache/spark/pull/27260#issuecomment-575864676
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21740/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27260: [SPARK-30549][SQL] Fix the 
subquery shown issue in UI When enable AQE
URL: https://github.com/apache/spark/pull/27260#issuecomment-575864674
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE

2020-01-17 Thread GitBox
SparkQA commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown 
issue in UI When enable AQE
URL: https://github.com/apache/spark/pull/27260#issuecomment-575864592
 
 
   **[Test build #116971 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116971/testReport)**
 for PR 27260 at commit 
[`3c4c84f`](https://github.com/apache/spark/commit/3c4c84fda772cd4c310ed2b0ece853c84b5eb7af).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors

2020-01-17 Thread GitBox
linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] 
Spark marks intentionally killed speculative tasks as pending leads to holding 
idle executors
URL: https://github.com/apache/spark/pull/27223#discussion_r368204585
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
 ##
 @@ -614,18 +614,24 @@ private[spark] class ExecutorAllocationManager(
 stageAttemptToNumRunningTask -= stageAttempt
   }
 }
-// If the task failed, we expect it to be resubmitted later. To ensure 
we have
-// enough resources to run the resubmitted task, we need to mark the 
scheduler
-// as backlogged again if it's not already marked as such (SPARK-8366)
-if (taskEnd.reason != Success) {
-  if (totalPendingTasks() == 0) {
-allocationManager.onSchedulerBacklogged()
-  }
-  if (taskEnd.taskInfo.speculative) {
-stageAttemptToSpeculativeTaskIndices.get(stageAttempt).foreach 
{_.remove(taskIndex)}
-  } else {
-stageAttemptToTaskIndices.get(stageAttempt).foreach 
{_.remove(taskIndex)}
-  }
+
+if (taskEnd.taskInfo.speculative) {
+  stageAttemptToSpeculativeTaskIndices.get(stageAttempt).foreach 
{_.remove{taskIndex}}
+  stageAttemptToNumSpeculativeTasks(stageAttempt) -= 1
+}
+
+// If the task failed (not intentionally killed), we expect it to be 
resubmitted later. To
+// ensure we have enough resources to run the resubmitted task, we 
need to mark the
+// scheduler as backlogged again if it's not already marked as such 
(SPARK-8366)
+taskEnd.reason match {
+  case Success | _: TaskKilled =>
 
 Review comment:
   Inside the the brackets, there are two things:
   ```
   if (totalPendingTasks() == 0) {
   allocationManager.onSchedulerBacklogged()
   }
   ```
   This one is straightforward. If a task is intentionally killed, then we 
don't expect this task to be resubmitted again, and we don't need to mark the 
scheduler as backlogged.
   ```
   if (!taskEnd.taskInfo.speculative) {
   stageAttemptToTaskIndices.get(stageAttempt).foreach {_.remove(taskIndex)}
   }
   ```
   If a non-speculative task is intentionally killed, it means the speculative 
task has succeeded, and no further task of this task index will be resubmitted. 
In this case, the task index is completed and we shouldn't remove it from 
`stageAttemptToTaskIndices`. Otherwise, we will have a pending non-speculative 
task for the task index.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune 
unnecessary nested fields from Generate without Project
URL: https://github.com/apache/spark/pull/26978#issuecomment-575864284
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116968/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gatorsmile commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown issue in UI When enable AQE

2020-01-17 Thread GitBox
gatorsmile commented on issue #27260: [SPARK-30549][SQL] Fix the subquery shown 
issue in UI When enable AQE
URL: https://github.com/apache/spark/pull/27260#issuecomment-575864324
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary 
nested fields from Generate without Project
URL: https://github.com/apache/spark/pull/26978#issuecomment-575864282
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune 
unnecessary nested fields from Generate without Project
URL: https://github.com/apache/spark/pull/26978#issuecomment-575864282
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary 
nested fields from Generate without Project
URL: https://github.com/apache/spark/pull/26978#issuecomment-575864284
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116968/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project

2020-01-17 Thread GitBox
SparkQA removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary 
nested fields from Generate without Project
URL: https://github.com/apache/spark/pull/26978#issuecomment-575841050
 
 
   **[Test build #116968 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116968/testReport)**
 for PR 26978 at commit 
[`19f7cd4`](https://github.com/apache/spark/commit/19f7cd4950e88ad79c2a8f7a5758577513261d91).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project

2020-01-17 Thread GitBox
SparkQA commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested 
fields from Generate without Project
URL: https://github.com/apache/spark/pull/26978#issuecomment-575864178
 
 
   **[Test build #116968 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116968/testReport)**
 for PR 26978 at commit 
[`19f7cd4`](https://github.com/apache/spark/commit/19f7cd4950e88ad79c2a8f7a5758577513261d91).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors

2020-01-17 Thread GitBox
linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] 
Spark marks intentionally killed speculative tasks as pending leads to holding 
idle executors
URL: https://github.com/apache/spark/pull/27223#discussion_r368203963
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
 ##
 @@ -614,18 +614,24 @@ private[spark] class ExecutorAllocationManager(
 stageAttemptToNumRunningTask -= stageAttempt
   }
 }
-// If the task failed, we expect it to be resubmitted later. To ensure 
we have
-// enough resources to run the resubmitted task, we need to mark the 
scheduler
-// as backlogged again if it's not already marked as such (SPARK-8366)
-if (taskEnd.reason != Success) {
-  if (totalPendingTasks() == 0) {
-allocationManager.onSchedulerBacklogged()
-  }
-  if (taskEnd.taskInfo.speculative) {
-stageAttemptToSpeculativeTaskIndices.get(stageAttempt).foreach 
{_.remove(taskIndex)}
-  } else {
-stageAttemptToTaskIndices.get(stageAttempt).foreach 
{_.remove(taskIndex)}
-  }
+
+if (taskEnd.taskInfo.speculative) {
 
 Review comment:
   Note we already remove pending task 
`stageAttemptToNumSpeculativeTasks(stageAttempt) -= 1`. @Ngone51 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors

2020-01-17 Thread GitBox
linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] 
Spark marks intentionally killed speculative tasks as pending leads to holding 
idle executors
URL: https://github.com/apache/spark/pull/27223#discussion_r368203860
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
 ##
 @@ -263,9 +263,15 @@ private[spark] class ExecutorAllocationManager(
*/
   private def maxNumExecutorsNeeded(): Int = {
 val numRunningOrPendingTasks = listener.totalPendingTasks + 
listener.totalRunningTasks
-math.ceil(numRunningOrPendingTasks * executorAllocationRatio /
-  tasksPerExecutorForFullParallelism)
-  .toInt
+val maxNeeded = math.ceil(numRunningOrPendingTasks * 
executorAllocationRatio /
+  tasksPerExecutorForFullParallelism).toInt
+if (listener.pendingSpeculativeTasks > 0 && 
tasksPerExecutorForFullParallelism > 1) {
 
 Review comment:
   Because if we are using 1 task per executor, we don't need to allocate an 
extra executor for locality requirements.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors

2020-01-17 Thread GitBox
linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] 
Spark marks intentionally killed speculative tasks as pending leads to holding 
idle executors
URL: https://github.com/apache/spark/pull/27223#discussion_r368203775
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
 ##
 @@ -263,9 +263,15 @@ private[spark] class ExecutorAllocationManager(
*/
   private def maxNumExecutorsNeeded(): Int = {
 val numRunningOrPendingTasks = listener.totalPendingTasks + 
listener.totalRunningTasks
-math.ceil(numRunningOrPendingTasks * executorAllocationRatio /
-  tasksPerExecutorForFullParallelism)
-  .toInt
+val maxNeeded = math.ceil(numRunningOrPendingTasks * 
executorAllocationRatio /
+  tasksPerExecutorForFullParallelism).toInt
+if (listener.pendingSpeculativeTasks > 0 && 
tasksPerExecutorForFullParallelism > 1) {
+  // If we have pending speculative tasks, allocate one more executor to 
satisfy the
+  // locality requirements of speculative tasks
+  maxNeeded + 1
 
 Review comment:
   As specified in the comments, this is to satisfy the locality requirements 
of speculative tasks. Let's say we have 1 normal task and 1 speculative task 
(setting is 4 tasks/executor), in this case we should allocate 2 executors 
instead of 1.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] Spark marks intentionally killed speculative tasks as pending leads to holding idle executors

2020-01-17 Thread GitBox
linzebing commented on a change in pull request #27223: [SPARK-30511][CORE] 
Spark marks intentionally killed speculative tasks as pending leads to holding 
idle executors
URL: https://github.com/apache/spark/pull/27223#discussion_r368203775
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
 ##
 @@ -263,9 +263,15 @@ private[spark] class ExecutorAllocationManager(
*/
   private def maxNumExecutorsNeeded(): Int = {
 val numRunningOrPendingTasks = listener.totalPendingTasks + 
listener.totalRunningTasks
-math.ceil(numRunningOrPendingTasks * executorAllocationRatio /
-  tasksPerExecutorForFullParallelism)
-  .toInt
+val maxNeeded = math.ceil(numRunningOrPendingTasks * 
executorAllocationRatio /
+  tasksPerExecutorForFullParallelism).toInt
+if (listener.pendingSpeculativeTasks > 0 && 
tasksPerExecutorForFullParallelism > 1) {
+  // If we have pending speculative tasks, allocate one more executor to 
satisfy the
+  // locality requirements of speculative tasks
+  maxNeeded + 1
 
 Review comment:
   As specified in the comments, this is to satisfy the locality requirements 
of speculative tasks. Let's say we have 1 normal task and 1 speculative task, 
in this case we should allocate 2 executors instead of 1.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add 
common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575859310
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116970/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add 
common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575859307
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
SparkQA removed a comment on issue #27245: [SPARK-29212][ML][PYSPARK] Add 
common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575857893
 
 
   **[Test build #116970 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116970/testReport)**
 for PR 27245 at commit 
[`d8266c2`](https://github.com/apache/spark/commit/d8266c26b2e793db0d42b708861a5684d2b5adcb).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common 
classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575859307
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27245: [SPARK-29212][ML][PYSPARK] Add common 
classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575859310
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116970/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common 
classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575859247
 
 
   **[Test build #116970 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116970/testReport)**
 for PR 27245 at commit 
[`d8266c2`](https://github.com/apache/spark/commit/d8266c26b2e793db0d42b708861a5684d2b5adcb).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] vanzin commented on issue #26586: [SPARK-29950][k8s] Blacklist deleted executors in K8S with dynamic allocation.

2020-01-17 Thread GitBox
vanzin commented on issue #26586: [SPARK-29950][k8s] Blacklist deleted 
executors in K8S with dynamic allocation.
URL: https://github.com/apache/spark/pull/26586#issuecomment-575858951
 
 
   That's the same "Launcher client dependencies" test that seems super flaky, 
and has failed with the same error in other PRs before this one was merged.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27245: 
[WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575857986
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21739/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add 
common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575857983
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add 
common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575857986
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21739/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27245: 
[WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575857983
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common 
classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575857893
 
 
   **[Test build #116970 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116970/testReport)**
 for PR 27245 at commit 
[`d8266c2`](https://github.com/apache/spark/commit/d8266c26b2e793db0d42b708861a5684d2b5adcb).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27245: 
[WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575856520
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116969/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add 
common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575856518
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common 
classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575856476
 
 
   **[Test build #116969 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116969/testReport)**
 for PR 27245 at commit 
[`bc2ebe8`](https://github.com/apache/spark/commit/bc2ebe8e56eb2be61c2d1577a7b12be171a588f8).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `class _PredictorParams(HasLabelCol, HasFeaturesCol, HasPredictionCol):`
 * `class Predictor(Estimator, _PredictorParams):`
 * `class PredictionModel(Model, _PredictorParams):`
 * `class _ClassifierParams(HasRawPredictionCol, _PredictorParams):`
 * `class Classifier(Predictor, _ClassifierParams):`
 * `class ClassificationModel(PredictionModel, _ClassifierParams):`
 * `class _ProbabilisticClassifierParams(HasProbabilityCol, HasThresholds, 
_ClassifierParams):`
 * `class ProbabilisticClassifier(Classifier, 
_ProbabilisticClassifierParams):`
 * `class ProbabilisticClassificationModel(ClassificationModel,`
 * `class _JavaClassifier(Classifier, JavaPredictor):`
 * `class _JavaClassificationModel(ClassificationModel, 
JavaPredictionModel):`
 * `class _JavaProbabilisticClassifier(ProbabilisticClassifier, 
_JavaClassifier):`
 * `class 
_JavaProbabilisticClassificationModel(ProbabilisticClassificationModel,`
 * `class _LinearSVCParams(_ClassifierParams, HasRegParam, HasMaxIter, 
HasFitIntercept, HasTol,`
 * `class LinearSVC(_JavaClassifier, _LinearSVCParams, JavaMLWritable, 
JavaMLReadable):`
 * `class LinearSVCModel(_JavaClassificationModel, _LinearSVCParams, 
JavaMLWritable, JavaMLReadable):`
 * `class _LogisticRegressionParams(_ProbabilisticClassifierParams, 
HasRegParam,`
 * `class LogisticRegression(_JavaProbabilisticClassifier, 
_LogisticRegressionParams, JavaMLWritable,`
 * `class LogisticRegressionModel(_JavaProbabilisticClassificationModel, 
_LogisticRegressionParams,`
 * `class DecisionTreeClassifier(_JavaProbabilisticClassifier, 
_DecisionTreeClassifierParams,`
 * `class DecisionTreeClassificationModel(_DecisionTreeModel, 
_JavaProbabilisticClassificationModel,`
 * `class RandomForestClassifier(_JavaProbabilisticClassifier, 
_RandomForestClassifierParams,`
 * `class RandomForestClassificationModel(_TreeEnsembleModel, 
_JavaProbabilisticClassificationModel,`
 * `class GBTClassifier(_JavaProbabilisticClassifier, _GBTClassifierParams,`
 * `class GBTClassificationModel(_TreeEnsembleModel, 
_JavaProbabilisticClassificationModel,`
 * `class _NaiveBayesParams(_PredictorParams, HasWeightCol):`
 * `class NaiveBayes(_JavaProbabilisticClassifier, _NaiveBayesParams, 
HasThresholds, HasWeightCol,`
 * `class NaiveBayesModel(_JavaProbabilisticClassificationModel, 
_NaiveBayesParams, JavaMLWritable,`
 * `class _MultilayerPerceptronParams(_ProbabilisticClassifierParams, 
HasSeed, HasMaxIter,`
 * `class MultilayerPerceptronClassifier(_JavaProbabilisticClassifier, 
_MultilayerPerceptronParams,`
 * `class 
MultilayerPerceptronClassificationModel(_JavaProbabilisticClassificationModel,`
 * `class _OneVsRestParams(_ClassifierParams, HasWeightCol):`
 * `class FMClassifier(_JavaProbabilisticClassifier, 
_FactorizationMachinesParams, JavaMLWritable,`
 * `class FMClassificationModel(_JavaProbabilisticClassificationModel, 
_FactorizationMachinesParams,`
 * `class Regressor(Predictor, _PredictorParams):`
 * `class RegressionModel(PredictionModel, _PredictorParams):`
 * `class _JavaRegressor(Regressor, JavaPredictor):`
 * `class _JavaRegressionModel(RegressionModel, JavaPredictionModel):`
 * `class _LinearRegressionParams(_PredictorParams, HasRegParam, 
HasElasticNetParam, HasMaxIter,`
 * `class LinearRegression(_JavaRegressor, _LinearRegressionParams, 
JavaMLWritable, JavaMLReadable):`
 * `class LinearRegressionModel(_JavaRegressionModel, 
_LinearRegressionParams, GeneralJavaMLWritable,`
 * `class DecisionTreeRegressor(_JavaRegressor, 
_DecisionTreeRegressorParams, JavaMLWritable,`
 * `class DecisionTreeRegressionModel(`
 * `class RandomForestRegressor(_JavaRegressor, 
_RandomForestRegressorParams, JavaMLWritable,`
 * `class RandomForestRegressionModel(`
 * `class GBTRegressor(_JavaRegressor, _GBTRegressorParams, JavaMLWritable, 
JavaMLReadable):`
 * `class GBTRegressionModel(`
 * `class _AFTSurvivalRegressionParams(_PredictorParams, HasMaxIter, 
HasTol, HasFitIntercept,`
 * `class AFTSurvivalRegression(_JavaRegressor, 
_AFTSurvivalRegressionParams,`
 * `class AFTSurvivalRegressionModel(_JavaRegressionModel, 
_AFTSurvivalRegressionParams,`
 * `class _GeneralizedLinearRegressionParams(_PredictorParams, 
HasFitIntercept, HasMaxIter,`
 * `class Generalized

[GitHub] [spark] AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add 
common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575856520
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116969/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
SparkQA removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add 
common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575855200
 
 
   **[Test build #116969 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116969/testReport)**
 for PR 27245 at commit 
[`bc2ebe8`](https://github.com/apache/spark/commit/bc2ebe8e56eb2be61c2d1577a7b12be171a588f8).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27245: 
[WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575856518
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27245: 
[WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575855305
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add 
common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575855306
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21738/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add 
common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575855305
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27245: 
[WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575855306
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21738/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on issue #27241: [SPARK-30533][ML][PYSPARK] Add classes to represent Java Regressors and RegressionModels

2020-01-17 Thread GitBox
zero323 commented on issue #27241: [SPARK-30533][ML][PYSPARK] Add classes to 
represent Java Regressors and RegressionModels
URL: https://github.com/apache/spark/pull/27241#issuecomment-575855169
 
 
   Thanks @huaxingao @srowen @zhengruifeng 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common classes without using JVM backend

2020-01-17 Thread GitBox
SparkQA commented on issue #27245: [WIP][SPARK-29212][ML][PYSPARK] Add common 
classes without using JVM backend
URL: https://github.com/apache/spark/pull/27245#issuecomment-575855200
 
 
   **[Test build #116969 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116969/testReport)**
 for PR 27245 at commit 
[`bc2ebe8`](https://github.com/apache/spark/commit/bc2ebe8e56eb2be61c2d1577a7b12be171a588f8).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #27268: [SPARK-30553][DOCS] fix structured-streaming java example error

2020-01-17 Thread GitBox
dongjoon-hyun commented on issue #27268: [SPARK-30553][DOCS] fix 
structured-streaming java example error
URL: https://github.com/apache/spark/pull/27268#issuecomment-575854721
 
 
   Oh, got it. Thank you for checking.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] fuwhu commented on a change in pull request #26805: [SPARK-15616][SQL] Add optimizer rule PruneHiveTablePartitions

2020-01-17 Thread GitBox
fuwhu commented on a change in pull request #26805: [SPARK-15616][SQL] Add 
optimizer rule PruneHiveTablePartitions
URL: https://github.com/apache/spark/pull/26805#discussion_r368196763
 
 

 ##
 File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/PruneHiveTablePartitions.scala
 ##
 @@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive.execution
+
+import org.apache.hadoop.hive.common.StatsSetupConst
+
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.catalyst.analysis.CastSupport
+import org.apache.spark.sql.catalyst.catalog.{CatalogStatistics, CatalogTable, 
CatalogTablePartition, ExternalCatalogUtils, HiveTableRelation}
+import org.apache.spark.sql.catalyst.expressions.{And, AttributeSet, 
Expression, ExpressionSet, SubqueryExpression}
+import org.apache.spark.sql.catalyst.planning.PhysicalOperation
+import org.apache.spark.sql.catalyst.plans.logical.{Filter, LogicalPlan, 
Project}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.execution.datasources.DataSourceStrategy
+import org.apache.spark.sql.internal.SQLConf
+
+/**
+ * TODO: merge this with PruneFileSourcePartitions after we completely make 
hive as a data source.
+ */
+private[sql] class PruneHiveTablePartitions(session: SparkSession)
+  extends Rule[LogicalPlan] with CastSupport {
+
+  override val conf: SQLConf = session.sessionState.conf
+
+  /**
+   * Extract the partition filters from the filters on the table.
+   */
+  private def getPartitionKeyFilters(
+  filters: Seq[Expression],
+  relation: HiveTableRelation): ExpressionSet = {
+val normalizedFilters = DataSourceStrategy.normalizeExprs(
+  filters.filter(f => f.deterministic && 
!SubqueryExpression.hasSubquery(f)), relation.output)
+val partitionColumnSet = AttributeSet(relation.partitionCols)
+ExpressionSet(normalizedFilters.filter { f =>
+  !f.references.isEmpty && f.references.subsetOf(partitionColumnSet)
+})
+  }
+
+  /**
+   * Prune the hive table using filters on the partitions of the table.
+   */
+  private def prunePartitions(
+  relation: HiveTableRelation,
+  partitionFilters: ExpressionSet): Seq[CatalogTablePartition] = {
+if (conf.metastorePartitionPruning) {
+  session.sessionState.catalog.listPartitionsByFilter(
+relation.tableMeta.identifier, partitionFilters.toSeq)
+} else {
+  ExternalCatalogUtils.prunePartitionsByFilter(relation.tableMeta,
+
session.sessionState.catalog.listPartitions(relation.tableMeta.identifier),
+partitionFilters.toSeq, conf.sessionLocalTimeZone)
+}
+  }
+
+  /**
+   * Update the statistics of the table.
+   */
+  private def updateTableMeta(
+  tableMeta: CatalogTable,
+  prunedPartitions: Seq[CatalogTablePartition]): CatalogTable = {
+val sizeOfPartitions = prunedPartitions.map { partition =>
+  val rawDataSize = 
partition.parameters.get(StatsSetupConst.RAW_DATA_SIZE).map(_.toLong)
+  val totalSize = 
partition.parameters.get(StatsSetupConst.TOTAL_SIZE).map(_.toLong)
+  if (rawDataSize.isDefined && rawDataSize.get > 0) {
+rawDataSize.get
+  } else if (totalSize.isDefined && totalSize.get > 0L) {
+totalSize.get
+  } else {
+0L
+  }
+}
+if (sizeOfPartitions.forall(s => s>0)) {
+  val sizeInBytes = sizeOfPartitions.sum
+  tableMeta.copy(stats = Some(CatalogStatistics(sizeInBytes = 
BigInt(sizeInBytes
+} else {
+  tableMeta
+}
+  }
+
+  override def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
+case op @ PhysicalOperation(projections, filters, relation: 
HiveTableRelation)
+  if filters.nonEmpty && relation.isPartitioned && 
relation.prunedPartitions.isEmpty =>
+  val partitionKeyFilters = getPartitionKeyFilters(filters, relation)
+  if (partitionKeyFilters.nonEmpty) {
+val newPartitions = prunePartitions(relation, partitionKeyFilters)
+val newTableMeta = updateTableMeta(relation.tableMeta, newPartitions)
+val newRelation = relation.copy(
+  tableMeta = newTableMeta, prunedPartitions = Some(newPartitions))
+Projec

[GitHub] [spark] srowen commented on issue #27241: [SPARK-30533][ML][PYSPARK] Add classes to represent Java Regressors and RegressionModels

2020-01-17 Thread GitBox
srowen commented on issue #27241: [SPARK-30533][ML][PYSPARK] Add classes to 
represent Java Regressors and RegressionModels
URL: https://github.com/apache/spark/pull/27241#issuecomment-575853279
 
 
   Merged to master


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen closed pull request #27241: [SPARK-30533][ML][PYSPARK] Add classes to represent Java Regressors and RegressionModels

2020-01-17 Thread GitBox
srowen closed pull request #27241: [SPARK-30533][ML][PYSPARK] Add classes to 
represent Java Regressors and RegressionModels
URL: https://github.com/apache/spark/pull/27241
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] fuwhu commented on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not need to prune partitions again after pushing down to hive metastore

2020-01-17 Thread GitBox
fuwhu commented on issue #27232: [SPARK-30525][SQL]HiveTableScanExec do not 
need to prune partitions again after pushing down to hive metastore
URL: https://github.com/apache/spark/pull/27232#issuecomment-575852899
 
 
   cc @cloud-fan 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] bettermouse edited a comment on issue #27268: [SPARK-30553][DOCS] fix structured-streaming java example error

2020-01-17 Thread GitBox
bettermouse edited a comment on issue #27268: [SPARK-30553][DOCS] fix 
structured-streaming java example error
URL: https://github.com/apache/spark/pull/27268#issuecomment-575852494
 
 
   @dongjoon-hyun  I have checked it.The class 
JavaStructuredNetworkWordCountWindowed does not use API withWatermark. So there 
is no problem
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] bettermouse commented on issue #27268: [SPARK-30553][DOCS] fix structured-streaming java example error

2020-01-17 Thread GitBox
bettermouse commented on issue #27268: [SPARK-30553][DOCS] fix 
structured-streaming java example error
URL: https://github.com/apache/spark/pull/27268#issuecomment-575852494
 
 
   I have checked it.The class JavaStructuredNetworkWordCountWindowed does not 
use API withWatermark. So there is no problem
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect

2020-01-17 Thread GitBox
dongjoon-hyun commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a 
migration guide for MsSQLServer JDBC dialect
URL: https://github.com/apache/spark/pull/27270#issuecomment-575852458
 
 
   You are faster than me. :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect

2020-01-17 Thread GitBox
maropu commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration 
guide for MsSQLServer JDBC dialect
URL: https://github.com/apache/spark/pull/27270#issuecomment-575852189
 
 
   hahaha, I was a bit late ;)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect

2020-01-17 Thread GitBox
dongjoon-hyun commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a 
migration guide for MsSQLServer JDBC dialect
URL: https://github.com/apache/spark/pull/27270#issuecomment-575851846
 
 
   Thank you, @maropu !


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect

2020-01-17 Thread GitBox
dongjoon-hyun closed pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a 
migration guide for MsSQLServer JDBC dialect
URL: https://github.com/apache/spark/pull/27270
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect

2020-01-17 Thread GitBox
dongjoon-hyun commented on a change in pull request #27270: 
[SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
URL: https://github.com/apache/spark/pull/27270#discussion_r368195075
 
 

 ##
 File path: docs/sql-migration-guide.md
 ##
 @@ -344,6 +344,12 @@ license: |
 
   - Since Spark 2.4.5, `TRUNCATE TABLE` command tries to set back original 
permission and ACLs during re-creating the table/partition paths. To restore 
the behaviour of earlier versions, set 
`spark.sql.truncateTable.ignorePermissionAcl.enabled` to `true`.
 
+  - Since Spark 2.4.5, `spark.sql.legacy.mssqlserver.numericMapping.enabled` 
configuration is added in order to support the legacy MsSQLServer dialect 
mapping behavior using IntegerType and DoubleType for SMALLINT and REAL JDBC 
types, respectively. To restore the behaviour of 2.4.3 and earlier versions, 
set `spark.sql.legacy.mssqlserver.numericMapping.enabled` to `true`.
+
+## Upgrading from Spark SQL 2.4.3 to 2.4.4
 
 Review comment:
   Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect

2020-01-17 Thread GitBox
dongjoon-hyun commented on issue #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a 
migration guide for MsSQLServer JDBC dialect
URL: https://github.com/apache/spark/pull/27270#issuecomment-575851702
 
 
   Merged to master/2.4.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories

2020-01-17 Thread GitBox
dongjoon-hyun closed pull request #27130: [SPARK-25993][SQL][TESTS] Add test 
cases for CREATE EXTERNAL TABLE with subdirectories
URL: https://github.com/apache/spark/pull/27130
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect

2020-01-17 Thread GitBox
viirya commented on a change in pull request #27270: 
[SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
URL: https://github.com/apache/spark/pull/27270#discussion_r368194683
 
 

 ##
 File path: docs/sql-migration-guide.md
 ##
 @@ -344,6 +344,12 @@ license: |
 
   - Since Spark 2.4.5, `TRUNCATE TABLE` command tries to set back original 
permission and ACLs during re-creating the table/partition paths. To restore 
the behaviour of earlier versions, set 
`spark.sql.truncateTable.ignorePermissionAcl.enabled` to `true`.
 
+  - Since Spark 2.4.5, `spark.sql.legacy.mssqlserver.numericMapping.enabled` 
configuration is added in order to support the legacy MsSQLServer dialect 
mapping behavior using IntegerType and DoubleType for SMALLINT and REAL JDBC 
types, respectively. To restore the behaviour of 2.4.3 and earlier versions, 
set `spark.sql.legacy.mssqlserver.numericMapping.enabled` to `true`.
+
+## Upgrading from Spark SQL 2.4.3 to 2.4.4
 
 Review comment:
   ok sounds good. Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect

2020-01-17 Thread GitBox
dongjoon-hyun commented on a change in pull request #27270: 
[SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
URL: https://github.com/apache/spark/pull/27270#discussion_r368193217
 
 

 ##
 File path: docs/sql-migration-guide.md
 ##
 @@ -344,6 +344,12 @@ license: |
 
   - Since Spark 2.4.5, `TRUNCATE TABLE` command tries to set back original 
permission and ACLs during re-creating the table/partition paths. To restore 
the behaviour of earlier versions, set 
`spark.sql.truncateTable.ignorePermissionAcl.enabled` to `true`.
 
+  - Since Spark 2.4.5, `spark.sql.legacy.mssqlserver.numericMapping.enabled` 
configuration is added in order to support the legacy MsSQLServer dialect 
mapping behavior using IntegerType and DoubleType for SMALLINT and REAL JDBC 
types, respectively. To restore the behaviour of 2.4.3 and earlier versions, 
set `spark.sql.legacy.mssqlserver.numericMapping.enabled` to `true`.
+
+## Upgrading from Spark SQL 2.4.3 to 2.4.4
 
 Review comment:
   For `2.4.4` release doc, I'll update `spark-website` repository.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27270: [SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect

2020-01-17 Thread GitBox
dongjoon-hyun commented on a change in pull request #27270: 
[SPARK-28152][DOCS][FOLLOWUP] Add a migration guide for MsSQLServer JDBC dialect
URL: https://github.com/apache/spark/pull/27270#discussion_r368193120
 
 

 ##
 File path: docs/sql-migration-guide.md
 ##
 @@ -344,6 +344,12 @@ license: |
 
   - Since Spark 2.4.5, `TRUNCATE TABLE` command tries to set back original 
permission and ACLs during re-creating the table/partition paths. To restore 
the behaviour of earlier versions, set 
`spark.sql.truncateTable.ignorePermissionAcl.enabled` to `true`.
 
+  - Since Spark 2.4.5, `spark.sql.legacy.mssqlserver.numericMapping.enabled` 
configuration is added in order to support the legacy MsSQLServer dialect 
mapping behavior using IntegerType and DoubleType for SMALLINT and REAL JDBC 
types, respectively. To restore the behaviour of 2.4.3 and earlier versions, 
set `spark.sql.legacy.mssqlserver.numericMapping.enabled` to `true`.
+
+## Upgrading from Spark SQL 2.4.3 to 2.4.4
 
 Review comment:
   IMO, although it's late for 2.4.4, `2.4.3` to `2.4.4` will be correct. When 
the users upgrade from 1.6.3 to 3.0.0, they need to see all previous migration 
guides. If there is some regression on 2.4.5, the users can use 2.4.4 instead 
of 2.4.5.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add 
test cases for CREATE EXTERNAL TABLE with subdirectories
URL: https://github.com/apache/spark/pull/27130#issuecomment-575849751
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116967/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test 
cases for CREATE EXTERNAL TABLE with subdirectories
URL: https://github.com/apache/spark/pull/27130#issuecomment-575849745
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test 
cases for CREATE EXTERNAL TABLE with subdirectories
URL: https://github.com/apache/spark/pull/27130#issuecomment-575849751
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/116967/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add 
test cases for CREATE EXTERNAL TABLE with subdirectories
URL: https://github.com/apache/spark/pull/27130#issuecomment-575849745
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories

2020-01-17 Thread GitBox
SparkQA removed a comment on issue #27130: [SPARK-25993][SQL][TESTS] Add test 
cases for CREATE EXTERNAL TABLE with subdirectories
URL: https://github.com/apache/spark/pull/27130#issuecomment-575824139
 
 
   **[Test build #116967 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116967/testReport)**
 for PR 27130 at commit 
[`39f271f`](https://github.com/apache/spark/commit/39f271f23278c334a8230408703201276e7292ac).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for CREATE EXTERNAL TABLE with subdirectories

2020-01-17 Thread GitBox
SparkQA commented on issue #27130: [SPARK-25993][SQL][TESTS] Add test cases for 
CREATE EXTERNAL TABLE with subdirectories
URL: https://github.com/apache/spark/pull/27130#issuecomment-575849490
 
 
   **[Test build #116967 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116967/testReport)**
 for PR 27130 at commit 
[`39f271f`](https://github.com/apache/spark/commit/39f271f23278c334a8230408703201276e7292ac).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang commented on issue #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing

2020-01-17 Thread GitBox
gengliangwang commented on issue #27157: [SPARK-30475][SQL] File source V2: 
Push data filters for file listing
URL: https://github.com/apache/spark/pull/27157#issuecomment-575848049
 
 
   @guykhazma Sorry to reply late.
   I was thinking about another approach, but I can't come up with a better one 
yet.
   
   My major concern is that the filters are supposed to be pushed down in the 
`FileScanBuilder`. It is wired to push down again for in the `FileScan`. 
Technically, the partition filters should be pushed down in `FileScanBuilder` 
as well.
   However, the current DSV2 API exposes the filters as `Filter` only instead 
of `Expression`. The coverage of `Filter` is limited. That's why I push the 
partition filters into FileScan in https://github.com/apache/spark/pull/27112.
   
   Keeping the behavior in V2 is also important. I will merge this one. We can 
improve the approach in the future.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang commented on a change in pull request #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing

2020-01-17 Thread GitBox
gengliangwang commented on a change in pull request #27157: [SPARK-30475][SQL] 
File source V2: Push data filters for file listing
URL: https://github.com/apache/spark/pull/27157#discussion_r368189535
 
 

 ##
 File path: 
external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
 ##
 @@ -1575,6 +1576,36 @@ class AvroV2Suite extends AvroSuite {
 }
   }
 
+  test("Avro source v2: support passing data filters to FileScan without 
partitionFilters") {
+withTempPath { dir =>
+  Seq(("a", 1, 2), ("b", 1, 2), ("c", 2, 1))
+.toDF("value", "p1", "p2")
+.write
+.format("avro")
+.option("header", true)
 
 Review comment:
   For Avro data source, `.option("header", true)` is not needed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang commented on a change in pull request #27157: [SPARK-30475][SQL] File source V2: Push data filters for file listing

2020-01-17 Thread GitBox
gengliangwang commented on a change in pull request #27157: [SPARK-30475][SQL] 
File source V2: Push data filters for file listing
URL: https://github.com/apache/spark/pull/27157#discussion_r368189550
 
 

 ##
 File path: 
external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
 ##
 @@ -1575,6 +1576,36 @@ class AvroV2Suite extends AvroSuite {
 }
   }
 
+  test("Avro source v2: support passing data filters to FileScan without 
partitionFilters") {
+withTempPath { dir =>
+  Seq(("a", 1, 2), ("b", 1, 2), ("c", 2, 1))
+.toDF("value", "p1", "p2")
+.write
+.format("avro")
+.option("header", true)
+.save(dir.getCanonicalPath)
+  val df = spark
+.read
+.format("avro")
+.option("header", true)
 
 Review comment:
   Ditto.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] closed pull request #17234: [SPARK-19892][MLlib] Implement findAnalogies method for Word2VecModel

2020-01-17 Thread GitBox
github-actions[bot] closed pull request #17234: [SPARK-19892][MLlib] Implement 
findAnalogies method for Word2VecModel
URL: https://github.com/apache/spark/pull/17234
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] closed pull request #18193: [SPARK-15616] [SQL] CatalogRelation should fallback to HDFS size of partitions that are involved in Query for JoinSelection.

2020-01-17 Thread GitBox
github-actions[bot] closed pull request #18193: [SPARK-15616] [SQL] 
CatalogRelation should fallback to HDFS size of partitions that are involved in 
Query for JoinSelection.
URL: https://github.com/apache/spark/pull/18193
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune 
unnecessary nested fields from Generate without Project
URL: https://github.com/apache/spark/pull/26978#issuecomment-575841295
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] closed pull request #17365: [SPARK-19962] [MLlib] add DictVectorizer to ml.feature

2020-01-17 Thread GitBox
github-actions[bot] closed pull request #17365: [SPARK-19962] [MLlib]  add 
DictVectorizer to ml.feature
URL: https://github.com/apache/spark/pull/17365
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on issue #20935: [SPARK-23819][SQL] Fix InMemoryTableScanExec complex type pruning

2020-01-17 Thread GitBox
github-actions[bot] commented on issue #20935: [SPARK-23819][SQL] Fix 
InMemoryTableScanExec complex type pruning
URL: https://github.com/apache/spark/pull/20935#issuecomment-575841406
 
 
   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] closed pull request #21006: [SPARK-22256][MESOS] - Introduce spark.mesos.driver.memoryOverhead

2020-01-17 Thread GitBox
github-actions[bot] closed pull request #21006: [SPARK-22256][MESOS] - 
Introduce spark.mesos.driver.memoryOverhead
URL: https://github.com/apache/spark/pull/21006
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on issue #14431: [SPARK-16258][SparkR] Automatically append the grouping keys in SparkR's gapply

2020-01-17 Thread GitBox
github-actions[bot] commented on issue #14431: [SPARK-16258][SparkR] 
Automatically append the grouping keys in SparkR's gapply
URL: https://github.com/apache/spark/pull/14431#issuecomment-575841442
 
 
   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on issue #14936: [SPARK-7877][MESOS] Allow configuration of framework timeout

2020-01-17 Thread GitBox
github-actions[bot] commented on issue #14936: [SPARK-7877][MESOS] Allow 
configuration of framework timeout
URL: https://github.com/apache/spark/pull/14936#issuecomment-575841434
 
 
   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] closed pull request #15326: [SPARK-17759] [CORE] Avoid adding duplicate schedulables

2020-01-17 Thread GitBox
github-actions[bot] closed pull request #15326: [SPARK-17759] [CORE] Avoid 
adding duplicate schedulables
URL: https://github.com/apache/spark/pull/15326
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on issue #21164: [SPARK-24098][SQL] ScriptTransformationExec should wait process exiting before output iterator finish

2020-01-17 Thread GitBox
github-actions[bot] commented on issue #21164: [SPARK-24098][SQL] 
ScriptTransformationExec should wait process exiting before output iterator 
finish
URL: https://github.com/apache/spark/pull/21164#issuecomment-575841397
 
 
   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project

2020-01-17 Thread GitBox
AmplabJenkins removed a comment on issue #26978: [SPARK-29721][SQL] Prune 
unnecessary nested fields from Generate without Project
URL: https://github.com/apache/spark/pull/26978#issuecomment-575841301
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21737/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on issue #15496: [SPARK-17950] [Python] Match SparseVector behavior with DenseVector

2020-01-17 Thread GitBox
github-actions[bot] commented on issue #15496: [SPARK-17950] [Python] Match 
SparseVector behavior with DenseVector
URL: https://github.com/apache/spark/pull/15496#issuecomment-575841424
 
 
   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on issue #13650: [SPARK-9623] [ML] Provide conditional variance for RandomForestRegressor

2020-01-17 Thread GitBox
github-actions[bot] commented on issue #13650: [SPARK-9623] [ML] Provide 
conditional variance for RandomForestRegressor
URL: https://github.com/apache/spark/pull/13650#issuecomment-575841447
 
 
   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] commented on issue #13379: [SPARK-12431][GraphX] Add local checkpointing to GraphX.

2020-01-17 Thread GitBox
github-actions[bot] commented on issue #13379: [SPARK-12431][GraphX] Add local 
checkpointing to GraphX.
URL: https://github.com/apache/spark/pull/13379#issuecomment-575841455
 
 
   We're closing this PR because it hasn't been updated in a while. This isn't 
a judgement on the merit of the PR in any way. It's just a way of keeping the 
PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to 
remove the Stale tag!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary nested fields from Generate without Project

2020-01-17 Thread GitBox
AmplabJenkins commented on issue #26978: [SPARK-29721][SQL] Prune unnecessary 
nested fields from Generate without Project
URL: https://github.com/apache/spark/pull/26978#issuecomment-575841301
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/21737/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   10   >