date:20181211

[GitHub] AmplabJenkins commented on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23266: [SPARK-26313][SQL] move read related 
methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446490616
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6003/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23266: [SPARK-26313][SQL] move read 
related methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446490616
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6003/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23266: [SPARK-26313][SQL] move read related 
methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446490610
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23266: [SPARK-26313][SQL] move read 
related methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446490610
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

SparkQA commented on issue #23266: [SPARK-26313][SQL] move read related methods 
from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446490498
 
 
   **[Test build #19 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19/testReport)**
 for PR 23266 at commit 
[`d9ae0fe`](https://github.com/apache/spark/commit/d9ae0fe1c61d4b21205152a2b6ea31ef23baf3b8).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] sadhen commented on issue #23276: [SPARK-26321][SQL] Split a SQL in correct way

2018-12-11 Thread GitBox

sadhen commented on issue #23276: [SPARK-26321][SQL] Split a SQL in correct way
URL: https://github.com/apache/spark/pull/23276#issuecomment-446490065
 
 
   OK, I will polish it later.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensitivity in ML

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23144: [SPARK-26172][ML][WIP] Unify 
String Params' case-insensitivity in ML
URL: https://github.com/apache/spark/pull/23144#issuecomment-446488548
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/17/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensitivity in ML

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23144: [SPARK-26172][ML][WIP] Unify 
String Params' case-insensitivity in ML
URL: https://github.com/apache/spark/pull/23144#issuecomment-446488546
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensitivity in ML

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23144: [SPARK-26172][ML][WIP] Unify String 
Params' case-insensitivity in ML
URL: https://github.com/apache/spark/pull/23144#issuecomment-446488546
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA removed a comment on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensitivity in ML

2018-12-11 Thread GitBox

SparkQA removed a comment on issue #23144: [SPARK-26172][ML][WIP] Unify String 
Params' case-insensitivity in ML
URL: https://github.com/apache/spark/pull/23144#issuecomment-446475933
 
 
   **[Test build #17 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17/testReport)**
 for PR 23144 at commit 
[`df5559e`](https://github.com/apache/spark/commit/df5559e1b5248844c98be487f3f051d9df6808b7).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensitivity in ML

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23144: [SPARK-26172][ML][WIP] Unify String 
Params' case-insensitivity in ML
URL: https://github.com/apache/spark/pull/23144#issuecomment-446488548
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/17/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensitivity in ML

2018-12-11 Thread GitBox

SparkQA commented on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' 
case-insensitivity in ML
URL: https://github.com/apache/spark/pull/23144#issuecomment-446488365
 
 
   **[Test build #17 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17/testReport)**
 for PR 23144 at commit 
[`df5559e`](https://github.com/apache/spark/commit/df5559e1b5248844c98be487f3f051d9df6808b7).
* This patch **fails SparkR unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] 
Keep track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446486859
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/11/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] 
Keep track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446486855
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep 
track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446486855
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep 
track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446486859
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/11/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

SparkQA removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep 
track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446442398
 
 
   **[Test build #11 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/11/testReport)**
 for PR 19045 at commit 
[`9d4fc23`](https://github.com/apache/spark/commit/9d4fc238d2d4e0984d4880d42c4cf6a1f1a52f8b).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

SparkQA commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of 
nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446486573
 
 
   **[Test build #11 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/11/testReport)**
 for PR 19045 at commit 
[`9d4fc23`](https://github.com/apache/spark/commit/9d4fc238d2d4e0984d4880d42c4cf6a1f1a52f8b).
* This patch **fails SparkR unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] kiszk commented on issue #23294: [SPARK-26265][Core][Followup] Put freePage into a finally block

2018-12-11 Thread GitBox

kiszk commented on issue #23294: [SPARK-26265][Core][Followup] Put freePage 
into a finally block
URL: https://github.com/apache/spark/pull/23294#issuecomment-446485354
 
 
   LGTM, thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] felixcheung commented on issue #23263: [SPARK-23674][ML] Adds Spark ML Events

2018-12-11 Thread GitBox

felixcheung commented on issue #23263: [SPARK-23674][ML] Adds Spark ML Events
URL: https://github.com/apache/spark/pull/23263#issuecomment-446483977
 
 
   my 2c - it does seem useful, though sounds like mostly for Atlas for now.
   if you can think of a less intrusive way to do this it might be easier to 
review...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] felixcheung commented on issue #23263: [SPARK-23674][ML] Adds Spark ML Events

2018-12-11 Thread GitBox

felixcheung commented on issue #23263: [SPARK-23674][ML] Adds Spark ML Events
URL: https://github.com/apache/spark/pull/23263#issuecomment-446484055
 
 
   (oops sorry - too easy to close)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] HyukjinKwon opened a new pull request #23263: [SPARK-23674][ML] Adds Spark ML Events

2018-12-11 Thread GitBox

HyukjinKwon opened a new pull request #23263: [SPARK-23674][ML] Adds Spark ML 
Events
URL: https://github.com/apache/spark/pull/23263
 
 
   ## What changes were proposed in this pull request?
   
   This PR proposes to add ML events so that other developers can track and add 
some actions for them.
   
   ## Introduction
   
   ML events (like SQL events) can be quite useful when people want to track 
and make some actions for corresponding ML operations. For instance, I have 
been working on integrating 
   Apache Spark with [Apache Atlas](https://atlas.apache.org/QuickStart.html). 
With some custom changes with this PR, I can visualise ML pipeline as below:
   
   
![spark_ml_streaming_lineage](https://user-images.githubusercontent.com/6477701/49682779-394bca80-faf5-11e8-85b8-5fae28b784b3.png)
   
   Another good thing that might have to be considered is, that we can interact 
this with other SQL/Streaming events. For instance, where the input `Dataset` 
is originated. For instance, with current Apache Spark, I can visualise SQL 
operations as below:
   
   ![screen shot 2018-12-10 at 9 41 36 
am](https://user-images.githubusercontent.com/6477701/49706269-d9bdfe00-fc5f-11e8-943a-3309d1856ba5.png)
   
   I think we can combine those existing lineages together to easily understand 
where the data comes and goes. Currently, ML side is a hole so the lineages 
can't be connected for the current Apache Spark ..
   
   To add up, I think it's not to mention how useful it is to track the 
SQL/Streaming operations. Likewise, I would like to propose ML events as well 
(as lowest stability `@Unstable` APIs for now - no guarantee about stability).
   
   ## Implementation Details
   
   ### Sends event (but not expose ML specific listener)
   
   **`mllib/src/main/scala/org/apache/spark/ml/events.scala`**
   
   ```scala
   @Unstable
   case class ...StartEvent(caller, input)
   @Unstable
   case class ...EndEvent(caller, output)
   
   object MLEvents {
 // Wrappers to send events:
 // def with...Event(body) = {
 //   body()
 //   SparkContext.getOrCreate().listenerBus.post(event)
 // }
   }
   ```
   
   This way mimics both:
   
   **1. Catalog events (see 
`org/apache/spark/sql/catalyst/catalog/events.scala`)**
   
   - This allows a Catalog specific listener to be added 
`ExternalCatalogEventListener` 
   
   - It's implemented in a way of wrapping whole `ExternalCatalog` named 
`ExternalCatalogWithListener`
   which delegates the operations to `ExternalCatalog`
   
   This is not quite possible in this case because most of instances (like 
`Pipeline`) will be directly created in most of cases. We might be able to do 
that via extending `ListenerBus` for all possible instances but IMHO it's too 
invasive. Also, exposing another ML specific listener sounds a bit too much at 
this stage. Therefore, I simply borrowed file name and structures here
   
   **2. SQL execution events (see 
`org/apache/spark/sql/execution/SQLExecution.scala`)**
   
   - Add an object that wraps a body to send events
   
   Current apporach is rather close to this. It has a `with...` wrapper to send 
events. I borrowed this approach to be consistent.
   
   
   ### Add `...Impl` methods to wrap each to send events
   
   **`mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala`**
   
   ```diff
   - def save(...) = { saveImpl(...) }
   + def save(...) = MLEvents.withSaveInstanceEvent { saveImpl(...) }
 def saveImpl(...): Unit = ...
   ```
   
 Note that `saveImpl` was already implemented unlike other instances below.
   
   
   ```diff
   - def load(...): T
   + def load(...): T = MLEvents.withLoadInstanceEvent { loadImple(...) }
   + def loadImpl(...): T
   ```
   
   **`mllib/src/main/scala/org/apache/spark/ml/Estimator.scala`**
   
   ```diff
   - def fit(...): Model
   + def fit(...): Model = MLEvents.withFitEvent { fitImpl(...) }
   + def fitImpl(...): Model
   ```
   
   **`mllib/src/main/scala/org/apache/spark/ml/Transformer.scala`**
   
   ```diff
   - def transform(...): DataFrame
   + def transform(...): DataFrame = MLEvents.withTransformEvent { 
transformImpl(...) }
   + def transformImpl(...): DataFrame
   ```
   
   This approach follows the existing way as below in ML:
   
   **1. `transform` and `transformImpl`**
   
   
https://github.com/apache/spark/blob/9b1f6c8bab5401258c653d4e2efb50e97c6d282f/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala#L202-L213
   
   
https://github.com/apache/spark/blob/9b1f6c8bab5401258c653d4e2efb50e97c6d282f/mllib/src/main/scala/org/apache/spark/ml/regression/DecisionTreeRegressor.scala#L191-L196
   
   
https://github.com/apache/spark/blob/9b1f6c8bab5401258c653d4e2efb50e97c6d282f/mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala#L1037-L1042
   
   **2. `save` and `saveImpl`**
   
   
https://github.com/apache/spark/blob/9b1f6c8bab5401258c653d4e2efb50e97c6d282f/mllib/src/main/scala/org/apache/spark/ml/util/ReadW

[GitHub] felixcheung closed pull request #23263: [SPARK-23674][ML] Adds Spark ML Events

2018-12-11 Thread GitBox

felixcheung closed pull request #23263: [SPARK-23674][ML] Adds Spark ML Events
URL: https://github.com/apache/spark/pull/23263
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/mllib/src/main/scala/org/apache/spark/ml/Estimator.scala 
b/mllib/src/main/scala/org/apache/spark/ml/Estimator.scala
index 1247882d6c1bd..a3c4db06862f8 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/Estimator.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/Estimator.scala
@@ -65,7 +65,19 @@ abstract class Estimator[M <: Model[M]] extends 
PipelineStage {
* Fits a model to the input data.
*/
   @Since("2.0.0")
-  def fit(dataset: Dataset[_]): M
+  def fit(dataset: Dataset[_]): M = MLEvents.withFitEvent(this, dataset) {
+fitImpl(dataset)
+  }
+
+  /**
+   * `fit()` handles events and then calls this method. Subclasses should 
override this
+   * method to implement the actual fiting a model to the input data.
+   */
+  @Since("3.0.0")
+  protected def fitImpl(dataset: Dataset[_]): M = {
+// Keep this default body for backward compatibility.
+throw new UnsupportedOperationException("fitImpl is not implemented.")
+  }
 
   /**
* Fits multiple models to the input data with multiple sets of parameters.
diff --git a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala 
b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
index 103082b7b9766..1c781faff129e 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
@@ -132,7 +132,8 @@ class Pipeline @Since("1.4.0") (
* @return fitted pipeline
*/
   @Since("2.0.0")
-  override def fit(dataset: Dataset[_]): PipelineModel = {
+  override def fit(dataset: Dataset[_]): PipelineModel = super.fit(dataset)
+  override protected def fitImpl(dataset: Dataset[_]): PipelineModel = {
 transformSchema(dataset.schema, logging = true)
 val theStages = $(stages)
 // Search for the last estimator.
@@ -210,7 +211,7 @@ object Pipeline extends MLReadable[Pipeline] {
 /** Checked against metadata when loading model */
 private val className = classOf[Pipeline].getName
 
-override def load(path: String): Pipeline = {
+override protected def loadImpl(path: String): Pipeline = {
   val (uid: String, stages: Array[PipelineStage]) = 
SharedReadWrite.load(className, sc, path)
   new Pipeline(uid).setStages(stages)
 }
@@ -301,7 +302,8 @@ class PipelineModel private[ml] (
   }
 
   @Since("2.0.0")
-  override def transform(dataset: Dataset[_]): DataFrame = {
+  override def transform(dataset: Dataset[_]): DataFrame = 
super.transform(dataset)
+  override protected def transformImpl(dataset: Dataset[_]): DataFrame = {
 transformSchema(dataset.schema, logging = true)
 stages.foldLeft(dataset.toDF)((cur, transformer) => 
transformer.transform(cur))
   }
@@ -344,7 +346,7 @@ object PipelineModel extends MLReadable[PipelineModel] {
 /** Checked against metadata when loading model */
 private val className = classOf[PipelineModel].getName
 
-override def load(path: String): PipelineModel = {
+override protected def loadImpl(path: String): PipelineModel = {
   val (uid: String, stages: Array[PipelineStage]) = 
SharedReadWrite.load(className, sc, path)
   val transformers = stages map {
 case stage: Transformer => stage
diff --git a/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala 
b/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala
index d8f3dfa874439..3731ddae0160c 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/Predictor.scala
@@ -94,7 +94,10 @@ abstract class Predictor[
   /** @group setParam */
   def setPredictionCol(value: String): Learner = set(predictionCol, 
value).asInstanceOf[Learner]
 
-  override def fit(dataset: Dataset[_]): M = {
+  // Explictly call parent's load. Otherwise, MiMa complains.
+  override def fit(dataset: Dataset[_]): M = super.fit(dataset)
+
+  override protected def fitImpl(dataset: Dataset[_]): M = {
 // This handles a few items such as schema validation.
 // Developers only need to implement train().
 transformSchema(dataset.schema, logging = true)
@@ -199,7 +202,8 @@ abstract class PredictionModel[FeaturesType, M <: 
PredictionModel[FeaturesType,
* @param dataset input dataset
* @return transformed dataset with [[predictionCol]] of type `Double`
*/
-  override def transform(dataset: Dataset[_]): DataFrame = {
+  override def transform(
+  dataset: Dataset[_]): DataFrame = MLEvents.withTransformEvent(this, 
dataset) {
 transformSchema(dataset.schema, logging = true)
 if ($(predictionCol).nonEmpty) {
   transformImpl(dataset)
@@ -210,7 +21

[GitHub] felixcheung commented on a change in pull request #23292: [SPARK-19827][R][FOLLOWUP] spark.ml R API for PIC

2018-12-11 Thread GitBox

felixcheung commented on a change in pull request #23292: 
[SPARK-19827][R][FOLLOWUP] spark.ml R API for PIC
URL: https://github.com/apache/spark/pull/23292#discussion_r240898609
 
 

 ##
 File path: R/pkg/R/mllib_fpm.R
 ##
 @@ -183,8 +183,8 @@ setMethod("write.ml", signature(object = "FPGrowthModel", 
path = "character"),
 #' @return A complete set of frequent sequential patterns in the input 
sequences of itemsets.
 #' The returned \code{SparkDataFrame} contains columns of sequence and 
corresponding
 #' frequency. The schema of it will be:
-#' \code{sequence: ArrayType(ArrayType(T))} (T is the item type)
-#' \code{freq: Long}
+#' \code{sequence: ArrayType(ArrayType(T))}, \code{freq: integer}
+#' where T is the item type
 
 Review comment:
   let's fix both


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23266: [SPARK-26313][SQL] move read 
related methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446481380
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/18/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

SparkQA commented on issue #23266: [SPARK-26313][SQL] move read related methods 
from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446481366
 
 
   **[Test build #18 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18/testReport)**
 for PR 23266 at commit 
[`4a982b6`](https://github.com/apache/spark/commit/4a982b66fb5cdb01a238fce79dd7f3e4d08076d6).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23266: [SPARK-26313][SQL] move read related 
methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446481376
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23266: [SPARK-26313][SQL] move read 
related methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446481376
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA removed a comment on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

SparkQA removed a comment on issue #23266: [SPARK-26313][SQL] move read related 
methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446480621
 
 
   **[Test build #18 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18/testReport)**
 for PR 23266 at commit 
[`4a982b6`](https://github.com/apache/spark/commit/4a982b66fb5cdb01a238fce79dd7f3e4d08076d6).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23266: [SPARK-26313][SQL] move read related 
methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446481380
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/18/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23266: [SPARK-26313][SQL] move read 
related methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446480632
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6002/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23266: [SPARK-26313][SQL] move read related 
methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446480632
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6002/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23266: [SPARK-26313][SQL] move read related 
methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446480626
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] cloud-fan commented on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

cloud-fan commented on issue #23266: [SPARK-26313][SQL] move read related 
methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446480647
 
 
   It's a little weird to have table without schema. I leave `schema` in 
`Table` with comment saying that, implementation can return empty schema if the 
table is not readable.
   
   For a sink that can accept data in any schema, data source API might not be 
a good option. `Dataset.foreach` could be better.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

SparkQA commented on issue #23266: [SPARK-26313][SQL] move read related methods 
from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446480621
 
 
   **[Test build #18 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18/testReport)**
 for PR 23266 at commit 
[`4a982b6`](https://github.com/apache/spark/commit/4a982b66fb5cdb01a238fce79dd7f3e4d08076d6).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23266: [SPARK-26313][SQL] move read related methods from Table to read related mix-in traits

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23266: [SPARK-26313][SQL] move read 
related methods from Table to read related mix-in traits
URL: https://github.com/apache/spark/pull/23266#issuecomment-446480626
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] 
Keep track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446477915
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/10/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query

2018-12-11 Thread GitBox

HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] 
Provide option to clean up completed files in streaming query
URL: https://github.com/apache/spark/pull/22952#discussion_r240894917
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
 ##
 @@ -1494,6 +1512,287 @@ class FileStreamSourceSuite extends 
FileStreamSourceTest {
   newSource.getBatch(None, FileStreamSourceOffset(1))
 }
   }
+
+  test("remove completed files when remove option is enabled") {
+def assertFileIsRemoved(files: Array[String], fileName: String): Unit = {
+  assert(!files.exists(_.startsWith(fileName)))
+}
+
+def assertFileIsNotRemoved(files: Array[String], fileName: String): Unit = 
{
+  assert(files.exists(_.startsWith(fileName)))
+}
+
+withTempDirs { case (src, tmp) =>
+  withSQLConf(
+SQLConf.FILE_SOURCE_LOG_COMPACT_INTERVAL.key -> "2",
+// Force deleting the old logs
+SQLConf.FILE_SOURCE_LOG_CLEANUP_DELAY.key -> "1"
+  ) {
+val option = Map("latestFirst" -> "false", "maxFilesPerTrigger" -> "1",
+  "cleanSource" -> "delete")
+
+val fileStream = createFileStream("text", src.getCanonicalPath, 
options = option)
+val filtered = fileStream.filter($"value" contains "keep")
+
+testStream(filtered)(
+  AddTextFileData("keep1", src, tmp, tmpFilePrefix = "keep1"),
+  CheckAnswer("keep1"),
+  AssertOnQuery("input file removed") { _: StreamExecution =>
+// it doesn't rename any file yet
+assertFileIsNotRemoved(src.list(), "keep1")
+true
+  },
+  AddTextFileData("keep2", src, tmp, tmpFilePrefix = "ke ep2 %"),
+  CheckAnswer("keep1", "keep2"),
+  AssertOnQuery("input file removed") { _: StreamExecution =>
+val files = src.list()
+
+// it renames input file for first batch, but not for second batch 
yet
+assertFileIsRemoved(files, "keep1")
+assertFileIsNotRemoved(files, "ke ep2 %")
+
+true
+  },
+  AddTextFileData("keep3", src, tmp, tmpFilePrefix = "keep3"),
+  CheckAnswer("keep1", "keep2", "keep3"),
+  AssertOnQuery("input file renamed") { _: StreamExecution =>
+val files = src.list()
+
+// it renames input file for second batch, but not third batch yet
+assertFileIsRemoved(files, "ke ep2 %")
+assertFileIsNotRemoved(files, "keep3")
+
+true
+  }
+)
+  }
+}
+  }
+
+  test("move completed files to archive directory when archive option is 
enabled") {
+
+withThreeTempDirs { case (src, tmp, archiveDir) =>
+  withSQLConf(
+SQLConf.FILE_SOURCE_LOG_COMPACT_INTERVAL.key -> "2",
+// Force deleting the old logs
+SQLConf.FILE_SOURCE_LOG_CLEANUP_DELAY.key -> "1"
+  ) {
+val option = Map("latestFirst" -> "false", "maxFilesPerTrigger" -> "1",
+  "cleanSource" -> "archive", "sourceArchiveDir" -> 
archiveDir.getAbsolutePath)
+
+val fileStream = createFileStream("text", 
s"${src.getCanonicalPath}/*/*",
+  options = option)
+val filtered = fileStream.filter($"value" contains "keep")
+
+// src/k %1
+// file: src/k1 %1/keep1
+val dirForKeep1 = new File(src, "k %1")
+// src/k1/k 2
+// file: src/k1/k 2/keep2
+val dirForKeep2 = new File(dirForKeep1, "k 2")
+// src/k3
+// file: src/k3/keep3
+val dirForKeep3 = new File(src, "k3")
+
+val expectedMovedDir1 = new File(archiveDir.getAbsolutePath + 
dirForKeep1.toURI.getPath)
+val expectedMovedDir2 = new File(archiveDir.getAbsolutePath + 
dirForKeep2.toURI.getPath)
+val expectedMovedDir3 = new File(archiveDir.getAbsolutePath + 
dirForKeep3.toURI.getPath)
+
+testStream(filtered)(
+  AddTextFileData("keep1", dirForKeep1, tmp, tmpFilePrefix = "keep1"),
+  CheckAnswer("keep1"),
+  AssertOnQuery("input file archived") { _: StreamExecution =>
+// it doesn't rename any file yet
+assertFileIsNotMoved(dirForKeep1, expectedMovedDir1, "keep1")
+true
+  },
+  AddTextFileData("keep2", dirForKeep2, tmp, tmpFilePrefix = "keep2 
%"),
+  CheckAnswer("keep1", "keep2"),
+  AssertOnQuery("input file archived") { _: StreamExecution =>
+// it renames input file for first batch, but not for second batch 
yet
+assertFileIsMoved(dirForKeep1, expectedMovedDir1, "keep1")
+assertFileIsNotMoved(dirForKeep2, expectedMovedDir2, "keep2 %")
+true
+  },
+  AddTextFileData("keep3", dirForKeep3, tmp, tmpFilePrefix = "keep3"),
+  CheckAnswer("keep1", "keep2", "keep3"),
+  AssertOnQuery("input file archived") { _: StreamExecution =>
+// it renam

[GitHub] AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] 
Keep track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446477908
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query

2018-12-11 Thread GitBox

HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] 
Provide option to clean up completed files in streaming query
URL: https://github.com/apache/spark/pull/22952#discussion_r240894769
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
 ##
 @@ -1494,6 +1512,287 @@ class FileStreamSourceSuite extends 
FileStreamSourceTest {
   newSource.getBatch(None, FileStreamSourceOffset(1))
 }
   }
+
+  test("remove completed files when remove option is enabled") {
+def assertFileIsRemoved(files: Array[String], fileName: String): Unit = {
+  assert(!files.exists(_.startsWith(fileName)))
+}
+
+def assertFileIsNotRemoved(files: Array[String], fileName: String): Unit = 
{
+  assert(files.exists(_.startsWith(fileName)))
+}
+
+withTempDirs { case (src, tmp) =>
+  withSQLConf(
+SQLConf.FILE_SOURCE_LOG_COMPACT_INTERVAL.key -> "2",
+// Force deleting the old logs
+SQLConf.FILE_SOURCE_LOG_CLEANUP_DELAY.key -> "1"
+  ) {
+val option = Map("latestFirst" -> "false", "maxFilesPerTrigger" -> "1",
+  "cleanSource" -> "delete")
+
+val fileStream = createFileStream("text", src.getCanonicalPath, 
options = option)
+val filtered = fileStream.filter($"value" contains "keep")
+
+testStream(filtered)(
+  AddTextFileData("keep1", src, tmp, tmpFilePrefix = "keep1"),
+  CheckAnswer("keep1"),
+  AssertOnQuery("input file removed") { _: StreamExecution =>
+// it doesn't rename any file yet
+assertFileIsNotRemoved(src.list(), "keep1")
+true
+  },
+  AddTextFileData("keep2", src, tmp, tmpFilePrefix = "ke ep2 %"),
+  CheckAnswer("keep1", "keep2"),
+  AssertOnQuery("input file removed") { _: StreamExecution =>
+val files = src.list()
+
+// it renames input file for first batch, but not for second batch 
yet
+assertFileIsRemoved(files, "keep1")
+assertFileIsNotRemoved(files, "ke ep2 %")
+
+true
+  },
+  AddTextFileData("keep3", src, tmp, tmpFilePrefix = "keep3"),
+  CheckAnswer("keep1", "keep2", "keep3"),
+  AssertOnQuery("input file renamed") { _: StreamExecution =>
+val files = src.list()
+
+// it renames input file for second batch, but not third batch yet
+assertFileIsRemoved(files, "ke ep2 %")
+assertFileIsNotRemoved(files, "keep3")
+
+true
+  }
+)
+  }
+}
+  }
+
+  test("move completed files to archive directory when archive option is 
enabled") {
+
+withThreeTempDirs { case (src, tmp, archiveDir) =>
+  withSQLConf(
+SQLConf.FILE_SOURCE_LOG_COMPACT_INTERVAL.key -> "2",
+// Force deleting the old logs
+SQLConf.FILE_SOURCE_LOG_CLEANUP_DELAY.key -> "1"
+  ) {
+val option = Map("latestFirst" -> "false", "maxFilesPerTrigger" -> "1",
+  "cleanSource" -> "archive", "sourceArchiveDir" -> 
archiveDir.getAbsolutePath)
+
+val fileStream = createFileStream("text", 
s"${src.getCanonicalPath}/*/*",
+  options = option)
+val filtered = fileStream.filter($"value" contains "keep")
+
+// src/k %1
+// file: src/k1 %1/keep1
+val dirForKeep1 = new File(src, "k %1")
+// src/k1/k 2
+// file: src/k1/k 2/keep2
+val dirForKeep2 = new File(dirForKeep1, "k 2")
+// src/k3
+// file: src/k3/keep3
+val dirForKeep3 = new File(src, "k3")
+
+val expectedMovedDir1 = new File(archiveDir.getAbsolutePath + 
dirForKeep1.toURI.getPath)
+val expectedMovedDir2 = new File(archiveDir.getAbsolutePath + 
dirForKeep2.toURI.getPath)
+val expectedMovedDir3 = new File(archiveDir.getAbsolutePath + 
dirForKeep3.toURI.getPath)
+
+testStream(filtered)(
+  AddTextFileData("keep1", dirForKeep1, tmp, tmpFilePrefix = "keep1"),
+  CheckAnswer("keep1"),
+  AssertOnQuery("input file archived") { _: StreamExecution =>
+// it doesn't rename any file yet
+assertFileIsNotMoved(dirForKeep1, expectedMovedDir1, "keep1")
+true
+  },
+  AddTextFileData("keep2", dirForKeep2, tmp, tmpFilePrefix = "keep2 
%"),
+  CheckAnswer("keep1", "keep2"),
+  AssertOnQuery("input file archived") { _: StreamExecution =>
+// it renames input file for first batch, but not for second batch 
yet
+assertFileIsMoved(dirForKeep1, expectedMovedDir1, "keep1")
+assertFileIsNotMoved(dirForKeep2, expectedMovedDir2, "keep2 %")
+true
+  },
+  AddTextFileData("keep3", dirForKeep3, tmp, tmpFilePrefix = "keep3"),
+  CheckAnswer("keep1", "keep2", "keep3"),
+  AssertOnQuery("input file archived") { _: StreamExecution =>
+// it renam

[GitHub] AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep 
track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446477915
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/10/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep 
track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446477908
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query

2018-12-11 Thread GitBox

HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] 
Provide option to clean up completed files in streaming query
URL: https://github.com/apache/spark/pull/22952#discussion_r240894769
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
 ##
 @@ -1494,6 +1512,287 @@ class FileStreamSourceSuite extends 
FileStreamSourceTest {
   newSource.getBatch(None, FileStreamSourceOffset(1))
 }
   }
+
+  test("remove completed files when remove option is enabled") {
+def assertFileIsRemoved(files: Array[String], fileName: String): Unit = {
+  assert(!files.exists(_.startsWith(fileName)))
+}
+
+def assertFileIsNotRemoved(files: Array[String], fileName: String): Unit = 
{
+  assert(files.exists(_.startsWith(fileName)))
+}
+
+withTempDirs { case (src, tmp) =>
+  withSQLConf(
+SQLConf.FILE_SOURCE_LOG_COMPACT_INTERVAL.key -> "2",
+// Force deleting the old logs
+SQLConf.FILE_SOURCE_LOG_CLEANUP_DELAY.key -> "1"
+  ) {
+val option = Map("latestFirst" -> "false", "maxFilesPerTrigger" -> "1",
+  "cleanSource" -> "delete")
+
+val fileStream = createFileStream("text", src.getCanonicalPath, 
options = option)
+val filtered = fileStream.filter($"value" contains "keep")
+
+testStream(filtered)(
+  AddTextFileData("keep1", src, tmp, tmpFilePrefix = "keep1"),
+  CheckAnswer("keep1"),
+  AssertOnQuery("input file removed") { _: StreamExecution =>
+// it doesn't rename any file yet
+assertFileIsNotRemoved(src.list(), "keep1")
+true
+  },
+  AddTextFileData("keep2", src, tmp, tmpFilePrefix = "ke ep2 %"),
+  CheckAnswer("keep1", "keep2"),
+  AssertOnQuery("input file removed") { _: StreamExecution =>
+val files = src.list()
+
+// it renames input file for first batch, but not for second batch 
yet
+assertFileIsRemoved(files, "keep1")
+assertFileIsNotRemoved(files, "ke ep2 %")
+
+true
+  },
+  AddTextFileData("keep3", src, tmp, tmpFilePrefix = "keep3"),
+  CheckAnswer("keep1", "keep2", "keep3"),
+  AssertOnQuery("input file renamed") { _: StreamExecution =>
+val files = src.list()
+
+// it renames input file for second batch, but not third batch yet
+assertFileIsRemoved(files, "ke ep2 %")
+assertFileIsNotRemoved(files, "keep3")
+
+true
+  }
+)
+  }
+}
+  }
+
+  test("move completed files to archive directory when archive option is 
enabled") {
+
+withThreeTempDirs { case (src, tmp, archiveDir) =>
+  withSQLConf(
+SQLConf.FILE_SOURCE_LOG_COMPACT_INTERVAL.key -> "2",
+// Force deleting the old logs
+SQLConf.FILE_SOURCE_LOG_CLEANUP_DELAY.key -> "1"
+  ) {
+val option = Map("latestFirst" -> "false", "maxFilesPerTrigger" -> "1",
+  "cleanSource" -> "archive", "sourceArchiveDir" -> 
archiveDir.getAbsolutePath)
+
+val fileStream = createFileStream("text", 
s"${src.getCanonicalPath}/*/*",
+  options = option)
+val filtered = fileStream.filter($"value" contains "keep")
+
+// src/k %1
+// file: src/k1 %1/keep1
+val dirForKeep1 = new File(src, "k %1")
+// src/k1/k 2
+// file: src/k1/k 2/keep2
+val dirForKeep2 = new File(dirForKeep1, "k 2")
+// src/k3
+// file: src/k3/keep3
+val dirForKeep3 = new File(src, "k3")
+
+val expectedMovedDir1 = new File(archiveDir.getAbsolutePath + 
dirForKeep1.toURI.getPath)
+val expectedMovedDir2 = new File(archiveDir.getAbsolutePath + 
dirForKeep2.toURI.getPath)
+val expectedMovedDir3 = new File(archiveDir.getAbsolutePath + 
dirForKeep3.toURI.getPath)
+
+testStream(filtered)(
+  AddTextFileData("keep1", dirForKeep1, tmp, tmpFilePrefix = "keep1"),
+  CheckAnswer("keep1"),
+  AssertOnQuery("input file archived") { _: StreamExecution =>
+// it doesn't rename any file yet
+assertFileIsNotMoved(dirForKeep1, expectedMovedDir1, "keep1")
+true
+  },
+  AddTextFileData("keep2", dirForKeep2, tmp, tmpFilePrefix = "keep2 
%"),
+  CheckAnswer("keep1", "keep2"),
+  AssertOnQuery("input file archived") { _: StreamExecution =>
+// it renames input file for first batch, but not for second batch 
yet
+assertFileIsMoved(dirForKeep1, expectedMovedDir1, "keep1")
+assertFileIsNotMoved(dirForKeep2, expectedMovedDir2, "keep2 %")
+true
+  },
+  AddTextFileData("keep3", dirForKeep3, tmp, tmpFilePrefix = "keep3"),
+  CheckAnswer("keep1", "keep2", "keep3"),
+  AssertOnQuery("input file archived") { _: StreamExecution =>
+// it renam

[GitHub] SparkQA removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

SparkQA removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep 
track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446435549
 
 
   **[Test build #10 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/10/testReport)**
 for PR 19045 at commit 
[`abb0609`](https://github.com/apache/spark/commit/abb060949797cec31ec02080c5720ccec9df3f5c).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

SparkQA commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of 
nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446477611
 
 
   **[Test build #10 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/10/testReport)**
 for PR 19045 at commit 
[`abb0609`](https://github.com/apache/spark/commit/abb060949797cec31ec02080c5720ccec9df3f5c).
* This patch **fails SparkR unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query

2018-12-11 Thread GitBox

HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] 
Provide option to clean up completed files in streaming query
URL: https://github.com/apache/spark/pull/22952#discussion_r240894533
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
 ##
 @@ -1494,6 +1512,287 @@ class FileStreamSourceSuite extends 
FileStreamSourceTest {
   newSource.getBatch(None, FileStreamSourceOffset(1))
 }
   }
+
+  test("remove completed files when remove option is enabled") {
+def assertFileIsRemoved(files: Array[String], fileName: String): Unit = {
+  assert(!files.exists(_.startsWith(fileName)))
+}
+
+def assertFileIsNotRemoved(files: Array[String], fileName: String): Unit = 
{
+  assert(files.exists(_.startsWith(fileName)))
+}
+
+withTempDirs { case (src, tmp) =>
+  withSQLConf(
+SQLConf.FILE_SOURCE_LOG_COMPACT_INTERVAL.key -> "2",
+// Force deleting the old logs
+SQLConf.FILE_SOURCE_LOG_CLEANUP_DELAY.key -> "1"
+  ) {
+val option = Map("latestFirst" -> "false", "maxFilesPerTrigger" -> "1",
+  "cleanSource" -> "delete")
+
+val fileStream = createFileStream("text", src.getCanonicalPath, 
options = option)
+val filtered = fileStream.filter($"value" contains "keep")
+
+testStream(filtered)(
+  AddTextFileData("keep1", src, tmp, tmpFilePrefix = "keep1"),
+  CheckAnswer("keep1"),
+  AssertOnQuery("input file removed") { _: StreamExecution =>
+// it doesn't rename any file yet
+assertFileIsNotRemoved(src.list(), "keep1")
+true
+  },
+  AddTextFileData("keep2", src, tmp, tmpFilePrefix = "ke ep2 %"),
+  CheckAnswer("keep1", "keep2"),
+  AssertOnQuery("input file removed") { _: StreamExecution =>
+val files = src.list()
+
+// it renames input file for first batch, but not for second batch 
yet
+assertFileIsRemoved(files, "keep1")
+assertFileIsNotRemoved(files, "ke ep2 %")
+
+true
+  },
+  AddTextFileData("keep3", src, tmp, tmpFilePrefix = "keep3"),
+  CheckAnswer("keep1", "keep2", "keep3"),
+  AssertOnQuery("input file renamed") { _: StreamExecution =>
+val files = src.list()
+
+// it renames input file for second batch, but not third batch yet
+assertFileIsRemoved(files, "ke ep2 %")
+assertFileIsNotRemoved(files, "keep3")
+
+true
+  }
+)
+  }
+}
+  }
+
+  test("move completed files to archive directory when archive option is 
enabled") {
+
+withThreeTempDirs { case (src, tmp, archiveDir) =>
+  withSQLConf(
+SQLConf.FILE_SOURCE_LOG_COMPACT_INTERVAL.key -> "2",
+// Force deleting the old logs
+SQLConf.FILE_SOURCE_LOG_CLEANUP_DELAY.key -> "1"
+  ) {
+val option = Map("latestFirst" -> "false", "maxFilesPerTrigger" -> "1",
+  "cleanSource" -> "archive", "sourceArchiveDir" -> 
archiveDir.getAbsolutePath)
+
+val fileStream = createFileStream("text", 
s"${src.getCanonicalPath}/*/*",
+  options = option)
+val filtered = fileStream.filter($"value" contains "keep")
+
+// src/k %1
+// file: src/k1 %1/keep1
+val dirForKeep1 = new File(src, "k %1")
+// src/k1/k 2
+// file: src/k1/k 2/keep2
+val dirForKeep2 = new File(dirForKeep1, "k 2")
+// src/k3
+// file: src/k3/keep3
+val dirForKeep3 = new File(src, "k3")
+
+val expectedMovedDir1 = new File(archiveDir.getAbsolutePath + 
dirForKeep1.toURI.getPath)
+val expectedMovedDir2 = new File(archiveDir.getAbsolutePath + 
dirForKeep2.toURI.getPath)
+val expectedMovedDir3 = new File(archiveDir.getAbsolutePath + 
dirForKeep3.toURI.getPath)
+
+testStream(filtered)(
+  AddTextFileData("keep1", dirForKeep1, tmp, tmpFilePrefix = "keep1"),
+  CheckAnswer("keep1"),
+  AssertOnQuery("input file archived") { _: StreamExecution =>
+// it doesn't rename any file yet
+assertFileIsNotMoved(dirForKeep1, expectedMovedDir1, "keep1")
+true
+  },
+  AddTextFileData("keep2", dirForKeep2, tmp, tmpFilePrefix = "keep2 
%"),
+  CheckAnswer("keep1", "keep2"),
+  AssertOnQuery("input file archived") { _: StreamExecution =>
+// it renames input file for first batch, but not for second batch 
yet
+assertFileIsMoved(dirForKeep1, expectedMovedDir1, "keep1")
+assertFileIsNotMoved(dirForKeep2, expectedMovedDir2, "keep2 %")
+true
+  },
+  AddTextFileData("keep3", dirForKeep3, tmp, tmpFilePrefix = "keep3"),
+  CheckAnswer("keep1", "keep2", "keep3"),
+  AssertOnQuery("input file archived") { _: StreamExecution =>
+// it renam

[GitHub] HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query

2018-12-11 Thread GitBox

HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] 
Provide option to clean up completed files in streaming query
URL: https://github.com/apache/spark/pull/22952#discussion_r240894261
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
 ##
 @@ -329,4 +345,124 @@ object FileStreamSource {
 
 def size: Int = map.size()
   }
+
+  class FileStreamSourceCleaner(fileSystem: FileSystem, sourcePath: Path,
+baseArchivePathString: Option[String]) extends 
Logging {
+
+private val sourceGlobFilters: Seq[GlobFilter] = 
buildSourceGlobFilters(sourcePath)
+
+private val baseArchivePath: Option[Path] = baseArchivePathString.map(new 
Path(_))
+
+def archive(entry: FileEntry): Unit = {
+  require(baseArchivePath.isDefined)
+
+  val curPath = new Path(new URI(entry.path))
+  val curPathUri = curPath.toUri
+
+  val newPath = buildArchiveFilePath(curPathUri)
+
+  if (baseArchivePath.get.depth() <= 2) {
 
 Review comment:
   It's just a shortcut of condition: given that source pattern will read files 
which exactly match to the pattern, or their parents match the pattern. Once we 
are adding base archive path as prefix of final path, if the depth of base 
archive path is greater than 2, FileStreamSource will not read the archived 
file as source.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] gatorsmile commented on issue #23276: [SPARK-26321][SQL] Split a SQL in correct way

2018-12-11 Thread GitBox

gatorsmile commented on issue #23276: [SPARK-26321][SQL] Split a SQL in correct 
way
URL: https://github.com/apache/spark/pull/23276#issuecomment-446477152
 
 
   > The semicolon (;) terminates an SQL command. It cannot appear anywhere 
within a command, except within a string constant or quoted identifier.
   
   Thus, you need to consider both double quotes, single quotes. Also 
backstick, which is being used by Spark SQL for quoting identifiers. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensitivity in ML

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23144: [SPARK-26172][ML][WIP] Unify 
String Params' case-insensitivity in ML
URL: https://github.com/apache/spark/pull/23144#issuecomment-446476895
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6001/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensitivity in ML

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23144: [SPARK-26172][ML][WIP] Unify String 
Params' case-insensitivity in ML
URL: https://github.com/apache/spark/pull/23144#issuecomment-446476895
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6001/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensitivity in ML

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23144: [SPARK-26172][ML][WIP] Unify 
String Params' case-insensitivity in ML
URL: https://github.com/apache/spark/pull/23144#issuecomment-446476891
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensitivity in ML

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23144: [SPARK-26172][ML][WIP] Unify String 
Params' case-insensitivity in ML
URL: https://github.com/apache/spark/pull/23144#issuecomment-446476891
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] gatorsmile commented on issue #23276: [SPARK-26321][SQL] Split a SQL in correct way

2018-12-11 Thread GitBox

gatorsmile commented on issue #23276: [SPARK-26321][SQL] Split a SQL in correct 
way
URL: https://github.com/apache/spark/pull/23276#issuecomment-446476683
 
 
   Let us first update the PR description to explain the problem we want to 
resolve. Regarding comments, our SQL parser follows PostgreSQL. You can read 
SqlBase.g4 to confirm it. 
   
   ```
   SIMPLE_COMMENT
   : '--' ~[\r\n]* '\r'? '\n'? -> channel(HIDDEN)
   ;
   
   BRACKETED_EMPTY_COMMENT
   : '/**/' -> channel(HIDDEN)
   ;
   
   BRACKETED_COMMENT
   : '/*' ~[+] .*? '*/' -> channel(HIDDEN)
   ;
   ```
   
   In PostgreSQL doc: 
https://www.postgresql.org/docs/9.4/sql-syntax-lexical.html
   
   > A comment is a sequence of characters beginning with double dashes and 
extending to the end of the line, e.g.:
   ```
   -- This is a standard SQL comment
   ```
   > Alternatively, C-style block comments can be used:
   ```
   /* multiline comment
* with nesting: /* nested block comment */
*/
   ```
   > where the comment begins with /* and extends to the matching occurrence of 
*/. These block comments nest, as specified in the SQL standard but unlike C, 
so that one can comment out larger blocks of code that might contain existing 
block comments.
   
   > A comment is removed from the input stream before further syntax analysis 
and is effectively replaced by whitespace.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query

2018-12-11 Thread GitBox

HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] 
Provide option to clean up completed files in streaming query
URL: https://github.com/apache/spark/pull/22952#discussion_r240893378
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
 ##
 @@ -329,4 +345,124 @@ object FileStreamSource {
 
 def size: Int = map.size()
   }
+
+  class FileStreamSourceCleaner(fileSystem: FileSystem, sourcePath: Path,
+baseArchivePathString: Option[String]) extends 
Logging {
+
+private val sourceGlobFilters: Seq[GlobFilter] = 
buildSourceGlobFilters(sourcePath)
+
+private val baseArchivePath: Option[Path] = baseArchivePathString.map(new 
Path(_))
+
+def archive(entry: FileEntry): Unit = {
+  require(baseArchivePath.isDefined)
+
+  val curPath = new Path(new URI(entry.path))
+  val curPathUri = curPath.toUri
+
+  val newPath = buildArchiveFilePath(curPathUri)
+
+  if (baseArchivePath.get.depth() <= 2) {
+if (isArchiveFileMatchedAgainstSourcePattern(newPath)) {
+  logWarning(s"Fail to move $curPath to $newPath - destination matches 
" +
+s"to source path/pattern. Skip moving file.")
+} else {
+  doArchive(curPath, newPath)
+}
+  } else {
+// there's no chance for archive file to be matched against source 
pattern
+doArchive(curPath, newPath)
+  }
+}
+
+def remove(entry: FileEntry): Unit = {
+  val curPath = new Path(new URI(entry.path))
+  try {
+logDebug(s"Removing completed file $curPath")
+fileSystem.delete(curPath, false)
 
 Review comment:
   Indeed. Thanks for pointing it out! Will address.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query

2018-12-11 Thread GitBox

HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] 
Provide option to clean up completed files in streaming query
URL: https://github.com/apache/spark/pull/22952#discussion_r240893237
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
 ##
 @@ -329,4 +345,124 @@ object FileStreamSource {
 
 def size: Int = map.size()
   }
+
+  class FileStreamSourceCleaner(fileSystem: FileSystem, sourcePath: Path,
+baseArchivePathString: Option[String]) extends 
Logging {
 
 Review comment:
   Same here: I think it is allowed at most two lines, but no strong opinions 
regarding this. Will address.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query

2018-12-11 Thread GitBox

HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] 
Provide option to clean up completed files in streaming query
URL: https://github.com/apache/spark/pull/22952#discussion_r240893299
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
 ##
 @@ -1494,6 +1512,287 @@ class FileStreamSourceSuite extends 
FileStreamSourceTest {
   newSource.getBatch(None, FileStreamSourceOffset(1))
 }
   }
+
+  test("remove completed files when remove option is enabled") {
+def assertFileIsRemoved(files: Array[String], fileName: String): Unit = {
+  assert(!files.exists(_.startsWith(fileName)))
+}
+
+def assertFileIsNotRemoved(files: Array[String], fileName: String): Unit = 
{
+  assert(files.exists(_.startsWith(fileName)))
+}
+
+withTempDirs { case (src, tmp) =>
+  withSQLConf(
+SQLConf.FILE_SOURCE_LOG_COMPACT_INTERVAL.key -> "2",
+// Force deleting the old logs
+SQLConf.FILE_SOURCE_LOG_CLEANUP_DELAY.key -> "1"
+  ) {
+val option = Map("latestFirst" -> "false", "maxFilesPerTrigger" -> "1",
+  "cleanSource" -> "delete")
+
+val fileStream = createFileStream("text", src.getCanonicalPath, 
options = option)
+val filtered = fileStream.filter($"value" contains "keep")
+
+testStream(filtered)(
+  AddTextFileData("keep1", src, tmp, tmpFilePrefix = "keep1"),
+  CheckAnswer("keep1"),
+  AssertOnQuery("input file removed") { _: StreamExecution =>
+// it doesn't rename any file yet
+assertFileIsNotRemoved(src.list(), "keep1")
+true
+  },
+  AddTextFileData("keep2", src, tmp, tmpFilePrefix = "ke ep2 %"),
+  CheckAnswer("keep1", "keep2"),
+  AssertOnQuery("input file removed") { _: StreamExecution =>
+val files = src.list()
+
+// it renames input file for first batch, but not for second batch 
yet
+assertFileIsRemoved(files, "keep1")
+assertFileIsNotRemoved(files, "ke ep2 %")
+
+true
+  },
+  AddTextFileData("keep3", src, tmp, tmpFilePrefix = "keep3"),
+  CheckAnswer("keep1", "keep2", "keep3"),
+  AssertOnQuery("input file renamed") { _: StreamExecution =>
+val files = src.list()
+
+// it renames input file for second batch, but not third batch yet
+assertFileIsRemoved(files, "ke ep2 %")
+assertFileIsNotRemoved(files, "keep3")
+
+true
+  }
+)
+  }
+}
+  }
+
+  test("move completed files to archive directory when archive option is 
enabled") {
+
+withThreeTempDirs { case (src, tmp, archiveDir) =>
+  withSQLConf(
+SQLConf.FILE_SOURCE_LOG_COMPACT_INTERVAL.key -> "2",
+// Force deleting the old logs
+SQLConf.FILE_SOURCE_LOG_CLEANUP_DELAY.key -> "1"
+  ) {
+val option = Map("latestFirst" -> "false", "maxFilesPerTrigger" -> "1",
+  "cleanSource" -> "archive", "sourceArchiveDir" -> 
archiveDir.getAbsolutePath)
+
+val fileStream = createFileStream("text", 
s"${src.getCanonicalPath}/*/*",
+  options = option)
+val filtered = fileStream.filter($"value" contains "keep")
+
+// src/k %1
+// file: src/k1 %1/keep1
+val dirForKeep1 = new File(src, "k %1")
+// src/k1/k 2
+// file: src/k1/k 2/keep2
 
 Review comment:
   Nice catch! Will address.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query

2018-12-11 Thread GitBox

HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] 
Provide option to clean up completed files in streaming query
URL: https://github.com/apache/spark/pull/22952#discussion_r240893237
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
 ##
 @@ -329,4 +345,124 @@ object FileStreamSource {
 
 def size: Int = map.size()
   }
+
+  class FileStreamSourceCleaner(fileSystem: FileSystem, sourcePath: Path,
+baseArchivePathString: Option[String]) extends 
Logging {
 
 Review comment:
   Same here: I think it is accepted at most two lines, but no strong opinions 
regarding this. Will address.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' case-insensitivity in ML

2018-12-11 Thread GitBox

SparkQA commented on issue #23144: [SPARK-26172][ML][WIP] Unify String Params' 
case-insensitivity in ML
URL: https://github.com/apache/spark/pull/23144#issuecomment-446475933
 
 
   **[Test build #17 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17/testReport)**
 for PR 23144 at commit 
[`df5559e`](https://github.com/apache/spark/commit/df5559e1b5248844c98be487f3f051d9df6808b7).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query

2018-12-11 Thread GitBox

HeartSaVioR commented on a change in pull request #22952: [SPARK-20568][SS] 
Provide option to clean up completed files in streaming query
URL: https://github.com/apache/spark/pull/22952#discussion_r240893073
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSource.scala
 ##
 @@ -329,4 +345,124 @@ object FileStreamSource {
 
 def size: Int = map.size()
   }
+
+  class FileStreamSourceCleaner(fileSystem: FileSystem, sourcePath: Path,
 
 Review comment:
   No need to be public. Will address.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] HeartSaVioR edited a comment on issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for query plans

2018-12-11 Thread GitBox

HeartSaVioR edited a comment on issue #23169: [SPARK-26103][SQL] Limit the 
length of debug strings for query plans
URL: https://github.com/apache/spark/pull/23169#issuecomment-446475026
 
 
   @DaveDeCaprio 
   I'm not sure only adding doc in SQLConf is enough for end users to be aware 
when troubleshooting their issues, but let's wait on committers feedback.
   
   cc. to @vanzin I guess you might be interested on reviewing this, given that 
you've left comments in this issue and relevant issues for Apache JIRA.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] HeartSaVioR commented on issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for query plans

2018-12-11 Thread GitBox

HeartSaVioR commented on issue #23169: [SPARK-26103][SQL] Limit the length of 
debug strings for query plans
URL: https://github.com/apache/spark/pull/23169#issuecomment-446475026
 
 
   @DaveDeCaprio 
   I'm not sure only adding doc in SQLConf is enough for end users to be aware 
when troubleshooting their issues, but let's wait on committers feedback.
   
   @vanzin 
   You might be interested on reviewing this, given that you've left comments 
in this issue and relevant issues for Apache JIRA.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23296: [mllib]MergeAggregate serialize and deserialize function use Bytebuffer to optimize

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23296: [mllib]MergeAggregate 
serialize and deserialize function use Bytebuffer to optimize
URL: https://github.com/apache/spark/pull/23296#issuecomment-446472926
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23296: [mllib]MergeAggregate serialize and deserialize function use Bytebuffer to optimize

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23296: [mllib]MergeAggregate serialize and 
deserialize function use Bytebuffer to optimize
URL: https://github.com/apache/spark/pull/23296#issuecomment-446473122
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23296: [mllib]MergeAggregate serialize and deserialize function use Bytebuffer to optimize

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23296: [mllib]MergeAggregate 
serialize and deserialize function use Bytebuffer to optimize
URL: https://github.com/apache/spark/pull/23296#issuecomment-446472860
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23296: [mllib]MergeAggregate serialize and deserialize function use Bytebuffer to optimize

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23296: [mllib]MergeAggregate serialize and 
deserialize function use Bytebuffer to optimize
URL: https://github.com/apache/spark/pull/23296#issuecomment-446472926
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23296: [mllib]MergeAggregate serialize and deserialize function use Bytebuffer to optimize

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23296: [mllib]MergeAggregate serialize and 
deserialize function use Bytebuffer to optimize
URL: https://github.com/apache/spark/pull/23296#issuecomment-446472860
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] lcqzte10192193 opened a new pull request #23296: [mllib]MergeAggregate serialize and deserialize function use Bytebuffer to optimize

2018-12-11 Thread GitBox

lcqzte10192193 opened a new pull request #23296: [mllib]MergeAggregate 
serialize and deserialize function use Bytebuffer to optimize
URL: https://github.com/apache/spark/pull/23296
 
 
   ## What changes were proposed in this pull request?
   MergeAggregate serialize and deserialize function can use ByteBuffer to 
optimize.
   
   
   ## How was this patch tested?
   SummarizerSuite
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] 
Keep track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446470421
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/8/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep 
track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446470417
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] 
Keep track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446470417
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep 
track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446470421
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/8/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

SparkQA removed a comment on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep 
track of nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446429391
 
 
   **[Test build #8 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/8/testReport)**
 for PR 19045 at commit 
[`43abf98`](https://github.com/apache/spark/commit/43abf98dc1eff6bcc3d5c9ba8700a74c7922de2c).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spot instances) which are going to be shutdown

2018-12-11 Thread GitBox

SparkQA commented on issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of 
nodes (/ spot instances) which are going to be shutdown
URL: https://github.com/apache/spark/pull/19045#issuecomment-446469994
 
 
   **[Test build #8 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/8/testReport)**
 for PR 19045 at commit 
[`43abf98`](https://github.com/apache/spark/commit/43abf98dc1eff6bcc3d5c9ba8700a74c7922de2c).
* This patch **fails SparkR unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #23273: [MINOR][DOC] Fix comments of ConvertToLocalRelation rule

2018-12-11 Thread GitBox

SparkQA commented on issue #23273: [MINOR][DOC] Fix comments of 
ConvertToLocalRelation rule
URL: https://github.com/apache/spark/pull/23273#issuecomment-446468234
 
 
   **[Test build #16 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16/testReport)**
 for PR 23273 at commit 
[`1cca847`](https://github.com/apache/spark/commit/1cca8475e999139ab8f22f5cb583d9edce3cf550).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] benmccann commented on issue #20974: [SPARK-23862][SQL] Spark ExpressionEncoder should support java enum type in scala

2018-12-11 Thread GitBox

benmccann commented on issue #20974: [SPARK-23862][SQL] Spark ExpressionEncoder 
should support java enum type in scala
URL: https://github.com/apache/spark/pull/20974#issuecomment-446467737
 
 
   @gatorsmile @cloud-fan would you be able to give this PR a look or suggest a 
more appropriate reviewer?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23249: [SPARK-26297][SQL] improve the 
doc of Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#issuecomment-446467464
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23249: [SPARK-26297][SQL] improve the 
doc of Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#issuecomment-446467466
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6000/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23249: [SPARK-26297][SQL] improve the doc of 
Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#issuecomment-446467466
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6000/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23276: [SPARK-26321][SQL] Split a SQL in correct way

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23276: [SPARK-26321][SQL] Split a SQL 
in correct way
URL: https://github.com/apache/spark/pull/23276#issuecomment-446467310
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/9/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23249: [SPARK-26297][SQL] improve the doc of 
Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#issuecomment-446467464
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-11 Thread GitBox

SparkQA commented on issue #23249: [SPARK-26297][SQL] improve the doc of 
Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#issuecomment-446467343
 
 
   **[Test build #15 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15/testReport)**
 for PR 23249 at commit 
[`cb94add`](https://github.com/apache/spark/commit/cb94addf126f9885c73c1dc8ead26fbcfece4441).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23276: [SPARK-26321][SQL] Split a SQL in correct way

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23276: [SPARK-26321][SQL] Split a SQL in 
correct way
URL: https://github.com/apache/spark/pull/23276#issuecomment-446467310
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/9/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23276: [SPARK-26321][SQL] Split a SQL in correct way

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23276: [SPARK-26321][SQL] Split a SQL 
in correct way
URL: https://github.com/apache/spark/pull/23276#issuecomment-446467307
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23276: [SPARK-26321][SQL] Split a SQL in correct way

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23276: [SPARK-26321][SQL] Split a SQL in 
correct way
URL: https://github.com/apache/spark/pull/23276#issuecomment-446467307
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #23276: [SPARK-26321][SQL] Split a SQL in correct way

2018-12-11 Thread GitBox

SparkQA commented on issue #23276: [SPARK-26321][SQL] Split a SQL in correct way
URL: https://github.com/apache/spark/pull/23276#issuecomment-446467127
 
 
   **[Test build #9 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/9/testReport)**
 for PR 23276 at commit 
[`2f77b5d`](https://github.com/apache/spark/commit/2f77b5de4b8576d3b95ff6105f3587b692b25351).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA removed a comment on issue #23276: [SPARK-26321][SQL] Split a SQL in correct way

2018-12-11 Thread GitBox

SparkQA removed a comment on issue #23276: [SPARK-26321][SQL] Split a SQL in 
correct way
URL: https://github.com/apache/spark/pull/23276#issuecomment-446433577
 
 
   **[Test build #9 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/9/testReport)**
 for PR 23276 at commit 
[`2f77b5d`](https://github.com/apache/spark/commit/2f77b5de4b8576d3b95ff6105f3587b692b25351).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] seancxmao commented on a change in pull request #23273: [MINOR][DOC] Fix comments of ConvertToLocalRelation rule

2018-12-11 Thread GitBox

seancxmao commented on a change in pull request #23273: [MINOR][DOC] Fix 
comments of ConvertToLocalRelation rule
URL: https://github.com/apache/spark/pull/23273#discussion_r240886207
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ##
 @@ -1370,10 +1370,10 @@ object DecimalAggregates extends Rule[LogicalPlan] {
 }
 
 /**
- * Converts local operations (i.e. ones that don't require data exchange) on 
LocalRelation to
- * another LocalRelation.
+ * Converts local operations (i.e. ones that don't require data exchange) on 
[[LocalRelation]] to
+ * another [[LocalRelation]].
  *
- * This is relatively simple as it currently handles only 2 single case: 
Project and Limit.
+ * This rule currently handles 3 cases: [[Project]], [[Limit]] and [[Filter]].
 
 Review comment:
   @srowen Sorry, I found that the links just do not work for scaladoc, though 
they works in IDE like Intellij IDEA. I should have generated the docs to see 
if there's any problem.
   
   I have changed `[[...]]` to backticks in a new commit.
   
   Also, I have some findings:
   * Because `org/apache/spark/sql/catalyst` is excluded for doc generation, I 
include catalyst in `SparkBuild.scala` and run `build/sbt unidoc`, there are 
many errors/warnings, should we fix these problems? Seems simple but many 
places.
   * As for `[[...]]`, fully qualified class name should be used if a link 
points to a class in another package. If the link points to a class in the same 
package, package prefix could be omitted.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] cloud-fan commented on issue #23294: [SPARK-26265][Core][Followup] Put freePage into a finally block

2018-12-11 Thread GitBox

cloud-fan commented on issue #23294: [SPARK-26265][Core][Followup] Put freePage 
into a finally block
URL: https://github.com/apache/spark/pull/23294#issuecomment-446467008
 
 
   I think this can be merged to 2.4 without conflict. I'll ping you if it 
doesn't. Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] rvesse commented on a change in pull request #22904: [SPARK-25887][K8S] Configurable K8S context support

2018-12-11 Thread GitBox

rvesse commented on a change in pull request #22904: [SPARK-25887][K8S] 
Configurable K8S context support
URL: https://github.com/apache/spark/pull/22904#discussion_r240886000
 
 

 ##
 File path: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ##
 @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects configurable 
context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the users 
K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else "current 
context"}" +
+  s" from users K8S config file")
+
+// Start from an auto-configured config with the desired context
+// Fabric 8 uses null to indicate that the users current context should be 
used so if no
+// explicit setting pass null
+val config = new ConfigBuilder(autoConfigure(kubeContext.getOrElse(null)))
 
 Review comment:
   @aditanase One practical use case is interactive Spark code where the Spark 
REPL (and driver) is running at some login shell to which the user has access 
to e.g. on a physical edge node of the cluster and the actual Spark executors 
are running as K8S pods.  This is something we have customers using today.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23068: [SPARK-26098][WebUI] Show 
associated SQL query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-446463272
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5999/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23068: [SPARK-26098][WebUI] Show 
associated SQL query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-446463271
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23068: [SPARK-26098][WebUI] Show associated 
SQL query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-446463271
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23068: [SPARK-26098][WebUI] Show associated 
SQL query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-446463272
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5999/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-11 Thread GitBox

SparkQA commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL 
query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-446463245
 
 
   **[Test build #14 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14/testReport)**
 for PR 23068 at commit 
[`0a63604`](https://github.com/apache/spark/commit/0a636049ecc721cdd31cd676fce79aeb6582dd7c).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] gatorsmile commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-11 Thread GitBox

gatorsmile commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL 
query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-446462519
 
 
   LGTM pending Jenkins. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] gatorsmile commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-11 Thread GitBox

gatorsmile commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL 
query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-446462497
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23295: [MINOR][SQL]Change `ThreadLocal.withInitial` to Scala lambda syntax

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23295: [MINOR][SQL]Change  
`ThreadLocal.withInitial`  to Scala lambda syntax
URL: https://github.com/apache/spark/pull/23295#issuecomment-446460866
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5998/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins removed a comment on issue #23295: [MINOR][SQL]Change `ThreadLocal.withInitial` to Scala lambda syntax

2018-12-11 Thread GitBox

AmplabJenkins removed a comment on issue #23295: [MINOR][SQL]Change  
`ThreadLocal.withInitial`  to Scala lambda syntax
URL: https://github.com/apache/spark/pull/23295#issuecomment-446460865
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] SparkQA commented on issue #23295: [MINOR][SQL]Change `ThreadLocal.withInitial` to Scala lambda syntax

2018-12-11 Thread GitBox

SparkQA commented on issue #23295: [MINOR][SQL]Change  
`ThreadLocal.withInitial`  to Scala lambda syntax
URL: https://github.com/apache/spark/pull/23295#issuecomment-446460982
 
 
   **[Test build #13 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13/testReport)**
 for PR 23295 at commit 
[`361ed5c`](https://github.com/apache/spark/commit/361ed5c51f113a0cea6ec2d6d7dcfd9c877a4694).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] AmplabJenkins commented on issue #23295: [MINOR][SQL]Change `ThreadLocal.withInitial` to Scala lambda syntax

2018-12-11 Thread GitBox

AmplabJenkins commented on issue #23295: [MINOR][SQL]Change  
`ThreadLocal.withInitial`  to Scala lambda syntax
URL: https://github.com/apache/spark/pull/23295#issuecomment-446460865
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 948 matches

Mail list logo