date:20190918

[GitHub] [spark] AmplabJenkins commented on issue #24575: [SPARK-27670][SQL]Add HA for HiveThriftServer2 based on HiveServer2.

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #24575: [SPARK-27670][SQL]Add HA for 
HiveThriftServer2 based on HiveServer2.
URL: https://github.com/apache/spark/pull/24575#issuecomment-532727584
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24575: [SPARK-27670][SQL]Add HA for HiveThriftServer2 based on HiveServer2.

2019-09-18 Thread GitBox

SparkQA removed a comment on issue #24575: [SPARK-27670][SQL]Add HA for 
HiveThriftServer2 based on HiveServer2.
URL: https://github.com/apache/spark/pull/24575#issuecomment-532715522
 
 
   **[Test build #110917 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110917/testReport)**
 for PR 24575 at commit 
[`e98c844`](https://github.com/apache/spark/commit/e98c844aa3851d65fac9ea5a5a0581f52bf14077).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24575: [SPARK-27670][SQL]Add HA for HiveThriftServer2 based on HiveServer2.

2019-09-18 Thread GitBox

SparkQA commented on issue #24575: [SPARK-27670][SQL]Add HA for 
HiveThriftServer2 based on HiveServer2.
URL: https://github.com/apache/spark/pull/24575#issuecomment-532727430
 
 
   **[Test build #110917 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110917/testReport)**
 for PR 24575 at commit 
[`e98c844`](https://github.com/apache/spark/commit/e98c844aa3851d65fac9ea5a5a0581f52bf14077).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] Support passing all Table metadata in TableProvider

2019-09-18 Thread GitBox

cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] 
Support passing all Table metadata in TableProvider
URL: https://github.com/apache/spark/pull/25651#discussion_r325731051
 
 

 ##
 File path: 
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableProvider.java
 ##
 @@ -36,26 +40,21 @@
 public interface TableProvider {
 
   /**
-   * Return a {@link Table} instance to do read/write with user-specified 
options.
+   * Return a {@link Table} instance to do read/write with the given table 
metadata. The returned
+   * table must report the same schema and partitioning with the given table 
metadata.
*
-   * @param options the user-specified options that can identify a table, e.g. 
file path, Kafka
-   *topic name, etc. It's an immutable case-insensitive 
string-to-string map.
-   */
-  Table getTable(CaseInsensitiveStringMap options);
-
-  /**
-   * Return a {@link Table} instance to do read/write with user-specified 
schema and options.
-   * 
-   * By default this method throws {@link UnsupportedOperationException}, 
implementations should
-   * override this method to handle user-specified schema.
-   * 
-   * @param options the user-specified options that can identify a table, e.g. 
file path, Kafka
-   *topic name, etc. It's an immutable case-insensitive 
string-to-string map.
-   * @param schema the user-specified schema.
-   * @throws UnsupportedOperationException
+   * @param schema The schema of the table to load. If it's empty, 
implementations should infer it.
+   * @param partitions The data partitioning of the table to load. If it's 
empty, implementations
+   *   should infer it.
+   * @param properties The properties of the table to load. It should be 
sufficient to define and
+   *   access a table. The properties map may be {@link 
CaseInsensitiveStringMap}.
+   *
+   * @throws IllegalArgumentException if the implementation can't infer 
schema/partitioning, or
+   *  the given schema/partitioning doesn't 
match the actual data
+   *  schema/partitioning.
*/
-  default Table getTable(CaseInsensitiveStringMap options, StructType schema) {
-throw new UnsupportedOperationException(
-  this.getClass().getSimpleName() + " source does not support 
user-specified schema");
-  }
+  Table getTable(
+  Optional schema,
+  Optional partitions,
+  Map properties);
 
 Review comment:
   I'd like to discuss how the API should look like. The current use cases 
include
   1. users only specify options, implementation needs to infer 
schema/partitioning
   2. users specify options and schema, implementation needs to infer 
partitioning
   3. users specify all the things.
   
   Shall we create 3 methods or just create one single method like this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25651: [SPARK-28948][SQL] Support passing all Table metadata in TableProvider

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25651: [SPARK-28948][SQL] Support 
passing all Table metadata in TableProvider
URL: https://github.com/apache/spark/pull/25651#issuecomment-532726418
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25651: [SPARK-28948][SQL] Support passing all Table metadata in TableProvider

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25651: [SPARK-28948][SQL] Support 
passing all Table metadata in TableProvider
URL: https://github.com/apache/spark/pull/25651#issuecomment-532726429
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16034/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple columns support to PySpark QuantileDiscretizer

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25812: [SPARK-22796][PYTHON][ML] Add 
multiple columns support to PySpark QuantileDiscretizer
URL: https://github.com/apache/spark/pull/25812#issuecomment-532726838
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple columns support to PySpark QuantileDiscretizer

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25812: [SPARK-22796][PYTHON][ML] Add 
multiple columns support to PySpark QuantileDiscretizer
URL: https://github.com/apache/spark/pull/25812#issuecomment-532726845
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110916/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple columns support to PySpark QuantileDiscretizer

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple 
columns support to PySpark QuantileDiscretizer
URL: https://github.com/apache/spark/pull/25812#issuecomment-532726845
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110916/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple columns support to PySpark QuantileDiscretizer

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple 
columns support to PySpark QuantileDiscretizer
URL: https://github.com/apache/spark/pull/25812#issuecomment-532726838
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25651: [SPARK-28948][SQL] Support passing all Table metadata in TableProvider

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25651: [SPARK-28948][SQL] Support passing all 
Table metadata in TableProvider
URL: https://github.com/apache/spark/pull/25651#issuecomment-532726429
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16034/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple columns support to PySpark QuantileDiscretizer

2019-09-18 Thread GitBox

SparkQA removed a comment on issue #25812: [SPARK-22796][PYTHON][ML] Add 
multiple columns support to PySpark QuantileDiscretizer
URL: https://github.com/apache/spark/pull/25812#issuecomment-532715432
 
 
   **[Test build #110916 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110916/testReport)**
 for PR 25812 at commit 
[`69ea569`](https://github.com/apache/spark/commit/69ea56900c389a5e0046050b0777e0d20284deb6).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25651: [SPARK-28948][SQL] Support passing all Table metadata in TableProvider

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25651: [SPARK-28948][SQL] Support passing all 
Table metadata in TableProvider
URL: https://github.com/apache/spark/pull/25651#issuecomment-532726418
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple columns support to PySpark QuantileDiscretizer

2019-09-18 Thread GitBox

SparkQA commented on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple 
columns support to PySpark QuantileDiscretizer
URL: https://github.com/apache/spark/pull/25812#issuecomment-532726296
 
 
   **[Test build #110916 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110916/testReport)**
 for PR 25812 at commit 
[`69ea569`](https://github.com/apache/spark/commit/69ea56900c389a5e0046050b0777e0d20284deb6).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] phpisciuneri commented on a change in pull request #25818: [SPARK-29121][ML][MLLIB] Support for dot product operation on Vector(s)

2019-09-18 Thread GitBox

phpisciuneri commented on a change in pull request #25818: 
[SPARK-29121][ML][MLLIB] Support for dot product operation on Vector(s)
URL: https://github.com/apache/spark/pull/25818#discussion_r325728399
 
 

 ##
 File path: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala
 ##
 @@ -178,6 +178,13 @@ sealed trait Vector extends Serializable {
*/
   @Since("2.0.0")
   def argmax: Int
+
+  /**
+   * Calculate the dot product of this vector with another.
+   *
+   * If `size` does not match an [IllegalArgumentException] is thrown.
+   */
+  def dot(v: Vector): Double = BLAS.dot(this, v)
 
 Review comment:
   @srowen thanks.  I added the annotation to each of dot functions in ml and 
mllib.
   
   I actually am not very familiar (scala coder) with the Pyspark and SparkR 
portions of the code base... Let me have a look.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gaborgsomogyi edited a comment on issue #25760: [SPARK-29054][SS] Invalidate Kafka consumer when new delegation token available

2019-09-18 Thread GitBox

gaborgsomogyi edited a comment on issue #25760: [SPARK-29054][SS] Invalidate 
Kafka consumer when new delegation token available
URL: https://github.com/apache/spark/pull/25760#issuecomment-532674237
 
 
   Thanks guys, valid comments. Let me think it through and update it soon...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25831: [SPARK-29122][SQL] Propagate all the SQL conf to executors in SQLQueryTestSuite

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25831: [SPARK-29122][SQL] Propagate 
all the SQL conf to executors in SQLQueryTestSuite
URL: https://github.com/apache/spark/pull/25831#issuecomment-532719435
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25831: [SPARK-29122][SQL] Propagate all the SQL conf to executors in SQLQueryTestSuite

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25831: [SPARK-29122][SQL] Propagate 
all the SQL conf to executors in SQLQueryTestSuite
URL: https://github.com/apache/spark/pull/25831#issuecomment-532719441
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16033/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25831: [SPARK-29122][SQL] Propagate all the SQL conf to executors in SQLQueryTestSuite

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25831: [SPARK-29122][SQL] Propagate all the 
SQL conf to executors in SQLQueryTestSuite
URL: https://github.com/apache/spark/pull/25831#issuecomment-532719441
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16033/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25831: [SPARK-29122][SQL] Propagate all the SQL conf to executors in SQLQueryTestSuite

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25831: [SPARK-29122][SQL] Propagate all the 
SQL conf to executors in SQLQueryTestSuite
URL: https://github.com/apache/spark/pull/25831#issuecomment-532719435
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25831: [SPARK-29122][SQL] Propagate all the SQL conf to executors in SQLQueryTestSuite

2019-09-18 Thread GitBox

SparkQA commented on issue #25831: [SPARK-29122][SQL] Propagate all the SQL 
conf to executors in SQLQueryTestSuite
URL: https://github.com/apache/spark/pull/25831#issuecomment-532718855
 
 
   **[Test build #110918 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110918/testReport)**
 for PR 25831 at commit 
[`e0d8a5e`](https://github.com/apache/spark/commit/e0d8a5e119b9319cefd5f6d93e51ed7f5e4bd60f).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu opened a new pull request #25831: [SPARK-29122][SQL] Propagate all the SQL conf to executors in SQLQueryTestSuite

2019-09-18 Thread GitBox

maropu opened a new pull request #25831: [SPARK-29122][SQL] Propagate all the 
SQL conf to executors in SQLQueryTestSuite
URL: https://github.com/apache/spark/pull/25831
 
 
   
   
   ### What changes were proposed in this pull request?
   
   This pr is to propagate all the SQL configurations to executors in 
`SQLQueryTestSuite`. When the propagation enabled in the tests, a potential bug 
below becomes apparent;
   ```
   CREATE TABLE num_data (id int, val decimal(38,10)) USING parquet;
   
select sum(udf(CAST(null AS Decimal(38,0 from range(1,4): 
QueryOutput(select sum(udf(CAST(null AS Decimal(38,0 from 
range(1,4),struct<>,java.lang.IllegalArgumentException
   [info]   requirement failed: MutableProjection cannot use UnsafeRow for 
output data types: decimal(38,0)) (SQLQueryTestSuite.scala:380)
   ``` 
   The root culprit is that `InterpretedMutableProjection` has incorrect 
validation in the interpreter mode: `validExprs.forall { case (e, _) => 
UnsafeRow.isFixedLength(e.dataType) }`. This validation should be the same with 
the condition (`isMutable`) in `HashAggregate.supportsAggregate`: 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala#L1126
   
   
   ### Why are the changes needed?
   
   Bug fixes.
   
   ### Does this PR introduce any user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Added tests in `AggregationQuerySuite`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple columns support to PySpark QuantileDiscretizer

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25812: [SPARK-22796][PYTHON][ML] Add 
multiple columns support to PySpark QuantileDiscretizer
URL: https://github.com/apache/spark/pull/25812#issuecomment-532716055
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25826: [SPARK-29042][Core][BRANCH-2.4] Sampling-based RDD with unordered input should be INDETERMINATE

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25826: 
[SPARK-29042][Core][BRANCH-2.4] Sampling-based RDD with unordered input should 
be INDETERMINATE
URL: https://github.com/apache/spark/pull/25826#issuecomment-532716076
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple columns support to PySpark QuantileDiscretizer

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25812: [SPARK-22796][PYTHON][ML] Add 
multiple columns support to PySpark QuantileDiscretizer
URL: https://github.com/apache/spark/pull/25812#issuecomment-532716066
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16032/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25826: [SPARK-29042][Core][BRANCH-2.4] Sampling-based RDD with unordered input should be INDETERMINATE

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25826: 
[SPARK-29042][Core][BRANCH-2.4] Sampling-based RDD with unordered input should 
be INDETERMINATE
URL: https://github.com/apache/spark/pull/25826#issuecomment-532716083
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16031/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen closed pull request #25815: [SPARK-29118][ML] Avoid redundant computation in transform of GMM & GLR

2019-09-18 Thread GitBox

srowen closed pull request #25815: [SPARK-29118][ML] Avoid redundant 
computation in transform of GMM & GLR
URL: https://github.com/apache/spark/pull/25815
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on issue #25815: [SPARK-29118][ML] Avoid redundant computation in transform of GMM & GLR

2019-09-18 Thread GitBox

srowen commented on issue #25815: [SPARK-29118][ML] Avoid redundant computation 
in transform of GMM & GLR
URL: https://github.com/apache/spark/pull/25815#issuecomment-532716287
 
 
   Merged to master


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25826: [SPARK-29042][Core][BRANCH-2.4] Sampling-based RDD with unordered input should be INDETERMINATE

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25826: [SPARK-29042][Core][BRANCH-2.4] 
Sampling-based RDD with unordered input should be INDETERMINATE
URL: https://github.com/apache/spark/pull/25826#issuecomment-532716076
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple columns support to PySpark QuantileDiscretizer

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple 
columns support to PySpark QuantileDiscretizer
URL: https://github.com/apache/spark/pull/25812#issuecomment-532716055
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple columns support to PySpark QuantileDiscretizer

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple 
columns support to PySpark QuantileDiscretizer
URL: https://github.com/apache/spark/pull/25812#issuecomment-532716066
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16032/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25826: [SPARK-29042][Core][BRANCH-2.4] Sampling-based RDD with unordered input should be INDETERMINATE

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25826: [SPARK-29042][Core][BRANCH-2.4] 
Sampling-based RDD with unordered input should be INDETERMINATE
URL: https://github.com/apache/spark/pull/25826#issuecomment-532716083
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16031/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #25818: [SPARK-29121][ML][MLLIB] Support for dot product operation on Vector(s)

2019-09-18 Thread GitBox

srowen commented on a change in pull request #25818: [SPARK-29121][ML][MLLIB] 
Support for dot product operation on Vector(s)
URL: https://github.com/apache/spark/pull/25818#discussion_r325715899
 
 

 ##
 File path: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala
 ##
 @@ -178,6 +178,13 @@ sealed trait Vector extends Serializable {
*/
   @Since("2.0.0")
   def argmax: Int
+
+  /**
+   * Calculate the dot product of this vector with another.
+   *
+   * If `size` does not match an [IllegalArgumentException] is thrown.
+   */
+  def dot(v: Vector): Double = BLAS.dot(this, v)
 
 Review comment:
   Add `@Since("3.0.0")`. I think this needs to be exposed in Pyspark and/or 
SparkR too?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple columns support to PySpark QuantileDiscretizer

2019-09-18 Thread GitBox

SparkQA commented on issue #25812: [SPARK-22796][PYTHON][ML] Add multiple 
columns support to PySpark QuantileDiscretizer
URL: https://github.com/apache/spark/pull/25812#issuecomment-532715432
 
 
   **[Test build #110916 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110916/testReport)**
 for PR 25812 at commit 
[`69ea569`](https://github.com/apache/spark/commit/69ea56900c389a5e0046050b0777e0d20284deb6).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24575: [SPARK-27670][SQL]Add HA for HiveThriftServer2 based on HiveServer2.

2019-09-18 Thread GitBox

SparkQA commented on issue #24575: [SPARK-27670][SQL]Add HA for 
HiveThriftServer2 based on HiveServer2.
URL: https://github.com/apache/spark/pull/24575#issuecomment-532715522
 
 
   **[Test build #110917 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110917/testReport)**
 for PR 24575 at commit 
[`e98c844`](https://github.com/apache/spark/commit/e98c844aa3851d65fac9ea5a5a0581f52bf14077).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25826: [SPARK-29042][Core][BRANCH-2.4] Sampling-based RDD with unordered input should be INDETERMINATE

2019-09-18 Thread GitBox

SparkQA commented on issue #25826: [SPARK-29042][Core][BRANCH-2.4] 
Sampling-based RDD with unordered input should be INDETERMINATE
URL: https://github.com/apache/spark/pull/25826#issuecomment-532715437
 
 
   **[Test build #110915 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110915/testReport)**
 for PR 25826 at commit 
[`e17dd66`](https://github.com/apache/spark/commit/e17dd6610b13883b446c0db7840171974bd9aef3).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on issue #25829: [SPARK-29144][ML] Binarizer handle sparse vectors incorrectly with negative threshold

2019-09-18 Thread GitBox

srowen commented on issue #25829: [SPARK-29144][ML] Binarizer handle sparse 
vectors incorrectly with negative threshold
URL: https://github.com/apache/spark/pull/25829#issuecomment-532715181
 
 
   I think the right answer is to return 1 for all of the implicit 0 entries 
when the threshold is < 0. Yes it makes it dense, but it's the right answer.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #25826: [SPARK-29042][Core][BRANCH-2.4] Sampling-based RDD with unordered input should be INDETERMINATE

2019-09-18 Thread GitBox

HyukjinKwon commented on issue #25826: [SPARK-29042][Core][BRANCH-2.4] 
Sampling-based RDD with unordered input should be INDETERMINATE
URL: https://github.com/apache/spark/pull/25826#issuecomment-532713979
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon closed pull request #25820: [SPARK-29101][SQL] Fix count API for csv file when DROPMALFORMED mode is selected

2019-09-18 Thread GitBox

HyukjinKwon closed pull request #25820: [SPARK-29101][SQL] Fix count API for 
csv file when DROPMALFORMED mode is selected
URL: https://github.com/apache/spark/pull/25820
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #25820: [SPARK-29101][SQL] Fix count API for csv file when DROPMALFORMED mode is selected

2019-09-18 Thread GitBox

HyukjinKwon commented on issue #25820: [SPARK-29101][SQL] Fix count API for csv 
file when DROPMALFORMED mode is selected
URL: https://github.com/apache/spark/pull/25820#issuecomment-532712466
 
 
   Merged to master.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon closed pull request #25814: [SPARK-19926][PYSPARK] make captured exception from JVM side user friendly

2019-09-18 Thread GitBox

HyukjinKwon closed pull request #25814: [SPARK-19926][PYSPARK] make captured 
exception from JVM side user friendly
URL: https://github.com/apache/spark/pull/25814
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #25814: [SPARK-19926][PYSPARK] make captured exception from JVM side user friendly

2019-09-18 Thread GitBox

HyukjinKwon commented on issue #25814: [SPARK-19926][PYSPARK] make captured 
exception from JVM side user friendly
URL: https://github.com/apache/spark/pull/25814#issuecomment-532711956
 
 
   Merged to master.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon closed pull request #25716: [SPARK-29012][SQL] Support special timestamp values

2019-09-18 Thread GitBox

HyukjinKwon closed pull request #25716: [SPARK-29012][SQL] Support special 
timestamp values
URL: https://github.com/apache/spark/pull/25716
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #25716: [SPARK-29012][SQL] Support special timestamp values

2019-09-18 Thread GitBox

HyukjinKwon commented on issue #25716: [SPARK-29012][SQL] Support special 
timestamp values
URL: https://github.com/apache/spark/pull/25716#issuecomment-532711283
 
 
   Merged to master.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on issue #25789: [SPARK-28927][ML] Rethrow block mismatch exception in ALS when input data is nondeterministic

2019-09-18 Thread GitBox

srowen commented on issue #25789: [SPARK-28927][ML] Rethrow block mismatch 
exception in ALS when input data is nondeterministic
URL: https://github.com/apache/spark/pull/25789#issuecomment-532707660
 
 
   Merged to master


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen closed pull request #25789: [SPARK-28927][ML] Rethrow block mismatch exception in ALS when input data is nondeterministic

2019-09-18 Thread GitBox

srowen closed pull request #25789: [SPARK-28927][ML] Rethrow block mismatch 
exception in ALS when input data is nondeterministic
URL: https://github.com/apache/spark/pull/25789
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on issue #25821: [SPARK-29124][CORE] Use MurmurHash3 `bytesHash(data, seed)` instead of `bytesHash(data)`

2019-09-18 Thread GitBox

srowen commented on issue #25821: [SPARK-29124][CORE] Use MurmurHash3 
`bytesHash(data, seed)` instead of `bytesHash(data)`
URL: https://github.com/apache/spark/pull/25821#issuecomment-532706603
 
 
   See comment on other PR - yep I get the idea now, makes sense.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] advancedxy commented on issue #25825: [SPARK-26713][CORE][2.4] Interrupt pipe IO threads in PipedRDD when task is finished

2019-09-18 Thread GitBox

advancedxy commented on issue #25825: [SPARK-26713][CORE][2.4] Interrupt pipe 
IO threads in PipedRDD when task is finished
URL: https://github.com/apache/spark/pull/25825#issuecomment-532706176
 
 
   > Let's also mention the original PR in the description.
   
   Edited the description. And the tests passed, let's merge this then 
@cloud-fan ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #25802: [SPARK-29095][ML] add extractInstances

2019-09-18 Thread GitBox

srowen commented on a change in pull request #25802: [SPARK-29095][ML] add 
extractInstances
URL: https://github.com/apache/spark/pull/25802#discussion_r325702663
 
 

 ##
 File path: mllib/src/main/scala/org/apache/spark/ml/Predictor.scala
 ##
 @@ -62,6 +62,40 @@ private[ml] trait PredictorParams extends Params
 }
 SchemaUtils.appendColumn(schema, $(predictionCol), DoubleType)
   }
+
+  /**
+   * Extract [[labelCol]], weightCol(if any) and [[featuresCol]] from the 
given dataset,
+   * and put it in an RDD with strong types.
+   */
+  protected def extractInstances(dataset: Dataset[_]): RDD[Instance] = {
+val w = this match {
+  case p: HasWeightCol =>
+if (isDefined(p.weightCol) && $(p.weightCol).nonEmpty) {
+  col($(p.weightCol)).cast(DoubleType)
+} else {
+  lit(1.0)
+}
+  case _ => lit(1.0)
 
 Review comment:
   If it doesn't have a weight column, does it mean there's no point in 
selecting lit(1.0) as a weight column as it will be unused? or do some 
algorithms not have a weight column but nevertheless have ways of using a 
weight?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] turboFei commented on issue #25795: [WIP][SPARK-29037][Core] Spark gives duplicate result when an application was killed

2019-09-18 Thread GitBox

turboFei commented on issue #25795: [WIP][SPARK-29037][Core] Spark gives 
duplicate result when an application was killed
URL: https://github.com/apache/spark/pull/25795#issuecomment-532704768
 
 
   I have discussed with advancedxy offline, and I am clearly for the solution 
now.
   Thanks @advancedxy .  I will complete this PR after 
https://github.com/apache/spark/pull/25739 is merged. 
   Thanks again.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #25759: [SPARK-19147][CORE] Gracefully handle error in task after executor is stopped

2019-09-18 Thread GitBox

srowen commented on a change in pull request #25759: [SPARK-19147][CORE] 
Gracefully handle error in task after executor is stopped
URL: https://github.com/apache/spark/pull/25759#discussion_r325700132
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/executor/Executor.scala
 ##
 @@ -624,37 +624,43 @@ private[spark] class Executor(
   execBackend.statusUpdate(taskId, TaskState.KILLED, 
ser.serialize(reason))
 
 case t: Throwable =>
-  // Attempt to exit cleanly by informing the driver of our failure.
-  // If anything goes wrong (or this was a fatal exception), we will 
delegate to
-  // the default uncaught exception handler, which will terminate the 
Executor.
-  logError(s"Exception in $taskName (TID $taskId)", t)
-
-  // SPARK-20904: Do not report failure to driver if if happened 
during shut down. Because
-  // libraries may set up shutdown hooks that race with running tasks 
during shutdown,
-  // spurious failures may occur and can result in improper accounting 
in the driver (e.g.
-  // the task failure would not be ignored if the shutdown happened 
because of premption,
-  // instead of an app issue).
-  if (!ShutdownHookManager.inShutdown()) {
-val (accums, accUpdates) = 
collectAccumulatorsAndResetStatusOnFailure(taskStartTimeNs)
-val metricPeaks = 
WrappedArray.make(metricsPoller.getTaskMetricPeaks(taskId))
-
-val serializedTaskEndReason = {
-  try {
-val ef = new ExceptionFailure(t, accUpdates).withAccums(accums)
-  .withMetricPeaks(metricPeaks)
-ser.serialize(ef)
-  } catch {
-case _: NotSerializableException =>
-  // t is not serializable so just send the stacktrace
-  val ef = new ExceptionFailure(t, accUpdates, 
false).withAccums(accums)
+  if (env.isStopped) {
 
 Review comment:
   This is looking OK overall, to me. You might be able to avoid most of the 
diff due to indentation by only adding a single case:
   
   ```
   case t: Throwable if env.isStopped =>
 logError(...)
   case t: Throwable =>
 // unchanged
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on issue #25689: [SPARK-28972][DOCS] Updating unit description in configurations, to maintain consistency

2019-09-18 Thread GitBox

srowen commented on issue #25689: [SPARK-28972][DOCS] Updating unit description 
in configurations, to maintain consistency
URL: https://github.com/apache/spark/pull/25689#issuecomment-532702739
 
 
   Merged to master


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen closed pull request #25689: [SPARK-28972][DOCS] Updating unit description in configurations, to maintain consistency

2019-09-18 Thread GitBox

srowen closed pull request #25689: [SPARK-28972][DOCS] Updating unit 
description in configurations, to maintain consistency
URL: https://github.com/apache/spark/pull/25689
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25820: [SPARK-29101][SQL] Fix count API for csv file when DROPMALFORMED mode is selected

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25820: [SPARK-29101][SQL] Fix count 
API for csv file when DROPMALFORMED mode is selected
URL: https://github.com/apache/spark/pull/25820#issuecomment-532700198
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110898/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25820: [SPARK-29101][SQL] Fix count API for csv file when DROPMALFORMED mode is selected

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25820: [SPARK-29101][SQL] Fix count 
API for csv file when DROPMALFORMED mode is selected
URL: https://github.com/apache/spark/pull/25820#issuecomment-532700190
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25820: [SPARK-29101][SQL] Fix count API for csv file when DROPMALFORMED mode is selected

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25820: [SPARK-29101][SQL] Fix count API for 
csv file when DROPMALFORMED mode is selected
URL: https://github.com/apache/spark/pull/25820#issuecomment-532700190
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25825: [SPARK-26713][CORE][2.4] Interrupt pipe IO threads in PipedRDD when task is finished

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25825: [SPARK-26713][CORE][2.4] 
Interrupt pipe IO threads in PipedRDD when task is finished
URL: https://github.com/apache/spark/pull/25825#issuecomment-532699672
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110896/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25825: [SPARK-26713][CORE][2.4] Interrupt pipe IO threads in PipedRDD when task is finished

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25825: [SPARK-26713][CORE][2.4] 
Interrupt pipe IO threads in PipedRDD when task is finished
URL: https://github.com/apache/spark/pull/25825#issuecomment-532699661
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25820: [SPARK-29101][SQL] Fix count API for csv file when DROPMALFORMED mode is selected

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25820: [SPARK-29101][SQL] Fix count API for 
csv file when DROPMALFORMED mode is selected
URL: https://github.com/apache/spark/pull/25820#issuecomment-532700198
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110898/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on issue #25830: [SPARK-29140][SQL] Handle BinaryType of parameter properly in HashAggregateExec

2019-09-18 Thread GitBox

maropu commented on issue #25830: [SPARK-29140][SQL] Handle BinaryType of 
parameter properly in HashAggregateExec
URL: https://github.com/apache/spark/pull/25830#issuecomment-532700223
 
 
   In the PR title, array types is more obvious than binary types?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25825: [SPARK-26713][CORE][2.4] Interrupt pipe IO threads in PipedRDD when task is finished

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25825: [SPARK-26713][CORE][2.4] Interrupt 
pipe IO threads in PipedRDD when task is finished
URL: https://github.com/apache/spark/pull/25825#issuecomment-532699661
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25825: [SPARK-26713][CORE][2.4] Interrupt pipe IO threads in PipedRDD when task is finished

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25825: [SPARK-26713][CORE][2.4] Interrupt 
pipe IO threads in PipedRDD when task is finished
URL: https://github.com/apache/spark/pull/25825#issuecomment-532699672
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110896/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local 
file
URL: https://github.com/apache/spark/pull/25690#issuecomment-532699223
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25825: [SPARK-26713][CORE][2.4] Interrupt pipe IO threads in PipedRDD when task is finished

2019-09-18 Thread GitBox

SparkQA removed a comment on issue #25825: [SPARK-26713][CORE][2.4] Interrupt 
pipe IO threads in PipedRDD when task is finished
URL: https://github.com/apache/spark/pull/25825#issuecomment-532584868
 
 
   **[Test build #110896 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110896/testReport)**
 for PR 25825 at commit 
[`6ee8d0d`](https://github.com/apache/spark/commit/6ee8d0d6aaddb8185122b9389155b64c102623d0).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local 
file
URL: https://github.com/apache/spark/pull/25690#issuecomment-532699238
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16030/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25728: [SPARK-29020][WIP][SQL] 
Improving array_sort behaviour
URL: https://github.com/apache/spark/pull/25728#issuecomment-532699304
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16029/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25820: [SPARK-29101][SQL] Fix count API for csv file when DROPMALFORMED mode is selected

2019-09-18 Thread GitBox

SparkQA removed a comment on issue #25820: [SPARK-29101][SQL] Fix count API for 
csv file when DROPMALFORMED mode is selected
URL: https://github.com/apache/spark/pull/25820#issuecomment-532587821
 
 
   **[Test build #110898 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110898/testReport)**
 for PR 25820 at commit 
[`f2c25f0`](https://github.com/apache/spark/commit/f2c25f068b29e2a71a7e4eacaa075e67001a2652).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25728: [SPARK-29020][WIP][SQL] 
Improving array_sort behaviour
URL: https://github.com/apache/spark/pull/25728#issuecomment-532699294
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25820: [SPARK-29101][SQL] Fix count API for csv file when DROPMALFORMED mode is selected

2019-09-18 Thread GitBox

SparkQA commented on issue #25820: [SPARK-29101][SQL] Fix count API for csv 
file when DROPMALFORMED mode is selected
URL: https://github.com/apache/spark/pull/25820#issuecomment-532699395
 
 
   **[Test build #110898 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110898/testReport)**
 for PR 25820 at commit 
[`f2c25f0`](https://github.com/apache/spark/commit/f2c25f068b29e2a71a7e4eacaa075e67001a2652).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25728: [SPARK-29020][WIP][SQL] Improving 
array_sort behaviour
URL: https://github.com/apache/spark/pull/25728#issuecomment-532699304
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16029/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25728: [SPARK-29020][WIP][SQL] Improving 
array_sort behaviour
URL: https://github.com/apache/spark/pull/25728#issuecomment-532699294
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local 
file
URL: https://github.com/apache/spark/pull/25690#issuecomment-532699223
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local 
file
URL: https://github.com/apache/spark/pull/25690#issuecomment-532699238
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16030/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #25830: [SPARK-29140][SQL] Handle BinaryType of parameter properly in HashAggregateExec

2019-09-18 Thread GitBox

maropu commented on a change in pull request #25830: [SPARK-29140][SQL] Handle 
BinaryType of parameter properly in HashAggregateExec
URL: https://github.com/apache/spark/pull/25830#discussion_r325694088
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ##
 @@ -392,6 +394,14 @@ case class HashAggregateExec(
  """.stripMargin
   }
 
+  private def typeNameForCodegen(clazz: Class[_]): String = {
 
 Review comment:
   It might be better to move this helper function to `CodeGenerator`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25825: [SPARK-26713][CORE][2.4] Interrupt pipe IO threads in PipedRDD when task is finished

2019-09-18 Thread GitBox

SparkQA commented on issue #25825: [SPARK-26713][CORE][2.4] Interrupt pipe IO 
threads in PipedRDD when task is finished
URL: https://github.com/apache/spark/pull/25825#issuecomment-532699049
 
 
   **[Test build #110896 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110896/testReport)**
 for PR 25825 at commit 
[`6ee8d0d`](https://github.com/apache/spark/commit/6ee8d0d6aaddb8185122b9389155b64c102623d0).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25811: [SPARK-29111][CORE] Support snapshot/restore on KVStore

2019-09-18 Thread GitBox

SparkQA commented on issue #25811: [SPARK-29111][CORE] Support snapshot/restore 
on KVStore
URL: https://github.com/apache/spark/pull/25811#issuecomment-532698564
 
 
   **[Test build #110912 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110912/testReport)**
 for PR 25811 at commit 
[`9b63b05`](https://github.com/apache/spark/commit/9b63b054d7a5d49f635c28c55f4d7da97c8bffba).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour

2019-09-18 Thread GitBox

SparkQA commented on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort 
behaviour
URL: https://github.com/apache/spark/pull/25728#issuecomment-532698559
 
 
   **[Test build #110913 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110913/testReport)**
 for PR 25728 at commit 
[`7ad574a`](https://github.com/apache/spark/commit/7ad574a4df899be1f66be70ef955af83b8440ac0).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file

2019-09-18 Thread GitBox

SparkQA commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local 
file
URL: https://github.com/apache/spark/pull/25690#issuecomment-532698517
 
 
   **[Test build #110914 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110914/testReport)**
 for PR 25690 at commit 
[`36c394c`](https://github.com/apache/spark/commit/36c394ce08e6cac1e32176c684eac0c9d1615831).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #25830: [SPARK-29140][SQL] Handle BinaryType of parameter properly in HashAggregateExec

2019-09-18 Thread GitBox

maropu commented on a change in pull request #25830: [SPARK-29140][SQL] Handle 
BinaryType of parameter properly in HashAggregateExec
URL: https://github.com/apache/spark/pull/25830#discussion_r325693305
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/aggregate/HashAggregateSuite.scala
 ##
 @@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.aggregate
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.functions._
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.test.SharedSparkSession
+import org.apache.spark.sql.types._
+
+class HashAggregateSuite extends SharedSparkSession {
+
+  import testImplicits._
+
+  test("SPARK-29140 HashAggregateExec aggregating binary type doesn't break 
codegen compilation") {
+val withDistinct = countDistinct($"c1")
+
+val schema = new StructType().add("c1", BinaryType, nullable = true)
+val schemaWithId = StructType(StructField("id", IntegerType, nullable = 
false) +: schema.fields)
+
+withSQLConf(
+SQLConf.CODEGEN_SPLIT_AGGREGATE_FUNC.key -> "true",
+SQLConf.CODEGEN_METHOD_SPLIT_THRESHOLD.key -> "1") {
+  val emptyRows = spark.sparkContext.parallelize(Seq.empty[Row], 1)
+  val aggDf = spark.createDataFrame(emptyRows, schemaWithId)
+.groupBy($"id" % 10 as "group")
+.agg(withDistinct)
+.orderBy("group")
+  aggDf.collect().toSeq
 
 Review comment:
   plz check the result.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #25830: [SPARK-29140][SQL] Handle BinaryType of parameter properly in HashAggregateExec

2019-09-18 Thread GitBox

maropu commented on a change in pull request #25830: [SPARK-29140][SQL] Handle 
BinaryType of parameter properly in HashAggregateExec
URL: https://github.com/apache/spark/pull/25830#discussion_r325692826
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/aggregate/HashAggregateSuite.scala
 ##
 @@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.aggregate
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.functions._
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.test.SharedSparkSession
+import org.apache.spark.sql.types._
+
+class HashAggregateSuite extends SharedSparkSession {
+
+  import testImplicits._
+
+  test("SPARK-29140 HashAggregateExec aggregating binary type doesn't break 
codegen compilation") {
+val withDistinct = countDistinct($"c1")
 
 Review comment:
   Move to `AggregationQuerySuite`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file

2019-09-18 Thread GitBox

wangyum commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local 
file
URL: https://github.com/apache/spark/pull/25690#issuecomment-532698008
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR commented on issue #25811: [SPARK-29111][CORE] Support snapshot/restore on KVStore

2019-09-18 Thread GitBox

HeartSaVioR commented on issue #25811: [SPARK-29111][CORE] Support 
snapshot/restore on KVStore
URL: https://github.com/apache/spark/pull/25811#issuecomment-532697915
 
 
   retest this, please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR commented on issue #25811: [SPARK-29111][CORE] Support snapshot/restore on KVStore

2019-09-18 Thread GitBox

HeartSaVioR commented on issue #25811: [SPARK-29111][CORE] Support 
snapshot/restore on KVStore
URL: https://github.com/apache/spark/pull/25811#issuecomment-532697859
 
 
   Known flaky test: SPARK-23197. Not relevant to this patch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25814: [SPARK-19926][PYSPARK] make captured exception from JVM side user friendly

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25814: [SPARK-19926][PYSPARK] make captured 
exception from JVM side user friendly
URL: https://github.com/apache/spark/pull/25814#issuecomment-532697186
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25814: [SPARK-19926][PYSPARK] make captured exception from JVM side user friendly

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25814: [SPARK-19926][PYSPARK] make 
captured exception from JVM side user friendly
URL: https://github.com/apache/spark/pull/25814#issuecomment-532697186
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25814: [SPARK-19926][PYSPARK] make captured exception from JVM side user friendly

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25814: [SPARK-19926][PYSPARK] make 
captured exception from JVM side user friendly
URL: https://github.com/apache/spark/pull/25814#issuecomment-532697198
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110910/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25814: [SPARK-19926][PYSPARK] make captured exception from JVM side user friendly

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25814: [SPARK-19926][PYSPARK] make captured 
exception from JVM side user friendly
URL: https://github.com/apache/spark/pull/25814#issuecomment-532697198
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110910/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25814: [SPARK-19926][PYSPARK] make captured exception from JVM side user friendly

2019-09-18 Thread GitBox

SparkQA removed a comment on issue #25814: [SPARK-19926][PYSPARK] make captured 
exception from JVM side user friendly
URL: https://github.com/apache/spark/pull/25814#issuecomment-532685386
 
 
   **[Test build #110910 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110910/testReport)**
 for PR 25814 at commit 
[`9e4e7e9`](https://github.com/apache/spark/commit/9e4e7e98cf4cca4248afb81c34168eef63fbd5cf).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25814: [SPARK-19926][PYSPARK] make captured exception from JVM side user friendly

2019-09-18 Thread GitBox

SparkQA commented on issue #25814: [SPARK-19926][PYSPARK] make captured 
exception from JVM side user friendly
URL: https://github.com/apache/spark/pull/25814#issuecomment-532696727
 
 
   **[Test build #110910 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110910/testReport)**
 for PR 25814 at commit 
[`9e4e7e9`](https://github.com/apache/spark/commit/9e4e7e98cf4cca4248afb81c34168eef63fbd5cf).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local 
file
URL: https://github.com/apache/spark/pull/25690#issuecomment-532695137
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110874/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] xuanyuanking commented on a change in pull request #25768: [SPARK-29063][SQL] Modify fillValue approach to support joined dataframe

2019-09-18 Thread GitBox

xuanyuanking commented on a change in pull request #25768: [SPARK-29063][SQL] 
Modify fillValue approach to support joined dataframe
URL: https://github.com/apache/spark/pull/25768#discussion_r325689461
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala
 ##
 @@ -497,12 +497,10 @@ final class DataFrameNaFunctions private[sql](df: 
DataFrame) {
   throw new IllegalArgumentException(s"$targetType is not matched at 
fillValue")
   }
   // Only fill if the column is part of the cols list.
-  if (typeMatches && cols.exists(col => columnEquals(f.name, col))) {
-fillCol[T](f, value)
-  } else {
-df.col(f.name)
-  }
+  typeMatches && cols.exists(col => columnEquals(f.name, col))
+}.map { col =>
+  (col.name, fillCol[T](col, value))
 }
-df.select(projections : _*)
+df.withColumns(fillColumnsInfo.map(_._1), fillColumnsInfo.map(_._2))
 
 Review comment:
   Yes, in the new approach, we only pass in the columns found in the existing 
fields, and `withColumns` will replace the existing columns with the original 
order.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local 
file
URL: https://github.com/apache/spark/pull/25690#issuecomment-532695131
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local 
file
URL: https://github.com/apache/spark/pull/25690#issuecomment-532695137
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/110874/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file

2019-09-18 Thread GitBox

AmplabJenkins commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local 
file
URL: https://github.com/apache/spark/pull/25690#issuecomment-532695131
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file

2019-09-18 Thread GitBox

SparkQA removed a comment on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local 
file
URL: https://github.com/apache/spark/pull/25690#issuecomment-532552916
 
 
   **[Test build #110874 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110874/testReport)**
 for PR 25690 at commit 
[`36c394c`](https://github.com/apache/spark/commit/36c394ce08e6cac1e32176c684eac0c9d1615831).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] colinmjj commented on a change in pull request #25759: [SPARK-19147][CORE] Gracefully handle error in task after executor is stopped

2019-09-18 Thread GitBox

colinmjj commented on a change in pull request #25759: [SPARK-19147][CORE] 
Gracefully handle error in task after executor is stopped
URL: https://github.com/apache/spark/pull/25759#discussion_r325688306
 
 

 ##
 File path: core/src/test/scala/org/apache/spark/executor/ExecutorSuite.scala
 ##
 @@ -246,6 +246,46 @@ class ExecutorSuite extends SparkFunSuite
 heartbeatZeroAccumulatorUpdateTest(false)
   }
 
+  test("SPARK-19147: Gracefully handle error in task after executor is 
stopped") {
 
 Review comment:
   Remove the case after I make clearly how metrics works, thanks for review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25690: [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file

2019-09-18 Thread GitBox

SparkQA commented on issue #25690: 
[SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local 
file
URL: https://github.com/apache/spark/pull/25690#issuecomment-532694695
 
 
   **[Test build #110874 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/110874/testReport)**
 for PR 25690 at commit 
[`36c394c`](https://github.com/apache/spark/commit/36c394ce08e6cac1e32176c684eac0c9d1615831).
* This patch **fails from timeout after a configured wait of `400m`**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] colinmjj commented on a change in pull request #25759: [SPARK-19147][CORE] Gracefully handle error in task after executor is stopped

2019-09-18 Thread GitBox

colinmjj commented on a change in pull request #25759: [SPARK-19147][CORE] 
Gracefully handle error in task after executor is stopped
URL: https://github.com/apache/spark/pull/25759#discussion_r325687758
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/executor/Executor.scala
 ##
 @@ -604,6 +604,21 @@ private[spark] class Executor(
   val serializedTK = ser.serialize(TaskKilled(killReason, accUpdates, 
accums, metricPeaks))
   execBackend.statusUpdate(taskId, TaskState.KILLED, serializedTK)
 
+// When put the task in the pool, executor.stop may be called before 
task.run.
+// The exception will be thrown from the task becauseof the unexpected 
status,
+// see: SPARK-19147, here is to process the exception after 
executor.stop
+// as the excepted exception.
+case t: Throwable if !isLocal && env.isStopped =>
 
 Review comment:
   @srowen @squito , thanks for the comments, I check the code again and make 
clearly how metrics & heartbeat work. You're right, report metrics is 
meaningless after executor.close(), because heartbeat won't work.
   Update the pr and the exception will be processed in "case t: Throwable =>" 
part with log only.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] PavithraRamachandran commented on issue #25689: [SPARK-28972][DOCS] Updating unit description in configurations, to maintain consistency

2019-09-18 Thread GitBox

PavithraRamachandran commented on issue #25689: [SPARK-28972][DOCS] Updating 
unit description in configurations, to maintain consistency
URL: https://github.com/apache/spark/pull/25689#issuecomment-532693849
 
 
   @srowen @kiszk kiszk @dongjoon-hyun  i have reworked the comments. Could you 
review


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25404: [SPARK-28683][BUILD][test-hadoop3.2][test-maven] Upgrade Scala to 2.12.10

2019-09-18 Thread GitBox

AmplabJenkins removed a comment on issue #25404: 
[SPARK-28683][BUILD][test-hadoop3.2][test-maven] Upgrade Scala to 2.12.10
URL: https://github.com/apache/spark/pull/25404#issuecomment-532692535
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/16028/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 5 6 7 8 9 10 11 12 13 14 >

901 - 1000 of 1557 matches

Mail list logo