date:20180724

[GitHub] spark issue #21866: [SPARK-24768][FollowUp][SQL]Avro migration followup: cha...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21866
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1288/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16677
  
**[Test build #93502 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93502/testReport)**
 for PR 16677 at commit 
[`d05c144`](https://github.com/apache/spark/commit/d05c144aecdd57f4ee3d179a240ccafa6c02bb66).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21118: SPARK-23325: Use InternalRow when reading with DataSourc...

2018-07-24 Thread rdblue

Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/21118
  
Thanks for reviewing and merging @cloud-fan, @gatorsmile, @felixcheung!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21850: [SPARK-24892] [SQL] Simplify `CaseWhen` to `If` when the...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21850
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21854: [SPARK-24896][SQL] Uuid should produce different values ...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21854
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1267/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21854: [SPARK-24896][SQL] Uuid should produce different values ...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21854
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20699: [SPARK-23544][SQL]Remove redundancy ShuffleExchange in t...

2018-07-24 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/20699
  
cc @HyukjinKwon 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21850: [SPARK-24892] [SQL] Simplify `CaseWhen` to `If` when the...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21850
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1268/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21858: [SPARK-24899][SQL][DOC] Add example of monotonically_inc...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21858
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1270/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21858: [SPARK-24899][SQL][DOC] Add example of monotonically_inc...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21858
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21858: [SPARK-24899][SQL][DOC] Add example of monotonically_inc...

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21858
  
**[Test build #93494 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93494/testReport)**
 for PR 21858 at commit 
[`29def00`](https://github.com/apache/spark/commit/29def0069d96ca449204ad27e8c66ca2a218ce84).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21859: [SPARK-24900][SQL]speed up sort when the dataset ...

2018-07-24 Thread sddyljsx

GitHub user sddyljsx opened a pull request:

https://github.com/apache/spark/pull/21859

[SPARK-24900][SQL]speed up sort when the dataset is small

## What changes were proposed in this pull request?

when running the sql like 'select * from order where order_status = 4 order 
by order_id'. The filescan and filter will be executed twice, it may take a 
long time. If the final dataset is small, and the sample data covers all the 
data, there is no need to do so.

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sddyljsx/spark order-optimization

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21859.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21859


commit dd50783d638ca5804531061c0a8aef2c8fef9dc1
Author: neal 
Date:   2018-07-24T07:26:58Z

speed up sort when the dataset is small




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21859: [SPARK-24900][SQL]speed up sort when the dataset is smal...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21859
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21859: [SPARK-24900][SQL]speed up sort when the dataset is smal...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21859
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21860
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16677
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistics to improve ...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16677
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93486/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21857: [SPARK-21274][SQL] Implement EXCEPT ALL clause.

2018-07-24 Thread dilipbiswal

Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/21857#discussion_r204679789
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala
 ---
@@ -222,6 +222,37 @@ case class Stack(children: Seq[Expression]) extends 
Generator {
   }
 }
 
+/**
+ * Replicate the row N times. N is specified as the first argument to the 
function.
+ * This is a internal function solely used by optimizer to rewrite EXCEPT 
ALL AND
+ * INTERSECT ALL queries.
+ */
+@ExpressionDescription(
--- End diff --

@HyukjinKwon OK..


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21802: [SPARK-23928][SQL] Add shuffle collection function.

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21802
  
**[Test build #93493 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93493/testReport)**
 for PR 21802 at commit 
[`c56ecc5`](https://github.com/apache/spark/commit/c56ecc5b727b03734a5bd7917bae14b07d09ad7d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21859: [SPARK-24900][SQL]speed up sort when the dataset is smal...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21859
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21853: [SPARK-23957][SQL] Sorts in subqueries are redundant and...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21853
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93485/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21857: [SPARK-21274][SQL] Implement EXCEPT ALL clause.

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21857
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1271/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21857: [SPARK-21274][SQL] Implement EXCEPT ALL clause.

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21857
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21802: [SPARK-23928][SQL] Add shuffle collection function.

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21802
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21802: [SPARK-23928][SQL] Add shuffle collection function.

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21802
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1269/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21860: [SPARK-24901][SQL]Merge the codegen of RegularHas...

2018-07-24 Thread heary-cao

GitHub user heary-cao opened a pull request:

https://github.com/apache/spark/pull/21860

[SPARK-24901][SQL]Merge the codegen of RegularHashMap and fastHashMap to 
reduce compiler maxCodesize when VectorizedHashMap is false.

## What changes were proposed in this pull request?

Currently, Generate code of update UnsafeRow in hash aggregation.
FastHashMap and RegularHashMap are two separate codesï¼These two separate 
codes need only when VectorizedHashMap is true. but other cases, we can merge 
together to reduce compiler maxCodesize. thanks.
case class DistinctAgg(a: Int, b: Float, c: Double, d: Int, e: String)
spark.sparkContext.parallelize(
  DistinctAgg(8, 2, 3, 4, "a") ::
  DistinctAgg(9, 3, 4, 5, "b") 
::Nil).toDF()createOrReplaceTempView("distinctAgg")
val df = sql("select a,b,e, min(d) as mind, min(case when a > 10 then a 
else null end) as mincasea, min(a) as mina from distinctAgg group by a, b, e")

println(org.apache.spark.sql.execution.debug.codegenString(df.queryExecution.executedPlan))
df.show()

Generate code like:
 Before modified:
Generated code:
/* 001 */ public Object generate(Object[] references) {
/* 002 */   return new GeneratedIteratorForCodegenStage1(references);
/* 003 */ }
/* 004 */
...
/* 354 */
/* 355 */ if (agg_fastAggBuffer_0 != null) {
/* 356 */   // common sub-expressions
/* 357 */
/* 358 */   // evaluate aggregate function
/* 359 */   agg_agg_isNull_31_0 = true;
/* 360 */   int agg_value_34 = -1;
/* 361 */
/* 362 */   boolean agg_isNull_32 = agg_fastAggBuffer_0.isNullAt(0);
/* 363 */   int agg_value_35 = agg_isNull_32 ?
/* 364 */   -1 : (agg_fastAggBuffer_0.getInt(0));
/* 365 */
/* 366 */   if (!agg_isNull_32 && (agg_agg_isNull_31_0 ||
/* 367 */   agg_value_34 > agg_value_35)) {
/* 368 */ agg_agg_isNull_31_0 = false;
/* 369 */ agg_value_34 = agg_value_35;
/* 370 */   }
/* 371 */
/* 372 */   if (!false && (agg_agg_isNull_31_0 ||
/* 373 */   agg_value_34 > agg_expr_2_0)) {
/* 374 */ agg_agg_isNull_31_0 = false;
/* 375 */ agg_value_34 = agg_expr_2_0;
/* 376 */   }
/* 377 */   agg_agg_isNull_34_0 = true;
/* 378 */   int agg_value_37 = -1;
/* 379 */
/* 380 */   boolean agg_isNull_35 = agg_fastAggBuffer_0.isNullAt(1);
/* 381 */   int agg_value_38 = agg_isNull_35 ?
/* 382 */   -1 : (agg_fastAggBuffer_0.getInt(1));
/* 383 */
/* 384 */   if (!agg_isNull_35 && (agg_agg_isNull_34_0 ||
/* 385 */   agg_value_37 > agg_value_38)) {
/* 386 */ agg_agg_isNull_34_0 = false;
/* 387 */ agg_value_37 = agg_value_38;
/* 388 */   }
/* 389 */
/* 390 */   byte agg_caseWhenResultState_1 = -1;
/* 391 */   do {
/* 392 */ boolean agg_value_40 = false;
/* 393 */ agg_value_40 = agg_expr_0_0 > 10;
/* 394 */ if (!false && agg_value_40) {
/* 395 */   agg_caseWhenResultState_1 = (byte)(false ? 1 : 0);
/* 396 */   agg_agg_value_39_0 = agg_expr_0_0;
/* 397 */   continue;
/* 398 */ }
/* 399 */
/* 400 */ agg_caseWhenResultState_1 = (byte)(true ? 1 : 0);
/* 401 */ agg_agg_value_39_0 = -1;
/* 402 */
/* 403 */   } while (false);
/* 404 */   // TRUE if any condition is met and the result is null, or 
no any condition is met.
/* 405 */   final boolean agg_isNull_36 = (agg_caseWhenResultState_1 != 
0);
/* 406 */
/* 407 */   if (!agg_isNull_36 && (agg_agg_isNull_34_0 ||
/* 408 */   agg_value_37 > agg_agg_value_39_0)) {
/* 409 */ agg_agg_isNull_34_0 = false;
/* 410 */ agg_value_37 = agg_agg_value_39_0;
/* 411 */   }
/* 412 */   agg_agg_isNull_42_0 = true;
/* 413 */   int agg_value_45 = -1;
/* 414 */
/* 415 */   boolean agg_isNull_43 = agg_fastAggBuffer_0.isNullAt(2);
/* 416 */   int agg_value_46 = agg_isNull_43 ?
/* 417 */   -1 : (agg_fastAggBuffer_0.getInt(2));
/* 418 */
/* 419 */   if (!agg_isNull_43 && (agg_agg_isNull_42_0 ||
/* 420 */   agg_value_45 > agg_value_46)) {
/* 421 */ agg_agg_isNull_42_0 = false;
/* 422 */ agg_value_45 = agg_value_46;
/* 423 */   }
/* 424 */
/* 425 */   if (!false && (agg_agg_isNull_42_0 ||
/* 426 */   agg_value_45 > agg_expr_0_0)) {
/* 427 */ agg_agg_isNull_42_0 = false;
/* 428 */ agg_value_45 = agg_expr_0_0;
/* 429 */   }
/* 430 */   // update fast row
/* 431 */   agg_fastAggBuffer_0.setInt(0, agg_value_34);
/* 432 */
/* 433 */   if (!agg_agg_isNull_34_0) {
/* 434 */ agg_fastAggBuffer_0.setInt(1,

[GitHub] spark issue #21854: [SPARK-24896][SQL] Uuid should produce different values ...

2018-07-24 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21854
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21857: [SPARK-21274][SQL] Implement EXCEPT ALL clause.

2018-07-24 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21857#discussion_r204677899
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala
 ---
@@ -222,6 +222,37 @@ case class Stack(children: Seq[Expression]) extends 
Generator {
   }
 }
 
+/**
+ * Replicate the row N times. N is specified as the first argument to the 
function.
+ * This is a internal function solely used by optimizer to rewrite EXCEPT 
ALL AND
+ * INTERSECT ALL queries.
+ */
+@ExpressionDescription(
--- End diff --

If it's for an internal purpose, you can just remove this though.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongToUnsafeRowMap in ex...

2018-07-24 Thread liutang123

Github user liutang123 commented on the issue:

https://github.com/apache/spark/pull/21772
  
Jenkins, test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21858: [SPARK-24899][SQL][DOC] Add example of monotonica...

2018-07-24 Thread jaceklaskowski

GitHub user jaceklaskowski opened a pull request:

https://github.com/apache/spark/pull/21858

[SPARK-24899][SQL][DOC] Add example of monotonically_increasing_id standard 
function to scaladoc

## What changes were proposed in this pull request?

Example of `monotonically_increasing_id` standard function (with how it 
works internally) in scaladoc

## How was this patch tested?

Local build. Waiting for Jenkins


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jaceklaskowski/spark 
SPARK-24899-monotonically_increasing_id

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21858.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21858


commit 29def0069d96ca449204ad27e8c66ca2a218ce84
Author: Jacek Laskowski 
Date:   2018-07-24T09:34:49Z

[SPARK-24899][SQL][DOC] Add example of monotonically_increasing_id standard 
function to scaladoc




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21584: [SPARK-24433][K8S] Initial R Bindings for SparkR ...

2018-07-24 Thread ifilonenko

Github user ifilonenko commented on a diff in the pull request:

https://github.com/apache/spark/pull/21584#discussion_r204713851
  
--- Diff: 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/R/Dockerfile
 ---
@@ -0,0 +1,29 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+ARG base_img
+FROM $base_img
+WORKDIR /
+RUN mkdir ${SPARK_HOME}/R
+COPY R ${SPARK_HOME}/R
+
+RUN apk add --no-cache R R-dev
+
--- End diff --

This is only for Python packaging. R does not have `/root/.cache` when 
created by alpine. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21845: [SPARK-24886][INFRA] Fix the testing script to increase ...

2018-07-24 Thread dilipbiswal

Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/21845
  
@HyukjinKwon I saw the following test run for 11 minutes on jenkins for one 
of my PR. Not sure if its a transient problem. Just thought, i should let you 
know. On the nightly runs, should we have test that runs for that long ?

SPARK-22499: Least and greatest should not generate codes beyond 64KB (11 
minutes, 38 seconds) 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21853: [SPARK-23957][SQL] Sorts in subqueries are redundant and...

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21853
  
**[Test build #93485 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93485/testReport)**
 for PR 21853 at commit 
[`a86cb9f`](https://github.com/apache/spark/commit/a86cb9f8764ac4962905ee1b8772fec5692d4342).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21860
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21853: [SPARK-23957][SQL] Sorts in subqueries are redundant and...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21853
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21857: [SPARK-21274][SQL] Implement EXCEPT ALL clause.

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21857
  
**[Test build #93495 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93495/testReport)**
 for PR 21857 at commit 
[`a6fc341`](https://github.com/apache/spark/commit/a6fc34101261d4627f2c42f5aefc9d377e44e29e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21857: [SPARK-21274][SQL] Implement EXCEPT ALL clause.

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21857
  
**[Test build #93510 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93510/testReport)**
 for PR 21857 at commit 
[`c516f78`](https://github.com/apache/spark/commit/c516f788a21f39abc0442e64d7b54b8e76f40043).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21752: [SPARK-24788][SQL] fixed UnresolvedException when toStri...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21752
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93515/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21752: [SPARK-24788][SQL] fixed UnresolvedException when toStri...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21752
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21851: [SPARK-24891][SQL] Fix HandleNullInputsForUDF rule

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21851
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21850: [SPARK-24892] [SQL] Simplify `CaseWhen` to `If` w...

2018-07-24 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/21850#discussion_r204933531
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -414,6 +414,9 @@ object SimplifyConditionals extends Rule[LogicalPlan] 
with PredicateHelper {
 // these branches can be pruned away
 val (h, t) = branches.span(_._1 != TrueLiteral)
 CaseWhen( h :+ t.head, None)
+
+  case CaseWhen((cond, branchValue) :: Nil, elseValue) =>
+If(cond, branchValue, elseValue.getOrElse(Literal(null, 
branchValue.dataType)))
--- End diff --

Look like not much difference in term of performance, but `If` primitive 
has more opportunities for further optimization. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21775: [SPARK-24812][SQL] Last Access Time in the table descrip...

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21775
  
**[Test build #93513 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93513/testReport)**
 for PR 21775 at commit 
[`76a34c6`](https://github.com/apache/spark/commit/76a34c6d3c05c3f729be5893210b199ebb6c093c).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21865: [SPARK-24895] Remove spotbugs plugin

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21865
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21848: [SPARK-24890] [SQL] Short circuiting the `if` con...

2018-07-24 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/21848#discussion_r204938763
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -1627,6 +1627,8 @@ case class InitializeJavaBean(beanInstance: 
Expression, setters: Map[String, Exp
 case class AssertNotNull(child: Expression, walkedTypePath: Seq[String] = 
Nil)
   extends UnaryExpression with NonSQLExpression {
 
+  override lazy val deterministic: Boolean = false
--- End diff --

Fair. I'll create a followup PR for this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21854: [SPARK-24896][SQL] Uuid should produce different values ...

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21854
  
**[Test build #93517 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93517/testReport)**
 for PR 21854 at commit 
[`1d629dc`](https://github.com/apache/spark/commit/1d629dc40060578aba16cb56a6ba89f89107e74b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21864: [SPARK-24908][R][style] removing spaces to make lintr ha...

2018-07-24 Thread dbtsai

Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/21864
  
LGTM. Merged into master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21865: [SPARK-24895] Remove spotbugs plugin

2018-07-24 Thread yhuai

Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/21865
  
cc @HyukjinKwon @kiszk 

I will merge this PR once it passes the test.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21867: [SPARK-24307][CORE] Add conf to revert to old code.

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21867
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1290/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21867: [SPARK-24307][CORE] Add conf to revert to old cod...

2018-07-24 Thread squito

GitHub user squito opened a pull request:

https://github.com/apache/spark/pull/21867

[SPARK-24307][CORE] Add conf to revert to old code.

In case there are any issues in converting FileSegmentManagedBuffer to
ChunkedByteBuffer, add a conf to go back to old code path.

Followup to 7e847646d1f377f46dc3154dea37148d4e557a03


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/squito/spark SPARK-24307-p2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21867.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21867


commit bc2ea46b291fe2aea6b9d254dc0fdb4e81f90ebd
Author: Imran Rashid 
Date:   2018-07-24T20:37:26Z

[SPARK-24307][CORE] Add conf to revert to old code.

In case there are any issues in converting FileSegmentManagedBuffer to
ChunkedByteBuffer, add a conf to go back to old code path.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21867: [SPARK-24307][CORE] Add conf to revert to old code.

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21867
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

2018-07-24 Thread squito

Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/21758#discussion_r204917880
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -359,20 +366,55 @@ private[spark] class TaskSchedulerImpl(
 // of locality levels so that it gets a chance to launch local tasks 
on all of them.
 // NOTE: the preferredLocality order: PROCESS_LOCAL, NODE_LOCAL, 
NO_PREF, RACK_LOCAL, ANY
 for (taskSet <- sortedTaskSets) {
-  var launchedAnyTask = false
-  var launchedTaskAtCurrentMaxLocality = false
-  for (currentMaxLocality <- taskSet.myLocalityLevels) {
-do {
-  launchedTaskAtCurrentMaxLocality = resourceOfferSingleTaskSet(
-taskSet, currentMaxLocality, shuffledOffers, availableCpus, 
tasks)
-  launchedAnyTask |= launchedTaskAtCurrentMaxLocality
-} while (launchedTaskAtCurrentMaxLocality)
-  }
-  if (!launchedAnyTask) {
-taskSet.abortIfCompletelyBlacklisted(hostToExecutors)
+  // Skip the barrier taskSet if the available slots are less than the 
number of pending tasks.
+  if (taskSet.isBarrier && availableSlots < taskSet.numTasks) {
--- End diff --

we should probably have a hard failure if DynamicAllocation is enabled 
until that is properly addressed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

2018-07-24 Thread squito

Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/21758#discussion_r204914384
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDDBarrier.scala ---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.rdd
+
+import scala.reflect.ClassTag
+
+import org.apache.spark.BarrierTaskContext
+import org.apache.spark.TaskContext
+import org.apache.spark.annotation.{Experimental, Since}
+
+/** Represents an RDD barrier, which forces Spark to launch tasks of this 
stage together. */
+class RDDBarrier[T: ClassTag](rdd: RDD[T]) {
+
+  /**
+   * :: Experimental ::
+   * Maps partitions together with a provided BarrierTaskContext.
+   *
+   * `preservesPartitioning` indicates whether the input function 
preserves the partitioner, which
+   * should be `false` unless `rdd` is a pair RDD and the input function 
doesn't modify the keys.
+   */
+  @Experimental
+  @Since("2.4.0")
+  def mapPartitions[S: ClassTag](
--- End diff --

if the only thing you can do on this is `mapPartitions`, is there any 
particular reason its divided into two calls `barrier().mapPartititons()`, 
instead of just `barrierMapPartitions()` or something?  Are there more things 
planned here?

I can users expecting the ability to be able to call other functions after 
`.barrier()`, eg. `barrier().reduceByKey()` or something.  the compiler will 
help with this, but just wondering if we can make it more obvious.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

2018-07-24 Thread squito

Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/21758#discussion_r204917245
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -1647,6 +1647,14 @@ abstract class RDD[T: ClassTag](
 }
   }
 
+  /**
+   * :: Experimental ::
+   * Indicates that Spark must launch the tasks together for the current 
stage.
+   */
+  @Experimental
+  @Since("2.4.0")
+  def barrier(): RDDBarrier[T] = withScope(new RDDBarrier[T](this))
--- End diff --

scheduling from seems to have a very hard requirement that the number of 
partitions is less than the number of available task slots.  It seems really 
hard for users to get this right.  Eg., if I just do

`sc.textFile(...).barrier().mapPartitions()`

the number of partitions is based on the hdfs input splits.  I see lots of 
users getting confused by this -- it'll work sometimes, won't work other times, 
and they won't know why.  Should there be some automatic repartitioning based 
on cluster resources?  Or at least an api which lets users do this?  Even 
`repartition()` isn't great here, because users dont' want to think about 
cluster resources.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21752: [SPARK-24788][SQL] fixed UnresolvedException when toStri...

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21752
  
**[Test build #93515 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93515/testReport)**
 for PR 21752 at commit 
[`db83c44`](https://github.com/apache/spark/commit/db83c4478cd4077526ced45559a19ba1b84414e0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class DataFrameAggregateSuite extends QueryTest with SharedSQLContext `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21851: [SPARK-24891][SQL] Fix HandleNullInputsForUDF rule

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21851
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93514/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21851: [SPARK-24891][SQL] Fix HandleNullInputsForUDF rule

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21851
  
**[Test build #93514 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93514/testReport)**
 for PR 21851 at commit 
[`b499b97`](https://github.com/apache/spark/commit/b499b9727a4cb9cc42149d05a4d54dba2de8bd9e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class KnowNotNull(child: Expression) extends UnaryExpression `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21775: [SPARK-24812][SQL] Last Access Time in the table descrip...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21775
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21775: [SPARK-24812][SQL] Last Access Time in the table descrip...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21775
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93513/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongToUnsafeRowMap in ex...

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21772
  
**[Test build #93516 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93516/testReport)**
 for PR 21772 at commit 
[`c9ebfd0`](https://github.com/apache/spark/commit/c9ebfd0acdeefa1495b48df84b137ea213b2f7fc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21865: [SPARK-24895] Remove spotbugs plugin

2018-07-24 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21865


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1292/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21864: [SPARK-24908][R][style] removing spaces to make l...

2018-07-24 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21864


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21758: [SPARK-24795][CORE] Implement barrier execution m...

2018-07-24 Thread squito

Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/21758#discussion_r204912925
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
@@ -359,20 +368,56 @@ private[spark] class TaskSchedulerImpl(
 // of locality levels so that it gets a chance to launch local tasks 
on all of them.
 // NOTE: the preferredLocality order: PROCESS_LOCAL, NODE_LOCAL, 
NO_PREF, RACK_LOCAL, ANY
 for (taskSet <- sortedTaskSets) {
-  var launchedAnyTask = false
-  var launchedTaskAtCurrentMaxLocality = false
-  for (currentMaxLocality <- taskSet.myLocalityLevels) {
-do {
-  launchedTaskAtCurrentMaxLocality = resourceOfferSingleTaskSet(
-taskSet, currentMaxLocality, shuffledOffers, availableCpus, 
tasks)
-  launchedAnyTask |= launchedTaskAtCurrentMaxLocality
-} while (launchedTaskAtCurrentMaxLocality)
-  }
-  if (!launchedAnyTask) {
-taskSet.abortIfCompletelyBlacklisted(hostToExecutors)
+  // Skip the barrier taskSet if the available slots are less than the 
number of pending tasks.
+  if (taskSet.isBarrier && availableSlots < taskSet.numTasks) {
+// Skip the launch process.
+// TODO SPARK-24819 If the job requires more slots than available 
(both busy and free
+// slots), fail the job on submit.
+logInfo(s"Skip current round of resource offers for barrier stage 
${taskSet.stageId} " +
+  s"because the barrier taskSet requires ${taskSet.numTasks} 
slots, while the total " +
+  s"number of available slots is ${availableSlots}.")
+  } else {
+var launchedAnyTask = false
+var launchedTaskAtCurrentMaxLocality = false
+// Record all the executor IDs assigned barrier tasks on.
+val addresses = ArrayBuffer[String]()
+val taskDescs = ArrayBuffer[TaskDescription]()
+for (currentMaxLocality <- taskSet.myLocalityLevels) {
+  do {
+launchedTaskAtCurrentMaxLocality = 
resourceOfferSingleTaskSet(taskSet,
+  currentMaxLocality, shuffledOffers, availableCpus, tasks, 
addresses, taskDescs)
+launchedAnyTask |= launchedTaskAtCurrentMaxLocality
+  } while (launchedTaskAtCurrentMaxLocality)
+}
+if (!launchedAnyTask) {
+  taskSet.abortIfCompletelyBlacklisted(hostToExecutors)
+}
+if (launchedAnyTask && taskSet.isBarrier) {
+  // Check whether the barrier tasks are partially launched.
+  // TODO SPARK-24818 handle the assert failure case (that can 
happen when some locality
+  // requirements are not fulfilled, and we should revert the 
launched tasks).
+  require(taskDescs.size == taskSet.numTasks,
+s"Skip current round of resource offers for barrier stage 
${taskSet.stageId} " +
+  s"because only ${taskDescs.size} out of a total number of 
${taskSet.numTasks} " +
+  "tasks got resource offers. The resource offers may have 
been blacklisted or " +
+  "cannot fulfill task locality requirements.")
+
+  // Update the taskInfos into all the barrier task properties.
+  val addressesStr = addresses.zip(taskDescs)
+// Addresses ordered by partitionId
+.sortBy(_._2.partitionId)
+.map(_._1)
+.mkString(",")
+  taskDescs.foreach(_.properties.setProperty("addresses", 
addressesStr))
+
+  logInfo(s"Successfully scheduled all the ${taskDescs.size} tasks 
for barrier stage " +
+s"${taskSet.stageId}.")
+}
   }
 }
 
+// TODO SPARK-24823 Cancel a job that contains barrier stage(s) if the 
barrier tasks don't get
+// launched within a configured time.
--- End diff --

with concurrently executing jobs, one job could easily cause starvation for 
the barrier job, right?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

2018-07-24 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21439#discussion_r204915146
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
 ---
@@ -101,6 +102,17 @@ class JacksonParser(
 }
   }
 
+  private def makeArrayRootConverter(at: ArrayType): JsonParser => 
Seq[InternalRow] = {
+val elemConverter = makeConverter(at.elementType)
+(parser: JsonParser) => parseJsonToken[Seq[InternalRow]](parser, at) {
+  case START_ARRAY => Seq(InternalRow(convertArray(parser, 
elemConverter)))
+  case START_OBJECT if at.elementType.isInstanceOf[StructType] =>
--- End diff --

Shall we add a comment on top of this `case` to explain it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

2018-07-24 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21439#discussion_r204932903
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -544,34 +544,27 @@ case class JsonToStructs(
   timeZoneId = None)
 
   override def checkInputDataTypes(): TypeCheckResult = nullableSchema 
match {
-case _: StructType | ArrayType(_: StructType, _) | _: MapType =>
+case _: StructType | _: ArrayType | _: MapType =>
   super.checkInputDataTypes()
 case _ => TypeCheckResult.TypeCheckFailure(
   s"Input schema ${nullableSchema.catalogString} must be a struct or 
an array of structs.")
--- End diff --

`or an array of structs.` -> `or an array.`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21865: [SPARK-24895] Remove spotbugs plugin

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21865
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93511/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21865: [SPARK-24895] Remove spotbugs plugin

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21865
  
**[Test build #93511 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93511/testReport)**
 for PR 21865 at commit 
[`af0ecf5`](https://github.com/apache/spark/commit/af0ecf5d39824ed2c0bb0515d9c4ff8651a58f74).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

2018-07-24 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21439#discussion_r204933936
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -544,34 +544,27 @@ case class JsonToStructs(
   timeZoneId = None)
--- End diff --

Pleas also update the comment of `JsonToStructs`:

`Converts an json input string to a [[StructType]] or [[ArrayType]] of 
[[StructType]]s`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongToUnsafeRowMap in ex...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21772
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93516/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongToUnsafeRowMap in ex...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21772
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21848
  
**[Test build #93523 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93523/testReport)**
 for PR 21848 at commit 
[`b4f1431`](https://github.com/apache/spark/commit/b4f143180adc0196aa16650efc399226b463699f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21867: [SPARK-24307][CORE] Add conf to revert to old code.

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21867
  
**[Test build #93520 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93520/testReport)**
 for PR 21867 at commit 
[`bc2ea46`](https://github.com/apache/spark/commit/bc2ea46b291fe2aea6b9d254dc0fdb4e81f90ebd).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21857: [SPARK-21274][SQL] Implement EXCEPT ALL clause.

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21857
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21857: [SPARK-21274][SQL] Implement EXCEPT ALL clause.

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21857
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93510/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21863: [SPARK-18874][SQL][FOLLOW-UP] Improvement type mismatche...

2018-07-24 Thread dilipbiswal

Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/21863
  
@gatorsmile Hi sean, isn't @mgaido91 working in the same area with the in 
subq pr ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

2018-07-24 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21439#discussion_r204931175
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala ---
@@ -136,12 +136,11 @@ class JsonFunctionsSuite extends QueryTest with 
SharedSQLContext {
   test("from_json invalid schema") {
--- End diff --

Not a invalid schema now.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21850: [SPARK-24892] [SQL] Simplify `CaseWhen` to `If` w...

2018-07-24 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/21850#discussion_r204933164
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -414,6 +414,9 @@ object SimplifyConditionals extends Rule[LogicalPlan] 
with PredicateHelper {
 // these branches can be pruned away
 val (h, t) = branches.span(_._1 != TrueLiteral)
 CaseWhen( h :+ t.head, None)
+
+  case CaseWhen((cond, branchValue) :: Nil, elseValue) =>
+If(cond, branchValue, elseValue.getOrElse(Literal(null, 
branchValue.dataType)))
--- End diff --

Before:

```
== Parsed Logical Plan ==
'Project [CASE WHEN isnull('a) THEN 1 END AS col1#181]
+- 'UnresolvedRelation

== Optimized Logical Plan ==
Project [CASE WHEN isnull(a#182) THEN 1 END AS col1#181]
+- Relation[a#182] parquet
```
Generated Java code

```java
/* 043 */   protected void processNext() throws java.io.IOException {
/* 044 */ if (scan_mutableStateArray_1[0] == null) {
/* 045 */   scan_nextBatch_0();
/* 046 */ }
/* 047 */ while (scan_mutableStateArray_1[0] != null) {
/* 048 */   int scan_numRows_0 = scan_mutableStateArray_1[0].numRows();
/* 049 */   int scan_localEnd_0 = scan_numRows_0 - scan_batchIdx_0;
/* 050 */   for (int scan_localIdx_0 = 0; scan_localIdx_0 < 
scan_localEnd_0; scan_localIdx_0++) {
/* 051 */ int scan_rowIdx_0 = scan_batchIdx_0 + scan_localIdx_0;
/* 052 */ byte project_caseWhenResultState_0 = -1;
/* 053 */ do {
/* 054 */   boolean scan_isNull_0 = 
scan_mutableStateArray_2[0].isNullAt(scan_rowIdx_0);
/* 055 */   int scan_value_0 = scan_isNull_0 ? -1 : 
(scan_mutableStateArray_2[0].getInt(scan_rowIdx_0));
/* 056 */   if (!false && scan_isNull_0) {
/* 057 */ project_caseWhenResultState_0 = (byte)(false ? 1 : 0);
/* 058 */ project_project_value_0_0 = 1;
/* 059 */ continue;
/* 060 */   }
/* 061 */
/* 062 */ } while (false);
/* 063 */ // TRUE if any condition is met and the result is null, 
or no any condition is met.
/* 064 */ final boolean project_isNull_0 = 
(project_caseWhenResultState_0 != 0);
/* 065 */ scan_mutableStateArray_3[1].reset();
/* 066 */
/* 067 */ scan_mutableStateArray_3[1].zeroOutNullBytes();
/* 068 */
/* 069 */ if (project_isNull_0) {
/* 070 */   scan_mutableStateArray_3[1].setNullAt(0);
/* 071 */ } else {
/* 072 */   scan_mutableStateArray_3[1].write(0, 
project_project_value_0_0);
/* 073 */ }
/* 074 */ append((scan_mutableStateArray_3[1].getRow()));
/* 075 */ if (shouldStop()) { scan_batchIdx_0 = scan_rowIdx_0 + 1; 
return; }
/* 076 */   }
/* 077 */   scan_batchIdx_0 = scan_numRows_0;
/* 078 */   scan_mutableStateArray_1[0] = null;
/* 079 */   scan_nextBatch_0();
/* 080 */ }
/* 081 */ ((org.apache.spark.sql.execution.metric.SQLMetric) 
references[1] /* scanTime */).add(scan_scanTime_0 / (1000 * 1000));
/* 082 */ scan_scanTime_0 = 0;
/* 083 */   }
```

After:

```
== Parsed Logical Plan ==
'Project [CASE WHEN isnull('a) THEN 1 END AS b#186]
+- 'UnresolvedRelation `td`

== Optimized Logical Plan ==
Project [if (isnull(a#187)) 1 else null AS b#186]
+- Relation[a#187,b#188] parquet
```

Generated Java code:

```java
/* 042 */   protected void processNext() throws java.io.IOException {
/* 043 */ if (scan_mutableStateArray_1[0] == null) {
/* 044 */   scan_nextBatch_0();
/* 045 */ }
/* 046 */ while (scan_mutableStateArray_1[0] != null) {
/* 047 */   int scan_numRows_0 = scan_mutableStateArray_1[0].numRows();
/* 048 */   int scan_localEnd_0 = scan_numRows_0 - scan_batchIdx_0;
/* 049 */   for (int scan_localIdx_0 = 0; scan_localIdx_0 < 
scan_localEnd_0; scan_localIdx_0++) {
/* 050 */ int scan_rowIdx_0 = scan_batchIdx_0 + scan_localIdx_0;
/* 051 */ boolean scan_isNull_0 = 
scan_mutableStateArray_2[0].isNullAt(scan_rowIdx_0);
/* 052 */ int scan_value_0 = scan_isNull_0 ? -1 : 
(scan_mutableStateArray_2[0].getInt(scan_rowIdx_0));
/* 053 */ boolean project_isNull_0 = false;
/* 054 */ int project_value_0 = -1;
/* 055 */ if (!false && scan_isNull_0) {
/* 056 */   project_isNull_0 = false;
/* 057 */   project_value_0 = 1;
/* 058 */ } else {
/* 059 */   project_isNull_0 = true;
/* 060 */   project_value_0 = -1;
/* 061 */ }
/* 062 */

[GitHub] spark issue #21865: [SPARK-24895] Remove spotbugs plugin

2018-07-24 Thread yhuai

Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/21865
  
lgtm. I am merging this PR to master branch. Then, I will kick off 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/job/spark-master-maven-snapshots/.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21850: [SPARK-24892] [SQL] Simplify `CaseWhen` to `If` when the...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21850
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21803: [SPARK-24849][SPARK-24911][SQL] Converting a value of St...

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21803
  
**[Test build #93522 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93522/testReport)**
 for PR 21803 at commit 
[`738e97c`](https://github.com/apache/spark/commit/738e97cdc1801c95b8b9d87ad00c6c8aeaf0f20b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21850: [SPARK-24892] [SQL] Simplify `CaseWhen` to `If` when the...

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21850
  
**[Test build #93521 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93521/testReport)**
 for PR 21850 at commit 
[`59fada7`](https://github.com/apache/spark/commit/59fada75fb59b1c3dabdac0a5d22b35c8f139a44).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21850: [SPARK-24892] [SQL] Simplify `CaseWhen` to `If` when the...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21850
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1291/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread dbtsai

Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/21848
  
@kiszk `trait Stateful extends Nondeterministic`, and this rule will not be 
invoked when an expression is nondeterministic.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21854: [SPARK-24896][SQL] Uuid should produce different values ...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21854
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21854: [SPARK-24896][SQL] Uuid should produce different values ...

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21854
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93517/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread dbtsai

Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/21848
  
Here is a followup PR for making `AssertTrue` and `AssertNotNull` 
`non-deterministic` https://issues.apache.org/jira/browse/SPARK-24913


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21866: [SPARK-24768][FollowUp][SQL]Avro migration followup: cha...

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21866
  
**[Test build #93518 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93518/testReport)**
 for PR 21866 at commit 
[`cff6f2a`](https://github.com/apache/spark/commit/cff6f2a0459e8cc4e48f28bde8103ea44ce5a1ab).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21861: [SPARK-24907][WIP] Migrate JDBC DataSource to JDBCDataSo...

2018-07-24 Thread tengpeng

Github user tengpeng commented on the issue:

https://github.com/apache/spark/pull/21861
  
@gatorsmile Got you. I will update the implementation after DataSourceV2 
API changes. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20699: [SPARK-23544][SQL]Remove redundancy ShuffleExchan...

2018-07-24 Thread heary-cao

Github user heary-cao closed the pull request at:

https://github.com/apache/spark/pull/20699


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21305: [SPARK-24251][SQL] Add AppendData logical plan.

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21305
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93519/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21305: [SPARK-24251][SQL] Add AppendData logical plan.

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21305
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier

2018-07-24 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21822#discussion_r204957474
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -751,7 +751,8 @@ object TypeCoercion {
*/
   case class ConcatCoercion(conf: SQLConf) extends TypeCoercionRule {
 
-override protected def coerceTypes(plan: LogicalPlan): LogicalPlan = 
plan transform { case p =>
+override protected def coerceTypes(
+  plan: LogicalPlan): LogicalPlan = plan resolveOperatorsDown { case p 
=>
--- End diff --

im using a weird wrapping here to minimize the diff.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21867: [SPARK-24307][CORE] Add conf to revert to old code.

2018-07-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21867
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93520/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21866: [SPARK-24768][FollowUp][SQL]Avro migration followup: cha...

2018-07-24 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21866
  
**[Test build #93528 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93528/testReport)**
 for PR 21866 at commit 
[`cff6f2a`](https://github.com/apache/spark/commit/cff6f2a0459e8cc4e48f28bde8103ea44ce5a1ab).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21867: [SPARK-24307][CORE] Add conf to revert to old cod...

2018-07-24 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21867#discussion_r204959300
  
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
@@ -731,7 +731,14 @@ private[spark] class BlockManager(
   }
 
   if (data != null) {
-return Some(ChunkedByteBuffer.fromManagedBuffer(data, chunkSize))
+// SPARK-24307 undocumented "escape-hatch" in case there are any 
issues in converting to
+// to ChunkedByteBuffer, to go back to old code-path.  Can be 
removed post Spark 2.4 if
+// new path is stable.
+if (conf.getBoolean("spark.fetchToNioBuffer", false)) {
--- End diff --

can we have a better prefix, rather than just spark. ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21865: [SPARK-24895] Remove spotbugs plugin

2018-07-24 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21865
  
Thank you all. I couldn't foresee this problem.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-07-24 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21542
  
This was reverted in favour of https://github.com/apache/spark/pull/21865 
and SPARK-24895 for now.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21851: [SPARK-24891][SQL] Fix HandleNullInputsForUDF rule

2018-07-24 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21851
  
LGTM

Thanks! Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20949: [SPARK-19018][SQL] Add support for custom encoding on cs...

2018-07-24 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20949
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 6 >

201 - 300 of 551 matches

Mail list logo