[GitHub] spark pull request #21061: [SPARK-23914][SQL] Add array_union function

2018-06-19 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21061#discussion_r196662595
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -2355,3 +2355,297 @@ case class ArrayRemove(left: Expression, right: 
Expression)
 
   override def prettyName: String = "array_remove"
 }
+
+object ArraySetLike {
+  def useGenericArrayData(elementSize: Int, length: Int): Boolean = {
+// Use the same calculation in UnsafeArrayData.fromPrimitiveArray()
+val headerInBytes = 
UnsafeArrayData.calculateHeaderPortionInBytes(length)
+val valueRegionInBytes = elementSize.toLong * length
+val totalSizeInLongs = (headerInBytes + valueRegionInBytes + 7) / 8
+totalSizeInLongs > Integer.MAX_VALUE / 8
+  }
+
+  def throwUnionLengthOverflowException(length: Int): Unit = {
+throw new RuntimeException(s"Unsuccessful try to union arrays with 
$length " +
+  s"elements due to exceeding the array size limit " +
+  s"${ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH}.")
+  }
+}
+
+
+abstract class ArraySetLike extends BinaryArrayExpressionWithImplicitCast {
+  override def dataType: DataType = left.dataType
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+val typeCheckResult = super.checkInputDataTypes()
+if (typeCheckResult.isSuccess) {
+  
TypeUtils.checkForOrderingExpr(dataType.asInstanceOf[ArrayType].elementType,
+s"function $prettyName")
+} else {
+  typeCheckResult
+}
+  }
+
+  protected def cn = left.dataType.asInstanceOf[ArrayType].containsNull ||
+right.dataType.asInstanceOf[ArrayType].containsNull
+
+  @transient protected lazy val ordering: Ordering[Any] =
+TypeUtils.getInterpretedOrdering(elementType)
+
+  @transient protected lazy val elementTypeSupportEquals = elementType 
match {
+case BinaryType => false
+case _: AtomicType => true
+case _ => false
+  }
+}
+
+/**
+ * Returns an array of the elements in the union of x and y, without 
duplicates
+ */
+@ExpressionDescription(
+  usage = """
+_FUNC_(array1, array2) - Returns an array of the elements in the union 
of array1 and array2,
+  without duplicates.
+  """,
+  examples = """
+Examples:
+  > SELECT _FUNC_(array(1, 2, 3), array(1, 3, 5));
+   array(1, 2, 3, 5)
+  """,
+  since = "2.4.0")
+case class ArrayUnion(left: Expression, right: Expression) extends 
ArraySetLike {
+
+  override def nullSafeEval(input1: Any, input2: Any): Any = {
+val array1 = input1.asInstanceOf[ArrayData]
+val array2 = input2.asInstanceOf[ArrayData]
+
+if (elementTypeSupportEquals && !cn) {
--- End diff --

Why do we need to check `!cn`? Can't we avoid boxing if the arrays contain 
null? How about using `foundNullElement` as `array_distinct` is doing at #21050?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21548: [SPARK-24518][CORE] Using Hadoop credential provider API...

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21548
  
**[Test build #92120 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92120/testReport)**
 for PR 21548 at commit 
[`c7ef15e`](https://github.com/apache/spark/commit/c7ef15e47e97e675a63444d0b66b8b8808cccf90).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21548: [SPARK-24518][CORE] Using Hadoop credential provider API...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21548
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4242/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21548: [SPARK-24518][CORE] Using Hadoop credential provider API...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21548
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21548: [SPARK-24518][CORE] Using Hadoop credential provider API...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21548
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/346/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21548: [SPARK-24518][CORE] Using Hadoop credential provider API...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21548
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21590: [SPARK-24423][SQL] Add a new option for JDBC sour...

2018-06-19 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/21590#discussion_r196658307
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
 ---
@@ -65,13 +65,38 @@ class JDBCOptions(
   // Required parameters
   // 
   require(parameters.isDefinedAt(JDBC_URL), s"Option '$JDBC_URL' is 
required.")
-  require(parameters.isDefinedAt(JDBC_TABLE_NAME), s"Option 
'$JDBC_TABLE_NAME' is required.")
+
   // a JDBC URL
   val url = parameters(JDBC_URL)
-  // name of table
-  val table = parameters(JDBC_TABLE_NAME)
+  val tableName = parameters.get(JDBC_TABLE_NAME)
--- End diff --

Personally I prefer:
```
  val tableExpression = if (parameters.isDefinedAt(JDBC_TABLE_NAME)) {
require(!parameters.isDefinedAt(JDBC_QUERY_STRING), "...")
parameters.get(JDBC_TABLE_NAME).get.trim
  } else {
require(parameters.isDefinedAt(JDBC_QUERY_STRING), "...")
s"(${parameters.get(JDBC_QUERY_STRING)}) ${curId.getAndIncrement()}"
  }
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21590: [SPARK-24423][SQL] Add a new option for JDBC sour...

2018-06-19 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/21590#discussion_r196657947
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
 ---
@@ -65,13 +65,38 @@ class JDBCOptions(
   // Required parameters
   // 
   require(parameters.isDefinedAt(JDBC_URL), s"Option '$JDBC_URL' is 
required.")
-  require(parameters.isDefinedAt(JDBC_TABLE_NAME), s"Option 
'$JDBC_TABLE_NAME' is required.")
+
   // a JDBC URL
   val url = parameters(JDBC_URL)
-  // name of table
-  val table = parameters(JDBC_TABLE_NAME)
+  val tableName = parameters.get(JDBC_TABLE_NAME)
+  val query = parameters.get(JDBC_QUERY_STRING)
+  // Following two conditions make sure that :
+  // 1. One of the option (dbtable or query) must be specified.
+  // 2. Both of them can not be specified at the same time as they are 
conflicting in nature.
+  require(
+tableName.isDefined || query.isDefined,
+s"Option '$JDBC_TABLE_NAME' or '${JDBC_QUERY_STRING}' is required."
+  )
+
+  require(
+!(tableName.isDefined && query.isDefined),
+s"Both '$JDBC_TABLE_NAME' and '$JDBC_QUERY_STRING' can not be 
specified."
+  )
+
+  // table name or a table expression.
+  val tableExpression = tableName.map(_.trim).getOrElse {
+// We have ensured in the code above that either dbtable or query is 
specified.
+query.get match {
+  case subq if subq.nonEmpty => s"(${subq}) 
spark_gen_${curId.getAndIncrement()}"
+  case subq => subq
+}
+  }
+
+  require(tableExpression.nonEmpty,
--- End diff --

The error check and error message here are confusing. It seems telling user 
that the two options can be both specified.
Maybe we should just check the defined one and improve the error message.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21061
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21061
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92113/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21061
  
**[Test build #92113 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92113/testReport)**
 for PR 21061 at commit 
[`f64e529`](https://github.com/apache/spark/commit/f64e5292ccc9c709ea56614bf70b1fbb83099625).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21050: [SPARK-23912][SQL]add array_distinct

2018-06-19 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21050#discussion_r196654369
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -2355,3 +2356,281 @@ case class ArrayRemove(left: Expression, right: 
Expression)
 
   override def prettyName: String = "array_remove"
 }
+
+/**
+ * Removes duplicate values from the array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(array) - Removes duplicate values from the array.",
+  examples = """
+Examples:
+  > SELECT _FUNC_(array(1, 2, 3, null, 3));
+   [1,2,3,null]
+  """, since = "2.4.0")
+case class ArrayDistinct(child: Expression)
+  extends UnaryExpression with ExpectsInputTypes {
+
+  override def inputTypes: Seq[AbstractDataType] = Seq(ArrayType)
+
+  override def dataType: DataType = child.dataType
+
+  @transient lazy val elementType: DataType = 
dataType.asInstanceOf[ArrayType].elementType
+
+  @transient private lazy val ordering: Ordering[Any] =
+TypeUtils.getInterpretedOrdering(elementType)
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+super.checkInputDataTypes() match {
+  case f: TypeCheckResult.TypeCheckFailure => f
+  case TypeCheckResult.TypeCheckSuccess =>
+TypeUtils.checkForOrderingExpr(elementType, s"function 
$prettyName")
+}
+  }
+
+  @transient private lazy val elementTypeSupportEquals = elementType match 
{
+case BinaryType => false
+case _: AtomicType => true
+case _ => false
+  }
+
+  override def nullSafeEval(array: Any): Any = {
+val data = array.asInstanceOf[ArrayData].toArray[AnyRef](elementType)
+if (elementTypeSupportEquals) {
+  new GenericArrayData(data.distinct.asInstanceOf[Array[Any]])
+} else {
+  var foundNullElement = false
+  var pos = 0
+  for(i <- 0 until data.length) {
+if (data(i) == null) {
+  if (!foundNullElement) {
+foundNullElement = true
+pos = pos + 1
+  }
+} else {
+  var j = 0
+  var done = false
+  while (j <= i && !done) {
+if (data(j) != null && ordering.equiv(data(j), data(i))) {
+  done = true
+}
+j = j + 1
+  }
+  if (i == j-1) {
+pos = pos + 1
+  }
+}
+  }
+  new GenericArrayData(data.slice(0, pos))
+}
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+nullSafeCodeGen(ctx, ev, (array) => {
+  val i = ctx.freshName("i")
+  val j = ctx.freshName("j")
+  val sizeOfDistinctArray = ctx.freshName("sizeOfDistinctArray")
+  val getValue1 = CodeGenerator.getValue(array, elementType, i)
+  val getValue2 = CodeGenerator.getValue(array, elementType, j)
+  val foundNullElement = ctx.freshName("foundNullElement")
+  val openHashSet = classOf[OpenHashSet[_]].getName
+  val hs = ctx.freshName("hs")
+  val classTag = s"scala.reflect.ClassTag$$.MODULE$$.Object()"
+  if(elementTypeSupportEquals) {
+s"""
+   |int $sizeOfDistinctArray = 0;
+   |boolean $foundNullElement = false;
+   |$openHashSet $hs = new $openHashSet($classTag);
+   |for (int $i = 0; $i < $array.numElements(); $i++) {
+   |  if ($array.isNullAt($i)) {
+   |$foundNullElement = true;
+   |  } else {
+   |$hs.add($getValue1);
+   |  }
+   |}
+   |$sizeOfDistinctArray = $hs.size() + ($foundNullElement ? 1 : 
0);
+   |${genCodeForResult(ctx, ev, array, sizeOfDistinctArray)}
+ """.stripMargin
+  } else {
+s"""
+   |int $sizeOfDistinctArray = 0;
+   |boolean $foundNullElement = false;
+   |for (int $i = 0; $i < $array.numElements(); $i ++) {
+   |  if ($array.isNullAt($i)) {
+   | if (!($foundNullElement)) {
+   |   $sizeOfDistinctArray = $sizeOfDistinctArray + 1;
+   |   $foundNullElement = true;
+   | }
+   |  } else {
+   |int $j;
+   |for ($j = 0; $j < $i; $j++) {
+   |  if (!$array.isNullAt($j) && ${ctx.genEqual(elementType, 
getValue1, getValue2)}) {
+   |break;
+   |  }
+   |}
+   |if ($i == $j) {
+   | $sizeOfDistinctArray = $sizeOfDistinctArray + 1;
+   |}
+   |  }
+   |}
 

[GitHub] spark pull request #21050: [SPARK-23912][SQL]add array_distinct

2018-06-19 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21050#discussion_r196654348
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -2355,3 +2356,281 @@ case class ArrayRemove(left: Expression, right: 
Expression)
 
   override def prettyName: String = "array_remove"
 }
+
+/**
+ * Removes duplicate values from the array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(array) - Removes duplicate values from the array.",
+  examples = """
+Examples:
+  > SELECT _FUNC_(array(1, 2, 3, null, 3));
+   [1,2,3,null]
+  """, since = "2.4.0")
+case class ArrayDistinct(child: Expression)
+  extends UnaryExpression with ExpectsInputTypes {
+
+  override def inputTypes: Seq[AbstractDataType] = Seq(ArrayType)
+
+  override def dataType: DataType = child.dataType
+
+  @transient lazy val elementType: DataType = 
dataType.asInstanceOf[ArrayType].elementType
+
+  @transient private lazy val ordering: Ordering[Any] =
+TypeUtils.getInterpretedOrdering(elementType)
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+super.checkInputDataTypes() match {
+  case f: TypeCheckResult.TypeCheckFailure => f
+  case TypeCheckResult.TypeCheckSuccess =>
+TypeUtils.checkForOrderingExpr(elementType, s"function 
$prettyName")
+}
+  }
+
+  @transient private lazy val elementTypeSupportEquals = elementType match 
{
+case BinaryType => false
+case _: AtomicType => true
+case _ => false
+  }
+
+  override def nullSafeEval(array: Any): Any = {
+val data = array.asInstanceOf[ArrayData].toArray[AnyRef](elementType)
+if (elementTypeSupportEquals) {
+  new GenericArrayData(data.distinct.asInstanceOf[Array[Any]])
+} else {
+  var foundNullElement = false
+  var pos = 0
+  for(i <- 0 until data.length) {
+if (data(i) == null) {
+  if (!foundNullElement) {
+foundNullElement = true
+pos = pos + 1
+  }
+} else {
+  var j = 0
+  var done = false
+  while (j <= i && !done) {
+if (data(j) != null && ordering.equiv(data(j), data(i))) {
+  done = true
+}
+j = j + 1
+  }
+  if (i == j-1) {
+pos = pos + 1
+  }
+}
+  }
+  new GenericArrayData(data.slice(0, pos))
+}
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+nullSafeCodeGen(ctx, ev, (array) => {
+  val i = ctx.freshName("i")
+  val j = ctx.freshName("j")
+  val sizeOfDistinctArray = ctx.freshName("sizeOfDistinctArray")
+  val getValue1 = CodeGenerator.getValue(array, elementType, i)
+  val getValue2 = CodeGenerator.getValue(array, elementType, j)
+  val foundNullElement = ctx.freshName("foundNullElement")
+  val openHashSet = classOf[OpenHashSet[_]].getName
+  val hs = ctx.freshName("hs")
+  val classTag = s"scala.reflect.ClassTag$$.MODULE$$.Object()"
+  if(elementTypeSupportEquals) {
+s"""
+   |int $sizeOfDistinctArray = 0;
+   |boolean $foundNullElement = false;
+   |$openHashSet $hs = new $openHashSet($classTag);
+   |for (int $i = 0; $i < $array.numElements(); $i++) {
+   |  if ($array.isNullAt($i)) {
+   |$foundNullElement = true;
+   |  } else {
+   |$hs.add($getValue1);
+   |  }
+   |}
+   |$sizeOfDistinctArray = $hs.size() + ($foundNullElement ? 1 : 
0);
+   |${genCodeForResult(ctx, ev, array, sizeOfDistinctArray)}
+ """.stripMargin
+  } else {
+s"""
+   |int $sizeOfDistinctArray = 0;
+   |boolean $foundNullElement = false;
+   |for (int $i = 0; $i < $array.numElements(); $i ++) {
+   |  if ($array.isNullAt($i)) {
+   | if (!($foundNullElement)) {
+   |   $sizeOfDistinctArray = $sizeOfDistinctArray + 1;
+   |   $foundNullElement = true;
+   | }
+   |  } else {
+   |int $j;
+   |for ($j = 0; $j < $i; $j++) {
+   |  if (!$array.isNullAt($j) && ${ctx.genEqual(elementType, 
getValue1, getValue2)}) {
+   |break;
+   |  }
+   |}
+   |if ($i == $j) {
+   | $sizeOfDistinctArray = $sizeOfDistinctArray + 1;
+   |}
+   |  }
+   |}
 

[GitHub] spark pull request #21050: [SPARK-23912][SQL]add array_distinct

2018-06-19 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21050#discussion_r196654935
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -2355,3 +2356,281 @@ case class ArrayRemove(left: Expression, right: 
Expression)
 
   override def prettyName: String = "array_remove"
 }
+
+/**
+ * Removes duplicate values from the array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(array) - Removes duplicate values from the array.",
+  examples = """
+Examples:
+  > SELECT _FUNC_(array(1, 2, 3, null, 3));
+   [1,2,3,null]
+  """, since = "2.4.0")
+case class ArrayDistinct(child: Expression)
+  extends UnaryExpression with ExpectsInputTypes {
+
+  override def inputTypes: Seq[AbstractDataType] = Seq(ArrayType)
+
+  override def dataType: DataType = child.dataType
+
+  @transient lazy val elementType: DataType = 
dataType.asInstanceOf[ArrayType].elementType
+
+  @transient private lazy val ordering: Ordering[Any] =
+TypeUtils.getInterpretedOrdering(elementType)
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+super.checkInputDataTypes() match {
+  case f: TypeCheckResult.TypeCheckFailure => f
+  case TypeCheckResult.TypeCheckSuccess =>
+TypeUtils.checkForOrderingExpr(elementType, s"function 
$prettyName")
+}
+  }
+
+  @transient private lazy val elementTypeSupportEquals = elementType match 
{
+case BinaryType => false
+case _: AtomicType => true
+case _ => false
+  }
+
+  override def nullSafeEval(array: Any): Any = {
+val data = array.asInstanceOf[ArrayData].toArray[AnyRef](elementType)
+if (elementTypeSupportEquals) {
+  new GenericArrayData(data.distinct.asInstanceOf[Array[Any]])
+} else {
+  var foundNullElement = false
+  var pos = 0
+  for(i <- 0 until data.length) {
--- End diff --

nit: `for (`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21050: [SPARK-23912][SQL]add array_distinct

2018-06-19 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21050#discussion_r196654756
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -2355,3 +2356,281 @@ case class ArrayRemove(left: Expression, right: 
Expression)
 
   override def prettyName: String = "array_remove"
 }
+
+/**
+ * Removes duplicate values from the array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(array) - Removes duplicate values from the array.",
+  examples = """
+Examples:
+  > SELECT _FUNC_(array(1, 2, 3, null, 3));
+   [1,2,3,null]
+  """, since = "2.4.0")
+case class ArrayDistinct(child: Expression)
+  extends UnaryExpression with ExpectsInputTypes {
+
+  override def inputTypes: Seq[AbstractDataType] = Seq(ArrayType)
+
+  override def dataType: DataType = child.dataType
+
+  @transient lazy val elementType: DataType = 
dataType.asInstanceOf[ArrayType].elementType
+
+  @transient private lazy val ordering: Ordering[Any] =
+TypeUtils.getInterpretedOrdering(elementType)
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+super.checkInputDataTypes() match {
+  case f: TypeCheckResult.TypeCheckFailure => f
+  case TypeCheckResult.TypeCheckSuccess =>
+TypeUtils.checkForOrderingExpr(elementType, s"function 
$prettyName")
+}
+  }
+
+  @transient private lazy val elementTypeSupportEquals = elementType match 
{
+case BinaryType => false
+case _: AtomicType => true
+case _ => false
+  }
+
+  override def nullSafeEval(array: Any): Any = {
+val data = array.asInstanceOf[ArrayData].toArray[AnyRef](elementType)
+if (elementTypeSupportEquals) {
+  new GenericArrayData(data.distinct.asInstanceOf[Array[Any]])
+} else {
+  var foundNullElement = false
+  var pos = 0
+  for(i <- 0 until data.length) {
+if (data(i) == null) {
+  if (!foundNullElement) {
+foundNullElement = true
+pos = pos + 1
+  }
+} else {
+  var j = 0
+  var done = false
+  while (j <= i && !done) {
+if (data(j) != null && ordering.equiv(data(j), data(i))) {
+  done = true
+}
+j = j + 1
+  }
+  if (i == j-1) {
--- End diff --

nit: `(i == j - 1)`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21050: [SPARK-23912][SQL]add array_distinct

2018-06-19 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21050#discussion_r196654969
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -2355,3 +2356,281 @@ case class ArrayRemove(left: Expression, right: 
Expression)
 
   override def prettyName: String = "array_remove"
 }
+
+/**
+ * Removes duplicate values from the array.
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(array) - Removes duplicate values from the array.",
+  examples = """
+Examples:
+  > SELECT _FUNC_(array(1, 2, 3, null, 3));
+   [1,2,3,null]
+  """, since = "2.4.0")
+case class ArrayDistinct(child: Expression)
+  extends UnaryExpression with ExpectsInputTypes {
+
+  override def inputTypes: Seq[AbstractDataType] = Seq(ArrayType)
+
+  override def dataType: DataType = child.dataType
+
+  @transient lazy val elementType: DataType = 
dataType.asInstanceOf[ArrayType].elementType
+
+  @transient private lazy val ordering: Ordering[Any] =
+TypeUtils.getInterpretedOrdering(elementType)
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+super.checkInputDataTypes() match {
+  case f: TypeCheckResult.TypeCheckFailure => f
+  case TypeCheckResult.TypeCheckSuccess =>
+TypeUtils.checkForOrderingExpr(elementType, s"function 
$prettyName")
+}
+  }
+
+  @transient private lazy val elementTypeSupportEquals = elementType match 
{
+case BinaryType => false
+case _: AtomicType => true
+case _ => false
+  }
+
+  override def nullSafeEval(array: Any): Any = {
+val data = array.asInstanceOf[ArrayData].toArray[AnyRef](elementType)
+if (elementTypeSupportEquals) {
+  new GenericArrayData(data.distinct.asInstanceOf[Array[Any]])
+} else {
+  var foundNullElement = false
+  var pos = 0
+  for(i <- 0 until data.length) {
+if (data(i) == null) {
+  if (!foundNullElement) {
+foundNullElement = true
+pos = pos + 1
+  }
+} else {
+  var j = 0
+  var done = false
+  while (j <= i && !done) {
+if (data(j) != null && ordering.equiv(data(j), data(i))) {
+  done = true
+}
+j = j + 1
+  }
+  if (i == j-1) {
+pos = pos + 1
+  }
+}
+  }
+  new GenericArrayData(data.slice(0, pos))
+}
+  }
+
+  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+nullSafeCodeGen(ctx, ev, (array) => {
+  val i = ctx.freshName("i")
+  val j = ctx.freshName("j")
+  val sizeOfDistinctArray = ctx.freshName("sizeOfDistinctArray")
+  val getValue1 = CodeGenerator.getValue(array, elementType, i)
+  val getValue2 = CodeGenerator.getValue(array, elementType, j)
+  val foundNullElement = ctx.freshName("foundNullElement")
+  val openHashSet = classOf[OpenHashSet[_]].getName
+  val hs = ctx.freshName("hs")
+  val classTag = s"scala.reflect.ClassTag$$.MODULE$$.Object()"
+  if(elementTypeSupportEquals) {
--- End diff --

nit: `if (`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21595: [MINOR][SQL] Remove invalid comment from SparkStrategies

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21595
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92111/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21595: [MINOR][SQL] Remove invalid comment from SparkStrategies

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21595
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21587: [SPARK-24588][SS] streaming join should require HashClus...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21587
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4241/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21595: [MINOR][SQL] Remove invalid comment from SparkStrategies

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21595
  
**[Test build #92111 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92111/testReport)**
 for PR 21595 at commit 
[`8afb36b`](https://github.com/apache/spark/commit/8afb36b20aab1bbd1f6a5cf902aef7e0c04c8353).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21587: [SPARK-24588][SS] streaming join should require HashClus...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21587
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21587: [SPARK-24588][SS] streaming join should require HashClus...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21587
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21590: [SPARK-24423][SQL] Add a new option for JDBC sources

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21590
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21590: [SPARK-24423][SQL] Add a new option for JDBC sources

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21590
  
**[Test build #92118 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92118/testReport)**
 for PR 21590 at commit 
[`8920793`](https://github.com/apache/spark/commit/8920793d480de76e3cbe9d25f66e624ad6183503).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21590: [SPARK-24423][SQL] Add a new option for JDBC sources

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21590
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21587: [SPARK-24588][SS] streaming join should require HashClus...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21587
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/345/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21587: [SPARK-24588][SS] streaming join should require HashClus...

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21587
  
**[Test build #92119 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92119/testReport)**
 for PR 21587 at commit 
[`08da2e6`](https://github.com/apache/spark/commit/08da2e6717ce4693fcaf33d021513f478f33d2a4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21590: [SPARK-24423][SQL] Add a new option for JDBC sources

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21590
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/344/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21590: [SPARK-24423][SQL] Add a new option for JDBC sources

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21590
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4240/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21587: [SPARK-24588][SS] streaming join should require HashClus...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21587
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/343/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21587: [SPARK-24588][SS] streaming join should require HashClus...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21587
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21333: [SPARK-23778][CORE] Avoid unneeded shuffle when u...

2018-06-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21333


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21587: [SPARK-24588][SS] streaming join should require HashClus...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21587
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21587: [SPARK-24588][SS] streaming join should require HashClus...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21587
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4239/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21587: [SPARK-24588][SS] streaming join should require H...

2018-06-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21587#discussion_r196652632
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala
 ---
@@ -41,34 +41,127 @@ class DistributionSuite extends SparkFunSuite {
 }
   }
 
-  test("HashPartitioning (with nullSafe = true) is the output 
partitioning") {
-// Cases which do not need an exchange between two data properties.
+  test("UnspecifiedDistribution and AllTuples") {
--- End diff --

I've reorganized this test suite and added a bunch of new test cases, to 
improve the test coverage.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21587: [SPARK-24588][SS] streaming join should require H...

2018-06-19 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/21587#discussion_r196652542
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
 ---
@@ -186,9 +180,8 @@ case class RoundRobinPartitioning(numPartitions: Int) 
extends Partitioning
 case object SinglePartition extends Partitioning {
   val numPartitions = 1
 
-  override def satisfies(required: Distribution): Boolean = required match 
{
+  override def satisfies0(required: Distribution): Boolean = required 
match {
--- End diff --

added in the base class


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21333: [SPARK-23778][CORE] Avoid unneeded shuffle when union ge...

2018-06-19 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/21333
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21542
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21587: [SPARK-24588][SS] streaming join should require HashClus...

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21587
  
**[Test build #92117 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92117/testReport)**
 for PR 21587 at commit 
[`0795e40`](https://github.com/apache/spark/commit/0795e405398278e0cf86fd06d7cae675194c197d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21542
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4238/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21542
  
**[Test build #92116 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92116/testReport)**
 for PR 21542 at commit 
[`75c9339`](https://github.com/apache/spark/commit/75c9339c948239c4b77899169bd5ac0484e523a1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21542
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/342/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21542
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21542
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21581: [SPARK-24574][SQL] array_contains, array_position...

2018-06-19 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21581#discussion_r196649798
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -3082,7 +3082,10 @@ object functions {
* @since 1.5.0
*/
   def array_contains(column: Column, value: Any): Column = withExpr {
-ArrayContains(column.expr, Literal(value))
+value match {
--- End diff --

+1


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21588: [WIP][SPARK-24590][BUILD] Make Jenkins tests pass...

2018-06-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21588#discussion_r196646202
  
--- Diff: pom.xml ---
@@ -123,7 +123,7 @@
 1.6.0
 3.4.6
 2.6.0
-org.spark-project.hive
+com.github.hyukjinkwon
--- End diff --

I am going to revert this change and the changes in ` dev/run-tests.py` 
back to `org.spark-project.hive` once the tests pass. I redistributed 
`org.spark-project.hive` having one liner fix for Hadoop 3 support in order to 
use it in this PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21588: [WIP][SPARK-24590][BUILD] Make Jenkins tests pass...

2018-06-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21588#discussion_r196646070
  
--- Diff: project/SparkBuild.scala ---
@@ -464,7 +464,20 @@ object DockerIntegrationTests {
  */
 object DependencyOverrides {
   lazy val settings = Seq(
-dependencyOverrides += "com.google.guava" % "guava" % "14.0.1")
+dependencyOverrides += "com.google.guava" % "guava" % "14.0.1",
+dependencyOverrides += "com.fasterxml.jackson.core" % 
"jackson-annotations" % "2.6.7",
+dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-core" % 
"2.6.7",
+dependencyOverrides += "com.fasterxml.jackson.core" % 
"jackson-module-jaxb-annotations" % "2.6.7",
+dependencyOverrides += "com.fasterxml.jackson.core" % 
"jackson-databind" % "2.6.7")
+}
+
+/**
+  * Exclusions to work around sbt's dependency resolution being different 
from Maven's.
+  */
+object ExcludeDependencies {
+  lazy val settings = Seq(
+excludeDependencies += "com.fasterxml.jackson.jaxrs" % 
"jackson-jaxrs-json-provider",
+excludeDependencies += "javax.ws.rs" % "jsr311-api")
--- End diff --

I think this also should have been excluded by Jersey. Seems it's 
difference between Maven and SBT if I am not mistaken.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21588: [WIP][SPARK-24590][BUILD] Make Jenkins tests pass...

2018-06-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21588#discussion_r196645986
  
--- Diff: project/SparkBuild.scala ---
@@ -464,7 +464,20 @@ object DockerIntegrationTests {
  */
 object DependencyOverrides {
   lazy val settings = Seq(
-dependencyOverrides += "com.google.guava" % "guava" % "14.0.1")
+dependencyOverrides += "com.google.guava" % "guava" % "14.0.1",
+dependencyOverrides += "com.fasterxml.jackson.core" % 
"jackson-annotations" % "2.6.7",
+dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-core" % 
"2.6.7",
+dependencyOverrides += "com.fasterxml.jackson.core" % 
"jackson-module-jaxb-annotations" % "2.6.7",
+dependencyOverrides += "com.fasterxml.jackson.core" % 
"jackson-databind" % "2.6.7")
--- End diff --

These look coming from `jackson-jaxrs-json-provider` where it looks the 
resolution is different between Maven and SBT. I had to manually override and 
exclude.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21588: [WIP][SPARK-24590][BUILD] Make Jenkins tests pass...

2018-06-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21588#discussion_r196645845
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/client/HiveVersionSuite.scala 
---
@@ -26,7 +27,6 @@ import org.apache.spark.sql.hive.HiveUtils
 
 private[client] abstract class HiveVersionSuite(version: String) extends 
SparkFunSuite {
   override protected val enableAutoThreadAudit = false
-  protected var client: HiveClient = null
--- End diff --

This was only used in `HiveClientSuite.scala`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21588: [WIP][SPARK-24590][BUILD] Make Jenkins tests passed with...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21588
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21588: [WIP][SPARK-24590][BUILD] Make Jenkins tests passed with...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21588
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21588: [WIP][SPARK-24590][BUILD] Make Jenkins tests passed with...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21588
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/341/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21588: [WIP][SPARK-24590][BUILD] Make Jenkins tests passed with...

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21588
  
**[Test build #92115 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92115/testReport)**
 for PR 21588 at commit 
[`aec2e71`](https://github.com/apache/spark/commit/aec2e710e7fbb349e7c59c466452ed7aab2e7ca8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21588: [WIP][SPARK-24590][BUILD] Make Jenkins tests passed with...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21588
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4237/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21542
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92112/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21542
  
**[Test build #92112 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92112/testReport)**
 for PR 21542 at commit 
[`75c9339`](https://github.com/apache/spark/commit/75c9339c948239c4b77899169bd5ac0484e523a1).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21542
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21594
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92108/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21594
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21594
  
**[Test build #92108 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92108/testReport)**
 for PR 21594 at commit 
[`4171062`](https://github.com/apache/spark/commit/417106260f329e4de9b0371084f688be435943ce).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21577: [SPARK-24589][core] Correctly identify tasks in output c...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21577
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21577: [SPARK-24589][core] Correctly identify tasks in output c...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21577
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92109/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21577: [SPARK-24589][core] Correctly identify tasks in output c...

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21577
  
**[Test build #92109 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92109/testReport)**
 for PR 21577 at commit 
[`264c533`](https://github.com/apache/spark/commit/264c533737410786faae24df8cb5b27218f804cd).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21593: [SPARK-24578][Core] Cap sub-region's size of retu...

2018-06-19 Thread squito
Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/21593#discussion_r196637144
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/protocol/MessageWithHeader.java
 ---
@@ -137,30 +137,15 @@ protected void deallocate() {
   }
 
   private int copyByteBuf(ByteBuf buf, WritableByteChannel target) throws 
IOException {
-ByteBuffer buffer = buf.nioBuffer();
-int written = (buffer.remaining() <= NIO_BUFFER_LIMIT) ?
-  target.write(buffer) : writeNioBuffer(target, buffer);
+// SPARK-24578: cap the sub-region's size of returned nio buffer to 
improve the performance
+// for the case that the passed-in buffer has too many components.
+int length = Math.min(buf.readableBytes(), NIO_BUFFER_LIMIT);
+ByteBuffer buffer = buf.nioBuffer(buf.readerIndex(), length);
--- End diff --

I think you can go one step further here, and call `buf.nioBuffers(int, 
int)` (plural) 

https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/ByteBuf.java#L2355

that will avoid the copying required to create the merged buffer (though 
its a bit complicated as you have to check for incomplete writes from any 
single `target.write()` call).

Also OK to leave this for now as this is a pretty important fix.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21581: [SPARK-24574][SQL] array_contains, array_position...

2018-06-19 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21581#discussion_r196636915
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -3082,7 +3082,10 @@ object functions {
* @since 1.5.0
*/
   def array_contains(column: Column, value: Any): Column = withExpr {
-ArrayContains(column.expr, Literal(value))
+value match {
--- End diff --

Yes, I think so.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21581: [SPARK-24574][SQL] array_contains, array_position...

2018-06-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21581#discussion_r196636085
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -3082,7 +3082,10 @@ object functions {
* @since 1.5.0
*/
   def array_contains(column: Column, value: Any): Column = withExpr {
-ArrayContains(column.expr, Literal(value))
+value match {
--- End diff --

Yup, that's what I was thinking too from my glance.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21593: [SPARK-24578][Core] Cap sub-region's size of returned ni...

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21593
  
**[Test build #92114 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92114/testReport)**
 for PR 21593 at commit 
[`a30d4de`](https://github.com/apache/spark/commit/a30d4de019ac4380cf5bfd36ff0cf12ef72d78f7).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21575: [SPARK-24566][CORE] spark.storage.blockManagerSla...

2018-06-19 Thread xueyumusic
Github user xueyumusic commented on a diff in the pull request:

https://github.com/apache/spark/pull/21575#discussion_r196635402
  
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -75,16 +76,18 @@ private[spark] class HeartbeatReceiver(sc: 
SparkContext, clock: Clock)
   // "spark.network.timeout" uses "seconds", while 
`spark.storage.blockManagerSlaveTimeoutMs` uses
   // "milliseconds"
   private val slaveTimeoutMs =
-sc.conf.getTimeAsMs("spark.storage.blockManagerSlaveTimeoutMs", "120s")
+sc.conf.getTimeAsMs("spark.storage.blockManagerSlaveTimeoutMs",
--- End diff --

I look at this carefully, I think your are right, thanks @jiangxb1987 . One 
case that is not relevant with this PR is like this: set 
spark.storage.blockManagerSlaveTimeoutMs=900ms and not configure 
spark.network.timeout, then `executorTimeoutMs ` will be 0 since 
getTimeAsSeconds loos precision for ms. This config maybe not reasonable. If 
need fix how about add ensuring > 0 or make executorTimeoutMs's min value as 1, 
@jiangxb1987 @zsxwing 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21593: [SPARK-24578][Core] Cap sub-region's size of returned ni...

2018-06-19 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/21593
  
Jenkins, ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21577: [SPARK-24589][core] Correctly identify tasks in output c...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21577
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92106/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21577: [SPARK-24589][core] Correctly identify tasks in output c...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21577
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21577: [SPARK-24589][core] Correctly identify tasks in output c...

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21577
  
**[Test build #92106 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92106/testReport)**
 for PR 21577 at commit 
[`5ece2f1`](https://github.com/apache/spark/commit/5ece2f12a820d6438146758f0e944f3b1c70d489).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21590: [SPARK-24423][SQL] Add a new option for JDBC sour...

2018-06-19 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/21590#discussion_r196634511
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -1206,4 +1207,92 @@ class JDBCSuite extends SparkFunSuite
 }.getMessage
 assert(errMsg.contains("Statement was canceled or the session timed 
out"))
   }
+
+  test("query JDBC option - negative tests") {
+val query = "SELECT * FROM  test.people WHERE theid = 1"
+// load path
+val e1 = intercept[RuntimeException] {
+  val df = spark.read.format("jdbc")
+.option("Url", urlWithUserAndPass)
+.option("query", query)
--- End diff --

@HyukjinKwon Thanks.. I will update the doc.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21594
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21581: [SPARK-24574][SQL] array_contains, array_position...

2018-06-19 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21581#discussion_r196633968
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala ---
@@ -571,6 +571,10 @@ class DataFrameFunctionsSuite extends QueryTest with 
SharedSQLContext {
   df.selectExpr("array_contains(a, 1)"),
   Seq(Row(true), Row(false))
 )
+checkAnswer(
+  df.select(array_contains(df("a"), df("c"))),
+  Seq(Row(true), Row(false))
+)
--- End diff --

Can you add another test to use `selectExpr`, e.g., 
`df.selectExpr("array_contains(a, c)")`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21594
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92107/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21581: [SPARK-24574][SQL] array_contains, array_position...

2018-06-19 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21581#discussion_r196633908
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -3082,7 +3082,10 @@ object functions {
* @since 1.5.0
*/
   def array_contains(column: Column, value: Any): Column = withExpr {
-ArrayContains(column.expr, Literal(value))
+value match {
--- End diff --

On second thoughts, we should use `lit()`, e.g., 
`ArrayContains(column.expr, lit(value).expr)`? WDYT? @viirya @maropu 
@HyukjinKwon


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21594
  
**[Test build #92107 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92107/testReport)**
 for PR 21594 at commit 
[`b9f1507`](https://github.com/apache/spark/commit/b9f1507c737c1edb87df9b03bee61218ada42307).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21061
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21061
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4236/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21542
  
**[Test build #92112 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92112/testReport)**
 for PR 21542 at commit 
[`75c9339`](https://github.com/apache/spark/commit/75c9339c948239c4b77899169bd5ac0484e523a1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21542
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21542
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4235/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21061
  
**[Test build #92113 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92113/testReport)**
 for PR 21061 at commit 
[`f64e529`](https://github.com/apache/spark/commit/f64e5292ccc9c709ea56614bf70b1fbb83099625).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21061
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21061
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/340/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21542
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21542
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/339/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs as getTi...

2018-06-19 Thread xueyumusic
Github user xueyumusic commented on the issue:

https://github.com/apache/spark/pull/21567
  
I see, thanks for your review and guidance, @jiangxb1987 @maropu , I will 
try to add related config to doc and close this PR, thank you


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21495: [SPARK-24418][Build] Upgrade Scala to 2.11.12 and 2.12.6

2018-06-19 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/21495
  
Is there any left work, or everything is already done? @dbtsai 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21594
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21594
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92105/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21555: [SPARK-24547][K8S] Allow for building spark on k8s docke...

2018-06-19 Thread ifilonenko
Github user ifilonenko commented on the issue:

https://github.com/apache/spark/pull/21555
  
@mccheah Good note. As this is a blocker for other PRs. It is probably best 
to refactor the `docker-image-tool.sh` in a separate PR for that is the not the 
focus of this PR. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21594
  
**[Test build #92105 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92105/testReport)**
 for PR 21594 at commit 
[`71b93ed`](https://github.com/apache/spark/commit/71b93ed598833d760955e972894685c089af297b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21595: [MINOR][SQL] Remove invalid comment from SparkStrategies

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21595
  
**[Test build #92111 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92111/testReport)**
 for PR 21595 at commit 
[`8afb36b`](https://github.com/apache/spark/commit/8afb36b20aab1bbd1f6a5cf902aef7e0c04c8353).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21555: [SPARK-24547][K8S] Allow for building spark on k8s docke...

2018-06-19 Thread mccheah
Github user mccheah commented on the issue:

https://github.com/apache/spark/pull/21555
  
This change makes it such that using the tool forces building and pushing 
both Python and non-Python, but, what if the user wants to only build one to 
save time? I can imagine that being the case in something like a dev-CI 
workflow. Ideally the tool would allow one to be selective in managing either 
image or both.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21595: [MINOR][SQL] Remove invalid comment from SparkStrategies

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21595
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21595: [MINOR][SQL] Remove invalid comment from SparkStrategies

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21595
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21388: [SPARK-24336][SQL] Support 'pass through' transformation...

2018-06-19 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/spark/pull/21388
  
I just provided new patch to remove the comment, as it looks like no longer 
preferred option.
https://github.com/apache/spark/pull/21595

Closing this one.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21388: [SPARK-24336][SQL] Support 'pass through' transfo...

2018-06-19 Thread HeartSaVioR
Github user HeartSaVioR closed the pull request at:

https://github.com/apache/spark/pull/21388


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   >