date:20180621

[GitHub] spark pull request #18324: [SPARK-21045][PYSPARK]Fixed executor blocked beca...

2018-06-21 Thread dataknocker

Github user dataknocker commented on a diff in the pull request:

https://github.com/apache/spark/pull/18324#discussion_r197355680
  
--- Diff: python/pyspark/worker.py ---
@@ -177,8 +180,11 @@ def process():
 process()
 except Exception:
 try:
+exc_info = traceback.format_exc()
+if isinstance(exc_info, unicode):
+exc_info = exc_info.encode('utf-8')
--- End diff --

cc @jiangxb1987 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21590: [SPARK-24423][SQL] Add a new option for JDBC sour...

2018-06-21 Thread dilipbiswal

Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/21590#discussion_r197355161
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
 ---
@@ -65,13 +65,38 @@ class JDBCOptions(
   // Required parameters
   // 
   require(parameters.isDefinedAt(JDBC_URL), s"Option '$JDBC_URL' is 
required.")
-  require(parameters.isDefinedAt(JDBC_TABLE_NAME), s"Option 
'$JDBC_TABLE_NAME' is required.")
+
   // a JDBC URL
   val url = parameters(JDBC_URL)
-  // name of table
-  val table = parameters(JDBC_TABLE_NAME)
+  val tableName = parameters.get(JDBC_TABLE_NAME)
+  val query = parameters.get(JDBC_QUERY_STRING)
--- End diff --

@maropu Thank you for taking the time to think about this throughly. A 
couple of questions/comments.
1) Looks like for read path we give precedence to dbtable over query. I 
feel its good to explicitly disallow this with a clear message in case of an 
ambiguity.
2) Usage of lazy here (especially to trigger errors) makes me a little 
nervous. Like if we want to introduce a debug statement to print the variables 
in side the QueryOptions class, things will not work any more, right ? Thats 
the reason, i had opted to check for the "invalid query option in write path" 
in the write function itself (i.e when i am sure of the calling context). 
Perhaps that how its used every where in which case it may be okay to follow 
the same approach here. 

I am okay with this. Lets get some opinion from @gatorsmile. Once i have 
the final set of comments, i will make the changes. Thanks again.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21537: [SPARK-24505][SQL] Convert strings in codegen to ...

2018-06-21 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21537#discussion_r197352302
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/javaCode.scala
 ---
@@ -256,6 +283,22 @@ object EmptyBlock extends Block with Serializable {
   override def + (other: Block): Block = other
 }
 
+/**
+ * A block inlines all types of input arguments into a string without
+ * tracking any reference of `JavaCode` instances.
+ */
+case class InlineBlock(block: String) extends Block {
+  override val code: String = block
+  override val exprValues: Set[ExprValue] = Set.empty
+
+  override def + (other: Block): Block = other match {
+case c: CodeBlock => Blocks(Seq(this, c))
+case i: InlineBlock => InlineBlock(block + i.block)
+case b: Blocks => Blocks(Seq(this) ++ b.blocks)
--- End diff --

Ok. I will do that PR first. Will ping you on the PR when it's ready.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21537: [SPARK-24505][SQL] Convert strings in codegen to ...

2018-06-21 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21537#discussion_r197352254
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -1004,26 +1012,29 @@ case class Cast(child: Expression, dataType: 
DataType, timeZoneId: Option[String
   private[this] def castToIntervalCode(from: DataType): CastFunction = 
from match {
 case StringType =>
   (c, evPrim, evNull) =>
-s"""$evPrim = CalendarInterval.fromString($c.toString());
+code"""$evPrim = CalendarInterval.fromString($c.toString());
if(${evPrim} == null) {
  ${evNull} = true;
}
  """.stripMargin
 
   }
 
-  private[this] def decimalToTimestampCode(d: String): String =
-s"($d.toBigDecimal().bigDecimal().multiply(new 
java.math.BigDecimal(100L))).longValue()"
-  private[this] def longToTimeStampCode(l: String): String = s"$l * 
100L"
-  private[this] def timestampToIntegerCode(ts: String): String =
-s"java.lang.Math.floor((double) $ts / 100L)"
-  private[this] def timestampToDoubleCode(ts: String): String = s"$ts / 
100.0"
+  private[this] def decimalToTimestampCode(d: ExprValue): Block = {
+val block = code"new java.math.BigDecimal(100L)"
+code"($d.toBigDecimal().bigDecimal().multiply($block)).longValue()"
+  }
+  private[this] def longToTimeStampCode(l: ExprValue): Block = code"$l * 
100L"
+  private[this] def timestampToIntegerCode(ts: ExprValue): Block =
+code"java.lang.Math.floor((double) $ts / 100L)"
+  private[this] def timestampToDoubleCode(ts: ExprValue): Block =
+code"$ts / 100.0"
 
   private[this] def castToBooleanCode(from: DataType): CastFunction = from 
match {
 case StringType =>
-  val stringUtils = StringUtils.getClass.getName.stripSuffix("$")
+  val stringUtils = 
inline"${StringUtils.getClass.getName.stripSuffix("$")}"
--- End diff --

inline is just used as a wrapper for string, as we disallow silent string 
interpolation. We expand the content of an inline into string into a code block.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21061
  
**[Test build #92201 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92201/testReport)**
 for PR 21061 at commit 
[`37cee1f`](https://github.com/apache/spark/commit/37cee1f5b81dcce12026f1db0320a1090b87).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21061
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21061
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/401/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21608: [SPARK-24626] [SQL] Improve Analyze Table command

2018-06-21 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21608
  
cc: @wzhfy @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21608: [SPARK-24626] [SQL] Improve Analyze Table command

2018-06-21 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21608
  
ok, can you put the result in the description? Also, can you make the title 
more precise? e.g., Parallelize size computation in ANALYZE command


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21590: [SPARK-24423][SQL] Add a new option for JDBC sour...

2018-06-21 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21590#discussion_r197349327
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
 ---
@@ -65,13 +65,38 @@ class JDBCOptions(
   // Required parameters
   // 
   require(parameters.isDefinedAt(JDBC_URL), s"Option '$JDBC_URL' is 
required.")
-  require(parameters.isDefinedAt(JDBC_TABLE_NAME), s"Option 
'$JDBC_TABLE_NAME' is required.")
+
   // a JDBC URL
   val url = parameters(JDBC_URL)
-  // name of table
-  val table = parameters(JDBC_TABLE_NAME)
+  val tableName = parameters.get(JDBC_TABLE_NAME)
+  val query = parameters.get(JDBC_QUERY_STRING)
--- End diff --

I think, since the `tableName` and `query` variables don't need to be 
exposed to other classes, can we remove them?

Btw, I feel sharing the `tableName` variable int both write/read paths 
makes code some complicated, so how about splitting the variable into two part: 
`tableOrQuery` for reading and `outputName` for writing?

e.g., 
https://github.com/apache/spark/commit/d62372ab0e855c359122609f1805ce83661d510e


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21611: [SPARK-24569][SQL] Aggregator with output type Option sh...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21611
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21611: [SPARK-24569][SQL] Aggregator with output type Option sh...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21611
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/400/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21611: [SPARK-24569][SQL] Aggregator with output type Op...

2018-06-21 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21611#discussion_r197348976
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala ---
@@ -148,6 +148,79 @@ object VeryComplexResultAgg extends Aggregator[Row, 
String, ComplexAggData] {
 }
 
 
+case class OptionBooleanData(name: String, isGood: Option[Boolean])
+case class OptionBooleanIntData(name: String, isGood: Option[(Boolean, 
Int)])
+
+case class OptionBooleanAggregator(colName: String)
+extends Aggregator[Row, Option[Boolean], Option[Boolean]] {
+
+  override def zero: Option[Boolean] = None
+
+  override def reduce(buffer: Option[Boolean], row: Row): Option[Boolean] 
= {
+val index = row.fieldIndex(colName)
+val value = if (row.isNullAt(index)) {
+  Option.empty[Boolean]
+} else {
+  Some(row.getBoolean(index))
+}
+merge(buffer, value)
+  }
+
+  override def merge(b1: Option[Boolean], b2: Option[Boolean]): 
Option[Boolean] = {
+if ((b1.isDefined && b1.get) || (b2.isDefined && b2.get)) {
+  Some(true)
+} else if (b1.isDefined) {
+  b1
+} else {
+  b2
+}
+  }
+
+  override def finish(reduction: Option[Boolean]): Option[Boolean] = 
reduction
+
+  override def bufferEncoder: Encoder[Option[Boolean]] = 
OptionalBoolEncoder
+  override def outputEncoder: Encoder[Option[Boolean]] = 
OptionalBoolEncoder
+
+  def OptionalBoolEncoder: Encoder[Option[Boolean]] = ExpressionEncoder()
+}
+
+case class OptionBooleanIntAggregator(colName: String)
+extends Aggregator[Row, Option[(Boolean, Int)], Option[(Boolean, 
Int)]] {
+
+  override def zero: Option[(Boolean, Int)] = None
+
+  override def reduce(buffer: Option[(Boolean, Int)], row: Row): 
Option[(Boolean, Int)] = {
+val index = row.fieldIndex(colName)
+val value = if (row.isNullAt(index)) {
+  Option.empty[(Boolean, Int)]
+} else {
+  val nestedRow = row.getStruct(index)
+  Some((nestedRow.getBoolean(0), nestedRow.getInt(1)))
+}
+merge(buffer, value)
+  }
+
+  override def merge(
+  b1: Option[(Boolean, Int)],
+  b2: Option[(Boolean, Int)]): Option[(Boolean, Int)] = {
+if ((b1.isDefined && b1.get._1) || (b2.isDefined && b2.get._1)) {
+  val newInt = b1.map(_._2).getOrElse(0) + b2.map(_._2).getOrElse(0)
+  Some((true, newInt))
+} else if (b1.isDefined) {
+  b1
+} else {
+  b2
+}
+  }
+
+  override def finish(reduction: Option[(Boolean, Int)]): Option[(Boolean, 
Int)] = reduction
+
+  override def bufferEncoder: Encoder[Option[(Boolean, Int)]] = 
OptionalBoolIntEncoder
+  override def outputEncoder: Encoder[Option[(Boolean, Int)]] = 
OptionalBoolIntEncoder
+
+  def OptionalBoolIntEncoder: Encoder[Option[(Boolean, Int)]] = 
ExpressionEncoder(topLevel = false)
--- End diff --

We can create Dataset like:

```scala
scala> Seq((1, Some(1, 2)), (2, Some(3, 4))).toDS.printSchema
root
 |-- _1: integer (nullable = false)
 |-- _2: struct (nullable = true)
 ||-- _1: integer (nullable = false)
 ||-- _2: integer (nullable = false)
```

But now we can't use it as buffer/output encoding here. But the encoder 
here is not for top-level.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21608: [SPARK-24626] [SQL] Improve Analyze Table command

2018-06-21 Thread Achuth17

Github user Achuth17 commented on the issue:

https://github.com/apache/spark/pull/21608
  
Yes, In the case where the data is stored in S3 I noticed a significant 
difference.

Some rough numbers - When done serially for a table in S3 with 1000 
partitions, the calculateTotalSize method took about 90 seconds vs 30-40 
seconds when done in parallel.  


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21611: [SPARK-24569][SQL] Aggregator with output type Option sh...

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21611
  
**[Test build #92200 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92200/testReport)**
 for PR 21611 at commit 
[`dd4ea61`](https://github.com/apache/spark/commit/dd4ea61ac1c2beaf8ee897b1533e2088c6f8364a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21611: [SPARK-24569][SQL] Aggregator with output type Op...

2018-06-21 Thread viirya

GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/21611

[SPARK-24569][SQL] Aggregator with output type Option should produce 
consistent schema

## What changes were proposed in this pull request?

SQL `Aggregator` with output type `Option[Boolean]` creates column of type 
`StructType`. It's not in consistency with a Dataset of similar java class.

This changes the way `definedByConstructorParams` checks given type. For 
`Option[_]`, it goes to check its type argument.

## How was this patch tested?

Added test.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-24569

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21611.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21611


commit dd4ea61ac1c2beaf8ee897b1533e2088c6f8364a
Author: Liang-Chi Hsieh 
Date:   2018-06-22T03:44:33Z

Aggregator with output type Option[Boolean] should produce consistent 
schema.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21610: Updates to LICENSE and NOTICE

2018-06-21 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21610#discussion_r197348104
  
--- Diff: NOTICE ---
@@ -1,667 +1,11 @@
 Apache Spark
-Copyright 2014 and onwards The Apache Software Foundation.
+Copyright 2014 - 20018 The Apache Software Foundation.
 
 This product includes software developed at
 The Apache Software Foundation (http://www.apache.org/).
 
+Android Code
+Copyright 2005-2008 The Android Open Source Project
 
-
-Common Development and Distribution License 1.0
-
-
-The following components are provided under the Common Development and 
Distribution License 1.0. See project link for details.
-
- (CDDL 1.0) Glassfish Jasper (org.mortbay.jetty:jsp-2.1:6.1.14 - 
http://jetty.mortbay.org/project/modules/jsp-2.1)
- (CDDL 1.0) JAX-RS (https://jax-rs-spec.java.net/)
- (CDDL 1.0) Servlet Specification 2.5 API 
(org.mortbay.jetty:servlet-api-2.5:6.1.14 - 
http://jetty.mortbay.org/project/modules/servlet-api-2.5)
- (CDDL 1.0) (GPL2 w/ CPE) javax.annotation API 
(https://glassfish.java.net/nonav/public/CDDL+GPL.html)
- (COMMON DEVELOPMENT AND DISTRIBUTION LICENSE (CDDL) Version 1.0) (GNU 
General Public Library) Streaming API for XML (javax.xml.stream:stax-api:1.0-2 
- no url defined)
- (Common Development and Distribution License (CDDL) v1.0) JavaBeans 
Activation Framework (JAF) (javax.activation:activation:1.1 - 
http://java.sun.com/products/javabeans/jaf/index.jsp)
-
-
-Common Development and Distribution License 1.1
-
-
-The following components are provided under the Common Development and 
Distribution License 1.1. See project link for details.
-
- (CDDL 1.1) (GPL2 w/ CPE) org.glassfish.hk2 (https://hk2.java.net)
- (CDDL 1.1) (GPL2 w/ CPE) JAXB API bundle for GlassFish V3 
(javax.xml.bind:jaxb-api:2.2.2 - https://jaxb.dev.java.net/)
- (CDDL 1.1) (GPL2 w/ CPE) JAXB RI (com.sun.xml.bind:jaxb-impl:2.2.3-1 
- http://jaxb.java.net/)
- (CDDL 1.1) (GPL2 w/ CPE) Jersey 2 (https://jersey.java.net)
-
-
-Common Public License 1.0
-
-
-The following components are provided under the Common Public 1.0 License. 
See project link for details.
-
- (Common Public License Version 1.0) JUnit (junit:junit-dep:4.10 - 
http://junit.org)
- (Common Public License Version 1.0) JUnit (junit:junit:3.8.1 - 
http://junit.org)
- (Common Public License Version 1.0) JUnit (junit:junit:4.8.2 - 
http://junit.org)
-
-
-Eclipse Public License 1.0
-
-
-The following components are provided under the Eclipse Public License 
1.0. See project link for details.
-
- (Eclipse Public License v1.0) Eclipse JDT Core 
(org.eclipse.jdt:core:3.1.1 - http://www.eclipse.org/jdt/)
-
-
-Mozilla Public License 1.0
-
-
-The following components are provided under the Mozilla Public License 
1.0. See project link for details.
-
- (GPL) (LGPL) (MPL) JTransforms (com.github.rwl:jtransforms:2.4.0 - 
http://sourceforge.net/projects/jtransforms/)
- (Mozilla Public License Version 1.1) jamon-runtime 
(org.jamon:jamon-runtime:2.3.1 - http://www.jamon.org/jamon-runtime/)
-
-
-
-
-NOTICE files
-
-
-The following NOTICEs are pertain to software distributed with this 
project.
-
-
-// --
-// NOTICE file corresponding to the section 4d of The Apache License,
-// Version 2.0, in this case for
-// --
-
-Apache Avro
-Copyright 2009-2013 The Apache Software Foundation
-
-This product includes software developed at
-The Apache Software Foundation (http://www.apache.org/).
-
-Apache Commons Codec
-Copyright 2002-2009 The Apache Software Foundation
-
-This product includes software developed by
-The Apache Software Foundation (http://ww

[GitHub] spark issue #21610: Updates to LICENSE and NOTICE

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21610
  
**[Test build #92199 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92199/testReport)**
 for PR 21610 at commit 
[`b9d12d7`](https://github.com/apache/spark/commit/b9d12d700b9cb83402e42f264f21bca090e0d1e3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21610: Updates to LICENSE and NOTICE

2018-06-21 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21610#discussion_r197347713
  
--- Diff: NOTICE ---
@@ -1,667 +1,11 @@
 Apache Spark
-Copyright 2014 and onwards The Apache Software Foundation.
+Copyright 2014 - 20018 The Apache Software Foundation.
--- End diff --

2018?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21610: Updates to LICENSE and NOTICE

2018-06-21 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21610
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21590: [SPARK-24423][SQL] Add a new option for JDBC sour...

2018-06-21 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/21590#discussion_r197347130
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
 ---
@@ -65,13 +65,38 @@ class JDBCOptions(
   // Required parameters
   // 
   require(parameters.isDefinedAt(JDBC_URL), s"Option '$JDBC_URL' is 
required.")
-  require(parameters.isDefinedAt(JDBC_TABLE_NAME), s"Option 
'$JDBC_TABLE_NAME' is required.")
+
   // a JDBC URL
   val url = parameters(JDBC_URL)
-  // name of table
-  val table = parameters(JDBC_TABLE_NAME)
+  val tableName = parameters.get(JDBC_TABLE_NAME)
+  val query = parameters.get(JDBC_QUERY_STRING)
+  // Following two conditions make sure that :
+  // 1. One of the option (dbtable or query) must be specified.
+  // 2. Both of them can not be specified at the same time as they are 
conflicting in nature.
+  require(
+tableName.isDefined || query.isDefined,
+s"Option '$JDBC_TABLE_NAME' or '${JDBC_QUERY_STRING}' is required."
+  )
+
+  require(
+!(tableName.isDefined && query.isDefined),
+s"Both '$JDBC_TABLE_NAME' and '$JDBC_QUERY_STRING' can not be 
specified."
+  )
+
+  // table name or a table expression.
+  val tableOrQuery = tableName.map(_.trim).getOrElse {
+// We have ensured in the code above that either dbtable or query is 
specified.
+query.get match {
+  case subQuery if subQuery.nonEmpty => s"(${subQuery}) 
spark_gen_${curId.getAndIncrement()}"
+  case subQuery => subQuery
+}
+  }
+
+  require(tableOrQuery.nonEmpty,
+s"Empty string is not allowed in either '$JDBC_TABLE_NAME' or 
'${JDBC_QUERY_STRING}' options"
+  )
+
 
-  // 
--- End diff --

nit: revert this line


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21061
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92197/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21061
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21061
  
**[Test build #92197 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92197/testReport)**
 for PR 21061 at commit 
[`195f3bd`](https://github.com/apache/spark/commit/195f3bd6b47da19b27cd0c8140bcd9aa6a063843).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21608: [SPARK-24626] [SQL] Improve Analyze Table command

2018-06-21 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21608
  
This pr improves actual performance values? (My question is that the 
calculation is a bottleneck?)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21606
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92192/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21606
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21606
  
**[Test build #92192 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92192/testReport)**
 for PR 21606 at commit 
[`a16d9f9`](https://github.com/apache/spark/commit/a16d9f907b3ce0078da72b7e7bcc56e187cbc8f9).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21482: [SPARK-24393][SQL] SQL builtin: isinf

2018-06-21 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21482
  
I have no more comments except the one above.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21482: [SPARK-24393][SQL] SQL builtin: isinf

2018-06-21 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21482#discussion_r197340906
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -468,6 +468,18 @@ def input_file_name():
 return Column(sc._jvm.functions.input_file_name())
 
 
+@since(2.4)
+def isinf(col):
--- End diff --

Yes, please because I see it's exposed in Column.scala.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21594
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92194/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21594
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21594
  
**[Test build #92194 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92194/testReport)**
 for PR 21594 at commit 
[`2f00f2f`](https://github.com/apache/spark/commit/2f00f2fe0e1cf9a0d44285aab306ed55bd176d9c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21603: [SPARK-17091][SQL] Add rule to convert IN predicate to e...

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21603
  
**[Test build #92198 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92198/testReport)**
 for PR 21603 at commit 
[`b9b3160`](https://github.com/apache/spark/commit/b9b3160061ef1e17ae32599ed9fbcfd44b0565b4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21603: [SPARK-17091][SQL] Add rule to convert IN predicate to e...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21603
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21603: [SPARK-17091][SQL] Add rule to convert IN predicate to e...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21603
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/399/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21603: [SPARK-17091][SQL] Add rule to convert IN predica...

2018-06-21 Thread wangyum

Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/21603#discussion_r197338867
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala
 ---
@@ -270,6 +270,11 @@ private[parquet] class ParquetFilters(pushDownDate: 
Boolean) {
   case sources.Not(pred) =>
 createFilter(schema, pred).map(FilterApi.not)
 
+  case sources.In(name, values) if canMakeFilterOn(name) && 
values.length < 20 =>
--- End diff --

It seems that the push-down performance is better when threshold is less 
than `300`:
https://user-images.githubusercontent.com/5399861/41757743-7e411532-7616-11e8-8844-45132c50c535.png";>

The code:
```scala
withSQLConf(SQLConf.PARQUET_FILTER_PUSHDOWN_ENABLED.key -> "true") {
  import testImplicits._
  withTempPath { path =>
val total = 1000
(0 to total).toDF().coalesce(1)
  .write.option("parquet.block.size", 512)
  .parquet(path.getAbsolutePath)
val df = spark.read.parquet(path.getAbsolutePath)
// scalastyle:off println
var lastSize = -1
var i = 16000
while (i < total) {
  val filter = Range(0, total).filter(_ % i == 0)
  i += 100
  if (lastSize != filter.size) {
if (lastSize == -1) println(s"start size: ${filter.size}")
lastSize = filter.size
sql("set spark.sql.parquet.pushdown.inFilterThreshold=100")
val begin1 = System.currentTimeMillis()
df.where(s"id in(${filter.mkString(",")})").count()
val end1 = System.currentTimeMillis()
val time1 = end1 - begin1

sql("set spark.sql.parquet.pushdown.inFilterThreshold=10")
val begin2 = System.currentTimeMillis()
df.where(s"id in(${filter.mkString(",")})").count()
val end2 = System.currentTimeMillis()
val time2 = end2 - begin2
if (time1 <= time2) println(s"Max threshold: $lastSize")
  }
}
  }
}
```



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21610: Updates to LICENSE and NOTICE

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21610
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21610: Updates to LICENSE and NOTICE

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21610
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21610: Updates to LICENSE and NOTICE

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21610
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21610: Updates to LICENSE and NOTICE

2018-06-21 Thread justinmclean

GitHub user justinmclean opened a pull request:

https://github.com/apache/spark/pull/21610

Updates to LICENSE and NOTICE

## What changes were proposed in this pull request?

LICENSE and NOTICE changes as per ASF policy

## How was this patch tested?

N/A

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/justinmclean/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21610.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21610


commit b9d12d700b9cb83402e42f264f21bca090e0d1e3
Author: Justin Mclean 
Date:   2018-06-22T04:20:59Z

Updates to LICENSE and NOTICE




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21609: [SPARK-22897][CORE] Expose stageAttemptId in TaskContext

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21609
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92196/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21609: [SPARK-22897][CORE] Expose stageAttemptId in TaskContext

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21609
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21609: [SPARK-22897][CORE] Expose stageAttemptId in TaskContext

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21609
  
**[Test build #92196 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92196/testReport)**
 for PR 21609 at commit 
[`3040763`](https://github.com/apache/spark/commit/3040763e51c8d32309f2dc38ce8b9fcc740ceb3d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21603: [SPARK-17091][SQL] Add rule to convert IN predica...

2018-06-21 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21603#discussion_r197336527
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala
 ---
@@ -270,6 +270,11 @@ private[parquet] class ParquetFilters(pushDownDate: 
Boolean) {
   case sources.Not(pred) =>
 createFilter(schema, pred).map(FilterApi.not)
 
+  case sources.In(name, values) if canMakeFilterOn(name) && 
values.length < 20 =>
--- End diff --

+1


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21607
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92195/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21607
  
**[Test build #92195 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92195/testReport)**
 for PR 21607 at commit 
[`9d7e6ea`](https://github.com/apache/spark/commit/9d7e6eafff3daa519f7fda0b1f219f74d499874d).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21607
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21607
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21607
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92193/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21607
  
**[Test build #92193 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92193/testReport)**
 for PR 21607 at commit 
[`0520d60`](https://github.com/apache/spark/commit/0520d60b44987369fa62d7237427cb0cf022ed41).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21606
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92190/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21606
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21606
  
**[Test build #92190 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92190/testReport)**
 for PR 21606 at commit 
[`227d513`](https://github.com/apache/spark/commit/227d513ade176fd56f7e6d75a16deb6c654982db).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21606
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92189/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21606
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21606
  
**[Test build #92189 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92189/testReport)**
 for PR 21606 at commit 
[`5efaae7`](https://github.com/apache/spark/commit/5efaae74bf340fed4223b5209bed63475cc35516).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21320
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92191/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21320
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21320
  
**[Test build #92191 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92191/testReport)**
 for PR 21320 at commit 
[`a255bcb`](https://github.com/apache/spark/commit/a255bcb4c480d3c97f7ff0590bca0c20de034a31).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21609: [SPARK-22897][CORE] Expose stageAttemptId in TaskContext

2018-06-21 Thread zzcclp

Github user zzcclp commented on the issue:

https://github.com/apache/spark/pull/21609
  
Can this pr be merged ASAP? Currently there is an error on branch-2.2 .


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21061
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/398/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21061
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...

2018-06-21 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21588
  
Yup, will fix the hive fork thing and be back.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21570: [SPARK-24564][TEST] Add test suite for RecordBina...

2018-06-21 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/21570#discussion_r197328626
  
--- Diff: 
sql/core/src/test/java/test/org/apache/spark/sql/execution/sort/RecordBinaryComparatorSuite.java
 ---
@@ -0,0 +1,255 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package test.org.apache.spark.sql.execution.sort;
+
+import org.apache.spark.SparkConf;
+import org.apache.spark.memory.TaskMemoryManager;
--- End diff --

cc @jiangxb1987


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21061: [SPARK-23914][SQL] Add array_union function

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21061
  
**[Test build #92197 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92197/testReport)**
 for PR 21061 at commit 
[`195f3bd`](https://github.com/apache/spark/commit/195f3bd6b47da19b27cd0c8140bcd9aa6a063843).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...

2018-06-21 Thread jerryshao

Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/21588
  
@HyukjinKwon , I'm in favor of @vanzin 's comment, we should fix things 
first and then back to this one.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21548: [SPARK-24518][CORE] Using Hadoop credential provi...

2018-06-21 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/21548#discussion_r197327620
  
--- Diff: core/src/main/scala/org/apache/spark/SSLOptions.scala ---
@@ -179,9 +185,11 @@ private[spark] object SSLOptions extends Logging {
 .orElse(defaults.flatMap(_.keyStore))
 
 val keyStorePassword = 
conf.getWithSubstitution(s"$ns.keyStorePassword")
+
.orElse(Option(hadoopConf.getPassword(s"$ns.keyStorePassword")).map(new 
String(_)))
--- End diff --

Hi @vanzin , I checked jdk8 doc again, I don't find a String constructor 
which takes both char array and charset as parameters.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21606
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92187/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21606
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21606
  
**[Test build #92187 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92187/testReport)**
 for PR 21606 at commit 
[`c884f4f`](https://github.com/apache/spark/commit/c884f4f27199b3c91f56ba0042b42d09bc243883).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21598: [SPARK-24605][SQL] size(null) returns null instea...

2018-06-21 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21598#discussion_r197326162
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1314,6 +1314,13 @@ object SQLConf {
   "Other column values can be ignored during parsing even if they are 
malformed.")
 .booleanConf
 .createWithDefault(true)
+
+  val LEGACY_SIZE_OF_NULL = buildConf("spark.sql.legacy.sizeOfNull")
--- End diff --

That's basically the same except that the postfix includes a specific 
version, which was just a rough idea.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21577: [SPARK-24589][core] Correctly identify tasks in output c...

2018-06-21 Thread zzcclp

Github user zzcclp commented on the issue:

https://github.com/apache/spark/pull/21577
  
@vanzin @tgravescs , after merge this pr into branch-2.2, there is an error 
"stageAttemptNumber is not a member of org.apache.spark.TaskContext" in 
SparkHadoopMapRedUtil, I think it needs to merge PR-20082 first.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21598: [SPARK-24605][SQL] size(null) returns null instead of -1

2018-06-21 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21598
  
My assumption was that the PR and JIRA claim that it's the right behaviour, 
as I said multiple times. If there's no such thing, there should be of course 
no need to argue about the default value, as I said above.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21542: [SPARK-24529][Build][test-maven] Add spotbugs into maven...

2018-06-21 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21542
  
Even when we stop forking SpotBugs, the same error occurred.
@HyukjinKwon is there any idea? I would appreciate your thoughts.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21607
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92188/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21607
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21607
  
**[Test build #92188 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92188/testReport)**
 for PR 21607 at commit 
[`d1f3219`](https://github.com/apache/spark/commit/d1f3219a58f4dc4f1e65a793c6d01572b25a609e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...

2018-06-21 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21588
  
Will try to fix it then.

We can just enable it back. If we want to support those Hive versions in 
Hadoop 3, we could simply enable them back with some fixes at that time. Adding 
the support sounds an incremental improvement.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21061: [SPARK-23914][SQL] Add array_union function

2018-06-21 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/21061#discussion_r197319579
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -2355,3 +2355,347 @@ case class ArrayRemove(left: Expression, right: 
Expression)
 
   override def prettyName: String = "array_remove"
 }
+
+object ArraySetLike {
+  def useGenericArrayData(elementSize: Int, length: Int): Boolean = {
+// Use the same calculation in UnsafeArrayData.fromPrimitiveArray()
+val headerInBytes = 
UnsafeArrayData.calculateHeaderPortionInBytes(length)
+val valueRegionInBytes = elementSize.toLong * length
+val totalSizeInLongs = (headerInBytes + valueRegionInBytes + 7) / 8
+totalSizeInLongs > Integer.MAX_VALUE / 8
+  }
+
+  def throwUnionLengthOverflowException(length: Int): Unit = {
+throw new RuntimeException(s"Unsuccessful try to union arrays with 
$length " +
+  s"elements due to exceeding the array size limit " +
+  s"${ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH}.")
+  }
+}
+
+
+abstract class ArraySetLike extends BinaryArrayExpressionWithImplicitCast {
+  override def dataType: DataType = left.dataType
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+val typeCheckResult = super.checkInputDataTypes()
+if (typeCheckResult.isSuccess) {
+  
TypeUtils.checkForOrderingExpr(dataType.asInstanceOf[ArrayType].elementType,
+s"function $prettyName")
+} else {
+  typeCheckResult
+}
+  }
+
+  @transient protected lazy val ordering: Ordering[Any] =
+TypeUtils.getInterpretedOrdering(elementType)
+
+  @transient protected lazy val elementTypeSupportEquals = elementType 
match {
+case BinaryType => false
+case _: AtomicType => true
+case _ => false
+  }
+}
+
+/**
+ * Returns an array of the elements in the union of x and y, without 
duplicates
+ */
+@ExpressionDescription(
+  usage = """
+_FUNC_(array1, array2) - Returns an array of the elements in the union 
of array1 and array2,
+  without duplicates.
+  """,
+  examples = """
+Examples:
+  > SELECT _FUNC_(array(1, 2, 3), array(1, 3, 5));
+   array(1, 2, 3, 5)
+  """,
+  since = "2.4.0")
+case class ArrayUnion(left: Expression, right: Expression) extends 
ArraySetLike {
+  var hsInt: OpenHashSet[Int] = _
+  var hsLong: OpenHashSet[Long] = _
+
+  def assignInt(array: ArrayData, idx: Int, resultArray: ArrayData, pos: 
Int): Boolean = {
+val elem = array.getInt(idx)
+if (!hsInt.contains(elem)) {
+  resultArray.setInt(pos, elem)
+  hsInt.add(elem)
+  true
+} else {
+  false
+}
+  }
+
+  def assignLong(array: ArrayData, idx: Int, resultArray: ArrayData, pos: 
Int): Boolean = {
+val elem = array.getLong(idx)
+if (!hsLong.contains(elem)) {
+  resultArray.setLong(pos, elem)
+  hsLong.add(elem)
+  true
+} else {
+  false
+}
+  }
+
+  def evalPrimitiveType(
+  array1: ArrayData,
+  array2: ArrayData,
+  size: Int,
+  resultArray: ArrayData,
+  isLongType: Boolean): ArrayData = {
+// store elements into resultArray
+var foundNullElement = false
+var pos = 0
+Seq(array1, array2).foreach(array => {
+  var i = 0
+  while (i < array.numElements()) {
+if (array.isNullAt(i)) {
+  if (!foundNullElement) {
+resultArray.setNullAt(pos)
+pos += 1
+foundNullElement = true
+  }
+} else {
+  val assigned = if (!isLongType) {
+assignInt(array, i, resultArray, pos)
+  } else {
+assignLong(array, i, resultArray, pos)
+  }
+  if (assigned) {
+pos += 1
+  }
+}
+i += 1
+  }
+})
+resultArray
+  }
+
+  override def nullSafeEval(input1: Any, input2: Any): Any = {
+val array1 = input1.asInstanceOf[ArrayData]
+val array2 = input2.asInstanceOf[ArrayData]
+
+if (elementTypeSupportEquals) {
+  elementType match {
+case IntegerType =>
+  // avoid boxing of primitive int array elements
+  // calculate result array size
+  val hsSize = new OpenHashSet[Int]
+  Seq(array1, array2).foreach(array => {
+var i = 0
+while (i < array.numElements()) {
+  if (hsSize.size > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) 
{
+ArraySetLi

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21607
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21609: [SPARK-22897][CORE] Expose stageAttemptId in TaskContext

2018-06-21 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/21609
  
+1 pending tests.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21607
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/397/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21609: [SPARK-22897][CORE] Expose stageAttemptId in TaskContext

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21609
  
**[Test build #92196 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92196/testReport)**
 for PR 21609 at commit 
[`3040763`](https://github.com/apache/spark/commit/3040763e51c8d32309f2dc38ce8b9fcc740ceb3d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21609: [SPARK-22897][CORE] Expose stageAttemptId in TaskContext

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21609
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/396/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21607
  
**[Test build #92195 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92195/testReport)**
 for PR 21607 at commit 
[`9d7e6ea`](https://github.com/apache/spark/commit/9d7e6eafff3daa519f7fda0b1f219f74d499874d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21609: [SPARK-22897][CORE] Expose stageAttemptId in TaskContext

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21609
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...

2018-06-21 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/21588
  
> The tests were passed in this PR builder

Against your private build of the Hive stuff.

Again, fix that and this will become a lot easier to discuss. 

I'm also against disabling these tests without a proper discussion of what 
that means, and I've said multiple times. If we want to support those Hive 
versions in Hadoop 3, then this is the wrong change.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21609: [SPARK-22897][CORE] Expose stageAttemptId in TaskContext

2018-06-21 Thread tgravescs

Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/21609
  
backport to branch-2.2,  only changes was to mimaExcludes and test file 
that had one more call to TaskContext.

@vanzin 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21609: [SPARK-22897][CORE] Expose stageAttemptId in Task...

2018-06-21 Thread tgravescs

GitHub user tgravescs opened a pull request:

https://github.com/apache/spark/pull/21609

[SPARK-22897][CORE] Expose stageAttemptId in TaskContext

stageAttemptId added in TaskContext and corresponding construction 
modification

Added a new test in TaskContextSuite, two cases are tested:
1. Normal case without failure
2. Exception case with resubmitted stages

Link to [SPARK-22897](https://issues.apache.org/jira/browse/SPARK-22897)

Author: Xianjin YE 

Closes #20082 from advancedxy/SPARK-22897.

Conflicts:
project/MimaExcludes.scala

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgravescs/spark SPARK-22897

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21609.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21609


commit 4bc8d2805949b6b9d4d06ff4ad0493d9b33c7063
Author: Xianjin YE 
Date:   2018-01-02T15:30:38Z

[SPARK-22897][CORE] Expose stageAttemptId in TaskContext

stageAttemptId added in TaskContext and corresponding construction 
modification

Added a new test in TaskContextSuite, two cases are tested:
1. Normal case without failure
2. Exception case with resubmitted stages

Link to [SPARK-22897](https://issues.apache.org/jira/browse/SPARK-22897)

Author: Xianjin YE 

Closes #20082 from advancedxy/SPARK-22897.

Conflicts:
project/MimaExcludes.scala




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21594
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21594
  
**[Test build #92194 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92194/testReport)**
 for PR 21594 at commit 
[`2f00f2f`](https://github.com/apache/spark/commit/2f00f2fe0e1cf9a0d44285aab306ed55bd176d9c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21594: [SPARK-24596][SQL] Non-cascading Cache Invalidation

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21594
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/395/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21606
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/394/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21606
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21607
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/393/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21607
  
**[Test build #92193 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92193/testReport)**
 for PR 21607 at commit 
[`0520d60`](https://github.com/apache/spark/commit/0520d60b44987369fa62d7237427cb0cf022ed41).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21607: branch-2.1: backport SPARK-24589 and SPARK-22897

2018-06-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21607
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...

2018-06-21 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21588
  
The tests were passed in this PR builder. The only hack I used is that I 
landed a one liner fix to an artifact to use it in this PR, which is already in 
Hive, and is proposed in Hive's fork which is blocked by non-techinical reason. 
I am working on this to get through. Okay, if you think it should be blocked, 
let me get through this first.

I am not dropping it. Isn't it what we already cover? I believe this is the 
most minimised and conservative fix to make Hadoop 3 working within Spark since 
we already added it. FWIW, we didn't document Hadoop 3 profile yet, so my 
impression is that it's in progress yet.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21606: [SPARK-24552][core][SQL] Use task ID instead of attempt ...

2018-06-21 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21606
  
**[Test build #92192 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92192/testReport)**
 for PR 21606 at commit 
[`a16d9f9`](https://github.com/apache/spark/commit/a16d9f907b3ce0078da72b7e7bcc56e187cbc8f9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 430 matches

Mail list logo