date:20180112

[GitHub] spark pull request #20256: [SPARK-23063][K8S] K8s changes for publishing scr...

2018-01-12 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20256#discussion_r161366655
  
--- Diff: dev/create-release/releaseutils.py ---
@@ -185,6 +185,7 @@ def get_commits(tag):
 "graphx": "GraphX",
 "input/output": CORE_COMPONENT,
 "java api": "Java API",
+"kubernetes": "Kubernetes",
--- End diff --

yes, I think this looks right


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20256: [SPARK-23063][K8S] K8s changes for publishing scripts (a...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20256
  
**[Test build #86086 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86086/testReport)**
 for PR 20256 at commit 
[`b0a2ead`](https://github.com/apache/spark/commit/b0a2ead5935408370a5303fc8f7315357314aeca).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20256: [SPARK-23063][K8S] K8s changes for publishing scr...

2018-01-12 Thread foxish

Github user foxish commented on a diff in the pull request:

https://github.com/apache/spark/pull/20256#discussion_r161366632
  
--- Diff: dev/create-release/releaseutils.py ---
@@ -185,6 +185,7 @@ def get_commits(tag):
 "graphx": "GraphX",
 "input/output": CORE_COMPONENT,
 "java api": "Java API",
+"kubernetes": "Kubernetes",
--- End diff --

Ah, okay, I misread that previously - updated the mapping, and it looks 
like the script turns things to lower case anyway, so `k8s` and `kubernetes` 
ought to cover everything. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20211: [SPARK-23011][PYTHON][SQL] Prepend missing groupi...

2018-01-12 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20211#discussion_r161366590
  
--- Diff: python/pyspark/sql/group.py ---
@@ -233,6 +233,27 @@ def apply(self, udf):
 |  2| 1.1094003924504583|
 +---+---+
 
+Notes on grouping column:
--- End diff --

sounds to me like we could either stick with func(key, pdf) or whatever 
pandas does.

(yes, for gapply, the returned data frame is expected to have key columns 
prepended; there was one SPARK-16258 proposing to eliminate that extra work)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19001: [SPARK-19256][SQL] Hive bucketing support

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19001
  
**[Test build #86085 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86085/testReport)**
 for PR 19001 at commit 
[`d37eb8b`](https://github.com/apache/spark/commit/d37eb8b3359981756c923948fe12833a56b61865).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19001: [SPARK-19256][SQL] Hive bucketing support

2018-01-12 Thread tejasapatil

Github user tejasapatil commented on the issue:

https://github.com/apache/spark/pull/19001
  
Jenkins retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20168: [SPARK-22730][ML] Add ImageSchema support for non...

2018-01-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20168#discussion_r161366253
  
--- Diff: python/pyspark/ml/image.py ---
@@ -71,9 +88,33 @@ def ocvTypes(self):
 """
 
 if self._ocvTypes is None:
-ctx = SparkContext._active_spark_context
-self._ocvTypes = 
dict(ctx._jvm.org.apache.spark.ml.image.ImageSchema.javaOcvTypes())
-return self._ocvTypes
+ctx = SparkContext.getOrCreate()
+ocvTypeList = 
ctx._jvm.org.apache.spark.ml.image.ImageSchema.javaOcvTypes()
+self._ocvTypes = [self._OcvType(name=x.name(),
+mode=x.mode(),
+nChannels=x.nChannels(),
+dataType=x.dataType(),
+
nptype=self._ocvToNumpyMap[x.dataType()])
+  for x in ocvTypeList]
+return self._ocvTypes[:]
+
+def ocvTypeByName(self, name):
--- End diff --

Let's write a doc and doctest too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20168: [SPARK-22730][ML] Add ImageSchema support for non...

2018-01-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20168#discussion_r161366127
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala 
---
@@ -37,20 +37,54 @@ import org.apache.spark.sql.types._
 @Since("2.3.0")
 object ImageSchema {
 
-  val undefinedImageType = "Undefined"
+  /**
+   * OpenCv type representation
+   * @param mode ordinal for the type
+   * @param dataType open cv data type
+   * @param nChannels number of color channels
+   */
+  case class OpenCvType(mode: Int, dataType: String, nChannels: Int) {
+def name: String = if (mode == -1) { "Undefined" } else { 
s"CV_$dataType" + s"C$nChannels" }
+override def toString: String = s"OpenCvType(mode = $mode, name = 
$name)"
+  }
+
+  def ocvTypeByName(name: String): OpenCvType = {
+ocvTypes.find(x => x.name == name).getOrElse(
+  throw new IllegalArgumentException("Unknown open cv type " + name))
+  }
+
+  def ocvTypeByMode(mode: Int): OpenCvType = {
+ocvTypes.find(x => x.mode == mode).getOrElse(
+  throw new IllegalArgumentException("Unknown open cv mode " + mode))
+  }
+
+  val undefinedImageType = OpenCvType(-1, "N/A", -1)
 
   /**
-   * (Scala-specific) OpenCV type mapping supported
+   * A Mapping of Type to Numbers in OpenCV
+   *
+   *C1 C2  C3  C4
+   * CV_8U   0  8  16  24
+   * CV_8S   1  9  17  25
+   * CV_16U  2 10  18  26
+   * CV_16S  3 11  19  27
+   * CV_32S  4 12  20  28
+   * CV_32F  5 13  21  29
+   * CV_64F  6 14  22  30
*/
-  val ocvTypes: Map[String, Int] = Map(
-undefinedImageType -> -1,
-"CV_8U" -> 0, "CV_8UC1" -> 0, "CV_8UC3" -> 16, "CV_8UC4" -> 24
-  )
+  val ocvTypes = {
+val types =
+  for (nc <- Array(1, 2, 3, 4);
+   dt <- Array("8U", "8S", "16U", "16S", "32S", "32F", "64F"))
+yield (dt, nc)
+val ordinals = for (i <- 0 to 3; j <- 0 to 6) yield ( i * 8 + j)
+undefinedImageType +: (ordinals zip types).map(x => OpenCvType(x._1, 
x._2._1, x._2._2))
+  }
 
   /**
-   * (Java-specific) OpenCV type mapping supported
+   *  (Java Specific) list of OpenCv types
*/
-  val javaOcvTypes: java.util.Map[String, Int] = ocvTypes.asJava
--- End diff --

Let's set the explicit type here .. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20168: [SPARK-22730][ML] Add ImageSchema support for non...

2018-01-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20168#discussion_r161366196
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala 
---
@@ -37,20 +37,54 @@ import org.apache.spark.sql.types._
 @Since("2.3.0")
 object ImageSchema {
 
-  val undefinedImageType = "Undefined"
+  /**
+   * OpenCv type representation
+   * @param mode ordinal for the type
+   * @param dataType open cv data type
+   * @param nChannels number of color channels
+   */
+  case class OpenCvType(mode: Int, dataType: String, nChannels: Int) {
+def name: String = if (mode == -1) { "Undefined" } else { 
s"CV_$dataType" + s"C$nChannels" }
+override def toString: String = s"OpenCvType(mode = $mode, name = 
$name)"
+  }
+
+  def ocvTypeByName(name: String): OpenCvType = {
+ocvTypes.find(x => x.name == name).getOrElse(
+  throw new IllegalArgumentException("Unknown open cv type " + name))
+  }
+
+  def ocvTypeByMode(mode: Int): OpenCvType = {
+ocvTypes.find(x => x.mode == mode).getOrElse(
+  throw new IllegalArgumentException("Unknown open cv mode " + mode))
+  }
+
+  val undefinedImageType = OpenCvType(-1, "N/A", -1)
 
   /**
-   * (Scala-specific) OpenCV type mapping supported
+   * A Mapping of Type to Numbers in OpenCV
+   *
+   *C1 C2  C3  C4
+   * CV_8U   0  8  16  24
+   * CV_8S   1  9  17  25
+   * CV_16U  2 10  18  26
+   * CV_16S  3 11  19  27
+   * CV_32S  4 12  20  28
+   * CV_32F  5 13  21  29
+   * CV_64F  6 14  22  30
*/
-  val ocvTypes: Map[String, Int] = Map(
-undefinedImageType -> -1,
-"CV_8U" -> 0, "CV_8UC1" -> 0, "CV_8UC3" -> 16, "CV_8UC4" -> 24
-  )
+  val ocvTypes = {
+val types =
+  for (nc <- Array(1, 2, 3, 4);
+   dt <- Array("8U", "8S", "16U", "16S", "32S", "32F", "64F"))
+yield (dt, nc)
+val ordinals = for (i <- 0 to 3; j <- 0 to 6) yield ( i * 8 + j)
+undefinedImageType +: (ordinals zip types).map(x => OpenCvType(x._1, 
x._2._1, x._2._2))
+  }
 
   /**
-   * (Java-specific) OpenCV type mapping supported
+   *  (Java Specific) list of OpenCv types
--- End diff --

Let's keep as is `(Java-specific)`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20168: [SPARK-22730][ML] Add ImageSchema support for non...

2018-01-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20168#discussion_r161366123
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala 
---
@@ -37,20 +37,54 @@ import org.apache.spark.sql.types._
 @Since("2.3.0")
 object ImageSchema {
 
-  val undefinedImageType = "Undefined"
+  /**
+   * OpenCv type representation
+   * @param mode ordinal for the type
+   * @param dataType open cv data type
+   * @param nChannels number of color channels
+   */
+  case class OpenCvType(mode: Int, dataType: String, nChannels: Int) {
+def name: String = if (mode == -1) { "Undefined" } else { 
s"CV_$dataType" + s"C$nChannels" }
+override def toString: String = s"OpenCvType(mode = $mode, name = 
$name)"
+  }
+
+  def ocvTypeByName(name: String): OpenCvType = {
+ocvTypes.find(x => x.name == name).getOrElse(
+  throw new IllegalArgumentException("Unknown open cv type " + name))
+  }
+
+  def ocvTypeByMode(mode: Int): OpenCvType = {
+ocvTypes.find(x => x.mode == mode).getOrElse(
+  throw new IllegalArgumentException("Unknown open cv mode " + mode))
+  }
+
+  val undefinedImageType = OpenCvType(-1, "N/A", -1)
 
   /**
-   * (Scala-specific) OpenCV type mapping supported
+   * A Mapping of Type to Numbers in OpenCV
+   *
+   *C1 C2  C3  C4
+   * CV_8U   0  8  16  24
+   * CV_8S   1  9  17  25
+   * CV_16U  2 10  18  26
+   * CV_16S  3 11  19  27
+   * CV_32S  4 12  20  28
+   * CV_32F  5 13  21  29
+   * CV_64F  6 14  22  30
*/
-  val ocvTypes: Map[String, Int] = Map(
-undefinedImageType -> -1,
-"CV_8U" -> 0, "CV_8UC1" -> 0, "CV_8UC3" -> 16, "CV_8UC4" -> 24
-  )
+  val ocvTypes = {
--- End diff --

Could we set the explicit type?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20168: [SPARK-22730][ML] Add ImageSchema support for non...

2018-01-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20168#discussion_r161366218
  
--- Diff: python/pyspark/ml/image.py ---
@@ -71,9 +88,33 @@ def ocvTypes(self):
 """
--- End diff --

Seems we should fix the doc for `:return:`. Seems it's going to be a list 
now.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20256: [SPARK-23063][K8S] K8s changes for publishing scr...

2018-01-12 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20256#discussion_r161366220
  
--- Diff: dev/create-release/releaseutils.py ---
@@ -185,6 +185,7 @@ def get_commits(tag):
 "graphx": "GraphX",
 "input/output": CORE_COMPONENT,
 "java api": "Java API",
+"kubernetes": "Kubernetes",
--- End diff --

so looks like this is for both commit title and JIRA component field... 
which isn't quite perfect (for example, not R here)

but in any case, multiple left value can map to the same right value, it 
looks like


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20256: [SPARK-23063][K8S] K8s changes for publishing scripts (a...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20256
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20256: [SPARK-23063][K8S] K8s changes for publishing scripts (a...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20256
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86077/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20256: [SPARK-23063][K8S] K8s changes for publishing scripts (a...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20256
  
**[Test build #86077 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86077/testReport)**
 for PR 20256 at commit 
[`df6f49d`](https://github.com/apache/spark/commit/df6f49d8d04da9ff8113929802a3c674c572e9f5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20237: [SPARK-22980][PYTHON][SQL] Clarify the length of ...

2018-01-12 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20237


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20237: [SPARK-22980][PYTHON][SQL] Clarify the length of each se...

2018-01-12 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20237
  
Merged to master and branch-2.3.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20153: [SPARK-22392][SQL] data source v2 columnar batch reader

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20153
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86080/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20153: [SPARK-22392][SQL] data source v2 columnar batch reader

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20153
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20153: [SPARK-22392][SQL] data source v2 columnar batch reader

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20153
  
**[Test build #86080 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86080/testReport)**
 for PR 20153 at commit 
[`d666110`](https://github.com/apache/spark/commit/d6661104f314c88ff84057fd4830e7a5fbe964d9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20247: [SPARK-23021][SQL] AnalysisBarrier should override inner...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20247
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20247: [SPARK-23021][SQL] AnalysisBarrier should override inner...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20247
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86078/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20247: [SPARK-23021][SQL] AnalysisBarrier should override inner...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20247
  
**[Test build #86078 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86078/testReport)**
 for PR 20247 at commit 
[`7692099`](https://github.com/apache/spark/commit/7692099c42907682a5ca10fa6a800fcb1a6e745d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20240: [SPARK-23049][SQL] `spark.sql.files.ignoreCorruptFiles` ...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20240
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20240: [SPARK-23049][SQL] `spark.sql.files.ignoreCorruptFiles` ...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20240
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86079/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20240: [SPARK-23049][SQL] `spark.sql.files.ignoreCorruptFiles` ...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20240
  
**[Test build #86079 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86079/testReport)**
 for PR 20240 at commit 
[`33ae3ca`](https://github.com/apache/spark/commit/33ae3ca34aa237c630927c96d9421ea53ed6a775).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20151: [SPARK-22959][PYTHON] Configuration to select the module...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20151
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86075/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20151: [SPARK-22959][PYTHON] Configuration to select the module...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20151
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20151: [SPARK-22959][PYTHON] Configuration to select the module...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20151
  
**[Test build #86075 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86075/testReport)**
 for PR 20151 at commit 
[`fc65803`](https://github.com/apache/spark/commit/fc658034639c1aa56ff5b9a44624cad05377fe51).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20256: [SPARK-23063][K8S] K8s changes for publishing scr...

2018-01-12 Thread foxish

Github user foxish commented on a diff in the pull request:

https://github.com/apache/spark/pull/20256#discussion_r161365711
  
--- Diff: dev/create-release/releaseutils.py ---
@@ -185,6 +185,7 @@ def get_commits(tag):
 "graphx": "GraphX",
 "input/output": CORE_COMPONENT,
 "java api": "Java API",
+"kubernetes": "Kubernetes",
--- End diff --

Can we supply a list/tuple there? I've updated it to `K8S`, but sometimes 
folks have written `k8s` or `kubernetes` in the PR titles by the looks of it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20256: [SPARK-23063][K8S] K8s changes for publishing scripts (a...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20256
  
**[Test build #86084 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86084/testReport)**
 for PR 20256 at commit 
[`73fb21e`](https://github.com/apache/spark/commit/73fb21e4e6fc12bd9d77b98ade8b2ed011b8d68f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20232: [SPARK-23042][ML] Use OneHotEncoderModel to encod...

2018-01-12 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20232#discussion_r161365548
  
--- Diff: R/pkg/tests/fulltests/test_mllib_classification.R ---
@@ -382,10 +382,10 @@ test_that("spark.mlp", {
   trainidxs <- base::sample(nrow(data), nrow(data) * 0.7)
   traindf <- as.DataFrame(data[trainidxs, ])
   testdf <- as.DataFrame(rbind(data[-trainidxs, ], c(0, "the other")))
-  model <- spark.mlp(traindf, clicked ~ ., layers = c(1, 3))
+  model <- spark.mlp(traindf, clicked ~ ., layers = c(1, 2))
--- End diff --

ok; I think perhaps we need to release-note this (like 
[this](http://spark.apache.org/docs/latest/sparkr.html#upgrading-to-sparkr-220))


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20163: [SPARK-22966][PYTHON][SQL] Python UDFs with retur...

2018-01-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20163#discussion_r161365496
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala
 ---
@@ -144,6 +145,7 @@ object EvaluatePython {
 }
 
 case StringType => (obj: Any) => nullSafeConvert(obj) {
+  case _: Calendar => null
   case _ => UTF8String.fromString(obj.toString)
--- End diff --

@cloud-fan, how about something like this then?

```scala
case StringType => (obj: Any) => nullSafeConvert(obj) {
  // Shortcut for string conversion
  case c: String => UTF8String.fromString(c)

  // Here, we return null for 'array', 'tuple', 'dict', 'list', 
'datetime.datetime',
  // 'datetime.date' and 'datetime.time' because those string 
conversions are
  // not quite consistent with SQL string representation of data.
  case _: java.util.Calendar | _: net.razorvine.pickle.objects.Time |
   _: java.util.List[_] | _: java.util.Map[_, _] =>
null
  case c if c.getClass.isArray => null

  // Here, we keep the string conversion fall back for compatibility.
  // TODO: We should revisit this and rewrite the type conversion logic 
in Spark 3.x.
  case other => UTF8String.fromString(other.toString)
}
```

My few tests:

`datetime.time`:

```
from pyspark.sql.functions import udf
from datetime import time

f = udf(lambda x: time(0, 0), "string")
spark.range(1).select(f("id")).show()
```

```
++
|(id)|
++
|Time: 0 hours, 0 ...|
++
```

`array`:

```
from pyspark.sql.functions import udf
import array

f = udf(lambda x: array.array("c", "aaa"), "string")
spark.range(1).select(f("id")).show()
```

```
++
|(id)|
++
| [C@11618d9e|
++
```

`tuple`:

```
from pyspark.sql.functions import udf

f = udf(lambda x: (x,), "string")
spark.range(1).select(f("id")).show()
```

```
++
|(id)|
++
|[Ljava.lang.Objec...|
++
```

`list`:

```
from pyspark.sql.functions import udf
from datetime import datetime

f = udf(lambda x: [datetime(1990, 1, 1)], "string")
spark.range(1).select(f("id")).show()
```

```
++
|(id)|
++
|[java.util.Gregor...|
++
```





---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20254: [SPARK-23062][SQL] Improve EXCEPT documentation

2018-01-12 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20254#discussion_r161365422
  
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1364,7 +1364,9 @@ def subtract(self, other):
 """ Return a new :class:`DataFrame` containing rows in this frame
 but not in another frame.
 
-This is equivalent to `EXCEPT` in SQL.
+This is equivalent to `EXCEPT DISTINCT` in SQL.
+
+(Note: Before Spark 2.0, the behavior was equivalent to `EXCEPT 
ALL` in SQL.)
--- End diff --

nit: `2.0` to `2.0.0`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20254: [SPARK-23062][SQL] Improve EXCEPT documentation

2018-01-12 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20254#discussion_r161365371
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -2873,6 +2873,7 @@ setMethod("intersect",
 #' @rdname except
 #' @export
 #' @note except since 1.4.0
+#' @note behaviour changed from \code{EXCEPT ALL} to \code{EXCEPT 
DISTINCT} in 2.0.
--- End diff --

I don't mind it either way, but to note:
- r doc order and whitespace is significant, if you use `#' Note:` you must 
put it after L2856, if you put an extra `#'` ie. empty line that it becomes the 
`Details` section, which might be the right place; see 
http://spark.apache.org/docs/latest/api/R/awaitTermination.html


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20254: [SPARK-23062][SQL] Improve EXCEPT documentation

2018-01-12 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20254#discussion_r161365416
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -2873,6 +2873,7 @@ setMethod("intersect",
 #' @rdname except
 #' @export
 #' @note except since 1.4.0
+#' @note behaviour changed from \code{EXCEPT ALL} to \code{EXCEPT 
DISTINCT} in 2.0.
--- End diff --

ie.
```
#' but not in another SparkDataFrame. This is equivalent to \code{EXCEPT 
DISTINCT} in SQL.
#'
#' Note: Before Spark 2.0.0, the behavior was equivalent to `EXCEPT ALL` in 
SQL.
#'
#' @param x a SparkDataFrame.
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20256: [SPARK-23063][K8S] K8s changes for publishing scr...

2018-01-12 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20256#discussion_r161365293
  
--- Diff: dev/create-release/releaseutils.py ---
@@ -185,6 +185,7 @@ def get_commits(tag):
 "graphx": "GraphX",
 "input/output": CORE_COMPONENT,
 "java api": "Java API",
+"kubernetes": "Kubernetes",
--- End diff --

this is for the PR title [foo] - I think [k8s] is more widely used, maybe 
both


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19001: [SPARK-19256][SQL] Hive bucketing support

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19001
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86074/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19001: [SPARK-19256][SQL] Hive bucketing support

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19001
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19001: [SPARK-19256][SQL] Hive bucketing support

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19001
  
**[Test build #86074 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86074/testReport)**
 for PR 19001 at commit 
[`3c367a0`](https://github.com/apache/spark/commit/3c367a08fa5290081e82d45ea7bf564277f196b0).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  throw new IOException(\"Cannot find class \" + 
inputFormatClassName, e);`
  * `  throw new IOException(\"Unable to find the InputFormat class \" 
+ inputFormatClassName, e);`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20080: [SPARK-22870][CORE] Dynamic allocation should allow 0 id...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20080
  
**[Test build #86083 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86083/testReport)**
 for PR 20080 at commit 
[`b03a496`](https://github.com/apache/spark/commit/b03a4968976781dff03961abc5caedae10ef10aa).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread jose-torres

Github user jose-torres commented on the issue:

https://github.com/apache/spark/pull/20253
  
Kicked off the 5 runs.

I noticed one of the earlier runs actually failed in PySpark. I don't know 
of a plausible mechanism by which this PR could cause that.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #4043 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4043/testReport)**
 for PR 20253 at commit 
[`4bb9c3f`](https://github.com/apache/spark/commit/4bb9c3f06f4da1c14ab24ad6a642bf831c90503f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20153: [SPARK-22392][SQL] data source v2 columnar batch reader

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20153
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #4044 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4044/testReport)**
 for PR 20253 at commit 
[`4bb9c3f`](https://github.com/apache/spark/commit/4bb9c3f06f4da1c14ab24ad6a642bf831c90503f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #4045 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4045/testReport)**
 for PR 20253 at commit 
[`4bb9c3f`](https://github.com/apache/spark/commit/4bb9c3f06f4da1c14ab24ad6a642bf831c90503f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #4042 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4042/testReport)**
 for PR 20253 at commit 
[`4bb9c3f`](https://github.com/apache/spark/commit/4bb9c3f06f4da1c14ab24ad6a642bf831c90503f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20153: [SPARK-22392][SQL] data source v2 columnar batch reader

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20153
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86072/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread jose-torres

Github user jose-torres commented on the issue:

https://github.com/apache/spark/pull/20253
  
In commit 4bb9c3f, there are no occurrences remaining of the string 
"processAllAvailable" in KafkaContinuousSourceSuite.scala, 
KafkaContinuousSinkSuite.scala, KafkaContinuousTest.scala, or StreamTest.scala.

There are four occurrences in KafkaSourceSuite.scala, two within the 
MicroBatch suite and two matched to not happen in ContinuousExecution. (One 
test with a foreach sink was moved to the MicroBatch suite, because it was 
executing in microbatch mode anyway since we haven't updated foreach for 
continuous processing.)

I believe this is exhaustive.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20153: [SPARK-22392][SQL] data source v2 columnar batch reader

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20153
  
**[Test build #86072 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86072/testReport)**
 for PR 20153 at commit 
[`4a6a725`](https://github.com/apache/spark/commit/4a6a725acffdc24f7c00302c1a0081c93f6acdd8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #86082 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86082/testReport)**
 for PR 20253 at commit 
[`4bb9c3f`](https://github.com/apache/spark/commit/4bb9c3f06f4da1c14ab24ad6a642bf831c90503f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20253
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86071/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20253
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #86071 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86071/testReport)**
 for PR 20253 at commit 
[`71bfbcf`](https://github.com/apache/spark/commit/71bfbcfbca3b8bce064c790a92dbab59a9414934).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread jose-torres

Github user jose-torres commented on the issue:

https://github.com/apache/spark/pull/20253
  
The builds at commit 3fe76e3 were on an incomplete version of the PR, so 
their failures are expected.

Of the 6 builds at target commit 0efc8c5, 5 passed and 1 failed. The 
failure was due to a stream.processAllAvailable() call in the Kafka suite; we 
already knew this method is inherently flake-prone for continuous processing, 
and had attempted to remove usages of it. I'm going to do another pass to get 
the rest, and then kick off 5 more attempts.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20248: [SPARK-23058][SQL] Show non printable field delim...

2018-01-12 Thread wangyum

Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/20248#discussion_r161364269
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ---
@@ -1023,7 +1023,12 @@ case class ShowCreateTableCommand(table: 
TableIdentifier) extends RunnableComman
 
   val serdeProps = metadata.storage.properties.map {
 case (key, value) =>
-  s"'${escapeSingleQuotedString(key)}' = 
'${escapeSingleQuotedString(value)}'"
+  val escapedValue = if (value.length == 1 && (value.head < 32 || 
value.head > 126)) {
--- End diff --

I need to copy an external table to another environment, but lost the 
create table statement.  So I want to get this create table statement by `show 
create table ...`, but it can't show non printable field delim.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20246: [SPARK-23054][SQL] Fix incorrect results of casting User...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20246
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20246: [SPARK-23054][SQL] Fix incorrect results of casting User...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20246
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86070/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20246: [SPARK-23054][SQL] Fix incorrect results of casting User...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20246
  
**[Test build #86070 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86070/testReport)**
 for PR 20246 at commit 
[`137d85f`](https://github.com/apache/spark/commit/137d85f23fa8d0e45144db89666f4c9083d14100).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20248: [SPARK-23058][SQL] Show non printable field delim as uni...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20248
  
**[Test build #86081 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86081/testReport)**
 for PR 20248 at commit 
[`edf5fa6`](https://github.com/apache/spark/commit/edf5fa6e8ee29bf237a6d61dee1146f297bd570f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20254: [SPARK-23062][SQL] Improve EXCEPT documentation

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20254
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86068/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20254: [SPARK-23062][SQL] Improve EXCEPT documentation

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20254
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20254: [SPARK-23062][SQL] Improve EXCEPT documentation

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20254
  
**[Test build #86068 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86068/testReport)**
 for PR 20254 at commit 
[`5562a16`](https://github.com/apache/spark/commit/5562a1665bebf413d5c4126642a77e2d9d0c4a46).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20163: [SPARK-22966][PYTHON][SQL] Python UDFs with retur...

2018-01-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20163#discussion_r161363813
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala
 ---
@@ -144,6 +145,7 @@ object EvaluatePython {
 }
 
 case StringType => (obj: Any) => nullSafeConvert(obj) {
+  case _: Calendar => null
   case _ => UTF8String.fromString(obj.toString)
--- End diff --

I think there is no perfect solution ..  I think 
https://github.com/apache/spark/pull/20163#discussion_r161363004 sounds good 
enough as a fix for this issue for now .. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20163: [SPARK-22966][PYTHON][SQL] Python UDFs with retur...

2018-01-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20163#discussion_r161363630
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala
 ---
@@ -144,6 +145,7 @@ object EvaluatePython {
 }
 
 case StringType => (obj: Any) => nullSafeConvert(obj) {
+  case _: Calendar => null
   case _ => UTF8String.fromString(obj.toString)
--- End diff --

BTW, seems there is another hole when we actually do the internal 
conversion with unexpected types:

```python
>>> from pyspark.sql.functions import udf
>>> f = udf(lambda x: x, "date")
>>> spark.range(1).select(f("id")).show()
```

```
org.apache.spark.api.python.PythonException: Traceback (most recent call 
last):
  File "./python/pyspark/worker.py", line 229, in main
process()
  File "./python/pyspark/worker.py", line 224, in process
serializer.dump_stream(func(split_index, iterator), outfile)
  File "./python/pyspark/worker.py", line 149, in 
func = lambda _, it: map(mapper, it)
  File "", line 1, in 
  File "./python/pyspark/worker.py", line 72, in 
return lambda *a: toInternal(f(*a))
  File "/.../pyspark/sql/types.py", line 175, in toInternal
return d.toordinal() - self.EPOCH_ORDINAL
AttributeError: 'int' object has no attribute 'toordinal'
```



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20163: [SPARK-22966][PYTHON][SQL] Python UDFs with retur...

2018-01-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20163#discussion_r161363023
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala
 ---
@@ -144,6 +145,7 @@ object EvaluatePython {
 }
 
 case StringType => (obj: Any) => nullSafeConvert(obj) {
+  case _: Calendar => null
   case _ => UTF8String.fromString(obj.toString)
--- End diff --

Oh, I didn't see the comment above when I write my comment.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20163: [SPARK-22966][PYTHON][SQL] Python UDFs with retur...

2018-01-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20163#discussion_r161363004
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala
 ---
@@ -144,6 +145,7 @@ object EvaluatePython {
 }
 
 case StringType => (obj: Any) => nullSafeConvert(obj) {
+  case _: Calendar => null
   case _ => UTF8String.fromString(obj.toString)
--- End diff --

So, for now .. I think it's fine as a small fix as is ... We are going to 
document that the return type and return value should be matched anyway ..

So, expected return values will be:

```python
# Mapping Python types to Spark SQL DataType
_type_mappings = {
type(None): NullType,
bool: BooleanType,
int: LongType,
float: DoubleType,
str: StringType,
bytearray: BinaryType,
decimal.Decimal: DecimalType,
datetime.date: DateType,
datetime.datetime: TimestampType,
datetime.time: TimestampType,
}
```

Seems, we can also check if the string conversion looks reasonable and then 
blacklist `net.razorvine.pickle.objects.Time` if not ...

How does this sound to you @cloud-fan and @rednaxelafx?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20163: [SPARK-22966][PYTHON][SQL] Python UDFs with retur...

2018-01-12 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20163#discussion_r161362994
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala
 ---
@@ -144,6 +145,7 @@ object EvaluatePython {
 }
 
 case StringType => (obj: Any) => nullSafeConvert(obj) {
+  case _: Calendar => null
   case _ => UTF8String.fromString(obj.toString)
--- End diff --

> check if the string conversion looks reasonably consistent by 
obj.toString. If not, we add it in the blacklist.

hmm, this seems weird as the type mismatch now is defined by Pyrolite 
object's `toString` behavior...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20253
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86064/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20253
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #4037 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4037/testReport)**
 for PR 20253 at commit 
[`3fe76e3`](https://github.com/apache/spark/commit/3fe76e30a01698ce8732044a0c663baa277605cb).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #86064 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86064/testReport)**
 for PR 20253 at commit 
[`5f4f7cf`](https://github.com/apache/spark/commit/5f4f7cf6662a389abe42bfcc433d2035c5d1c35e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20214: [SPARK-23023][SQL] Cast field data to strings in showStr...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20214
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20214: [SPARK-23023][SQL] Cast field data to strings in showStr...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20214
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86069/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20153: [SPARK-22392][SQL] data source v2 columnar batch reader

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20153
  
**[Test build #86080 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86080/testReport)**
 for PR 20153 at commit 
[`d666110`](https://github.com/apache/spark/commit/d6661104f314c88ff84057fd4830e7a5fbe964d9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20214: [SPARK-23023][SQL] Cast field data to strings in showStr...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20214
  
**[Test build #86069 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86069/testReport)**
 for PR 20214 at commit 
[`022ed32`](https://github.com/apache/spark/commit/022ed327bc7e2fd3a5cbd498d21183f0eabf2a26).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20163: [SPARK-22966][PYTHON][SQL] Python UDFs with retur...

2018-01-12 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20163#discussion_r161362902
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvaluatePython.scala
 ---
@@ -144,6 +145,7 @@ object EvaluatePython {
 }
 
 case StringType => (obj: Any) => nullSafeConvert(obj) {
+  case _: Calendar => null
   case _ => UTF8String.fromString(obj.toString)
--- End diff --

For the perfectness, I think we should check all the types, 
https://github.com/irmen/Pyrolite,

```
PYTHON> JAVA
--  
Nonenull
boolboolean
int int
longlong or BigInteger  (depending on size)
string  String
unicode String
complex net.razorvine.pickle.objects.ComplexNumber
datetime.date   java.util.Calendar
datetime.datetime   java.util.Calendar
datetime.time   net.razorvine.pickle.objects.Time
datetime.timedelta  net.razorvine.pickle.objects.TimeDelta
float   double   (float isn't used) 
array.array array of appropriate primitive type (char, int, short, 
long, float, double)
listjava.util.List
tuple   Object[]
set java.util.Set
dictjava.util.Map
bytes   byte[]
bytearray   byte[]
decimal BigDecimal
custom classMap  (dict with class attributes 
including its name in "__class__")
Pyro4.core.URI  net.razorvine.pyro.PyroURI
Pyro4.core.Proxynet.razorvine.pyro.PyroProxy
Pyro4.errors.*  net.razorvine.pyro.PyroException
Pyro4.utils.flame.FlameBuiltin net.razorvine.pyro.FlameBuiltin 
Pyro4.utils.flame.FlameModule  net.razorvine.pyro.FlameModule 
Pyro4.utils.flame.RemoteInteractiveConsole
net.razorvine.pyro.FlameRemoteConsole 
```

and then check if the string conversion looks reasonably consistent by 
`obj.toString`. If not, we add it in the blacklist.

Another possibility is to whitelist `String`, but then I guess this is 
rather a radical change.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #4038 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4038/testReport)**
 for PR 20253 at commit 
[`0efc8c5`](https://github.com/apache/spark/commit/0efc8c5b7e98f3e79361f355a08fc8404d2d7d9b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20251: [Spark-23051][core] Fix for broken job description in Sp...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20251
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86066/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20251: [Spark-23051][core] Fix for broken job description in Sp...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20251
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20251: [Spark-23051][core] Fix for broken job description in Sp...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20251
  
**[Test build #86066 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86066/testReport)**
 for PR 20251 at commit 
[`d9cdb07`](https://github.com/apache/spark/commit/d9cdb07263f7a584cf217d30c55313283459ac92).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20253
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86063/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20253
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #86063 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86063/testReport)**
 for PR 20253 at commit 
[`3fe76e3`](https://github.com/apache/spark/commit/3fe76e30a01698ce8732044a0c663baa277605cb).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20254: [SPARK-23062][SQL] Improve EXCEPT documentation

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20254
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86067/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20254: [SPARK-23062][SQL] Improve EXCEPT documentation

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20254
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20254: [SPARK-23062][SQL] Improve EXCEPT documentation

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20254
  
**[Test build #86067 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86067/testReport)**
 for PR 20254 at commit 
[`9fe5707`](https://github.com/apache/spark/commit/9fe57074b496ad95411c4ce5a43b0c43dd6246af).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20240: [SPARK-23049][SQL] `spark.sql.files.ignoreCorruptFiles` ...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20240
  
**[Test build #86079 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86079/testReport)**
 for PR 20240 at commit 
[`33ae3ca`](https://github.com/apache/spark/commit/33ae3ca34aa237c630927c96d9421ea53ed6a775).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20247: [SPARK-23021][SQL] AnalysisBarrier should override inner...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20247
  
**[Test build #86078 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86078/testReport)**
 for PR 20247 at commit 
[`7692099`](https://github.com/apache/spark/commit/7692099c42907682a5ca10fa6a800fcb1a6e745d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20240: [SPARK-23049][SQL] `spark.sql.files.ignoreCorruptFiles` ...

2018-01-12 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/20240
  
Retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20153: [SPARK-22392][SQL] data source v2 columnar batch ...

2018-01-12 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20153#discussion_r161362565
  
--- Diff: 
sql/core/src/test/java/test/org/apache/spark/sql/sources/v2/JavaBatchDataSourceV2.java
 ---
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package test.org.apache.spark.sql.sources.v2;
+
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.spark.sql.execution.vectorized.OnHeapColumnVector;
+import org.apache.spark.sql.sources.v2.DataSourceV2;
+import org.apache.spark.sql.sources.v2.DataSourceV2Options;
+import org.apache.spark.sql.sources.v2.ReadSupport;
+import org.apache.spark.sql.sources.v2.reader.*;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.StructType;
+import org.apache.spark.sql.vectorized.ColumnVector;
+import org.apache.spark.sql.vectorized.ColumnarBatch;
+
+
+public class JavaBatchDataSourceV2 implements DataSourceV2, ReadSupport {
+
+  class Reader implements DataSourceV2Reader, SupportsScanColumnarBatch {
--- End diff --

This is the convention. If we implement many mix-in interfaces, it's better 
to write
```
MyReader extends DataSourceV2Reader, XXX, YYY, ZZZ ...
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20247: [SPARK-23021][SQL] AnalysisBarrier should override inner...

2018-01-12 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20247
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20247: [SPARK-23021][SQL] AnalysisBarrier should override inner...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20247
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20247: [SPARK-23021][SQL] AnalysisBarrier should override inner...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20247
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86056/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #4041 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4041/testReport)**
 for PR 20253 at commit 
[`0efc8c5`](https://github.com/apache/spark/commit/0efc8c5b7e98f3e79361f355a08fc8404d2d7d9b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20247: [SPARK-23021][SQL] AnalysisBarrier should override inner...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20247
  
**[Test build #86056 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86056/testReport)**
 for PR 20247 at commit 
[`7692099`](https://github.com/apache/spark/commit/7692099c42907682a5ca10fa6a800fcb1a6e745d).
 * This patch **fails from timeout after a configured wait of \`300m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #4040 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4040/testReport)**
 for PR 20253 at commit 
[`0efc8c5`](https://github.com/apache/spark/commit/0efc8c5b7e98f3e79361f355a08fc8404d2d7d9b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20253
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20253
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86060/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20253: [SPARK-22908][SS] Roll forward continuous processing Kaf...

2018-01-12 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20253
  
**[Test build #86060 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86060/testReport)**
 for PR 20253 at commit 
[`f575483`](https://github.com/apache/spark/commit/f5754837efbdca10398b769be07eaf53ae36f0f3).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 473 matches

Mail list logo