date:20170923

[GitHub] spark issue #19293: [SPARK-22079][SQL] Serializer in HiveOutputWriter miss l...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19293
  
@LantaoJin Please add a test case.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19293: [SPARK-22079][SQL] Serializer in HiveOutputWriter miss l...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19293
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19321: [SPARK-22100] [SQL] Make percentile_approx support numer...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19321
  
Could you document the change in the output type of `percentile_approx ` in 
the following section?


https://spark.apache.org/docs/latest/sql-programming-guide.html#migration-guide


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19321: [SPARK-22100] [SQL] Make percentile_approx suppor...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19321#discussion_r140647928
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
 ---
@@ -123,7 +124,13 @@ case class ApproximatePercentile(
 val value = child.eval(inputRow)
 // Ignore empty rows, for example: percentile_approx(null)
 if (value != null) {
-  buffer.add(value.asInstanceOf[Double])
+  // Convert the value to a double value
+  val doubleValue = child.dataType match {
+case DateType => value.asInstanceOf[Int].toDouble
+case TimestampType => value.asInstanceOf[Long].toDouble
+case n: NumericType => 
n.numeric.toDouble(value.asInstanceOf[n.InternalType])
--- End diff --

The same here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19321: [SPARK-22100] [SQL] Make percentile_approx suppor...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19321#discussion_r140647683
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
 ---
@@ -134,7 +141,18 @@ case class ApproximatePercentile(
   }
 
   override def eval(buffer: PercentileDigest): Any = {
-val result = buffer.getPercentiles(percentages)
+val doubleResult = buffer.getPercentiles(percentages)
+val result = child.dataType match {
+  case DateType => doubleResult.map(_.toInt)
+  case TimestampType => doubleResult.map(_.toLong)
+  case ByteType => doubleResult.map(_.toByte)
+  case ShortType => doubleResult.map(_.toShort)
+  case IntegerType => doubleResult.map(_.toInt)
+  case LongType => doubleResult.map(_.toLong)
+  case FloatType => doubleResult.map(_.toFloat)
+  case DoubleType => doubleResult
+  case _: DecimalType => doubleResult.map(Decimal(_))
--- End diff --

Add 
```Scala
case other: DataType =>
  throw new UnsupportedOperationException(s"Unexpected data type 
$other")
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19317: [SPARK-22098][CORE] Add new method aggregateByKeyLocally...

2017-09-23 Thread ConeyLiu

Github user ConeyLiu commented on the issue:

https://github.com/apache/spark/pull/19317
  
Does not `treeAggregate` will introduce another `Shuffle`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadi...

2017-09-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19286


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19286
  
Thanks! Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19286
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19330: Orderable MapType

2017-09-23 Thread jinxing64

Github user jinxing64 commented on the issue:

https://github.com/apache/spark/pull/19330
  
@hvanhovell
Thanks a lot for comment.
I got you point. I will refine soon.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18704
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18704
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82125/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18704
  
**[Test build #82125 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82125/testReport)**
 for PR 18704 at commit 
[`549b10f`](https://github.com/apache/spark/commit/549b10fac2e3b7a8cfd9d289ab4c152e7f764a17).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19286
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19286
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82123/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19286
  
**[Test build #82123 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82123/testReport)**
 for PR 19286 at commit 
[`7e5a57c`](https://github.com/apache/spark/commit/7e5a57c3e4d9550d2ddd8a971293ace3984b5447).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-23 Thread dilipbiswal

Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/19286
  
@viirya No problem. The newer version you have looks clean as well.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18704
  
**[Test build #82125 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82125/testReport)**
 for PR 18704 at commit 
[`549b10f`](https://github.com/apache/spark/commit/549b10fac2e3b7a8cfd9d289ab4c152e7f764a17).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18704
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82124/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18704
  
**[Test build #82124 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82124/testReport)**
 for PR 18704 at commit 
[`b8d5dec`](https://github.com/apache/spark/commit/b8d5decfa32a8d8c1eba331a976eb2e341c40b53).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18704
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18704
  
**[Test build #82124 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82124/testReport)**
 for PR 18704 at commit 
[`b8d5dec`](https://github.com/apache/spark/commit/b8d5decfa32a8d8c1eba331a976eb2e341c40b53).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19286
  
**[Test build #82123 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82123/testReport)**
 for PR 19286 at commit 
[`7e5a57c`](https://github.com/apache/spark/commit/7e5a57c3e4d9550d2ddd8a971293ace3984b5447).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-23 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/19286
  
@dilipbiswal Thanks for the suggestion. However, it looks more complicated, 
IMO.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadi...

2017-09-23 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/19286#discussion_r140644507
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/AggregatedDialect.scala ---
@@ -43,6 +43,17 @@ private class AggregatedDialect(dialects: 
List[JdbcDialect]) extends JdbcDialect
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = {
-dialects.flatMap(_.isCascadingTruncateTable()).reduceOption(_ || _)
+// If any dialect claims cascading truncate, this dialect is also 
cascading truncate.
+// Otherwise, if any dialect has unknown cascading truncate, this 
dialect is also unknown.
+val cascading = 
dialects.flatMap(_.isCascadingTruncateTable()).reduceOption(_ || _)
+if (cascading.getOrElse(false)) {
--- End diff --

Done.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18704
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18704
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82122/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18704
  
**[Test build #82122 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82122/testReport)**
 for PR 18704 at commit 
[`1607bd1`](https://github.com/apache/spark/commit/1607bd152c64bf7900e489eb2cbef086f44e0861).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18704
  
**[Test build #82122 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82122/testReport)**
 for PR 18704 at commit 
[`1607bd1`](https://github.com/apache/spark/commit/1607bd152c64bf7900e489eb2cbef086f44e0861).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-23 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19290
  
oh, remember to remove WIP and update this line 
https://github.com/apache/spark/pull/19290/files#diff-74ca2b618d236bbd6faa23e13bff1403R30
 before merging in? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest co...

2017-09-23 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/19290#discussion_r140643692
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -2650,8 +2650,9 @@ setMethod("merge",
 #' @param suffix a suffix for the column name
 #' @return list of columns
 #'
-#' @note generateAliasesForIntersectedCols since 1.6.0
-generateAliasesForIntersectedCols <- function(x, intersectedColNames, 
suffix) { # nolint
+#' @note genAliasesForIntersectedCols since 1.6.0
--- End diff --

nit: I'd remove this `@note` too


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadingTrunc...

2017-09-23 Thread dilipbiswal

Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/19286
  
@viirya Hey simon, thanks for catching this. Will it be little easier to 
follow if we wrote like this ?
```
override def isCascadingTruncateTable(): Option[Boolean] = {
   def compute(left: Option[Boolean], right: Option[Boolean]): 
Option[Boolean] = {
 (left, right) match {
   case (_, Some(true)) => Some(true)
   case (Some(true), _) => Some(true)
   case (Some(false), Some(false)) => Some(false)
   case (_, _) => None
 }
   }
   // If any dialect claims cascading truncate, this dialect is also 
cascading truncate.
   // Otherwise, if any dialect has unknown cascading truncate, this 
dialect is also unknown.
   dialects.map(_.isCascadingTruncateTable()).reduce(compute(_, _))
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-09-23 Thread szhem

Github user szhem commented on the issue:

https://github.com/apache/spark/pull/19294
  
@mridulm 
> incorporating a test for the sql part will also help in this matter.

What should be the expected behaviour in case of sql? 
I'm asking because [the sql part seems to fail even before setupJob the on 
committer is 
called](https://github.com/apache/spark/blob/3f958a99921d149fb9fdf7ba7e78957afdad1405/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala#L118).

FileOutputFormat.setOutputPath(job, new Path(outputSpec.outputPath))


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19333: [SPARK-22109][SQL][BRANCH-2.2] Resolves type conf...

2017-09-23 Thread HyukjinKwon

Github user HyukjinKwon closed the pull request at:

https://github.com/apache/spark/pull/19333


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18747: [WIP][SPARK-20822][SQL] Generate code to directly get va...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18747
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82121/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18747: [WIP][SPARK-20822][SQL] Generate code to directly get va...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18747
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18747: [WIP][SPARK-20822][SQL] Generate code to directly get va...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18747
  
**[Test build #82121 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82121/testReport)**
 for PR 18747 at commit 
[`c77b129`](https://github.com/apache/spark/commit/c77b12919a928bfef65f653fad6948484ef8ff3d).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19326: [SPARK-22107] Change as to alias in python quickstart

2017-09-23 Thread jgoleary

Github user jgoleary commented on the issue:

https://github.com/apache/spark/pull/19326
  
@HyukjinKwon looks like this does the same thing as `name()` is an alias 
for `alias()`. Happy to close but suggest merging one of these to help future 
noobs.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19290
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82120/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19290
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19290
  
**[Test build #82120 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82120/testReport)**
 for PR 19290 at commit 
[`7e6c2c5`](https://github.com/apache/spark/commit/7e6c2c546d86946ee4cd110321ae24226cb50917).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19321: [SPARK-22100] [SQL] Make percentile_approx support numer...

2017-09-23 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/19321
  
That's a good point, thanks




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82118/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19222
  
**[Test build #82118 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82118/testReport)**
 for PR 19222 at commit 
[`8ec08ba`](https://github.com/apache/spark/commit/8ec08ba822bca62db0d917b2908dfc509e08e96b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19295: [SPARK-22080][SQL] Adds support for allowing user to add...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19295
  
I do not think we should do it. The extra pre-optimizer rules can easily 
break our existing optimizer rules. Adding post optimizer rules should be 
enough for 99% cases.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19295: [SPARK-22080][SQL] Adds support for allowing user...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19295#discussion_r140638641
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/ExperimentalMethods.scala ---
@@ -44,11 +44,14 @@ class ExperimentalMethods private[sql]() {
*/
   @volatile var extraStrategies: Seq[Strategy] = Nil
 
+  @volatile var extraPreOptimizations: Seq[Rule[LogicalPlan]] = Nil
+
   @volatile var extraOptimizations: Seq[Rule[LogicalPlan]] = Nil
--- End diff --

This is an API change. We can't do it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18747: [WIP][SPARK-20822][SQL] Generate code to directly get va...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18747
  
**[Test build #82121 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82121/testReport)**
 for PR 18747 at commit 
[`c77b129`](https://github.com/apache/spark/commit/c77b12919a928bfef65f653fad6948484ef8ff3d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19333: [SPARK-22109][SQL][BRANCH-2.2] Resolves type conflicts b...

2017-09-23 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/19333
  
@HyukjinKwon Could you close this PR?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19333: [SPARK-22109][SQL][BRANCH-2.2] Resolves type conflicts b...

2017-09-23 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/19333
  
Thanks! merging to branch-2.2.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19277: [SPARK-22058][CORE]the BufferedInputStream will not be c...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19277
  
**[Test build #3933 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3933/testReport)**
 for PR 19277 at commit 
[`2e5f21a`](https://github.com/apache/spark/commit/2e5f21a1a89bbf69431f918d3043ff6b58420dd8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19333: [SPARK-22109][SQL][BRANCH-2.2] Resolves type conflicts b...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19333
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19333: [SPARK-22109][SQL][BRANCH-2.2] Resolves type conflicts b...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19333
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82119/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19333: [SPARK-22109][SQL][BRANCH-2.2] Resolves type conflicts b...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19333
  
**[Test build #82119 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82119/testReport)**
 for PR 19333 at commit 
[`42fa83c`](https://github.com/apache/spark/commit/42fa83caab5914a089d282e0c5009f7d0db42c46).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19330: Orderable MapType

2017-09-23 Thread tejasapatil

Github user tejasapatil commented on the issue:

https://github.com/apache/spark/pull/19330
  
@hvanhovell : based on [your comment over the 
jira](https://issues.apache.org/jira/browse/SPARK-18134?focusedCommentId=15693519=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15693519),
 it seemed that the approach to be used is yet to be finalised. Are we moving 
ahead with this approach ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadi...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19286#discussion_r140638171
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/AggregatedDialect.scala ---
@@ -43,6 +43,17 @@ private class AggregatedDialect(dialects: 
List[JdbcDialect]) extends JdbcDialect
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = {
-dialects.flatMap(_.isCascadingTruncateTable()).reduceOption(_ || _)
+// If any dialect claims cascading truncate, this dialect is also 
cascading truncate.
+// Otherwise, if any dialect has unknown cascading truncate, this 
dialect is also unknown.
+val cascading = 
dialects.flatMap(_.isCascadingTruncateTable()).reduceOption(_ || _)
+if (cascading.getOrElse(false)) {
--- End diff --

Use case-match?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadi...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19286#discussion_r140638141
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/jdbc/AggregatedDialect.scala ---
@@ -43,6 +43,17 @@ private class AggregatedDialect(dialects: 
List[JdbcDialect]) extends JdbcDialect
   }
 
   override def isCascadingTruncateTable(): Option[Boolean] = {
-dialects.flatMap(_.isCascadingTruncateTable()).reduceOption(_ || _)
+// If any dialect claims cascading truncate, this dialect is also 
cascading truncate.
+// Otherwise, if any dialect has unknown cascading truncate, this 
dialect is also unknown.
+val cascading = 
dialects.flatMap(_.isCascadingTruncateTable()).reduceOption(_ || _)
+if (cascading.getOrElse(false)) {
+  cascading
+} else {
+  if (dialects.exists(_.isCascadingTruncateTable().isEmpty)) {
--- End diff --

combine line 51 and 52?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19286: [SPARK-21338][SQL][FOLLOW-UP] Implement isCascadi...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19286#discussion_r140638128
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala 
---
@@ -749,6 +749,34 @@ class JDBCSuite extends SparkFunSuite
 assert(agg.isCascadingTruncateTable() === Some(true))
   }
 
+  test("Aggregated dialects: isCascadingTruncateTable") {
+def genDialect(cascadingTruncateTable: Option[Boolean]): JdbcDialect = 
new JdbcDialect {
+  override def canHandle(url: String): Boolean = true
+  override def getCatalystType(
+sqlType: Int,
+typeName: String,
+size: Int,
+md: MetadataBuilder): Option[DataType] = None
+  override def isCascadingTruncateTable(): Option[Boolean] = 
cascadingTruncateTable
+}
+
+val dialectCombination = Seq(
+  List(genDialect(Some(true)), genDialect(Some(false)), 
genDialect(None)),
+  List(genDialect(Some(true)), genDialect(Some(true)), 
genDialect(None)),
+  List(genDialect(Some(false)), genDialect(Some(false)), 
genDialect(None)),
+  List(genDialect(Some(true)), genDialect(Some(true))),
+  List(genDialect(Some(false)), genDialect(Some(false))),
+  List(genDialect(None), genDialect(None))
+)
+
+val expectedCascading = Seq(Some(true), Some(true), None, Some(true), 
Some(false), None)
+
+dialectCombination.zip(expectedCascading).foreach { case (dialects, 
cascading) =>
--- End diff --

Could we combine `dialectCombination` and `expectedCascading` together? Or 
we can create a separate helper function?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19329: [SPARK-22110][SQL][Documentation] Add usage and i...

2017-09-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19329


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19329: [SPARK-22110][SQL][Documentation] Add usage and improve ...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19329
  
Thanks! Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19329: [SPARK-22110][SQL][Documentation] Add usage and improve ...

2017-09-23 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19329
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-09-23 Thread akopich

Github user akopich commented on the issue:

https://github.com/apache/spark/pull/18924
  
@WeichenXu123, thanks for creating Jira. Yes, sure I will work on it. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19332: [SPARK-22093][TESTS] Fixes `assume` in `UtilsSuite` and ...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19332
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82117/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19332: [SPARK-22093][TESTS] Fixes `assume` in `UtilsSuite` and ...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19332
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19290
  
**[Test build #82120 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82120/testReport)**
 for PR 19290 at commit 
[`7e6c2c5`](https://github.com/apache/spark/commit/7e6c2c546d86946ee4cd110321ae24226cb50917).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19332: [SPARK-22093][TESTS] Fixes `assume` in `UtilsSuite` and ...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19332
  
**[Test build #82117 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82117/testReport)**
 for PR 19332 at commit 
[`c32f65b`](https://github.com/apache/spark/commit/c32f65b4a25c53e3fbf3ca5dff274f762da37d5b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19290
  
I also tried v1.0.1 against the master; however, it looks not detecting 
many instances this PR fixes. It only detected 3 extra instances:

```
R/DataFrame.R:2615:22: style: Variable and function names should not be 
longer than 30 characters.
colsX <- generateAliasesForIntersectedCols(x, by, suffixes[1])
 ^
R/DataFrame.R:2616:22: style: Variable and function names should not be 
longer than 30 characters.
colsY <- generateAliasesForIntersectedCols(y, by, suffixes[2])
 ^
R/DataFrame.R:2666:1: style: Variable and function names should not be 
longer than 30 characters.
generateAliasesForIntersectedCols <- function (x, intersectedColNames, 
suffix) {
^
lintr checks failed.
```

 So, I would like to propose to use jimhester/lintr@5431140 if possible.

I opened an issue to ask a question about this slowdown - 
`https://github.com/jimhester/lintr/issues/270`. Probably, we could wait for 
few days.  if nothing could be done for now, I guess we could consider just use 
jimhester/lintr@5431140 as is ...



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19290
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17743: [SPARK-20448][DOCS] Document how FileInputDStream...

2017-09-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17743


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17743: [SPARK-20448][DOCS] Document how FileInputDStream works ...

2017-09-23 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17743
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19290
  
> 5 min: this is mildly concerning, is it possible this is caused by new 
checks in lintr? perhaps we could exclude them or something?

1.0.1 miss the linters below (vs master):

```
nonportable_path_linter: check that file.path() is used to construct safe 
and portable paths.
pipe_continuation_linter: Check that each step in a pipeline is on a new 
line, or the entire pipe fits on one line.
extraction_operator_linter: check that the [[ operator is used when 
extracting a single element from an object, not [ (subsetting) nor $ 
(interactive use).
implicit_integer_linter: check that integers are explicitly typed using the 
form 1L instead of 1.
object_name_linter: check that object names conform to a single naming 
style, e.g. snake_case or lowerCamelCase.
semicolon_terminator_linter: check that no semicolons terminate statements.
todo_comment_linter: check that the source contains no TODO comments 
(case-insensitive).
T_and_F_symbol_linter: avoid the symbols T and F (for TRUE and FALSE).
undesirable_function_linter: report the use of undesirable functions, e.g. 
options or sapply and suggest an alternative.
undesirable_operator_linter: report the use of undesirable operators, e.g. 
::: or <<- and suggest an alternative.
unneeded_concatenation_linter: check that the c function is not used 
without arguments nor with a single constant
```

I manually disabled the linters above but the elapsed time looks almost the 
same. So, I think some existing linters were slowed down in the master - I 
tried to disable everything except for each linter - 
https://gist.github.com/HyukjinKwon/1f852bcdcc2a13f396dfd2d6a88bea16


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19310: [SPARK-18136] Fix SPARK_JARS_DIR for Python pip install ...

2017-09-23 Thread holdenk

Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/19310
  
It probably is. The feature only works on Linux, but thanks for working 
towards fixing this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19333: [SPARK-22109][SQL][BRANCH-2.2] Resolves type conflicts b...

2017-09-23 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/19333
  
LGTM, pending Jenkins.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19333: [SPARK-22109][SQL][BRANCH-2.2] Resolves type conflicts b...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19333
  
**[Test build #82119 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82119/testReport)**
 for PR 19333 at commit 
[`42fa83c`](https://github.com/apache/spark/commit/42fa83caab5914a089d282e0c5009f7d0db42c46).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19333: [SPARK-22109][SQL][BRANCH-2.2] Resolves type conf...

2017-09-23 Thread HyukjinKwon

GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/19333

[SPARK-22109][SQL][BRANCH-2.2] Resolves type conflicts between strings and 
timestamps in partition column

## What changes were proposed in this pull request?

This PR backports 
https://github.com/apache/spark/commit/04975a68b583a6175f93da52374108e5d4754d9a 
into branch-2.2.

## How was this patch tested?

Unit tests in `ParquetPartitionDiscoverySuite`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-22109-backport-2.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19333.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19333


commit 42fa83caab5914a089d282e0c5009f7d0db42c46
Author: hyukjinkwon 
Date:   2017-09-23T15:05:17Z

[SPARK-22109][SQL] Resolves type conflicts between strings and timestamps 
in partition column

This PR proposes to resolve the type conflicts in strings and timestamps in 
partition column values.
It looks we need to set the timezone as it needs a cast between strings and 
timestamps.

```scala
val df = Seq((1, "2015-01-01 00:00:00"), (2, "2014-01-01 00:00:00"), (3, 
"blah")).toDF("i", "str")
val path = "/tmp/test.parquet"
df.write.format("parquet").partitionBy("str").save(path)
spark.read.parquet(path).show()
```

**Before**

```
java.util.NoSuchElementException: None.get
  at scala.None$.get(Option.scala:347)
  at scala.None$.get(Option.scala:345)
  at 
org.apache.spark.sql.catalyst.expressions.TimeZoneAwareExpression$class.timeZone(datetimeExpressions.scala:46)
  at 
org.apache.spark.sql.catalyst.expressions.Cast.timeZone$lzycompute(Cast.scala:172)
  at org.apache.spark.sql.catalyst.expressions.Cast.timeZone(Cast.scala:172)
  at 
org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToString$3$$anonfun$apply$16.apply(Cast.scala:208)
  at 
org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToString$3$$anonfun$apply$16.apply(Cast.scala:208)
  at 
org.apache.spark.sql.catalyst.expressions.Cast.org$apache$spark$sql$catalyst$expressions$Cast$$buildCast(Cast.scala:201)
  at 
org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToString$3.apply(Cast.scala:207)
  at 
org.apache.spark.sql.catalyst.expressions.Cast.nullSafeEval(Cast.scala:533)
  at 
org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:331)
  at 
org.apache.spark.sql.execution.datasources.PartitioningUtils$$anonfun$org$apache$spark$sql$execution$datasources$PartitioningUtils$$resolveTypeConflicts$1.apply(PartitioningUtils.scala:481)
  at 
org.apache.spark.sql.execution.datasources.PartitioningUtils$$anonfun$org$apache$spark$sql$execution$datasources$PartitioningUtils$$resolveTypeConflicts$1.apply(PartitioningUtils.scala:480)
  at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
  at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
  at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
```

**After**

```
+---+---+
|  i|str|
+---+---+
|  2|2014-01-01 00:00:00|
|  1|2015-01-01 00:00:00|
|  3|   blah|
+---+---+
```

Unit tests added in `ParquetPartitionDiscoverySuite` and manual tests.

Author: hyukjinkwon 

Closes #19331 from HyukjinKwon/SPARK-22109.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19222
  
**[Test build #82118 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82118/testReport)**
 for PR 19222 at commit 
[`8ec08ba`](https://github.com/apache/spark/commit/8ec08ba822bca62db0d917b2908dfc509e08e96b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-23 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/19222
  
Jenkins, retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19331: [SPARK-22109][SQL] Resolves type conflicts between strin...

2017-09-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19331
  
Definitely.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19331: [SPARK-22109][SQL] Resolves type conflicts between strin...

2017-09-23 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/19331
  
@HyukjinKwon Could you send a pr to backport this to branch-2.2 please?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19331: [SPARK-22109][SQL] Resolves type conflicts between strin...

2017-09-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19331
  
Thank you @ueshin!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19331: [SPARK-22109][SQL] Resolves type conflicts between strin...

2017-09-23 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/19331
  
Thanks! merging to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19331: [SPARK-22109][SQL] Resolves type conflicts betwee...

2017-09-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19331


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19331: [SPARK-22109][SQL] Resolves type conflicts between strin...

2017-09-23 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/19331
  
LGTM.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17743: [SPARK-20448][DOCS] Document how FileInputDStream works ...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17743
  
**[Test build #3934 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3934/testReport)**
 for PR 17743 at commit 
[`1e620ce`](https://github.com/apache/spark/commit/1e620ceb7b5eb0df6df83525366ebc1074f8e8ce).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19266: [SPARK-22033][CORE] BufferHolder, other size chec...

2017-09-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19266


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19320: [SPARK-22099] The 'job ids' list style needs to b...

2017-09-23 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19320


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17743: [SPARK-20448][DOCS] Document how FileInputDStream works ...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17743
  
**[Test build #3934 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3934/testReport)**
 for PR 17743 at commit 
[`1e620ce`](https://github.com/apache/spark/commit/1e620ceb7b5eb0df6df83525366ebc1074f8e8ce).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19266: [SPARK-22033][CORE] BufferHolder, other size checks shou...

2017-09-23 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/19266
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19320: [SPARK-22099] The 'job ids' list style needs to be chang...

2017-09-23 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/19320
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19277: [SPARK-22058][CORE]the BufferedInputStream will not be c...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19277
  
**[Test build #3933 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3933/testReport)**
 for PR 19277 at commit 
[`2e5f21a`](https://github.com/apache/spark/commit/2e5f21a1a89bbf69431f918d3043ff6b58420dd8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-23 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/19290
  
> anyway I mean if we use the git tag 9951084 it should match v1.0.1 
release exactly.
the one you listed jimhester/lintr@5431140 is in master I think and perhaps 
you are saying there are more changes in master (which is true from the diff)?

Ah, yes. It looks some changes are intendedly not into v1.0.1 (vs the 
master), and given the results from the linter in my local, v1.0.1 missed some 
important changes. Although I am not used to the policy there, I see the diff 
in the commit logs (master vs tag v1.0.1).



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19323: [SPARK-22092] Reallocation in OffHeapColumnVector.reserv...

2017-09-23 Thread hvanhovell

Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/19323
  
@ala can you close this one? The merge script does not do that for 
backports.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19323: [SPARK-22092] Reallocation in OffHeapColumnVector.reserv...

2017-09-23 Thread hvanhovell

Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/19323
  
merging to 2.2/2.1. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19332: [SPARK-22093][TESTS] Fixes `assume` in the tests for `Ut...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19332
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82114/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19332: [SPARK-22093][TESTS] Fixes `assume` in the tests for `Ut...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19332
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19332: [SPARK-22093][TESTS] Fixes `assume` in the tests for `Ut...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19332
  
**[Test build #82114 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82114/testReport)**
 for PR 19332 at commit 
[`beaedb0`](https://github.com/apache/spark/commit/beaedb09d7b8456f82763b21756f0cd46a9b33fd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19222
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82116/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-23 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19222
  
**[Test build #82116 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82116/testReport)**
 for PR 19222 at commit 
[`8ec08ba`](https://github.com/apache/spark/commit/8ec08ba822bca62db0d917b2908dfc509e08e96b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19322: [SPARK-22102][SQL] Set ConfVars.METASTOREWAREHOUSE befor...

2017-09-23 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19322
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 >

1 - 100 of 171 matches

Mail list logo