[GitHub] spark issue #22977: [BUILD] Bump previousSparkVersion in MimaBuild.scala to ...

2018-11-10 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/22977
  
right, I mean both this and that should be part of the process 
"post-release"


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232477365
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -147,6 +147,55 @@ getDefaultSqlSource <- function() {
   l[["spark.sql.sources.default"]]
 }
 
+writeToTempFileInArrow <- function(rdf, numPartitions) {
+  # R API in Arrow is not yet released. CRAN requires to add the package 
in requireNamespace
+  # at DESCRIPTION. Later, CRAN checks if the package is available or not. 
Therefore, it works
+  # around by avoiding direct requireNamespace.
+  requireNamespace1 <- requireNamespace
+  if (requireNamespace1("arrow", quietly = TRUE)) {
+record_batch <- get("record_batch", envir = asNamespace("arrow"), 
inherits = FALSE)
+record_batch_stream_writer <- get(
+  "record_batch_stream_writer", envir = asNamespace("arrow"), inherits 
= FALSE)
+file_output_stream <- get(
+  "file_output_stream", envir = asNamespace("arrow"), inherits = FALSE)
+write_record_batch <- get(
+  "write_record_batch", envir = asNamespace("arrow"), inherits = FALSE)
+
+# Currently arrow requires withr; otherwise, write APIs don't work.
+# Direct 'require' is not recommended by CRAN. Here's a workaround.
+require1 <- require
+if (require1("withr", quietly = TRUE)) {
--- End diff --

that will be good. circumventing CRAN check for method name is... 
problematic..
(there are other more hacky way too, but ..)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232477325
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, 
samplingRatio = 1.0,
   x
 }
   }
+  data[] <- lapply(data, cleanCols)
 
-  # drop factors and wrap lists
-  data <- setNames(lapply(data, cleanCols), NULL)
+  args <- list(FUN = list, SIMPLIFY = FALSE, USE.NAMES = FALSE)
+  if (arrowEnabled) {
+shouldUseArrow <- tryCatch({
+  stopifnot(length(data) > 0)
+  dataHead <- head(data, 1)
+  # Currenty Arrow optimization does not support POSIXct and raw 
for now.
+  # Also, it does not support explicit float type set by users. It 
leads to
+  # incorrect conversion. We will fall back to the path without 
Arrow optimization.
+  if (any(sapply(dataHead, function(x) is(x, "POSIXct" {
+stop("Arrow optimization with R DataFrame does not support 
POSIXct type yet.")
+  }
+  if (any(sapply(dataHead, is.raw))) {
+stop("Arrow optimization with R DataFrame does not support raw 
type yet.")
+  }
+  if (inherits(schema, "structType")) {
+if (any(sapply(schema$fields(), function(x) 
x$dataType.toString() == "FloatType"))) {
+  stop("Arrow optimization with R DataFrame does not support 
FloatType type yet.")
--- End diff --

maybe out of bit range? 53 bit 
https://stat.ethz.ch/R-manual/R-patched/library/base/html/double.html



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20788: [SPARK-23647][PYTHON][SQL] Adds more types for hint in p...

2018-11-10 Thread DylanGuedes
Github user DylanGuedes commented on the issue:

https://github.com/apache/spark/pull/20788
  
@HyukjinKwon Sure! I'll be working on this, then. Do you have any specific 
scenarios in mind, so that I could start with them?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232477271
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, 
samplingRatio = 1.0,
   x
 }
   }
+  data[] <- lapply(data, cleanCols)
 
-  # drop factors and wrap lists
-  data <- setNames(lapply(data, cleanCols), NULL)
+  args <- list(FUN = list, SIMPLIFY = FALSE, USE.NAMES = FALSE)
+  if (arrowEnabled) {
+shouldUseArrow <- tryCatch({
+  stopifnot(length(data) > 0)
+  dataHead <- head(data, 1)
+  # Currenty Arrow optimization does not support POSIXct and raw 
for now.
+  # Also, it does not support explicit float type set by users. It 
leads to
+  # incorrect conversion. We will fall back to the path without 
Arrow optimization.
+  if (any(sapply(dataHead, function(x) is(x, "POSIXct" {
+stop("Arrow optimization with R DataFrame does not support 
POSIXct type yet.")
+  }
+  if (any(sapply(dataHead, is.raw))) {
+stop("Arrow optimization with R DataFrame does not support raw 
type yet.")
+  }
+  if (inherits(schema, "structType")) {
+if (any(sapply(schema$fields(), function(x) 
x$dataType.toString() == "FloatType"))) {
+  stop("Arrow optimization with R DataFrame does not support 
FloatType type yet.")
+}
+  }
+  firstRow <- do.call(mapply, append(args, dataHead))[[1]]
+  fileName <- writeToTempFileInArrow(data, numPartitions)
+  tryCatch(
+jrddInArrow <- 
callJStatic("org.apache.spark.sql.api.r.SQLUtils",
+   "readArrowStreamFromFile",
+   sparkSession,
+   fileName),
+  finally = {
+file.remove(fileName)
--- End diff --

yes, just more consistent. I also don't know for sure why all other 
instances are calling unlink


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232477257
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, 
samplingRatio = 1.0,
   x
 }
   }
+  data[] <- lapply(data, cleanCols)
 
-  # drop factors and wrap lists
-  data <- setNames(lapply(data, cleanCols), NULL)
+  args <- list(FUN = list, SIMPLIFY = FALSE, USE.NAMES = FALSE)
+  if (arrowEnabled) {
+shouldUseArrow <- tryCatch({
+  stopifnot(length(data) > 0)
+  dataHead <- head(data, 1)
+  # Currenty Arrow optimization does not support POSIXct and raw 
for now.
+  # Also, it does not support explicit float type set by users. It 
leads to
+  # incorrect conversion. We will fall back to the path without 
Arrow optimization.
+  if (any(sapply(dataHead, function(x) is(x, "POSIXct" {
--- End diff --

LG, I tested a few cases too


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232477171
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -172,10 +221,10 @@ getDefaultSqlSource <- function() {
 createDataFrame <- function(data, schema = NULL, samplingRatio = 1.0,
 numPartitions = NULL) {
   sparkSession <- getSparkSession()
-
+  arrowEnabled <- sparkR.conf("spark.sql.execution.arrow.enabled")[[1]] == 
"true"
--- End diff --

ok


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232477155
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -147,6 +147,55 @@ getDefaultSqlSource <- function() {
   l[["spark.sql.sources.default"]]
 }
 
+writeToTempFileInArrow <- function(rdf, numPartitions) {
+  # R API in Arrow is not yet released. CRAN requires to add the package 
in requireNamespace
+  # at DESCRIPTION. Later, CRAN checks if the package is available or not. 
Therefore, it works
+  # around by avoiding direct requireNamespace.
+  requireNamespace1 <- requireNamespace
+  if (requireNamespace1("arrow", quietly = TRUE)) {
+record_batch <- get("record_batch", envir = asNamespace("arrow"), 
inherits = FALSE)
+record_batch_stream_writer <- get(
+  "record_batch_stream_writer", envir = asNamespace("arrow"), inherits 
= FALSE)
+file_output_stream <- get(
+  "file_output_stream", envir = asNamespace("arrow"), inherits = FALSE)
+write_record_batch <- get(
+  "write_record_batch", envir = asNamespace("arrow"), inherits = FALSE)
+
+# Currently arrow requires withr; otherwise, write APIs don't work.
+# Direct 'require' is not recommended by CRAN. Here's a workaround.
+require1 <- require
+if (require1("withr", quietly = TRUE)) {
+  numPartitions <- if (!is.null(numPartitions)) {
+numToInt(numPartitions)
+  } else {
+1
+  }
+  fileName <- tempfile(pattern = "spark-arrow", fileext = ".tmp")
+  chunk <- as.integer(ceiling(nrow(rdf) / numPartitions))
+  rdf_slices <- split(rdf, rep(1:ceiling(nrow(rdf) / chunk), each = 
chunk)[1:nrow(rdf)])
--- End diff --

ah, any idea that was done that way in python? this seems to be different 
from sc.paralleize which is what 
https://github.com/apache/spark/blob/c3b4a94a91d66c172cf332321d3a78dba29ef8f0/R/pkg/R/context.R#L152
 is done


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232477131
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala 
---
@@ -225,4 +226,25 @@ private[sql] object SQLUtils extends Logging {
 }
 sparkSession.sessionState.catalog.listTables(db).map(_.table).toArray
   }
+
+  /**
+   * R callable function to read a file in Arrow stream format and create 
a `RDD`
+   * using each serialized ArrowRecordBatch as a partition.
+   */
+  def readArrowStreamFromFile(
+  sparkSession: SparkSession,
+  filename: String): JavaRDD[Array[Byte]] = {
+ArrowConverters.readArrowStreamFromFile(sparkSession.sqlContext, 
filename)
--- End diff --

hmm, I see your point... but there could be hundreds of these wrapper we 
need add if we set as a practice, I'm guessing. 

a few problems with these wrappers I see:
1. they are extra work to add or maintain
2. many are very simple, not much value add
3. many get abandoned over the years - they are not called and not removed

but I kinda see your way, let's keep this one and review any new one in the 
future.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22997: SPARK-25999: make-distribution.sh failure with --r and -...

2018-11-10 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/22997
  
btw, please see the page https://spark.apache.org/contributing.html and 
particularly "Pull Request" on the format.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22997: SPARK-25999: make-distribution.sh failure with --r and -...

2018-11-10 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/22997
  
thx, but I'm not sure about this approach. this step will now cause hadoop 
jar to be packaged into the release tarball of hadoop-provided, which is 
undoing the point of hadoop-provided.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23004: [SPARK-26004][SQL] InMemoryTable support StartsWith pred...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23004
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98689/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23004: [SPARK-26004][SQL] InMemoryTable support StartsWith pred...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23004
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23004: [SPARK-26004][SQL] InMemoryTable support StartsWith pred...

2018-11-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23004
  
**[Test build #98689 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98689/testReport)**
 for PR 23004 at commit 
[`7bbdb07`](https://github.com/apache/spark/commit/7bbdb0713056f387e49cf3921a226554e9af5557).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232476906
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -147,6 +147,55 @@ getDefaultSqlSource <- function() {
   l[["spark.sql.sources.default"]]
 }
 
+writeToTempFileInArrow <- function(rdf, numPartitions) {
+  # R API in Arrow is not yet released. CRAN requires to add the package 
in requireNamespace
+  # at DESCRIPTION. Later, CRAN checks if the package is available or not. 
Therefore, it works
+  # around by avoiding direct requireNamespace.
+  requireNamespace1 <- requireNamespace
+  if (requireNamespace1("arrow", quietly = TRUE)) {
+record_batch <- get("record_batch", envir = asNamespace("arrow"), 
inherits = FALSE)
+record_batch_stream_writer <- get(
+  "record_batch_stream_writer", envir = asNamespace("arrow"), inherits 
= FALSE)
+file_output_stream <- get(
+  "file_output_stream", envir = asNamespace("arrow"), inherits = FALSE)
+write_record_batch <- get(
+  "write_record_batch", envir = asNamespace("arrow"), inherits = FALSE)
+
+# Currently arrow requires withr; otherwise, write APIs don't work.
+# Direct 'require' is not recommended by CRAN. Here's a workaround.
+require1 <- require
+if (require1("withr", quietly = TRUE)) {
+  numPartitions <- if (!is.null(numPartitions)) {
+numToInt(numPartitions)
+  } else {
+1
+  }
+  fileName <- tempfile(pattern = "spark-arrow", fileext = ".tmp")
+  chunk <- as.integer(ceiling(nrow(rdf) / numPartitions))
+  rdf_slices <- split(rdf, rep(1:ceiling(nrow(rdf) / chunk), each = 
chunk)[1:nrow(rdf)])
--- End diff --

Let me try to reuse the R side slicing logic.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232475881
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, 
samplingRatio = 1.0,
   x
 }
   }
+  data[] <- lapply(data, cleanCols)
 
-  # drop factors and wrap lists
-  data <- setNames(lapply(data, cleanCols), NULL)
+  args <- list(FUN = list, SIMPLIFY = FALSE, USE.NAMES = FALSE)
+  if (arrowEnabled) {
+shouldUseArrow <- tryCatch({
+  stopifnot(length(data) > 0)
+  dataHead <- head(data, 1)
+  # Currenty Arrow optimization does not support POSIXct and raw 
for now.
+  # Also, it does not support explicit float type set by users. It 
leads to
+  # incorrect conversion. We will fall back to the path without 
Arrow optimization.
+  if (any(sapply(dataHead, function(x) is(x, "POSIXct" {
--- End diff --

Looks okay in this case specifically:

```r
> any(sapply(head(data.frame(list(list(a=NA))), 1), is.raw))
[1] FALSE
> any(sapply(head(data.frame(list(list(a=NA))), 1), function(x) is(x, 
"POSIXct")))
[1] FALSE
> any(sapply(head(data.frame(list(list(a=1))), 1), is.raw))
[1] FALSE
> any(sapply(head(data.frame(list(list(a="a"))), 1), function(x) is(x, 
"POSIXct")))
[1] FALSE
> any(sapply(head(data.frame(list(list(a=raw(1, 1), is.raw))
[1] TRUE
> any(sapply(head(data.frame(list(list(a=as.POSIXct("2000-01-01", 1), 
function(x) is(x, "POSIXct")))
[1] TRUE
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232475777
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, 
samplingRatio = 1.0,
   x
 }
   }
+  data[] <- lapply(data, cleanCols)
 
-  # drop factors and wrap lists
-  data <- setNames(lapply(data, cleanCols), NULL)
+  args <- list(FUN = list, SIMPLIFY = FALSE, USE.NAMES = FALSE)
+  if (arrowEnabled) {
+shouldUseArrow <- tryCatch({
+  stopifnot(length(data) > 0)
+  dataHead <- head(data, 1)
+  # Currenty Arrow optimization does not support POSIXct and raw 
for now.
+  # Also, it does not support explicit float type set by users. It 
leads to
+  # incorrect conversion. We will fall back to the path without 
Arrow optimization.
+  if (any(sapply(dataHead, function(x) is(x, "POSIXct" {
+stop("Arrow optimization with R DataFrame does not support 
POSIXct type yet.")
+  }
+  if (any(sapply(dataHead, is.raw))) {
+stop("Arrow optimization with R DataFrame does not support raw 
type yet.")
+  }
+  if (inherits(schema, "structType")) {
+if (any(sapply(schema$fields(), function(x) 
x$dataType.toString() == "FloatType"))) {
+  stop("Arrow optimization with R DataFrame does not support 
FloatType type yet.")
+}
+  }
+  firstRow <- do.call(mapply, append(args, dataHead))[[1]]
+  fileName <- writeToTempFileInArrow(data, numPartitions)
+  tryCatch(
+jrddInArrow <- 
callJStatic("org.apache.spark.sql.api.r.SQLUtils",
+   "readArrowStreamFromFile",
+   sparkSession,
+   fileName),
+  finally = {
+file.remove(fileName)
--- End diff --

I believe either way is fine.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232475752
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -147,6 +147,55 @@ getDefaultSqlSource <- function() {
   l[["spark.sql.sources.default"]]
 }
 
+writeToTempFileInArrow <- function(rdf, numPartitions) {
+  # R API in Arrow is not yet released. CRAN requires to add the package 
in requireNamespace
+  # at DESCRIPTION. Later, CRAN checks if the package is available or not. 
Therefore, it works
+  # around by avoiding direct requireNamespace.
+  requireNamespace1 <- requireNamespace
+  if (requireNamespace1("arrow", quietly = TRUE)) {
+record_batch <- get("record_batch", envir = asNamespace("arrow"), 
inherits = FALSE)
+record_batch_stream_writer <- get(
+  "record_batch_stream_writer", envir = asNamespace("arrow"), inherits 
= FALSE)
+file_output_stream <- get(
+  "file_output_stream", envir = asNamespace("arrow"), inherits = FALSE)
+write_record_batch <- get(
+  "write_record_batch", envir = asNamespace("arrow"), inherits = FALSE)
+
+# Currently arrow requires withr; otherwise, write APIs don't work.
+# Direct 'require' is not recommended by CRAN. Here's a workaround.
+require1 <- require
+if (require1("withr", quietly = TRUE)) {
--- End diff --

Yea, I at least managed to get rid of this hack itself. Will push soon.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21766: [SPARK-24803][SQL] add support for numeric

2018-11-10 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21766
  
@wangtao605 Do you mind documenting our behavior in our Spark SQL doc?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17648: [SPARK-19851] Add support for EVERY and ANY (SOME) aggre...

2018-11-10 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17648
  
@ptkool Thanks for your contribution! This feature will be available in the 
next release. Spark 3.0


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21679: [SPARK-24695] [SQL]: To add support to return Calendar i...

2018-11-10 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21679
  
For your information, @cloud-fan and I are discussing how we expose 
calendar interval data types. Will post the proposal later. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21766: [SPARK-24803][SQL] add support for numeric

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21766


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17169: [SPARK-19714][ML] Bucketizer.handleInvalid docs i...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17169


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20846: [SPARK-5498][SQL][FOLLOW] add schema to table par...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20846


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20163: [SPARK-22966][PYTHON][SQL] Python UDFs with retur...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20163


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22813: [SPARK-25818][CORE] WorkDirCleanup should only re...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22813


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21514


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15899: [SPARK-18466] added withFilter method to RDD

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15899


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17648: [SPARK-19851] Add support for EVERY and ANY (SOME...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17648


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21679: [SPARK-24695] [SQL]: To add support to return Cal...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21679


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21994: [SPARK-24529][Build][test-maven][follow-up] Add s...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21994


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18697: [SPARK-16683][SQL] Repeated joins to same table c...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18697


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23001: [INFRA] Close stale PRs

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23001


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22463: remove annotation @Experimental

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22463


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21868: [SPARK-24906][SQL] Adaptively enlarge split / par...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21868


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21322: [SPARK-24225][CORE] Support closing AutoClosable ...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21322


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21402: SPARK-24355 Spark external shuffle server improve...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21402


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21161: [SPARK-21645]left outer join synchronize the cond...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21161


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19434: [SPARK-21785][SQL]Support create table from a par...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19434


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18080: [Spark-20771][SQL] Make weekofyear more intuitive

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18080


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18636: added support word2vec training with additional d...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18636


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21257: [SPARK-24194] [SQL]HadoopFsRelation cannot overwr...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21257


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22005: [SPARK-16817][CORE][WIP] Use Alluxio to improve s...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22005


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22539: [SPARK-25517][SQL] Detect/Infer date type in CSV ...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22539


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19691: [SPARK-14922][SPARK-17732][SQL]ALTER TABLE DROP P...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19691


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17176: [SPARK-19833][SQL]remove SQLConf.HIVE_VERIFY_PART...

2018-11-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17176


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23001: [INFRA] Close stale PRs

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23001
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22994
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22994
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98688/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22994
  
**[Test build #98688 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98688/testReport)**
 for PR 22994 at commit 
[`1f5c2f5`](https://github.com/apache/spark/commit/1f5c2f5815a46546d7ea878f6b732c8d5638943b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22994
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98687/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22994
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22994
  
**[Test build #98687 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98687/testReport)**
 for PR 22994 at commit 
[`56329bc`](https://github.com/apache/spark/commit/56329bc9d9d28252032fe6fef8da2ffbb1ed0f9e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22976: [SPARK-25974][SQL]Optimizes Generates bytecode fo...

2018-11-10 Thread heary-cao
Github user heary-cao commented on a diff in the pull request:

https://github.com/apache/spark/pull/22976#discussion_r232474090
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateOrdering.scala
 ---
@@ -133,7 +126,6 @@ object GenerateOrdering extends 
CodeGenerator[Seq[SortOrder], Ordering[InternalR
   returnType = "int",
   makeSplitFunction = { body =>
 s"""
--- End diff --

thanks, I have update it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23004: [SPARK-26004][SQL] InMemoryTable support StartsWith pred...

2018-11-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23004
  
**[Test build #98689 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98689/testReport)**
 for PR 23004 at commit 
[`7bbdb07`](https://github.com/apache/spark/commit/7bbdb0713056f387e49cf3921a226554e9af5557).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23004: [SPARK-26004][SQL] InMemoryTable support StartsWith pred...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23004
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4921/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23004: [SPARK-26004][SQL] InMemoryTable support StartsWith pred...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23004
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232473761
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, 
samplingRatio = 1.0,
   x
 }
   }
+  data[] <- lapply(data, cleanCols)
 
-  # drop factors and wrap lists
-  data <- setNames(lapply(data, cleanCols), NULL)
+  args <- list(FUN = list, SIMPLIFY = FALSE, USE.NAMES = FALSE)
+  if (arrowEnabled) {
+shouldUseArrow <- tryCatch({
+  stopifnot(length(data) > 0)
+  dataHead <- head(data, 1)
+  # Currenty Arrow optimization does not support POSIXct and raw 
for now.
+  # Also, it does not support explicit float type set by users. It 
leads to
+  # incorrect conversion. We will fall back to the path without 
Arrow optimization.
+  if (any(sapply(dataHead, function(x) is(x, "POSIXct" {
+stop("Arrow optimization with R DataFrame does not support 
POSIXct type yet.")
+  }
+  if (any(sapply(dataHead, is.raw))) {
+stop("Arrow optimization with R DataFrame does not support raw 
type yet.")
+  }
+  if (inherits(schema, "structType")) {
+if (any(sapply(schema$fields(), function(x) 
x$dataType.toString() == "FloatType"))) {
+  stop("Arrow optimization with R DataFrame does not support 
FloatType type yet.")
--- End diff --

I suspect that it happens when `numeric` (which is like `1.0`) is casted 
into float type. I think it's related with casting behaviour. Let me take a 
look and file a JIRA there in Arrow side but if you don't mind I will focus on 
matching exact type cases for now ... 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23004: [SPARK-26004][SQL] InMemoryTable support StartsWi...

2018-11-10 Thread wangyum
GitHub user wangyum opened a pull request:

https://github.com/apache/spark/pull/23004

[SPARK-26004][SQL] InMemoryTable support StartsWith predicate push down

## What changes were proposed in this pull request?

[SPARK-24638](https://issues.apache.org/jira/browse/SPARK-24638) adds 
support for Parquet file `StartsWith` predicate push down.
`InMemoryTable` can also support this feature.


## How was this patch tested?

 unit tests and benchmark tests

benchmark test result:
```


Pushdown benchmark for StringStartsWith



Java HotSpot(TM) 64-Bit Server VM 1.8.0_191-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
StringStartsWith filter: (value like '10%'): Best/Avg Time(ms)Rate(M/s) 
  Per Row(ns)   Relative


InMemoryTable Vectorized12068 / 14198  1.3  
   767.3   1.0X
InMemoryTable Vectorized (Pushdown)   5457 / 8662  2.9  
   347.0   2.2X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_191-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
StringStartsWith filter: (value like '1000%'): Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative


InMemoryTable Vectorized  5246 / 5355  3.0  
   333.5   1.0X
InMemoryTable Vectorized (Pushdown)   2185 / 2346  7.2  
   138.9   2.4X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_191-b12 on Mac OS X 10.12.6
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
StringStartsWith filter: (value like '786432%'): Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative


InMemoryTable Vectorized  5112 / 5312  3.1  
   325.0   1.0X
InMemoryTable Vectorized (Pushdown)   2292 / 2522  6.9  
   145.7   2.2X
```



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wangyum/spark SPARK-26004

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23004.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23004


commit 7bbdb0713056f387e49cf3921a226554e9af5557
Author: Yuming Wang 
Date:   2018-11-11T03:56:36Z

InMemoryTable support StartsWith predicate push down




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232473723
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, 
samplingRatio = 1.0,
   x
 }
   }
+  data[] <- lapply(data, cleanCols)
 
-  # drop factors and wrap lists
-  data <- setNames(lapply(data, cleanCols), NULL)
+  args <- list(FUN = list, SIMPLIFY = FALSE, USE.NAMES = FALSE)
+  if (arrowEnabled) {
+shouldUseArrow <- tryCatch({
--- End diff --

Yup, let me try.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232473716
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -172,10 +221,10 @@ getDefaultSqlSource <- function() {
 createDataFrame <- function(data, schema = NULL, samplingRatio = 1.0,
 numPartitions = NULL) {
   sparkSession <- getSparkSession()
-
+  arrowEnabled <- sparkR.conf("spark.sql.execution.arrow.enabled")[[1]] == 
"true"
--- End diff --

Yea,I checked that it always has the default value. I initially left the 
default value but took it out after double checking.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232473705
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -147,6 +147,55 @@ getDefaultSqlSource <- function() {
   l[["spark.sql.sources.default"]]
 }
 
+writeToTempFileInArrow <- function(rdf, numPartitions) {
+  # R API in Arrow is not yet released. CRAN requires to add the package 
in requireNamespace
+  # at DESCRIPTION. Later, CRAN checks if the package is available or not. 
Therefore, it works
+  # around by avoiding direct requireNamespace.
+  requireNamespace1 <- requireNamespace
+  if (requireNamespace1("arrow", quietly = TRUE)) {
+record_batch <- get("record_batch", envir = asNamespace("arrow"), 
inherits = FALSE)
+record_batch_stream_writer <- get(
+  "record_batch_stream_writer", envir = asNamespace("arrow"), inherits 
= FALSE)
+file_output_stream <- get(
+  "file_output_stream", envir = asNamespace("arrow"), inherits = FALSE)
+write_record_batch <- get(
+  "write_record_batch", envir = asNamespace("arrow"), inherits = FALSE)
+
+# Currently arrow requires withr; otherwise, write APIs don't work.
+# Direct 'require' is not recommended by CRAN. Here's a workaround.
+require1 <- require
+if (require1("withr", quietly = TRUE)) {
+  numPartitions <- if (!is.null(numPartitions)) {
+numToInt(numPartitions)
+  } else {
+1
+  }
+  fileName <- tempfile(pattern = "spark-arrow", fileext = ".tmp")
+  chunk <- as.integer(ceiling(nrow(rdf) / numPartitions))
+  rdf_slices <- split(rdf, rep(1:ceiling(nrow(rdf) / chunk), each = 
chunk)[1:nrow(rdf)])
+  stream_writer <- NULL
+  for (rdf_slice in rdf_slices) {
+batch <- record_batch(rdf_slice)
+if (is.null(stream_writer)) {
+  # We should avoid private calls like 'close_on_exit' (CRAN 
disallows) but looks
+  # there's no exposed API for it. Here's a workaround but ideally 
this should
+  # be removed.
+  close_on_exit <- get("close_on_exit", envir = 
asNamespace("arrow"), inherits = FALSE)
--- End diff --

Hm, possibly yea. Let me try it. In this way, we could get rid of `require`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232473697
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -147,6 +147,55 @@ getDefaultSqlSource <- function() {
   l[["spark.sql.sources.default"]]
 }
 
+writeToTempFileInArrow <- function(rdf, numPartitions) {
+  # R API in Arrow is not yet released. CRAN requires to add the package 
in requireNamespace
+  # at DESCRIPTION. Later, CRAN checks if the package is available or not. 
Therefore, it works
+  # around by avoiding direct requireNamespace.
+  requireNamespace1 <- requireNamespace
+  if (requireNamespace1("arrow", quietly = TRUE)) {
+record_batch <- get("record_batch", envir = asNamespace("arrow"), 
inherits = FALSE)
+record_batch_stream_writer <- get(
+  "record_batch_stream_writer", envir = asNamespace("arrow"), inherits 
= FALSE)
+file_output_stream <- get(
+  "file_output_stream", envir = asNamespace("arrow"), inherits = FALSE)
+write_record_batch <- get(
+  "write_record_batch", envir = asNamespace("arrow"), inherits = FALSE)
+
+# Currently arrow requires withr; otherwise, write APIs don't work.
+# Direct 'require' is not recommended by CRAN. Here's a workaround.
+require1 <- require
+if (require1("withr", quietly = TRUE)) {
+  numPartitions <- if (!is.null(numPartitions)) {
+numToInt(numPartitions)
+  } else {
+1
+  }
+  fileName <- tempfile(pattern = "spark-arrow", fileext = ".tmp")
+  chunk <- as.integer(ceiling(nrow(rdf) / numPartitions))
+  rdf_slices <- split(rdf, rep(1:ceiling(nrow(rdf) / chunk), each = 
chunk)[1:nrow(rdf)])
--- End diff --

This resembles PySpark side logic:


https://github.com/apache/spark/blob/d367bdcf521f564d2d7066257200be26b27ea926/python/pyspark/sql/session.py#L554-L556

Let me check the difference between them


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232473669
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -147,6 +147,55 @@ getDefaultSqlSource <- function() {
   l[["spark.sql.sources.default"]]
 }
 
+writeToTempFileInArrow <- function(rdf, numPartitions) {
+  # R API in Arrow is not yet released. CRAN requires to add the package 
in requireNamespace
+  # at DESCRIPTION. Later, CRAN checks if the package is available or not. 
Therefore, it works
+  # around by avoiding direct requireNamespace.
+  requireNamespace1 <- requireNamespace
+  if (requireNamespace1("arrow", quietly = TRUE)) {
+record_batch <- get("record_batch", envir = asNamespace("arrow"), 
inherits = FALSE)
+record_batch_stream_writer <- get(
+  "record_batch_stream_writer", envir = asNamespace("arrow"), inherits 
= FALSE)
+file_output_stream <- get(
+  "file_output_stream", envir = asNamespace("arrow"), inherits = FALSE)
+write_record_batch <- get(
+  "write_record_batch", envir = asNamespace("arrow"), inherits = FALSE)
+
+# Currently arrow requires withr; otherwise, write APIs don't work.
+# Direct 'require' is not recommended by CRAN. Here's a workaround.
+require1 <- require
+if (require1("withr", quietly = TRUE)) {
+  numPartitions <- if (!is.null(numPartitions)) {
+numToInt(numPartitions)
+  } else {
+1
--- End diff --

We should; however, it follows the original code path's behaviour. I 
matched it as the same so that we can compare the performances in the same 
conditions. If you don't mind, I will fix both in a separate PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232473643
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala 
---
@@ -225,4 +226,25 @@ private[sql] object SQLUtils extends Logging {
 }
 sparkSession.sessionState.catalog.listTables(db).map(_.table).toArray
   }
+
+  /**
+   * R callable function to read a file in Arrow stream format and create 
a `RDD`
+   * using each serialized ArrowRecordBatch as a partition.
+   */
+  def readArrowStreamFromFile(
+  sparkSession: SparkSession,
+  filename: String): JavaRDD[Array[Byte]] = {
+ArrowConverters.readArrowStreamFromFile(sparkSession.sqlContext, 
filename)
--- End diff --

Hmhmhm .. yea. What I was trying to do is to add SQL related codes called 
in R from JVM, into here when they are not official APIs in order to avoid, we 
change the internal APIs within Scala, and it causes R test failure. I was 
trying to do the similar things within PySpark side.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22954#discussion_r232473611
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -147,6 +147,55 @@ getDefaultSqlSource <- function() {
   l[["spark.sql.sources.default"]]
 }
 
+writeToTempFileInArrow <- function(rdf, numPartitions) {
+  # R API in Arrow is not yet released. CRAN requires to add the package 
in requireNamespace
+  # at DESCRIPTION. Later, CRAN checks if the package is available or not. 
Therefore, it works
+  # around by avoiding direct requireNamespace.
+  requireNamespace1 <- requireNamespace
+  if (requireNamespace1("arrow", quietly = TRUE)) {
+record_batch <- get("record_batch", envir = asNamespace("arrow"), 
inherits = FALSE)
+record_batch_stream_writer <- get(
+  "record_batch_stream_writer", envir = asNamespace("arrow"), inherits 
= FALSE)
+file_output_stream <- get(
+  "file_output_stream", envir = asNamespace("arrow"), inherits = FALSE)
+write_record_batch <- get(
+  "write_record_batch", envir = asNamespace("arrow"), inherits = FALSE)
+
+# Currently arrow requires withr; otherwise, write APIs don't work.
+# Direct 'require' is not recommended by CRAN. Here's a workaround.
+require1 <- require
+if (require1("withr", quietly = TRUE)) {
--- End diff --

Yup .. it is .. looks we shouldn't have this error from a cursory look in R 
API of Arrow. Maybe this can be gone when I use official R Arrow release 
version. Let me check it later.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23001: [INFRA] Close stale PRs

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23001
  
I took a quick pass. Mind adding those please?:

#22539
#22539
#21868
#21514
#21402
#21322
#21257
#20163
#19691
#18697
#18636
#17176


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18697: [SPARK-16683][SQL] Repeated joins to same table can leak...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18697
  
Let't close this then.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20163: [SPARK-22966][PYTHON][SQL] Python UDFs with returnType=S...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20163
  
Let's leave this closed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20168: [SPARK-22730][ML] Add ImageSchema support for all OpenCv...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20168
  
@tomasatdatabricks, mind updating this? Lately I happened to take a look 
for this few times. I will try to review.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20503
  
ping @ashashwat to update


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule ch...

2018-11-10 Thread maryannxue
Github user maryannxue closed the pull request at:

https://github.com/apache/spark/pull/22060


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule check

2018-11-10 Thread maryannxue
Github user maryannxue commented on the issue:

https://github.com/apache/spark/pull/22060
  
Thank you for reminding me, @HyukjinKwon! And thanks to @mgaido91's 
contribution, this has been fixed already.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20788: [SPARK-23647][PYTHON][SQL] Adds more types for hint in p...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20788
  
@DylanGuedes let's add tests.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21257: [SPARK-24194] [SQL]HadoopFsRelation cannot overwrite a p...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21257
  
ping @zheh12 to address comments. I am going to suggest to close this for 
now while I am identifying PRs to close now.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21322
  
Shall we leave this PR closed and start it from a design doc? Let me 
suggest to close this for now while I am looking through old PRs.

@JeetKunDoug, please feel free to create a clone of this PR if there's any 
reason to keep this open that I missed. No objection.

Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21402
  
@redsanket you should close it by yourself.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide key pa...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21514
  
I'm going to suggest to close this. The review comments were not addressed 
more then few months and there's not quite a great point to keep inactive PRs.

Feel free to take over this if any of you here is interested in this. Or 
@tooptoop4, please recreate a PR after addressing review commnets here.

Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide key pa...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21514
  
ping @tooptoop4


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21868: [SPARK-24906][SQL] Adaptively enlarge split / partition ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21868
  
I think we should fix this. Basically the dynamic estimation logic is too 
flaky, and I think we need this for the current status. Let's don't add it for 
now.

While I am revisiting old PRs, I am trying to suggest to close PRs that 
look not likely to be merged. Let me suggest to close this for now but please 
feel free to recreate a PR if you strongly this is needed in Spark. No 
objection.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule check

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22060
  
hey @maryannxue, where are we here? Let's close this if it's going to be 
inactive a couple of weeks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22144: [SPARK-24935][SQL] : Problem with Executing Hive UDF's f...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22144
  
So, 2.4 is out. Where are we? Rereading the comments above, looks we should:

1. Find the root cause
2. Officially drop it if the workaround is not easy
3. Fix it if the workaround is simple
4. add a test

If not, I would just document that we dropped this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22184: [SPARK-25132][SQL][DOC] Add migration doc for case-insen...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22184
  
@seancxmao, so this behaviour changes description is only valid when we 
upgrade spark 2.3 to 2.4? Then we can add it in `Upgrading From Spark SQL 2.3 
to 2.4`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21363: [SPARK-19228][SQL] Migrate on Java 8 time from FastDateF...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21363
  
I am going to suggest to close this since it's being active more then few 
weeks. It should be good to fix. Let me leave some cc's who might be interested 
in this just FYI. Feel free to take over this when you guys find some time or 
are interested in this. 

Adding @xuanyuanking, @mgaido91, @viirya, @MaxGekk, @softmanu who I could 
think of for now. Feel free to ignore my cc if you guys are busy or having more 
important fixes you guys are working on.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22993: [SPARK-24421][BUILD][CORE] Accessing sun.misc.Cleaner in...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22993
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98686/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22993: [SPARK-24421][BUILD][CORE] Accessing sun.misc.Cleaner in...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22993
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22993: [SPARK-24421][BUILD][CORE] Accessing sun.misc.Cleaner in...

2018-11-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22993
  
**[Test build #98686 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98686/testReport)**
 for PR 22993 at commit 
[`2db719c`](https://github.com/apache/spark/commit/2db719c2896c96793dfc96791bbadb2bc8a3f4ab).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22994
  
**[Test build #98688 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98688/testReport)**
 for PR 22994 at commit 
[`1f5c2f5`](https://github.com/apache/spark/commit/1f5c2f5815a46546d7ea878f6b732c8d5638943b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22994
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4920/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22994
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22994
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22994
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4919/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22994
  
**[Test build #98687 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98687/testReport)**
 for PR 22994 at commit 
[`56329bc`](https://github.com/apache/spark/commit/56329bc9d9d28252032fe6fef8da2ffbb1ed0f9e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22994
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4918/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22994
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread shaneknapp
Github user shaneknapp commented on the issue:

https://github.com/apache/spark/pull/22994
  
test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread shaneknapp
Github user shaneknapp commented on the issue:

https://github.com/apache/spark/pull/22994
  
test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23001: [INFRA] Close stale PRs

2018-11-10 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/23001
  
Thank you for pinging me. Please add
#15899 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22993: [SPARK-24421][BUILD][CORE] Accessing sun.misc.Cleaner in...

2018-11-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22993
  
**[Test build #98686 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98686/testReport)**
 for PR 22993 at commit 
[`2db719c`](https://github.com/apache/spark/commit/2db719c2896c96793dfc96791bbadb2bc8a3f4ab).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22993: [SPARK-24421][BUILD][CORE] Accessing sun.misc.Cleaner in...

2018-11-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22993
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4917/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   >