date:20181104

[GitHub] spark pull request #22942: [SPARK-25884][SQL][FOLLOW-UP] Add sample.json bac...

2018-11-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22942


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22429
  
**[Test build #98461 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98461/testReport)**
 for PR 22429 at commit 
[`76f4248`](https://github.com/apache/spark/commit/76f424830418129c12a2a08d81f19377490c95eb).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22942: [SPARK-25884][SQL][FOLLOW-UP] Add sample.json back.

2018-11-04 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22942
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...

2018-11-04 Thread MaxGekk

Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/22429
  
jenkins, retest this, please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22919: [SPARK-25906][SHELL] Documents '-I' option (from ...

2018-11-04 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22919#discussion_r230655513
  
--- Diff: bin/spark-shell ---
@@ -32,7 +32,10 @@ if [ -z "${SPARK_HOME}" ]; then
   source "$(dirname "$0")"/find-spark-home
 fi
 
-export _SPARK_CMD_USAGE="Usage: ./bin/spark-shell [options]"
+export _SPARK_CMD_USAGE="Usage: ./bin/spark-shell [options]
+
+Scala REPL options:
+  -Ipreload , enforcing line-by-line 
interpretation"
--- End diff --

where do we define other options?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22942: [SPARK-25884][SQL][FOLLOW-UP] Add sample.json back.

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22942
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98458/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22942: [SPARK-25884][SQL][FOLLOW-UP] Add sample.json back.

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22942
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22942: [SPARK-25884][SQL][FOLLOW-UP] Add sample.json back.

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22942
  
**[Test build #98458 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98458/testReport)**
 for PR 22942 at commit 
[`18ccff1`](https://github.com/apache/spark/commit/18ccff15a771d3e0221b49114ff300b0ef41a25b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22939: [SPARK-25446][R] Add schema_of_json() and schema_...

2018-11-04 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22939#discussion_r230649513
  
--- Diff: R/pkg/R/functions.R ---
@@ -205,11 +205,18 @@ NULL
 #'  also supported for the schema.
 #'  \item \code{from_csv}: a DDL-formatted string
 #'  }
-#' @param ... additional argument(s). In \code{to_json}, \code{to_csv} and 
\code{from_json},
-#'this contains additional named properties to control how it 
is converted, accepts
-#'the same options as the JSON/CSV data source. Additionally 
\code{to_json} supports
-#'the "pretty" option which enables pretty JSON generation. In 
\code{arrays_zip},
-#'this contains additional Columns of arrays to be merged.
+#' @param ... additional argument(s).
+#'  \itemize{
+#'  \item \code{to_json}, \code{from_json} and 
\code{schema_of_json}: this contains
+#'  additional named properties to control how it is converted 
and accepts the
+#'  same options as the JSON data source.
+#'  \item \code{to_json}: it supports the "pretty" option which 
enables pretty
--- End diff --

actually, how does `pretty` work? is it `pretty = TRUE`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22939: [SPARK-25446][R] Add schema_of_json() and schema_...

2018-11-04 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22939#discussion_r230650176
  
--- Diff: R/pkg/R/functions.R ---
@@ -2230,6 +2237,32 @@ setMethod("from_json", signature(x = "Column", 
schema = "characterOrstructType")
 column(jc)
   })
 
+#' @details
+#' \code{schema_of_json}: Parses a JSON string and infers its schema in 
DDL format.
+#'
+#' @rdname column_collection_functions
+#' @aliases schema_of_json schema_of_json,characterOrColumn-method
+#' @examples
+#'
+#' \dontrun{
+#' json <- '{"name":"Bob"}'
+#' df <- sql("SELECT * FROM range(1)")
+#' head(select(df, schema_of_json(json)))}
+#' @note schema_of_json since 3.0.0
+setMethod("schema_of_json", signature(x = "characterOrColumn"),
+  function(x, ...) {
+if (class(x) == "character") {
+  col <- callJStatic("org.apache.spark.sql.functions", "lit", 
x)
+} else {
+  col <- x@jc
--- End diff --

what's the use when x is a Column?
`schema_of_csv(lit("Amsterdam,2018")))` seems a bit odd to me...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22939: [SPARK-25446][R] Add schema_of_json() and schema_...

2018-11-04 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22939#discussion_r230649120
  
--- Diff: R/pkg/R/functions.R ---
@@ -2230,6 +2237,32 @@ setMethod("from_json", signature(x = "Column", 
schema = "characterOrstructType")
 column(jc)
   })
 
+#' @details
+#' \code{schema_of_json}: Parses a JSON string and infers its schema in 
DDL format.
+#'
+#' @rdname column_collection_functions
+#' @aliases schema_of_json schema_of_json,characterOrColumn-method
+#' @examples
+#'
+#' \dontrun{
+#' json <- '{"name":"Bob"}'
--- End diff --

I think we should avoid mixing `'` and `"` in doc


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22939: [SPARK-25446][R] Add schema_of_json() and schema_...

2018-11-04 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22939#discussion_r230649693
  
--- Diff: R/pkg/R/functions.R ---
@@ -2260,6 +2293,32 @@ setMethod("from_csv", signature(x = "Column", schema 
= "characterOrColumn"),
 column(jc)
   })
 
+#' @details
+#' \code{schema_of_csv}: Parses a CSV string and infers its schema in DDL 
format.
+#'
+#' @rdname column_collection_functions
+#' @aliases schema_of_csv schema_of_csv,characterOrColumn-method
+#' @examples
+#'
+#' \dontrun{
+#' csv <- "'Amsterdam,2018'"
--- End diff --

I"m a bit confused `"'Amsterdam,2018'"` vs `"Amsterdam,2018"`
does the latter work?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22693: [SPARK-25701][SQL] Supports calculation of table ...

2018-11-04 Thread wangyum

Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/22693#discussion_r230639634
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
@@ -115,26 +116,45 @@ class ResolveHiveSerdeTable(session: SparkSession) 
extends Rule[LogicalPlan] {
 
 class DetermineTableStats(session: SparkSession) extends Rule[LogicalPlan] 
{
   override def apply(plan: LogicalPlan): LogicalPlan = plan 
resolveOperators {
+case filterPlan @ Filter(_, SubqueryAlias(_, relation: 
HiveTableRelation)) =>
+  val predicates = 
PhysicalOperation.unapply(filterPlan).map(_._2).getOrElse(Nil)
+  computeTableStats(relation, predicates)
 case relation: HiveTableRelation
 if DDLUtils.isHiveTable(relation.tableMeta) && 
relation.tableMeta.stats.isEmpty =>
-  val table = relation.tableMeta
-  val sizeInBytes = if 
(session.sessionState.conf.fallBackToHdfsForStatsEnabled) {
-try {
-  val hadoopConf = session.sessionState.newHadoopConf()
-  val tablePath = new Path(table.location)
-  val fs: FileSystem = tablePath.getFileSystem(hadoopConf)
-  fs.getContentSummary(tablePath).getLength
-} catch {
-  case e: IOException =>
-logWarning("Failed to get table size from hdfs.", e)
-session.sessionState.conf.defaultSizeInBytes
-}
-  } else {
-session.sessionState.conf.defaultSizeInBytes
+  computeTableStats(relation)
+  }
+
+  private def computeTableStats(
+  relation: HiveTableRelation,
+  predicates: Seq[Expression] = Nil): LogicalPlan = {
+val table = relation.tableMeta
+val sizeInBytes = if 
(session.sessionState.conf.fallBackToHdfsForStatsEnabled) {
+  try {
+val hadoopConf = session.sessionState.newHadoopConf()
+val tablePath = new Path(table.location)
+val fs: FileSystem = tablePath.getFileSystem(hadoopConf)
+BigInt(fs.getContentSummary(tablePath).getLength)
+  } catch {
+case e: IOException =>
+  logWarning("Failed to get table size from hdfs.", e)
+  getSizeInBytesFromTablePartitions(table.identifier, predicates)
   }
+} else {
+  getSizeInBytesFromTablePartitions(table.identifier, predicates)
+}
+val withStats = table.copy(stats = Some(CatalogStatistics(sizeInBytes 
= sizeInBytes)))
+relation.copy(tableMeta = withStats)
+  }
 
-  val withStats = table.copy(stats = 
Some(CatalogStatistics(sizeInBytes = BigInt(sizeInBytes
-  relation.copy(tableMeta = withStats)
+  private def getSizeInBytesFromTablePartitions(
+  tableIdentifier: TableIdentifier,
+  predicates: Seq[Expression] = Nil): BigInt = {
+session.sessionState.catalog.listPartitionsByFilter(tableIdentifier, 
predicates) match {
--- End diff --

After [this refactor](https://github.com/apache/spark/pull/22743). We can 
avoid compute stats if `LogicalRelation` already cached. because the computed 
stats will not take effect. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22943: [SPARK-25098][SQL] Trim the string when cast stringToTim...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22943
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22943: [SPARK-25098][SQL] Trim the string when cast stringToTim...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22943
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4760/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22943: [SPARK-25098][SQL] Trim the string when cast stringToTim...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22943
  
**[Test build #98460 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98460/testReport)**
 for PR 22943 at commit 
[`d297817`](https://github.com/apache/spark/commit/d297817b7457fef40eb78b803542aed213afb7fc).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22943: [SPARK-25098][SQL] Trim the string when cast stri...

2018-11-04 Thread wangyum

GitHub user wangyum opened a pull request:

https://github.com/apache/spark/pull/22943

[SPARK-25098][SQL] Trim the string when cast stringToTimestamp and 
stringToDate

## What changes were proposed in this pull request?

**Hive** and **Oracle** trim the string when cast `stringToTimestamp` and 
`stringToDate`. this PR support this feature:

![image](https://user-images.githubusercontent.com/5399861/47979721-793b1e80-e0ff-11e8-97c8-24b10950ee9e.png)

![image](https://user-images.githubusercontent.com/5399861/47979725-7dffd280-e0ff-11e8-87d4-5767a00ed46e.png)


## How was this patch tested?

unit tests

Closes https://github.com/apache/spark/pull/22089


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wangyum/spark SPARK-25098

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22943.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22943


commit d297817b7457fef40eb78b803542aed213afb7fc
Author: Yuming Wang 
Date:   2018-11-05T05:31:22Z

trim() the string when cast stringToTimestamp and stringToDate




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22928: [SPARK-25926][CORE] Move config entries in core module t...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22928
  
**[Test build #98459 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98459/testReport)**
 for PR 22928 at commit 
[`6144e01`](https://github.com/apache/spark/commit/6144e01fc6eb612e07a532cc10e3fafb8ccd71ee).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22928: [SPARK-25926][CORE] Move config entries in core module t...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22928
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22928: [SPARK-25926][CORE] Move config entries in core module t...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22928
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4759/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22913: [SPARK-25902][SQL] Add support for dates with mil...

2018-11-04 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22913#discussion_r230635196
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java 
---
@@ -414,6 +416,21 @@ final int getInt(int rowId) {
 }
   }
 
+  private static class DateMilliAccessor extends ArrowVectorAccessor {
+
+private final DateMilliVector accessor;
+
+DateMilliAccessor(DateMilliVector vector) {
+  super(vector);
+  this.accessor = vector;
+}
+
+@Override
+final long getLong(int rowId) {
--- End diff --

We should use `getInt()` to represent the number of days from 1970-01-01 if 
we map the type to date type.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22942: [SPARK-25884][SQL][FOLLOW-UP] Add sample.json back.

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22942
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22942: [SPARK-25884][SQL][FOLLOW-UP] Add sample.json back.

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22942
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4758/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22942: [SPARK-25884][SQL][FOLLOW-UP] Add sample.json back.

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22942
  
**[Test build #98458 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98458/testReport)**
 for PR 22942 at commit 
[`18ccff1`](https://github.com/apache/spark/commit/18ccff15a771d3e0221b49114ff300b0ef41a25b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22942: [SPARK-25884][SQL][FOLLOW-UP] Add sample.json bac...

2018-11-04 Thread ueshin

GitHub user ueshin opened a pull request:

https://github.com/apache/spark/pull/22942

[SPARK-25884][SQL][FOLLOW-UP] Add sample.json back.

## What changes were proposed in this pull request?

This is a follow-up pr of #22892 which moved `sample.json` from hive module 
to sql module, but we still need the file in hive module.

## How was this patch tested?

Existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ueshin/apache-spark 
issues/SPARK-25884/sample.json

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22942.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22942


commit 18ccff15a771d3e0221b49114ff300b0ef41a25b
Author: Takuya UESHIN 
Date:   2018-11-05T04:54:42Z

Add sample.json back.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22942: [SPARK-25884][SQL][FOLLOW-UP] Add sample.json back.

2018-11-04 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/22942
  
cc @srowen @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22892: [SPARK-25884][SQL] Add TBLPROPERTIES and COMMENT, and us...

2018-11-04 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/22892
  
Seems like we still need `sample.json` in hive module. I'll submit a 
follow-up pr.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22939
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98457/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22939
  
**[Test build #98457 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98457/testReport)**
 for PR 22939 at commit 
[`c0a9384`](https://github.com/apache/spark/commit/c0a9384d292cdeff3a8799b20e166522f64ff50d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22939
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22913: [SPARK-25902][SQL] Add support for dates with mil...

2018-11-04 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22913#discussion_r230628333
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala 
---
@@ -71,6 +71,7 @@ object ArrowUtils {
 case d: ArrowType.Decimal => DecimalType(d.getPrecision, d.getScale)
 case date: ArrowType.Date if date.getUnit == DateUnit.DAY => DateType
 case ts: ArrowType.Timestamp if ts.getUnit == TimeUnit.MICROSECOND => 
TimestampType
+case date: ArrowType.Date if date.getUnit == DateUnit.MILLISECOND => 
TimestampType
--- End diff --

Notice that Spark doesn't have date type with milliseconds, so if we want 
to map to date type, the hours, minutes, ... will be lost. Otherwise we have to 
map to timestamp type.
Which is the proper behavior? cc @BryanCutler 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20675: [SPARK-23033][SS][Follow Up] Task level retry for...

2018-11-04 Thread xuanyuanking

Github user xuanyuanking closed the pull request at:

https://github.com/apache/spark/pull/20675


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20675: [SPARK-23033][SS][Follow Up] Task level retry for contin...

2018-11-04 Thread xuanyuanking

Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/20675
  
@HeartSaVioR Thanks for your reply, sorry for just seen your comment. Yep, 
will keep tracking this feature after we supports shuffled stateful operators.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22932
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22932
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98456/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22932
  
**[Test build #98456 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98456/testReport)**
 for PR 22932 at commit 
[`ef49a27`](https://github.com/apache/spark/commit/ef49a277d3fd39c6fd91b3fcda65f660b833ec95).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-04 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22939
  
Will make another PR after this gets merged to allow the cases below:


```r
df <- sql("SELECT named_struct('name', 'Bob') as people")
df <- mutate(df, people_json = to_json(df$people))
head(select(df, from_json(df$people_json, 
schema_of_json(head(df)$people_json
```

```
  from_json(people_json)
1Bob
```

```r
df <- sql("SELECT named_struct('name', 'Bob') as people")
df <- mutate(df, people_json = to_csv(df$people))
head(select(df, from_csv(df$people_json, 
schema_of_csv(head(df)$people_json
```

```
  from_csv(people_json)
1   Bob
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22939
  
**[Test build #98457 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98457/testReport)**
 for PR 22939 at commit 
[`c0a9384`](https://github.com/apache/spark/commit/c0a9384d292cdeff3a8799b20e166522f64ff50d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22939
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4757/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22939
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22941: [SPARK-25936][SQL] Fix InsertIntoDataSourceComman...

2018-11-04 Thread wangyum

Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/22941#discussion_r230622708
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala ---
@@ -589,4 +590,33 @@ class InsertSuite extends DataSourceTest with 
SharedSQLContext {
   sql("INSERT INTO TABLE test_table SELECT 2, null")
 }
   }
+
+  test("SPARK-25936 InsertIntoDataSourceCommand does not use Cached Data") 
{
--- End diff --

It works. Do we need to fix this plan issue?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22918: [SPARK-25892][SQL]Change AttributeReference.withMetadata...

2018-11-04 Thread xuanyuanking

Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/22918
  
The `as the spark-25902 mentioned.` in pr description maybe a typo? 
SPARK-25892?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22903: [SPARK-24196][SQL] Implement Spark's own GetSchemasOpera...

2018-11-04 Thread wangyum

Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/22903
  
cc @gatorsmile


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22925: [SPARK-25913][SQL] Extend UnaryExecNode by unary ...

2018-11-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22925


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22928: [SPARK-25926][CORE] Move config entries in core module t...

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22928
  
Keeping them in separate source files is also fine to me. I think we should 
put them in the same package. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22913
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22913
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98455/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22913
  
**[Test build #98455 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98455/testReport)**
 for PR 22913 at commit 
[`3afb870`](https://github.com/apache/spark/commit/3afb8708c0394368a9435a7911201de31143f41e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...

2018-11-04 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22905#discussion_r230616072
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala 
---
@@ -306,7 +306,15 @@ case class FileSourceScanExec(
   withOptPartitionCount
 }
 
-withSelectedBucketsCount
+val withOptColumnCount = relation.fileFormat match {
+  case columnar: ColumnarFileFormat =>
+val sqlConf = relation.sparkSession.sessionState.conf
+val columnCount = columnar.columnCountForSchema(sqlConf, 
requiredSchema)
+withSelectedBucketsCount + ("ColumnCount" -> columnCount.toString)
--- End diff --

I was wondering how important to know if the columns are pruned or not. In 
that way, other logs should be put in metadata. For instance, we're not even 
showing the actual filters (not cayalyst but I mean the actual pushed filters 
that are going to apply at each source implementation level such as filters 
from `ParquetFilters.createFilter`) in Spark side.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22913
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98454/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22913
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22913
  
**[Test build #98454 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98454/testReport)**
 for PR 22913 at commit 
[`2c14694`](https://github.com/apache/spark/commit/2c146941adb294ec9c5acc93cf55108e88075ad2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22089: [SPARK-25098][SQL]‘Cast’ will return NULL when input...

2018-11-04 Thread wangyum

Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/22089
  
Sure, @gatorsmile .


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18581: [SPARK-21289][SQL][ML] Supports custom line separator fo...

2018-11-04 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/18581
  
What you see is what you get. It's not yet finished. See also 
https://github.com/apache/spark/pull/20877#issuecomment-429182740


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22889: [SPARK-25882][SQL] Added a function to join two d...

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22889#discussion_r230614726
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -883,6 +883,31 @@ class Dataset[T] private[sql](
 join(right, Seq(usingColumn))
   }
 
+  /**
+* Equi-join with another `DataFrame` using the given column.
+*
+* Different from other join functions, the join column will only 
appear once in the output,
+* i.e. similar to SQL's `JOIN USING` syntax.
+*
+* {{{
+*   // Left join of df1 and df2 using the column "user_id"
+*   df1.join(df2, "user_id", "left")
+* }}}
+*
+* @param right Right side of the join operation.
+* @param usingColumn Name of the column to join on. This column must 
exist on both sides.
+* @param joinType Type of join to perform. Default `inner`. Must be 
one of:
+* `inner`, `cross`, `outer`, `full`, `full_outer`, 
`left`, `left_outer`,
+* `right`, `right_outer`, `left_semi`, `left_anti`.
+* @note If you perform a self-join using this function without 
aliasing the input
+* `DataFrame`s, you will NOT be able to reference any columns after 
the join, since
+* there is no way to disambiguate which side of the join you would 
like to reference.
+* @group untypedrel
+*/
+  def join(right: Dataset[_], usingColumn: String, joinType: String): 
DataFrame = {
--- End diff --

Normally, we do not add such an API. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21856: [SPARK-24738] [HistoryServer] FsHistoryProvider c...

2018-11-04 Thread LiShuMing

Github user LiShuMing closed the pull request at:

https://github.com/apache/spark/pull/21856


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22923: [SPARK-25910][CORE] accumulator updates from previous st...

2018-11-04 Thread zsxwing

Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/22923
  
We need to always update user accumulators. Right now such task metrics 
just cause some annoying error logs, seems not worth to fix.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22932
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4756/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22932
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22932
  
**[Test build #98456 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98456/testReport)**
 for PR 22932 at commit 
[`ef49a27`](https://github.com/apache/spark/commit/ef49a277d3fd39c6fd91b3fcda65f660b833ec95).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22932: [SPARK-25102][SQL] Write Spark version to ORC/Par...

2018-11-04 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22932#discussion_r230610261
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/package.scala ---
@@ -44,4 +44,13 @@ package object sql {
   type Strategy = SparkStrategy
 
   type DataFrame = Dataset[Row]
+
+  /**
+   * Metadata key which is used to write Spark version in the followings:
+   * - Parquet file metadata
+   * - ORC file metadata
+   *
+   * Note that Hive table property `spark.sql.create.version` also has 
Spark version.
+   */
+  private[sql] val CREATE_VERSION = "org.apache.spark.sql.create.version"
--- End diff --

Thank you for review, @hvanhovell . Yes, we can use that 
`org.apache.spark.version` since this is a new key.

Although Hive table property `spark.sql.create.version` has `.create.` 
part, it seems that we don't need to follow that convention here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22913
  
Also cc @ueshin 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22693: [SPARK-25701][SQL] Supports calculation of table ...

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22693#discussion_r230609824
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
@@ -115,26 +116,45 @@ class ResolveHiveSerdeTable(session: SparkSession) 
extends Rule[LogicalPlan] {
 
 class DetermineTableStats(session: SparkSession) extends Rule[LogicalPlan] 
{
   override def apply(plan: LogicalPlan): LogicalPlan = plan 
resolveOperators {
+case filterPlan @ Filter(_, SubqueryAlias(_, relation: 
HiveTableRelation)) =>
+  val predicates = 
PhysicalOperation.unapply(filterPlan).map(_._2).getOrElse(Nil)
+  computeTableStats(relation, predicates)
 case relation: HiveTableRelation
 if DDLUtils.isHiveTable(relation.tableMeta) && 
relation.tableMeta.stats.isEmpty =>
-  val table = relation.tableMeta
-  val sizeInBytes = if 
(session.sessionState.conf.fallBackToHdfsForStatsEnabled) {
-try {
-  val hadoopConf = session.sessionState.newHadoopConf()
-  val tablePath = new Path(table.location)
-  val fs: FileSystem = tablePath.getFileSystem(hadoopConf)
-  fs.getContentSummary(tablePath).getLength
-} catch {
-  case e: IOException =>
-logWarning("Failed to get table size from hdfs.", e)
-session.sessionState.conf.defaultSizeInBytes
-}
-  } else {
-session.sessionState.conf.defaultSizeInBytes
+  computeTableStats(relation)
+  }
+
+  private def computeTableStats(
+  relation: HiveTableRelation,
+  predicates: Seq[Expression] = Nil): LogicalPlan = {
+val table = relation.tableMeta
+val sizeInBytes = if 
(session.sessionState.conf.fallBackToHdfsForStatsEnabled) {
+  try {
+val hadoopConf = session.sessionState.newHadoopConf()
+val tablePath = new Path(table.location)
+val fs: FileSystem = tablePath.getFileSystem(hadoopConf)
+BigInt(fs.getContentSummary(tablePath).getLength)
+  } catch {
+case e: IOException =>
+  logWarning("Failed to get table size from hdfs.", e)
+  getSizeInBytesFromTablePartitions(table.identifier, predicates)
   }
+} else {
+  getSizeInBytesFromTablePartitions(table.identifier, predicates)
+}
+val withStats = table.copy(stats = Some(CatalogStatistics(sizeInBytes 
= sizeInBytes)))
+relation.copy(tableMeta = withStats)
+  }
 
-  val withStats = table.copy(stats = 
Some(CatalogStatistics(sizeInBytes = BigInt(sizeInBytes
-  relation.copy(tableMeta = withStats)
+  private def getSizeInBytesFromTablePartitions(
+  tableIdentifier: TableIdentifier,
+  predicates: Seq[Expression] = Nil): BigInt = {
+session.sessionState.catalog.listPartitionsByFilter(tableIdentifier, 
predicates) match {
--- End diff --

The perf will be pretty bad when the number of partitions is huge. Thus, I 
think we can close this PR. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19796: [SPARK-22581][SQL] Catalog api does not allow to ...

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19796#discussion_r230609716
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala ---
@@ -411,7 +410,29 @@ abstract class Catalog {
   tableName: String,
   source: String,
   schema: StructType,
-  options: Map[String, String]): DataFrame
+  options: Map[String, String]): DataFrame = {
+createTable(tableName, source, schema, options, Nil)
+  }
+
+  /**
+* :: Experimental ::
+* (Scala-specific)
+* Create a table based on the dataset in a data source, a schema, a 
set of options and a set of partition columns.
+* Then, returns the corresponding DataFrame.
+*
+* @param tableName is either a qualified or unqualified name that 
designates a table.
+*  If no database identifier is provided, it refers to 
a table in
+*  the current database.
+* @since ???
+*/
+  @Experimental
+  @InterfaceStability.Evolving
+  def createTable(
+tableName: String,
+source: String,
+schema: StructType,
+options: Map[String, String],
+partitionColumnNames : Seq[String]): DataFrame
--- End diff --

I think we will not introduce a new API for partitioning columns in the 
current stage. Let us use SQL DDL for creating the table.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22089: [SPARK-25098][SQL]‘Cast’ will return NULL when input...

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22089
  
@wangyum Could you please take it over?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22089: [SPARK-25098][SQL]‘Cast’ will return NULL whe...

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22089#discussion_r230609486
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
 ---
@@ -98,6 +98,7 @@ class CastSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 c.set(Calendar.MILLISECOND, 0)
 checkEvaluation(Cast(Literal("2015-03-18"), DateType), new 
Date(c.getTimeInMillis))
 checkEvaluation(Cast(Literal("2015-03-18 "), DateType), new 
Date(c.getTimeInMillis))
+checkEvaluation(Cast(Literal(" 2015-03-18"), DateType), new 
Date(c.getTimeInMillis))
--- End diff --

> SELECT CAST(' 22-OCT-1997' AS TIMESTAMP) FROM dual;

Oracle also trims the leading space. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22941: [SPARK-25936][SQL] Fix InsertIntoDataSourceCommand does ...

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22941
  
I think this is not a bug, although the plan is confusing.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand'...

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22930#discussion_r230609078
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SaveIntoDataSourceCommand.scala
 ---
@@ -37,13 +37,12 @@ case class SaveIntoDataSourceCommand(
 query: LogicalPlan,
 dataSource: CreatableRelationProvider,
 options: Map[String, String],
-mode: SaveMode) extends RunnableCommand {
+mode: SaveMode,
+outputColumnNames: Seq[String]) extends DataWritingCommand {
 
-  override protected def innerChildren: Seq[QueryPlan[_]] = Seq(query)
-
-  override def run(sparkSession: SparkSession): Seq[Row] = {
-dataSource.createRelation(
-  sparkSession.sqlContext, mode, options, Dataset.ofRows(sparkSession, 
query))
--- End diff --

See what I commented in https://github.com/apache/spark/pull/22941


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22941: [SPARK-25936][SQL] Fix InsertIntoDataSourceComman...

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22941#discussion_r230609046
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala ---
@@ -589,4 +590,33 @@ class InsertSuite extends DataSourceTest with 
SharedSQLContext {
   sql("INSERT INTO TABLE test_table SELECT 2, null")
 }
   }
+
+  test("SPARK-25936 InsertIntoDataSourceCommand does not use Cached Data") 
{
--- End diff --

You can move this test suite to CachedTableSuite.scala and use the helper 
functions to verify whether the cache is used. 

See the example. 
```
spark.range(2).createTempView("test_view")
spark.catalog.cacheTable("test_view")
val rddId = rddIdOf("test_view")
assert(!isMaterialized(rddId))
sql("INSERT INTO TABLE test_table SELECT * FROM test_view")
assert(isMaterialized(rddId))
```




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18581: [SPARK-21289][SQL][ML] Supports custom line separator fo...

2018-11-04 Thread don4of4

Github user don4of4 commented on the issue:

https://github.com/apache/spark/pull/18581
  
Was this finished and merged in?  I see 
https://issues.apache.org/jira/browse/SPARK-21289 is still open.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22941: [SPARK-25936][SQL] Fix InsertIntoDataSourceComman...

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22941#discussion_r230608937
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoDataSourceCommand.scala
 ---
@@ -30,14 +30,13 @@ import org.apache.spark.sql.sources.InsertableRelation
 case class InsertIntoDataSourceCommand(
 logicalRelation: LogicalRelation,
 query: LogicalPlan,
-overwrite: Boolean)
-  extends RunnableCommand {
+overwrite: Boolean,
+outputColumnNames: Seq[String])
+  extends DataWritingCommand {
 
-  override protected def innerChildren: Seq[QueryPlan[_]] = Seq(query)
-
-  override def run(sparkSession: SparkSession): Seq[Row] = {
+  override def run(sparkSession: SparkSession, child: SparkPlan): Seq[Row] 
= {
 val relation = 
logicalRelation.relation.asInstanceOf[InsertableRelation]
-val data = Dataset.ofRows(sparkSession, query)
--- End diff --

This will use the cached data, although the plan does not show the cached 
data is used. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22913
  
**[Test build #98455 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98455/testReport)**
 for PR 22913 at commit 
[`3afb870`](https://github.com/apache/spark/commit/3afb8708c0394368a9435a7911201de31143f41e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22913: [SPARK-25902][SQL] Add support for dates with mil...

2018-11-04 Thread javierluraschi

Github user javierluraschi commented on a diff in the pull request:

https://github.com/apache/spark/pull/22913#discussion_r230607581
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala 
---
@@ -71,6 +71,7 @@ object ArrowUtils {
 case d: ArrowType.Decimal => DecimalType(d.getPrecision, d.getScale)
 case date: ArrowType.Date if date.getUnit == DateUnit.DAY => DateType
 case ts: ArrowType.Timestamp if ts.getUnit == TimeUnit.MICROSECOND => 
TimestampType
+case date: ArrowType.Date if date.getUnit == DateUnit.MILLISECOND => 
TimestampType
--- End diff --

Good catch, thanks. Yes, this should be mapped to `Date` in the Arrow 
schema, not `TimeStamp`.

To give more background, Arrow Dates can have a unit of `DateUnit.DAY` or 
`DateUnit.MILLISECOND` (see 
[arrow/vector/types/DateUnit.java#L21-L22](https://github.com/apache/arrow/blob/73d379f4631cd3013371f60876a52615171e6c3b/java/vector/src/main/java/org/apache/arrow/vector/types/DateUnit.java#L21-L22)),
 currently, if a date with milliseconds is passed, this simply fails; 
therefore, this change does not affect other type conversions and is fine to 
map all Arrow dates to Spark dates since now all cases are properly handled.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22913
  
**[Test build #98454 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98454/testReport)**
 for PR 22913 at commit 
[`2c14694`](https://github.com/apache/spark/commit/2c146941adb294ec9c5acc93cf55108e88075ad2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22936: [SPARK-19799] Support WITH clause (CTE) in subqueries

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22936
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22936: [SPARK-19799] Support WITH clause (CTE) in subqueries

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22936
  
**[Test build #98453 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98453/testReport)**
 for PR 22936 at commit 
[`66cd537`](https://github.com/apache/spark/commit/66cd5379a17e05707ae162bb20e9c64812737d78).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22936: [SPARK-19799] Support WITH clause (CTE) in subqueries

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22936
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98453/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22936: [SPARK-19799] Support WITH clause (CTE) in subqueries

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22936
  
**[Test build #98453 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98453/testReport)**
 for PR 22936 at commit 
[`66cd537`](https://github.com/apache/spark/commit/66cd5379a17e05707ae162bb20e9c64812737d78).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22936: [SPARK-19799] Support WITH clause (CTE) in subqueries

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22936
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22931
  
**[Test build #4412 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4412/testReport)**
 for PR 22931 at commit 
[`bf85974`](https://github.com/apache/spark/commit/bf85974e769b86056a83be6f051cb15ff3279022).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22932: [SPARK-25102][SQL] Write Spark version to ORC/Par...

2018-11-04 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/22932#discussion_r230604337
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/package.scala ---
@@ -44,4 +44,13 @@ package object sql {
   type Strategy = SparkStrategy
 
   type DataFrame = Dataset[Row]
+
+  /**
+   * Metadata key which is used to write Spark version in the followings:
+   * - Parquet file metadata
+   * - ORC file metadata
+   *
+   * Note that Hive table property `spark.sql.create.version` also has 
Spark version.
+   */
+  private[sql] val CREATE_VERSION = "org.apache.spark.sql.create.version"
--- End diff --

Is this a pre-existing key? Seems that `org.apache.spark.version` should be 
enough.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22938: [SPARK-25935][SQL] Prevent null rows from JSON parser

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22938
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98452/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22938: [SPARK-25935][SQL] Prevent null rows from JSON parser

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22938
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22938: [SPARK-25935][SQL] Prevent null rows from JSON parser

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22938
  
**[Test build #98452 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98452/testReport)**
 for PR 22938 at commit 
[`c4d6a80`](https://github.com/apache/spark/commit/c4d6a8066031c4f1b0f9323f9998f0f0b10b74c7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22818: [SPARK-25904][CORE] Allocate arrays smaller than Int.Max...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22818
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22818: [SPARK-25904][CORE] Allocate arrays smaller than Int.Max...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22818
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98451/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22818: [SPARK-25904][CORE] Allocate arrays smaller than Int.Max...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22818
  
**[Test build #98451 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98451/testReport)**
 for PR 22818 at commit 
[`ca3efd8`](https://github.com/apache/spark/commit/ca3efd8f636706abf8c994cb75c14432f4e4939a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22914
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22914
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98450/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22914
  
**[Test build #98450 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98450/testReport)**
 for PR 22914 at commit 
[`fc1e542`](https://github.com/apache/spark/commit/fc1e5423547fb86156e2b76bd3857c5a75139300).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-04 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22932
  
I see. Thanks, @gatorsmile .


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-04 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22932
  
Will take a look this week. Thanks for your work!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-04 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22932
  
Could you review this please, @gatorsmile ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22754: [SPARK-25776][CORE]The disk write buffer size must be gr...

2018-11-04 Thread kiszk

Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/22754
  
Thanks! merging to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22754: [SPARK-25776][CORE]The disk write buffer size mus...

2018-11-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22754


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22931
  
**[Test build #4412 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4412/testReport)**
 for PR 22931 at commit 
[`bf85974`](https://github.com/apache/spark/commit/bf85974e769b86056a83be6f051cb15ff3279022).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22938: [SPARK-25935][SQL] Prevent null rows from JSON parser

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22938
  
**[Test build #98452 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98452/testReport)**
 for PR 22938 at commit 
[`c4d6a80`](https://github.com/apache/spark/commit/c4d6a8066031c4f1b0f9323f9998f0f0b10b74c7).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22939
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98449/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22939
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22939: [SPARK-25446][R] Add schema_of_json() and schema_of_csv(...

2018-11-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22939
  
**[Test build #98449 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98449/testReport)**
 for PR 22939 at commit 
[`5f0a3b6`](https://github.com/apache/spark/commit/5f0a3b658b1512cceccb6a2e90bc39942851d815).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 >

1 - 100 of 220 matches

Mail list logo