[GitHub] spark pull request #22932: [SPARK-25102][SQL] Write Spark version to ORC/Par...

2018-11-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22932#discussion_r230547020
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala
 ---
@@ -314,6 +316,21 @@ abstract class OrcSuite extends OrcTest with 
BeforeAndAfterAll {
   checkAnswer(spark.read.orc(path.getCanonicalPath), Row(ts))
 }
   }
+
--- End diff --

Please note that the following test case is executed twice; 
`OrcSourceSuite` and `HiveOrcSourceSuite`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22932
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4733/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22932
  
**[Test build #98420 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98420/testReport)**
 for PR 22932 at commit 
[`601ccbb`](https://github.com/apache/spark/commit/601ccbb4e20a068469839bc71870230cfb6fd7a1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22932
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22932: [SPARK-25102][SQL] Write Spark version to ORC/Par...

2018-11-02 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/22932

[SPARK-25102][SQL] Write Spark version to ORC/Parquet file metadata

## What changes were proposed in this pull request?

Currently, Spark writes Spark version number into Hive Table properties 
with `spark.sql.create.version`.
```
parameters:{
  spark.sql.sources.schema.part.0={
"type":"struct",
"fields":[{"name":"a","type":"integer","nullable":true,"metadata":{}}]
  },
  transient_lastDdlTime=1541142761, 
  spark.sql.sources.schema.numParts=1,
  spark.sql.create.version=2.4.0
}
```

This PR aims to write Spark versions to ORC/Parquet file metadata with 
`org.apache.spark.sql.create.version`. It's different from Hive Table property 
key `spark.sql.create.version`, but it seems that we cannot change that for 
backward compatibility.

**ORC (`native` and `hive` implmentation)**
```
File Version: 0.12 with ORC_135
...
User Metadata:
  org.apache.spark.sql.create.version=3.0.0-SNAPSHOT
```

**PARQUET**
```
creator: parquet-mr version 1.10.0 (build 
031a6654009e3b82020012a18434c582bd74c73a)
extra:   org.apache.spark.sql.create.version = 3.0.0-SNAPSHOT
extra:   org.apache.spark.sql.parquet.row.metadata = 
{"type":"struct","fields":[{"name":"id","type":"long","nullable":false,"metadata":{}}]}
```

## How was this patch tested?

Pass the Jenkins with newly added test cases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-25102

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22932.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22932


commit 601ccbb4e20a068469839bc71870230cfb6fd7a1
Author: Dongjoon Hyun 
Date:   2018-11-03T06:43:48Z

[SPARK-25102][SQL] Write Spark version to ORC/Parquet file metadata




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22255: [SPARK-25102][Spark Core] Write Spark version informatio...

2018-11-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22255
  
It seems to cause some inconsistency if we choose one of 
`org.apache.spark.sql.create.version` or `spark.sql.create.version` as a key?

1) If we choose `spark.sql.create.version` as a key, in Parquet, it will 
look like the following.
```
extra:   spark.sql.create.version = 3.0.0-SNAPSHOT
extra:   org.apache.spark.sql.parquet.row.metadata = 
{"type":"struct","fields":[{"name":"id","type":"long","nullable":false,"metadata":{}}]}
```

2) If we choose `org.apache.spark.sql.create.version`, it's different from 
Hive table property.

I'll ignore the consistency of (2) for backward compatibility.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22912: [SPARK-25901][CORE] Use only one thread in Barrie...

2018-11-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22912


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22920: [SPARK-24959][SQL][FOLLOWUP] Creating Jackson par...

2018-11-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22920#discussion_r230546330
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmark.scala
 ---
@@ -158,26 +166,78 @@ object JSONBenchmark extends SqlBasedBenchmark {
 
   val ds = spark.read.schema(schema).json(path.getAbsolutePath)
 
-  benchmark.addCase(s"Select $colsNum columns + count()", 3) { _ =>
+  benchmark.addCase(s"Select $colsNum columns + count()", numIters) { 
_ =>
 ds.select("*").filter((_: Row) => true).count()
   }
-  benchmark.addCase(s"Select 1 column + count()", 3) { _ =>
+  benchmark.addCase(s"Select 1 column + count()", numIters) { _ =>
 ds.select($"col1").filter((_: Row) => true).count()
   }
-  benchmark.addCase(s"count()", 3) { _ =>
+  benchmark.addCase(s"count()", numIters) { _ =>
 ds.count()
   }
 
   benchmark.run()
 }
   }
 
+  def jsonParserCreation(rowsNum: Int, numIters: Int): Unit = {
+val benchmark = new Benchmark("creation of JSON parser per line", 
rowsNum, output = output)
+
+withTempPath { path =>
+  prepareDataInfo(benchmark)
+
+  val shortColumnPath = path.getAbsolutePath + "/short"
+  val shortSchema = writeShortColumn(shortColumnPath, rowsNum)
+
+  val wideColumnPath = path.getAbsolutePath + "/wide"
+  val wideSchema = writeWideColumn(wideColumnPath, rowsNum)
+
+  benchmark.addCase("Short column without encoding", numIters) { _ =>
+spark.read
+  .schema(shortSchema)
+  .json(shortColumnPath)
+  .filter((_: Row) => true)
+  .count()
+  }
+
+  benchmark.addCase("Short column with UTF-8", numIters) { _ =>
+spark.read
+  .option("encoding", "UTF-8")
+  .schema(shortSchema)
+  .json(shortColumnPath)
+  .filter((_: Row) => true)
+  .count()
+  }
+
+  benchmark.addCase("Wide column without encoding", numIters) { _ =>
+spark.read
+  .schema(wideSchema)
+  .json(wideColumnPath)
+  .filter((_: Row) => true)
+  .count()
+  }
+
+  benchmark.addCase("Wide column with UTF-8", numIters) { _ =>
+spark.read
+  .option("encoding", "UTF-8")
+  .schema(wideSchema)
+  .json(wideColumnPath)
+  .filter((_: Row) => true)
+  .count()
+  }
+
+  benchmark.run()
+}
+  }
+
   override def runBenchmarkSuite(mainArgs: Array[String]): Unit = {
+val numIters = 2
--- End diff --

Thank you for updating, @MaxGekk .
Do we have a reason to decrease this value from 3 to 2 in this PR?
If this is to reduce the time, let's keep the original value. 
This benchmark is not executed frequently.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22893: [SPARK-25868][MLlib] One part of Spark MLlib Kmean Logic...

2018-11-02 Thread KyleLi1985
Github user KyleLi1985 commented on the issue:

https://github.com/apache/spark/pull/22893
  
> So the pull request right now doesn't reflect what you tested, but you 
tested the version pasted above. You're saying that the optimization just never 
helps the dense-dense case, and sqdist is faster than a dot product. This 
doesn't make sense mathematically as it should be more math, but stranger 
things have happened.
> 
> Still, I don't follow your test code here. You parallelize one vector, 
map it, collect it: why use Spark? and it's the same vector over and over, and 
it's not a big vector. Your sparse vectors aren't very sparse.
> 
> How about more representative input -- larger vectors (100s of elements, 
probably), more sparse sparse vectors, and a large set of different inputs. I 
also don't see where the precision bound is changed here?
> 
> This may be a good change but I'm just not yet convinced by the test 
methodology, and the result still doesn't make much intuitive sense.

1) why use Spark? not for special reason, only align with my common using 
tool. 

2) About the vector, I did a more representative input test, I show this 
result below

3) About the precision, it is trick,  you can meet your goal (let your 
calculation logic into which branch) by manually change it.  As I said in last 
comment, take LOGIC2 for example, you can manually change precision to -1  
in ( precisionbound1 < precision) and change precision to 1 in 
(precisionbound2 > precision), so you calculation login will into LOGIC2 
situation.  It is like codecoverage thing.  Anyway, we goal is to show the 
performance will not change in same calculation logic before and after added 
Enhance for sparse-sparse and sparse-dense situation.

There is my test file

[SparkMLlibTest.txt](https://github.com/apache/spark/files/2544667/SparkMLlibTest.txt)

There is my test data situation
I use the data 

http://archive.ics.uci.edu/ml/datasets/Condition+monitoring+of+hydraulic+systems
extract file (PS1, PS2, PS3, PS4, PS5, PS6) to form the test data

total instances are 13230
the attributes for line are 6000

**Result for sparse-sparse situation time cost (milliseconds)**
Before Enhance:  7670, 7704, 7652
After Enhance: 7634, 7729, 7645



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-11-02 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/22912
  
Thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22930
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22930
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98415/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22930
  
**[Test build #98415 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98415/testReport)**
 for PR 22930 at commit 
[`eca075a`](https://github.com/apache/spark/commit/eca075a83bfba189d93e04376577cbddaeb2d897).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22930
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4732/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22930
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22930
  
**[Test build #98419 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98419/testReport)**
 for PR 22930 at commit 
[`a799e3f`](https://github.com/apache/spark/commit/a799e3f0f39459b2eb14978fa622af4a3e0b3294).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22921
  
Looks okay to me too but I'd also leave this open for few more days.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22908: [MINOR][SQL] Replace all TreeNode's node name in ...

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22908#discussion_r230544923
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
 ---
@@ -56,7 +56,7 @@ case class DataSourceV2Relation(
 
   override def pushedFilters: Seq[Expression] = Seq.empty
 
-  override def simpleString: String = "RelationV2 " + metadataString
+  override def simpleString: String = s"$nodeName " + metadataString
--- End diff --

I'd follow this comment, actually.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22913: [SPARK-25902][SQL] Add support for dates with mil...

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22913#discussion_r230544838
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala 
---
@@ -71,6 +71,7 @@ object ArrowUtils {
 case d: ArrowType.Decimal => DecimalType(d.getPrecision, d.getScale)
 case date: ArrowType.Date if date.getUnit == DateUnit.DAY => DateType
 case ts: ArrowType.Timestamp if ts.getUnit == TimeUnit.MICROSECOND => 
TimestampType
+case date: ArrowType.Date if date.getUnit == DateUnit.MILLISECOND => 
TimestampType
--- End diff --

Wait .. is it correct to map it to `TimestampType`? Looks this is why 
`Date` with `MILLISECOND` is not added.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22913
  
Can we add a test in `ArrowUtilsSuite.scala`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22626#discussion_r230544775
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql ---
@@ -15,3 +15,10 @@ CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * 
FROM VALUES ('1,abc', 'a
 SELECT schema_of_csv(csvField) FROM csvTable;
 -- Clean up
 DROP VIEW IF EXISTS csvTable;
+-- to_csv
+select to_csv(named_struct('a', 1, 'b', 2));
+select to_csv(named_struct('time', to_timestamp('2015-08-26', 
'-MM-dd')), map('timestampFormat', 'dd/MM/'));
+-- Check if errors handled
+select to_csv(named_struct('a', 1, 'b', 2), named_struct('mode', 
'PERMISSIVE'));
+select to_csv(named_struct('a', 1, 'b', 2), map('mode', 1));
--- End diff --

This one too since the exception is from `convertToMapData`. We just only 
need one test - this one or the one right above. One of them can be removed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22626#discussion_r230544760
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql ---
@@ -15,3 +15,10 @@ CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * 
FROM VALUES ('1,abc', 'a
 SELECT schema_of_csv(csvField) FROM csvTable;
 -- Clean up
 DROP VIEW IF EXISTS csvTable;
+-- to_csv
+select to_csv(named_struct('a', 1, 'b', 2));
+select to_csv(named_struct('time', to_timestamp('2015-08-26', 
'-MM-dd')), map('timestampFormat', 'dd/MM/'));
+-- Check if errors handled
+select to_csv(named_struct('a', 1, 'b', 2), named_struct('mode', 
'PERMISSIVE'));
+select to_csv(named_struct('a', 1, 'b', 2), map('mode', 1));
+select to_csv();
--- End diff --

I think we don't have to test this since it's not specific to this 
expression.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22626#discussion_r230544717
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala
 ---
@@ -174,3 +176,66 @@ case class SchemaOfCsv(
 
   override def prettyName: String = "schema_of_csv"
 }
+
+/**
+ * Converts a [[StructType]] to a CSV output string.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given 
struct value",
+  examples = """
+Examples:
+  > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
+   1,2
+  > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 
'-MM-dd')), map('timestampFormat', 'dd/MM/'));
+   "26/08/2015"
+  """,
+  since = "3.0.0")
+// scalastyle:on line.size.limit
+case class StructsToCsv(
+ options: Map[String, String],
+ child: Expression,
+ timeZoneId: Option[String] = None)
+  extends UnaryExpression with TimeZoneAwareExpression with 
CodegenFallback with ExpectsInputTypes {
+  override def nullable: Boolean = true
+
+  def this(options: Map[String, String], child: Expression) = 
this(options, child, None)
+
+  // Used in `FunctionRegistry`
+  def this(child: Expression) = this(Map.empty, child, None)
+
+  def this(child: Expression, options: Expression) =
+this(
+  options = ExprUtils.convertToMapData(options),
+  child = child,
+  timeZoneId = None)
+
+  @transient
+  lazy val writer = new CharArrayWriter()
+
+  @transient
+  lazy val inputSchema: StructType = child.dataType match {
+case st: StructType => st
+case other =>
+  throw new IllegalArgumentException(s"Unsupported input type 
${other.catalogString}")
+  }
+
+  @transient
+  lazy val gen = new UnivocityGenerator(
+inputSchema, writer, new CSVOptions(options, columnPruning = true, 
timeZoneId.get))
+
+  // This converts rows to the CSV output according to the given schema.
+  @transient
+  lazy val converter: Any => UTF8String = {
+(row: Any) => 
UTF8String.fromString(gen.writeToString(row.asInstanceOf[InternalRow]))
+  }
+
+  override def dataType: DataType = StringType
+
+  override def withTimeZone(timeZoneId: String): TimeZoneAwareExpression =
+copy(timeZoneId = Option(timeZoneId))
+
+  override def nullSafeEval(value: Any): Any = converter(value)
+
+  override def inputTypes: Seq[AbstractDataType] = 
TypeCollection(StructType) :: Nil
--- End diff --

I think we can `StructType :: Nil`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22914
  
**[Test build #98417 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98417/testReport)**
 for PR 22914 at commit 
[`2e39c4a`](https://github.com/apache/spark/commit/2e39c4a2cbf1db82b37795b2b568985fda2ff903).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22626#discussion_r230544667
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala
 ---
@@ -174,3 +176,66 @@ case class SchemaOfCsv(
 
   override def prettyName: String = "schema_of_csv"
 }
+
+/**
+ * Converts a [[StructType]] to a CSV output string.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given 
struct value",
+  examples = """
+Examples:
+  > SELECT _FUNC_(named_struct('a', 1, 'b', 2));
+   1,2
+  > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', 
'-MM-dd')), map('timestampFormat', 'dd/MM/'));
+   "26/08/2015"
+  """,
+  since = "3.0.0")
+// scalastyle:on line.size.limit
+case class StructsToCsv(
+ options: Map[String, String],
+ child: Expression,
+ timeZoneId: Option[String] = None)
+  extends UnaryExpression with TimeZoneAwareExpression with 
CodegenFallback with ExpectsInputTypes {
+  override def nullable: Boolean = true
+
+  def this(options: Map[String, String], child: Expression) = 
this(options, child, None)
+
+  // Used in `FunctionRegistry`
+  def this(child: Expression) = this(Map.empty, child, None)
+
+  def this(child: Expression, options: Expression) =
+this(
+  options = ExprUtils.convertToMapData(options),
+  child = child,
+  timeZoneId = None)
+
+  @transient
+  lazy val writer = new CharArrayWriter()
+
+  @transient
+  lazy val inputSchema: StructType = child.dataType match {
+case st: StructType => st
+case other =>
+  throw new IllegalArgumentException(s"Unsupported input type 
${other.catalogString}")
+  }
+
+  @transient
+  lazy val gen = new UnivocityGenerator(
+inputSchema, writer, new CSVOptions(options, columnPruning = true, 
timeZoneId.get))
+
+  // This converts rows to the CSV output according to the given schema.
+  @transient
+  lazy val converter: Any => UTF8String = {
+(row: Any) => 
UTF8String.fromString(gen.writeToString(row.asInstanceOf[InternalRow]))
--- End diff --

@MaxGekk, can we use the data from `writer` like `writer.toString` and 
`writer.reset()` like `to_json`? Looks we are going to avoid header (which is 
fine). If we explicitly set `header` to `false` in this expression, looks we 
don't need to add `writeToString` in `UnivocityGenerator`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22913
  
**[Test build #98418 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98418/testReport)**
 for PR 22913 at commit 
[`f809942`](https://github.com/apache/spark/commit/f809942de6c241cc7b499c19d0250185ebe26122).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22913
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22914
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22913
  
cc @BryanCutler 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22626#discussion_r230544556
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala ---
@@ -45,7 +45,6 @@ class CsvFunctionsSuite extends QueryTest with 
SharedSQLContext {
   Row(Row(java.sql.Timestamp.valueOf("2015-08-26 18:00:00.0"
   }
 
-
--- End diff --

Ah, I prefer to don't include unrelated changes but it's okay


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22626#discussion_r230544492
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityGenerator.scala
 ---
@@ -15,18 +15,17 @@
  * limitations under the License.
  */
 
-package org.apache.spark.sql.execution.datasources.csv
+package org.apache.spark.sql.catalyst.csv
 
 import java.io.Writer
 
 import com.univocity.parsers.csv.CsvWriter
 
 import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.catalyst.csv.CSVOptions
 import org.apache.spark.sql.catalyst.util.DateTimeUtils
 import org.apache.spark.sql.types._
 
-private[csv] class UnivocityGenerator(
+private[sql] class UnivocityGenerator(
--- End diff --

Let's remove `private[sql]`. We are already in an internal package 
`catalyst`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-11-02 Thread shahidki31
Github user shahidki31 commented on the issue:

https://github.com/apache/spark/pull/22914
  
Jenkins, retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22864: [SPARK-25861][Minor][WEBUI] Remove unused refreshInterva...

2018-11-02 Thread shahidki31
Github user shahidki31 commented on the issue:

https://github.com/apache/spark/pull/22864
  
Thank you @srowen 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22923: [SPARK-25910][CORE] accumulator updates from previous st...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22923
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22923: [SPARK-25910][CORE] accumulator updates from previous st...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22923
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4731/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22923: [SPARK-25910][CORE] accumulator updates from previous st...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22923
  
**[Test build #98416 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98416/testReport)**
 for PR 22923 at commit 
[`4d9cbe0`](https://github.com/apache/spark/commit/4d9cbe043604e76b6367e4ecb42d0d36437d1792).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22923: [SPARK-25910][CORE] accumulator updates from previous st...

2018-11-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22923
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22921
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98414/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22921
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22921
  
**[Test build #98414 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98414/testReport)**
 for PR 22921 at commit 
[`57ef4e8`](https://github.com/apache/spark/commit/57ef4e81d0bad3a2631088488cc41ce230457406).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22930
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4730/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22930
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22504
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98413/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22504
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22930
  
**[Test build #98415 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98415/testReport)**
 for PR 22930 at commit 
[`eca075a`](https://github.com/apache/spark/commit/eca075a83bfba189d93e04376577cbddaeb2d897).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22504
  
**[Test build #98413 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98413/testReport)**
 for PR 22504 at commit 
[`70a227f`](https://github.com/apache/spark/commit/70a227fd83524f35264fac9717d07024f440d179).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22920: [SPARK-24959][SQL][FOLLOWUP] Creating Jackson parser in ...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22920
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98411/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22920: [SPARK-24959][SQL][FOLLOWUP] Creating Jackson parser in ...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22920
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22920: [SPARK-24959][SQL][FOLLOWUP] Creating Jackson parser in ...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22920
  
**[Test build #98411 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98411/testReport)**
 for PR 22920 at commit 
[`9e32447`](https://github.com/apache/spark/commit/9e3244755e2a3a647ed84cd1d56fbe0a4033caaa).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22547: [SPARK-25528][SQL] data source V2 read side API r...

2018-11-02 Thread rdblue
Github user rdblue commented on a diff in the pull request:

https://github.com/apache/spark/pull/22547#discussion_r230528510
  
--- Diff: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousInputStream.scala
 ---
@@ -46,17 +45,22 @@ import org.apache.spark.sql.types.StructType
  *   scenarios, where some offsets after the specified 
initial ones can't be
  *   properly read.
  */
-class KafkaContinuousReadSupport(
+class KafkaContinuousInputStream(
--- End diff --

I'd prefer that the commits themselves compile, but since this is 
separating the modes I think it could be done incrementally.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22504
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98407/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22504
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22504
  
**[Test build #98407 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98407/testReport)**
 for PR 22504 at commit 
[`4df08bd`](https://github.com/apache/spark/commit/4df08bd56b4cd51c4072aa026bf7f46bc574421d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15496: [SPARK-17950] [Python] Match SparseVector behavior with ...

2018-11-02 Thread itg-abby
Github user itg-abby commented on the issue:

https://github.com/apache/spark/pull/15496
  
Hi, can you give a ref to the new style guides? I'm not sure if anything 
major needs changing.

In the meantime I resolved the one conflict at the bottom of ml/tests.py 
and extended the docstring to include your comment.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19573: [SPARK-22350][SQL] select grouping__id from subquery

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19573
  
Build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19573: [SPARK-22350][SQL] select grouping__id from subquery

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19573
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4729/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22864: [SPARK-25861][Minor][WEBUI] Remove unused refresh...

2018-11-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22864


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22921
  
**[Test build #98414 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98414/testReport)**
 for PR 22921 at commit 
[`57ef4e8`](https://github.com/apache/spark/commit/57ef4e81d0bad3a2631088488cc41ce230457406).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22864: [SPARK-25861][Minor][WEBUI] Remove unused refreshInterva...

2018-11-02 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/22864
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22921
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4728/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22921
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22504
  
**[Test build #98413 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98413/testReport)**
 for PR 22504 at commit 
[`70a227f`](https://github.com/apache/spark/commit/70a227fd83524f35264fac9717d07024f440d179).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22626
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98405/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22626
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22626
  
**[Test build #98405 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98405/testReport)**
 for PR 22626 at commit 
[`230f789`](https://github.com/apache/spark/commit/230f7890d75ab9ed6041eb7d9be1aa01c9f82968).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread skonto
Github user skonto commented on the issue:

https://github.com/apache/spark/pull/22931
  
@srowen I guess yes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread shaneknapp
Github user shaneknapp commented on the issue:

https://github.com/apache/spark/pull/22931
  
sorry, ignore amplab's report.  the build passed, but my hacking on the 
integration test reports was what caused the failure.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22931
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4727/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22931
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22931
  
Kubernetes integration test status success
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4727/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/22931
  
OK. Does this need to go in branch 2.4 too?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22928: [SPARK-25926][CORE] Move config entries in core module t...

2018-11-02 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/22928
  
I don't like stashing everything in `package.scala`; I'm ok-ish with moving 
them under the `internal.config` package, but it would be better to keep them 
in separate source files.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22931
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22931
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98412/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22931
  
**[Test build #98412 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98412/testReport)**
 for PR 22931 at commit 
[`bf85974`](https://github.com/apache/spark/commit/bf85974e769b86056a83be6f051cb15ff3279022).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22931
  
Kubernetes integration test starting
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4727/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...

2018-11-02 Thread boy-uber
Github user boy-uber commented on the issue:

https://github.com/apache/spark/pull/22429
  
> > @MaxGekk I sent email to spark dev list about structured plan logging, 
but did not get any response.
> 
> @boy-uber I guess It is better to speak about the feature to @bogdanrdc 
@hvanhovell @larturus

Thanks @MaxGekk for the contact list! I will ping them to gather more 
thoughts.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22931
  
**[Test build #98412 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98412/testReport)**
 for PR 22931 at commit 
[`bf85974`](https://github.com/apache/spark/commit/bf85974e769b86056a83be6f051cb15ff3279022).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22930
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22930
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98408/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22930
  
**[Test build #98408 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98408/testReport)**
 for PR 22930 at commit 
[`9125b31`](https://github.com/apache/spark/commit/9125b31fd2b1b6df0a5aaeba743f9d287cf2e897).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread skonto
Github user skonto commented on the issue:

https://github.com/apache/spark/pull/22931
  
@vanzin modified it. Much better.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22925: [SPARK-25913][SQL] Extend UnaryExecNode by unary SparkPl...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22925
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22504
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22925: [SPARK-25913][SQL] Extend UnaryExecNode by unary SparkPl...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22925
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98402/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22504
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98401/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22504
  
**[Test build #98401 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98401/testReport)**
 for PR 22504 at commit 
[`4df08bd`](https://github.com/apache/spark/commit/4df08bd56b4cd51c4072aa026bf7f46bc574421d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22925: [SPARK-25913][SQL] Extend UnaryExecNode by unary SparkPl...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22925
  
**[Test build #98402 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98402/testReport)**
 for PR 22925 at commit 
[`0172030`](https://github.com/apache/spark/commit/01720302645d13e6e94d66b12b83568aff321a91).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22897: [SPARK-25875][k8s] Merge code to set up driver co...

2018-11-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22897


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22920: [SPARK-24959][SQL][FOLLOWUP] Creating Jackson parser in ...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22920
  
**[Test build #98411 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98411/testReport)**
 for PR 22920 at commit 
[`9e32447`](https://github.com/apache/spark/commit/9e3244755e2a3a647ed84cd1d56fbe0a4033caaa).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22530: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22530
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98406/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22530: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22530
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22530: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22530
  
**[Test build #98406 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98406/testReport)**
 for PR 22530 at commit 
[`9f90fa0`](https://github.com/apache/spark/commit/9f90fa0bfd7337375c97660b475b65f4ce25c160).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22897: [SPARK-25875][k8s] Merge code to set up driver command i...

2018-11-02 Thread mccheah
Github user mccheah commented on the issue:

https://github.com/apache/spark/pull/22897
  
Ok, I am merging into master. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22547: [SPARK-25528][SQL] data source V2 read side API r...

2018-11-02 Thread mccheah
Github user mccheah commented on a diff in the pull request:

https://github.com/apache/spark/pull/22547#discussion_r230505785
  
--- Diff: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousInputStream.scala
 ---
@@ -46,17 +45,22 @@ import org.apache.spark.sql.types.StructType
  *   scenarios, where some offsets after the specified 
initial ones can't be
  *   properly read.
  */
-class KafkaContinuousReadSupport(
+class KafkaContinuousInputStream(
--- End diff --

+1 for this. A lot of the changes right now are for moving around the 
streaming code especially, which makes it harder to isolate just the proposed 
API for review.

An alternative is to make this PR separate commits that, while the commits 
themselves may not compile because of mismatching signatures - but all the 
commits taken together would compile, and each commit can be reviewed 
individually for assessing the API and then the implementation.

For example I'd propose 3 PRs:

* Batch reading, with a commit for the interface changes and a separate 
commit for the implementation changes
* Micro Batch Streaming read, with a commit for the interface changes and a 
separate commit for the implementation changes
* Continuous streaming read, similar to above

Thoughts?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread shaneknapp
Github user shaneknapp commented on the issue:

https://github.com/apache/spark/pull/22931
  
btw next k8s test that runs will actually have logs!

here's the integration test log from this run, which wasn't archived...

[integration-tests.log](https://github.com/apache/spark/files/2544075/integration-tests.log)



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread skonto
Github user skonto commented on the issue:

https://github.com/apache/spark/pull/22931
  
@vanzin ok let me try that. 



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22931
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22931
  
Kubernetes integration test status success
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4726/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...

2018-11-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22931
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4726/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >