[GitHub] spark pull request #22932: [SPARK-25102][SQL] Write Spark version to ORC/Par...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22932#discussion_r230547020 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala --- @@ -314,6 +316,21 @@ abstract class OrcSuite extends OrcTest with BeforeAndAfterAll { checkAnswer(spark.read.orc(path.getCanonicalPath), Row(ts)) } } + --- End diff -- Please note that the following test case is executed twice; `OrcSourceSuite` and `HiveOrcSourceSuite`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22932 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4733/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22932 **[Test build #98420 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98420/testReport)** for PR 22932 at commit [`601ccbb`](https://github.com/apache/spark/commit/601ccbb4e20a068469839bc71870230cfb6fd7a1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22932 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22932: [SPARK-25102][SQL] Write Spark version to ORC/Par...
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/22932 [SPARK-25102][SQL] Write Spark version to ORC/Parquet file metadata ## What changes were proposed in this pull request? Currently, Spark writes Spark version number into Hive Table properties with `spark.sql.create.version`. ``` parameters:{ spark.sql.sources.schema.part.0={ "type":"struct", "fields":[{"name":"a","type":"integer","nullable":true,"metadata":{}}] }, transient_lastDdlTime=1541142761, spark.sql.sources.schema.numParts=1, spark.sql.create.version=2.4.0 } ``` This PR aims to write Spark versions to ORC/Parquet file metadata with `org.apache.spark.sql.create.version`. It's different from Hive Table property key `spark.sql.create.version`, but it seems that we cannot change that for backward compatibility. **ORC (`native` and `hive` implmentation)** ``` File Version: 0.12 with ORC_135 ... User Metadata: org.apache.spark.sql.create.version=3.0.0-SNAPSHOT ``` **PARQUET** ``` creator: parquet-mr version 1.10.0 (build 031a6654009e3b82020012a18434c582bd74c73a) extra: org.apache.spark.sql.create.version = 3.0.0-SNAPSHOT extra: org.apache.spark.sql.parquet.row.metadata = {"type":"struct","fields":[{"name":"id","type":"long","nullable":false,"metadata":{}}]} ``` ## How was this patch tested? Pass the Jenkins with newly added test cases. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-25102 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22932.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22932 commit 601ccbb4e20a068469839bc71870230cfb6fd7a1 Author: Dongjoon Hyun Date: 2018-11-03T06:43:48Z [SPARK-25102][SQL] Write Spark version to ORC/Parquet file metadata --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22255: [SPARK-25102][Spark Core] Write Spark version informatio...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22255 It seems to cause some inconsistency if we choose one of `org.apache.spark.sql.create.version` or `spark.sql.create.version` as a key? 1) If we choose `spark.sql.create.version` as a key, in Parquet, it will look like the following. ``` extra: spark.sql.create.version = 3.0.0-SNAPSHOT extra: org.apache.spark.sql.parquet.row.metadata = {"type":"struct","fields":[{"name":"id","type":"long","nullable":false,"metadata":{}}]} ``` 2) If we choose `org.apache.spark.sql.create.version`, it's different from Hive table property. I'll ignore the consistency of (2) for backward compatibility. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22912: [SPARK-25901][CORE] Use only one thread in Barrie...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22912 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22920: [SPARK-24959][SQL][FOLLOWUP] Creating Jackson par...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22920#discussion_r230546330 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmark.scala --- @@ -158,26 +166,78 @@ object JSONBenchmark extends SqlBasedBenchmark { val ds = spark.read.schema(schema).json(path.getAbsolutePath) - benchmark.addCase(s"Select $colsNum columns + count()", 3) { _ => + benchmark.addCase(s"Select $colsNum columns + count()", numIters) { _ => ds.select("*").filter((_: Row) => true).count() } - benchmark.addCase(s"Select 1 column + count()", 3) { _ => + benchmark.addCase(s"Select 1 column + count()", numIters) { _ => ds.select($"col1").filter((_: Row) => true).count() } - benchmark.addCase(s"count()", 3) { _ => + benchmark.addCase(s"count()", numIters) { _ => ds.count() } benchmark.run() } } + def jsonParserCreation(rowsNum: Int, numIters: Int): Unit = { +val benchmark = new Benchmark("creation of JSON parser per line", rowsNum, output = output) + +withTempPath { path => + prepareDataInfo(benchmark) + + val shortColumnPath = path.getAbsolutePath + "/short" + val shortSchema = writeShortColumn(shortColumnPath, rowsNum) + + val wideColumnPath = path.getAbsolutePath + "/wide" + val wideSchema = writeWideColumn(wideColumnPath, rowsNum) + + benchmark.addCase("Short column without encoding", numIters) { _ => +spark.read + .schema(shortSchema) + .json(shortColumnPath) + .filter((_: Row) => true) + .count() + } + + benchmark.addCase("Short column with UTF-8", numIters) { _ => +spark.read + .option("encoding", "UTF-8") + .schema(shortSchema) + .json(shortColumnPath) + .filter((_: Row) => true) + .count() + } + + benchmark.addCase("Wide column without encoding", numIters) { _ => +spark.read + .schema(wideSchema) + .json(wideColumnPath) + .filter((_: Row) => true) + .count() + } + + benchmark.addCase("Wide column with UTF-8", numIters) { _ => +spark.read + .option("encoding", "UTF-8") + .schema(wideSchema) + .json(wideColumnPath) + .filter((_: Row) => true) + .count() + } + + benchmark.run() +} + } + override def runBenchmarkSuite(mainArgs: Array[String]): Unit = { +val numIters = 2 --- End diff -- Thank you for updating, @MaxGekk . Do we have a reason to decrease this value from 3 to 2 in this PR? If this is to reduce the time, let's keep the original value. This benchmark is not executed frequently. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22893: [SPARK-25868][MLlib] One part of Spark MLlib Kmean Logic...
Github user KyleLi1985 commented on the issue: https://github.com/apache/spark/pull/22893 > So the pull request right now doesn't reflect what you tested, but you tested the version pasted above. You're saying that the optimization just never helps the dense-dense case, and sqdist is faster than a dot product. This doesn't make sense mathematically as it should be more math, but stranger things have happened. > > Still, I don't follow your test code here. You parallelize one vector, map it, collect it: why use Spark? and it's the same vector over and over, and it's not a big vector. Your sparse vectors aren't very sparse. > > How about more representative input -- larger vectors (100s of elements, probably), more sparse sparse vectors, and a large set of different inputs. I also don't see where the precision bound is changed here? > > This may be a good change but I'm just not yet convinced by the test methodology, and the result still doesn't make much intuitive sense. 1) why use Spark? not for special reason, only align with my common using tool. 2) About the vector, I did a more representative input test, I show this result below 3) About the precision, it is trick, you can meet your goal (let your calculation logic into which branch) by manually change it. As I said in last comment, take LOGIC2 for example, you can manually change precision to -1 in ( precisionbound1 < precision) and change precision to 1 in (precisionbound2 > precision), so you calculation login will into LOGIC2 situation. It is like codecoverage thing. Anyway, we goal is to show the performance will not change in same calculation logic before and after added Enhance for sparse-sparse and sparse-dense situation. There is my test file [SparkMLlibTest.txt](https://github.com/apache/spark/files/2544667/SparkMLlibTest.txt) There is my test data situation I use the data http://archive.ics.uci.edu/ml/datasets/Condition+monitoring+of+hydraulic+systems extract file (PS1, PS2, PS3, PS4, PS5, PS6) to form the test data total instances are 13230 the attributes for line are 6000 **Result for sparse-sparse situation time cost (milliseconds)** Before Enhance: 7670, 7704, 7652 After Enhance: 7634, 7729, 7645 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22912 Thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98415/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22930 **[Test build #98415 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98415/testReport)** for PR 22930 at commit [`eca075a`](https://github.com/apache/spark/commit/eca075a83bfba189d93e04376577cbddaeb2d897). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4732/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22930 **[Test build #98419 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98419/testReport)** for PR 22930 at commit [`a799e3f`](https://github.com/apache/spark/commit/a799e3f0f39459b2eb14978fa622af4a3e0b3294). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22921 Looks okay to me too but I'd also leave this open for few more days. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22908: [MINOR][SQL] Replace all TreeNode's node name in ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22908#discussion_r230544923 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala --- @@ -56,7 +56,7 @@ case class DataSourceV2Relation( override def pushedFilters: Seq[Expression] = Seq.empty - override def simpleString: String = "RelationV2 " + metadataString + override def simpleString: String = s"$nodeName " + metadataString --- End diff -- I'd follow this comment, actually. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22913: [SPARK-25902][SQL] Add support for dates with mil...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22913#discussion_r230544838 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala --- @@ -71,6 +71,7 @@ object ArrowUtils { case d: ArrowType.Decimal => DecimalType(d.getPrecision, d.getScale) case date: ArrowType.Date if date.getUnit == DateUnit.DAY => DateType case ts: ArrowType.Timestamp if ts.getUnit == TimeUnit.MICROSECOND => TimestampType +case date: ArrowType.Date if date.getUnit == DateUnit.MILLISECOND => TimestampType --- End diff -- Wait .. is it correct to map it to `TimestampType`? Looks this is why `Date` with `MILLISECOND` is not added. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22913 Can we add a test in `ArrowUtilsSuite.scala`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22626#discussion_r230544775 --- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql --- @@ -15,3 +15,10 @@ CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * FROM VALUES ('1,abc', 'a SELECT schema_of_csv(csvField) FROM csvTable; -- Clean up DROP VIEW IF EXISTS csvTable; +-- to_csv +select to_csv(named_struct('a', 1, 'b', 2)); +select to_csv(named_struct('time', to_timestamp('2015-08-26', '-MM-dd')), map('timestampFormat', 'dd/MM/')); +-- Check if errors handled +select to_csv(named_struct('a', 1, 'b', 2), named_struct('mode', 'PERMISSIVE')); +select to_csv(named_struct('a', 1, 'b', 2), map('mode', 1)); --- End diff -- This one too since the exception is from `convertToMapData`. We just only need one test - this one or the one right above. One of them can be removed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22626#discussion_r230544760 --- Diff: sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql --- @@ -15,3 +15,10 @@ CREATE TEMPORARY VIEW csvTable(csvField, a) AS SELECT * FROM VALUES ('1,abc', 'a SELECT schema_of_csv(csvField) FROM csvTable; -- Clean up DROP VIEW IF EXISTS csvTable; +-- to_csv +select to_csv(named_struct('a', 1, 'b', 2)); +select to_csv(named_struct('time', to_timestamp('2015-08-26', '-MM-dd')), map('timestampFormat', 'dd/MM/')); +-- Check if errors handled +select to_csv(named_struct('a', 1, 'b', 2), named_struct('mode', 'PERMISSIVE')); +select to_csv(named_struct('a', 1, 'b', 2), map('mode', 1)); +select to_csv(); --- End diff -- I think we don't have to test this since it's not specific to this expression. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22626#discussion_r230544717 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala --- @@ -174,3 +176,66 @@ case class SchemaOfCsv( override def prettyName: String = "schema_of_csv" } + +/** + * Converts a [[StructType]] to a CSV output string. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value", + examples = """ +Examples: + > SELECT _FUNC_(named_struct('a', 1, 'b', 2)); + 1,2 + > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', '-MM-dd')), map('timestampFormat', 'dd/MM/')); + "26/08/2015" + """, + since = "3.0.0") +// scalastyle:on line.size.limit +case class StructsToCsv( + options: Map[String, String], + child: Expression, + timeZoneId: Option[String] = None) + extends UnaryExpression with TimeZoneAwareExpression with CodegenFallback with ExpectsInputTypes { + override def nullable: Boolean = true + + def this(options: Map[String, String], child: Expression) = this(options, child, None) + + // Used in `FunctionRegistry` + def this(child: Expression) = this(Map.empty, child, None) + + def this(child: Expression, options: Expression) = +this( + options = ExprUtils.convertToMapData(options), + child = child, + timeZoneId = None) + + @transient + lazy val writer = new CharArrayWriter() + + @transient + lazy val inputSchema: StructType = child.dataType match { +case st: StructType => st +case other => + throw new IllegalArgumentException(s"Unsupported input type ${other.catalogString}") + } + + @transient + lazy val gen = new UnivocityGenerator( +inputSchema, writer, new CSVOptions(options, columnPruning = true, timeZoneId.get)) + + // This converts rows to the CSV output according to the given schema. + @transient + lazy val converter: Any => UTF8String = { +(row: Any) => UTF8String.fromString(gen.writeToString(row.asInstanceOf[InternalRow])) + } + + override def dataType: DataType = StringType + + override def withTimeZone(timeZoneId: String): TimeZoneAwareExpression = +copy(timeZoneId = Option(timeZoneId)) + + override def nullSafeEval(value: Any): Any = converter(value) + + override def inputTypes: Seq[AbstractDataType] = TypeCollection(StructType) :: Nil --- End diff -- I think we can `StructType :: Nil` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22914 **[Test build #98417 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98417/testReport)** for PR 22914 at commit [`2e39c4a`](https://github.com/apache/spark/commit/2e39c4a2cbf1db82b37795b2b568985fda2ff903). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22626#discussion_r230544667 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/csvExpressions.scala --- @@ -174,3 +176,66 @@ case class SchemaOfCsv( override def prettyName: String = "schema_of_csv" } + +/** + * Converts a [[StructType]] to a CSV output string. + */ +// scalastyle:off line.size.limit +@ExpressionDescription( + usage = "_FUNC_(expr[, options]) - Returns a CSV string with a given struct value", + examples = """ +Examples: + > SELECT _FUNC_(named_struct('a', 1, 'b', 2)); + 1,2 + > SELECT _FUNC_(named_struct('time', to_timestamp('2015-08-26', '-MM-dd')), map('timestampFormat', 'dd/MM/')); + "26/08/2015" + """, + since = "3.0.0") +// scalastyle:on line.size.limit +case class StructsToCsv( + options: Map[String, String], + child: Expression, + timeZoneId: Option[String] = None) + extends UnaryExpression with TimeZoneAwareExpression with CodegenFallback with ExpectsInputTypes { + override def nullable: Boolean = true + + def this(options: Map[String, String], child: Expression) = this(options, child, None) + + // Used in `FunctionRegistry` + def this(child: Expression) = this(Map.empty, child, None) + + def this(child: Expression, options: Expression) = +this( + options = ExprUtils.convertToMapData(options), + child = child, + timeZoneId = None) + + @transient + lazy val writer = new CharArrayWriter() + + @transient + lazy val inputSchema: StructType = child.dataType match { +case st: StructType => st +case other => + throw new IllegalArgumentException(s"Unsupported input type ${other.catalogString}") + } + + @transient + lazy val gen = new UnivocityGenerator( +inputSchema, writer, new CSVOptions(options, columnPruning = true, timeZoneId.get)) + + // This converts rows to the CSV output according to the given schema. + @transient + lazy val converter: Any => UTF8String = { +(row: Any) => UTF8String.fromString(gen.writeToString(row.asInstanceOf[InternalRow])) --- End diff -- @MaxGekk, can we use the data from `writer` like `writer.toString` and `writer.reset()` like `to_json`? Looks we are going to avoid header (which is fine). If we explicitly set `header` to `false` in this expression, looks we don't need to add `writeToString` in `UnivocityGenerator`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22913 **[Test build #98418 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98418/testReport)** for PR 22913 at commit [`f809942`](https://github.com/apache/spark/commit/f809942de6c241cc7b499c19d0250185ebe26122). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22913 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22914 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22913: [SPARK-25902][SQL] Add support for dates with millisecon...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22913 cc @BryanCutler --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22626#discussion_r230544556 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala --- @@ -45,7 +45,6 @@ class CsvFunctionsSuite extends QueryTest with SharedSQLContext { Row(Row(java.sql.Timestamp.valueOf("2015-08-26 18:00:00.0" } - --- End diff -- Ah, I prefer to don't include unrelated changes but it's okay --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22626#discussion_r230544492 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityGenerator.scala --- @@ -15,18 +15,17 @@ * limitations under the License. */ -package org.apache.spark.sql.execution.datasources.csv +package org.apache.spark.sql.catalyst.csv import java.io.Writer import com.univocity.parsers.csv.CsvWriter import org.apache.spark.sql.catalyst.InternalRow -import org.apache.spark.sql.catalyst.csv.CSVOptions import org.apache.spark.sql.catalyst.util.DateTimeUtils import org.apache.spark.sql.types._ -private[csv] class UnivocityGenerator( +private[sql] class UnivocityGenerator( --- End diff -- Let's remove `private[sql]`. We are already in an internal package `catalyst`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...
Github user shahidki31 commented on the issue: https://github.com/apache/spark/pull/22914 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22864: [SPARK-25861][Minor][WEBUI] Remove unused refreshInterva...
Github user shahidki31 commented on the issue: https://github.com/apache/spark/pull/22864 Thank you @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22923: [SPARK-25910][CORE] accumulator updates from previous st...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22923 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22923: [SPARK-25910][CORE] accumulator updates from previous st...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22923 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4731/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22923: [SPARK-25910][CORE] accumulator updates from previous st...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22923 **[Test build #98416 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98416/testReport)** for PR 22923 at commit [`4d9cbe0`](https://github.com/apache/spark/commit/4d9cbe043604e76b6367e4ecb42d0d36437d1792). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22923: [SPARK-25910][CORE] accumulator updates from previous st...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22923 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22921 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98414/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22921 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22921 **[Test build #98414 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98414/testReport)** for PR 22921 at commit [`57ef4e8`](https://github.com/apache/spark/commit/57ef4e81d0bad3a2631088488cc41ce230457406). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22930 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4730/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22930 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22504 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98413/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22504 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22930 **[Test build #98415 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98415/testReport)** for PR 22930 at commit [`eca075a`](https://github.com/apache/spark/commit/eca075a83bfba189d93e04376577cbddaeb2d897). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22504 **[Test build #98413 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98413/testReport)** for PR 22504 at commit [`70a227f`](https://github.com/apache/spark/commit/70a227fd83524f35264fac9717d07024f440d179). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22920: [SPARK-24959][SQL][FOLLOWUP] Creating Jackson parser in ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22920 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98411/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22920: [SPARK-24959][SQL][FOLLOWUP] Creating Jackson parser in ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22920 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22920: [SPARK-24959][SQL][FOLLOWUP] Creating Jackson parser in ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22920 **[Test build #98411 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98411/testReport)** for PR 22920 at commit [`9e32447`](https://github.com/apache/spark/commit/9e3244755e2a3a647ed84cd1d56fbe0a4033caaa). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22547: [SPARK-25528][SQL] data source V2 read side API r...
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/22547#discussion_r230528510 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousInputStream.scala --- @@ -46,17 +45,22 @@ import org.apache.spark.sql.types.StructType * scenarios, where some offsets after the specified initial ones can't be * properly read. */ -class KafkaContinuousReadSupport( +class KafkaContinuousInputStream( --- End diff -- I'd prefer that the commits themselves compile, but since this is separating the modes I think it could be done incrementally. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22504 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98407/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22504 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22504 **[Test build #98407 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98407/testReport)** for PR 22504 at commit [`4df08bd`](https://github.com/apache/spark/commit/4df08bd56b4cd51c4072aa026bf7f46bc574421d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15496: [SPARK-17950] [Python] Match SparseVector behavior with ...
Github user itg-abby commented on the issue: https://github.com/apache/spark/pull/15496 Hi, can you give a ref to the new style guides? I'm not sure if anything major needs changing. In the meantime I resolved the one conflict at the bottom of ml/tests.py and extended the docstring to include your comment. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19573: [SPARK-22350][SQL] select grouping__id from subquery
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19573 Build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19573: [SPARK-22350][SQL] select grouping__id from subquery
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19573 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4729/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22864: [SPARK-25861][Minor][WEBUI] Remove unused refresh...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22864 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22921 **[Test build #98414 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98414/testReport)** for PR 22921 at commit [`57ef4e8`](https://github.com/apache/spark/commit/57ef4e81d0bad3a2631088488cc41ce230457406). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22864: [SPARK-25861][Minor][WEBUI] Remove unused refreshInterva...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/22864 Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22921 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4728/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22921: [SPARK-25908][CORE][SQL] Remove old deprecated items in ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22921 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22504 **[Test build #98413 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98413/testReport)** for PR 22504 at commit [`70a227f`](https://github.com/apache/spark/commit/70a227fd83524f35264fac9717d07024f440d179). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22626 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98405/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22626 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22626 **[Test build #98405 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98405/testReport)** for PR 22626 at commit [`230f789`](https://github.com/apache/spark/commit/230f7890d75ab9ed6041eb7d9be1aa01c9f82968). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user skonto commented on the issue: https://github.com/apache/spark/pull/22931 @srowen I guess yes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/22931 sorry, ignore amplab's report. the build passed, but my hacking on the integration test reports was what caused the failure. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22931 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4727/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22931 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22931 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4727/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/22931 OK. Does this need to go in branch 2.4 too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22928: [SPARK-25926][CORE] Move config entries in core module t...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/22928 I don't like stashing everything in `package.scala`; I'm ok-ish with moving them under the `internal.config` package, but it would be better to keep them in separate source files. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22931 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22931 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98412/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22931 **[Test build #98412 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98412/testReport)** for PR 22931 at commit [`bf85974`](https://github.com/apache/spark/commit/bf85974e769b86056a83be6f051cb15ff3279022). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22931 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4727/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...
Github user boy-uber commented on the issue: https://github.com/apache/spark/pull/22429 > > @MaxGekk I sent email to spark dev list about structured plan logging, but did not get any response. > > @boy-uber I guess It is better to speak about the feature to @bogdanrdc @hvanhovell @larturus Thanks @MaxGekk for the contact list! I will ping them to gather more thoughts. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22931 **[Test build #98412 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98412/testReport)** for PR 22931 at commit [`bf85974`](https://github.com/apache/spark/commit/bf85974e769b86056a83be6f051cb15ff3279022). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22930 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22930 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98408/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22930: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22930 **[Test build #98408 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98408/testReport)** for PR 22930 at commit [`9125b31`](https://github.com/apache/spark/commit/9125b31fd2b1b6df0a5aaeba743f9d287cf2e897). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user skonto commented on the issue: https://github.com/apache/spark/pull/22931 @vanzin modified it. Much better. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22925: [SPARK-25913][SQL] Extend UnaryExecNode by unary SparkPl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22925 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22504 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22925: [SPARK-25913][SQL] Extend UnaryExecNode by unary SparkPl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22925 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98402/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22504 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98401/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22504: [SPARK-25118][Submit] Persist Driver Logs in Client mode...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22504 **[Test build #98401 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98401/testReport)** for PR 22504 at commit [`4df08bd`](https://github.com/apache/spark/commit/4df08bd56b4cd51c4072aa026bf7f46bc574421d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22925: [SPARK-25913][SQL] Extend UnaryExecNode by unary SparkPl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22925 **[Test build #98402 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98402/testReport)** for PR 22925 at commit [`0172030`](https://github.com/apache/spark/commit/01720302645d13e6e94d66b12b83568aff321a91). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22897: [SPARK-25875][k8s] Merge code to set up driver co...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22897 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22920: [SPARK-24959][SQL][FOLLOWUP] Creating Jackson parser in ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22920 **[Test build #98411 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98411/testReport)** for PR 22920 at commit [`9e32447`](https://github.com/apache/spark/commit/9e3244755e2a3a647ed84cd1d56fbe0a4033caaa). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22530: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22530 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98406/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22530: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22530 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22530: [SPARK-24869][SQL] Fix SaveIntoDataSourceCommand's input...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22530 **[Test build #98406 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98406/testReport)** for PR 22530 at commit [`9f90fa0`](https://github.com/apache/spark/commit/9f90fa0bfd7337375c97660b475b65f4ce25c160). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22897: [SPARK-25875][k8s] Merge code to set up driver command i...
Github user mccheah commented on the issue: https://github.com/apache/spark/pull/22897 Ok, I am merging into master. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22547: [SPARK-25528][SQL] data source V2 read side API r...
Github user mccheah commented on a diff in the pull request: https://github.com/apache/spark/pull/22547#discussion_r230505785 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousInputStream.scala --- @@ -46,17 +45,22 @@ import org.apache.spark.sql.types.StructType * scenarios, where some offsets after the specified initial ones can't be * properly read. */ -class KafkaContinuousReadSupport( +class KafkaContinuousInputStream( --- End diff -- +1 for this. A lot of the changes right now are for moving around the streaming code especially, which makes it harder to isolate just the proposed API for review. An alternative is to make this PR separate commits that, while the commits themselves may not compile because of mismatching signatures - but all the commits taken together would compile, and each commit can be reviewed individually for assessing the API and then the implementation. For example I'd propose 3 PRs: * Batch reading, with a commit for the interface changes and a separate commit for the implementation changes * Micro Batch Streaming read, with a commit for the interface changes and a separate commit for the implementation changes * Continuous streaming read, similar to above Thoughts? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/22931 btw next k8s test that runs will actually have logs! here's the integration test log from this run, which wasn't archived... [integration-tests.log](https://github.com/apache/spark/files/2544075/integration-tests.log) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user skonto commented on the issue: https://github.com/apache/spark/pull/22931 @vanzin ok let me try that. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22931 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22931 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4726/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22931: [SPARK-25930][K8s] Fix scala string detection in k8s tes...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22931 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4726/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org