[GitHub] spark issue #21004: [SPARK-23896][SQL]Improve PartitioningAwareFileIndex
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21004 **[Test build #89049 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89049/testReport)** for PR 21004 at commit [`10536a6`](https://github.com/apache/spark/commit/10536a6dbf2ab37d7066915223a64e914cf53b5f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21001 **[Test build #89050 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89050/testReport)** for PR 21001 at commit [`3fe648f`](https://github.com/apache/spark/commit/3fe648fa03e81b8a2f5ec23182cae3b977164646). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21004: [SPARK-23896][SQL]Improve PartitioningAwareFileIndex
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21004 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2089/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21007: [SPARK-23942][PYTHON][SQL] Makes collect in PySpark as a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21007 **[Test build #89053 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89053/testReport)** for PR 21007 at commit [`db1987f`](https://github.com/apache/spark/commit/db1987f63370c6c2f9434aea76da7d326565be5a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21005: [SPARK-23898][SQL] Simplify add & subtract code generati...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21005 **[Test build #89052 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89052/testReport)** for PR 21005 at commit [`433`](https://github.com/apache/spark/commit/43314b1d443fac5ca27ecef80677dbe70ab7). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21004: [SPARK-23896][SQL]Improve PartitioningAwareFileIndex
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21004 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20981 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20981 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89048/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20981 **[Test build #89048 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89048/testReport)** for PR 20981 at commit [`2eb2bf1`](https://github.com/apache/spark/commit/2eb2bf1853a0ba4de8f4a3adfe8407d04a075b22). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20925: [SPARK-22941][core] Do not exit JVM when submit fails wi...
Github user attilapiros commented on the issue: https://github.com/apache/spark/pull/20925 I have finished my review and have not found any additional issue. LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20937 > I don't know about you but I used to think if something doesn't work it means it doesn't work in ALL cases. I agree with that there are always rooms for improvement. Trust me, I don't usually try to get in a way hard like this. I would like to avoid to document the auto-detection is supported particularly in this case. This is incomplete and we found many holes and we also found these are pretty tricky to fix. For example, there was a case that `DROPMALFORMED` was required too IIRC. I think really you know what cases don't work @MaxGekk because we talked so far. If you really don't know, I will try to test, look back and list up the cases that don't work. I really want to avoid complaints why auto-detection doesn't work and just want to clarify _the current status as is_ because this PR targets to add the explicit encoding. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Supp...
Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/20937#discussion_r180050283 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonDataSource.scala --- @@ -92,26 +93,30 @@ object TextInputJsonDataSource extends JsonDataSource { sparkSession: SparkSession, inputPaths: Seq[FileStatus], parsedOptions: JSONOptions): StructType = { -val json: Dataset[String] = createBaseDataset( - sparkSession, inputPaths, parsedOptions.lineSeparator) +val json: Dataset[String] = createBaseDataset(sparkSession, inputPaths, parsedOptions) + inferFromDataset(json, parsedOptions) } def inferFromDataset(json: Dataset[String], parsedOptions: JSONOptions): StructType = { val sampled: Dataset[String] = JsonUtils.sample(json, parsedOptions) -val rdd: RDD[UTF8String] = sampled.queryExecution.toRdd.map(_.getUTF8String(0)) -JsonInferSchema.infer(rdd, parsedOptions, CreateJacksonParser.utf8String) +val rdd: RDD[InternalRow] = sampled.queryExecution.toRdd +val rowParser = parsedOptions.encoding.map { enc => + CreateJacksonParser.internalRow(enc, _: JsonFactory, _: InternalRow, 0) --- End diff -- I didn't it originally but rejected the solution because overhead of wrapping the array by `ByteArrayInputStream` per-each row is very high. It increases execution time up to 20% in some cases. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21007: [SPARK-23942][PYTHON][SQL] Makes collect in PySpa...
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/21007 [SPARK-23942][PYTHON][SQL] Makes collect in PySpark as action for a query executor listener ## What changes were proposed in this pull request? This PR proposes to add `collect` to a query executor as an action. Seems `collect` / `collect` with Arrow are not recognised via `QueryExecutionListener` as an action. For example, if we have a custom listener as below: ```scala package org.apache.spark.sql import org.apache.spark.internal.Logging import org.apache.spark.sql.execution.QueryExecution import org.apache.spark.sql.util.QueryExecutionListener class TestQueryExecutionListener extends QueryExecutionListener with Logging { override def onSuccess(funcName: String, qe: QueryExecution, durationNs: Long): Unit = { logError("Look at me! I'm 'onSuccess'") } override def onFailure(funcName: String, qe: QueryExecution, exception: Exception): Unit = { } } ``` **Before** ```python >>> sql("SELECT * FROM range(1)").collect() ``` ``` [Row(id=0)] ``` ```python >>> spark.conf.set("spark.sql.execution.arrow.enabled", "true") >>> sql("SELECT * FROM range(1)").toPandas() ``` ``` id 0 0 ``` **After** ```python >>> sql("SELECT * FROM range(1)").collect() ``` ``` 18/04/09 16:57:58 ERROR TestQueryExecutionListener: Look at me! I'm 'onSuccess' [Row(id=0)] ``` ```python >>> spark.conf.set("spark.sql.execution.arrow.enabled", "true") >>> sql("SELECT * FROM range(1)").toPandas() ``` ``` 18/04/09 17:53:26 ERROR TestQueryExecutionListener: Look at me! I'm 'onSuccess' id 0 0 ``` Other operations in PySpark or Scala side seems fine: ```python >>> sql("SELECT * FROM range(1)").show() ``` ``` 18/04/09 17:02:04 ERROR TestQueryExecutionListener: Look at me! I'm 'onSuccess' +---+ | id| +---+ | 0| +---+ ``` ```scala scala> sql("SELECT * FROM range(1)").collect() ``` ``` 18/04/09 16:58:41 ERROR TestQueryExecutionListener: Look at me! I'm 'onSuccess' res1: Array[org.apache.spark.sql.Row] = Array([0]) ``` ## How was this patch tested? I have manually tested as described above. It's possible to add a test but I should make a mock `QueryExecutionListener`, static object with a variable updated by the mock `QueryExecutionListener` and check the variable via Py4J. This will also need manual skip condition in PySpark side. I can add this test but .. I usually try to avoid a test with JVM access .. let me know if anyone feels ^ is required. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark SPARK-23942 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21007.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21007 commit db1987f63370c6c2f9434aea76da7d326565be5a Author: hyukjinkwon Date: 2018-04-09T09:54:44Z Makes collect in PySpark as action for a query executor listener --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21005: [SPARK-23898][SQL] Simplify add & subtract code generati...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/21005 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20944 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89046/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20944 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20944 **[Test build #89046 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89046/testReport)** for PR 20944 at commit [`1c801f1`](https://github.com/apache/spark/commit/1c801f1e673b3d6f9e94eeade08d5b309a105061). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21005: [SPARK-23898][SQL] Simplify add & subtract code generati...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/21005 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff test Pyth...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/20904 Jenkins, test this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21005: [SPARK-23898][SQL] Simplify add & subtract code generati...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21005 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21005: [SPARK-23898][SQL] Simplify add & subtract code generati...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21005 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89047/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21005: [SPARK-23898][SQL] Simplify add & subtract code generati...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21005 **[Test build #89047 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89047/testReport)** for PR 21005 at commit [`433`](https://github.com/apache/spark/commit/43314b1d443fac5ca27ecef80677dbe70ab7). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21004: [SPARK-23896][SQL]Improve PartitioningAwareFileIndex
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/21004 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/20937 @HyukjinKwon Let's sync. > Automatic encoding detection doesn't work for newlines and schema inference when multiLine is disabled I don't know about you but I used to think if something doesn't work it means it doesn't work in ALL cases. You write some statements that are partially correct or incorrect. About this statement, here are counterexamples: 1. File in UTF-8, multiline is disabled - newline and schema will be inferred correctly? Yes 2. File in ISO 8859-1, multiline is disabled. Does it work? Yes. 3. Encoding is CP1251 - the same All those examples show that your statement is wrong in mathematical meaning. > I thought this PR targets to add the **explicit encoding** support mainly EXACTLY. I don't know why do you push me to do something with auto-detection. The PR doesn't change behavior in the case if `encoding` is not specified. The PR is not about supporting any encoding in any cases. It is about the cases when the `encoding` is specified by an user explicitly. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20235: [Spark-22887][ML][TESTS][WIP] ML test for Structu...
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/20235#discussion_r180027926 --- Diff: mllib/src/test/scala/org/apache/spark/ml/fpm/FPGrowthSuite.scala --- @@ -34,86 +35,122 @@ class FPGrowthSuite extends SparkFunSuite with MLlibTestSparkContext with Defaul } test("FPGrowth fit and transform with different data types") { -Array(IntegerType, StringType, ShortType, LongType, ByteType).foreach { dt => - val data = dataset.withColumn("items", col("items").cast(ArrayType(dt))) - val model = new FPGrowth().setMinSupport(0.5).fit(data) - val generatedRules = model.setMinConfidence(0.5).associationRules - val expectedRules = spark.createDataFrame(Seq( -(Array("2"), Array("1"), 1.0), -(Array("1"), Array("2"), 0.75) - )).toDF("antecedent", "consequent", "confidence") -.withColumn("antecedent", col("antecedent").cast(ArrayType(dt))) -.withColumn("consequent", col("consequent").cast(ArrayType(dt))) - assert(expectedRules.sort("antecedent").rdd.collect().sameElements( -generatedRules.sort("antecedent").rdd.collect())) - - val transformed = model.transform(data) - val expectedTransformed = spark.createDataFrame(Seq( -(0, Array("1", "2"), Array.emptyIntArray), -(0, Array("1", "2"), Array.emptyIntArray), -(0, Array("1", "2"), Array.emptyIntArray), -(0, Array("1", "3"), Array(2)) - )).toDF("id", "items", "prediction") -.withColumn("items", col("items").cast(ArrayType(dt))) -.withColumn("prediction", col("prediction").cast(ArrayType(dt))) - assert(expectedTransformed.collect().toSet.equals( -transformed.collect().toSet)) + class DataTypeWithEncoder[A](val a: DataType) + (implicit val encoder: Encoder[(Int, Array[A], Array[A])]) + + Array( +new DataTypeWithEncoder[Int](IntegerType), +new DataTypeWithEncoder[String](StringType), +new DataTypeWithEncoder[Short](ShortType), +new DataTypeWithEncoder[Long](LongType) +// , new DataTypeWithEncoder[Byte](ByteType) +// TODO: using ByteType produces error, as Array[Byte] is handled as Binary +// cannot resolve 'CAST(`items` AS BINARY)' due to data type mismatch: +// cannot cast array to binary; + ).foreach { dt => { +val data = dataset.withColumn("items", col("items").cast(ArrayType(dt.a))) +val model = new FPGrowth().setMinSupport(0.5).fit(data) +val generatedRules = model.setMinConfidence(0.5).associationRules +val expectedRules = Seq( + (Array("2"), Array("1"), 1.0), + (Array("1"), Array("2"), 0.75) +).toDF("antecedent", "consequent", "confidence") + .withColumn("antecedent", col("antecedent").cast(ArrayType(dt.a))) + .withColumn("consequent", col("consequent").cast(ArrayType(dt.a))) +assert(expectedRules.sort("antecedent").rdd.collect().sameElements( + generatedRules.sort("antecedent").rdd.collect())) + +val expectedTransformed = Seq( + (0, Array("1", "2"), Array.emptyIntArray), --- End diff -- I think the "id" column should be of values "0, 1, 2, 3". Here id column is useless, we can remove it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21006: [SPARK-22256][MESOS] - Introduce spark.mesos.driver.memo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21006 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21006: [SPARK-22256][MESOS] - Introduce spark.mesos.driver.memo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21006 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21006: [SPARK-22256][MESOS] - Introduce spark.mesos.driv...
GitHub user pmackles opened a pull request: https://github.com/apache/spark/pull/21006 [SPARK-22256][MESOS] - Introduce spark.mesos.driver.memoryOverhead When running spark driver in a container such as when using the Mesos dispatcher service, we need to apply the same rules as for executors in order to avoid the JVM going over the allotted limit and then killed. Tested manually on spark 2.3 branch You can merge this pull request into a Git repository by running: $ git pull https://github.com/pmackles/spark paul-SPARK-22256 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21006.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21006 commit 1197c0b2e4ae72c1353ab4cd132285da4cfed61e Author: Paul Mackles Date: 2018-04-06T17:44:38Z [SPARK-22256] - Introduce spark.mesos.driver.memoryOverhead --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20937 Wait .. @MaxGekk, I think we should be synced first. Automatic encoding detection doesn't work for newlines and schema inference when `multiLine` is disabled, and I want to clarify this in documentation and error messages. I thought this PR targets to add the explicit encoding support mainly, as I talked with @cloud-fan and you if I haven't missed. Did I maybe misread the discussion? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21005: [SPARK-23898][SQL] Simplify add & subtract code generati...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/21005 yea, I think so and I just suggested we'd better to file a new jira for that. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Supp...
Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/20937#discussion_r180016246 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala --- @@ -361,6 +361,15 @@ class JacksonParser( // For such records, all fields other than the field configured by // `columnNameOfCorruptRecord` are set to `null`. throw BadRecordException(() => recordLiteral(record), () => None, e) + case e: CharConversionException if options.encoding.isEmpty => +val msg = + """Failed to parse a character. Encoding was detected automatically. --- End diff -- > I don't think `Encoding was detected automatically` is not quite correct. It is absolutely correct. If `encoding` is not set, it is detected automatically by jackson. Look at the condition `if options.encoding.isEmpty =>`. > It might not help user solve the issue but it gives less correct information. It gives absolutely correct information. > They could thought it detects encoding correctly regardless of multiline option. The message DOESN'T say that `encoding` detected correctly. > Think about this scenario: users somehow get this exception and read Failed to parse a character. Encoding was detected automatically.. What would they think? They will look at the proposed solution `You might want to set it explicitly via the encoding option like` and will set `encoding` > I would think somehow the file is somehow failed to read It could be true even `encoding` is set correctly > but it looks detecting the encoding in the file correctly automatically I don't know why you decided that. I see nothing about `encoding` correctness in the message. > It's annoying to debug encoding related stuff in my experience. It would be nicer if we give the correct information as much as we can. What is your suggestion for the error message? > I am saying let's document the automatic encoding detection feature only for multiLine officially, which is true. I agree let's document that thought it is not related to this PR. This PR doesn't change behavior of encoding auto detection. And it must not change the behavior from my point of view. If you want to restrict the encoding auto-detection mechanism somehow, please, create separate PR. We will discuss separately what kind of customer's apps it will break. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Supp...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20937#discussion_r180014636 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala --- @@ -86,14 +85,34 @@ private[sql] class JSONOptions( val multiLine = parameters.get("multiLine").map(_.toBoolean).getOrElse(false) - val lineSeparator: Option[String] = parameters.get("lineSep").map { sep => -require(sep.nonEmpty, "'lineSep' cannot be an empty string.") -sep + /** + * A string between two consecutive JSON records. + */ + val lineSeparator: Option[String] = parameters.get("lineSep") + + /** + * Standard encoding (charset) name. For example UTF-8, UTF-16LE and UTF-32BE. + * If the encoding is not specified (None), it will be detected automatically. + */ + val encoding: Option[String] = parameters.get("encoding") +.orElse(parameters.get("charset")).map { enc => + val blacklist = List("UTF16", "UTF32") --- End diff -- Not important but it's more usual and was thinking of doing it if there isn't specific reason to make an exception from a norm. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Supp...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20937#discussion_r180014167 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -366,6 +366,9 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging { * `java.text.SimpleDateFormat`. This applies to timestamp type. * `multiLine` (default `false`): parse one record, which may span multiple lines, * per file + * `encoding` (by default it is not set): allows to forcibly set one of standard basic + * or extended charsets for input jsons. For example UTF-8, UTF-16BE, UTF-32. If the encoding + * is not specified (by default), it will be detected automatically. --- End diff -- > If encoding is not set, it will be detected by Jackson independently from multiline. Jackson detects but Spark doesn't correctly when `multiLine` is disabled even with this PR, as we talked. We found many holes. Why did you bring this again? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Supp...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20937#discussion_r180013348 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonDataSource.scala --- @@ -92,26 +93,30 @@ object TextInputJsonDataSource extends JsonDataSource { sparkSession: SparkSession, inputPaths: Seq[FileStatus], parsedOptions: JSONOptions): StructType = { -val json: Dataset[String] = createBaseDataset( - sparkSession, inputPaths, parsedOptions.lineSeparator) +val json: Dataset[String] = createBaseDataset(sparkSession, inputPaths, parsedOptions) + inferFromDataset(json, parsedOptions) } def inferFromDataset(json: Dataset[String], parsedOptions: JSONOptions): StructType = { val sampled: Dataset[String] = JsonUtils.sample(json, parsedOptions) -val rdd: RDD[UTF8String] = sampled.queryExecution.toRdd.map(_.getUTF8String(0)) -JsonInferSchema.infer(rdd, parsedOptions, CreateJacksonParser.utf8String) +val rdd: RDD[InternalRow] = sampled.queryExecution.toRdd +val rowParser = parsedOptions.encoding.map { enc => + CreateJacksonParser.internalRow(enc, _: JsonFactory, _: InternalRow, 0) --- End diff -- Can we do something like ```scala (factory JsonFactory, row: InternalRow) => val bais = new ByteArrayInputStream(row.getBinary(0))) CreateJacksonParser.inputStream(enc, factory, bais) ``` ? Looks `internalRow` doesn't actually deduplicate codes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21005: [SPARK-23898][SQL] Simplify add & subtract code generati...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/21005 @maropu it seems a bit of overkill to add a separate trait for this, it also kinda nullifies the effect of this PR. As for the `CalendarInterval`'s support for `divide` and `multiply`. These operations have not been implemented yet, and - correct me if I am wrong - involve a `CalendarInterval` on the left side and an `Integral` on the right side; this violates the contract of `BinaryArithmetic`. Anyway I am not opposed to this, but I think we should do this as a part of a separate JIRA/PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21005: [SPARK-23898][SQL] Simplify add & subtract code generati...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21005 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Supp...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20937#discussion_r180009422 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala --- @@ -361,6 +361,15 @@ class JacksonParser( // For such records, all fields other than the field configured by // `columnNameOfCorruptRecord` are set to `null`. throw BadRecordException(() => recordLiteral(record), () => None, e) + case e: CharConversionException if options.encoding.isEmpty => +val msg = + """Failed to parse a character. Encoding was detected automatically. --- End diff -- I am saying let's document the automatic encoding detection feature only for `multiLine` officially, which is true. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21005: [SPARK-23898][SQL] Simplify add & subtract code generati...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21005 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2088/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Supp...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20937#discussion_r180009312 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala --- @@ -361,6 +361,15 @@ class JacksonParser( // For such records, all fields other than the field configured by // `columnNameOfCorruptRecord` are set to `null`. throw BadRecordException(() => recordLiteral(record), () => None, e) + case e: CharConversionException if options.encoding.isEmpty => +val msg = + """Failed to parse a character. Encoding was detected automatically. --- End diff -- I don't think `Encoding was detected automatically` is not quite correct. It might not help user solve the issue but it gives less correct information. They could thought it detects encoding correctly regardless of `multiline` option. Think about this scenario: users somehow get this exception and read `Failed to parse a character. Encoding was detected automatically.`. What would they think? I would think somehow the file is somehow failed to read but it looks detecting the encoding in the file correctly automatically regardless of other options. It's annoying to debug encoding related stuff in my experience. It would be nicer if we give the correct information as much as we can. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20981: [SPARK-23873][SQL] Use accessors in interpreted L...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20981#discussion_r180008583 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/InternalRow.scala --- @@ -119,4 +119,25 @@ object InternalRow { case v: MapData => v.copy() case _ => value } + + /** + * Returns an accessor for an InternalRow with given data type and ordinal. + */ + def getAccessor(dataType: DataType, ordinal: Int): (InternalRow) => Any = dataType match { +case BooleanType => (input) => input.getBoolean(ordinal) +case ByteType => (input) => input.getByte(ordinal) +case ShortType => (input) => input.getShort(ordinal) +case IntegerType | DateType => (input) => input.getInt(ordinal) +case LongType | TimestampType => (input) => input.getLong(ordinal) +case FloatType => (input) => input.getFloat(ordinal) +case DoubleType => (input) => input.getDouble(ordinal) +case StringType => (input) => input.getUTF8String(ordinal) +case BinaryType => (input) => input.getBinary(ordinal) +case CalendarIntervalType => (input) => input.getInterval(ordinal) +case t: DecimalType => (input) => input.getDecimal(ordinal, t.precision, t.scale) +case t: StructType => (input) => input.getStruct(ordinal, t.size) +case _: ArrayType => (input) => input.getArray(ordinal) +case _: MapType => (input) => input.getMap(ordinal) +case _ => (input) => input.get(ordinal, dataType) --- End diff -- Handle `UDT`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20981 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21005: [SPARK-23898][SQL] Simplify add & subtract code generati...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21005 **[Test build #89047 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89047/testReport)** for PR 21005 at commit [`433`](https://github.com/apache/spark/commit/43314b1d443fac5ca27ecef80677dbe70ab7). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20981: [SPARK-23873][SQL] Use accessors in interpreted L...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/20981#discussion_r180008527 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/BoundAttribute.scala --- @@ -33,28 +33,14 @@ case class BoundReference(ordinal: Int, dataType: DataType, nullable: Boolean) override def toString: String = s"input[$ordinal, ${dataType.simpleString}, $nullable]" + private lazy val accessor: InternalRow => Any = InternalRow.getAccessor(dataType, ordinal) --- End diff -- Do we need to be lazy? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20981 **[Test build #89048 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89048/testReport)** for PR 20981 at commit [`2eb2bf1`](https://github.com/apache/spark/commit/2eb2bf1853a0ba4de8f4a3adfe8407d04a075b22). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20981 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2087/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20981 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20944 **[Test build #89046 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89046/testReport)** for PR 20944 at commit [`1c801f1`](https://github.com/apache/spark/commit/1c801f1e673b3d6f9e94eeade08d5b309a105061). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20944 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20944 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2086/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/20944 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20981 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89042/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20981 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89040/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20981 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff test Pyth...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20904 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89039/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20981 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff test Pyth...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20904 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20944 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20981 **[Test build #89040 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89040/testReport)** for PR 20981 at commit [`a8cdbe8`](https://github.com/apache/spark/commit/a8cdbe8baf2d508fb2583862042f1213cf0eae7b). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff test Pyth...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20904 **[Test build #89039 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89039/testReport)** for PR 20904 at commit [`49a7ddb`](https://github.com/apache/spark/commit/49a7ddb45cb9a0035e3faed5906ecd37890333e1). * This patch **fails due to an unknown error code, -9**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21004: [SPARK-23896][SQL]Improve PartitioningAwareFileIndex
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21004 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21004: [SPARK-23896][SQL]Improve PartitioningAwareFileIndex
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21004 **[Test build #89044 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89044/testReport)** for PR 21004 at commit [`10536a6`](https://github.com/apache/spark/commit/10536a6dbf2ab37d7066915223a64e914cf53b5f). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20937 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20937 **[Test build #89045 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89045/testReport)** for PR 20937 at commit [`b817184`](https://github.com/apache/spark/commit/b817184d35d0e2589682f1dcd88b9f29b2063f5b). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21004: [SPARK-23896][SQL]Improve PartitioningAwareFileIndex
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21004 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89044/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20944 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89043/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20937: [SPARK-23094][SPARK-23723][SPARK-23724][SQL] Support cus...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20937 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89045/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20981 **[Test build #89042 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89042/testReport)** for PR 20981 at commit [`2eb2bf1`](https://github.com/apache/spark/commit/2eb2bf1853a0ba4de8f4a3adfe8407d04a075b22). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20944 **[Test build #89043 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89043/testReport)** for PR 20944 at commit [`1c801f1`](https://github.com/apache/spark/commit/1c801f1e673b3d6f9e94eeade08d5b309a105061). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org