[GitHub] spark issue #19147: [WIP][SPARK-21190][SQL][PYTHON] Vectorized UDFs in Pytho...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19147 **[Test build #81538 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81538/testReport)** for PR 19147 at commit [`2f929d8`](https://github.com/apache/spark/commit/2f929d8e0ec01ca7070fc0969e5091dad4ce8350). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19147: [WIP][SPARK-21190][SQL][PYTHON] Vectorized UDFs in Pytho...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19147 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19158: [SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.test...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19158 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19158: [SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.tests.SQLTe...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19158 Thanks for reviewing! merging to master/2.2/2.1/2.0 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18875: [SPARK-21513][SQL] Allow UDF to_json support converting ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18875 **[Test build #81537 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81537/testReport)** for PR 18875 at commit [`36ce961`](https://github.com/apache/spark/commit/36ce9614c078c9c0aca62a672948d8581b43e2ea). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18875: [SPARK-21513][SQL] Allow UDF to_json support conv...
Github user goldmedal commented on a diff in the pull request: https://github.com/apache/spark/pull/18875#discussion_r137710147 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala --- @@ -26,20 +26,50 @@ import org.apache.spark.sql.catalyst.expressions.SpecializedGetters import org.apache.spark.sql.catalyst.util.{ArrayData, DateTimeUtils, MapData} import org.apache.spark.sql.types._ +// `JackGenerator` can only be initialized with a `StructType` or a `MapType`. +// Once it is initialized with `StructType`, it can be used to write out a struct or an array of +// struct. Once it is initialized with `MapType`, it can be used to write out a map. An exception +// will be thrown if trying to write out a struct if it is initialized with a `MapType`, +// and vice verse. --- End diff -- ok. I'll modify it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19158: [SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.tests.SQLTe...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19158 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19147: [WIP][SPARK-21190][SQL][PYTHON] Vectorized UDFs i...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/19147#discussion_r137707828 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/VectorizedPythonRunner.scala --- @@ -0,0 +1,329 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.python + +import java.io.{BufferedInputStream, BufferedOutputStream, DataInputStream, DataOutputStream} +import java.net.Socket +import java.nio.charset.StandardCharsets + +import scala.collection.JavaConverters._ + +import org.apache.arrow.vector.VectorSchemaRoot +import org.apache.arrow.vector.stream.{ArrowStreamReader, ArrowStreamWriter} + +import org.apache.spark.{SparkEnv, SparkFiles, TaskContext} +import org.apache.spark.api.python.{ChainedPythonFunctions, PythonEvalType, PythonException, PythonRDD, SpecialLengths} +import org.apache.spark.internal.Logging +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.execution.arrow.{ArrowUtils, ArrowWriter} +import org.apache.spark.sql.execution.vectorized.{ArrowColumnVector, ColumnarBatch, ColumnVector} +import org.apache.spark.sql.types._ +import org.apache.spark.util.Utils + +/** + * Similar to `PythonRunner`, but exchange data with Python worker via columnar format. + */ +class VectorizedPythonRunner( +funcs: Seq[ChainedPythonFunctions], +batchSize: Int, +bufferSize: Int, +reuse_worker: Boolean, +argOffsets: Array[Array[Int]]) extends Logging { + + require(funcs.length == argOffsets.length, "argOffsets should have the same length as funcs") + + // All the Python functions should have the same exec, version and envvars. + private val envVars = funcs.head.funcs.head.envVars + private val pythonExec = funcs.head.funcs.head.pythonExec + private val pythonVer = funcs.head.funcs.head.pythonVer + + // TODO: support accumulator in multiple UDF + private val accumulator = funcs.head.funcs.head.accumulator + + // todo: return column batch? + def compute( --- End diff -- Yes, it is a lot of duplicated code from `PythonRunner` that could be refactored. I'm guessing you did not use the existing code because of the Arrow stream format? While I would love to start using that in Spark, I think it would be better to do this at a later time when the required code could be refactored and the Arrow stream format could replace where we currently use the file format. Also, the good part about using the iterator based file format is each iteration can allow Python to communicate back an error code and exit gracefully. In my own tests with the streaming format if an error occurred after the stream had started, Spark could lock up in a waiting state. These are the reasons I did not use the streaming format in my implementation. Would this `VectorizedPythonRunner` be able to handle these types of errors? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18875: [SPARK-21513][SQL] Allow UDF to_json support converting ...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18875 We should add test suite for `JacksonGenerator`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18875: [SPARK-21513][SQL] Allow UDF to_json support conv...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18875#discussion_r137706345 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala --- @@ -26,20 +26,50 @@ import org.apache.spark.sql.catalyst.expressions.SpecializedGetters import org.apache.spark.sql.catalyst.util.{ArrayData, DateTimeUtils, MapData} import org.apache.spark.sql.types._ +// `JackGenerator` can only be initialized with a `StructType` or a `MapType`. +// Once it is initialized with `StructType`, it can be used to write out a struct or an array of +// struct. Once it is initialized with `MapType`, it can be used to write out a map. An exception +// will be thrown if trying to write out a struct if it is initialized with a `MapType`, +// and vice verse. --- End diff -- For this kind of comment, we use the style like: /** * Code comments... * */ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137706271 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { + private val wareHousePath = Utils.createTempDir(namePrefix = "warehouse") + private val tmpDataDir = Utils.createTempDir(namePrefix = "test-data") + private val sparkTestingDir = "/tmp/spark-test" + private val unusedJar = TestUtils.createJarWithClasses(Seq.empty) + + override def afterAll(): Unit = { +Utils.deleteRecursively(wareHousePath) +Utils.deleteRecursively(tmpDataDir) +super.afterAll() + } + + private def downloadSpark(version: String): Unit = { +import scala.sys.process._ + +val url = s"https://d3kbcqa49mib13.cloudfront.net/spark-$version-bin-hadoop2.7.tgz; + +Seq("wget", url, "-q", "-P", sparkTestingDir).! + +val downloaded = new File(sparkTestingDir, s"spark-$version-bin-hadoop2.7.tgz").getCanonicalPath +val targetDir = new File(sparkTestingDir, s"spark-$version").getCanonicalPath + +Seq("mkdir", targetDir).! + +Seq("tar", "-xzf", downloaded, "-C", targetDir, "--strip-components=1").! + +Seq("rm", downloaded).! + } + + private def genDataDir(name: String): String = { +new File(tmpDataDir, name).getCanonicalPath + } + + override def beforeAll(): Unit = { +super.beforeAll() + +val tempPyFile = File.createTempFile("test", ".py") +Files.write(tempPyFile.toPath, + s""" +|from pyspark.sql import SparkSession +| +|spark = SparkSession.builder.enableHiveSupport().getOrCreate() +|version_index = spark.conf.get("spark.sql.test.version.index", None) +| +|spark.sql("create table data_source_tbl_{} using json as select 1 i".format(version_index)) +| +|spark.sql("create table hive_compatible_data_source_tbl_" + version_index + \\ +| " using parquet as select 1 i") +| +|json_file = "${genDataDir("json_")}" + str(version_index) +|spark.range(1, 2).selectExpr("cast(id as int) as i").write.json(json_file) +|spark.sql("create table external_data_source_tbl_" + version_index + \\ +| "(i int) using json options (path '{}')".format(json_file)) +| +|parquet_file = "${genDataDir("parquet_")}" + str(version_index) +|spark.range(1, 2).selectExpr("cast(id as int) as i").write.parquet(parquet_file) +|spark.sql("create table hive_compatible_external_data_source_tbl_" + version_index + \\ +| "(i int) using parquet options (path '{}')".format(parquet_file)) +| +|json_file2 = "${genDataDir("json2_")}" + str(version_index) +|spark.range(1, 2).selectExpr("cast(id as int) as i").write.json(json_file2) +|spark.sql("create table
[GitHub] spark issue #18266: [SPARK-20427][SQL] Read JDBC table use custom schema
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18266 **[Test build #81536 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81536/testReport)** for PR 18266 at commit [`b38a1a8`](https://github.com/apache/spark/commit/b38a1a8b2d9ffee250b9e8637dc579f2a8f9182d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137704899 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { + private val wareHousePath = Utils.createTempDir(namePrefix = "warehouse") + private val tmpDataDir = Utils.createTempDir(namePrefix = "test-data") + private val sparkTestingDir = "/tmp/spark-test" + private val unusedJar = TestUtils.createJarWithClasses(Seq.empty) + + override def afterAll(): Unit = { +Utils.deleteRecursively(wareHousePath) +Utils.deleteRecursively(tmpDataDir) +super.afterAll() + } + + private def downloadSpark(version: String): Unit = { +import scala.sys.process._ + +val url = s"https://d3kbcqa49mib13.cloudfront.net/spark-$version-bin-hadoop2.7.tgz; + +Seq("wget", url, "-q", "-P", sparkTestingDir).! + +val downloaded = new File(sparkTestingDir, s"spark-$version-bin-hadoop2.7.tgz").getCanonicalPath +val targetDir = new File(sparkTestingDir, s"spark-$version").getCanonicalPath + +Seq("mkdir", targetDir).! + +Seq("tar", "-xzf", downloaded, "-C", targetDir, "--strip-components=1").! + +Seq("rm", downloaded).! + } + + private def genDataDir(name: String): String = { +new File(tmpDataDir, name).getCanonicalPath + } + + override def beforeAll(): Unit = { +super.beforeAll() + +val tempPyFile = File.createTempFile("test", ".py") +Files.write(tempPyFile.toPath, + s""" +|from pyspark.sql import SparkSession +| +|spark = SparkSession.builder.enableHiveSupport().getOrCreate() +|version_index = spark.conf.get("spark.sql.test.version.index", None) +| +|spark.sql("create table data_source_tbl_{} using json as select 1 i".format(version_index)) +| +|spark.sql("create table hive_compatible_data_source_tbl_" + version_index + \\ +| " using parquet as select 1 i") +| +|json_file = "${genDataDir("json_")}" + str(version_index) +|spark.range(1, 2).selectExpr("cast(id as int) as i").write.json(json_file) +|spark.sql("create table external_data_source_tbl_" + version_index + \\ +| "(i int) using json options (path '{}')".format(json_file)) +| +|parquet_file = "${genDataDir("parquet_")}" + str(version_index) +|spark.range(1, 2).selectExpr("cast(id as int) as i").write.parquet(parquet_file) +|spark.sql("create table hive_compatible_external_data_source_tbl_" + version_index + \\ +| "(i int) using parquet options (path '{}')".format(parquet_file)) +| +|json_file2 = "${genDataDir("json2_")}" + str(version_index) +|spark.range(1, 2).selectExpr("cast(id as int) as i").write.json(json_file2) +|spark.sql("create table
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137704429 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { + private val wareHousePath = Utils.createTempDir(namePrefix = "warehouse") + private val tmpDataDir = Utils.createTempDir(namePrefix = "test-data") + private val sparkTestingDir = "/tmp/spark-test" + private val unusedJar = TestUtils.createJarWithClasses(Seq.empty) + + override def afterAll(): Unit = { +Utils.deleteRecursively(wareHousePath) +Utils.deleteRecursively(tmpDataDir) +super.afterAll() + } + + private def downloadSpark(version: String): Unit = { +import scala.sys.process._ + +val url = s"https://d3kbcqa49mib13.cloudfront.net/spark-$version-bin-hadoop2.7.tgz; + +Seq("wget", url, "-q", "-P", sparkTestingDir).! + +val downloaded = new File(sparkTestingDir, s"spark-$version-bin-hadoop2.7.tgz").getCanonicalPath +val targetDir = new File(sparkTestingDir, s"spark-$version").getCanonicalPath + +Seq("mkdir", targetDir).! + +Seq("tar", "-xzf", downloaded, "-C", targetDir, "--strip-components=1").! + +Seq("rm", downloaded).! + } + + private def genDataDir(name: String): String = { +new File(tmpDataDir, name).getCanonicalPath + } + + override def beforeAll(): Unit = { +super.beforeAll() + +val tempPyFile = File.createTempFile("test", ".py") +Files.write(tempPyFile.toPath, + s""" +|from pyspark.sql import SparkSession +| +|spark = SparkSession.builder.enableHiveSupport().getOrCreate() +|version_index = spark.conf.get("spark.sql.test.version.index", None) +| +|spark.sql("create table data_source_tbl_{} using json as select 1 i".format(version_index)) --- End diff -- Instead of only using lowercase column name, should we use mix-case Hive schema for those tables? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19155: [SPARK-21949][TEST] Tables created in unit tests should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19155 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81533/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19155: [SPARK-21949][TEST] Tables created in unit tests should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19155 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19155: [SPARK-21949][TEST] Tables created in unit tests should ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19155 **[Test build #81533 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81533/testReport)** for PR 19155 at commit [`1d38337`](https://github.com/apache/spark/commit/1d38337b22ea8926aeb1db0591285fbb34f902cc). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19148: [SPARK-21936][SQL] backward compatibility test framework...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19148 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81532/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19148: [SPARK-21936][SQL] backward compatibility test framework...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19148 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19148: [SPARK-21936][SQL] backward compatibility test framework...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19148 **[Test build #81532 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81532/testReport)** for PR 19148 at commit [`00cdd0a`](https://github.com/apache/spark/commit/00cdd0a63bdd4f531eb06de8d9651e934f2bb448). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137703092 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { --- End diff -- Ok. After a build clean it works now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19158: [SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.tests.SQLTe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19158 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81534/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19158: [SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.tests.SQLTe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19158 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19158: [SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.tests.SQLTe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19158 **[Test build #81534 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81534/testReport)** for PR 19158 at commit [`134bc26`](https://github.com/apache/spark/commit/134bc267a5ef01d9dea3d08cc255facdd8dfc0c8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18956: [SPARK-21726][SQL] Check for structural integrity of the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18956 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81531/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18956: [SPARK-21726][SQL] Check for structural integrity of the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18956 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18956: [SPARK-21726][SQL] Check for structural integrity of the...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18956 **[Test build #81531 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81531/testReport)** for PR 18956 at commit [`ecdfb7d`](https://github.com/apache/spark/commit/ecdfb7db34d0d01e357bff0d32b62137ef0ae735). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137700913 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { --- End diff -- Let me do build clean and try again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19148: [SPARK-21936][SQL] backward compatibility test framework...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19148 **[Test build #81535 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81535/testReport)** for PR 19148 at commit [`62369e3`](https://github.com/apache/spark/commit/62369e3a07bc23d68068e809edf1c43de448740a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137700499 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { --- End diff -- Did you try a clean clone? I added the derby dependency to make the test work on jenkins... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19107: [SPARK-21799][ML] Fix `KMeans` performance regression ca...
Github user smurching commented on the issue: https://github.com/apache/spark/pull/19107 Sorry for the delay, this looks good to me -- thanks @WeichenXu123! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137699853 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { --- End diff -- After removing the added derby dependency, this test can work. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137699802 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/SparkSubmitTestUtils.scala --- @@ -0,0 +1,101 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.sql.Timestamp +import java.util.Date + +import scala.collection.mutable.ArrayBuffer + +import org.scalatest.concurrent.Timeouts +import org.scalatest.exceptions.TestFailedDueToTimeoutException +import org.scalatest.time.SpanSugar._ + +import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.test.ProcessTestUtils.ProcessOutputCapturer +import org.apache.spark.util.Utils + +trait SparkSubmitTestUtils extends SparkFunSuite with Timeouts { --- End diff -- nit. Let's use `TimeLimits` instead of `Timeouts`. `Timeouts` is deprecated now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19158: [SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.tests.SQLTe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19158 **[Test build #81534 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81534/testReport)** for PR 19158 at commit [`134bc26`](https://github.com/apache/spark/commit/134bc267a5ef01d9dea3d08cc255facdd8dfc0c8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137699720 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { --- End diff -- can you print `org.apache.derby.tools.sysinfo.getVersionString` in `IsolatedClientLoader.createClient` to see what's your actual derby version? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19147: [WIP][SPARK-21190][SQL][PYTHON] Vectorized UDFs in Pytho...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19147 The test failure above should be fixed by #19158. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137699367 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { --- End diff -- I ran this test locally and encountered the failure like: 2017-09-07 19:28:07.595 - stderr> Caused by: java.sql.SQLException: Database at /root/repos/spark-1/target/tmp/warehouse-66dad501-c743-4ac3-83cc-51451c6d697a/metastore_db has an incompatible format with the current version of the software. The database was created by or upgraded by version 10.12. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19158: [SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.test...
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/19158 [SPARK-21950][SQL][PYTHON][TEST] pyspark.sql.tests.SQLTests2 should stop SparkContext. ## What changes were proposed in this pull request? `pyspark.sql.tests.SQLTests2` doesn't stop newly created spark context in the test and it might affect the following tests. This pr makes `pyspark.sql.tests.SQLTests2` stop `SparkContext`. ## How was this patch tested? Existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ueshin/apache-spark issues/SPARK-21950 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19158.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19158 commit 134bc267a5ef01d9dea3d08cc255facdd8dfc0c8 Author: Takuya UESHINDate: 2017-09-08T02:34:41Z Make pyspark.sql.tests.SQLTests2 stop SparkContext. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19136#discussion_r137699153 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/upward/StatisticsSupport.java --- @@ -0,0 +1,26 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.sources.v2.reader.upward; + +/** + * A mix in interface for `DataSourceV2Reader`. Users can implement this interface to report + * statistics to Spark. + */ +public interface StatisticsSupport { --- End diff -- I'd like to put column stats in a separated interface, because we already separate basic stats and column stats in `ANALYZE TABLE`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19136: [DO NOT MERGE][SPARK-15689][SQL] data source v2
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19136#discussion_r137698996 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala --- @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.datasources.v2 + +import org.apache.spark.sql.catalyst.expressions.AttributeReference +import org.apache.spark.sql.catalyst.plans.logical.{LeafNode, Statistics} +import org.apache.spark.sql.sources.v2.reader.DataSourceV2Reader +import org.apache.spark.sql.sources.v2.reader.upward.StatisticsSupport + +case class DataSourceV2Relation( +output: Seq[AttributeReference], +reader: DataSourceV2Reader) extends LeafNode { + + override def computeStats(): Statistics = reader match { +case r: StatisticsSupport => Statistics(sizeInBytes = r.getStatistics.sizeInBytes()) +case _ => Statistics(sizeInBytes = conf.defaultSizeInBytes) + } +} + +object DataSourceV2Relation { + def apply(reader: DataSourceV2Reader): DataSourceV2Relation = { +new DataSourceV2Relation(reader.readSchema().toAttributes, reader) --- End diff -- In data source V2, we will delegate partition pruning to the data source, although we need to do some refactoring to make it happen. > I was just looking into how the data source should provide partition data, or at least fields that are the same for all rows in a `ReadTask`. It would be nice to have a way to pass those up instead of materializing them in each `UnsafeRow`. This can be achieved by the columnar reader. Think about a data source having a data column `i` and a partition column `j`, the returned columnar batch has 2 column vectors for `i` and `j`. Column vector `i` is a normal one that contains all the values of column `i` within this batch, column vector `j` is a constant vector that only contains a single value. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19107: [SPARK-21799][ML] Fix `KMeans` performance regression ca...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19107 cc @smurching Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19132: [SPARK-21922] Fix duration always updating when task fai...
Github user caneGuy commented on the issue: https://github.com/apache/spark/pull/19132 @vanzin @zsxwing could you help reivew this?Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18956: [SPARK-21726][SQL] Check for structural integrity of the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18956 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18956: [SPARK-21726][SQL] Check for structural integrity of the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18956 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81529/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19155: [SPARK-21949][TEST] Tables created in unit tests should ...
Github user 10110346 commented on the issue: https://github.com/apache/spark/pull/19155 @dongjoon-hyun thanks, I have created a JIRA issue. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18956: [SPARK-21726][SQL] Check for structural integrity of the...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18956 **[Test build #81529 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81529/testReport)** for PR 18956 at commit [`d1db7cf`](https://github.com/apache/spark/commit/d1db7cf815d447b195c907fb159ed0a6770c537b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19155: [MINOR][TEST] Tables created in unit tests should be dro...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19155 **[Test build #81533 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81533/testReport)** for PR 19155 at commit [`1d38337`](https://github.com/apache/spark/commit/1d38337b22ea8926aeb1db0591285fbb34f902cc). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19157: [SPARK-20589][Core][Scheduler] Allow limiting task concu...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19157 @dhruve, FYI, AppVeyor CI only runs SparkR tests on Windows only when there are changes in R related codes: https://github.com/apache/spark/blob/75a6d05853fea13f88e3c941b1959b24e4640824/appveyor.yml#L29-L34 Thing is, it looks when `git merge` is performed, https://github.com/apache/spark/commit/8b3830004d69bd5f109fd9846f59583c23a910c7 (not `rebase`), that merging commit one includes usually some changes in R and then the CI is triggered, which is actually quite moderate. So, I think generally we should rebase it when there are conflicts. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19144: [UI][Streaming]Modify the title, 'Records' instead of 'I...
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/19144 @zsxwing Help to review the code, thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19150: [SPARK-21939][TEST] Use TimeLimits instead of Tim...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19150 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19150 Thank you for review and merging, @jerryshao ! Also, thank you for review and approving, @HyukjinKwon and @srowen . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19150: [SPARK-21939][TEST] Use TimeLimits instead of Timeouts
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19150 Merging to master, thanks @dongjoon-hyun . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19149: [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict between ...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19149 Except that, Isolation of `InferFiltersFromConstraints` looks good to me. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19149: [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict between ...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19149 Hi, @gatorsmile . According to the PR description, it's about `PruneFilters`. Do we need a test case because SPARK-21652 is about `ConstantPropagation`, not `PruneFilters`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18029: [SPARK-20168] [DStream] Add changes to use kinesis fetch...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18029 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81530/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18029: [SPARK-20168] [DStream] Add changes to use kinesis fetch...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18029 **[Test build #81530 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81530/testReport)** for PR 18029 at commit [`cef5cde`](https://github.com/apache/spark/commit/cef5cdece2bd2a7c95e19493c511d602c1b46461). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `public class KinesisInitialPosition ` * `sealed trait InitialPosition ` * `case class AtTimestamp(timestamp: Date) extends InitialPosition ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18029: [SPARK-20168] [DStream] Add changes to use kinesis fetch...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18029 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19148: [SPARK-21936][SQL] backward compatibility test framework...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19148 **[Test build #81532 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81532/testReport)** for PR 19148 at commit [`00cdd0a`](https://github.com/apache/spark/commit/00cdd0a63bdd4f531eb06de8d9651e934f2bb448). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137686311 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,193 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { + private val wareHousePath = Utils.createTempDir(namePrefix = "warehouse") + private val tmpDataDir = Utils.createTempDir(namePrefix = "test-data") + private val sparkTestingDir = "/tmp/spark-test" + private val unusedJar = TestUtils.createJarWithClasses(Seq.empty) + + override def afterAll(): Unit = { +Utils.deleteRecursively(wareHousePath) --- End diff -- I wanna keep the `sparkTestingDir`, so we don't need to download spark again if this jenkins machine has already run this suite before. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18956: [SPARK-21726][SQL] Check for structural integrity of the...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18956 **[Test build #81531 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81531/testReport)** for PR 18956 at commit [`ecdfb7d`](https://github.com/apache/spark/commit/ecdfb7db34d0d01e357bff0d32b62137ef0ae735). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17435#discussion_r137685731 --- Diff: python/pyspark/sql/types.py --- @@ -438,6 +438,11 @@ def toInternal(self, obj): def fromInternal(self, obj): return self.dataType.fromInternal(obj) +def typeName(self): +raise TypeError( +"StructField does not have typename. \ --- End diff -- Little nit: looks a typo, typename -> typeName. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17435#discussion_r137685629 --- Diff: python/pyspark/sql/types.py --- @@ -438,6 +438,11 @@ def toInternal(self, obj): def fromInternal(self, obj): return self.dataType.fromInternal(obj) +def typeName(self): +raise TypeError( +"StructField does not have typename. \ +You can use self.dataType.simpleString() instead.") --- End diff -- I'd remove `self` here and just say something like ` use typeName() on its type explicitly ...`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18029: [SPARK-20168] [DStream] Add changes to use kinesis fetch...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18029 **[Test build #81530 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81530/testReport)** for PR 18029 at commit [`cef5cde`](https://github.com/apache/spark/commit/cef5cdece2bd2a7c95e19493c511d602c1b46461). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18029: [SPARK-20168] [DStream] Add changes to use kinesi...
Github user yssharma commented on a diff in the pull request: https://github.com/apache/spark/pull/18029#discussion_r137684968 --- Diff: external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/InitialPosition.scala --- @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.streaming.kinesis + +import java.util.Date + +import com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStream + +/** + * Trait for Kinesis's InitialPositionInStream. + * This will be overridden by more specific types. + */ +sealed trait InitialPosition { + val initialPositionInStream: InitialPositionInStream +} + +/** + * Case object for Kinesis's InitialPositionInStream.LATEST. + */ +case object Latest extends InitialPosition { + val instance: InitialPosition = this + override val initialPositionInStream: InitialPositionInStream += InitialPositionInStream.LATEST +} + +/** + * Case object for Kinesis's InitialPositionInStream.TRIM_HORIZON. + */ +case object TrimHorizon extends InitialPosition { + val instance: InitialPosition = this + override val initialPositionInStream: InitialPositionInStream += InitialPositionInStream.TRIM_HORIZON +} + +/** + * Case object for Kinesis's InitialPositionInStream.AT_TIMESTAMP. + */ +case class AtTimestamp(timestamp: Date) extends InitialPosition { + val instance: InitialPosition = this + override val initialPositionInStream: InitialPositionInStream += InitialPositionInStream.AT_TIMESTAMP +} + +/** + * Companion object for InitialPosition that returns + * appropriate version of InitialPositionInStream. + */ +object InitialPosition { --- End diff -- I've implemented the functions with this Capital naming, but still feel a bit salty about this :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17435: [SPARK-20098][PYSPARK] dataType's typeName fix
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17435#discussion_r137684263 --- Diff: python/pyspark/sql/types.py --- @@ -438,6 +438,11 @@ def toInternal(self, obj): def fromInternal(self, obj): return self.dataType.fromInternal(obj) +def typeName(self): +raise TypeError( --- End diff -- Could we do like ... ```python raise TypeError( "..." "...") ``` if it doesn't bother you much? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19148: [SPARK-21936][SQL] backward compatibility test framework...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19148 LGTM except two minor comments. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137683665 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,193 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { + private val wareHousePath = Utils.createTempDir(namePrefix = "warehouse") + private val tmpDataDir = Utils.createTempDir(namePrefix = "test-data") + private val sparkTestingDir = "/tmp/spark-test" + private val unusedJar = TestUtils.createJarWithClasses(Seq.empty) + + override def afterAll(): Unit = { +Utils.deleteRecursively(wareHousePath) +super.afterAll() + } + + private def downloadSpark(version: String): Unit = { +import scala.sys.process._ + +val url = s"https://d3kbcqa49mib13.cloudfront.net/spark-$version-bin-hadoop2.7.tgz; + +Seq("wget", url, "-q", "-P", sparkTestingDir).! + +val downloaded = new File(sparkTestingDir, s"spark-$version-bin-hadoop2.7.tgz").getCanonicalPath +val targetDir = new File(sparkTestingDir, s"spark-$version").getCanonicalPath + +Seq("mkdir", targetDir).! + +Seq("tar", "-xzf", downloaded, "-C", targetDir, "--strip-components=1").! + +Seq("rm", downloaded).! + } + + private def genDataDir(name: String): String = { +new File(tmpDataDir, name).getCanonicalPath + } + + override def beforeAll(): Unit = { +super.beforeAll() + +val tempPyFile = File.createTempFile("test", ".py") +Files.write(tempPyFile.toPath, + s""" +|from pyspark.sql import SparkSession +| +|spark = SparkSession.builder.enableHiveSupport().getOrCreate() +|version_index = spark.conf.get("spark.sql.test.version.index", None) +| +|spark.sql("create table data_source_tbl_{} using json as select 1 i".format(version_index)) +| +|spark.sql("create table hive_compatible_data_source_tbl_" + version_index + \\ +| " using parquet as select 1 i") +| +|json_file = "${genDataDir("json_")}" + str(version_index) +|spark.range(1, 2).selectExpr("cast(id as int) as i").write.json(json_file) +|spark.sql("create table external_data_source_tbl_" + version_index + \\ +| "(i int) using json options (path '{}')".format(json_file)) +| +|parquet_file = "${genDataDir("parquet_")}" + str(version_index) +|spark.range(1, 2).selectExpr("cast(id as int) as i").write.parquet(parquet_file) +|spark.sql("create table hive_compatible_external_data_source_tbl_" + version_index + \\ +| "(i int) using parquet options (path '{}')".format(parquet_file)) +| +|json_file2 = "${genDataDir("json2_")}" + str(version_index) +|spark.range(1, 2).selectExpr("cast(id as int) as i").write.json(json_file2) +|spark.sql("create table external_table_without_schema_" + version_index + \\
[GitHub] spark issue #19129: [SPARK-13656][SQL] Delete spark.sql.parquet.cacheMetadat...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19129 Thank you for review, @gatorsmile , @HyukjinKwon , @maropu . In this issue, I've learned how to track the unused stuff correctly. Thank you again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137682740 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,193 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { + private val wareHousePath = Utils.createTempDir(namePrefix = "warehouse") + private val tmpDataDir = Utils.createTempDir(namePrefix = "test-data") + private val sparkTestingDir = "/tmp/spark-test" + private val unusedJar = TestUtils.createJarWithClasses(Seq.empty) + + override def afterAll(): Unit = { +Utils.deleteRecursively(wareHousePath) +super.afterAll() + } + + private def downloadSpark(version: String): Unit = { +import scala.sys.process._ + +val url = s"https://d3kbcqa49mib13.cloudfront.net/spark-$version-bin-hadoop2.7.tgz; + +Seq("wget", url, "-q", "-P", sparkTestingDir).! + +val downloaded = new File(sparkTestingDir, s"spark-$version-bin-hadoop2.7.tgz").getCanonicalPath +val targetDir = new File(sparkTestingDir, s"spark-$version").getCanonicalPath + +Seq("mkdir", targetDir).! + +Seq("tar", "-xzf", downloaded, "-C", targetDir, "--strip-components=1").! + +Seq("rm", downloaded).! + } + + private def genDataDir(name: String): String = { +new File(tmpDataDir, name).getCanonicalPath + } + + override def beforeAll(): Unit = { +super.beforeAll() + +val tempPyFile = File.createTempFile("test", ".py") +Files.write(tempPyFile.toPath, + s""" +|from pyspark.sql import SparkSession +| +|spark = SparkSession.builder.enableHiveSupport().getOrCreate() +|version_index = spark.conf.get("spark.sql.test.version.index", None) +| +|spark.sql("create table data_source_tbl_{} using json as select 1 i".format(version_index)) +| +|spark.sql("create table hive_compatible_data_source_tbl_" + version_index + \\ +| " using parquet as select 1 i") +| +|json_file = "${genDataDir("json_")}" + str(version_index) +|spark.range(1, 2).selectExpr("cast(id as int) as i").write.json(json_file) +|spark.sql("create table external_data_source_tbl_" + version_index + \\ +| "(i int) using json options (path '{}')".format(json_file)) +| +|parquet_file = "${genDataDir("parquet_")}" + str(version_index) +|spark.range(1, 2).selectExpr("cast(id as int) as i").write.parquet(parquet_file) +|spark.sql("create table hive_compatible_external_data_source_tbl_" + version_index + \\ +| "(i int) using parquet options (path '{}')".format(parquet_file)) +| +|json_file2 = "${genDataDir("json2_")}" + str(version_index) +|spark.range(1, 2).selectExpr("cast(id as int) as i").write.json(json_file2) +|spark.sql("create table external_table_without_schema_" + version_index + \\
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137682115 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala --- @@ -0,0 +1,193 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import java.io.File +import java.nio.file.Files + +import org.apache.spark.TestUtils +import org.apache.spark.sql.{QueryTest, Row, SparkSession} +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.catalog.CatalogTableType +import org.apache.spark.sql.test.SQLTestUtils +import org.apache.spark.util.Utils + +/** + * Test HiveExternalCatalog backward compatibility. + * + * Note that, this test suite will automatically download spark binary packages of different + * versions to a local directory `/tmp/spark-test`. If there is already a spark folder with + * expected version under this local directory, e.g. `/tmp/spark-test/spark-2.0.3`, we will skip the + * downloading for this spark version. + */ +class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { + private val wareHousePath = Utils.createTempDir(namePrefix = "warehouse") + private val tmpDataDir = Utils.createTempDir(namePrefix = "test-data") + private val sparkTestingDir = "/tmp/spark-test" + private val unusedJar = TestUtils.createJarWithClasses(Seq.empty) + + override def afterAll(): Unit = { +Utils.deleteRecursively(wareHousePath) --- End diff -- Also delete `tmpDataDir ` and `sparkTestingDir `? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19148: [SPARK-21936][SQL] backward compatibility test framework...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19148 Less than 2 mins to finish the suite. It looks pretty good! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18956: [SPARK-21726][SQL] Check for structural integrity of the...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18956 LGTM except two minor comments --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18956: [SPARK-21726][SQL] Check for structural integrity...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18956#discussion_r137681045 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala --- @@ -64,6 +64,14 @@ abstract class RuleExecutor[TreeType <: TreeNode[_]] extends Logging { protected def batches: Seq[Batch] /** + * Defines a check function which checks for structural integrity of the plan after the execution --- End diff -- `which` -> `that` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18956: [SPARK-21726][SQL] Check for structural integrity...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18956#discussion_r137680999 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala --- @@ -64,6 +64,14 @@ abstract class RuleExecutor[TreeType <: TreeNode[_]] extends Logging { protected def batches: Seq[Batch] /** + * Defines a check function which checks for structural integrity of the plan after the execution + * of each rule. For example, we can check whether a plan is still resolved after each rule in + * `Optimizer`, so we can catch rules that return invalid plans. The check function will returns --- End diff -- `will returns` -> `returns` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r137680545 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala --- @@ -534,4 +534,176 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef } } } + + test("insert overwrite to dir from hive metastore table") { +withTempDir { dir => + val path = dir.toURI.getPath + + sql(s"INSERT OVERWRITE LOCAL DIRECTORY '${path}' SELECT * FROM src where key < 10") + + sql( +s""" + |INSERT OVERWRITE LOCAL DIRECTORY '${path}' + |STORED AS orc + |SELECT * FROM src where key < 10 + """.stripMargin) + + // use orc data source to check the data of path is right. + withTempView("orc_source") { +sql( + s""" + |CREATE TEMPORARY VIEW orc_source + |USING org.apache.spark.sql.hive.orc + |OPTIONS ( + | PATH '${dir.getCanonicalPath}' + |) + """.stripMargin) + +checkAnswer( + sql("select * from orc_source"), + sql("select * from src where key < 10")) + } +} + } + + test("insert overwrite to local dir from temp table") { +withTempView("test_insert_table") { + spark.range(10).selectExpr("id", "id AS str").createOrReplaceTempView("test_insert_table") + + withTempDir { dir => +val path = dir.toURI.getPath + +sql( + s""" + |INSERT OVERWRITE LOCAL DIRECTORY '${path}' + |ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' + |SELECT * FROM test_insert_table + """.stripMargin) + +sql( + s""" + |INSERT OVERWRITE LOCAL DIRECTORY '${path}' + |STORED AS orc + |SELECT * FROM test_insert_table + """.stripMargin) + +// use orc data source to check the data of path is right. +checkAnswer( + spark.read.orc(dir.getCanonicalPath), + sql("select * from test_insert_table")) + } +} + } + + test("insert overwrite to dir from temp table") { +withTempView("test_insert_table") { + spark.range(10).selectExpr("id", "id AS str").createOrReplaceTempView("test_insert_table") + + withTempDir { dir => +val pathUri = dir.toURI + +sql( + s""" + |INSERT OVERWRITE DIRECTORY '${pathUri}' + |ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' + |SELECT * FROM test_insert_table + """.stripMargin) + +sql( + s""" + |INSERT OVERWRITE DIRECTORY '${pathUri}' + |STORED AS orc + |SELECT * FROM test_insert_table + """.stripMargin) + +// use orc data source to check the data of path is right. +checkAnswer( + spark.read.orc(dir.getCanonicalPath), + sql("select * from test_insert_table")) + } +} + } + + test("multi insert overwrite to dir") { +withTempView("test_insert_table") { + spark.range(10).selectExpr("id", "id AS str").createOrReplaceTempView("test_insert_table") + + withTempDir { dir => +val pathUri = dir.toURI + +sql( + s""" + |FROM test_insert_table + |INSERT OVERWRITE DIRECTORY '${pathUri}' + |STORED AS orc + |SELECT id + |INSERT OVERWRITE DIRECTORY '${pathUri}' --- End diff -- To test multi-insert, we need to use different paths and then verify both are successful or not. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r137680153 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -360,6 +360,31 @@ case class InsertIntoTable( } /** + * Insert query result into a directory. + * + * @param isLocal Indicates whether the specified directory is local directory + * @param storage Info about output file, row and what serialization format + * @param provider Specifies what data source to use; only used for data source file. + * @param child The query to be executed + * @param overwrite If true, the existing directory will be overwritten + * + * Note that this plan is unresolved and has to be replaced by the concrete implementations + * sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scaladuring analysis. --- End diff -- Could you fix it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19129: [SPARK-13656][SQL] Delete spark.sql.parquet.cache...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19129 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19129: [SPARK-13656][SQL] Delete spark.sql.parquet.cacheMetadat...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19129 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18956: [SPARK-21726][SQL] Check for structural integrity...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18956#discussion_r137678987 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizerSICheckerSuite.scala --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.catalyst.analysis.{EmptyFunctionRegistry, UnresolvedAttribute} +import org.apache.spark.sql.catalyst.catalog.{InMemoryCatalog, SessionCatalog} +import org.apache.spark.sql.catalyst.dsl.plans._ +import org.apache.spark.sql.catalyst.errors.TreeNodeException +import org.apache.spark.sql.catalyst.expressions.{Alias, Literal} +import org.apache.spark.sql.catalyst.plans.PlanTest +import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, OneRowRelation, Project} +import org.apache.spark.sql.catalyst.rules._ +import org.apache.spark.sql.internal.SQLConf + + +class OptimizerSICheckerkSuite extends PlanTest { --- End diff -- Ok. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18956: [SPARK-21726][SQL] Check for structural integrity...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18956#discussion_r137679007 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala --- @@ -64,6 +64,14 @@ abstract class RuleExecutor[TreeType <: TreeNode[_]] extends Logging { protected def batches: Seq[Batch] /** + * Defines a check function which checks for structural integrity of the plan after the execution + * of each rule. For example, we can check whether a plan is still resolved after each rule in + * `Optimizer`, so we can catch rules that return invalid plans. The check function will returns + * `false` if the given plan doesn't pass the structural integrity check. + */ + protected def planChecker(plan: TreeType): Boolean = true --- End diff -- Looks good. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18956: [SPARK-21726][SQL] Check for structural integrity of the...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18956 **[Test build #81529 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81529/testReport)** for PR 18956 at commit [`d1db7cf`](https://github.com/apache/spark/commit/d1db7cf815d447b195c907fb159ed0a6770c537b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19046: [SPARK-18769][yarn] Limit resource requests based...
Github user vanzin closed the pull request at: https://github.com/apache/spark/pull/19046 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/19046 I'm going to close this; when I find some free time I might take a closer at the issue described in Wilfred's message. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19148: [SPARK-21936][SQL] backward compatibility test fr...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19148#discussion_r137663570 --- Diff: sql/hive/pom.xml --- @@ -177,6 +177,10 @@ libfb303 + org.apache.derby + derby --- End diff -- I see. Thank you! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19148: [SPARK-21936][SQL] backward compatibility test framework...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19148 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81524/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19148: [SPARK-21936][SQL] backward compatibility test framework...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19148 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19148: [SPARK-21936][SQL] backward compatibility test framework...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19148 **[Test build #81524 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81524/testReport)** for PR 19148 at commit [`08dcf22`](https://github.com/apache/spark/commit/08dcf2291a0b1ae4b0e8f29c7628ff04b1924029). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils ` * `trait SparkSubmitTestUtils extends SparkFunSuite with Timeouts ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17589: [SPARK-16544][SQL] Support for conversion from numeric c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17589 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19129: [SPARK-13656][SQL] Delete spark.sql.parquet.cacheMetadat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19129 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19129: [SPARK-13656][SQL] Delete spark.sql.parquet.cacheMetadat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19129 **[Test build #81525 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81525/testReport)** for PR 19129 at commit [`8e3d8fe`](https://github.com/apache/spark/commit/8e3d8fe26c6bbf15e17a4b80ff8357fe870f2d46). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17589: [SPARK-16544][SQL] Support for conversion from numeric c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17589 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81527/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19129: [SPARK-13656][SQL] Delete spark.sql.parquet.cacheMetadat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19129 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81525/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17589: [SPARK-16544][SQL] Support for conversion from numeric c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17589 **[Test build #81527 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81527/testReport)** for PR 17589 at commit [`cbf8a22`](https://github.com/apache/spark/commit/cbf8a224e9cb5744fd340a4f835bdf07cfdf5543). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19103: [SPARK-21890] Credentials not being passed to add...
Github user redsanket closed the pull request at: https://github.com/apache/spark/pull/19103 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18975 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81528/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18975 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18975 **[Test build #81528 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81528/testReport)** for PR 18975 at commit [`4a5ff29`](https://github.com/apache/spark/commit/4a5ff2912b15a00e7568893be0fa0b61618146c2). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19157: [SPARK-20589][Core][Scheduler] Allow limiting task concu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19157 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19157: [SPARK-20589][Core][Scheduler] Allow limiting task concu...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19157 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81526/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19157: [SPARK-20589][Core][Scheduler] Allow limiting task concu...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19157 **[Test build #81526 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81526/testReport)** for PR 19157 at commit [`8b38300`](https://github.com/apache/spark/commit/8b3830004d69bd5f109fd9846f59583c23a910c7). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org