[GitHub] spark pull request #20944: [SPARK-23831][SQL] Add org.apache.derby to Isolat...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20944#discussion_r179935823 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala --- @@ -188,6 +188,9 @@ private[hive] class IsolatedClientLoader( (name.startsWith("com.google") && !name.startsWith("com.google.cloud")) || name.startsWith("java.lang.") || name.startsWith("java.net") || +name.startsWith("com.sun.") || +name.startsWith("sun.reflect.") || --- End diff -- Do not add them unless we have to do it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20944 @wangyum What is the root cause? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20260: [SPARK-23039][SQL] Finish TODO work in alter table set l...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20260 cc @gengliangwang --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20987 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89019/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20963: [SPARK-23849][SQL] Tests for the samplingRatio op...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20963 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20987 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20987 **[Test build #89019 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89019/testReport)** for PR 20987 at commit [`b1997a7`](https://github.com/apache/spark/commit/b1997a7e9df56d48c28f825cfb30c02fe61de21d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20963: [SPARK-23849][SQL] Tests for the samplingRatio option of...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20963 LGTM. Thanks! Merged to master/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20963: [SPARK-23849][SQL] Tests for the samplingRatio op...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20963#discussion_r179934861 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala --- @@ -2127,4 +2127,39 @@ class JsonSuite extends QueryTest with SharedSQLContext with TestJsonData { assert(df.schema === expectedSchema) } } + + test("SPARK-23849: schema inferring touches less data if samplingRation < 1.0") { +val predefinedSample = Set[Int](2, 8, 15, 27, 30, 34, 35, 37, 44, 46, + 57, 62, 68, 72) --- End diff -- Not need to have so many elements in this set. Please combine the tests in your CSV PR. Instead of calling `json()`, we can do it using `format("json")`. Then, you can combine the test cases for both CSV and Json. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20963: [SPARK-23849][SQL] Tests for the samplingRatio op...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20963#discussion_r179934815 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala --- @@ -2127,4 +2127,39 @@ class JsonSuite extends QueryTest with SharedSQLContext with TestJsonData { assert(df.schema === expectedSchema) } } + + test("SPARK-23849: schema inferring touches less data if samplingRation < 1.0") { +val predefinedSample = Set[Int](2, 8, 15, 27, 30, 34, 35, 37, 44, 46, + 57, 62, 68, 72) +withTempPath { path => + val writer = Files.newBufferedWriter(Paths.get(path.getAbsolutePath), +StandardCharsets.UTF_8, StandardOpenOption.CREATE_NEW) + for (i <- 0 until 100) { +if (predefinedSample.contains(i)) { + writer.write(s"""{"f1":${i.toString}}""" + "\n") +} else { + writer.write(s"""{"f1":${(i.toDouble + 0.1).toString}}""" + "\n") +} + } + writer.close() + + val ds = spark.read.option("samplingRatio", 0.1).json(path.getCanonicalPath) --- End diff -- Yes. The seed is also given. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20959: [SPARK-23846][SQL] The samplingRatio option for CSV data...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20959 @MaxGekk Thanks for working on this! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20959: [SPARK-23846][SQL] The samplingRatio option for C...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20959#discussion_r179934772 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -528,6 +529,7 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging { * `header` (default `false`): uses the first line as names of columns. * `inferSchema` (default `false`): infers the input schema automatically from data. It * requires one extra pass over the data. + * `samplingRatio` (default 1.0): the sample ratio of rows used for schema inferring. --- End diff -- Also need to update the PySpark API --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21002 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20959: [SPARK-23846][SQL] The samplingRatio option for C...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20959#discussion_r179934767 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -1279,4 +1279,45 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils { Row("0,2013-111-11 12:13:14") :: Row(null) :: Nil ) } + + test("SPARK-23846: schema inferring touches less data if samplingRation < 1.0") { +val predefinedSample = Set[Int](2, 8, 15, 27, 30, 34, 35, 37, 44, 46, + 57, 62, 68, 72) +withTempPath { path => + val writer = Files.newBufferedWriter(Paths.get(path.getAbsolutePath), +StandardCharsets.UTF_8, StandardOpenOption.CREATE_NEW) + for (i <- 0 until 100) { +if (predefinedSample.contains(i)) { + writer.write(i.toString + "\n") +} else { + writer.write((i.toDouble + 0.1).toString + "\n") +} + } + writer.close() + + val ds = spark.read +.option("inferSchema", true) +.option("samplingRatio", 0.1) +.csv(path.getCanonicalPath) + assert(ds.schema == new StructType().add("_c0", IntegerType)) +} + } + + test("SPARK-23846: usage of samplingRation while parsing of dataset of strings") { +val dstr = spark.sparkContext.parallelize(0 until 100, 1).map { i => + val predefinedSample = Set[Int](2, 8, 15, 27, 30, 34, 35, 37, 44, 46, +57, 62, 68, 72) + if (predefinedSample.contains(i)) { +i.toString + "\n" + } else { +(i.toDouble + 0.1) + "\n" + } +}.toDS() +val ds = spark.read + .option("inferSchema", true) + .option("samplingRatio", 0.1) --- End diff -- Add some negative case. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21002 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2071/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20959: [SPARK-23846][SQL] The samplingRatio option for C...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20959#discussion_r179934764 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -1279,4 +1279,45 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils { Row("0,2013-111-11 12:13:14") :: Row(null) :: Nil ) } + + test("SPARK-23846: schema inferring touches less data if samplingRation < 1.0") { +val predefinedSample = Set[Int](2, 8, 15, 27, 30, 34, 35, 37, 44, 46, + 57, 62, 68, 72) --- End diff -- `val predefinedSample = Set[Int](2, 8, 15, 27, 30, 34, 35, 37, 44, 46)` is enough. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21002 **[Test build #89024 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89024/testReport)** for PR 21002 at commit [`ac24549`](https://github.com/apache/spark/commit/ac24549d190a7c203d0a5a2e8f589b0ba797b0ba). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/21002 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20971: [SPARK-23809][SQL][backport] Active SparkSession should ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20971 Thanks! Merged to 2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, as...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20962 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21002 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89020/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21002 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21002 **[Test build #89020 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89020/testReport)** for PR 21002 at commit [`ac24549`](https://github.com/apache/spark/commit/ac24549d190a7c203d0a5a2e8f589b0ba797b0ba). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20962: [SPARK-23847][PYTHON][SQL]Add asc_nulls_first, asc_nulls...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20962 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19691: [SPARK-14922][SPARK-17732][SQL]ALTER TABLE DROP PARTITIO...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19691 @dongjoon-hyun @maropu @mgaido91 Could you review this PR? I think this command is a pretty useful to end users. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19691: [SPARK-14922][SPARK-17732][SQL]ALTER TABLE DROP PARTITIO...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19691 **[Test build #89023 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89023/testReport)** for PR 19691 at commit [`9832ec5`](https://github.com/apache/spark/commit/9832ec55191deb995fe975d01d7899cb049207e5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19691: [SPARK-14922][SPARK-17732][SQL]ALTER TABLE DROP P...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19691#discussion_r179934313 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -282,6 +282,27 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging parts.toMap } + /** + * Create a partition filter specification. + */ + def visitPartitionFilterSpec(ctx: PartitionSpecContext): Expression = withOrigin(ctx) { +val parts = ctx.expression.asScala.map { pVal => + expression(pVal) match { +case EqualNullSafe(_, _) => + throw new ParseException("'<=>' operator is not allowed in partition specification.", ctx) +case cmp @ BinaryComparison(UnresolvedAttribute(name :: Nil), constant: Literal) => --- End diff -- Still the same question here. Constant has to be in the right side? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20999: [WIP][SPARK-23866][SQL] Support partition filters in ALT...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20999 Let us start reviewing that PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19691: [SPARK-14922][SPARK-17732][SQL]ALTER TABLE DROP PARTITIO...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19691 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20999: [WIP][SPARK-23866][SQL] Support partition filters...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20999#discussion_r179934268 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -261,6 +261,14 @@ partitionVal : identifier (EQ constant)? ; +dropPartitionSpec +: PARTITION '(' dropPartitionVal (',' dropPartitionVal)* ')' +; + +dropPartitionVal +: identifier (comparisonOperator constant)? --- End diff -- It has to be in this format? `partCol1 > 2` How about `2 > partCol1`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20816: [SPARK-21479][SQL] Outer join filter pushdown in null su...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20816 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20816: [SPARK-21479][SQL] Outer join filter pushdown in null su...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20816 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2070/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20816: [SPARK-21479][SQL] Outer join filter pushdown in null su...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20816 **[Test build #89022 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89022/testReport)** for PR 20816 at commit [`7fe9329`](https://github.com/apache/spark/commit/7fe93295df5627f2fc4e712b71aa9ce75383d410). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20816: [SPARK-21479][SQL] Outer join filter pushdown in null su...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20816 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20992 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2069/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20992 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20945: [SPARK-23790][Mesos] fix metastore connection issue
Github user skonto commented on the issue: https://github.com/apache/spark/pull/20945 @susanxhuynh Unfortunately I cannot unify the APIs even for DC/OS, 1.10.x is different from 1.11.x (https://docs.mesosphere.com/services/spark/2.3.0-2.2.1-2/security/) and code is dependent on this (I played a bit with the DC/OS secret store API), not to mention other APIs out there. This would require a a generic secrets API at the pure mesos level (like in k8s) so I don't see a viable solution for now, unless I manage to restrict access to the TGT in client mode and essentially make it safe. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20992 **[Test build #89021 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89021/testReport)** for PR 20992 at commit [`0e1e0a0`](https://github.com/apache/spark/commit/0e1e0a0234d07ae9b0af2da31c58f5367911e54c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20858 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20858 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89018/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20858 **[Test build #89018 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89018/testReport)** for PR 20858 at commit [`367ee22`](https://github.com/apache/spark/commit/367ee2241901225e7451d7280611cecf23be82f1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20944: [SPARK-23831][SQL] Add org.apache.derby to IsolatedClien...
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/20944 cc @jerryshao --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...
Github user squito commented on the issue: https://github.com/apache/spark/pull/20987 I filed https://issues.apache.org/jira/browse/SPARK-23894 for the test failure -- appears to be a flaky test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21002 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21002 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2068/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21002 **[Test build #89020 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89020/testReport)** for PR 21002 at commit [`ac24549`](https://github.com/apache/spark/commit/ac24549d190a7c203d0a5a2e8f589b0ba797b0ba). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/21002 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20987: [SPARK-23816][CORE] Killed tasks should ignore Fe...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20987#discussion_r179931481 --- Diff: core/src/test/scala/org/apache/spark/executor/ExecutorSuite.scala --- @@ -257,19 +281,32 @@ class ExecutorSuite extends SparkFunSuite with LocalSparkContext with MockitoSug } private def runTaskAndGetFailReason(taskDescription: TaskDescription): TaskFailedReason = { -runTaskGetFailReasonAndExceptionHandler(taskDescription)._1 +runTaskGetFailReasonAndExceptionHandler(taskDescription, false)._1 } private def runTaskGetFailReasonAndExceptionHandler( - taskDescription: TaskDescription): (TaskFailedReason, UncaughtExceptionHandler) = { + taskDescription: TaskDescription, + killTask: Boolean): (TaskFailedReason, UncaughtExceptionHandler) = { val mockBackend = mock[ExecutorBackend] val mockUncaughtExceptionHandler = mock[UncaughtExceptionHandler] var executor: Executor = null +var killingThread: Thread = null --- End diff -- yeah good point -- I was originally thinking of that but I don't think that is needed. however I did get rid of the indefinite awaits. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20987 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2067/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20987 **[Test build #89019 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89019/testReport)** for PR 20987 at commit [`b1997a7`](https://github.com/apache/spark/commit/b1997a7e9df56d48c28f825cfb30c02fe61de21d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20987: [SPARK-23816][CORE] Killed tasks should ignore FetchFail...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20987 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20858 **[Test build #89018 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89018/testReport)** for PR 20858 at commit [`367ee22`](https://github.com/apache/spark/commit/367ee2241901225e7451d7280611cecf23be82f1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20874: [SPARK-23763][SQL] OffHeapColumnVector uses MemoryBlock
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20874 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89014/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20874: [SPARK-23763][SQL] OffHeapColumnVector uses MemoryBlock
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20874 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20992 **[Test build #89016 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89016/testReport)** for PR 20992 at commit [`3d25617`](https://github.com/apache/spark/commit/3d256179fbb833f2b49f3b8578d9de68e66429f0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20992 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89016/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20992 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20874: [SPARK-23763][SQL] OffHeapColumnVector uses MemoryBlock
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20874 **[Test build #89014 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89014/testReport)** for PR 20874 at commit [`9ef19df`](https://github.com/apache/spark/commit/9ef19dfcde9dc84f494bff5f03a56db840741496). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...
Github user mn-mikke commented on the issue: https://github.com/apache/spark/pull/20858 @maropu I've modified the solution according to your comments: - Removed UnresolvedConcat and merged string and array concatenation into one expression class. - Implemented type coercion for concatenation of arrays and added tests for it - Added codegen examples into the description Please take a look... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21001 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89017/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21001 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21001 **[Test build #89017 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89017/testReport)** for PR 21001 at commit [`3d1c909`](https://github.com/apache/spark/commit/3d1c90960f88042c51012f1b4df8eaffb73994c8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21002 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21002 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89012/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21002 **[Test build #89012 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89012/testReport)** for PR 21002 at commit [`ac24549`](https://github.com/apache/spark/commit/ac24549d190a7c203d0a5a2e8f589b0ba797b0ba). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21001 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21001 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2066/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21003 **[Test build #89015 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89015/testReport)** for PR 21003 at commit [`63959c9`](https://github.com/apache/spark/commit/63959c90e712f4d8ff8ae660b22cf61dc91e3874). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21003 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89015/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21003 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21001 **[Test build #89017 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89017/testReport)** for PR 21001 at commit [`3d1c909`](https://github.com/apache/spark/commit/3d1c90960f88042c51012f1b4df8eaffb73994c8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/21001 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21003 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21003 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2065/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20992 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2064/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20992 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21003 **[Test build #89015 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89015/testReport)** for PR 21003 at commit [`63959c9`](https://github.com/apache/spark/commit/63959c90e712f4d8ff8ae660b22cf61dc91e3874). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20992: [SPARK-23779][SQL] TaskMemoryManager and UnsafeSorter re...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20992 **[Test build #89016 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89016/testReport)** for PR 20992 at commit [`3d25617`](https://github.com/apache/spark/commit/3d256179fbb833f2b49f3b8578d9de68e66429f0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20874: [SPARK-23763][SQL] OffHeapColumnVector uses MemoryBlock
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20874 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2063/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20874: [SPARK-23763][SQL] OffHeapColumnVector uses MemoryBlock
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20874 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20858 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89010/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20858 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20858: [SPARK-23736][SQL] Extending the concat function to supp...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20858 **[Test build #89010 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89010/testReport)** for PR 20858 at commit [`8abd1a8`](https://github.com/apache/spark/commit/8abd1a8b92eee5b83c13a1969dcbfca7e6cb6a06). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20874: [SPARK-23763][SQL] OffHeapColumnVector uses MemoryBlock
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20874 **[Test build #89014 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89014/testReport)** for PR 20874 at commit [`9ef19df`](https://github.com/apache/spark/commit/9ef19dfcde9dc84f494bff5f03a56db840741496). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20984: [SPARK-23875][SQL] Add IndexedSeq wrapper for Arr...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/20984#discussion_r179920210 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayData.scala --- @@ -164,3 +167,32 @@ abstract class ArrayData extends SpecializedGetters with Serializable { } } } + +class ArrayDataIndexedSeq[T](arrayData: ArrayData, dataType: DataType) extends IndexedSeq[T] { + + private lazy val accessor: (Int) => Any = dataType match { +case BooleanType => (idx: Int) => arrayData.getBoolean(idx) +case ByteType => (idx: Int) => arrayData.getByte(idx) +case ShortType => (idx: Int) => arrayData.getShort(idx) +case IntegerType => (idx: Int) => arrayData.getInt(idx) +case LongType => (idx: Int) => arrayData.getLong(idx) +case FloatType => (idx: Int) => arrayData.getFloat(idx) +case DoubleType => (idx: Int) => arrayData.getDouble(idx) +case d: DecimalType => (idx: Int) => arrayData.getDecimal(idx, d.precision, d.scale) +case CalendarIntervalType => (idx: Int) => arrayData.getInterval(idx) +case StringType => (idx: Int) => arrayData.getUTF8String(idx) +case BinaryType => (idx: Int) => arrayData.getBinary(idx) +case s: StructType => (idx: Int) => arrayData.getStruct(idx, s.length) +case _: ArrayType => (idx: Int) => arrayData.getArray(idx) +case _: MapType => (idx: Int) => arrayData.getMap(idx) +case _ => (idx: Int) => arrayData.get(idx, dataType) + } + + override def apply(idx: Int): T = if (idx < arrayData.numElements()) { --- End diff -- Do we need a check `0 <= idx`, too? If so, it would be good to update a message in the exception. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21003 **[Test build #89013 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89013/testReport)** for PR 21003 at commit [`2b588ef`](https://github.com/apache/spark/commit/2b588ef02131521653dd48433d2d7296eacaf30d). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class VectorAssembler(JavaTransformer, HasInputCols, HasOutputCol, HasHandleInvalid, JavaMLReadable,` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21003 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89013/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21003 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21003 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2062/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21003 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21003: [SPARK-23871][ML][PYTHON]add python api for VectorAssemb...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21003 **[Test build #89013 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89013/testReport)** for PR 21003 at commit [`2b588ef`](https://github.com/apache/spark/commit/2b588ef02131521653dd48433d2d7296eacaf30d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21003: [SPARK-23871][ML][PYTHON]add python api for Vecto...
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/21003 [SPARK-23871][ML][PYTHON]add python api for VectorAssembler handleInvalid ## What changes were proposed in this pull request? add python api for VectorAssembler handleInvalid ## How was this patch tested? Add doctest You can merge this pull request into a Git repository by running: $ git pull https://github.com/huaxingao/spark spark-23871 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21003.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21003 commit 2b588ef02131521653dd48433d2d7296eacaf30d Author: Huaxin GaoDate: 2018-04-07T15:24:16Z [SPARK-23871][ML][PYTHON]add python api for VectorAssembler handleInvalid --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21001 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21001 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89011/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21001 **[Test build #89011 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89011/testReport)** for PR 21001 at commit [`3d1c909`](https://github.com/apache/spark/commit/3d1c90960f88042c51012f1b4df8eaffb73994c8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21002 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2061/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21002 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21002 **[Test build #89012 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89012/testReport)** for PR 21002 at commit [`ac24549`](https://github.com/apache/spark/commit/ac24549d190a7c203d0a5a2e8f589b0ba797b0ba). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21002: [SPARK-23893][Core][SQL] Avoid possible integer overflow...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/21002 cc @gatorsmile @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21002: initial commit
GitHub user kiszk opened a pull request: https://github.com/apache/spark/pull/21002 initial commit ## What changes were proposed in this pull request? This PR avoids possible overflow at an operation `long = (long)(int * int)`. The multiplication of large positive integer values may set one to MSB. This leads to a negative value in long while we expected a positive value (e.g. `0111___ * ___0010`). This PR performs long cast before the multiplication to avoid this situation. ## How was this patch tested? Existing UTs You can merge this pull request into a Git repository by running: $ git pull https://github.com/kiszk/spark SPARK-23893 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21002.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21002 commit ac24549d190a7c203d0a5a2e8f589b0ba797b0ba Author: Kazuaki IshizakiDate: 2018-04-07T14:16:18Z initial commit --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org