[GitHub] spark pull request #14559: [SPARK-16968]Add additional options in jdbc when ...
Github user GraceH commented on a diff in the pull request: https://github.com/apache/spark/pull/14559#discussion_r74027475 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -447,7 +447,16 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { // Create the table if the table didn't exist. if (!tableExists) { val schema = JdbcUtils.schemaString(df, url) -val sql = s"CREATE TABLE $table ($schema)" +// To allow certain options to append when create a new table, which can be +// table_options or partition_options. +// E.g., "CREATE TABLE t (name string) ENGINE=InnoDB DEFAULT CHARSET=utf8" +val createtblOptions = { + extraOptions.get("jdbc.create.table.options") match { --- End diff -- Thanks Sean. Actually, here I have a little bit hesitation. For example, "mergeSchema" which may not be so that similar to the other option name (prefixed with "spark"). ``` val mergedDF = spark.read.option("mergeSchema", "true").parquet("data/test_table") ``` How about to use some short name as "createTableOptions"? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14561: [SPARK-16972][CORE] Move DriverEndpoint out of CoarseGra...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14561 **[Test build #63435 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63435/consoleFull)** for PR 14561 at commit [`def6954`](https://github.com/apache/spark/commit/def695421948db1efd0418625243ed645d0958fa). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14561: [SPARK-16972][CORE] Move DriverEndpoint out of CoarseGra...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14561 Jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14559: [SPARK-16968]Add additional options in jdbc when ...
Github user GraceH commented on a diff in the pull request: https://github.com/apache/spark/pull/14559#discussion_r74026903 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -447,7 +447,16 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { // Create the table if the table didn't exist. if (!tableExists) { val schema = JdbcUtils.schemaString(df, url) -val sql = s"CREATE TABLE $table ($schema)" +// To allow certain options to append when create a new table, which can be +// table_options or partition_options. +// E.g., "CREATE TABLE t (name string) ENGINE=InnoDB DEFAULT CHARSET=utf8" +val createtblOptions = { + extraOptions.get("jdbc.create.table.options") match { +case Some(value) => " " + value +case None => "" + } +} +val sql = s"CREATE TABLE $table ($schema)" + createtblOptions --- End diff -- Yes. so right. will fix that, which looks better as a whole part. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14557: [SPARK-16709][CORE] Kill the running task if stage faile...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14557 **[Test build #63434 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63434/consoleFull)** for PR 14557 at commit [`9ea08e8`](https://github.com/apache/spark/commit/9ea08e8680a544fc051574efcecff00270f2d2d6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14561: [SPARK-16972][CORE] Move DriverEndpoint out of CoarseGra...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14561 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14561: SPARK-16972: Move DriverEndpoint out of CoarseGra...
GitHub user lshmouse opened a pull request: https://github.com/apache/spark/pull/14561 SPARK-16972: Move DriverEndpoint out of CoarseGrainedSchedulerBackend ## What changes were proposed in this pull request? Move DriverEndpoint out of CoarseGrainedSchedulerBackend and make the two classes clean. ## How was this patch tested? Pass the unit tests in local. (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) You can merge this pull request into a Git repository by running: $ git pull https://github.com/lshmouse/spark DriverEndpoint Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14561.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14561 commit def695421948db1efd0418625243ed645d0958fa Author: Liu ShaohuiDate: 2016-08-09T09:25:41Z SPARK-16972: Move DriverEndpoint out of CoarseGrainedSchedulerBackend --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14552: [SPARK-16952] don't lookup spark home directory when exe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14552 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63426/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14552: [SPARK-16952] don't lookup spark home directory when exe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14552 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14552: [SPARK-16952] don't lookup spark home directory when exe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14552 **[Test build #63426 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63426/consoleFull)** for PR 14552 at commit [`a19cec7`](https://github.com/apache/spark/commit/a19cec746c3314fa12844adcd04eeb9fb900cd46). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14557: [SPARK-16709][CORE] Kill the running task if stage faile...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14557 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63424/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14557: [SPARK-16709][CORE] Kill the running task if stage faile...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14557 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14557: [SPARK-16709][CORE] Kill the running task if stage faile...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14557 **[Test build #63424 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63424/consoleFull)** for PR 14557 at commit [`9263678`](https://github.com/apache/spark/commit/926367815a262c89a24f86fd735348f493e64881). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14558: [SPARK-16508][SparkR] Fix warnings on undocumented/dupli...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14558 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14558: [SPARK-16508][SparkR] Fix warnings on undocumented/dupli...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14558 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63423/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14558: [SPARK-16508][SparkR] Fix warnings on undocumented/dupli...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14558 **[Test build #63423 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63423/consoleFull)** for PR 14558 at commit [`82e2f09`](https://github.com/apache/spark/commit/82e2f09517e9f3d726af0046d251748f892f59c8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14534: [SPARK-16941]Use concurrentHashMap instead of sca...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14534#discussion_r74024410 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/server/SparkSQLOperationManager.scala --- @@ -39,15 +38,19 @@ private[thriftserver] class SparkSQLOperationManager() val handleToOperation = ReflectionUtils .getSuperField[JMap[OperationHandle, Operation]](this, "handleToOperation") - val sessionToActivePool = Map[SessionHandle, String]() - val sessionToContexts = Map[SessionHandle, SQLContext]() + val sessionToActivePool = new ConcurrentHashMap[SessionHandle, String]() --- End diff -- It's only `private[thriftserver]`. It's minor, and a whole lot of stuff in Spark that should be `private` isn't, but I wondered if it was worth it here because you're concerned with synchronizing access to this object and therefore possibly concerned with what is accessing it. The usages you changed look like they're sufficiently protected, but are there others BTW? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13988: [SPARK-16101][SQL] Refactoring CSV data source to be con...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13988 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63425/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13988: [SPARK-16101][SQL] Refactoring CSV data source to be con...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13988 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13988: [SPARK-16101][SQL] Refactoring CSV data source to be con...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13988 **[Test build #63425 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63425/consoleFull)** for PR 13988 at commit [`a634435`](https://github.com/apache/spark/commit/a63443505483287fa9bb20312a24b38e75f90588). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14534: [SPARK-16941]Use concurrentHashMap instead of scala Map ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14534 **[Test build #63433 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63433/consoleFull)** for PR 14534 at commit [`0a436a0`](https://github.com/apache/spark/commit/0a436a0a911151e0cc823a81974473f89e8bb966). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14560: [SPARK-16971][SQL] Strip trailing zeros for decimal's st...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14560 **[Test build #63432 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63432/consoleFull)** for PR 14560 at commit [`d11ce1f`](https://github.com/apache/spark/commit/d11ce1f8554f5028e7a64b3b6abe5e1b6a290529). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14534: [SPARK-16941]Use concurrentHashMap instead of sca...
Github user SaintBacchus commented on a diff in the pull request: https://github.com/apache/spark/pull/14534#discussion_r74023262 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/server/SparkSQLOperationManager.scala --- @@ -39,15 +38,19 @@ private[thriftserver] class SparkSQLOperationManager() val handleToOperation = ReflectionUtils .getSuperField[JMap[OperationHandle, Operation]](this, "handleToOperation") - val sessionToActivePool = Map[SessionHandle, String]() - val sessionToContexts = Map[SessionHandle, SQLContext]() + val sessionToActivePool = new ConcurrentHashMap[SessionHandle, String]() --- End diff -- the whole class is private, it this necessary to make flied to be private? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14560: [SPARK-16971][SQL] Strip trailing zeros for decimals whe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14560 **[Test build #63431 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63431/consoleFull)** for PR 14560 at commit [`b8a5267`](https://github.com/apache/spark/commit/b8a5267d495f4a9bf882c82b730c660858e1eebf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14560: [SPARK-16971][SQL] Strip trailing zeros for decim...
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/14560 [SPARK-16971][SQL] Strip trailing zeros for decimals when using show() API in Dataset ## What changes were proposed in this pull request? Currently, `Dataset.show()` prints all the trailing zeros for decimals. For example, ``` spark.range(11).toDF("a").select('a.cast(DecimalType(30, 20))).show() ``` prints below: ```bash ++ | a| ++ | 0E-20| |1.000...| |2.000...| |3.000...| |4.000...| |5.000...| |6.000...| |7.000...| |8.000...| |9.000...| |10.00...| ++ ``` It might be confusing, in particular, for `0E-20`. Also, I think we can strip the trailing zeros. This PR fixes this as below: ```bash +---+ | a| +---+ | 0| | 1| | 2| | 3| | 4| | 5| | 6| | 7| | 8| | 9| | 10| +---+ ``` ## How was this patch tested? Unit test in `DataFrameSuite`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark SPARK-16971 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14560.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14560 commit b8a5267d495f4a9bf882c82b730c660858e1eebf Author: hyukjinkwonDate: 2016-08-09T08:58:13Z Strip trailing zeros for decimals when using show() API in Dataset --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14555: [SPARK-16965][MLLIB][PYSPARK] Fix bound checking ...
Github user zjffdu commented on a diff in the pull request: https://github.com/apache/spark/pull/14555#discussion_r74022257 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala --- @@ -560,11 +554,25 @@ class SparseVector @Since("2.0.0") ( @Since("2.0.0") val indices: Array[Int], @Since("2.0.0") val values: Array[Double]) extends Vector { - require(indices.length == values.length, "Sparse vectors require that the dimension of the" + -s" indices match the dimension of the values. You provided ${indices.length} indices and " + -s" ${values.length} values.") - require(indices.length <= size, s"You provided ${indices.length} indices and values, " + -s"which exceeds the specified vector size ${size}.") + validate() + + private def validate(): Unit = { +require(size >= 0, "The size of the requested sparse vector must be greater than 0.") --- End diff -- Yes, I do see test for 0 length vector. https://github.com/apache/spark/blob/master/mllib-local/src/test/scala/org/apache/spark/ml/linalg/VectorsSuite.scala#L81 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74022313 --- Diff: R/pkg/R/functions.R --- @@ -1273,12 +1267,14 @@ setMethod("round", #' bround #' #' Returns the value of the column `e` rounded to `scale` decimal places using HALF_EVEN rounding -#' mode if `scale` >= 0 or at integral part when `scale` < 0. +#' mode if `scale` >= 0 or at integer part when `scale` < 0. #' Also known as Gaussian rounding or bankers' rounding that rounds to the nearest even number. #' bround(2.5, 0) = 2, bround(3.5, 0) = 4. #' #' @param x Column to compute on. -#' +#' @param scale round to `scale` digits to the right of the decimal point when `scale` > 0, --- End diff -- it seems this is duplicating L1270 and might seem confusing since they seem to different behavior? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13868: [SPARK-15899] [SQL] Fix the construction of the file pat...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/13868 OK will merge soonish if there are no further comments. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13868: [SPARK-15899] [SQL] Fix the construction of the file pat...
Github user avulanov commented on the issue: https://github.com/apache/spark/pull/13868 @srowen Sure. I've addressed @vanzin 's comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13868: [SPARK-15899] [SQL] Fix the construction of the file pat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13868 **[Test build #63430 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63430/consoleFull)** for PR 13868 at commit [`ea24b59`](https://github.com/apache/spark/commit/ea24b59fe83c37dbab27579141b5c63cccee138d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14555: [SPARK-16965][MLLIB][PYSPARK] Fix bound checking ...
Github user zjffdu commented on a diff in the pull request: https://github.com/apache/spark/pull/14555#discussion_r74021856 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala --- @@ -560,11 +554,25 @@ class SparseVector @Since("2.0.0") ( @Since("2.0.0") val indices: Array[Int], @Since("2.0.0") val values: Array[Double]) extends Vector { - require(indices.length == values.length, "Sparse vectors require that the dimension of the" + -s" indices match the dimension of the values. You provided ${indices.length} indices and " + -s" ${values.length} values.") - require(indices.length <= size, s"You provided ${indices.length} indices and values, " + -s"which exceeds the specified vector size ${size}.") + validate() --- End diff -- I also thought about `{...}`, just feel putting into one method is better. Anyway I can do that way if this is not proper for spark code style. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14557: [SPARK-16709][CORE] Kill the running task if stag...
Github user shenh062326 commented on a diff in the pull request: https://github.com/apache/spark/pull/14557#discussion_r74021599 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -1564,6 +1564,14 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli } } + def killTasks(tasks: HashSet[Long], taskInfo: HashMap[Long, TaskInfo]): Boolean = { --- End diff -- Jerryshao, Thanks for your prompt. I will move the method to TaskSetManager. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14555: [SPARK-16965][MLLIB][PYSPARK] Fix bound checking ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14555#discussion_r74021332 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala --- @@ -560,11 +554,25 @@ class SparseVector @Since("2.0.0") ( @Since("2.0.0") val indices: Array[Int], @Since("2.0.0") val values: Array[Double]) extends Vector { - require(indices.length == values.length, "Sparse vectors require that the dimension of the" + -s" indices match the dimension of the values. You provided ${indices.length} indices and " + -s" ${values.length} values.") - require(indices.length <= size, s"You provided ${indices.length} indices and values, " + -s"which exceeds the specified vector size ${size}.") + validate() + + private def validate(): Unit = { +require(size >= 0, "The size of the requested sparse vector must be greater than 0.") --- End diff -- This allows a size 0 vector now. I guess that's good, because `DenseVector` allows this (a 0 length array). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74021040 --- Diff: R/pkg/R/functions.R --- @@ -1560,7 +1556,8 @@ setMethod("stddev_samp", #' #' Creates a new struct column that composes multiple input columns. #' -#' @param x Column to compute on. +#' @param x a column to compute on. +#' @param ... additional column(s) to be included. --- End diff -- optional? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14555: [SPARK-16965][MLLIB][PYSPARK] Fix bound checking ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14555#discussion_r74021092 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala --- @@ -560,11 +554,25 @@ class SparseVector @Since("2.0.0") ( @Since("2.0.0") val indices: Array[Int], @Since("2.0.0") val values: Array[Double]) extends Vector { - require(indices.length == values.length, "Sparse vectors require that the dimension of the" + -s" indices match the dimension of the values. You provided ${indices.length} indices and " + -s" ${values.length} values.") - require(indices.length <= size, s"You provided ${indices.length} indices and values, " + -s"which exceeds the specified vector size ${size}.") + validate() --- End diff -- They wouldn't become fields unless used outside the constructor. You can also use a simple scope `{...}` to guard against this. I understand the argument and don't feel strongly either way, but we don't do this in other code in general. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74020775 --- Diff: R/pkg/R/functions.R --- @@ -2654,6 +2647,9 @@ setMethod("expr", signature(x = "character"), #' #' Formats the arguments in printf-style and returns the result as a string column. #' +#' @param format a character object of format strings. +#' @param x a Column object. +#' @param ... additional columns. --- End diff -- Let's keep type in capital case? `Column` or `Columns` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14534: [SPARK-16941]Use concurrentHashMap instead of sca...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14534#discussion_r74020655 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/server/SparkSQLOperationManager.scala --- @@ -39,15 +38,19 @@ private[thriftserver] class SparkSQLOperationManager() val handleToOperation = ReflectionUtils .getSuperField[JMap[OperationHandle, Operation]](this, "handleToOperation") - val sessionToActivePool = Map[SessionHandle, String]() - val sessionToContexts = Map[SessionHandle, SQLContext]() + val sessionToActivePool = new ConcurrentHashMap[SessionHandle, String]() + val sessionToContexts = new ConcurrentHashMap[SessionHandle, SQLContext]() override def newExecuteStatementOperation( parentSession: HiveSession, statement: String, confOverlay: JMap[String, String], async: Boolean): ExecuteStatementOperation = synchronized { -val sqlContext = sessionToContexts(parentSession.getSessionHandle) +val sqlContext = sessionToContexts.get(parentSession.getSessionHandle) +if (null == sqlContext) { --- End diff -- Does this have to be `HiveSQLException`? I'd just use `require` to generate an `IllegalArgumentException` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14534: [SPARK-16941]Use concurrentHashMap instead of sca...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14534#discussion_r74020589 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/server/SparkSQLOperationManager.scala --- @@ -39,15 +38,19 @@ private[thriftserver] class SparkSQLOperationManager() val handleToOperation = ReflectionUtils .getSuperField[JMap[OperationHandle, Operation]](this, "handleToOperation") - val sessionToActivePool = Map[SessionHandle, String]() - val sessionToContexts = Map[SessionHandle, SQLContext]() + val sessionToActivePool = new ConcurrentHashMap[SessionHandle, String]() --- End diff -- While we're here, make them `private` for a bit more future-proofing of access to these --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14525: [SPARK-16324] [SQL] regexp_extract should doc that it re...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14525 **[Test build #63429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63429/consoleFull)** for PR 14525 at commit [`a48be1f`](https://github.com/apache/spark/commit/a48be1f2955c6bd73ebdf3b03fcdadd8eb347278). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14528: [SPARK-16940][SQL] `checkAnswer` should raise `Te...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14528 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14528: [SPARK-16940][SQL] `checkAnswer` should raise `TestFaile...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14528 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14534: [SPARK-16941]Use concurrentHashMap instead of sca...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14534#discussion_r74020482 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -206,15 +206,16 @@ private[hive] class SparkExecuteStatementOperation( statementId, parentSession.getUsername) sqlContext.sparkContext.setJobGroup(statementId, statement) -sessionToActivePool.get(parentSession.getSessionHandle).foreach { pool => +val pool = sessionToActivePool.get(parentSession.getSessionHandle) +if(null != pool) { --- End diff -- `if (pool != null)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14540: [SPARK-16950] [PySpark] fromOffsets parameter support in...
Github user szczeles commented on the issue: https://github.com/apache/spark/pull/14540 @holdenk I've checked setSeed methods in MLlib and it seems py4j handles them well. If function gets simple arguments (strings, numerric, bool), py4j applies conversion between types (see https://github.com/bartdag/py4j/blob/master/py4j-java/src/main/java/py4j/reflection/MethodInvoker.java#L99). For setSeed(Long), if argument is mapped to Integer, it goes through toString and Long.parseLong (see https://github.com/bartdag/py4j/blob/master/py4j-java/src/main/java/py4j/reflection/TypeConverter.java#L88) Apparently, this conversion does not work for complex types like fromOffsets map. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14175: [SPARK-16522][MESOS] Spark application throws exception ...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14175 Merged to master, but it doesn't pick cleanly into 2.0, and the conflict in the tests wasn't entirely trivial. You can open another PR if it's important. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14175: [SPARK-16522][MESOS] Spark application throws exc...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14175 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14533: [SPARK-16606] [CORE] Misleading warning for Spark...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14533 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14533: [SPARK-16606] [CORE] Misleading warning for SparkContext...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14533 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14539: [SPARK-16947][SQL] Improve type coercion for inli...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14539#discussion_r74018815 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -1192,8 +1192,8 @@ class DataFrameSuite extends QueryTest with SharedSQLContext { } test("SPARK-10740: handle nondeterministic expressions correctly for set operations") { -val df1 = (1 to 20).map(Tuple1.apply).toDF("i") -val df2 = (1 to 10).map(Tuple1.apply).toDF("i") +val df1 = spark.range(1, 20).select('id.cast("int").as("i")) +val df2 = spark.range(1, 10).select('id.cast("int").as("i")) --- End diff -- This syntax `(1 to 20).map(Tuple1.apply).toDF("i")` produces a `LocalRelation`. These `LocalRelation`s are subsequently `Union`'ed. The new optimizer rules reduces this Union into a single LocalRelation. Which fails this test, because the new approach results in an already evaluated `LocalRelation` (using a different seed for the RNG), instead of a `Union` with two separate partitions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13868: [SPARK-15899] [SQL] Fix the construction of the file pat...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/13868 @avulanov can you have one more look at Marcelo's last small comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74017865 --- Diff: R/pkg/R/functions.R --- @@ -3033,6 +3033,9 @@ setMethod("when", signature(condition = "Column", value = "ANY"), #' Evaluates a list of conditions and returns \code{yes} if the conditions are satisfied. #' Otherwise \code{no} is returned for unmatched conditions. #' +#' @param test a Column expression that describes the condition. +#' @param yes return values for true elements of test. --- End diff -- true -> TRUE false -> FALSE? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14491: [SPARK-16886] [EXAMPLES][SQL] structured streaming netwo...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14491 @ganeshchand could you address his last comment? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14557: [SPARK-16709][CORE] Kill the running task if stag...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/14557#discussion_r74017373 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -1564,6 +1564,14 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli } } + def killTasks(tasks: HashSet[Long], taskInfo: HashMap[Long, TaskInfo]): Boolean = { --- End diff -- It is not suitable to add a public method here in `SparkContext`, `SparkContext` is a public entry point, any method adds to here should be considered carefully. In your case looks like only Spark internally will use this method, why not directly change the `TaskSetManager`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74017287 --- Diff: R/pkg/R/generics.R --- @@ -1022,6 +1059,7 @@ setGeneric("month", function(x) { standardGeneric("month") }) #' @export setGeneric("months_between", function(y, x) { standardGeneric("months_between") }) +#' @param x a SparkDataFrame or a Column object. #' @rdname nrow --- End diff -- here for "n" column function, it shouldn't be under rdname nrow, which is for count for DataFrame - I'd change this to a new rdname, `count` and put "count" and "n" under that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14556: [SPARK-16966][Core] Make App Name to the valid name inst...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14556 Just so it's not missed, I have a slightly different proposal in the JIRA. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74016459 --- Diff: R/pkg/R/generics.R --- @@ -1091,8 +1129,8 @@ setGeneric("reverse", function(x) { standardGeneric("reverse") }) #' @export setGeneric("rint", function(x, ...) { standardGeneric("rint") }) -#' @rdname row_number -#' @export +# @rdname row_number +# @export --- End diff -- `#'` changed to `#`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74016421 --- Diff: R/pkg/R/generics.R --- @@ -1046,8 +1084,8 @@ setGeneric("ntile", function(x) { standardGeneric("ntile") }) #' @export setGeneric("n_distinct", function(x, ...) { standardGeneric("n_distinct") }) -#' @rdname percent_rank -#' @export +# @rdname percent_rank --- End diff -- `#'` changed to `#`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74016338 --- Diff: R/pkg/R/mllib.R --- @@ -57,6 +57,9 @@ setClass("KMeansModel", representation(jobj = "jobj")) #' #' Saves the MLlib model to the input path. For more information, see the specific #' MLlib model below. +#' @param object a fitted ML model object. +#' @param path the directory where the model is saved. +#' @param ... additional argument(s) passed to the method. --- End diff -- does it complain about this? This rd does not have a function signature so it shouldn't ask to document parameter? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74016268 --- Diff: R/pkg/R/mllib.R --- @@ -69,6 +72,8 @@ NULL #' #' Makes predictions from a MLlib model. For more information, see the specific #' MLlib model below. +#' @param object a fitted ML model object. +#' @param ... additional argument(s) passed to the method. --- End diff -- does it complain about this? This rd does not have a function signature so it shouldn't ask to document parameter? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74016085 --- Diff: R/pkg/R/mllib.R --- @@ -82,15 +87,16 @@ NULL #' Users can call \code{summary} to print a summary of the fitted model, \code{predict} to make #' predictions on new data, and \code{write.ml}/\code{read.ml} to save/load fitted models. #' -#' @param data SparkDataFrame for training. -#' @param formula A symbolic description of the model to be fitted. Currently only a few formula +#' @param data a SparkDataFrame for training. +#' @param formula a symbolic description of the model to be fitted. Currently only a few formula #'operators are supported, including '~', '.', ':', '+', and '-'. -#' @param family A description of the error distribution and link function to be used in the model. +#' @param family a description of the error distribution and link function to be used in the model. #' This can be a character string naming a family function, a family function or #' the result of a call to a family function. Refer R family at #' \url{https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html}. -#' @param tol Positive convergence tolerance of iterations. -#' @param maxIter Integer giving the maximal number of IRLS iterations. +#' @param tol positive convergence tolerance of iterations. +#' @param maxIter integer giving the maximal number of IRLS iterations. +#' @param ... additional arguments passed to the method. --- End diff -- there is no `...` here in the signature? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14180: Wheelhouse and VirtualEnv support
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14180 Yes I am back from vacation! Can work on it now :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74015938 --- Diff: R/pkg/R/mllib.R --- @@ -298,14 +304,15 @@ setMethod("summary", signature(object = "NaiveBayesModel"), #' Users can call \code{summary} to print a summary of the fitted model, \code{predict} to make #' predictions on new data, and \code{write.ml}/\code{read.ml} to save/load fitted models. #' -#' @param data SparkDataFrame for training -#' @param formula A symbolic description of the model to be fitted. Currently only a few formula +#' @param data a SparkDataFrame for training. +#' @param formula a symbolic description of the model to be fitted. Currently only a few formula #'operators are supported, including '~', '.', ':', '+', and '-'. #'Note that the response variable of formula is empty in spark.kmeans. -#' @param k Number of centers -#' @param maxIter Maximum iteration number -#' @param initMode The initialization algorithm choosen to fit the model -#' @return \code{spark.kmeans} returns a fitted k-means model +#' @param ... additional argument(s) passed to the method. --- End diff -- there is no `...` here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14556: [SPARK-16966][Core] Make App Name to the valid name inst...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/14556 Would you please add a unit test to verify the changes? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74015800 --- Diff: R/pkg/R/mllib.R --- @@ -346,8 +353,11 @@ setMethod("spark.kmeans", signature(data = "SparkDataFrame", formula = "formula" #' Get fitted result from a k-means model, similarly to R's fitted(). #' Note: A saved-loaded model does not support this method. #' -#' @param object A fitted k-means model -#' @return \code{fitted} returns a SparkDataFrame containing fitted values +#' @param object a fitted k-means model. +#' @param method type of fitted results, `"centers"` for cluster centers --- End diff -- I wouldn't put it in ` and " - roxygen2 doesn't really handle ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14559: [SPARK-16968]Add additional options in jdbc when ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14559#discussion_r74015718 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -447,7 +447,16 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { // Create the table if the table didn't exist. if (!tableExists) { val schema = JdbcUtils.schemaString(df, url) -val sql = s"CREATE TABLE $table ($schema)" +// To allow certain options to append when create a new table, which can be +// table_options or partition_options. +// E.g., "CREATE TABLE t (name string) ENGINE=InnoDB DEFAULT CHARSET=utf8" +val createtblOptions = { + extraOptions.get("jdbc.create.table.options") match { +case Some(value) => " " + value +case None => "" + } +} +val sql = s"CREATE TABLE $table ($schema)" + createtblOptions --- End diff -- Why not also use interpolation for the new var? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14559: [SPARK-16968]Add additional options in jdbc when ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14559#discussion_r74015696 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -447,7 +447,16 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { // Create the table if the table didn't exist. if (!tableExists) { val schema = JdbcUtils.schemaString(df, url) -val sql = s"CREATE TABLE $table ($schema)" +// To allow certain options to append when create a new table, which can be +// table_options or partition_options. +// E.g., "CREATE TABLE t (name string) ENGINE=InnoDB DEFAULT CHARSET=utf8" +val createtblOptions = { + extraOptions.get("jdbc.create.table.options") match { --- End diff -- Probably need a different prop name starting with spark. See other option naming conventions. The outer scope isn't necessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14539: [SPARK-16947][SQL] Improve type coercion for inli...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14539#discussion_r74015279 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -756,16 +756,20 @@ case class Repartition(numPartitions: Int, shuffle: Boolean, child: LogicalPlan) /** * A relation with one row. This is used in "SELECT ..." without a from clause. */ -case object OneRowRelation extends LeafNode { +abstract class AbstractOneRowRelation extends LeafNode { --- End diff -- Yeah that is fair --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost [WIP]
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14547 **[Test build #63428 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63428/consoleFull)** for PR 14547 at commit [`b4e5e6c`](https://github.com/apache/spark/commit/b4e5e6cc6a48ba5160c9aa8a0e03800f193b561e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74014809 --- Diff: R/pkg/R/mllib.R --- @@ -563,11 +574,12 @@ read.ml <- function(path) { #' \code{predict} to make predictions on new data, and \code{write.ml}/\code{read.ml} to #' save/load fitted models. #' -#' @param data A SparkDataFrame for training -#' @param formula A symbolic description of the model to be fitted. Currently only a few formula +#' @param data a SparkDataFrame for training. +#' @param formula a symbolic description of the model to be fitted. Currently only a few formula #'operators are supported, including '~', ':', '+', and '-'. #'Note that operator '.' is not supported currently -#' @return \code{spark.survreg} returns a fitted AFT survival regression model +#' @param ... additional argument(s) passed to the method. --- End diff -- or document as `Currently not used.` like http://ugrad.stat.ubc.ca/R/library/e1071/html/predict.naiveBayes.html --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74014676 --- Diff: R/pkg/R/mllib.R --- @@ -414,11 +425,12 @@ setMethod("predict", signature(object = "KMeansModel"), #' predictions on new data, and \code{write.ml}/\code{read.ml} to save/load fitted models. #' Only categorical data is supported. #' -#' @param data A \code{SparkDataFrame} of observations and labels for model fitting -#' @param formula A symbolic description of the model to be fitted. Currently only a few formula +#' @param data a \code{SparkDataFrame} of observations and labels for model fitting. +#' @param formula a symbolic description of the model to be fitted. Currently only a few formula #' operators are supported, including '~', '.', ':', '+', and '-'. -#' @param smoothing Smoothing parameter -#' @return \code{spark.naiveBayes} returns a fitted naive Bayes model +#' @param smoothing smoothing parameter. +#' @param ... additional parameter(s) passed to the method. --- End diff -- same here - `...` are unused --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74014544 --- Diff: R/pkg/R/mllib.R --- @@ -563,11 +574,12 @@ read.ml <- function(path) { #' \code{predict} to make predictions on new data, and \code{write.ml}/\code{read.ml} to #' save/load fitted models. #' -#' @param data A SparkDataFrame for training -#' @param formula A symbolic description of the model to be fitted. Currently only a few formula +#' @param data a SparkDataFrame for training. +#' @param formula a symbolic description of the model to be fitted. Currently only a few formula #'operators are supported, including '~', ':', '+', and '-'. #'Note that operator '.' is not supported currently -#' @return \code{spark.survreg} returns a fitted AFT survival regression model +#' @param ... additional argument(s) passed to the method. --- End diff -- there are a few cases where they are not clear why `...` should be in the function signature. I think we should remove them since they are not used --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14559: [SPARK-16968]Add additional options in jdbc when creatin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14559 **[Test build #63427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63427/consoleFull)** for PR 14559 at commit [`b302b1c`](https://github.com/apache/spark/commit/b302b1c7ec75ae1e78d132f7ecdb9bb7f33816d4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13146: [SPARK-13081][PYSPARK][SPARK_SUBMIT]. Allow set pythonEx...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13146 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13146: [SPARK-13081][PYSPARK][SPARK_SUBMIT]. Allow set pythonEx...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13146 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63420/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13146: [SPARK-13081][PYSPARK][SPARK_SUBMIT]. Allow set pythonEx...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13146 **[Test build #63420 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63420/consoleFull)** for PR 13146 at commit [`8119f6d`](https://github.com/apache/spark/commit/8119f6ded867a8b2e0b212f3247f52278b9e8c28). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74013650 --- Diff: R/pkg/R/sparkR.R --- @@ -328,6 +328,7 @@ sparkRHive.init <- function(jsc = NULL) { #' @param sparkPackages Character vector of packages from spark-packages.org #' @param enableHiveSupport Enable support for Hive, fallback if not built with Hive support; once #'set, this cannot be turned off on an existing session +#' @param ... additional parameters passed to the method --- End diff -- I'd clarify as in L 317, for example, "named Spark properties passed to the method" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14559: [SPARK-16968]Add additional options in jdbc when ...
GitHub user GraceH opened a pull request: https://github.com/apache/spark/pull/14559 [SPARK-16968]Add additional options in jdbc when creating a new table ## What changes were proposed in this pull request? In the PR, we just allow the user to add additional options when create a new table in JDBC writer. The options can be table_options or partition_options. E.g., "CREATE TABLE t (name string) ENGINE=InnoDB DEFAULT CHARSET=utf8" ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) will apply test result soon. You can merge this pull request into a Git repository by running: $ git pull https://github.com/GraceH/spark jdbc_options Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14559.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14559 commit b302b1c7ec75ae1e78d132f7ecdb9bb7f33816d4 Author: GraceH <93113...@qq.com> Date: 2016-08-09T06:47:51Z Add additional options in jdbc when creating a new table --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74013042 --- Diff: R/pkg/R/generics.R --- @@ -465,10 +477,14 @@ setGeneric("dapply", function(x, func, schema) { standardGeneric("dapply") }) #' @export setGeneric("dapplyCollect", function(x, func) { standardGeneric("dapplyCollect") }) +#' @param x a SparkDataFrame or GroupedData. +#' @param ... additional argument(s) passed to the method. #' @rdname gapply #' @export setGeneric("gapply", function(x, ...) { standardGeneric("gapply") }) +#' @param x a SparkDataFrame or GroupedData. --- End diff -- same here for gapplyCollect --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14552: [SPARK-16952] don't lookup spark home directory when exe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14552 **[Test build #63426 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63426/consoleFull)** for PR 14552 at commit [`a19cec7`](https://github.com/apache/spark/commit/a19cec746c3314fa12844adcd04eeb9fb900cd46). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74013005 --- Diff: R/pkg/R/generics.R --- @@ -465,10 +477,14 @@ setGeneric("dapply", function(x, func, schema) { standardGeneric("dapply") }) #' @export setGeneric("dapplyCollect", function(x, func) { standardGeneric("dapplyCollect") }) +#' @param x a SparkDataFrame or GroupedData. --- End diff -- gapply is only for GroupedData? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/14558#discussion_r74012847 --- Diff: R/pkg/R/generics.R --- @@ -395,6 +396,9 @@ setGeneric("value", function(bcast) { standardGeneric("value") }) SparkDataFrame Methods +#' @param x a SparkDataFrame or GroupedData. --- End diff -- Hmm.. I see why this would be a place for it. I think it would be easier to maintain if the documentation is next to the function body instead of the generics, but I haven't completely figure the best way to do yet. The approach we have so far is to keep most of the tag/doc on one of the definition - do you think it would work here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14552: [SPARK-16952] don't lookup spark home directory when exe...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14552 Seems OK to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14552: [SPARK-16952] don't lookup spark home directory when exe...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14552 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14546 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63419/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14546 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14546 **[Test build #63419 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63419/consoleFull)** for PR 14546 at commit [`1dc193a`](https://github.com/apache/spark/commit/1dc193a15fa02359bf3e767662c7ef633464caac). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13146: [SPARK-13081][PYSPARK][SPARK_SUBMIT]. Allow set pythonEx...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13146 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13146: [SPARK-13081][PYSPARK][SPARK_SUBMIT]. Allow set pythonEx...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13146 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63417/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13146: [SPARK-13081][PYSPARK][SPARK_SUBMIT]. Allow set pythonEx...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13146 **[Test build #63417 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63417/consoleFull)** for PR 13146 at commit [`3826f33`](https://github.com/apache/spark/commit/3826f3340785a4f3e1c0ad92bd0bfff32a3525c0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14546 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63418/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14519: [SPARK-16933] [ML] Fix AFTAggregator in AFTSurviv...
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/14519#discussion_r74011434 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala --- @@ -478,21 +482,23 @@ object AFTSurvivalRegressionModel extends MLReadable[AFTSurvivalRegressionModel] *$$ * * - * @param parameters including three part: The log of scale parameter, the intercept and - *regression coefficients corresponding to the features. + * @param bcParameters The broadcasted value includes three part: The log of scale parameter, + * the intercept and regression coefficients corresponding to the features. * @param fitIntercept Whether to fit an intercept term. - * @param featuresStd The standard deviation values of the features. + * @param bcFeaturesStd The broadcast standard deviation values of the features. */ private class AFTAggregator( -parameters: BDV[Double], +bcParameters: Broadcast[BDV[Double]], fitIntercept: Boolean, -featuresStd: Array[Double]) extends Serializable { +bcFeaturesStd: Broadcast[Array[Double]]) extends Serializable { + // make transient so we do not serialize between aggregation stages + @transient private lazy val parameters = bcParameters.value // the regression coefficients to the covariates - private val coefficients = parameters.slice(2, parameters.length) - private val intercept = parameters(1) + @transient private lazy val coefficients = parameters.slice(2, parameters.length) + @transient private lazy val intercept = parameters(1) // sigma is the scale parameter of the AFT model - private val sigma = math.exp(parameters(0)) + @transient private lazy val sigma = math.exp(parameters(0)) --- End diff -- if we using @transient val xxx = ... as a class member, the complier will generate the assignment code into the class constructor. when deserialzing it, if deserializer do not init this val, it will surely be null, because deserializing will not call the constructor. @transient lazy val xxx =... using another mechanism. when using this val it will generate the value and do the val assignment, so do not have the problem above. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14546 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14546: [SPARK-16955][SQL] Using ordinals in ORDER BY and GROUP ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14546 **[Test build #63418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63418/consoleFull)** for PR 14546 at commit [`1ca8d59`](https://github.com/apache/spark/commit/1ca8d59dc3f94dd491740ae89f4d8c8223b11944). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14558: [SPARK-16508][SparkR] Fix warnings on undocumented/dupli...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/14558 Ah, I'm actually about half way though this as well, but let's review yours. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14517: [SPARK-16931][PYTHON] PySpark APIS for bucketBy and sort...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14517 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63422/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14517: [SPARK-16931][PYTHON] PySpark APIS for bucketBy and sort...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14517 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14517: [SPARK-16931][PYTHON] PySpark APIS for bucketBy and sort...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14517 **[Test build #63422 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63422/consoleFull)** for PR 14517 at commit [`31c43e6`](https://github.com/apache/spark/commit/31c43e6f3d9544478142990b4968fb105d8a03d4). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13988: [SPARK-16101][SQL] Refactoring CSV data source to be con...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13988 **[Test build #63425 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63425/consoleFull)** for PR 13988 at commit [`a634435`](https://github.com/apache/spark/commit/a63443505483287fa9bb20312a24b38e75f90588). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14384: [Spark-16443][SparkR] Alternating Least Squares (...
Github user junyangq commented on a diff in the pull request: https://github.com/apache/spark/pull/14384#discussion_r74009550 --- Diff: R/pkg/R/mllib.R --- @@ -632,3 +642,147 @@ setMethod("predict", signature(object = "AFTSurvivalRegressionModel"), function(object, newData) { return(dataFrame(callJMethod(object@jobj, "transform", newData@sdf))) }) + + +#' Alternating Least Squares (ALS) for Collaborative Filtering +#' +#' \code{spark.als} learns latent factors in collaborative filtering via alternating least +#' squares. Users can call \code{summary} to obtain fitted latent factors, \code{predict} +#' to make predictions on new data, and \code{write.ml}/\code{read.ml} to save/load fitted models. +#' +#' For more details, see +#' \href{http://spark.apache.org/docs/latest/ml-collaborative-filtering.html}{MLlib: +#' Collaborative Filtering}. +#' Additional arguments can be passed to the methods. +#' \describe{ +#'\item{nonnegative}{logical value indicating whether to apply nonnegativity constraints. +#' Default: FALSE} +#'\item{implicitPrefs}{logical value indicating whether to use implicit preference. +#' Default: FALSE} +#'\item{alpha}{alpha parameter in the implicit preference formulation (>= 0). Default: 1.0} +#'\item{seed}{integer seed for random number generation. Default: 0} +#'\item{numUserBlocks}{number of user blocks used to parallelize computation (> 0). +#' Default: 10} +#'\item{numItemBlocks}{number of item blocks used to parallelize computation (> 0). +#' Default: 10} +#'\item{checkpointInterval}{number of checkpoint intervals (>= 1) or disable checkpoint (-1). +#' Default: 10} +#'} +#' +#' @param data A SparkDataFrame for training +#' @param ratingCol column name for ratings +#' @param userCol column name for user ids. Ids must be (or can be coerced into) integers +#' @param itemCol column name for item ids. Ids must be (or can be coerced into) integers +#' @param rank rank of the matrix factorization (> 0) +#' @param reg regularization parameter (>= 0) +#' @param maxIter maximum number of iterations (>= 0) + +#' @return \code{spark.als} returns a fitted ALS model +#' @rdname spark.als +#' @aliases spark.als,SparkDataFrame +#' @name spark.als +#' @export +#' @examples +#' \dontrun{ +#' df <- createDataFrame(ratings) --- End diff -- Good point. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14555: [SPARK-16965][MLLIB][PYSPARK] Fix bound checking ...
Github user zjffdu commented on a diff in the pull request: https://github.com/apache/spark/pull/14555#discussion_r74009150 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala --- @@ -560,11 +554,25 @@ class SparseVector @Since("2.0.0") ( @Since("2.0.0") val indices: Array[Int], @Since("2.0.0") val values: Array[Double]) extends Vector { - require(indices.length == values.length, "Sparse vectors require that the dimension of the" + -s" indices match the dimension of the values. You provided ${indices.length} indices and " + -s" ${values.length} values.") - require(indices.length <= size, s"You provided ${indices.length} indices and values, " + -s"which exceeds the specified vector size ${size}.") + validate() --- End diff -- 2 reasons * group the validation code together * I may define some temp variable for validation, without method it would become variable of SparseVector --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14557: [SPARK-16709][CORE] Kill the running task if stage faile...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14557 **[Test build #63424 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63424/consoleFull)** for PR 14557 at commit [`9263678`](https://github.com/apache/spark/commit/926367815a262c89a24f86fd735348f493e64881). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org