[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21889 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22071: [SPARK-25088][CORE][MESOS][DOCS] Update Rest Server docs...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/22071 in this case maybe ok. perhaps just rel note this iff there's another 2.2.x or 2.1.x releases? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21889 **[Test build #94790 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94790/testReport)** for PR 21889 at commit [`1c0c4bf`](https://github.com/apache/spark/commit/1c0c4bf14172dd2257fe1d00fc0aeed78aa1cb84). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL support...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22107 **[Test build #94789 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94789/testReport)** for PR 22107 at commit [`1d93304`](https://github.com/apache/spark/commit/1d93304290909617c1ddb794f3599907d09cad3d). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20226: [SPARK-23034][SQL] Override `nodeName` for all *ScanExec...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20226 **[Test build #94786 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94786/testReport)** for PR 20226 at commit [`1facc05`](https://github.com/apache/spark/commit/1facc0554aae0829a19bbb7607b25ff7eda4ef8d). * This patch **fails due to an unknown error code, -9**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22013: [SPARK-23939][SQL] Add transform_keys function
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22013 **[Test build #94788 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94788/testReport)** for PR 22013 at commit [`e5d9b05`](https://github.com/apache/spark/commit/e5d9b051b027cf86fbcd82701f54e50f1aeac7f6). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20611 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94787/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20226: [SPARK-23034][SQL] Override `nodeName` for all *ScanExec...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20226 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22013: [SPARK-23939][SQL] Add transform_keys function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22013 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94788/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22013: [SPARK-23939][SQL] Add transform_keys function
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22013 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL support...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22107 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94789/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL support...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22107 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21661: [SPARK-24685][build] Restore support for building old Ha...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21661 **[Test build #94784 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94784/testReport)** for PR 21661 at commit [`34ca4ef`](https://github.com/apache/spark/commit/34ca4ef2b0757b2832ac2dbcc364b42eb4f34e48). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20226: [SPARK-23034][SQL] Override `nodeName` for all *ScanExec...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20226 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94786/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20611 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20611 **[Test build #94787 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94787/testReport)** for PR 20611 at commit [`5ad2d58`](https://github.com/apache/spark/commit/5ad2d58216cba3eb2b89621086696352b33d456f). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21661: [SPARK-24685][build] Restore support for building old Ha...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21661 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94784/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21661: [SPARK-24685][build] Restore support for building old Ha...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21661 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20611 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21661: [SPARK-24685][build] Restore support for building old Ha...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21661 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL support...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22107 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2204/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL support...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22107 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL support...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22107 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21661: [SPARK-24685][build] Restore support for building old Ha...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21661 **[Test build #94792 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94792/testReport)** for PR 21661 at commit [`34ca4ef`](https://github.com/apache/spark/commit/34ca4ef2b0757b2832ac2dbcc364b42eb4f34e48). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20611 **[Test build #94793 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94793/testReport)** for PR 20611 at commit [`5ad2d58`](https://github.com/apache/spark/commit/5ad2d58216cba3eb2b89621086696352b33d456f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL support...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22107 **[Test build #94791 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94791/testReport)** for PR 22107 at commit [`1d93304`](https://github.com/apache/spark/commit/1d93304290909617c1ddb794f3599907d09cad3d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL support...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22107 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2205/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL support...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22107 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21661: [SPARK-24685][build] Restore support for building old Ha...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21661 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2206/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21661: [SPARK-24685][build] Restore support for building old Ha...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21661 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22095: [SPARK-23984][K8S] Changed Python Version config to be c...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/22095 @mccheah @foxish --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21889 **[Test build #94790 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94790/testReport)** for PR 21889 at commit [`1c0c4bf`](https://github.com/apache/spark/commit/1c0c4bf14172dd2257fe1d00fc0aeed78aa1cb84). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21889 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94790/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21889 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21718: [SPARK-24744][STRUCTRURED STREAMING] Set the SparkSessio...
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/21718 @bjkonglu @bethunebtj @wguangliang Update: I thought about splitting execution tasks and data partitions (`spark.sql.shuffle.partitions`), and turned out it can be achieved by calling `coalesce`. With `coalesce` you can reduce execution tasks whereas the number of data partitions is kept same. Please note that we still can't change `spark.sql.shuffle.partitions`, since repartitioning state will not be trivial according to the size of the state. One thing to note is that execution tasks will be reduced even for downstream operators (unless there's a new stage), so you need to call `repartition` to adjust execution tasks for downstream. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21909: [SPARK-24959][SQL] Speed up count() for JSON and CSV
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21909 @HyukjinKwon @maropu Please, have a look at the PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22013: [SPARK-23939][SQL] Add transform_keys function
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22013#discussion_r210193484 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala --- @@ -497,6 +497,65 @@ case class ArrayAggregate( override def prettyName: String = "aggregate" } +/** + * Transform Keys for every entry of the map by applying the transform_keys function. + * Returns map with transformed key entries + */ +@ExpressionDescription( + usage = "_FUNC_(expr, func) - Transforms elements in a map using the function.", + examples = """ +Examples: + > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + 1); + map(array(2, 3, 4), array(1, 2, 3)) + > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v); + map(array(2, 4, 6), array(1, 2, 3)) + """, + since = "2.4.0") +case class TransformKeys( +argument: Expression, +function: Expression) + extends MapBasedSimpleHigherOrderFunction with CodegenFallback { + + override def nullable: Boolean = argument.nullable + + override def dataType: DataType = { +val map = argument.dataType.asInstanceOf[MapType] +MapType(function.dataType, map.valueType, map.valueContainsNull) + } + + @transient val MapType(keyType, valueType, valueContainsNull) = argument.dataType + + override def bind(f: (Expression, Seq[(DataType, Boolean)]) => LambdaFunction): TransformKeys = { +copy(function = f(function, (keyType, false) :: (valueType, valueContainsNull) :: Nil)) + } + + @transient lazy val (keyVar, valueVar) = { +val LambdaFunction( +_, (keyVar: NamedLambdaVariable) :: (valueVar: NamedLambdaVariable) :: Nil, _) = function +(keyVar, valueVar) + } --- End diff -- Sorry, I meant we don't need to surround by: ```scala @transient lazy val (keyVar, valueVar) = { ... (keyVar, valueVar) } ``` just ```scala @transient lazy val LambdaFunction(_, (keyVar: NamedLambdaVariable) :: (valueVar: NamedLambdaVariable) :: Nil, _) = function ``` should work. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22013: [SPARK-23939][SQL] Add transform_keys function
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/22013#discussion_r210193591 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala --- @@ -497,6 +497,65 @@ case class ArrayAggregate( override def prettyName: String = "aggregate" } +/** + * Transform Keys for every entry of the map by applying the transform_keys function. + * Returns map with transformed key entries + */ +@ExpressionDescription( + usage = "_FUNC_(expr, func) - Transforms elements in a map using the function.", + examples = """ +Examples: + > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + 1); + map(array(2, 3, 4), array(1, 2, 3)) + > SELECT _FUNC_(map(array(1, 2, 3), array(1, 2, 3)), (k, v) -> k + v); + map(array(2, 4, 6), array(1, 2, 3)) + """, + since = "2.4.0") +case class TransformKeys( +argument: Expression, +function: Expression) + extends MapBasedSimpleHigherOrderFunction with CodegenFallback { + + override def nullable: Boolean = argument.nullable + + override def dataType: DataType = { +val map = argument.dataType.asInstanceOf[MapType] +MapType(function.dataType, map.valueType, map.valueContainsNull) --- End diff -- What about this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL support...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22107 **[Test build #94791 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94791/testReport)** for PR 22107 at commit [`1d93304`](https://github.com/apache/spark/commit/1d93304290909617c1ddb794f3599907d09cad3d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL support...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22107 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL support...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22107 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94791/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21860 **[Test build #94794 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94794/testReport)** for PR 21860 at commit [`e9b088d`](https://github.com/apache/spark/commit/e9b088df421129605059eae417c9abafda34e76a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22110: [SPARK-25122][SQL] Deduplication of supports equa...
GitHub user mn-mikke opened a pull request: https://github.com/apache/spark/pull/22110 [SPARK-25122][SQL] Deduplication of supports equals code ## What changes were proposed in this pull request? The method ```*supportEquals``` determining whether elements of a data type could be used as items in a hash set or as keys in a hash map is duplicated across multiple collection and higher-order functions. This PR suggests to deduplicate the method. ## How was this patch tested? Run tests in: - DataFrameFunctionsSuite - CollectionExpressionsSuite - HigherOrderExpressionsSuite You can merge this pull request into a Git repository by running: $ git pull https://github.com/mn-mikke/spark SPARK-25122 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22110.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22110 commit dd292e8cf3ed1788793e626da3a136e9acb9d81c Author: Marek Novotny Date: 2018-08-15T08:18:05Z [SPARK-25122][SQL] Deduplication of supports equals code --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22110: [SPARK-25122][SQL] Deduplication of supports equals code
Github user mn-mikke commented on the issue: https://github.com/apache/spark/pull/22110 cc @ueshin @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22110: [SPARK-25122][SQL] Deduplication of supports equals code
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22110 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22110: [SPARK-25122][SQL] Deduplication of supports equals code
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22110 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22110: [SPARK-25122][SQL] Deduplication of supports equals code
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22110 **[Test build #94795 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94795/testReport)** for PR 22110 at commit [`dd292e8`](https://github.com/apache/spark/commit/dd292e8cf3ed1788793e626da3a136e9acb9d81c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22102: [SPARK-25051][SQL] FixNullability should not stop...
Github user mgaido91 closed the pull request at: https://github.com/apache/spark/pull/22102 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20611 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21561: [SPARK-24555][ML] logNumExamples in KMeans/BiKM/G...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21561#discussion_r210205064 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeans.scala --- @@ -151,13 +152,10 @@ class BisectingKMeans private ( this } - /** - * Runs the bisecting k-means algorithm. - * @param input RDD of vectors - * @return model for the bisecting kmeans - */ - @Since("1.6.0") - def run(input: RDD[Vector]): BisectingKMeansModel = { + --- End diff -- nit: extra newline --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20611 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20611 **[Test build #94796 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94796/testReport)** for PR 20611 at commit [`5ad2d58`](https://github.com/apache/spark/commit/5ad2d58216cba3eb2b89621086696352b33d456f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22110: [SPARK-25122][SQL] Deduplication of supports equa...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22110#discussion_r210207729 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/AbstractDataType.scala --- @@ -115,6 +115,8 @@ protected[sql] abstract class AtomicType extends DataType { private[sql] type InternalType private[sql] val tag: TypeTag[InternalType] private[sql] val ordering: Ordering[InternalType] + + private[spark] override def supportsEquals: Boolean = true --- End diff -- I don't think this should be a property of the data type. It's specific to the `OpenHashSet`. How about we add this method to `object OpenHashSet`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17400: [SPARK-19981][SQL] Update output partitioning info. when...
Github user eyalfa commented on the issue: https://github.com/apache/spark/pull/17400 thanks @maropu I appreciate this. must say I'm pretty surprised a bug like that lives so long... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22111: [SPARK-25123][SQL] Use Block to track code in Sim...
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/22111 [SPARK-25123][SQL] Use Block to track code in SimpleExprValue ## What changes were proposed in this pull request? `SimpleExprValue` carries some java code which is a rvalue. Now the code is represented by a String. If the code references a variable, this means that we are loosing track of its usage. This is particularly important as the `value` of an `ExprCode` is (correctly) a `ExprValue`. Thus, if we have current code referencing a variable in it, we are loosing track of the referenced variable. So the PR proposes to represent the code in `SimpleExprValue` using a `Block`, in order not to loose eventual references present in the generated code. ## How was this patch tested? added UT/existing UTs You can merge this pull request into a Git repository by running: $ git pull https://github.com/mgaido91/spark SPARK-25123 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22111.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22111 commit 7e91f93bfcd03537f02fdb9ae05b4807cb09dd87 Author: Marco Gaido Date: 2018-08-10T14:50:57Z [SPARK-25123][SQL] Use Block to track code in SimpleExprValue --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22111: [SPARK-25123][SQL] Use Block to track code in SimpleExpr...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22111 cc @cloud-fan @kiszk @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22111: [SPARK-25123][SQL] Use Block to track code in SimpleExpr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22111 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2207/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22111: [SPARK-25123][SQL] Use Block to track code in SimpleExpr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22111 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22110: [SPARK-25122][SQL] Deduplication of supports equa...
Github user mn-mikke commented on a diff in the pull request: https://github.com/apache/spark/pull/22110#discussion_r210212586 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/AbstractDataType.scala --- @@ -115,6 +115,8 @@ protected[sql] abstract class AtomicType extends DataType { private[sql] type InternalType private[sql] val tag: TypeTag[InternalType] private[sql] val ordering: Ordering[InternalType] + + private[spark] override def supportsEquals: Boolean = true --- End diff -- Not all of the expressions utilize ```OpenHashSet``` or ```OpenHashMap```. What about ```TypeUtils``` that contains methods like ```getInterpretedOrdering```? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22111: [SPARK-25123][SQL] Use Block to track code in SimpleExpr...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22111 **[Test build #94797 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94797/testReport)** for PR 22111 at commit [`7e91f93`](https://github.com/apache/spark/commit/7e91f93bfcd03537f02fdb9ae05b4807cb09dd87). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22110: [SPARK-25122][SQL] Deduplication of supports equa...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22110#discussion_r210213125 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/AbstractDataType.scala --- @@ -115,6 +115,8 @@ protected[sql] abstract class AtomicType extends DataType { private[sql] type InternalType private[sql] val tag: TypeTag[InternalType] private[sql] val ordering: Ordering[InternalType] + + private[spark] override def supportsEquals: Boolean = true --- End diff -- SGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22009 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22009 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2208/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22009 **[Test build #94798 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94798/testReport)** for PR 22009 at commit [`e6e599a`](https://github.com/apache/spark/commit/e6e599a9630801078c046d3aec398cf4f046945c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22110: [SPARK-25122][SQL] Deduplication of supports equa...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22110#discussion_r210215352 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/AbstractDataType.scala --- @@ -115,6 +115,8 @@ protected[sql] abstract class AtomicType extends DataType { private[sql] type InternalType private[sql] val tag: TypeTag[InternalType] private[sql] val ordering: Ordering[InternalType] + + private[spark] override def supportsEquals: Boolean = true --- End diff -- +1 too --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21123: [SPARK-24045][SQL]Create base class for file data source...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21123 BTW, do we consider Datasource V1 fallback too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22111: [SPARK-25123][SQL] Use Block to track code in SimpleExpr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22111 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94797/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22111: [SPARK-25123][SQL] Use Block to track code in SimpleExpr...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22111 **[Test build #94797 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94797/testReport)** for PR 22111 at commit [`7e91f93`](https://github.com/apache/spark/commit/7e91f93bfcd03537f02fdb9ae05b4807cb09dd87). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class SimpleExprValue(expr: Block, javaType: Class[_]) extends ExprValue ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22111: [SPARK-25123][SQL] Use Block to track code in SimpleExpr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22111 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21860 **[Test build #94794 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94794/testReport)** for PR 21860 at commit [`e9b088d`](https://github.com/apache/spark/commit/e9b088df421129605059eae417c9abafda34e76a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21860 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94794/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21860 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22111: [SPARK-25123][SQL] Use Block to track code in SimpleExpr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22111 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22111: [SPARK-25123][SQL] Use Block to track code in SimpleExpr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22111 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2209/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22111: [SPARK-25123][SQL] Use Block to track code in SimpleExpr...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22111 **[Test build #94799 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94799/testReport)** for PR 22111 at commit [`79057bb`](https://github.com/apache/spark/commit/79057bb9cfc4bfd27824dd17d9c92b01a619a68a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20611 **[Test build #94793 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94793/testReport)** for PR 20611 at commit [`5ad2d58`](https://github.com/apache/spark/commit/5ad2d58216cba3eb2b89621086696352b33d456f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20611 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94793/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20611 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21661: [SPARK-24685][build] Restore support for building old Ha...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21661 **[Test build #94792 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94792/testReport)** for PR 21661 at commit [`34ca4ef`](https://github.com/apache/spark/commit/34ca4ef2b0757b2832ac2dbcc364b42eb4f34e48). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21661: [SPARK-24685][build] Restore support for building old Ha...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21661 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21661: [SPARK-24685][build] Restore support for building old Ha...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21661 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94792/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22110: [SPARK-25122][SQL] Deduplication of supports equals code
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22110 **[Test build #94795 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94795/testReport)** for PR 22110 at commit [`dd292e8`](https://github.com/apache/spark/commit/dd292e8cf3ed1788793e626da3a136e9acb9d81c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22110: [SPARK-25122][SQL] Deduplication of supports equals code
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22110 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94795/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22110: [SPARK-25122][SQL] Deduplication of supports equals code
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22110 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22009 **[Test build #94798 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94798/testReport)** for PR 22009 at commit [`e6e599a`](https://github.com/apache/spark/commit/e6e599a9630801078c046d3aec398cf4f046945c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22009 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94798/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22009 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/22104 Thanks @HyukjinKwon and @cloud-fan ! I will take a look --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22110: [SPARK-25122][SQL] Deduplication of supports equals code
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22110 **[Test build #94800 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94800/testReport)** for PR 22110 at commit [`9ed65cf`](https://github.com/apache/spark/commit/9ed65cf5aa3f440e49db9613379593c23272737e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20611 **[Test build #94796 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94796/testReport)** for PR 20611 at commit [`5ad2d58`](https://github.com/apache/spark/commit/5ad2d58216cba3eb2b89621086696352b33d456f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20611 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20611 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94796/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21977: SPARK-25004: Add spark.executor.pyspark.memory li...
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/21977#discussion_r210271735 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -91,6 +91,13 @@ private[spark] class Client( private val executorMemoryOverhead = sparkConf.get(EXECUTOR_MEMORY_OVERHEAD).getOrElse( math.max((MEMORY_OVERHEAD_FACTOR * executorMemory).toLong, MEMORY_OVERHEAD_MIN)).toInt + private val isPython = sparkConf.get(IS_PYTHON_APP) --- End diff -- It's true, creating mixed language pipelines is difficult and not documented. But I do it, and some others do as well. Some cloud providers (databricks is the most notable example) provide mixed language pipelines in their notebook solutions I believe, and so I think that also reaches a larger audience than the people who do it manually. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22001 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22001 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2210/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22001: [SPARK-24819][CORE] Fail fast when no enough slots to la...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22001 **[Test build #94801 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94801/testReport)** for PR 22001 at commit [`c9036aa`](https://github.com/apache/spark/commit/c9036aab22cfd6b7a4939f4b23741612706ba2a6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22009 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21123: [SPARK-24045][SQL]Create base class for file data source...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21123 v1 fallback should be used as a temporary workaround, not a long-term solution to fix the gap between v1 and v2. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22009 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22009 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2211/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org