[GitHub] spark issue #22864: [SPARK-25861][Minor][WEBUI] Remove unused refreshInterva...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22864 **[Test build #98346 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98346/testReport)** for PR 22864 at commit [`72cf70a`](https://github.com/apache/spark/commit/72cf70a47bef979e3e625edc8fb8610632f886d3). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22864: [SPARK-25861][Minor][WEBUI] Remove unused refreshInterva...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/22864 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...
Github user rednaxelafx commented on the issue: https://github.com/apache/spark/pull/22847 Just in case people wonder, the following is the hack patch that I used for stress testing code splitting before this PR: ```diff --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala @@ -647,11 +647,13 @@ class CodegenContext(val useStreamlining: Boolean) { * Returns a term name that is unique within this instance of a `CodegenContext`. */ def freshName(name: String): String = synchronized { -val fullName = if (freshNamePrefix == "") { +// hack: intentionally add a very long prefix (length=300 characters) to +// trigger code splitting more frequently +val fullName = ("averylongprefix" * 20) + (if (freshNamePrefix == "") { name } else { s"${freshNamePrefix}_$name" -} +}) if (freshNameIds.contains(fullName)) { val id = freshNameIds(fullName) freshNameIds(fullName) = id + 1 ``` Of course, now with this PR, we can simply set the split threshold to a very low value (e.g. `1`) to force split. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22847: [SPARK-25850][SQL] Make the split threshold for t...
Github user rednaxelafx commented on a diff in the pull request: https://github.com/apache/spark/pull/22847#discussion_r229943260 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -812,6 +812,17 @@ object SQLConf { .intConf .createWithDefault(65535) + val CODEGEN_METHOD_SPLIT_THRESHOLD = buildConf("spark.sql.codegen.methodSplitThreshold") +.internal() +.doc("The threshold of source code length without comment of a single Java function by " + + "codegen to be split. When the generated Java function source code exceeds this threshold" + + ", it will be split into multiple small functions. We can't know how many bytecode will " + + "be generated, so use the code length as metric. A function's bytecode should not go " + + "beyond 8KB, otherwise it will not be JITted; it also should not be too small, otherwise " + + "there will be many function calls.") +.intConf --- End diff -- Oh I see, you're using the column name...that's not the right place to put the "prefix". Column names are almost never carried over to the generated code in the current framework (the only exception is the lambda variable name). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19927: [SPARK-22737][ML][WIP] OVR transform optimization
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/19927 @srowen How do you think about this? Current OVR model's transform is too slow. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22847: [SPARK-25850][SQL] Make the split threshold for t...
Github user rednaxelafx commented on a diff in the pull request: https://github.com/apache/spark/pull/22847#discussion_r229942325 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -812,6 +812,17 @@ object SQLConf { .intConf .createWithDefault(65535) + val CODEGEN_METHOD_SPLIT_THRESHOLD = buildConf("spark.sql.codegen.methodSplitThreshold") +.internal() +.doc("The threshold of source code length without comment of a single Java function by " + + "codegen to be split. When the generated Java function source code exceeds this threshold" + + ", it will be split into multiple small functions. We can't know how many bytecode will " + + "be generated, so use the code length as metric. A function's bytecode should not go " + + "beyond 8KB, otherwise it will not be JITted; it also should not be too small, otherwise " + + "there will be many function calls.") +.intConf --- End diff -- The "freshNamePrefix" prefix is only applied in whole-stage codegen, https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala#L87 https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala#L169 It doesn't take any effect in non-whole-stage codegen. If you intend to stress test expression codegen but don't see the prefix being prepended, you're probably not adding it in the right place. Where did you add it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/22087 @imatiach-msft Updated according to your comments! Thanks for your reviewing! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22898: [SPARK-25746][SQL][followup] do not add unnecessa...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22898 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22898 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22087 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98345/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22087 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22626 This needs to be rebased. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22087 **[Test build #98345 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98345/testReport)** for PR 22087 at commit [`2d6594e`](https://github.com/apache/spark/commit/2d6594e37ab6968fc13add89cdec2fd42f2b799b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22914 > May be we can highlight above the table, that "Invalid page number, falling back to first page" Yes, that's what I mean. No big deal but falling back to the first page seems unnecessary, we can return the same page again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22087 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22087 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98344/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22087 **[Test build #98344 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98344/testReport)** for PR 22087 at commit [`f428a5d`](https://github.com/apache/spark/commit/f428a5ddef242bbcccb189ae62259a9ced6e80de). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22908: [MONOR][SQL] Replace all TreeNode's node name in the sim...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22908 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98343/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22908: [MONOR][SQL] Replace all TreeNode's node name in the sim...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22908 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22908: [MONOR][SQL] Replace all TreeNode's node name in the sim...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22908 **[Test build #98343 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98343/testReport)** for PR 22908 at commit [`743b821`](https://github.com/apache/spark/commit/743b821ab7aab0d88cf52c13b98fb690f7b80836). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...
Github user shahidki31 commented on the issue: https://github.com/apache/spark/pull/22914 The current behavior is, If we enter a value more than the maximum page number, the page navigation bar shows the user is in first page and throws an exception. So, if we really want to throw exception, then page navigation bar also need to change accordingly. But still I believe, falling back to the first page is better. May be we can highlight above the table, that "Invalid page number, falling back to first page" --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22912 cc @jiangxb1987 and @mengxr --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22905#discussion_r229933689 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ColumnarFileFormat.scala --- @@ -0,0 +1,32 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.datasources + +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.types.StructType + +/** + * An optional mix-in for columnar [[FileFormat]]s. This trait provides some helpful metadata when + * debugging a physical query plan. + */ +private[sql] trait ColumnarFileFormat { --- End diff -- If it's supposed to be exposed as an interface to external datasources, then I wouldn't even add this one. It looks a rough guess that it can be generalised. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22905#discussion_r229933544 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -306,7 +306,15 @@ case class FileSourceScanExec( withOptPartitionCount } -withSelectedBucketsCount +val withOptColumnCount = relation.fileFormat match { + case columnar: ColumnarFileFormat => +val sqlConf = relation.sparkSession.sessionState.conf +val columnCount = columnar.columnCountForSchema(sqlConf, requiredSchema) +withSelectedBucketsCount + ("ColumnCount" -> columnCount.toString) --- End diff -- Is this something we really should include in the metadata? If the purpose of this is to check if the column pruning works or not, logging should be good enough. Adding a trait for it sounds an overkill for the current status. Let's not add an abstraction just for rough guess that it can be generalised. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22905#discussion_r229932838 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ColumnarFileFormat.scala --- @@ -0,0 +1,32 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.datasources + +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.types.StructType + +/** + * An optional mix-in for columnar [[FileFormat]]s. This trait provides some helpful metadata when + * debugging a physical query plan. + */ +private[sql] trait ColumnarFileFormat { --- End diff -- Don't we allow other non built-in datasource to use this trait? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22905#discussion_r229932338 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ColumnarFileFormat.scala --- @@ -0,0 +1,32 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.datasources + +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.types.StructType + +/** + * An optional mix-in for columnar [[FileFormat]]s. This trait provides some helpful metadata when + * debugging a physical query plan. + */ +private[sql] trait ColumnarFileFormat { --- End diff -- and I would actually make it `pricate[datasources]` since that's only currently used in ParquetFileFormat. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22914 I prefer to just highlight the invalid output. E.g. ![image](https://user-images.githubusercontent.com/1097932/47831557-0e6ea800-ddcc-11e8-9fd1-c4d29f944c9d.png) ![image](https://user-images.githubusercontent.com/1097932/47831561-11699880-ddcc-11e8-9d83-20e49a0c517d.png) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22905#discussion_r229932234 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ColumnarFileFormat.scala --- @@ -0,0 +1,32 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.datasources + +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.types.StructType + +/** + * An optional mix-in for columnar [[FileFormat]]s. This trait provides some helpful metadata when + * debugging a physical query plan. + */ +private[sql] trait ColumnarFileFormat { --- End diff -- It's already in a private package. `private[sql]` can be removed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22087 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4691/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22087 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20899: Bug fix in sendMessage() of pregel implementation in Pag...
Github user WenqianZhao commented on the issue: https://github.com/apache/spark/pull/20899 > @WenqianZhao I think the point of sending deltas instead of absolute ranks was that, as parts of the graph converge, their deltas would go to zero. GraphX would then be able to compress those zero messages more efficiently. Got it! Thank you! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22087 **[Test build #98345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98345/testReport)** for PR 22087 at commit [`2d6594e`](https://github.com/apache/spark/commit/2d6594e37ab6968fc13add89cdec2fd42f2b799b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...
Github user shahidki31 commented on the issue: https://github.com/apache/spark/pull/22914 @gengliangwang IMHO, We should try to avoid exceptions in the WEBUI. User will come to know which page he is, from the page navigation bar. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22087 **[Test build #98344 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98344/testReport)** for PR 22087 at commit [`f428a5d`](https://github.com/apache/spark/commit/f428a5ddef242bbcccb189ae62259a9ced6e80de). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22087 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22087 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4690/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19045 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19045 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98342/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19045 **[Test build #98342 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98342/testReport)** for PR 19045 at commit [`ca448d1`](https://github.com/apache/spark/commit/ca448d13c523e4658720ed3bf9b7cfa9f03ec260). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22892: [SPARK-25884][SQL] Add TBLPROPERTIES and COMMENT,...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22892 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22847: [SPARK-25850][SQL] Make the split threshold for t...
Github user yucai commented on a diff in the pull request: https://github.com/apache/spark/pull/22847#discussion_r229919857 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -812,6 +812,17 @@ object SQLConf { .intConf .createWithDefault(65535) + val CODEGEN_METHOD_SPLIT_THRESHOLD = buildConf("spark.sql.codegen.methodSplitThreshold") +.internal() +.doc("The threshold of source code length without comment of a single Java function by " + + "codegen to be split. When the generated Java function source code exceeds this threshold" + + ", it will be split into multiple small functions. We can't know how many bytecode will " + + "be generated, so use the code length as metric. A function's bytecode should not go " + + "beyond 8KB, otherwise it will not be JITted; it also should not be too small, otherwise " + + "there will be many function calls.") +.intConf --- End diff -- Seems like long alias names have no influence. ``` [info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_162-b12 on Mac OS X 10.13.6 [info] Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz [info] projection on wide table:Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative [info] [info] split threshold 106512 / 6736 0.26210.4 1.0X [info] split threshold 100 5730 / 6329 0.25464.9 1.1X [info] split threshold 1024 3119 / 3184 0.32974.6 2.1X [info] split threshold 2048 2981 / 3100 0.42842.9 2.2X [info] split threshold 4096 3289 / 3379 0.33136.6 2.0X [info] split threshold 8196 4307 / 4338 0.24108.0 1.5X [info] split threshold 65536 29147 / 30212 0.0 27797.0 0.2X ``` No `averylongprefixrepeatedmultipletimes` in the **expression code gen**: ``` /* 047 */ private void createExternalRow_0_8(InternalRow i, Object[] values_0) { /* 048 */ /* 049 */ // input[80, bigint, false] /* 050 */ long value_81 = i.getLong(80); /* 051 */ if (false) { /* 052 */ values_0[80] = null; /* 053 */ } else { /* 054 */ values_0[80] = value_81; /* 055 */ } /* 056 */ /* 057 */ // input[81, bigint, false] /* 058 */ long value_82 = i.getLong(81); /* 059 */ if (false) { /* 060 */ values_0[81] = null; /* 061 */ } else { /* 062 */ values_0[81] = value_82; /* 063 */ } /* 064 */ /* 065 */ // input[82, bigint, false] /* 066 */ long value_83 = i.getLong(82); /* 067 */ if (false) { /* 068 */ values_0[82] = null; /* 069 */ } else { /* 070 */ values_0[82] = value_83; /* 071 */ } /* 072 */ ... ``` My benchmark: ``` object WideTableBenchmark extends SqlBasedBenchmark { override def runBenchmarkSuite(mainArgs: Array[String]): Unit = { runBenchmark("projection on wide table") { val N = 1 << 20 val df = spark.range(N) val columns = (0 until 400).map{ i => s"id as averylongprefixrepeatedmultipletimes_id$i"} val benchmark = new Benchmark("projection on wide table", N, output = output) Seq("10", "100", "1024", "2048", "4096", "8196", "65536").foreach { n => benchmark.addCase(s"split threshold $n", numIters = 5) { iter => withSQLConf("spark.testing.codegen.splitThreshold" -> n) { df.selectExpr(columns: _*).foreach(identity(_)) } } } benchmark.run() } } } ``` Will keep benchmarking for the complex expression. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22892: [SPARK-25884][SQL] Add TBLPROPERTIES and COMMENT, and us...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22892 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22847 @rednaxelafx ah good point! It's hardcoded as 1024 too, and it's also doing method splitting. Let's apply the config there too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22912 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22912 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98341/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22912 **[Test build #98341 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98341/testReport)** for PR 22912 at commit [`2f2de0b`](https://github.com/apache/spark/commit/2f2de0bb381c6b6bed65f5371aa001dd84aff3fe). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22912 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22912 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98340/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22912 **[Test build #98340 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98340/testReport)** for PR 22912 at commit [`e7ad8ab`](https://github.com/apache/spark/commit/e7ad8abc2c3505b7fab81cfa79f009a8808bb2b8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22918: [SPARK-25902][SQL]Change AttributeReference.withMetadata...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22918 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/21860 thanks, @cloud-fan, @maropu, @kiszk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22918: [SPARK-25902][SQL]Change AttributeReference.withMetadata...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22918 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22666 Argh, sorry, it was my mistake. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22918: [SPARK-25902][SQL]Change AttributeReference.withMetadata...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22918 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22918: [SPARK-25902][SQL]Change AttributeReference.withM...
GitHub user kevinyu98 opened a pull request: https://github.com/apache/spark/pull/22918 [SPARK-25902][SQL]Change AttributeReference.withMetadata's return type to AttributeReference ## What changes were proposed in this pull request? Currently the `AttributeReference.withMetadata` method have return type `Attribute,` the rest of with methods in the `AttributeReference` return type are `AttributeReference`, as the [spark-25902](https://issues.apache.org/jira/browse/SPARK-25892?jql=project%20%3D%20SPARK%20AND%20component%20in%20(ML%2C%20PySpark%2C%20SQL)) mentioned. ## How was this patch tested? Run all `sql/test,` `catalyst/test` and `org.apache.spark.sql.execution.streaming.*` You can merge this pull request into a Git repository by running: $ git pull https://github.com/kevinyu98/spark spark-25892 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22918.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22918 commit 7aa2e9f9113ace0106ed1de31bad5997d600f03b Author: Kevin Yu Date: 2018-10-31T22:40:04Z return AttributeReference type for AttributeReference.withMetadata --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22666 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22666 Ah no I am sorry @MaxGekk. I made the primary author as me mistakenly. I showed my email first. ``` === Pull Request #22666 === title [SPARK-25672][SQL] schema_of_csv() - schema inference from an example source MaxGekk/schema_of_csv-function target master url https://api.github.com/repos/apache/spark/pulls/22666 Proceed with merging pull request #22666? (y/n): y git fetch apache-github pull/22666/head:PR_TOOL_MERGE_PR_22666 From https://github.com/apache/spark * [new ref] refs/pull/22666/head -> PR_TOOL_MERGE_PR_22666 git fetch apache master:PR_TOOL_MERGE_PR_22666_MASTER remote: Counting objects: 303, done. remote: Compressing objects: 100% (153/153), done. remote: Total 209 (delta 91), reused 0 (delta 0) Receiving objects: 100% (209/209), 91.89 KiB | 445.00 KiB/s, done. Resolving deltas: 100% (91/91), completed with 65 local objects. From https://git-wip-us.apache.org/repos/asf/spark * [new branch] master -> PR_TOOL_MERGE_PR_22666_MASTER 57eddc7182e..c5ef477d2f6 master -> apache/master git checkout PR_TOOL_MERGE_PR_22666_MASTER Switched to branch 'PR_TOOL_MERGE_PR_22666_MASTER' ['git', 'merge', 'PR_TOOL_MERGE_PR_22666', '--squash'] Automatic merge went well; stopped before committing as requested ['git', 'log', 'HEAD..PR_TOOL_MERGE_PR_22666', '--pretty=format:%an <%ae>'] Enter primary author in the format of "name " [hyukjinkwon ]: hyukjinkwon ['git', 'log', 'HEAD..PR_TOOL_MERGE_PR_22666', '--pretty=format:%h [%an] %s'] ``` Looks the commit order affects the name appearing for `Enter primary author in the format of "name "`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22914 **[Test build #4400 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4400/testReport)** for PR 22914 at commit [`5a31350`](https://github.com/apache/spark/commit/5a313506b63b870d8de37348f97cf5d67ed52ff6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22666 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use m...
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/22845 thanks,@dongjoon-hyum --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22908: [MONOR][SQL] Replace all TreeNode's node name in the sim...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22908 **[Test build #98343 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98343/testReport)** for PR 22908 at commit [`743b821`](https://github.com/apache/spark/commit/743b821ab7aab0d88cf52c13b98fb690f7b80836). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22908: [MONOR][SQL] Replace all TreeNode's node name in the sim...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22908 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22908: [MONOR][SQL] Replace all TreeNode's node name in the sim...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22908 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4689/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22626 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22626 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98336/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22626 **[Test build #98336 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98336/testReport)** for PR 22626 at commit [`39f6899`](https://github.com/apache/spark/commit/39f689932ee2df194420fc63c7c5d9e351b09b86). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22860: Branch 2.4
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22860 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22897: [SPARK-25875][k8s] Merge code to set up driver command i...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/22897 @mccheah @liyinan926 (I'm kinda assuming you guys monitor github / jira instead of relying on pings.) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22917: [SPARK-25827][CORE] Encrypted blocks can be over 2GB.
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/22917 The title seems to describe the problem, can you describe the solution instead? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22901: [SPARK-25891][PYTHON] Upgrade to Py4J 0.10.8.1
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22901 Late LGTM! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22883: [SPARK-25837] [Core] Fix potential slowdown in AppStatus...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22883 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98331/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22883: [SPARK-25837] [Core] Fix potential slowdown in AppStatus...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22883 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22883: [SPARK-25837] [Core] Fix potential slowdown in AppStatus...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22883 **[Test build #98331 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98331/testReport)** for PR 22883 at commit [`82fef86`](https://github.com/apache/spark/commit/82fef8686f4b71d94b3bfe44ea809b4d88f5fe70). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22910: [SPARK-25899][TESTS]Fix flaky CoarseGrainedSchedu...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22910 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19045 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4688/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19045 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19045 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4688/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19045 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4688/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22910: [SPARK-25899][TESTS]Fix flaky CoarseGrainedSchedulerBack...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22910 Thanks! Merged to master/2.4 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19045 **[Test build #98342 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98342/testReport)** for PR 19045 at commit [`ca448d1`](https://github.com/apache/spark/commit/ca448d13c523e4658720ed3bf9b7cfa9f03ec260). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22883: [SPARK-25837] [Core] Fix potential slowdown in AppStatus...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22883 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98329/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22883: [SPARK-25837] [Core] Fix potential slowdown in AppStatus...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22883 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22883: [SPARK-25837] [Core] Fix potential slowdown in AppStatus...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22883 **[Test build #98329 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98329/testReport)** for PR 22883 at commit [`178f7c3`](https://github.com/apache/spark/commit/178f7c3bf82f93177fce086037ece6ebf09bb350). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22910: [SPARK-25899][TESTS]Fix flaky CoarseGrainedSchedulerBack...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22910 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98328/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22910: [SPARK-25899][TESTS]Fix flaky CoarseGrainedSchedulerBack...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22910 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22910: [SPARK-25899][TESTS]Fix flaky CoarseGrainedSchedulerBack...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22910 **[Test build #98328 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98328/testReport)** for PR 22910 at commit [`0a3095f`](https://github.com/apache/spark/commit/0a3095fd5610810004ef6a0d1e02581b78bfdea4). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22847: [SPARK-25850][SQL] Make the split threshold for t...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22847#discussion_r229879855 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -812,6 +812,17 @@ object SQLConf { .intConf .createWithDefault(65535) + val CODEGEN_METHOD_SPLIT_THRESHOLD = buildConf("spark.sql.codegen.methodSplitThreshold") +.internal() +.doc("The threshold of source code length without comment of a single Java function by " + + "codegen to be split. When the generated Java function source code exceeds this threshold" + + ", it will be split into multiple small functions. We can't know how many bytecode will " + + "be generated, so use the code length as metric. A function's bytecode should not go " + + "beyond 8KB, otherwise it will not be JITted; it also should not be too small, otherwise " + + "there will be many function calls.") +.intConf --- End diff -- Could you try some very long alias names or complex expressions? You will get different number, right? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...
Github user yogeshg commented on the issue: https://github.com/apache/spark/pull/22912 In an offline discussion with @MrBago , we noted that there's at most as many (non-cancelled) `timerTasks` on the `timer` as there are slots. So, one thread for managing logging is probably fine, in fact if anything, we should also think about how we can just use the main thread. Also, this means that the `timer.purge()` call in the finally block is also `O(n + log c) \in O(constant)` where `n` is the total number of tasks and `c` is the number of cancelled tasks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22912 **[Test build #98341 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98341/testReport)** for PR 22912 at commit [`2f2de0b`](https://github.com/apache/spark/commit/2f2de0bb381c6b6bed65f5371aa001dd84aff3fe). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22912 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4687/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22912 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22917: [SPARK-25827][CORE] Encrypted blocks can be over 2GB.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22917 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22917: [SPARK-25827][CORE] Encrypted blocks can be over 2GB.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22917 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4686/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22917: [SPARK-25827][CORE] Encrypted blocks can be over 2GB.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22917 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22911: [SPARK-25815][k8s] Support kerberos in client mode, keyt...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/22911 It is actually not needed for client mode because only the driver needs the keytab. But whether to store it in secrets is not a question. You either store it in a secret or you don't support the keytab/principal feature in Spark at all, and we can delete a bunch of code here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22917: [SPARK-25827][CORE] Encrypted blocks can be over 2GB.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22917 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98339/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22917: [SPARK-25827][CORE] Encrypted blocks can be over 2GB.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22917 **[Test build #98339 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98339/testReport)** for PR 22917 at commit [`5b432e7`](https://github.com/apache/spark/commit/5b432e7184130c0264d330f3068a80f1b3a4bb61). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22429 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98324/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22909: [SPARK-25897][k8s] Hook up k8s integration tests to sbt ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22909 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98327/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22429 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org