[GitHub] dongjoon-hyun commented on a change in pull request #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path
dongjoon-hyun commented on a change in pull request #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path URL: https://github.com/apache/spark/pull/23383#discussion_r247337185 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1419,6 +1419,13 @@ object SQLConf { .timeConf(TimeUnit.MILLISECONDS) .createWithDefault(100) + val DISABLED_V2_FILE_READS = buildConf("spark.sql.files.disabledV2Reads") +.internal() Review comment: Given the overall DSv2 risk, shall we make this `public` and add migration doc officially, @gatorsmile ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dongjoon-hyun commented on a change in pull request #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path
dongjoon-hyun commented on a change in pull request #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path URL: https://github.com/apache/spark/pull/23383#discussion_r247337038 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcDataSourceV2.scala ## @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql.execution.datasources.v2.orc + +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.execution.datasources.orc._ Review comment: ``` -import org.apache.spark.sql.execution.datasources.orc._ +import org.apache.spark.sql.execution.datasources.orc.OrcFileFormat ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dongjoon-hyun commented on a change in pull request #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path
dongjoon-hyun commented on a change in pull request #23383: [SPARK-23817][SQL] Create file source V2 framework and migrate ORC read path URL: https://github.com/apache/spark/pull/23383#discussion_r247337038 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcDataSourceV2.scala ## @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql.execution.datasources.v2.orc + +import org.apache.spark.sql.execution.datasources._ +import org.apache.spark.sql.execution.datasources.orc._ Review comment: ```scala -import org.apache.spark.sql.execution.datasources.orc._ +import org.apache.spark.sql.execution.datasources.orc.OrcFileFormat ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
AmplabJenkins removed a comment on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-453808971 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
AmplabJenkins removed a comment on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-453808962 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
AmplabJenkins removed a comment on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-453808972 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101135/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
AmplabJenkins removed a comment on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-453808963 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6999/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
AmplabJenkins commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-453808962 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
AmplabJenkins commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-453808971 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
AmplabJenkins commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-453808963 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6999/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
AmplabJenkins commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-453808972 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101135/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
SparkQA removed a comment on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-453798520 **[Test build #101135 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101135/testReport)** for PR 23530 at commit [`e81a501`](https://github.com/apache/spark/commit/e81a501be64324aeb929a05bbd7825d2bfdcbddc). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
SparkQA commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-45380 **[Test build #101135 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101135/testReport)** for PR 23530 at commit [`e81a501`](https://github.com/apache/spark/commit/e81a501be64324aeb929a05bbd7825d2bfdcbddc). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
SparkQA commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-453808680 **[Test build #101138 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101138/testReport)** for PR 22952 at commit [`1f7645f`](https://github.com/apache/spark/commit/1f7645f6f6596b36c903016a13dc9efaedd849a7). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dongjoon-hyun commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
dongjoon-hyun commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-453808655 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dongjoon-hyun commented on a change in pull request #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
dongjoon-hyun commented on a change in pull request #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#discussion_r247336530 ## File path: core/src/main/scala/org/apache/spark/internal/config/Network.scala ## @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.internal.config + +import java.util.concurrent.TimeUnit + +private[spark] object Network { + + private[spark] val NETWORK_CRYPTO_SASL_FALLBACK = +ConfigBuilder("spark.network.crypto.saslFallback") + .booleanConf + .createWithDefault(true) + + private[spark] val NETWORK_ENCRYPTION_ENABLED = Review comment: Although this is a moved one, maybe `NETWORK_ENCRYPTION_ENABLED` -> `NETWORK_CRYPTO_ENABLED`? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dongjoon-hyun commented on a change in pull request #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
dongjoon-hyun commented on a change in pull request #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#discussion_r247336530 ## File path: core/src/main/scala/org/apache/spark/internal/config/Network.scala ## @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.internal.config + +import java.util.concurrent.TimeUnit + +private[spark] object Network { + + private[spark] val NETWORK_CRYPTO_SASL_FALLBACK = +ConfigBuilder("spark.network.crypto.saslFallback") + .booleanConf + .createWithDefault(true) + + private[spark] val NETWORK_ENCRYPTION_ENABLED = Review comment: `NETWORK_ENCRYPTION_ENABLED` -> `NETWORK_CRYPTO_ENABLED`? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dongjoon-hyun commented on a change in pull request #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
dongjoon-hyun commented on a change in pull request #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#discussion_r247336496 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -732,6 +732,21 @@ package object config { .checkValue(v => v > 0, "The max failures should be a positive value.") .createWithDefault(40) + private[spark] val UNSAFE_EXCEPTION_ON_MEMORY_LEAK = +ConfigBuilder("spark.unsafe.exceptionOnMemoryLeak") + .booleanConf + .createWithDefault(false) + + private[spark] val UNSAFE_SORTER_SPILL_READ_AHEAD_ENABLED = +ConfigBuilder("spark.unsafe.sorter.spill.read.ahead.enabled") + .booleanConf + .createWithDefault(true) + + private[spark] val UNSAFE_SORTER_SPILL_READER_BUFFER_SIZE = +ConfigBuilder("spark.unsafe.sorter.spill.reader.buffer.size") + .bytesConf(ByteUnit.BYTE) Review comment: Shall we add `checkValue`? Previously, the values beyond boundary are replaced back to the default value with warnings in `UnsafeSorterSpillReader`. It would be great to check here since we have a configuration now. ``` if (bufferSizeBytes > MAX_BUFFER_SIZE_BYTES || bufferSizeBytes < DEFAULT_BUFFER_SIZE_BYTES) { ``` Hi, @gatorsmile . If we use `checkValue` here, do we need to add a behavior change doc? IMO, these three configuration can be `.internal()` configuration and we can skip the behavior change doc for these. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
AmplabJenkins removed a comment on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#issuecomment-453807560 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
SparkQA commented on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#issuecomment-453807596 **[Test build #101137 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101137/testReport)** for PR 23412 at commit [`b566cdd`](https://github.com/apache/spark/commit/b566cddc4e83d4b2b62138dd49ca117145d399d9). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
AmplabJenkins removed a comment on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#issuecomment-453807563 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6998/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
AmplabJenkins commented on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#issuecomment-453807560 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
AmplabJenkins commented on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#issuecomment-453807563 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6998/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
AmplabJenkins removed a comment on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#issuecomment-453807362 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101134/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dongjoon-hyun commented on a change in pull request #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
dongjoon-hyun commented on a change in pull request #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#discussion_r247336260 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -732,6 +732,21 @@ package object config { .checkValue(v => v > 0, "The max failures should be a positive value.") .createWithDefault(40) + private[spark] val UNSAFE_EXCEPTION_ON_MEMORY_LEAK = +ConfigBuilder("spark.unsafe.exceptionOnMemoryLeak") + .booleanConf + .createWithDefault(false) + + private[spark] val UNSAFE_SORTER_SPILL_READ_AHEAD_ENABLED = +ConfigBuilder("spark.unsafe.sorter.spill.read.ahead.enabled") + .booleanConf + .createWithDefault(true) + + private[spark] val UNSAFE_SORTER_SPILL_READER_BUFFER_SIZE = +ConfigBuilder("spark.unsafe.sorter.spill.reader.buffer.size") + .bytesConf(ByteUnit.BYTE) + .createWithDefault(1024 * 1024) Review comment: If possible, can we have a brief doc for the above new three configurations? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
AmplabJenkins removed a comment on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#issuecomment-453807359 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
SparkQA removed a comment on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#issuecomment-453797564 **[Test build #101134 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101134/testReport)** for PR 23416 at commit [`21f95a9`](https://github.com/apache/spark/commit/21f95a978997e76e10935a3833370462cf9f924a). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
AmplabJenkins commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#issuecomment-453807362 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101134/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
AmplabJenkins commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#issuecomment-453807359 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
SparkQA commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#issuecomment-453807293 **[Test build #101134 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101134/testReport)** for PR 23416 at commit [`21f95a9`](https://github.com/apache/spark/commit/21f95a978997e76e10935a3833370462cf9f924a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HeartSaVioR commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
HeartSaVioR commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-453807134 Kindly reminder. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HeartSaVioR commented on issue #23142: [SPARK-26170][SS] Add missing metrics in FlatMapGroupsWithState
HeartSaVioR commented on issue #23142: [SPARK-26170][SS] Add missing metrics in FlatMapGroupsWithState URL: https://github.com/apache/spark/pull/23142#issuecomment-453807143 Kindly reminder. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HeartSaVioR commented on issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer
HeartSaVioR commented on issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer URL: https://github.com/apache/spark/pull/22138#issuecomment-453807137 Kindly reminder. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dongjoon-hyun commented on a change in pull request #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
dongjoon-hyun commented on a change in pull request #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#discussion_r247336035 ## File path: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java ## @@ -59,21 +60,22 @@ public UnsafeSorterSpillReader( File file, BlockId blockId) throws IOException { assert (file.length() > 0); +final ConfigEntry bufferSizeConfigEntry = +package$.MODULE$.UNSAFE_SORTER_SPILL_READER_BUFFER_SIZE(); +final long DEFAULT_BUFFER_SIZE_BYTES = (long)bufferSizeConfigEntry.defaultValue().get(); long bufferSizeBytes = SparkEnv.get() == null ? -DEFAULT_BUFFER_SIZE_BYTES: - SparkEnv.get().conf().getSizeAsBytes("spark.unsafe.sorter.spill.reader.buffer.size", - DEFAULT_BUFFER_SIZE_BYTES); + DEFAULT_BUFFER_SIZE_BYTES:(long)SparkEnv.get().conf().get(bufferSizeConfigEntry); Review comment: Spaces around colon? ```java - DEFAULT_BUFFER_SIZE_BYTES:(long)SparkEnv.get().conf().get(bufferSizeConfigEntry); + DEFAULT_BUFFER_SIZE_BYTES : (long)SparkEnv.get().conf().get(bufferSizeConfigEntry); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HeartSaVioR commented on a change in pull request #23260: [SPARK-26311][YARN] New feature: custom log URL for stdout/stderr
HeartSaVioR commented on a change in pull request #23260: [SPARK-26311][YARN] New feature: custom log URL for stdout/stderr URL: https://github.com/apache/spark/pull/23260#discussion_r247335594 ## File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala ## @@ -246,13 +253,57 @@ private[yarn] class ExecutorRunnable( sys.env.get("SPARK_USER").foreach { user => val containerId = ConverterUtils.toString(c.getId) val address = c.getNodeHttpAddress -val baseUrl = s"$httpScheme$address/node/containerlogs/$containerId/$user" -env("SPARK_LOG_URL_STDERR") = s"$baseUrl/stderr?start=-4096" -env("SPARK_LOG_URL_STDOUT") = s"$baseUrl/stdout?start=-4096" +val customLogUrl = sparkConf.get(config.CUSTOM_LOG_URL) + +val envNameToFileNameMap = Map("SPARK_LOG_URL_STDERR" -> "stderr", + "SPARK_LOG_URL_STDOUT" -> "stdout") +val logUrls = ExecutorRunnable.buildLogUrls(customLogUrl, httpScheme, address, + clusterId, containerId, user, envNameToFileNameMap) +logUrls.foreach { case (envName, url) => + env(envName) = url +} } } env } } + +private[yarn] object ExecutorRunnable { + def buildLogUrls( + logUrlPattern: String, + httpScheme: String, + nodeHttpAddress: String, + clusterId: Option[String], + containerId: String, + user: String, + envNameToFileNameMap: Map[String, String]): Map[String, String] = { +val optionalPathVariable: Map[String, Option[String]] = Map("{{ClusterId}}" -> clusterId) +val pathVariables: Map[String, String] = Map("{{HttpScheme}}" -> httpScheme, Review comment: My intention was to separate parameters by `mandatory` and `optional`. Mandatory parameters should be available for all the cases, whereas optional parameters are available only when Hadoop/YARN has configured like that. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HeartSaVioR commented on issue #23260: [SPARK-26311][YARN] New feature: custom log URL for stdout/stderr
HeartSaVioR commented on issue #23260: [SPARK-26311][YARN] New feature: custom log URL for stdout/stderr URL: https://github.com/apache/spark/pull/23260#issuecomment-453805982 Sorry to revisit this lately: I had to tackle some other tasks as well, so spent time to do simpler thing first. @vanzin I guess I'm confusing based on your previous comment: > The parameters don't need to be pre-defined, each RM can have its own set, and it's up to the admin to set things up so that the URLs in the SHS make sense for their deployment. especially `The parameters don't need to be pre-defined`. In this patch, I defined some parameters which I think they can make a reference of specific container being unique. I interpreted `no pre-defined` as flexibility, so tried to find a way to let end users (or admin) defines their own set of parameters, hence thought about interface and plug-in, like having interface which inputs will be YarnConfiguration, Container, etc (for YARN, others should have different interface). If we are happy with pre-defining parameters by ourselves and let end users just be flexible with constructing log URL, things would be fairly simpler. And I guess that looks like what you really meant, by part of your comment below: > All that's missing is writing those to the event log somehow, and using them on the SHS side, instead of doing it all on the application side, which has the drawbacks that have already been discussed. Could you please confirm I understand correctly? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dongjoon-hyun commented on a change in pull request #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
dongjoon-hyun commented on a change in pull request #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#discussion_r247335540 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala ## @@ -68,9 +68,6 @@ object PhysicalOperation extends PredicateHelper { val substitutedCondition = substitute(aliases)(condition) (fields, filters ++ splitConjunctivePredicates(substitutedCondition), other, aliases) - case h: ResolvedHint => -collectProjectsAndFilters(h.child) - Review comment: In that case, let's not change this in this PR `[SPARK-26576][SQL] Broadcast hint not applied to partitioned table` . For the dead code cleaning up, you can do that later. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HeartSaVioR commented on issue #23369: [SPARK-26429][SS]add jdbc sink for Structured Streaming.
HeartSaVioR commented on issue #23369: [SPARK-26429][SS]add jdbc sink for Structured Streaming. URL: https://github.com/apache/spark/pull/23369#issuecomment-453804885 Seems like: 1) this needs to wait and reflect newer DSv2. 2) According to https://github.com/apache/spark/pull/17190#issuecomment-363132833 , maybe this is encouraged to be placed to bahir or some outer place? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HeartSaVioR edited a comment on issue #23527: [SPARK-26482][K8S][TEST][FOLLOWUP] Fix compile failure
HeartSaVioR edited a comment on issue #23527: [SPARK-26482][K8S][TEST][FOLLOWUP] Fix compile failure URL: https://github.com/apache/spark/pull/23527#issuecomment-453804394 Thanks @dongjoon-hyun for fixing missing spot! Looks like I confused between SparkConf and Spark**App**Conf. I couldn't run all tests in local (since it takes hours) but was trying to ensure compilation passes, so ran below command: ``` mvn -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean compile test-compile ``` Maybe I needed to add `-Pkubernetes-integration-tests` as well. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HeartSaVioR commented on issue #23527: [SPARK-26482][K8S][TEST][FOLLOWUP] Fix compile failure
HeartSaVioR commented on issue #23527: [SPARK-26482][K8S][TEST][FOLLOWUP] Fix compile failure URL: https://github.com/apache/spark/pull/23527#issuecomment-453804394 Thanks @dongjoon-hyun for fixing missing spot! Looks like I confused between SparkConf and Spark*App*Conf. I couldn't run all tests in local (since it takes hours) but was trying to ensure compilation passes, so ran below command: ``` mvn -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean compile test-compile ``` Maybe I needed to add `-Pkubernetes-integration-tests` as well. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on issue #21135: [SPARK-24060][TEST] StreamingSymmetricHashJoinHelperSuite should initialize after SparkSession creation
HyukjinKwon commented on issue #21135: [SPARK-24060][TEST] StreamingSymmetricHashJoinHelperSuite should initialize after SparkSession creation URL: https://github.com/apache/spark/pull/21135#issuecomment-453803977 Looks right but if it doesn't cause an actual problem now, let's fix it later. If reproducible steps can be explained, we can merge. If it's getting inactive, let's leave this closed. I manually tested this in IDE and seems fine. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] deshanxiao commented on a change in pull request #23486: [SPARK-26457] Show hadoop configurations in HistoryServer environment tab
deshanxiao commented on a change in pull request #23486: [SPARK-26457] Show hadoop configurations in HistoryServer environment tab URL: https://github.com/apache/spark/pull/23486#discussion_r247334277 ## File path: core/src/main/scala/org/apache/spark/status/api/v1/api.scala ## @@ -352,6 +352,7 @@ class VersionInfo private[spark]( class ApplicationEnvironmentInfo private[spark] ( val runtime: RuntimeInfo, val sparkProperties: Seq[(String, String)], +val hadoopProperties: Seq[(String, String)], Review comment: Yes, you are right. Thank you! I have repaired it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
AmplabJenkins removed a comment on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#issuecomment-453802096 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
AmplabJenkins removed a comment on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#issuecomment-453802097 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101133/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
AmplabJenkins commented on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#issuecomment-453802097 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101133/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
AmplabJenkins commented on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#issuecomment-453802096 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
SparkQA removed a comment on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#issuecomment-453790515 **[Test build #101133 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101133/testReport)** for PR 23412 at commit [`01162cb`](https://github.com/apache/spark/commit/01162cbc4b31135a64e105b559117f1bcb19ba5e). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category
SparkQA commented on issue #23412: [SPARK-26477][CORE] Use ConfigEntry for hardcoded configs for unsafe category URL: https://github.com/apache/spark/pull/23412#issuecomment-453802011 **[Test build #101133 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101133/testReport)** for PR 23412 at commit [`01162cb`](https://github.com/apache/spark/commit/01162cbc4b31135a64e105b559117f1bcb19ba5e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
AmplabJenkins removed a comment on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453801461 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101136/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
AmplabJenkins removed a comment on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453801460 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
AmplabJenkins commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453801460 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
SparkQA commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453801425 **[Test build #101136 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101136/testReport)** for PR 20691 at commit [`e15eb63`](https://github.com/apache/spark/commit/e15eb636ef421656df817612b1900d2e953e545a). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
SparkQA removed a comment on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453800521 **[Test build #101136 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101136/testReport)** for PR 20691 at commit [`e15eb63`](https://github.com/apache/spark/commit/e15eb636ef421656df817612b1900d2e953e545a). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
AmplabJenkins commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453801461 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101136/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] jzhuge commented on a change in pull request #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
jzhuge commented on a change in pull request #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#discussion_r247333601 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala ## @@ -68,9 +68,6 @@ object PhysicalOperation extends PredicateHelper { val substitutedCondition = substitute(aliases)(condition) (fields, filters ++ splitConjunctivePredicates(substitutedCondition), other, aliases) - case h: ResolvedHint => -collectProjectsAndFilters(h.child) - Review comment: Yeah, it is safer to only port the unit test. However, I believe this is dead code, otherwise we might have to revisit the fix for 2.4 and 2.3. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] felixcheung commented on issue #23347: allow building spark gpu docker images
felixcheung commented on issue #23347: allow building spark gpu docker images URL: https://github.com/apache/spark/pull/23347#issuecomment-453800996 IMO we haven’t concluded what we are trying to do having Dockerfile in the release. We had some discussion last year. In any case, with Alpine, we could publish official docker images. But I’d agree if we do then we should test and sign off on it and also agree the more variations and complexity would be more work for the community. Perhaps this should be on dev@ This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on issue #23391: [SPARK-26456][SQL] Cast date/timestamp to string by Date/TimestampFormatter
HyukjinKwon commented on issue #23391: [SPARK-26456][SQL] Cast date/timestamp to string by Date/TimestampFormatter URL: https://github.com/apache/spark/pull/23391#issuecomment-453800952 Let's also update PR description. `spark.sql.legacy.timeParser.enabled` is now removed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] jzhuge commented on a change in pull request #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
jzhuge commented on a change in pull request #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#discussion_r247333601 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala ## @@ -68,9 +68,6 @@ object PhysicalOperation extends PredicateHelper { val substitutedCondition = substitute(aliases)(condition) (fields, filters ++ splitConjunctivePredicates(substitutedCondition), other, aliases) - case h: ResolvedHint => -collectProjectsAndFilters(h.child) - Review comment: Yeah, it is safer to only port the unit test. However, we believe this is dead code, otherwise we might have to revisit the fix for 2.4 and 2.3. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
AmplabJenkins removed a comment on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453800570 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
AmplabJenkins removed a comment on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453800572 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6997/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
AmplabJenkins commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453800570 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
AmplabJenkins commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453800572 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6997/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
SparkQA commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453800521 **[Test build #101136 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101136/testReport)** for PR 20691 at commit [`e15eb63`](https://github.com/apache/spark/commit/e15eb636ef421656df817612b1900d2e953e545a). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on a change in pull request #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
HyukjinKwon commented on a change in pull request #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#discussion_r247333476 ## File path: python/pyspark/broadcast.py ## @@ -80,7 +81,7 @@ def __init__(self, sc=None, value=None, pickle_registry=None, path=None): def dump(self, value, f): try: -pickle.dump(value, f, 2) +pickle.dump(value, f, protocol) Review comment: can we match our cloudpickle copy to upper version for this accordingly? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
HyukjinKwon commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453800408 Let's give a shot for this in 3.0.0. Cloudpickle also changed its protocol a long ago from 2 to highest as well and looks it doesn't have notable regression so far. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
AmplabJenkins removed a comment on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453656615 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+)
HyukjinKwon commented on issue #20691: [SPARK-18161] [Python] Allow pickle to serialize >4 GB objects when possible (Python 3.4+) URL: https://github.com/apache/spark/pull/20691#issuecomment-453800378 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on a change in pull request #23518: [SPARK-26600] Update spark-submit usage message
HyukjinKwon commented on a change in pull request #23518: [SPARK-26600] Update spark-submit usage message URL: https://github.com/apache/spark/pull/23518#discussion_r247333262 ## File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ## @@ -576,27 +576,26 @@ private[deploy] class SparkSubmitArguments(args: Seq[String], env: Map[String, S | --kill SUBMISSION_IDIf given, kills the driver specified. | --status SUBMISSION_ID If given, requests the status of the driver specified. | -| Spark standalone and Mesos only: +| Spark standalone, Mesos and K8S only: Review comment: Shell we use fully qualified name since it's a user facing doc? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on issue #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable
HyukjinKwon commented on issue #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable URL: https://github.com/apache/spark/pull/23519#issuecomment-453799619 Also, how was this patch tested? looks UT doesn't cover the current change. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable
HyukjinKwon commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable URL: https://github.com/apache/spark/pull/23519#discussion_r247333148 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/StaticSQLConf.scala ## @@ -126,4 +126,10 @@ object StaticSQLConf { .intConf .createWithDefault(1000) + val MAX_BROADCAST_EXCHANGE_THREADNUMBER = +buildStaticConf("spark.sql.broadcastExchange.maxThreadNumber") + .doc("MAX number of threads can hold by BroadcastExchangeExec.") Review comment: Can you elaborate this in the doc about what this number controls? For instance, it controls the parallelism of fetching and broadcasting the table. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable
HyukjinKwon commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable URL: https://github.com/apache/spark/pull/23519#discussion_r247333106 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala ## @@ -157,5 +157,6 @@ case class BroadcastExchangeExec( object BroadcastExchangeExec { private[execution] val executionContext = ExecutionContext.fromExecutorService( -ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", 128)) +ThreadUtils.newDaemonCachedThreadPool("broadcast-exchange", + new SparkConf().get(StaticSQLConf.MAX_BROADCAST_EXCHANGE_THREADNUMBER))) Review comment: `SQLConf.get` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] asfgit closed pull request #23529: [SPARK-26503][CORE][DOC][FOLLOWUP] Get rid of spark.sql.legacy.timeParser.enabled
asfgit closed pull request #23529: [SPARK-26503][CORE][DOC][FOLLOWUP] Get rid of spark.sql.legacy.timeParser.enabled URL: https://github.com/apache/spark/pull/23529 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/sql-migration-guide-upgrade.md b/docs/sql-migration-guide-upgrade.md index a2d782e782ae0..fce0b9a5f86a0 100644 --- a/docs/sql-migration-guide-upgrade.md +++ b/docs/sql-migration-guide-upgrade.md @@ -33,13 +33,13 @@ displayTitle: Spark SQL Upgrading Guide - In Spark version 2.4 and earlier, the `SET` command works without any warnings even if the specified key is for `SparkConf` entries and it has no effect because the command does not update `SparkConf`, but the behavior might confuse users. Since 3.0, the command fails if a `SparkConf` key is used. You can disable such a check by setting `spark.sql.legacy.setCommandRejectsSparkCoreConfs` to `false`. - - Since Spark 3.0, CSV/JSON datasources use java.time API for parsing and generating CSV/JSON content. In Spark version 2.4 and earlier, java.text.SimpleDateFormat is used for the same purpose with fallbacks to the parsing mechanisms of Spark 2.0 and 1.x. For example, `2018-12-08 10:39:21.123` with the pattern `-MM-dd'T'HH:mm:ss.SSS` cannot be parsed since Spark 3.0 because the timestamp does not match to the pattern but it can be parsed by earlier Spark versions due to a fallback to `Timestamp.valueOf`. To parse the same timestamp since Spark 3.0, the pattern should be `-MM-dd HH:mm:ss.SSS`. To switch back to the implementation used in Spark 2.4 and earlier, set `spark.sql.legacy.timeParser.enabled` to `true`. + - Since Spark 3.0, CSV/JSON datasources use java.time API for parsing and generating CSV/JSON content. In Spark version 2.4 and earlier, java.text.SimpleDateFormat is used for the same purpose with fallbacks to the parsing mechanisms of Spark 2.0 and 1.x. For example, `2018-12-08 10:39:21.123` with the pattern `-MM-dd'T'HH:mm:ss.SSS` cannot be parsed since Spark 3.0 because the timestamp does not match to the pattern but it can be parsed by earlier Spark versions due to a fallback to `Timestamp.valueOf`. To parse the same timestamp since Spark 3.0, the pattern should be `-MM-dd HH:mm:ss.SSS`. - In Spark version 2.4 and earlier, CSV datasource converts a malformed CSV string to a row with all `null`s in the PERMISSIVE mode. Since Spark 3.0, the returned row can contain non-`null` fields if some of CSV column values were parsed and converted to desired types successfully. - In Spark version 2.4 and earlier, JSON datasource and JSON functions like `from_json` convert a bad JSON record to a row with all `null`s in the PERMISSIVE mode when specified schema is `StructType`. Since Spark 3.0, the returned row can contain non-`null` fields if some of JSON column values were parsed and converted to desired types successfully. - - Since Spark 3.0, the `unix_timestamp`, `date_format`, `to_unix_timestamp`, `from_unixtime`, `to_date`, `to_timestamp` functions use java.time API for parsing and formatting dates/timestamps from/to strings by using ISO chronology (https://docs.oracle.com/javase/8/docs/api/java/time/chrono/IsoChronology.html) based on Proleptic Gregorian calendar. In Spark version 2.4 and earlier, java.text.SimpleDateFormat and java.util.GregorianCalendar (hybrid calendar that supports both the Julian and Gregorian calendar systems, see https://docs.oracle.com/javase/7/docs/api/java/util/GregorianCalendar.html) is used for the same purpose. New implementation supports pattern formats as described here https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html and performs strict checking of its input. For example, the `2015-07-22 10:00:00` timestamp cannot be parse if pattern is `-MM-dd` because the parser does not consume whole input. Another example is the `31/01/2015 00:00` input cannot be parsed by the `dd/MM/ hh:mm` pattern because `hh` supposes hours in the range `1-12`. To switch back to the implementation used in Spark 2.4 and earlier, set `spark.sql.legacy.timeParser.enabled` to `true`. + - Since Spark 3.0, the `unix_timestamp`, `date_format`, `to_unix_timestamp`, `from_unixtime`, `to_date`, `to_timestamp` functions use java.time API for parsing and formatting dates/timestamps from/to strings by using ISO chronology (https://docs.oracle.com/javase/8/docs/api/java/time/chrono/IsoChronology.html) based on Proleptic Gregorian calendar. In Spark version 2.4 and earlier, java.text.SimpleDateFormat and java.util.GregorianCalendar (hybrid calendar that supports both the Julian and Gregorian calendar systems, see
[GitHub] AmplabJenkins removed a comment on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
AmplabJenkins removed a comment on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-453798609 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6996/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
AmplabJenkins removed a comment on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-453798608 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
AmplabJenkins commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-453798609 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6996/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
AmplabJenkins commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-453798608 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] HyukjinKwon commented on issue #23529: [SPARK-26503][CORE][DOC][FOLLOWUP] Get rid of spark.sql.legacy.timeParser.enabled
HyukjinKwon commented on issue #23529: [SPARK-26503][CORE][DOC][FOLLOWUP] Get rid of spark.sql.legacy.timeParser.enabled URL: https://github.com/apache/spark/pull/23529#issuecomment-453798579 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
SparkQA commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-453798520 **[Test build #101135 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101135/testReport)** for PR 23530 at commit [`e81a501`](https://github.com/apache/spark/commit/e81a501be64324aeb929a05bbd7825d2bfdcbddc). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dongjoon-hyun commented on a change in pull request #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
dongjoon-hyun commented on a change in pull request #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#discussion_r247332753 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala ## @@ -68,9 +68,6 @@ object PhysicalOperation extends PredicateHelper { val substitutedCondition = substitute(aliases)(condition) (fields, filters ++ splitConjunctivePredicates(substitutedCondition), other, aliases) - case h: ResolvedHint => -collectProjectsAndFilters(h.child) - Review comment: Hi, @jzhuge . Do we need to remove this? According to the previous PR, `master` has no issue. So, I expected a test only PR. cc @cloud-fan and @gatorsmile This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dongjoon-hyun commented on a change in pull request #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
dongjoon-hyun commented on a change in pull request #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#discussion_r247332753 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala ## @@ -68,9 +68,6 @@ object PhysicalOperation extends PredicateHelper { val substitutedCondition = substitute(aliases)(condition) (fields, filters ++ splitConjunctivePredicates(substitutedCondition), other, aliases) - case h: ResolvedHint => -collectProjectsAndFilters(h.child) - Review comment: Hi, @jzhuge . Do we need this? According to the previous PR, `master` has no issue. So, I expected a test only PR. cc @cloud-fan and @gatorsmile This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
AmplabJenkins removed a comment on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-453788541 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] dongjoon-hyun commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table
dongjoon-hyun commented on issue #23530: [SPARK-26576][SQL] Broadcast hint not applied to partitioned table URL: https://github.com/apache/spark/pull/23530#issuecomment-453798325 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
AmplabJenkins removed a comment on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#issuecomment-453797627 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6995/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
AmplabJenkins removed a comment on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#issuecomment-453797626 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
AmplabJenkins commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#issuecomment-453797626 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
AmplabJenkins commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#issuecomment-453797627 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/6995/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories.
SparkQA commented on issue #23416: [SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. URL: https://github.com/apache/spark/pull/23416#issuecomment-453797564 **[Test build #101134 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101134/testReport)** for PR 23416 at commit [`21f95a9`](https://github.com/apache/spark/commit/21f95a978997e76e10935a3833370462cf9f924a). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType
AmplabJenkins removed a comment on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType URL: https://github.com/apache/spark/pull/23512#issuecomment-453797160 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType
AmplabJenkins commented on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType URL: https://github.com/apache/spark/pull/23512#issuecomment-453797163 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101132/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins removed a comment on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType
AmplabJenkins removed a comment on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType URL: https://github.com/apache/spark/pull/23512#issuecomment-453797163 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/101132/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] AmplabJenkins commented on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType
AmplabJenkins commented on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType URL: https://github.com/apache/spark/pull/23512#issuecomment-453797160 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType
SparkQA commented on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType URL: https://github.com/apache/spark/pull/23512#issuecomment-453797083 **[Test build #101132 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101132/testReport)** for PR 23512 at commit [`0f7ddd2`](https://github.com/apache/spark/commit/0f7ddd2f7736270eb3db51e65c2bef8dafa884c5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType
SparkQA removed a comment on issue #23512: [SPARK-26593][SQL] Use Proleptic Gregorian calendar in casting UTF8String to Date/TimestampType URL: https://github.com/apache/spark/pull/23512#issuecomment-453786298 **[Test build #101132 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/101132/testReport)** for PR 23512 at commit [`0f7ddd2`](https://github.com/apache/spark/commit/0f7ddd2f7736270eb3db51e65c2bef8dafa884c5). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #23529: [SPARK-26503][CORE][DOC][FOLLOWUP] Get rid of spark.sql.legacy.timeParser.enabled
SparkQA removed a comment on issue #23529: [SPARK-26503][CORE][DOC][FOLLOWUP] Get rid of spark.sql.legacy.timeParser.enabled URL: https://github.com/apache/spark/pull/23529#issuecomment-453784427 **[Test build #4507 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4507/testReport)** for PR 23529 at commit [`496e061`](https://github.com/apache/spark/commit/496e06175490590e05945f282f04ba3547eb9190). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #23529: [SPARK-26503][CORE][DOC][FOLLOWUP] Get rid of spark.sql.legacy.timeParser.enabled
SparkQA commented on issue #23529: [SPARK-26503][CORE][DOC][FOLLOWUP] Get rid of spark.sql.legacy.timeParser.enabled URL: https://github.com/apache/spark/pull/23529#issuecomment-453795960 **[Test build #4507 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4507/testReport)** for PR 23529 at commit [`496e061`](https://github.com/apache/spark/commit/496e06175490590e05945f282f04ba3547eb9190). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA commented on issue #20512: [SPARK-23182][CORE] Allow enabling TCP keep alive on the RPC connections
SparkQA commented on issue #20512: [SPARK-23182][CORE] Allow enabling TCP keep alive on the RPC connections URL: https://github.com/apache/spark/pull/20512#issuecomment-453794591 **[Test build #4506 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4506/testReport)** for PR 20512 at commit [`c5e2d98`](https://github.com/apache/spark/commit/c5e2d98b9e98fd3416a36ab91262260146bf4ac5). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] SparkQA removed a comment on issue #20512: [SPARK-23182][CORE] Allow enabling TCP keep alive on the RPC connections
SparkQA removed a comment on issue #20512: [SPARK-23182][CORE] Allow enabling TCP keep alive on the RPC connections URL: https://github.com/apache/spark/pull/20512#issuecomment-453782753 **[Test build #4506 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4506/testReport)** for PR 20512 at commit [`c5e2d98`](https://github.com/apache/spark/commit/c5e2d98b9e98fd3416a36ab91262260146bf4ac5). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] pgandhi999 commented on a change in pull request #23437: [SPARK-26524] If the application directory fails to be created on the SPARK_WORKER_…
pgandhi999 commented on a change in pull request #23437: [SPARK-26524] If the application directory fails to be created on the SPARK_WORKER_… URL: https://github.com/apache/spark/pull/23437#discussion_r247331157 ## File path: core/src/main/scala/org/apache/spark/deploy/master/WorkerInfo.scala ## @@ -101,5 +106,27 @@ private[spark] class WorkerInfo( this.state = state } + def setIsBlack(): Unit = { +this.isBlack = true +this.lastBlackTime = System.currentTimeMillis() + } + + def unsetBlack(): Unit = { +this.isBlack = false +this.lastBlackTime = 0 + } + + def increaseFailedAppCount(appId: String) { +appIdToRetryCount(appId) = appIdToRetryCount.get(appId).map(_ + 1).getOrElse(1) + } + + def getFailedFailedCount(appId: String): Int = { +appIdToRetryCount.getOrElse(appId, 1) Review comment: Shouldn't this default to zero in the event no worker has failed? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] felixcheung commented on issue #23527: [SPARK-26482][K8S][TEST][FOLLOWUP] Fix compile failure
felixcheung commented on issue #23527: [SPARK-26482][K8S][TEST][FOLLOWUP] Fix compile failure URL: https://github.com/apache/spark/pull/23527#issuecomment-453794268 I think we need it added to module and then only "execute the test" when the profile is set This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] felixcheung commented on issue #23527: [SPARK-26482][K8S][TEST][FOLLOWUP] Fix compile failure
felixcheung commented on issue #23527: [SPARK-26482][K8S][TEST][FOLLOWUP] Fix compile failure URL: https://github.com/apache/spark/pull/23527#issuecomment-453794227 looks like integration test is not in the module list https://github.com/apache/spark/blob/master/dev/sparktestsupport/modules.py#L542 so it doesn't get build unless the profile `kubernetes-integration-tests` is enabled explicitly https://github.com/apache/spark/blob/master/pom.xml#L2726 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org