[GitHub] [spark] HeartSaVioR commented on pull request #30770: [SPARK-33783][SS] Unload State Store Provider after configured keep alive time

2020-12-15 Thread GitBox
HeartSaVioR commented on pull request #30770: URL: https://github.com/apache/spark/pull/30770#issuecomment-745815471 cc. @tdas @zsxwing @jose-torres @gaborgsomogyi @xuanyuanking This is an automated message from the Apache G

[GitHub] [spark] HeartSaVioR commented on pull request #30770: [SPARK-33783][SS] Unload State Store Provider after configured keep alive time

2020-12-16 Thread GitBox
HeartSaVioR commented on pull request #30770: URL: https://github.com/apache/spark/pull/30770#issuecomment-747138002 The problem is, this is completely relying on luck - this doesn't give any help on physical plan. Again the problem exists even without the PR, but then shouldn't we fix the

[GitHub] [spark] HeartSaVioR commented on pull request #30770: [SPARK-33783][SS] Unload State Store Provider after configured keep alive time

2020-12-16 Thread GitBox
HeartSaVioR commented on pull request #30770: URL: https://github.com/apache/spark/pull/30770#issuecomment-747154451 Let's be clear, it is relying on luck "as it is", it requires non-trivial change to not rely on luck. E.g. You can make state store coordinator to track executors being in

[GitHub] [spark] HeartSaVioR commented on pull request #30770: [SPARK-33783][SS] Unload State Store Provider after configured keep alive time

2020-12-16 Thread GitBox
HeartSaVioR commented on pull request #30770: URL: https://github.com/apache/spark/pull/30770#issuecomment-747169325 > If you think we should not make it as a configurable item. I can remove it from a configuration and only check if alive time is more than the maintenance interval. It also

[GitHub] [spark] HeartSaVioR commented on pull request #30770: [SPARK-33783][SS] Unload State Store Provider after configured keep alive time

2020-12-16 Thread GitBox
HeartSaVioR commented on pull request #30770: URL: https://github.com/apache/spark/pull/30770#issuecomment-747178340 The problem explanation sounds me as we should unload ASAP whenever possible instead of delaying, right? Providing TTL would delay the unload more than current, even g

[GitHub] [spark] HeartSaVioR commented on pull request #30770: [SPARK-33783][SS] Unload State Store Provider after configured keep alive time

2020-12-16 Thread GitBox
HeartSaVioR commented on pull request #30770: URL: https://github.com/apache/spark/pull/30770#issuecomment-747187019 > why we don't do unloading asap Because the conversation is one way - executor registers to driver, executor queries from driver, but driver doesn't notify executors.

[GitHub] [spark] HeartSaVioR commented on pull request #30770: [SPARK-33783][SS] Unload State Store Provider after configured keep alive time

2020-12-16 Thread GitBox
HeartSaVioR commented on pull request #30770: URL: https://github.com/apache/spark/pull/30770#issuecomment-747189389 Let's hear about 3rd voices here before making progress. This is an automated message from the Apache Git Se

[GitHub] [spark] HeartSaVioR commented on pull request #30770: [SPARK-33783][SS] Unload State Store Provider after configured keep alive time

2020-12-17 Thread GitBox
HeartSaVioR commented on pull request #30770: URL: https://github.com/apache/spark/pull/30770#issuecomment-747289740 My preference is unloading state as soon as possible if it's not being used. So if this is achievable I don't think we need to investigate alternatives. The difference