[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-11-12 Thread JeetKunDoug
Github user JeetKunDoug commented on the issue: https://github.com/apache/spark/pull/21322 @HyukjinKwon makes sense - I'll try to get a design doc put together as soon as I can, but my "day job" is preventing me from working on it right now. ---

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21322 Shall we leave this PR closed and start it from a design doc? Let me suggest to close this for now while I am looking through old PRs. @JeetKunDoug, please feel free to create a clone

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21322 I'd like to recommend http://apache-spark-developers-list.1001551.n3.nabble.com/SPIP-SPARK-25728-Structured-Intermediate-Representation-Tungsten-IR-for-generating-Java-code-td25370.html ---

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-10-16 Thread JeetKunDoug
Github user JeetKunDoug commented on the issue: https://github.com/apache/spark/pull/21322 Sounds good - it may take me a while to get the time to work this up, but I'll get something put together and attached to the underlying issue as soon as I can. If possible, can you point me at

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21322 I think a design doc is better, to make sure we are on the same page before the actual coding, which saves time. --- - To

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-10-16 Thread JeetKunDoug
Github user JeetKunDoug commented on the issue: https://github.com/apache/spark/pull/21322 @cloud-fan yes, essentially... the only thing I'd add is that it's bound to "whatever scope the driver sees fit to bind it to", if that makes sense. In my particular case, it happens to be a

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21322 Ah i see. It's kind of a broadcast but has a much small scope and its lifecycle is bound to a stage. I'm looking forward to a full design of it, thanks! ---

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-10-16 Thread JeetKunDoug
Github user JeetKunDoug commented on the issue: https://github.com/apache/spark/pull/21322 @cloud-fan The use-case we have in mind (and are currently using Broadcast + finalizers for) is the case where you have, for example, a connection pool that is, for one reason or another,

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-10-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21322 Introducing a new concept which is similar to broadcast seems like an overkill. We can just update broadcast, to allow it to be memory-only. However, there might be simpler solutions to

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-10-12 Thread JeetKunDoug
Github user JeetKunDoug commented on the issue: https://github.com/apache/spark/pull/21322 @cloud-fan (Also left on the JIRA ticket) - Sorry this has dropped off my radar for so long - work + life took me away from it for a while. So looking at the PR review comments and better

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-09-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21322 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21322 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-21 Thread JeetKunDoug
Github user JeetKunDoug commented on the issue: https://github.com/apache/spark/pull/21322 @cloud-fan So I see what's been messing with me - it's the whole Broadcast variable cache/week reference thing. I originally wrote this internally for Spark 1.6.3 (still supporting some older

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-19 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21322 IIUC it's impossible to clean up broadcasted object in `MemoryStore`. The life cycle of the broadcasted object is: 1. The first task that tries to get the broadcasted object will read bytes

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21322 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90824/ Test PASSed. ---

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21322 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21322 **[Test build #90824 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90824/testReport)** for PR 21322 at commit

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21322 **[Test build #90824 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90824/testReport)** for PR 21322 at commit

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-18 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/21322 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21322 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90796/ Test FAILed. ---

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21322 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21322 **[Test build #90796 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90796/testReport)** for PR 21322 at commit

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21322 **[Test build #90796 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90796/testReport)** for PR 21322 at commit

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-18 Thread JeetKunDoug
Github user JeetKunDoug commented on the issue: https://github.com/apache/spark/pull/21322 @cloud-fan as to eviction, it ends up calling `BlockManager.dropFromMemory`, which will in fact call `MemoryStore.remove` and do the cleanup appropriately, so that should already be handled

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21322 according to `TorrentBroadcast.writeBlocks` and `TorrentBroadcast.readBroadcastBlock`, broadcasted data are always written into the block manager with `MEMORY_AND_DISK_SER` mode. When a task

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21322 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21322 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90652/ Test PASSed. ---

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21322 **[Test build #90652 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90652/testReport)** for PR 21322 at commit

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21322 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90651/ Test PASSed. ---

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21322 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21322 **[Test build #90651 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90651/testReport)** for PR 21322 at commit

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-15 Thread JeetKunDoug
Github user JeetKunDoug commented on the issue: https://github.com/apache/spark/pull/21322 @cloud-fan I added the above-mentioned check for `isBroadcast` and only release resources if it's a broadcast ID. This will affect the MapOutputTracker as well, but I think it's a pretty

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21322 **[Test build #90652 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90652/testReport)** for PR 21322 at commit

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21322 **[Test build #90651 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90651/testReport)** for PR 21322 at commit

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-15 Thread JeetKunDoug
Github user JeetKunDoug commented on the issue: https://github.com/apache/spark/pull/21322 @cloud-fan So your suggestion makes sense - it seems like the best path forward is to check the `isBroadcast` flag on the BlockId passed in to `MemoryStore.remove` and release resources only if

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-15 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21322 @JeetKunDoug The same issue we discuss above. I think if there's a deserialized version of the variable, it can be not in `MemoryStore` but only serialized bytes in disk store. The reason

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-15 Thread JeetKunDoug
Github user JeetKunDoug commented on the issue: https://github.com/apache/spark/pull/21322 @viirya it seems from my admittedly cursory look at where we use the `cachedValues` reference map that we should be OK in this case - if there's a deserialized version of the variable (the only

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-14 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21322 One concern I have is, now we keep broadcasted variables in `BroadcastManager.cachedValues` by using weak reference. So if a broadcasted variable with `AutoCloseable` is released before we call

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-14 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21322 I think users are responsible to call `Broadcast#destroy`, which unpersist broadcast blocks from block manager and run user-defined driver side cleanup. It is a valid use case to allow

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21322 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90599/ Test PASSed. ---

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21322 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-05-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21322 **[Test build #90599 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90599/testReport)** for PR 21322 at commit