[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14333
  
**[Test build #62779 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62779/consoleFull)**
 for PR 14333 at commit 
[`ecd15b2`](https://github.com/apache/spark/commit/ecd15b2b7bac21a9e37f998747a752903530).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14333
  
**[Test build #62778 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62778/consoleFull)**
 for PR 14333 at commit 
[`f129a2b`](https://github.com/apache/spark/commit/f129a2b575ac9523672e321e64151bbff64e71c5).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14333
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62778/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14333
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14333
  
**[Test build #62778 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62778/consoleFull)**
 for PR 14333 at commit 
[`f129a2b`](https://github.com/apache/spark/commit/f129a2b575ac9523672e321e64151bbff64e71c5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread WeichenXu123
Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/14333
  
@srowen 
The `bcNewCenters` in `KMeans` has some problem.
Check the code logic in detail, we can find that in each loop, it should 
destroy the broadcast var `bcNewCenters` generated in the previous loop, not 
the one generated in current loop. Like what
is done to the `costs: RDD`, which use a `preCosts` var to save that 
generated in previous loop.
I update the code. 

The second problem, what's the meaning of `broadcast.unpersist`, eh, I 
think, there is another senario, suppose there is a RDD lineage, when executing 
in normal case, it executed successfully, and in code we can unpersist useless 
broadcast var in time, but, if some exception happened, the spark can recovery  
from it, it need to recovery the broken RDD from the RDD lineage and in such 
case may re-use the broadcast var we had unpersisted. If we simply destroy it, 
the broadcast var cannot be recover
so that the recovery will fail.

So that I think the safe place to use `broadcast.destroy` is the place 
where some action to RDD has successfully executed, and the whole RDD lineage 
is no longer needed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14333
  
How about the same for `bcNewCenters` in `KMeans`?

Yeah, it seems like it's pretty rare to want to call `unpersist` here. The 
only context where it seems valid are the two left in `Word2Vec` like 
`bcSyn0Global` where the driver's state needs to be rebroadcast on each loop. 
But even then you could make a new broadcast for the same variable and destroy 
it in the loop.

I suppose it saves the overhead of new bookkeeping for a `Broadcast`. But 
unless I miss something it's not really worth the separate API. I wouldn't go 
so far as deprecating `unpersist` but doesn't look like something that would be 
added today if it weren't there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14333
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62769/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14333
  
**[Test build #62769 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62769/consoleFull)**
 for PR 14333 at commit 
[`c40f7f8`](https://github.com/apache/spark/commit/c40f7f829ca03d6c0ad55cc75964e7be5c59d748).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14333
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14333
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62768/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14333
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14333
  
**[Test build #62768 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62768/consoleFull)**
 for PR 14333 at commit 
[`52afc03`](https://github.com/apache/spark/commit/52afc038c79ab8176bf760d65793e8d5f94d4d4a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14333
  
**[Test build #62769 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62769/consoleFull)**
 for PR 14333 at commit 
[`c40f7f8`](https://github.com/apache/spark/commit/c40f7f829ca03d6c0ad55cc75964e7be5c59d748).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] unused broadcast variables do d...

2016-07-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14333
  
**[Test build #62768 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62768/consoleFull)**
 for PR 14333 at commit 
[`52afc03`](https://github.com/apache/spark/commit/52afc038c79ab8176bf760d65793e8d5f94d4d4a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org