[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-12 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Priority: Major (was: Minor) Memory-based shuffle strategy to reduce overhead of disk I/O

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Component/s: Shuffle Memory-based shuffle strategy to reduce overhead of disk I/O

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Issue Type: New Feature (was: Planned Work) Memory-based shuffle strategy to reduce overhead of disk I/O

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Affects Version/s: 1.1.0 Memory-based shuffle strategy to reduce overhead of disk I/O

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Target Version/s: 1.3.0 Memory-based shuffle strategy to reduce overhead of disk I/O

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Labels: performance (was: ) Memory-based shuffle strategy to reduce overhead of disk I/O

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Commented] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14240985#comment-14240985 ] uncleGen commented on SPARK-3376: - [~rxin] Yeah, I agree with you. We can improve the I/O

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-12-10 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Priority: Minor (was: Trivial) Memory-based shuffle strategy to reduce overhead of disk I/O

[GitHub] spark pull request: [SPARK-4488][PySpark] Add control over map-sid...

2014-11-20 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/3366#issuecomment-63831692 @davies Could you help reviewing this patch? Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[jira] [Created] (SPARK-4488) Add control over map-side aggregation

2014-11-19 Thread uncleGen (JIRA)
uncleGen created SPARK-4488: --- Summary: Add control over map-side aggregation Key: SPARK-4488 URL: https://issues.apache.org/jira/browse/SPARK-4488 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-3373) Filtering operations should optionally rebuild routing tables

2014-11-19 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3373: Target Version/s: 1.1.1, 1.2.0 (was: 1.1.0, 1.0.3) Filtering operations should optionally rebuild routing

[GitHub] spark pull request: [SPARK-4488][PySpark] Add control over map-sid...

2014-11-19 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/3365 [SPARK-4488][PySpark] Add control over map-side aggregation You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark master-clean-141119

[GitHub] spark pull request: [SPARK-4488][PySpark] Add control over map-sid...

2014-11-19 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/3365 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-4488][PySpark] Add control over map-sid...

2014-11-19 Thread uncleGen
GitHub user uncleGen reopened a pull request: https://github.com/apache/spark/pull/3365 [SPARK-4488][PySpark] Add control over map-side aggregation You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark master-clean

[GitHub] spark pull request: [SPARK-4488][PySpark] Add control over map-sid...

2014-11-19 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/3365 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-4488][PySpark] Add control over map-sid...

2014-11-19 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/3366 [SPARK-4488][PySpark] Add control over map-side aggregation You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark master-pyspark

[GitHub] spark pull request: [SPARK-3719][CORE][UI]:complete/failed stages...

2014-11-19 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2574#issuecomment-63651117 @JoshRosen [[SPARK-4168][WebUI] ](https://github.com/apache/spark/commit/97a466eca0a629f17e9662ca2b59eeca99142c54) The patch solved the same problem, and I will close

[GitHub] spark pull request: [SPARK-3719][CORE][UI]:complete/failed stages...

2014-11-19 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/2574 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-3373][GraphX]: Filtering operations sho...

2014-11-19 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2249#issuecomment-63654065 @ankurdave Hi, can you review it again. Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-3818] Graph coarsening

2014-10-14 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2679#issuecomment-59037906 @ankurdave I see. And I think it is worthy to provide a memory-based shuffle manager in some cases, like sufficient memory resources, stringent performance requirement

[GitHub] spark pull request: [SPARK-3818] Graph coarsening

2014-10-13 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2679#issuecomment-58885039 @ankurdave I have some doubts, but not about this patch. In [GraphX OSDI paper](http://ankurdave.com/dl/graphx-osdi14.pdf) , I find that you have implemented a memory

[GitHub] spark pull request: [SPARK-3719][CORE]:complete/failed stages is...

2014-10-10 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2574#issuecomment-58735669 @JoshRosen Sorry for my misunderstanding, I will correct it as soon as possible. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-3719][CORE]:complete/failed stages is...

2014-10-08 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/2574#discussion_r18622733 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressPage.scala --- @@ -70,11 +72,11 @@ private[ui] class JobProgressPage(parent

[GitHub] spark pull request: [SPARK-3719][CORE]:complete/failed stages is...

2014-10-07 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/2574#discussion_r18561760 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressPage.scala --- @@ -70,11 +72,11 @@ private[ui] class JobProgressPage(parent

[jira] [Created] (SPARK-3719) Spark UI: complete/failed stages is better to show the total number of stages

2014-09-29 Thread uncleGen (JIRA)
uncleGen created SPARK-3719: --- Summary: Spark UI: complete/failed stages is better to show the total number of stages Key: SPARK-3719 URL: https://issues.apache.org/jira/browse/SPARK-3719 Project: Spark

[GitHub] spark pull request: [SPARK-3719][CORE]:complete/failed stages is...

2014-09-29 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/2574 [SPARK-3719][CORE]:complete/failed stages is better to show the total ... ...number of stages You can merge this pull request into a Git repository by running: $ git pull https://github.com

[jira] [Created] (SPARK-3712) add a new UpdateDStream to update a rdd dynamically

2014-09-28 Thread uncleGen (JIRA)
uncleGen created SPARK-3712: --- Summary: add a new UpdateDStream to update a rdd dynamically Key: SPARK-3712 URL: https://issues.apache.org/jira/browse/SPARK-3712 Project: Spark Issue Type

[GitHub] spark pull request: [SPARK-3712][STREAMING]: add a new UpdateDStre...

2014-09-28 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/2562 [SPARK-3712][STREAMING]: add a new UpdateDStream to update a rdd dynamically Maybe, we can achieve the aim by using forEachRdd function. But it is weird in this way, because I need to pass

[GitHub] spark pull request: [SPARK-3712][STREAMING]: add a new UpdateDStre...

2014-09-28 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2562#issuecomment-57083416 Test failure appears to be unrelated to my patch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-3712][STREAMING]: add a new UpdateDStre...

2014-09-28 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2562#issuecomment-57087576 @jerryshao Thanks for your comments! I want to abstract an independent DStream to achieve the aim. I feel it is weird to update a rdd by passing a closure. Maybe

[GitHub] spark pull request: [SPARK-3712][STREAMING]: add a new UpdateDStre...

2014-09-28 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/2562 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-3636][CORE]:It is not friendly to inter...

2014-09-23 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/2488 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-3636][CORE]:It is not friendly to inter...

2014-09-23 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2488#issuecomment-56505578 @pwendell ah, make sense, I will close this PR. Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-1774] Respect SparkSubmit --jars on YAR...

2014-09-23 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/710#discussion_r17951094 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -192,15 +236,17 @@ class SparkSubmitSuite extends FunSuite

[jira] [Created] (SPARK-3636) It is not friendly to interrupt a Job when user passes different storageLevels to a RDD

2014-09-22 Thread uncleGen (JIRA)
uncleGen created SPARK-3636: --- Summary: It is not friendly to interrupt a Job when user passes different storageLevels to a RDD Key: SPARK-3636 URL: https://issues.apache.org/jira/browse/SPARK-3636 Project

[GitHub] spark pull request: [SPARK-3636][CORE]:It is not friendly to inter...

2014-09-22 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/2488 [SPARK-3636][CORE]:It is not friendly to interrupt a Job when user passe... ... different storageLevels to a RDD You can merge this pull request into a Git repository by running: $ git pull

[jira] [Created] (SPARK-3373) trim some useless informations of VertexRDD in some cases

2014-09-03 Thread uncleGen (JIRA)
uncleGen created SPARK-3373: --- Summary: trim some useless informations of VertexRDD in some cases Key: SPARK-3373 URL: https://issues.apache.org/jira/browse/SPARK-3373 Project: Spark Issue Type

[jira] [Created] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-09-03 Thread uncleGen (JIRA)
uncleGen created SPARK-3376: --- Summary: Memory-based shuffle strategy to reduce overhead of disk I/O Key: SPARK-3376 URL: https://issues.apache.org/jira/browse/SPARK-3376 Project: Spark Issue Type

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-09-03 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2014-09-03 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3376: Description: I think a memory-based shuffle can reduce some overhead of disk I/O. I just want to know

[GitHub] spark pull request: [GraphX]: trim some useless informations of Ve...

2014-09-03 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/2249 [GraphX]: trim some useless informations of VertexRDD in some cases You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark

[GitHub] spark pull request: [SPARK-3373][GraphX]: trim some useless inform...

2014-09-03 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2249#issuecomment-54399143 @ankurdave Thanks for you comments, I will update it as soon as possible. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-3373][GraphX]: trim some useless inform...

2014-09-03 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/2249#discussion_r17095835 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/Graph.scala --- @@ -262,13 +262,61 @@ abstract class Graph[VD: ClassTag, ED: ClassTag] protected

[GitHub] spark pull request: [SPARK-2506]: In yarn-cluster mode, Applicatio...

2014-09-02 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/1429 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-2773]use growth rate to predict if need...

2014-09-01 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/1696#issuecomment-54102947 @pwendell OK! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2675]LiveListenerBus Queue Overflow

2014-08-30 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/1356#issuecomment-53977557 okay! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-3170][CORE][BUG]:RDD info loss in Stor...

2014-08-27 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/2131#discussion_r16761636 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -68,7 +68,9 @@ private[spark] class CacheManager(blockManager: BlockManager

[GitHub] spark pull request: [SPARK-3170][CORE][BUG]:RDD info loss in Stor...

2014-08-27 Thread uncleGen
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/2131#discussion_r16762096 --- Diff: core/src/test/scala/org/apache/spark/CacheManagerSuite.scala --- @@ -87,4 +99,12 @@ class CacheManagerSuite extends FunSuite with BeforeAndAfter

[GitHub] spark pull request: [SPARK-3170][CORE][BUG]:RDD info loss in Stor...

2014-08-27 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2131#issuecomment-53549872 @andrewor14 sorry for my poor coding. Unit test passed locally, test it again pls. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-3170][CORE][BUG]:RDD info loss in Stor...

2014-08-26 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/2131 [SPARK-3170][CORE][BUG]:RDD info loss in StorageTab and ExecutorTab compeleted stage only need to remove its own partitions that are no longer cached. However, StorageTab may lost some rdds which

[GitHub] spark pull request: [SPARK-3170][CORE]: Bug Fix in Storage UI

2014-08-26 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2076#issuecomment-53383270 @andrewor14 @pwendell @srowen As my branch is not up to date, I decide to close this and submit a new PR. Please Review It : https://github.com/apache/spark

[GitHub] spark pull request: [SPARK-3170][CORE]: Bug Fix in Storage UI

2014-08-26 Thread uncleGen
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/2076 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-3170][CORE][BUG]:RDD info loss in Stor...

2014-08-26 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2131#issuecomment-53521662 Hi @andrewor14, test it again please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3170][CORE]: Bug Fix in Storage UI

2014-08-23 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2076#issuecomment-53144395 @pwendell Okay! I will add them as soon as possible and pay more attention. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-3170][CORE]: Bug Fix in Storage UI

2014-08-21 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/2076#issuecomment-52908740 @srowen yes! Not only in StorageTab, ExectutorTab may also lose some rdd-infos which have been overwritten by following rdd in a same task. StorageTab: when

[jira] [Created] (SPARK-3170) Bug Fix in Storage UI

2014-08-20 Thread uncleGen (JIRA)
uncleGen created SPARK-3170: --- Summary: Bug Fix in Storage UI Key: SPARK-3170 URL: https://issues.apache.org/jira/browse/SPARK-3170 Project: Spark Issue Type: Bug Components: Spark Core

[jira] [Updated] (SPARK-3170) Bug Fix in Storage UI

2014-08-20 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3170: Description: current compeleted stage only need to remove its own partitions that are no longer cached

[GitHub] spark pull request: [SPARK-3170][CORE]: Bug Fix in Storage UI

2014-08-20 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/2076 [SPARK-3170][CORE]: Bug Fix in Storage UI current compeleted stage only need to remove its own partitions that are no longer cached. Currently, Storage in Spark UI may lost some rdds which

[jira] [Created] (SPARK-3123) override the setName function to set EdgeRDD's name manually just as VertexRDD does.

2014-08-19 Thread uncleGen (JIRA)
uncleGen created SPARK-3123: --- Summary: override the setName function to set EdgeRDD's name manually just as VertexRDD does. Key: SPARK-3123 URL: https://issues.apache.org/jira/browse/SPARK-3123 Project

[GitHub] spark pull request: [GraphX]: override the setName function to s...

2014-08-19 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/2033 [GraphX]: override the setName function to set EdgeRDD's name manually just as VertexRDD does. You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] spark pull request: [SPARK-2773]use growth rate to predict if need...

2014-08-02 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/1696#issuecomment-50958278 @rxin Thanks for your attention, I have updated my jira. https://issues.apache.org/jira/browse/SPARK-2773 --- If your project is set up for it, you can reply

[jira] [Created] (SPARK-2773) Shuffle:use growth rate to predict if need to spill

2014-07-31 Thread uncleGen (JIRA)
uncleGen created SPARK-2773: --- Summary: Shuffle:use growth rate to predict if need to spill Key: SPARK-2773 URL: https://issues.apache.org/jira/browse/SPARK-2773 Project: Spark Issue Type

[jira] [Commented] (SPARK-2773) Shuffle:use growth rate to predict if need to spill

2014-07-31 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14081161#comment-14081161 ] uncleGen commented on SPARK-2773: - here is my improvement: https://github.com/apache/spark

[GitHub] spark pull request: Spark Shuffle:use growth rate to predict if ...

2014-07-31 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/1696 Spark Shuffle:use growth rate to predict if need to spill You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark master

[GitHub] spark pull request: [SPARK-2506]: In yarn-cluster mode, Applicatio...

2014-07-22 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/1429#issuecomment-49751553 @tgravescs ok, and any suggestions? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[jira] [Created] (SPARK-2506) In yarn-cluster mode, ApplicationMaster does not clean up correctly at the end of the job if users call sc.stop manually

2014-07-15 Thread uncleGen (JIRA)
uncleGen created SPARK-2506: --- Summary: In yarn-cluster mode, ApplicationMaster does not clean up correctly at the end of the job if users call sc.stop manually Key: SPARK-2506 URL: https://issues.apache.org/jira/browse

[jira] [Updated] (SPARK-2506) In yarn-cluster mode, ApplicationMaster does not clean up correctly at the end of the job if users call sc.stop manually

2014-07-15 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-2506: Description: when i call sc.stop manually, some strange ERRORs will appear: 1. in driver log: INFO

[jira] [Commented] (SPARK-2506) In yarn-cluster mode, ApplicationMaster does not clean up correctly at the end of the job if users call sc.stop manually

2014-07-15 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063171#comment-14063171 ] uncleGen commented on SPARK-2506: - Here is a simple PR fix this problem: https

[GitHub] spark pull request: Bug Fix: LiveListenerBus Queue Overflow

2014-07-10 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/1356 Bug Fix: LiveListenerBus Queue Overflow As we know, the size of eventQueue is fixed. When event comes faster than consume speed of listener, overflow events will be thrown away with throwing

[GitHub] spark pull request: Bug Fix: LiveListenerBus Queue Overflow

2014-07-10 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/1356#issuecomment-48691344 @pwendell yeah, this is not a handsome way to resolve the bug. My fix is a compromised way. Actually, there are no frequent get/put opertions in blockManager when

[GitHub] spark pull request: Bug Fix: LiveListenerBus Queue Overflow

2014-07-10 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/1356#issuecomment-48691872 @pwendell yeah, it is not a handsome way to resolve the bug. My fix is a compromise way. Actualy, is will not cause frequent put/get opertions in blockManager when

<    2   3   4   5   6   7