[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1535 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-52275155 Hey this looks good. Merging it now into mater. Sorry about the delay. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user nkronenfeld commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50836302 OK, @pwendel, I think it's set now. Let me know if there are merge problems, I can resubmit on a clean branch if necessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50832292 QA results for PR 1535:- This patch PASSES unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17612/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50828214 QA tests have started for PR 1535. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17612/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50660639 yeah to keep it simple let's just always have it show memory. I'd rather not add a new public API for this `showMemory` thing at the moment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user nkronenfeld commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50621396 Thanks, @pwendel. I can revert it back if you want - is that preferable to the way it is now, with the option to include the memory info or not? I'll start with taking out the DeveloperAPI and adjusting the docs; I'll leave off taking out the optional memory parameter until I hear from you again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50579659 Hey @nkronenfeld - I traced through the exact function call more closely and I actually think it's fine. The issue I pointed out in the JIRA is orthogonal. So I'm fine to just revert this back to always showing the status. However, we should not mark this as a developer API. This is a stable API we are happy to support forever. Still, this will cause a significant amount of object allocation due to the way other internal function calls happen (it is basically O(all blocks)) for an application. It might be nice to add a note to the docs that the operation might be expensive and should not be called inside of a critical code path. Thought we could likely optimize those things down the road. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50507822 QA results for PR 1535:- This patch PASSES unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17363/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50501244 QA tests have started for PR 1535. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17363/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user nkronenfeld commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50487526 If I'm reading that correctly, that test failure is from an MLLib change that's nothing to do with what I've done? Perhaps I'll just try it again, maybe it's a bad sync with master: Jenkins, please test this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50419441 QA results for PR 1535:- This patch FAILED unit tests.For more information see test ouptut:https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17310/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50410853 QA tests have started for PR 1535. This patch DID NOT merge cleanly! View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17310/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50407668 QA results for PR 1535:- This patch FAILED unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17304/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50401197 QA tests have started for PR 1535. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17304/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50388630 QA results for PR 1535:- This patch FAILED unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17302/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50388544 QA tests have started for PR 1535. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17302/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user nkronenfeld commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-50388139 I just parameterized the memory so one can display it or not as desired (with not displaying it the default) - is that sufficient? I forgot to put in the note about the JIRA into the code, I'll definitely add that too, or I can back out the optional nature and just leave in the code comment about the JIRA Let me know which you want, please. Thanks, -Nathan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1535#discussion_r15324282 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1269,6 +1269,19 @@ abstract class RDD[T: ClassTag]( /** A description of this RDD and its recursive dependencies for debugging. */ def toDebugString: String = { +// Get a debug description of an rdd without its children +def debugSelf (rdd: RDD[_]): Seq[String] = { + import Utils.bytesToString + + val persistence = storageLevel.description + val storageInfo = rdd.context.getRDDStorageInfo.filter(_.id == rdd.id).map(info => --- End diff -- BTW - we create add JIRA to add this back once SPARK-2316 is fixed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1535#discussion_r15324267 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1269,6 +1269,19 @@ abstract class RDD[T: ClassTag]( /** A description of this RDD and its recursive dependencies for debugging. */ def toDebugString: String = { +// Get a debug description of an rdd without its children +def debugSelf (rdd: RDD[_]): Seq[String] = { + import Utils.bytesToString + + val persistence = storageLevel.description + val storageInfo = rdd.context.getRDDStorageInfo.filter(_.id == rdd.id).map(info => --- End diff -- Ah sorry, yeah I mean this very costly. I'd rather not do this in a debug function - because people will do things like print debug statements inside of loops. In that case the debugging will significantly alter the performance of their application. There is a separate JIRA to make this function faster (it's a function also used in the UI), but until that's fixed I'd rather not call it here: https://issues.apache.org/jira/browse/SPARK-2316 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user GregOwen commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49926584 Looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user nkronenfeld commented on a diff in the pull request: https://github.com/apache/spark/pull/1535#discussion_r15308845 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1269,6 +1269,19 @@ abstract class RDD[T: ClassTag]( /** A description of this RDD and its recursive dependencies for debugging. */ def toDebugString: String = { +// Get a debug description of an rdd without its children +def debugSelf (rdd: RDD[_]): Seq[String] = { + import Utils.bytesToString + + val persistence = storageLevel.description + val storageInfo = rdd.context.getRDDStorageInfo.filter(_.id == rdd.id).map(info => --- End diff -- I'm not sure what you mean - do you mean "an extremely costly operation"? Assuming that to be the case, two comments:: * I though about attaching flags to the function so one could specify the type of debug information desired; I think that makes the function too complex, but I'm hardly firm in that idea. * This whole function is specifically to help a developer with debugging. I don't _think_ having it be costly is all that bad. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1535#discussion_r15307173 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1269,6 +1269,19 @@ abstract class RDD[T: ClassTag]( /** A description of this RDD and its recursive dependencies for debugging. */ def toDebugString: String = { +// Get a debug description of an rdd without its children +def debugSelf (rdd: RDD[_]): Seq[String] = { + import Utils.bytesToString + + val persistence = storageLevel.description + val storageInfo = rdd.context.getRDDStorageInfo.filter(_.id == rdd.id).map(info => --- End diff -- Hey on this one, this is actually an extremely operation... I wonder if maybe for now it's better to not put this in there and only put the storage level. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49890189 QA results for PR 1535:- This patch PASSES unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17035/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49875467 QA tests have started for PR 1535. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17035/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49874874 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user nkronenfeld commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49870919 I'm not sure what to do about this test failure; all I've changed is toDebugString, and this is in a spark streaming test which never calls that, so I'm pretty sure it's nothing to do with me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49831636 QA results for PR 1535:- This patch FAILED unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17008/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49827241 QA tests have started for PR 1535. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17008/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49825506 QA results for PR 1535:- This patch FAILED unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17005/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49825462 QA tests have started for PR 1535. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17005/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user nkronenfeld commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49825329 thanks mark, I had no idea that existed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1535#discussion_r15259034 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1294,7 +1307,11 @@ abstract class RDD[T: ClassTag]( val partitionStr = "(" + rdd.partitions.size + ")" val leftOffset = (partitionStr.length - 1) / 2 val nextPrefix = (" " * leftOffset) + "|" + (" " * (partitionStr.length - leftOffset)) - Seq(partitionStr + " " + rdd) ++ debugChildren(rdd, nextPrefix) + + debugSelf(rdd).zipWithIndex.map{ +case (desc: String, 0) => partitionStr+" "+desc +case (desc: String, _) => nextPrefix+" "+desc --- End diff -- And elsewhere in this PR, avoid string concatenation with `+` when string interpolation would be equally clear or clearer. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1535#discussion_r15258957 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1294,7 +1307,11 @@ abstract class RDD[T: ClassTag]( val partitionStr = "(" + rdd.partitions.size + ")" val leftOffset = (partitionStr.length - 1) / 2 val nextPrefix = (" " * leftOffset) + "|" + (" " * (partitionStr.length - leftOffset)) - Seq(partitionStr + " " + rdd) ++ debugChildren(rdd, nextPrefix) + + debugSelf(rdd).zipWithIndex.map{ +case (desc: String, 0) => partitionStr+" "+desc +case (desc: String, _) => nextPrefix+" "+desc --- End diff -- s"$partitionStr $desc" s"$nextPrefix $desc" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49799385 @gowen mind taking a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49798677 QA results for PR 1535:- This patch FAILED unit tests.- This patch merges cleanly- This patch adds no public classesFor more information see test ouptut:https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16987/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49798572 QA tests have started for PR 1535. This patch merges cleanly. View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16987/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user nkronenfeld commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49797427 Sorry, forgot to move one small formatting issue over from the old branch, I'll check that in as soon as I test it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user nkronenfeld commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49783708 Done, and I also left a comment on Greg Owen's PR from yesterday asking him for formatting comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1535#issuecomment-49782942 Hey, do you mind putting an example of what the output looks like in the PR description? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to rdd.toDebugString
GitHub user nkronenfeld opened a pull request: https://github.com/apache/spark/pull/1535 Add caching information to rdd.toDebugString I find it useful to see where in an RDD's DAG data is cached, so I figured others might too. I've added both the caching level, and the actual memory state of the RDD. Some of this is redundant with the web UI (notably the actual memory state), but (a) that is temporary, and (b) putting it in the DAG tree shows some context that can help a lot. You can merge this pull request into a Git repository by running: $ git pull https://github.com/nkronenfeld/spark-1 feature/debug-caching2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1535.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1535 commit 8fbecb6eb47505e7e56949c00107b917c6c5e945 Author: Nathan Kronenfeld Date: 2014-07-22T18:44:58Z Add caching information to rdd.toDebugString --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to RDD.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1532#issuecomment-49749243 QA results for PR 1532:- This patch FAILED unit tests.For more information see test ouptut:https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16966/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to RDD.toDebugString
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1532#issuecomment-49749136 QA tests have started for PR 1532. This patch DID NOT merge cleanly! View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16966/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to RDD.toDebugString
Github user nkronenfeld closed the pull request at: https://github.com/apache/spark/pull/1532 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Add caching information to RDD.toDebugString
GitHub user nkronenfeld opened a pull request: https://github.com/apache/spark/pull/1532 Add caching information to RDD.toDebugString I find it useful to see where in an RDD's DAG data is cached, so I figured others might too. I've added both the caching level, and the actual memory state of the RDD. Some of this is redundant with the web UI (notably the actual memory state), but (a) that is temporary, and (b) putting it in the DAG tree shows some context that can help a lot. You can merge this pull request into a Git repository by running: $ git pull https://github.com/nkronenfeld/spark-1 feature/debug-caching Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1532.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1532 commit 06c76ab961afc42e8305d6a3f186361c1e20e04d Author: Nathan Kronenfeld Date: 2014-07-22T14:39:41Z Add caching information to RDD.toDebugString --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---