[ https://issues.apache.org/jira/browse/SPARK-5112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-5112: ----------------------------------- Assignee: Apache Spark (was: Sandy Ryza) > Expose SizeEstimator as a developer API > --------------------------------------- > > Key: SPARK-5112 > URL: https://issues.apache.org/jira/browse/SPARK-5112 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Reporter: Sandy Ryza > Assignee: Apache Spark > > "The best way to size the amount of memory consumption your dataset will > require is to create an RDD, put it into cache, and look at the SparkContext > logs on your driver program. The logs will tell you how much memory each > partition is consuming, which you can aggregate to get the total size of the > RDD." > -the Tuning Spark page > This is a pain. It would be much nicer to expose simply functionality for > understanding the memory footprint of a Java object. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org