[ 
https://issues.apache.org/jira/browse/SPARK-20391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15975029#comment-15975029
 ] 

Thomas Graves commented on SPARK-20391:
---------------------------------------

I agree that if its been released we can't change it, the on/off heap we need 
to change asap before a release.  If we want to change the names of the other 2 
we could simply add 2 extra fields with a more appropriate name and leave the 
other 2 not sure that is necessary at this point though.

It think we should document rest api better and I think that page would be fine 
or link to another page, but that might be a separate jira if this is to change 
names still.  Its an api and we should have had that from the beginning. 
example of yarn rest api docs: 
https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html
  I'm sure there are better examples too.

I think making it a separate ExecutorMemoryMetrics makes sense so we can more 
easily extend in the future..   I assume managed memory here is  
spark.memory.fraction on heap + spark.memory.offHeap.size?



> Properly rename the memory related fields in ExecutorSummary REST API
> ---------------------------------------------------------------------
>
>                 Key: SPARK-20391
>                 URL: https://issues.apache.org/jira/browse/SPARK-20391
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Saisai Shao
>            Priority: Minor
>
> Currently in Spark we could get executor summary through REST API 
> {{/api/v1/applications/<app-id>/executors}}. The format of executor summary 
> is:
> {code}
> class ExecutorSummary private[spark](
>     val id: String,
>     val hostPort: String,
>     val isActive: Boolean,
>     val rddBlocks: Int,
>     val memoryUsed: Long,
>     val diskUsed: Long,
>     val totalCores: Int,
>     val maxTasks: Int,
>     val activeTasks: Int,
>     val failedTasks: Int,
>     val completedTasks: Int,
>     val totalTasks: Int,
>     val totalDuration: Long,
>     val totalGCTime: Long,
>     val totalInputBytes: Long,
>     val totalShuffleRead: Long,
>     val totalShuffleWrite: Long,
>     val isBlacklisted: Boolean,
>     val maxMemory: Long,
>     val executorLogs: Map[String, String],
>     val onHeapMemoryUsed: Option[Long],
>     val offHeapMemoryUsed: Option[Long],
>     val maxOnHeapMemory: Option[Long],
>     val maxOffHeapMemory: Option[Long])
> {code}
> Here are 6 memory related fields: {{memoryUsed}}, {{maxMemory}}, 
> {{onHeapMemoryUsed}}, {{offHeapMemoryUsed}}, {{maxOnHeapMemory}}, 
> {{maxOffHeapMemory}}.
> These all 6 fields reflects the *storage* memory usage in Spark, but from the 
> name of this 6 fields, user doesn't really know it is referring to *storage* 
> memory or the total memory (storage memory + execution memory). This will be 
> misleading.
> So I think we should properly rename these fields to reflect their real 
> meanings. Or we should will document it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to