[ 
https://issues.apache.org/jira/browse/SPARK-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Diana Carroll updated SPARK-2527:
---------------------------------

    Description: 
If I persist an RDD at one level, unpersist it, then repersist it at another 
level, the UI will continue to show the RDD at the first level...but correctly 
show individual partitions at the second level.

{code}
import org.apache.spark.api.java.StorageLevels
import org.apache.spark.api.java.StorageLevels._
val test1 = sc.parallelize(Array(1,2,3))test1.persist(StorageLevels.DISK_ONLY)
test1.count()
test1.unpersist()
test1.persist(StorageLevels.MEMORY_ONLY)
test1.count()
{code}

after the first call to persist and count, the Spark App web UI shows:

RDD Storage Info for 14 Storage Level: Disk Serialized 1x Replicated 
rdd_14_0        Disk Serialized 1x Replicated

After the second call, it shows:

RDD Storage Info for 14 Storage Level: Disk Serialized 1x Replicated 
rdd_14_0        Memory Deserialized 1x Replicated 

  was:
If I persist an RDD at one level, unpersist it, then repersist it at another 
level, the UI will continue to show the RDD at the first level...but correctly 
show individual partitions at the second level.

{code}
import org.apache.spark.api.java.StorageLevels._
val test1 = sc.parallelize(Array(1,2,3))test1.persist(StorageLevels.DISK_ONLY)
test1.count()
test1.unpersist()
test1.persist(StorageLevels.MEMORY_ONLY)
test1.count()
{code}

after the first call to persist and count, the Spark App web UI shows:

RDD Storage Info for 14 Storage Level: Disk Serialized 1x Replicated 
rdd_14_0        Disk Serialized 1x Replicated

After the second call, it shows:

RDD Storage Info for 14 Storage Level: Disk Serialized 1x Replicated 
rdd_14_0        Memory Deserialized 1x Replicated 


> incorrect persistence level shown in Spark UI after repersisting
> ----------------------------------------------------------------
>
>                 Key: SPARK-2527
>                 URL: https://issues.apache.org/jira/browse/SPARK-2527
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI
>    Affects Versions: 1.0.0
>            Reporter: Diana Carroll
>         Attachments: persistbug1.png, persistbug2.png
>
>
> If I persist an RDD at one level, unpersist it, then repersist it at another 
> level, the UI will continue to show the RDD at the first level...but 
> correctly show individual partitions at the second level.
> {code}
> import org.apache.spark.api.java.StorageLevels
> import org.apache.spark.api.java.StorageLevels._
> val test1 = sc.parallelize(Array(1,2,3))test1.persist(StorageLevels.DISK_ONLY)
> test1.count()
> test1.unpersist()
> test1.persist(StorageLevels.MEMORY_ONLY)
> test1.count()
> {code}
> after the first call to persist and count, the Spark App web UI shows:
> RDD Storage Info for 14 Storage Level: Disk Serialized 1x Replicated 
> rdd_14_0      Disk Serialized 1x Replicated
> After the second call, it shows:
> RDD Storage Info for 14 Storage Level: Disk Serialized 1x Replicated 
> rdd_14_0      Memory Deserialized 1x Replicated 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to