Actually, if you don’t use method like persist or cache, it even not store the rdd to the disk. Every time you use this rdd, they just compute it from the original one.
In logistic regression from mllib, they don't persist the changed input , so I can't see the rdd from the web gui. I have changed the code and gained a 10x speed up. -- binbinbin915 Sent with Airmail -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-t-see-any-thing-one-the-storage-panel-of-application-UI-tp10296p11403.html Sent from the Apache Spark User List mailing list archive at Nabble.com.