Yes you lose the data You can add machines but will require you to restart the cluster. Also adding is manual on you add nodes Regards Mayur
On Wednesday, July 23, 2014, durga <[email protected]> wrote: > Hi All, > I have a question, > > For my company , we are planning to use spark-ec2 scripts to create cluster > for us. > > I understand that , persistent HDFS will make the hdfs available for > cluster > restarts. > > Question is: > > 1) What happens , If I destroy and re-create , do I loose the data. > a) If I loose the data , is there only way is to copy to s3 and recopy > after launching the cluster(it seems costly data transfer from and to s3?) > 2) How would I add/remove some machines in the cluster?. I mean I am asking > for cluster management. > Is there any place amazon allows to see the machines , and do the operation > of adding and removing? > > Thanks, > D. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/persistent-HDFS-instance-for-cluster-restarts-destroys-tp10551.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > -- Sent from Gmail Mobile
