Do you have enough disk space for the spill? It seems it has lots of memory reserved but not enough for the spill. You will need a disk that can handle the entire data partition for each host. Compression of the spilled data saves about 50% in most if not all cases.
Given the large data set I would consider a 1TB SATA flash drive, formatted as EXT4 or XFS and give it exclusive access as spark.local.dir. It will slow things down but it won’t stop. There are alternatives if you want to discuss offline. > On Apr 5, 2016, at 6:37 PM, lllll <lishu...@gmail.com> wrote: > > I have a task to remap the index to actual uuid in ALS prediction results. > But it consistently fail due to lost executors. I noticed there's large > shuffle spill memory but I don't know how to improve it. > > <http://apache-spark-user-list.1001560.n3.nabble.com/file/n26683/24.png> > > I've tried to reduce the number of executors while assigning each to have > bigger memory. > <http://apache-spark-user-list.1001560.n3.nabble.com/file/n26683/31.png> > > But it still doesn't seem big enough. I don't know what to do. > > Below is my code: > user = load_user() > product = load_product() > user.cache() > product.cache() > model = load_model(model_path) > all_pairs = user.map(lambda x: x[1]).cartesian(product.map(lambda x: x[1])) > all_prediction = model.predictAll(all_pairs) > user_reverse = user.map(lambda r: (r[1], r[0])) > product_reverse = product.map(lambda r: (r[1], r[0])) > user_reversed = all_prediction.map(lambda u: (u[0], (u[1], > u[2]))).join(user_reverse).map(lambda r: (r[1][0][0], (r[1][1], > r[1][0][1]))) > both_reversed = user_reversed.join(product_reverse).map(lambda r: > (r[1][0][0], r[1][1], r[1][0][1])) > both_reversed.map(lambda x: '{}|{}|{}'.format(x[0], x[1], > x[2])).saveAsTextFile(recommendation_path) > > Both user and products are (uuid, index) tuples. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/lost-executor-due-to-large-shuffle-spill-memory-tp26683.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org