Hi Folks, I'm running a five-step path following-algorithm on a movie graph with 120K verticies and 400K edges. The graph has vertices for actors, directors, movies, users, and user ratings, and my Scala code is walking the path "rating > movie > rating > user > rating". There are 75K rating nodes and each has ~100 edges. My program iterates over each path item, calling aggregateMessages() then joinVertices() each time, and then processing that result on the next iteration. The program never finishes the second 'rating' step, which makes sense as, IIUC from my back-of-the-napkin estimate, the intermediate result would have ~4B active vertices.
Spark is version 1.2.0 and running in standalone mode on a small cluster of five hosts: four compute nodes and a head node where the computes have 4 cores and 32GB RAM each, and the head has 32 cores and 128GB RAM. After restarting Spark just now, the Master web UI shows 15 workers (5 dead), two per node, with cores and memory listed as "32 (0 Used)" and "125.0 GB (0.0 B Used)" on the two head node workers and "4 (0 Used)" and "30.5 GB (0.0 B Used)" for the 8 workers running on the compute nodes. (Note: I don't understand why it's configured to run two workers per node.) The small Spark example programs run to completion. I've listed the console output at http://pastebin.com/DPECKgQ9 (I'm running in spark-shell). I hope you can provide some advice on things to try next (e.g., configuration vars). My guess is the cluster is running out of memory, though I think it has adequate aggregate ram to handle this app. Thanks very much -- matt ---- Matthew Cornell, Research Fellow, Computer Science Department, Umass Amherst --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org