Re: Spark - GraphX pregel like with global variables (accumulator / broadcast)

2014-08-27 Thread BertrandR
Thank you for your answers, and sorry for my lack of understanding. So I tried what you suggested, with/without unpersisting and with .cache() (also persist(StorageLevel.MEMORY_AND_DISK) but this is not allowed for msg because you can't change the Storage level apparently) for msg, g and newVerts,

Re: Spark - GraphX pregel like with global variables (accumulator / broadcast)

2014-08-26 Thread BertrandR
I actually tried without unpersisting, but given the performance I tryed to add these in order to free the memory. After your anwser I tried to remove them again, but without any change in the execution time... Looking at the web interface, I can see that the mapPartitions at GraphImpl.scala:184

Spark - GraphX pregel like with global variables (accumulator / broadcast)

2014-08-25 Thread BertrandR
Hi, I'm working on big graph analytics, and currently implementing a mean field inference algorithm in GraphX/Spark. I start with an arbitrary graph, keep a (sparse) probability distribution at each node implemented as a Map[Long,Double]. At each iteration, from the current estimates of the