RE: cannot run Giraph trunk with Hadoop 2.0.0-alpha

2012-08-20 Thread David Garcia
You can remove this error by recursively removing _bsp folder from the zookeeper file system...and then running the job again. Probably should remove folder from hdfs too. From: Johnny Zhang [xiao...@cloudera.com] Sent: Monday, August 20, 2012 6:59 PM To

Re: Giraph : newbie questions

2012-07-16 Thread David Garcia
Giraph partitions the vertices using a hashing function that's basically the equivalent of (hash(vertexID) mod #ofComputeNodes). You can mitigate memory issues by starting the job with a minimum of vertices in your file and then add them dynamically as your job progresses (assuming that your job do

RE: Resources or advice on minimising memory usage in Giraph/Hadoop code ?

2012-06-07 Thread David Garcia
e ? Won't this just postpone the pain? On Thursday, June 7, 2012, David Garcia wrote: Based upon what you have mentioned, o think you are getting heap errors because every vertex in your graph will be loaded into memory prior to super step one. So if you have a large graph, with lots of

Re: Resources or advice on minimising memory usage in Giraph/Hadoop code ?

2012-06-06 Thread David Garcia
Based upon what you have mentioned, o think you are getting heap errors because every vertex in your graph will be loaded into memory prior to super step one. So if you have a large graph, with lots of state, you probably have memory issues from the very beginning. A simple way to mitigate the