It's very strange.. it is definitely failing on some partitions.. currently the disk size of a offloading worker corresponda about to the size of its part of the graph... but the worker attempts to create additional partitions, and this fails. On Sep 12, 2013 2:07 PM, "Alexander Asplund" <alexaspl...@gmail.com> wrote:
> Actually, I take that back. It seems it does succeeded in creating > partitions - it just struggles with it sometimes. Should I be worried about > these errors if partition directories seem to be filling up? > On Sep 11, 2013 6:38 PM, "Claudio Martella" <claudio.marte...@gmail.com> > wrote: > >> Giraph does not offload partitions or messages to HDFS in the out-of-core >> module. It uses local disk on the computing nodes. By defualt, it uses the >> tasktracker local directory where for example the distributed cache is >> stored. >> >> Could you provide the stacktrace Giraph is spitting when failing? >> >> >> On Thu, Sep 12, 2013 at 12:54 AM, Alexander Asplund < >> alexaspl...@gmail.com> wrote: >> >>> Hi, >>> >>> I'm still trying to get Giraph to work on a graph that requires more >>> memory that is available. The problem is that when the Workers try to >>> offload partitions, the offloading fails. The DiskBackedPartitionStore >>> fails to create the directory >>> _bsp/_partitions/job-xxxx/part-vertices-xxx (roughly from recall). >>> >>> The input or computation will then continue for a while, which I >>> believe is because it is still managing to hold everything in memory - >>> but at some point it reaches the limit where there simply is no more >>> heap space, and it crashes with OOM. >>> >>> Has anybody had this problem with giraph failing to make HDFS >>> directories? >>> >> >> >> >> -- >> Claudio Martella >> claudio.marte...@gmail.com >> >