Hi all,
I'm seeing the same problem. I'm pasting here part of the logs that
looks more relevant in case it helps. This appears on the log of every
hadoop slave node.

2013-09-23 12:34:29,908 INFO
org.apache.giraph.comm.SendPartitionCache: SendPartitionCache:
maxVerticesPerTransfer = 10000
2013-09-23 12:34:29,908 INFO
org.apache.giraph.comm.SendPartitionCache: SendPartitionCache:
maxEdgesPerTransfer = 80000
2013-09-23 12:34:29,917 INFO
org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit:
Reserved input split path
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/20, overall
roughly 0.0% input splits reserved
2013-09-23 12:34:29,919 INFO
org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/20 from
ZooKeeper and got input split 'hdfs://hadoop-master:54310/<some
path>/part-r-00002:402653184+14270392'
2013-09-23 12:34:29,935 WARN
org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library
is available
2013-09-23 12:34:29,935 INFO
org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library
loaded
2013-09-23 12:34:31,491 INFO
org.apache.giraph.worker.InputSplitsCallable: loadFromInputSplit:
Finished loading
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/20 (v=9209,
e=782750)
2013-09-23 12:34:31,496 INFO
org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit:
Reserved input split path
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/131, overall
roughly 0.71428573% input splits reserved
2013-09-23 12:34:31,497 INFO
org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/131 from
ZooKeeper and got input split 'hdfs://hadoop-master:54310/home/<some
path>/part-r-00018:335544320+67108864'
2013-09-23 12:34:35,374 INFO
org.apache.giraph.worker.InputSplitsCallable: loadFromInputSplit:
Finished loading
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/131 (v=44211,
e=3680393)
2013-09-23 12:34:35,378 INFO
org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit:
Reserved input split path
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/113, overall
roughly 1.4285715% input splits reserved
2013-09-23 12:34:35,378 INFO
org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/113 from
ZooKeeper and got input split 'hdfs://hadoop-master:54310/home/<some
path>/part-r-00016:67108864+67108864'
2013-09-23 12:34:38,161 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-365_vertices
2013-09-23 12:34:38,171 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-245_vertices
2013-09-23 12:34:38,181 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-25_vertices
2013-09-23 12:34:38,190 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-85_vertices
2013-09-23 12:34:38,205 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-345_vertices
2013-09-23 12:34:38,216 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-285_vertices
2013-09-23 12:34:38,228 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-205_vertices
2013-09-23 12:34:38,240 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-265_vertices
2013-09-23 12:34:38,255 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-5_vertices
2013-09-23 12:34:38,834 INFO
org.apache.giraph.worker.InputSplitsCallable: loadFromInputSplit:
Finished loading
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/113 (v=43776,
e=3684432)
2013-09-23 12:34:38,838 INFO
org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit:
Reserved input split path
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/1, overall
roughly 2.857143% input splits reserved

On Fri, Sep 13, 2013 at 11:27 AM, Claudio Martella
<claudio.marte...@gmail.com> wrote:
> I have no idea without the logs, especially when it happens rarely.
>
>
> On Fri, Sep 13, 2013 at 12:33 AM, Alexander Asplund <alexaspl...@gmail.com>
> wrote:
>>
>> Actually, why is it saying it fails to create directory in the first
>> place, when it is trying to write files?
>>
>> On Sep 12, 2013 3:04 PM, "Alexander Asplund" <alexaspl...@gmail.com>
>> wrote:
>>>
>>> I can also add that there is no such issue with DiskBackedMessageStore.
>>> It successfully creates a large number of store files, and never starts
>>> failing.
>>>
>>> On Sep 12, 2013 2:11 PM, "Alexander Asplund" <alexaspl...@gmail.com>
>>> wrote:
>>>>
>>>> It's very strange.. it is definitely failing on some partitions..
>>>> currently the disk size of a offloading worker corresponda about to the 
>>>> size
>>>> of its part of the graph... but the worker attempts to create additional
>>>> partitions, and this fails.
>>>>
>>>> On Sep 12, 2013 2:07 PM, "Alexander Asplund" <alexaspl...@gmail.com>
>>>> wrote:
>>>>>
>>>>> Actually, I take that back. It seems it does succeeded in creating
>>>>> partitions - it just struggles with it sometimes. Should I be worried 
>>>>> about
>>>>> these errors if partition directories seem to be filling up?
>>>>>
>>>>> On Sep 11, 2013 6:38 PM, "Claudio Martella"
>>>>> <claudio.marte...@gmail.com> wrote:
>>>>>>
>>>>>> Giraph does not offload partitions or messages to HDFS in the
>>>>>> out-of-core module. It uses local disk on the computing nodes. By 
>>>>>> defualt,
>>>>>> it uses the tasktracker local directory where for example the distributed
>>>>>> cache is stored.
>>>>>>
>>>>>> Could you provide the stacktrace Giraph is spitting when failing?
>>>>>>
>>>>>>
>>>>>> On Thu, Sep 12, 2013 at 12:54 AM, Alexander Asplund
>>>>>> <alexaspl...@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm still trying to get Giraph to work on a graph that requires more
>>>>>>> memory that is available. The problem is that when the Workers try to
>>>>>>> offload partitions, the offloading fails. The
>>>>>>> DiskBackedPartitionStore
>>>>>>> fails to create the directory
>>>>>>> _bsp/_partitions/job-xxxx/part-vertices-xxx (roughly from recall).
>>>>>>>
>>>>>>> The input or computation will then continue for a while, which I
>>>>>>> believe is because it is still managing to hold everything in memory
>>>>>>> -
>>>>>>> but at some point it reaches the limit where there simply is no more
>>>>>>> heap space, and it crashes with OOM.
>>>>>>>
>>>>>>> Has anybody had this problem with giraph failing to make HDFS
>>>>>>> directories?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>    Claudio Martella
>>>>>>    claudio.marte...@gmail.com
>
>
>
>
> --
>    Claudio Martella
>    claudio.marte...@gmail.com

Reply via email to