Re: Giraph offloadPartition fails creation directory

2013-09-23 Thread Dionysis Logothetis
Hi all,
I'm seeing the same problem. I'm pasting here part of the logs that
looks more relevant in case it helps. This appears on the log of every
hadoop slave node.

2013-09-23 12:34:29,908 INFO
org.apache.giraph.comm.SendPartitionCache: SendPartitionCache:
maxVerticesPerTransfer = 1
2013-09-23 12:34:29,908 INFO
org.apache.giraph.comm.SendPartitionCache: SendPartitionCache:
maxEdgesPerTransfer = 8
2013-09-23 12:34:29,917 INFO
org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit:
Reserved input split path
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/20, overall
roughly 0.0% input splits reserved
2013-09-23 12:34:29,919 INFO
org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/20 from
ZooKeeper and got input split 'hdfs://hadoop-master:54310/some
path/part-r-2:402653184+14270392'
2013-09-23 12:34:29,935 WARN
org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library
is available
2013-09-23 12:34:29,935 INFO
org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library
loaded
2013-09-23 12:34:31,491 INFO
org.apache.giraph.worker.InputSplitsCallable: loadFromInputSplit:
Finished loading
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/20 (v=9209,
e=782750)
2013-09-23 12:34:31,496 INFO
org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit:
Reserved input split path
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/131, overall
roughly 0.71428573% input splits reserved
2013-09-23 12:34:31,497 INFO
org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/131 from
ZooKeeper and got input split 'hdfs://hadoop-master:54310/home/some
path/part-r-00018:335544320+67108864'
2013-09-23 12:34:35,374 INFO
org.apache.giraph.worker.InputSplitsCallable: loadFromInputSplit:
Finished loading
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/131 (v=44211,
e=3680393)
2013-09-23 12:34:35,378 INFO
org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit:
Reserved input split path
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/113, overall
roughly 1.4285715% input splits reserved
2013-09-23 12:34:35,378 INFO
org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/113 from
ZooKeeper and got input split 'hdfs://hadoop-master:54310/home/some
path/part-r-00016:67108864+67108864'
2013-09-23 12:34:38,161 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-365_vertices
2013-09-23 12:34:38,171 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-245_vertices
2013-09-23 12:34:38,181 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-25_vertices
2013-09-23 12:34:38,190 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-85_vertices
2013-09-23 12:34:38,205 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-345_vertices
2013-09-23 12:34:38,216 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-285_vertices
2013-09-23 12:34:38,228 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-205_vertices
2013-09-23 12:34:38,240 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-265_vertices
2013-09-23 12:34:38,255 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-5_vertices
2013-09-23 12:34:38,834 INFO
org.apache.giraph.worker.InputSplitsCallable: loadFromInputSplit:
Finished loading
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/113 (v=43776,
e=3684432)
2013-09-23 12:34:38,838 INFO
org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit:
Reserved input split path
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/1, overall
roughly 2.857143% input splits reserved

On Fri, Sep 13, 2013 at 11:27 AM, Claudio Martella
claudio.marte...@gmail.com wrote:
 I have no idea without the logs, especially when it happens rarely.


 On Fri, Sep 13, 2013 at 12:33 AM, Alexander Asplund alexaspl...@gmail.com
 wrote:

 Actually, why is it saying it fails to create directory in the first
 place, when it is trying to write files?

 On Sep 12, 2013 3:04 

Re: Giraph offloadPartition fails creation directory

2013-09-23 Thread Claudio Martella
Weird.

This is the code:

if (!parent.exists()) {

  if (!parent.mkdirs()) {

LOG.error(offloadPartition: Failed to create directory  + parent.
getAbsolutePath());

  }

}


Question is why parent.mkdirs() is returning false. Could be a problem of
permissions. Could you try to pass a different directory for writing, e.g.
/tmp/foobar?


On Mon, Sep 23, 2013 at 1:28 PM, Dionysis Logothetis
dlogothe...@gmail.comwrote:

 offloadPartition: Failed to create directory





-- 
   Claudio Martella
   claudio.marte...@gmail.com


Re: Giraph offloadPartition fails creation directory

2013-09-12 Thread Alexander Asplund
Unfortunately there's some restrictions that means I don't really have
them handy, BUT pointing me towards the local disk helped me partially
resolve it. There are rights issues with this directory, but I was
able to get by it by manually creating a separate giraph in mapreduce
local storage mapred/local/giraph and setting Giraph Options to point
to local storage/giraph/partitions and /messages

Then something strange happens. The job successfully creates 30,
exactly 30 directories, and then starts failing again. This happened
both times I ran the job. 30 directories are created in the partitions
directory, and then subsequenctly it prints to the task log something
like

DiskBackedPartitionStorage: offloadPartition: Failed to create directory ...

..and then no further directories are created after the 30. It will
attempt to create more partition directories, but it keeps failling
after the inital 30. It is quite strange.

On 9/12/13, Claudio Martella claudio.marte...@gmail.com wrote:
 Giraph does not offload partitions or messages to HDFS in the out-of-core
 module. It uses local disk on the computing nodes. By defualt, it uses the
 tasktracker local directory where for example the distributed cache is
 stored.

 Could you provide the stacktrace Giraph is spitting when failing?


 On Thu, Sep 12, 2013 at 12:54 AM, Alexander Asplund
 alexaspl...@gmail.comwrote:

 Hi,

 I'm still trying to get Giraph to work on a graph that requires more
 memory that is available. The problem is that when the Workers try to
 offload partitions, the offloading fails. The DiskBackedPartitionStore
 fails to create the directory
 _bsp/_partitions/job-/part-vertices-xxx (roughly from recall).

 The input or computation will then continue for a while, which I
 believe is because it is still managing to hold everything in memory -
 but at some point it reaches the limit where there simply is no more
 heap space, and it crashes with OOM.

 Has anybody had this problem with giraph failing to make HDFS
 directories?




 --
Claudio Martella
claudio.marte...@gmail.com



-- 
Alexander Asplund


Re: Giraph offloadPartition fails creation directory

2013-09-12 Thread Alexander Asplund
Actually, I take that back. It seems it does succeeded in creating
partitions - it just struggles with it sometimes. Should I be worried about
these errors if partition directories seem to be filling up?
On Sep 11, 2013 6:38 PM, Claudio Martella claudio.marte...@gmail.com
wrote:

 Giraph does not offload partitions or messages to HDFS in the out-of-core
 module. It uses local disk on the computing nodes. By defualt, it uses the
 tasktracker local directory where for example the distributed cache is
 stored.

 Could you provide the stacktrace Giraph is spitting when failing?


 On Thu, Sep 12, 2013 at 12:54 AM, Alexander Asplund alexaspl...@gmail.com
  wrote:

 Hi,

 I'm still trying to get Giraph to work on a graph that requires more
 memory that is available. The problem is that when the Workers try to
 offload partitions, the offloading fails. The DiskBackedPartitionStore
 fails to create the directory
 _bsp/_partitions/job-/part-vertices-xxx (roughly from recall).

 The input or computation will then continue for a while, which I
 believe is because it is still managing to hold everything in memory -
 but at some point it reaches the limit where there simply is no more
 heap space, and it crashes with OOM.

 Has anybody had this problem with giraph failing to make HDFS directories?




 --
Claudio Martella
claudio.marte...@gmail.com



Re: Giraph offloadPartition fails creation directory

2013-09-12 Thread Alexander Asplund
Actually, why is it saying it fails to create directory in the first place,
when it is trying to write files?
On Sep 12, 2013 3:04 PM, Alexander Asplund alexaspl...@gmail.com wrote:

 I can also add that there is no such issue with DiskBackedMessageStore. It
 successfully creates a large number of store files, and never starts
 failing.
 On Sep 12, 2013 2:11 PM, Alexander Asplund alexaspl...@gmail.com
 wrote:

 It's very strange.. it is definitely failing on some partitions..
 currently the disk size of a offloading worker corresponda about to the
 size of its part of the graph... but the worker attempts to create
 additional partitions, and this fails.
 On Sep 12, 2013 2:07 PM, Alexander Asplund alexaspl...@gmail.com
 wrote:

 Actually, I take that back. It seems it does succeeded in creating
 partitions - it just struggles with it sometimes. Should I be worried about
 these errors if partition directories seem to be filling up?
 On Sep 11, 2013 6:38 PM, Claudio Martella claudio.marte...@gmail.com
 wrote:

 Giraph does not offload partitions or messages to HDFS in the
 out-of-core module. It uses local disk on the computing nodes. By defualt,
 it uses the tasktracker local directory where for example the distributed
 cache is stored.

 Could you provide the stacktrace Giraph is spitting when failing?


 On Thu, Sep 12, 2013 at 12:54 AM, Alexander Asplund 
 alexaspl...@gmail.com wrote:

 Hi,

 I'm still trying to get Giraph to work on a graph that requires more
 memory that is available. The problem is that when the Workers try to
 offload partitions, the offloading fails. The DiskBackedPartitionStore
 fails to create the directory
 _bsp/_partitions/job-/part-vertices-xxx (roughly from recall).

 The input or computation will then continue for a while, which I
 believe is because it is still managing to hold everything in memory -
 but at some point it reaches the limit where there simply is no more
 heap space, and it crashes with OOM.

 Has anybody had this problem with giraph failing to make HDFS
 directories?




 --
Claudio Martella
claudio.marte...@gmail.com




Giraph offloadPartition fails creation directory

2013-09-11 Thread Alexander Asplund
Hi,

I'm still trying to get Giraph to work on a graph that requires more
memory that is available. The problem is that when the Workers try to
offload partitions, the offloading fails. The DiskBackedPartitionStore
fails to create the directory
_bsp/_partitions/job-/part-vertices-xxx (roughly from recall).

The input or computation will then continue for a while, which I
believe is because it is still managing to hold everything in memory -
but at some point it reaches the limit where there simply is no more
heap space, and it crashes with OOM.

Has anybody had this problem with giraph failing to make HDFS directories?


Re: Giraph offloadPartition fails creation directory

2013-09-11 Thread Claudio Martella
Giraph does not offload partitions or messages to HDFS in the out-of-core
module. It uses local disk on the computing nodes. By defualt, it uses the
tasktracker local directory where for example the distributed cache is
stored.

Could you provide the stacktrace Giraph is spitting when failing?


On Thu, Sep 12, 2013 at 12:54 AM, Alexander Asplund
alexaspl...@gmail.comwrote:

 Hi,

 I'm still trying to get Giraph to work on a graph that requires more
 memory that is available. The problem is that when the Workers try to
 offload partitions, the offloading fails. The DiskBackedPartitionStore
 fails to create the directory
 _bsp/_partitions/job-/part-vertices-xxx (roughly from recall).

 The input or computation will then continue for a while, which I
 believe is because it is still managing to hold everything in memory -
 but at some point it reaches the limit where there simply is no more
 heap space, and it crashes with OOM.

 Has anybody had this problem with giraph failing to make HDFS directories?




-- 
   Claudio Martella
   claudio.marte...@gmail.com