Library of ML and Graph Mining algorithms

2014-02-28 Thread Dionysis Logothetis
Hi all,

We were going to send an e-mail to the giraph mailing lists soon, but
it's good to see you already discovered the site. Grafos.ml is a
general project on tools for ML and graph mining that we started at
Telefonica Research and for the most part revolves around Giraph. In
this context, we've been developing a library of algorithms for Giraph
that we call Okapi.

We started developing and using Okapi internally for research
purposes and we thought it'd be good to open source it. We've been
working on this for some time, and after some final testing, we're
ready to release a first version of it (a couple of the algorithm
implementations are still under testing).

So far we've developed some state-of-the-art machine
learning/recommendation algorithms and graph mining algorithms. In
general, we'd like to build more applications in these areas and
potentially new categories of algorithms that fit well on top of
Giraph. If you have any suggestions, let us know!

As you'll see we're releasing Okapi under the Apache license. We hope
to get contributions from the Giraph community and make this a rich
and useful library for everybody. This could be in the form of new
algorithms, fixes etc. In the long term we'd like Okapi to become an
Apache project if this makes sense, something similar to Mahout. If
you have any feedback about this or would like to get involved more
please do get in touch with us.

Thanks!





On Fri, Feb 28, 2014 at 9:26 AM, Pavan Kumar A pava...@outlook.com wrote:
 +1 :)

 Date: Fri, 28 Feb 2014 00:00:55 -0800
 From: ach...@apache.org
 To: d...@giraph.apache.org
 Subject: Re: Fyi: Graphos

 Thanks for the link, this looks pretty neat.

 On 2/27/14, 11:29 PM, Sebastian Schelter wrote:
  Hi,
 
  It seems a team from Telefonica built a machine learning library on
  top of Giraph:
 
  http://grafos.ml/
 
  Looks pretty interesting to me :)
 
  Best,
  Sebastian




Re: Giraph offloadPartition fails creation directory

2013-09-23 Thread Dionysis Logothetis
Hi all,
I'm seeing the same problem. I'm pasting here part of the logs that
looks more relevant in case it helps. This appears on the log of every
hadoop slave node.

2013-09-23 12:34:29,908 INFO
org.apache.giraph.comm.SendPartitionCache: SendPartitionCache:
maxVerticesPerTransfer = 1
2013-09-23 12:34:29,908 INFO
org.apache.giraph.comm.SendPartitionCache: SendPartitionCache:
maxEdgesPerTransfer = 8
2013-09-23 12:34:29,917 INFO
org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit:
Reserved input split path
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/20, overall
roughly 0.0% input splits reserved
2013-09-23 12:34:29,919 INFO
org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/20 from
ZooKeeper and got input split 'hdfs://hadoop-master:54310/some
path/part-r-2:402653184+14270392'
2013-09-23 12:34:29,935 WARN
org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library
is available
2013-09-23 12:34:29,935 INFO
org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library
loaded
2013-09-23 12:34:31,491 INFO
org.apache.giraph.worker.InputSplitsCallable: loadFromInputSplit:
Finished loading
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/20 (v=9209,
e=782750)
2013-09-23 12:34:31,496 INFO
org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit:
Reserved input split path
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/131, overall
roughly 0.71428573% input splits reserved
2013-09-23 12:34:31,497 INFO
org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/131 from
ZooKeeper and got input split 'hdfs://hadoop-master:54310/home/some
path/part-r-00018:335544320+67108864'
2013-09-23 12:34:35,374 INFO
org.apache.giraph.worker.InputSplitsCallable: loadFromInputSplit:
Finished loading
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/131 (v=44211,
e=3680393)
2013-09-23 12:34:35,378 INFO
org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit:
Reserved input split path
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/113, overall
roughly 1.4285715% input splits reserved
2013-09-23 12:34:35,378 INFO
org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/113 from
ZooKeeper and got input split 'hdfs://hadoop-master:54310/home/some
path/part-r-00016:67108864+67108864'
2013-09-23 12:34:38,161 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-365_vertices
2013-09-23 12:34:38,171 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-245_vertices
2013-09-23 12:34:38,181 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-25_vertices
2013-09-23 12:34:38,190 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-85_vertices
2013-09-23 12:34:38,205 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-345_vertices
2013-09-23 12:34:38,216 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-285_vertices
2013-09-23 12:34:38,228 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-205_vertices
2013-09-23 12:34:38,240 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-265_vertices
2013-09-23 12:34:38,255 ERROR
org.apache.giraph.partition.DiskBackedPartitionStore:
offloadPartition: Failed to create directory
_bsp/_partitions/job_201307021917_1469/partition-5_vertices
2013-09-23 12:34:38,834 INFO
org.apache.giraph.worker.InputSplitsCallable: loadFromInputSplit:
Finished loading
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/113 (v=43776,
e=3684432)
2013-09-23 12:34:38,838 INFO
org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit:
Reserved input split path
/_hadoopBsp/job_201307021917_1469/_vertexInputSplitDir/1, overall
roughly 2.857143% input splits reserved

On Fri, Sep 13, 2013 at 11:27 AM, Claudio Martella
claudio.marte...@gmail.com wrote:
 I have no idea without the logs, especially when it happens rarely.


 On Fri, Sep 13, 2013 at 12:33 AM, Alexander Asplund alexaspl...@gmail.com
 wrote:

 Actually, why is it saying it fails to create directory in the first
 place, when it is trying to write files?

 On Sep 12, 2013 3:04