Hi Bu,
no, currently we do not have a DBInputFormat. We have an open issue with a
google summer of code student working on a GoraInputFormat, which supports
also reading from RDBMs through Gora. However, if/when it will get it, it
will not provide a rich semantic as DBInputFormat, e.g. you'll be
Hi Mirko,
this is in general the kind of approach I was suggesting, but looked at in
a broader-perspective. I'd tend to avoid calling other tools such as Hive
or Pig often to compute injections, as Giraph is still a batch-processing
and this could really introduce latency and reduce throughput. I
Hi Claudio, and Marco
thanks for your comments!
I also see the problem of latency in this case and I would like to have a
generic method which than is implemented maybe on two levels. The loose
coupled one with workflows (maybe Oozie) which just reload the graph (no
injection) and another one
Hi Bu:
Until the interface with Gora is available you could use Apache Sqoop to
import your mysql table into HDFS and then run your Giraph job.
Cheers
Gustavo
Em 06/09/2013 04:43, Claudio Martella claudio.marte...@gmail.com
escreveu:
Hi Bu,
no, currently we do not have a DBInputFormat. We
Thanks Claudio and Gustavo for your answer. I have another question. I run
my algorithm on a cluster that has 20 nodes. When I specify the number of
workers to be 10 (or more), the algorithms works well and produces the
expected output. But, if the number of workers is less than 10 I get the
Hi,
I've increased the counter limit in mapred-site.xml, but I still get the
error: Exceeded counter limits - Counters=121 Limit=120. Groups=6 Limit=50.
This is my config:
cat conf/mapred-site.xml
?xml version=1.0?
?xml-stylesheet type=text/xsl href=configuration.xsl?
!-- Put site-specific