Re: Giraph and HBase
Nice! Thanks for sharing! Roman. On Tue, Dec 13, 2016 at 12:38 PM, Robert Yokotawrote: > Hi, > > In case anyone is interested in analyzing graphs in HBase with Apache > Giraph, this might be helpful: > > https://yokota.blog/2016/12/13/graph-analytics-on-hbase-with-hgraphdb-and-giraph/
Re: Apache Giraph Visualisation
On Sun, Aug 7, 2016 at 7:50 PM, agc studiowrote: > Hi Team, > > I have been looking for a graph drawing/visualisation implemented algorithm > for Giraph. Are there any such implementations? Can you be more specific/give examples? Thanks, Roman.
Re: Anybody's considering to present at ApacheCON EU?
Awesome! Meantime, I'll take a look at the HBase one. Thanks, Roman. On Mon, Jul 18, 2016 at 5:05 PM, Sergey Edunov <edu...@gmail.com> wrote: > Hi Roman, > > I'll take care of out of core test case. > > Regards, > Sergey Edunov > > On Mon, Jul 18, 2016 at 2:52 PM, Roman Shaposhnik <ro...@shaposhnik.org> > wrote: >> Hi! >> >> I was wondering if anybody is considering doing a talk at >> ApacheCON EU. The CFP closes on Sep 9th: >>http://www.apachecon.com/ >> >> Given upcoming 1.2.0 release it could be a good place >> for us to rekindle some of that Giraph love ;-) >> >> Thanks, >> Roman. >> >> P.S. Speaking of 1.2.0 can someone please take a look at the OutOfCore test >> failure in hadoop_1 profile? >> >> https://builds.apache.org/job/Giraph-1.2/MVN_PROFILE=hadoop_1,jdk=JDK%201.7%20(latest),label=ubuntu/8/testReport/
Re: Giraph out-of-core feature is not helping!
I remember seeing a discussion that discouraged users from utilizing OOC. I am vague on the details but you can try searching the archives. Thanks, Roman. On Fri, Dec 4, 2015 at 4:34 PM, Khaled Ammarwrote: > Hi, > > I am using Giraph-1.1.0 to do large graph processing. I was trying to do a > hashMin (WCC) algorithm on a large graph but it failed with out of memory > error. I thought the out-of-core option may help, but it did not. > > Is there any advice about how to enable out-of-core processing? > > I followed this URL: http://giraph.apache.org/ooc.html > > -- > Thanks, > -Khaled
Re: How can I build giraph 1.1.0-hadoop2 distribution from the source?
On Thu, Sep 10, 2015 at 4:09 AM, Anton Peterssonwrote: > Dear Roman, > > Thanks for your reply. > My application code runs on hadoop 2.4.1, by including the following > dependency. > > > org.apache.giraph > giraph-core > 1.1.0-hadoop2 > > > > I tried to change the dependency to my custom build (such as mvn clean > install -Pyarn -Dhadoop.version=2.4.1 -DskipTests), > but my application code is NOT running on my custom build. What exactly fails and how. I need details to be able to help you. > Therefore I wish > to know how to build the giraph-core jar (1.1.0-hadoop2) as listed in the > central maven repo. Building your local copy of Giraph has nothing to do with Maven central. Thanks, Roman.
Re: Please welcome our newest committer, Igor Kabiljo!
Great work, Igor! Thanks, Roman. On Tue, Feb 10, 2015 at 8:42 PM, Maja Kabiljo majakabi...@fb.com wrote: I am pleased to announce that Igor Kabiljo has been invited to become a committer by the Project Management Committee (PMC) of Apache Giraph, and he accepted. Igor's most important contributions are implementing reduce/broadcast that generalizes aggregators and working on primitive message/edge storages that make applications more efficient, as well as around using specific partitioners that utilize good partitioning. He is also coming up with issues for beginners and guiding them along the way. Igor, we are looking forward to your future work and deeper involvement in the project. Thanks, Maja List of Igor’s contributions: GIRAPH-785: Improve GraphPartitionerFactory usage GIRAPH-786: XSparseVector create a lot of objects in add/write GIRAPH-848: Allowing plain computation with types being configurable GIRAPH-934: Allow having state in aggregators GIRAPH-935: Loosen modifiers when needed GIRAPH-938: Allow fast working with primitives generically GIRAPH-939: Reduce/broadcast API GIRAPH-954: Allow configurable Aggregators/Reducers again GIRAPH-955: Allow vertex/edge/message value to be configurable GIRAPH-961: Internals of MasterLoggingAggregator have been incorrectly removed GIRAPH-965: Improving and adding reducers GIRAPH-986: Add more stuff to TypeOps GIRAPH-987: Improve naming for ReduceOperation Beginner issues he guided: GIRAPH-891: Make MessageStoreFactory configurable GIRAPH-895: Trim the edges in Giraph GIRAPH-921: Create ByteValueVertex to store vertex values as bytes without object instance GIRAPH-988: Allow object to be specified as next Computation in Giraph
Re: YARN vs. MR1: is YARN a good idea?
Perfect summary! Thanks for writing it. Thanks, Roman. On Fri, Dec 19, 2014 at 12:09 PM, Eli Reisman apache.mail...@gmail.com wrote: Giraph on YARN thus far doesn't break any compatibility with the MapReduce version. When I was working on it more actively, it had a slightly faster job startup but otherwise behaved similarly to the MapReduce version. There are a number of things design wise that could make the YARN profile substantially better (in theory) but would require a fork or bigger design changes/agreements about the MapReduce profiles. This would include things like spawning the Master Giraph task in the Application Master itself, and many other things along those lines. There are also a number of smaller things that would probably make a difference like exposing YARN's per-task resource configuration features in a more flexible way. I haven't had much time to hack on Giraph this past year, and at some point last summer some folks like Muhammad Islam from LinkedIn did some great work to update the YARN profile to run on Hadoop 2.2.0 or newer versions but since then it hasn't gotten much love. I noticed there is still a note in the master POM from the original Giraph on YARN implementation that says its compatible only with Hadoop 2.0.3-alpha. I thought that was removed with Mohammad's Hadoop 2.2.0 patches but apparently it wasn't. We should remove that, it's no longer accurate and seems to be misleading people trying to build the YARN profile. On Fri, Oct 10, 2014 at 11:15 AM, Tripti Singh tri...@yahoo-inc.com wrote: Hi Matthew, I would have been thrilled to give you numbers on this one but for me the Application is not scaling without the out-of-core option( which isn't working the way it was in previous version) I'm still figuring it out and can get back once it's resolved. I have patched a few things and will share them for people who might face similar issue. If u have a fix for scalability, do let me know Thanks, Tripti Sent from my iPhone On 06-Oct-2014, at 9:22 pm, Matthew Cornell m...@matthewcornell.org wrote: Hi Folks. I don't think I paid enough attention to YARN vs. MR1 when I built Giraph 1.0.0 for our system. How much better is Giraph on YARN? Thank you. -- Matthew Cornell | m...@matthewcornell.org
Re: ClassNotFoundException GiraphYarn Task with Giraph 1.1.0 for Hadoop 2.5.1
On Sun, Dec 14, 2014 at 11:21 PM, Philipp Nolte p...@daslaboratorium.de wrote: Maybe its just a configuration thing. How did you build Giraph in the first place? Also, what's your mapred-site.xml and yarn-site.xml in HADOOP_CONF_DIR? Finally, what version of Hadoop are you using? And from what vendor? I’ve tried running in giraph.SplitMasterWorker mode and its seems like hadoop is missing the worker nodes: Here is my command: $ hadoop jar giraphs-and-balloons-computation-0.0.1-for-hadoop-2.5.1-and-giraph-1.1.0-RC1-jar-with-dependencies.jar How did you build this JAR? -ca mapred.job.tracker=master:5431\ If your Giraph installation has access to a correctly configured Hadoop client you really don't need this line. Thanks, Roman.
Re: ClassNotFoundException GiraphYarn Task with Giraph 1.1.0 for Hadoop 2.5.1
Fixing YARN backend would be nice, but in the meantime, what prevents you from using a MR backend? That is known to work. Thanks, Roman. On Sun, Dec 14, 2014 at 2:41 PM, Philipp Nolte p...@daslaboratorium.de wrote: Hello everyone, I have been developing a Giraph application and now wanted to try it on my four-machine cluster running Hadoop 2.5.1. My Giraph application is packaged with Giraph 1.1.0-RC1 and all its dependencies. I’ve built Giraph and installed it into my local repository using the hadoop_2 profile running and built my application as a jar with all dependencies (with giraph-core 1.1.0-RC1). My application runs nicely with GiraphRunner and one worker on my local machine and on the cluster’s master. But as I have a small Hadooop 2.5.1 cluster, I need to use GiraphYarnTask to utilize all its workers. Running my application results in a ClassNotFoundException for GiraphYarnTask. Any idea? Philipp
Re: ClassNotFoundException GiraphYarn Task with Giraph 1.1.0 for Hadoop 2.5.1
On Sun, Dec 14, 2014 at 4:09 PM, Philipp Nolte p...@daslaboratorium.de wrote: I’ve had a look at the assembled giraph-core jar file and it does not contain any GiraphYarnTask. How so? Running my application using the GiraphRunner works fine as long as I only have one worker (local mode). To use the other workers, I need to start MR TaskTrackers on the machines - which aren’t available on hadoop 2.5.1. Thats why I need the GiraphYarnTask. Looks like we're talking past each other. What I am saying is that running pure MR-based Giraph job on a fully distributed YARNized cluster is perfectly valid and works fine. You don't *have* to use YARN, even though it is available on your cluster. Thanks, Roman.
Re: Can't complete Shortest Path Example: error code 127
Is there any reason you actually *need* Giraph to run on YARN? If you don't get any benefit out of it -- just go with the default MR model. Thanks, Roman. On Mon, Dec 8, 2014 at 11:58 PM, Alessio Arleo ingar...@icloud.com wrote: Thanks Roman for your reply :) in first place, I am pretty new to Giraph and Hadoop. For what I could learn, Yarn is the result a of a complete overhaul of MR, so I thought I could be wise to focus on the new technology: should I rollback to Hadoop_2 profile instead of pure_yarn? Or am I just missing the whole picture? Thanks :) Sent from my iPad On 08 Dec 2014, at 23:59, Roman Shaposhnik ro...@shaposhnik.org wrote: On Mon, Dec 8, 2014 at 9:23 AM, Dr. Alessio Arleo ingar...@icloud.com wrote: Hello everybody I managed to compile Giraph 1.1 for Hadoop 2.6.0 with pure_yarn maven profile. I am running on a VM environment so I took the necessary actions as listed in the Giraph Quick Start Guide. I think it would be fair to say that with Giraph on YARN you're running a configuration that hasn't been as much validated as Giraph on MR. Not to say you're on your own. More like: you're among a few users and collectively you guys may be on your own ;-) Patches, of course, are always welcome. Thanks, Roman.
Re: Can't complete Shortest Path Example: error code 127
On Mon, Dec 8, 2014 at 9:23 AM, Dr. Alessio Arleo ingar...@icloud.com wrote: Hello everybody I managed to compile Giraph 1.1 for Hadoop 2.6.0 with pure_yarn maven profile. I am running on a VM environment so I took the necessary actions as listed in the Giraph Quick Start Guide. I think it would be fair to say that with Giraph on YARN you're running a configuration that hasn't been as much validated as Giraph on MR. Not to say you're on your own. More like: you're among a few users and collectively you guys may be on your own ;-) Patches, of course, are always welcome. Thanks, Roman.
Re: Please welcome our newest committer, Sergey Edunov!
Congrats! Great to have you onboard! On Wed, Dec 3, 2014 at 10:34 AM, Maja Kabiljo majakabi...@fb.com wrote: I am happy to announce that the Project Management Committee (PMC) for Apache Giraph has elected Sergey Edunov to become a committer, and he accepted. Sergey has been an active member of Giraph community, finding issues, submitting patches and reviewing code. We’re looking forward to Sergey’s larger involvement and future work. List of his contributions: GIRAPH-895: Trim the edges in Giraph GIRAPH-896: Memory leak in SuperstepMetricsRegistry GIRAPH-897: Add an option to dump only live objects to JMap GIRAPH-898: Remove giraph-accumulo from Facebook profile GIRAPH-903: Detect crashes on Netty threads GIRAPH-924: Fix checkpointing GIRAPH-925: Unit tests should pass even if zookeeper port not available GIRAPH-927: Decouple netty server threads from message processing GIRAPH-933: Checkpointing improvements GIRAPH-936: Decouple netty server threads from message processing GIRAPH-940: Cleanup the list of supported hadoop versions GIRAPH-950: Auto-restart from checkpoint doesn't pick up latest checkpoint GIRAPH-963: Aggregators may fail with IllegalArgumentException upon deserialization Best, Maja
[ANNOUNCE] Giraph 1.1.0 is now available
The Apache Giraph team is pleased to announce the immediate release of Giraph 1.1.0. Download it from your favorite Apache mirror at: http://www.apache.org/dyn/closer.cgi/giraph/giraph-1.1.0 This release has also been pushed to Apache's maven repository as two separate versions: * 1.1.0 is the version that has been compiled against hadoop 1.2+ * 1.1.0-hadoop2 is the version that has been compiled against hadoop 2.5+ make sure to specify the correct version in your Maven dependencies. See also the full release notes: http://s.apache.org/a8X Thanks to everybody who contributed to this release! Yours, The Apache Giraph team
[RESULT] [VOTE] Apache Giraph 1.1.0 RC2
Hi! with 3 binding +1, one non-binding +1, no 0s or -1s the vote to publish Apache Giraph 1.1.0 RC2 as the 1.1.0 release of Apache Giraph passes. Thanks to everybody who spent time on validating the bits! The vote tally is +1s: Claudio Martella (binding) Maja Kabiljo (binding) Eli Reisman (binding) Roman Shaposhnik (non-binding) I'll do the publishing tonight and will send an announcement! Thanks, Roman (AKA 1.1.0 RM) On Thu, Nov 13, 2014 at 5:28 AM, Roman Shaposhnik ro...@shaposhnik.org wrote: This vote is for Apache Giraph, version 1.1.0 release It fixes the following issues: http://s.apache.org/a8X *** Please download, test and vote by Mon 11/17 noon PST Note that we are voting upon the source (tag): release-1.1.0-RC2 Source and binary files are available at: http://people.apache.org/~rvs/giraph-1.1.0-RC2/ Staged website is available at: http://people.apache.org/~rvs/giraph-1.1.0-RC2/site/ Maven staging repo is available at: https://repository.apache.org/content/repositories/orgapachegiraph-1003 Please notice, that as per earlier agreement two sets of artifacts are published differentiated by the version ID: * version ID 1.1.0 corresponds to the artifacts built for the hadoop_1 profile * version ID 1.1.0-hadoop2 corresponds to the artifacts built for hadoop_2 profile. The tag to be voted upon (release-1.1.0-RC1): https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=log;h=refs/tags/release-1.1.0-RC2 The KEYS file containing PGP keys we use to sign the release: http://svn.apache.org/repos/asf/bigtop/dist/KEYS Thanks, Roman.
Re: [VOTE] Apache Giraph 1.1.0 RC2
On Thu, Nov 13, 2014 at 5:28 AM, Roman Shaposhnik ro...@shaposhnik.org wrote: This vote is for Apache Giraph, version 1.1.0 release +1 (non-binding) Thanks, Roman.
[VOTE] Apache Giraph 1.1.0 RC2
This vote is for Apache Giraph, version 1.1.0 release It fixes the following issues: http://s.apache.org/a8X *** Please download, test and vote by Mon 11/17 noon PST Note that we are voting upon the source (tag): release-1.1.0-RC2 Source and binary files are available at: http://people.apache.org/~rvs/giraph-1.1.0-RC2/ Staged website is available at: http://people.apache.org/~rvs/giraph-1.1.0-RC2/site/ Maven staging repo is available at: https://repository.apache.org/content/repositories/orgapachegiraph-1003 Please notice, that as per earlier agreement two sets of artifacts are published differentiated by the version ID: * version ID 1.1.0 corresponds to the artifacts built for the hadoop_1 profile * version ID 1.1.0-hadoop2 corresponds to the artifacts built for hadoop_2 profile. The tag to be voted upon (release-1.1.0-RC1): https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=log;h=refs/tags/release-1.1.0-RC2 The KEYS file containing PGP keys we use to sign the release: http://svn.apache.org/repos/asf/bigtop/dist/KEYS Thanks, Roman.
Re: [VOTE] Apache Giraph 1.1.0 RC1
Hi! as per prior suggestion I've recut RC2 with the only extra fix included: https://issues.apache.org/jira/browse/GIRAPH-961 Given the scope of the fix, I firmly believe that the testing we've conducted so far should be 100% applicable to the new RC. Still, I started a new vote thread to keep things clean. Claudio, can you please recast your vote in that other new thread? Maja, Eugene, we really need you guys to vote. Please do so. We're sooo close to finally push 1.1.0 out. Thanks, Roman. On Mon, Nov 10, 2014 at 4:01 AM, Claudio Martella claudio.marte...@gmail.com wrote: Yes, I did re-run the build this weekend, and it built succesfully for the default profile and the hadoop_2 one. I ran a couple of examples on the cluster, and it ran succesfully. I'm +1. On Tue, Nov 4, 2014 at 8:10 PM, Roman Shaposhnik ro...@shaposhnik.org wrote: On Tue, Nov 4, 2014 at 5:47 AM, Claudio Martella claudio.marte...@gmail.com wrote: I am indeed having some problems. mvn install will fail because the test is opening too many files: [snip] I have to investigate why this happens. I'm not using a different ulimit than what I have on my Mac OS X by default. Where are you building yours? This is really weird. I have not issues whatsoever on Mac OS X with the following setup: $ uname -a Darwin usxxshaporm1.corp.emc.com 12.4.1 Darwin Kernel Version 12.4.1: Tue May 21 17:04:50 PDT 2013; root:xnu-2050.40.51~1/RELEASE_X86_64 x86_64 $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 2560 pipe size(512 bytes, -p) 1 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 709 virtual memory (kbytes, -v) unlimited $ mvn --version Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 2014-08-11T13:58:10-07:00) Maven home: /Users/shapor/dist/apache-maven-3.2.3 Java version: 1.7.0_51, vendor: Oracle Corporation Java home: /Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/jre Default locale: en_US, platform encoding: UTF-8 OS name: mac os x, version: 10.8.4, arch: x86_64, family: mac Thanks, Roman. -- Claudio Martella
Re: [VOTE] Apache Giraph 1.1.0 RC2
On Thu, Nov 13, 2014 at 5:32 AM, Panagiotis Eustratiadis ep.pan@gmail.com wrote: Was the issue with the aggregators on release-1.1 fixed? The only extra fix in this latest RC is: https://issues.apache.org/jira/browse/GIRAPH-961 Thanks, Roman.
Re: Differences in building Giraph
At this point you either need Hadoop 2.5+ or you need to manually hack the two offending files. Thanks, Roman. On Thu, Nov 6, 2014 at 9:44 PM, Sundara Raghavan Sankaran sun...@crayondata.com wrote: Hi Giraph Experts, I'm trying to build Giraph (trunk) for Hadoop 2.4.0. I know that there are 2 profiles with which I can proceed to build. hadoop_yarn and hadoop_2 profiles. When I try with hadoop_yarn, build is fine. When I try with hadoop_2, build is failing. The commands I used is mvn -Phadoop_yarn -Dhadoop.version=2.4.0 -DskipTests clean install mvn -Phadoop_2 -Dhadoop.version=2.4.0 -DskipTests clean install I get the following error while building with hadoop_2 profile [INFO] - [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Giraph Parent ... SUCCESS [ 5.758 s] [INFO] Apache Giraph Core . FAILURE [ 9.259 s] [INFO] Apache Giraph Examples . SKIPPED [INFO] Apache Giraph Accumulo I/O . SKIPPED [INFO] Apache Giraph HBase I/O SKIPPED [INFO] Apache Giraph HCatalog I/O . SKIPPED [INFO] Apache Giraph Hive I/O . SKIPPED [INFO] Apache Giraph Gora I/O . SKIPPED [INFO] Apache Giraph Rexster I/O .. SKIPPED [INFO] Apache Giraph Rexster Kibble ... SKIPPED [INFO] Apache Giraph Rexster I/O Formats .. SKIPPED [INFO] Apache Giraph Distribution . SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 15.722 s [INFO] Finished at: 2014-11-07T11:06:36+05:30 [INFO] Final Memory: 48M/558M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile) on project giraph-core: Compilation failure: Compilation failure: [ERROR] /home/sundar/giraph/giraph-core/src/main/java/org/apache/giraph/comm/netty/SaslNettyClient.java:[93,32] getDefaultProperties() has protected access in org.apache.hadoop.security.SaslPropertiesResolver [ERROR] /home/sundar/giraph/giraph-core/src/main/java/org/apache/giraph/comm/netty/SaslNettyServer.java:[113,32] getDefaultProperties() has protected access in org.apache.hadoop.security.SaslPropertiesResolver [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :giraph-core Can some one help me in understanding why this is happening with profile change? I get it if the build failed when hadoop version changed. -- Thanks, Sundar
Re: Compiling Giraph 1.1
What's the exact compilation incantation you use? Thanks, Roman. On Tue, Nov 4, 2014 at 9:56 AM, Ryan freelanceflashga...@gmail.com wrote: I'm attempting to build, compile and install Giraph 1.1 on a server running CDH5.1.2. A few weeks ago I successfully compiled it by changing the hadoop_2 profile version to be 2.3.0-cdh5.1.2. I recently did a fresh install and was unable to build, compile and install (perhaps due to the latest code updates). The error seems to be related to the SaslNettyClient and SaslNettyServer. Any idea on fixes? Here's part of the error log: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile) on project giraph-core: Compilation failure: Compilation failure: [ERROR] /[myPath]/giraph/giraph-core/src/main/java/org/apache/giraph/comm/netty/SaslNettyClient.java:[28,34] cannot find symbol [ERROR] symbol: class SaslPropertiesResolver [ERROR] location: package org.apache.hadoop.security ... [ERROR] /[myPath]/giraph/giraph-core/src/main/java/org/apache/giraph/comm/netty/SaslNettyServer.java:[108,11] cannot find symbol [ERROR] symbol: variable SaslPropertiesResolver [ERROR] location: class org.apache.giraph.comm.netty.SaslNettyServer
Re: [VOTE] Apache Giraph 1.1.0 RC1
On Mon, Nov 3, 2014 at 4:51 PM, Maja Kabiljo majakabi...@fb.com wrote: We¹ve been running code which is the same as release candidate plus fix on GIRAPH-961 in production for 5 days now, no problems. This is hadoop_facebook profile, using only hive-io from all io modules. Great! This tells me that once I cut RC2 with GIRAPH-961 you guys will be ready to vote! Thanks, Roman.
Re: [VOTE] Apache Giraph 1.1.0 RC1
On Tue, Nov 4, 2014 at 5:47 AM, Claudio Martella claudio.marte...@gmail.com wrote: I am indeed having some problems. mvn install will fail because the test is opening too many files: [snip] I have to investigate why this happens. I'm not using a different ulimit than what I have on my Mac OS X by default. Where are you building yours? This is really weird. I have not issues whatsoever on Mac OS X with the following setup: $ uname -a Darwin usxxshaporm1.corp.emc.com 12.4.1 Darwin Kernel Version 12.4.1: Tue May 21 17:04:50 PDT 2013; root:xnu-2050.40.51~1/RELEASE_X86_64 x86_64 $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 2560 pipe size(512 bytes, -p) 1 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 709 virtual memory (kbytes, -v) unlimited $ mvn --version Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4; 2014-08-11T13:58:10-07:00) Maven home: /Users/shapor/dist/apache-maven-3.2.3 Java version: 1.7.0_51, vendor: Oracle Corporation Java home: /Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/jre Default locale: en_US, platform encoding: UTF-8 OS name: mac os x, version: 10.8.4, arch: x86_64, family: mac Thanks, Roman.
Re: [VOTE] Apache Giraph 1.1.0 RC1
Ping! Any progress on testing the current RC? Thanks, Roman. On Fri, Oct 31, 2014 at 9:00 AM, Claudio Martella claudio.marte...@gmail.com wrote: Oh, thanks for the info! On Fri, Oct 31, 2014 at 3:06 PM, Roman Shaposhnik ro...@shaposhnik.org wrote: On Fri, Oct 31, 2014 at 3:26 AM, Claudio Martella claudio.marte...@gmail.com wrote: Hi Roman, thanks again for this. I have had a look at the staging site so far (our cluster has been down whole week... universities...), and I was wondering if you have an insight why some of the docs are missing, e.g. gora and rexster documentation. None of them are missing. The links moved to a User Docs - Modules though: http://people.apache.org/~rvs/giraph-1.1.0-RC1/site/gora.html http://people.apache.org/~rvs/giraph-1.1.0-RC1/site/rexster.html and so forth. Thanks, Roman. -- Claudio Martella
Re: Issue with Giraph on multinode cluster
Please create a JIRA and attach your patch to it. Thanks, Roman. On Mon, Oct 20, 2014 at 2:23 AM, Bojan Babic gba...@gmail.com wrote: I've made a patch that worked for me. Not sure, if I should post JIRA issue. In attach, you can find hack. On Fri, Oct 17, 2014 at 5:52 PM, Bojan Babic gba...@gmail.com wrote: I'm using giraph 1.1.0-SNAPSHOT for hadoop 1.2.1 On Fri, Oct 17, 2014 at 4:01 PM, Bojan Babic gba...@gmail.com wrote: Hi guys, I'm risking to post issue that has been already issued, but I'll take risk to be ridiculed :) I have small hadoop cluster on Digital Ocean (1 master 4 nodes). I was able to setup cluster and run word count example as well as single node sample from Quick start. As I introduce more nodes into play, I get issue where Task Tracker spawns Child process hduser@hdnode-2:~# jps 13839 TaskTracker 13697 DataNode 14067 Jps 13962 Child 13961 Child that listen on looback interface Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name tcp0 0 127.0.0.1:1337 0.0.0.0:* LISTEN root 2154492529912/python tcp0 0 0.0.0.0:50010 0.0.0.0:* LISTEN hduser 2169155213697/java tcp0 0 127.0.0.1:30011 0.0.0.0:* LISTEN hduser 2169357813962/java tcp0 0 0.0.0.0:50075 0.0.0.0:* LISTEN hduser 2169155413697/java tcp0 0 0.0.0.0:50020 0.0.0.0:* LISTEN hduser 2169155713697/java tcp0 0 127.0.0.1:50118 0.0.0.0:* LISTEN hduser 2169187013839/java tcp0 0 0.0.0.0:41640 0.0.0.0:* LISTEN hduser 2169129613697/java tcp0 0 127.0.0.1:31337 0.0.0.0:* LISTEN root 204326601514/python tcp0 0 0.0.0.0:50060 0.0.0.0:* LISTEN hduser 2169214413839/java tcp0 0 0.0.0.0:http-alt0.0.0.0:* LISTEN root 204318971421/python tcp0 0 127.0.0.1:30001 0.0.0.0:* LISTEN hduser 213700047856/ssh tcp0 0 127.0.0.1:30003 0.0.0.0:* LISTEN hduser 2169356213961/java tcp0 0 127.0.0.1:58741 0.0.0.0:* LISTEN hduser 21377856/ssh tcp0 0 127.0.0.1:58742 0.0.0.0:* LISTEN hduser 213699827845/autossh tcp0 0 0.0.0.0:ssh 0.0.0.0:* LISTEN root 9130834/sshd tcp6 0 0 ::1:30001 :::* LISTEN hduser 213700037856/ssh tcp6 0 0 ::1:58741 :::* LISTEN hduser 21367856/ssh tcp6 0 0 :::ssh :::* LISTEN root 9165834/sshd instead of all interfaces (0.0.0.0) This results in node being unreachable from other nodes. ie hdnode02: 2014-10-17 14:10:31,146 WARN org.apache.giraph.comm.netty.NettyClient: 2014-10-17 14:10:31,159 WARN org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Future failed to connect with hdnode-2/XXX.XXX.XXX.XXX:30003 with 1 failures because of java.net.ConnectException: Connection refused: hdnode-2/XXX.XXX.XXX.XXX:30003 2014-10-17 14:10:31,159 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 2 failed, 2 failures total. If I stop all processes and start nc on 30003, I can telnet to hdnode2. Question here is if there is any setup that will configure Child process to listen on 0.0.0.0 instead of loopback interface? Thanks in advance -- Bojan Babic, M.Sc.E.E Software developer twitter: @bojanbabic mobile: +1312 8602944 -- Bojan Babic, M.Sc.E.E Software developer twitter: @bojanbabic mobile: +1312 8602944
Re: [VOTE] Apache Giraph 1.1.0 RC1
On Fri, Oct 31, 2014 at 3:26 AM, Claudio Martella claudio.marte...@gmail.com wrote: Hi Roman, thanks again for this. I have had a look at the staging site so far (our cluster has been down whole week... universities...), and I was wondering if you have an insight why some of the docs are missing, e.g. gora and rexster documentation. None of them are missing. The links moved to a User Docs - Modules though: http://people.apache.org/~rvs/giraph-1.1.0-RC1/site/gora.html http://people.apache.org/~rvs/giraph-1.1.0-RC1/site/rexster.html and so forth. Thanks, Roman.
[VOTE] Apache Giraph 1.1.0 RC1
This vote is for Apache Giraph, version 1.1.0 release It fixes the following issues: http://s.apache.org/a8X *** Please download, test and vote by Mon 11/3 noon PST Note that we are voting upon the source (tag): release-1.1.0-RC1 Source and binary files are available at: http://people.apache.org/~rvs/giraph-1.1.0-RC1/ Staged website is available at: http://people.apache.org/~rvs/giraph-1.1.0-RC1/site/ Maven staging repo is available at: https://repository.apache.org/content/repositories/orgapachegiraph-1002 Please notice, that as per earlier agreement two sets of artifacts are published differentiated by the version ID: * version ID 1.1.0 corresponds to the artifacts built for the hadoop_1 profile * version ID 1.1.0-hadoop2 corresponds to the artifacts built for hadoop_2 profile. The tag to be voted upon (release-1.1.0-RC1): https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=commit;h=1f0fc23c26ce3addb746e3e57cc155f82afbab87 The KEYS file containing PGP keys we use to sign the release: http://svn.apache.org/repos/asf/bigtop/dist/KEYS Thanks, Roman.
Re: Empty Output when running Shortest Path Algorithm !!
I this should should give you some clue: 14/09/21 13:34:03 WARN job.GiraphConfigurationValidator: Output format vertex index type is not known 14/09/21 13:34:03 WARN job.GiraphConfigurationValidator: Output format vertex value type is not known 14/09/21 13:34:03 WARN job.GiraphConfigurationValidator: Output format edge value type is not known At this point, what I would recommend though is to try the same experiment with our Giraph 1.1.0 RC0 available from the following URL: http://people.apache.org/~rvs/giraph-1.1.0-RC0/giraph-dist-1.1.0-bin.tar.gz Thanks, Roman.
Re: Are the unit tests supposed to fail?
On Mon, Jul 7, 2014 at 4:50 PM, Toshio ITO toshio9@toshiba.co.jp wrote: Hi Roman, I previously reported some cases where Giraph unit tests failed. http://mail-archives.apache.org/mod_mbox/giraph-user/201406.mbox/%3c87a990dkni.wl%25toshio9@toshiba.co.jp%3E This thread talks about hadoop_0.20.203 profile. I am not sure this version of Hadoop gets a lot of attention in Giraph community. Personally, I'd definitely not treat any failures in that profile as release blockers. http://mail-archives.apache.org/mod_mbox/giraph-user/201407.mbox/%3C8761jgqjec.wl%25toshio9.ito%40toshiba.co.jp%3E This one is more interesting. I can't reproduce your failures in Rexster I/O Format in my environment. As for -Dprop.mapred.job.tracker=localhost:54311 -- I've never seen anybody running Giraph unit tests that way. You're right that in theory it should work and it would be useful for us to understand why it fails. I may be able to look at it, but since it happens to be a pretty non-orthodox way of running Unit test, I don't think it'll be a release blocker all by itself. Because it seems I'm almost the only one in the user mailing list who cares about the unit tests I think all of us do, but the thing is -- we run them as pure unit tests -- you run then as combination of unit/system tests. If you can make them work both ways that would be appreciated regardless of whether your fixes end up in 1.1.0 or not. I just wonder whether it is normal (or expected) for the unit tests to fail at this stage of development (release-1.1.0-RC0). Pure unit tests are definitely expected to pass. They do pass on our Jenkins, hence my suspicion that something's is wrong with your env. Thanks, Roman.
Re: Giraph 1.1.0 and Jetty 7
On Fri, Jul 4, 2014 at 9:11 AM, Carlo Sartiani sarti...@gmail.com wrote: We examined the code of Giraph 1.1.0, but we actually did not find any place where a Collector object is created and/or Jetty is really used. Do you have any idea on how to solve this issue? Please, observe that we are forced to use this Hadoop distribution and cannot switch to a plain Hadoop 2.2.0 distribution with Jetty 6. Here's what I'd suggest: take 1.1.0-RC0 source code and manually change the version of Jetty in the top level pom.xml. Rebuild everything with -Phadoop_2 and also specify the exact version of your Hadoop with -Dhadoop.version=X.Y.Z for good measure. Thanks, Roman.
Re: Giraph (1.1.0-SNAPSHOT and 1.0.0-RC3) unit tests fail
Yes, the failures around Accumulo in hadoop_2 profile are expected and nothing to worry about. I should've probably mentioned it in my RC announcement email. Sorry about that. Any failures in hadoop_1 profile would be a reason to reconsider RC0. Thanks, Roman. P.S. This is one of the reasons we're still running with hadoop_1 as a default profile. On Mon, Jun 30, 2014 at 3:09 AM, Akila Wajirasena akila.wajiras...@gmail.com wrote: Hi Roman, I got the same error when running hadoop_2 profile. According to this [1] the Accumulo version we use in giraph (1.4) is not compatible with Hadoop 2. I think this is the issue. [1] http://apache-accumulo.1065345.n5.nabble.com/Accumulo-Hadoop-version-compatibility-matrix-tp3893p3894.html Thanks Akila On Mon, Jun 30, 2014 at 2:21 PM, Toshio ITO toshio9@toshiba.co.jp wrote: Hi Roman. I checked out release-1.1.0-RC0 and succeeded to build it. $ git checkout release-1.1.0-RC0 $ mvn clean $ mvn package -Phadoop_2 -DskipTests ## SUCCESS However, when I ran the tests with LocalJobRunner, it failed. $ mvn clean $ mvn package -Phadoop_2 It passed tests from Core and Examples, but it failed at Accumulo I/O. testAccumuloInputOutput(org.apache.giraph.io.accumulo.TestAccumuloVertexFormat) The error log contained the following exception java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected Next I wanted to run the tests with a running Hadoop2 instance, but I'm having trouble to set it up (I'm quite new to Hadoop). Could you show me some example configuration (etc/hadoop/* files) of Hadoop 2.2.0 single-node cluster? That would be very helpful. On Sun, Jun 29, 2014 at 5:06 PM, Toshio ITO toshio9@toshiba.co.jp wrote: Hi Roman. Thanks for the reply. OK, I'll try hadoop_1 and hadoop_2 with the latest release-1.1.0-RC0 and report the result. That would be extremely helpful! And speaking of which -- I'd like to remind folks that taking RC0 for a spin would really help at this point. If we ever want to have 1.1.0 out we need the required PMC votes. Thanks, Roman. Toshio Ito
Re: Giraph (1.1.0-SNAPSHOT and 1.0.0-RC3) unit tests fail
Please try profiles hadoop_1 and hadoop_2. These are the two profiles that we're targeting for the upcoming 1.1.0 release. The rest of profiles are optional and may or may NOT work. Thanks, Roman. On Thu, Jun 26, 2014 at 3:34 AM, Toshio ITO toshio9@toshiba.co.jp wrote: Hi all, I recently tried Giraph. It compiled fine, but when I did the unit test (i.e. `mvn test`) it failed. I did the test several times with different settings. See below for details. My question is: - Is it normal for Giraph? Is the unit test supposed to fail for now? - If it's not normal, what do you think is wrong? My environment: - Ubuntu Server 14.04 64-bit - Dynamic address by DHCP - Followed the Quick Start guide https://giraph.apache.org/quick_start.html except I used localhost for the node hostname. -- Installed openjdk-7-jdk, git, maven by apt-get -- Installed hadoop-0.20.203.0rc1 as instructed Test settings: (1) 1.1.0-SNAPSHOT (Commit ID: b218d72cedc52467e691c6002e596e482d8583e4) with LocalJobRunner (2) 1.1.0-SNAPSHOT with the running Hadoop instance (3) Tag release-1.0.0-RC3 with LocalJobRunner (4) Tag release-1.0.0-RC3 with the running Hadoop instance Cases (1) and (3) are run by the following commands. $ mvn clean $ mvn package -Phadoop_0.20.203 -DskipTests $ mvn test -Phadoop_0.20.203 Cases (2) and (4) are run by the following commands. $ mvn clean $ mvn package -Phadoop_0.20.203 -DskipTests $ mvn test -Phadoop_0.20.203 -Dprop.mapred.job.tracker=localhost:54311 Test results: (1) Failed at Rexster I/O Format. At some point, the test endlessly tried to connect to ZooKeeper, repeating the following log messages. I had to terminate the test by Ctrl-C. 14/06/26 18:10:27 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:22182. Will not attempt to authenticate using SASL (unknown error) 14/06/26 18:10:27 WARN zookeeper.ClientCnxn: Session 0x146d76e16750002 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) (2) Failed at Examples. Hadoop was stuck at testBspFail job. Logs like the following line were printed every 5 seconds for 10 minutes. After that, the test endlessly tried to connect to ZooKeeper, just like the case (1). I killed the test process and the Hadoop job. 14/06/26 18:40:30 INFO job.JobProgressTracker: Data from 2 workers - Compute superstep 20: 10 out of 10 vertices computed; 6 out of 6 partitions computed; min free memory on worker 1 - 137.42MB, average 149.12MB (3) Failed at HBase I/O in testHBaseInputOutput (org.apache.giraph.io.hbase.TestHBaseRootMarkerVertextFormat). Before it reported failure, it blocked for 10 minutes. (4) Failed at Core in testContinue (org.apache.giraph.io.TestJsonBase64Format) . Hadoop was stuck at the second testContinue job. After 10 minutes, the test went on and reported the failure. The testContinue map task was aborted with the following error. java.io.FileNotFoundException: /tmp/_giraphTests/testContinue/_logs (Is a directory) Thanks in advance. Toshio Ito
Re: Is it possible to know the mapper task a particular vertex is assigned to?
On Wed, Mar 5, 2014 at 9:53 PM, Pankaj Malhotra pankajiit...@gmail.com wrote: Hi, How can I find the mapper task a particular vertex is assigned to? I can do this by doing a sysout and then looking at the logs. But there must be a smarter way to do this. Please suggest. That mapping is not static and can change. In theory you can rely on the info in ZK, but that would be relying on what is, essentially, an implementation detail of Giraph. What's the reason for you to need this info? Thanks, Roman.
Giraph talks at Hadoop Summit
Hi! not sure if anybody from the Giraph community submitted any talks to Hadoop Summit, but here's the one I submitted: https://hadoopsummit.uservoice.com/forums/242790-committer-track/suggestions/5568061-apache-giraph-start-analyzing-graph-relationships Feel free to upvote if you feel like Giraph deserves to be well represented at Hadoop Summit. Thanks, Roman.
Re: Issue in running giraph example in hadoop2.2
You need to configure Giraph not to split workers and master via giraph.SplitMasterWorker=false You can either set it in giraph-site.xml or pass via command line option -ca giraph.SplitMasterWorker=false Thanks, Roman. On Sat, Feb 22, 2014 at 10:19 PM, Arun Kumar toga...@gmail.com wrote: While running Giraph over Hadoop 2.2 I am getting the below exception 14/02/20 04:52:44 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address Exception in thread main java.lang.IllegalArgumentException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, you cannot run in split master / worker mode since there is only 1 task at a time! at org.apache.giraph.job.GiraphJob.checkLocalJobRunnerConfiguration(GiraphJob.java:157) at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:225) at org.apache.giraph.GiraphRunner.run(GiraphRunner.java:94) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.giraph.GiraphRunner.main(GiraphRunner.java:124) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) One of the solution found in the archive was to modify /directory-to-hadoop/conf/mapred-site.xml with: property namemapred.tasktracker.map.tasks.maximum/name value4/value /property property namemapred.map.tasks/name value4/value /property But in my case it did not work.. Can somebody help? Regards -Arun
Re: Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0
On Wed, Feb 19, 2014 at 1:12 PM, Stefan Beskow stefan.bes...@sas.com wrote: Hi. I checked out and built Giraph for Cloudera CDH5 with Hadoop 2.2.0 using the following: git clone git://git.apache.org/giraph.git snapshot_from_git cd snapshot_from_git mvn -Phadoop_yarn -Dhadoop.version=2.2.0 clean package -DskipTests It may help to build against CDH5 directly by: * manually adding repository.cloudera.com to the set of repos * specifying -Dhadoop.version=2.2.0-cdh5.0.0-beta-2 When I run the sample application org.apache.giraph.examples.SimpleShortestPathsComputation I get the following exception: You need to provide way more logs from the YARN side for us to make sense of it. Thanks, Roman.
Re: Giraph 1.1.0 Apache Giraph Distribution ........................ FAILURE and Unsupported major.minor version 51.0 error.
Try applying this patch (you may need to apply it with a fuzz): https://issues.apache.org/jira/browse/GIRAPH-794 and let us know if it helps (it should). Thanks, Roman. On Thu, Feb 6, 2014 at 2:23 PM, Rocky Grey rockyg...@gmail.com wrote: Hi I pulled the latest repository from Git - http://git-wip-us.apache.org/repos/asf/giraph.git and tried to package the application using mvn -Phadoop_yarn -Dhadoop.version=2.2.0 -DskipTests clean package The process succeeded partially. Below is the snapshot of the package log. Did anybody else face the same problem ? ... ... [INFO] [INFO] Building Apache Giraph Distribution 1.1.0-SNAPSHOT [INFO] [WARNING] The POM for org.apache.giraph:giraph-hbase:jar:1.1.0-SNAPSHOT is missing, no dependency information available [WARNING] The POM for org.apache.giraph:giraph-accumulo:jar:1.1.0-SNAPSHOT is missing, no dependency information available [WARNING] The POM for org.apache.giraph:giraph-hcatalog:jar:1.1.0-SNAPSHOT is missing, no dependency information available [WARNING] The POM for org.apache.giraph:giraph-hive:jar:1.1.0-SNAPSHOT is missing, no dependency information available [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Giraph Parent .. SUCCESS [4.283s] [INFO] Apache Giraph Core SUCCESS [35.053s] [INFO] Apache Giraph Examples SUCCESS [12.905s] [INFO] Apache Giraph Distribution FAILURE [0.110s] [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 52.782s [INFO] Finished at: Thu Feb 06 13:17:41 PST 2014 [INFO] Final Memory: 41M/286M [INFO] [ERROR] Failed to execute goal on project giraph-dist: Could not resolve dependencies for project org.apache.giraph:giraph-dist:pom:1.1.0-SNAPSHOT: The following artifacts could not be resolved: org.apache.giraph:giraph-hbase:jar:1.1.0-SNAPSHOT, org.apache.giraph:giraph-accumulo:jar:1.1.0-SNAPSHOT, org.apache.giraph:giraph-hcatalog:jar:1.1.0-SNAPSHOT, org.apache.giraph:giraph-hive:jar:1.1.0-SNAPSHOT: Failure to find org.apache.giraph:giraph-hbase:jar:1.1.0-SNAPSHOT in http://repo1.maven.org/maven2 was cached in the local repository, resolution will not be reattempted until the update interval of central has elapsed or updates are forced - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :giraph-dist I tried to rerun the process multiple times but got the same error. On a second note when I tried to test the partially successful installation like mentioned below # hadoop jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-jar-with-dependencies.jar org.apache.giraph.GiraphRunner -h It failed with the following error. Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/giraph/GiraphRunner : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) at java.lang.ClassLoader.defineClass(ClassLoader.java:615) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) at java.net.URLClassLoader.access$000(URLClassLoader.java:58) at java.net.URLClassLoader$1.run(URLClassLoader.java:197) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.util.RunJar.main(RunJar.java:205) Any help would be appreciated. Thanks.
Re: Release date for 1.1.0
On Thu, Jan 30, 2014 at 5:04 PM, Avery Ching ach...@apache.org wrote: I've upgraded HBase in Giraph-833 ( HBase 0.90.5- 0.94.16) https://issues.apache.org/jira/browse/GIRAPH-833 Any reason it is not marked as for 1.1.0? Thanks, Roman.
Re: Release date for 1.1.0
It is the usual community-driven ASF process. Somebody familiar with the project has to step forward as a Release Manager and drive the release. I did a few months back, but since then I went through a career change that made it very difficult for me to find free cycles to drive this release. I fully intend to pick up the slack begging of Feb. Given that I think beginning of March should be a realistic deadline, but it all depends on the availability of the Giraph PMC members to cast votes on the release candidate. That said, if there's anybody else who would want to speed up this release I'd be more than happy to yield. By and large though, ASF project typically don't give any schedule for future releases. The way to speed it up is to join the community, start contributing and volunteering as RM. Thanks, Roman. On Wed, Jan 15, 2014 at 5:02 PM, Zhu, Xia xia@intel.com wrote: Is it possible to release 1.1.0 before March 2014? Thanks, Xia -Original Message- From: Zhu, Xia [mailto:xia@intel.com] Sent: Wednesday, January 15, 2014 4:36 PM To: user@giraph.apache.org Subject: RE: Release date for 1.1.0 May I know what are the Giraph release process? Thanks, Ivy -Original Message- From: shaposh...@gmail.com [mailto:shaposh...@gmail.com] On Behalf Of Roman Shaposhnik Sent: Monday, January 06, 2014 9:22 PM To: user@giraph.apache.org Subject: Re: Release date for 1.1.0 On Mon, Jan 6, 2014 at 6:13 AM, Ahmet Emre Aladağ aladage...@gmail.com wrote: Hi, Are there any advances so far on the 1.1.0 release schedule? Unfortunately, with my recent job change driving 1.1.0 release dropped from my list. I'll try to pick it up back this month. Still very much would like to help make it happen. Thanks, Roman.
Re: Release date for 1.1.0
Forgot to add, one way you can help is to look through the list of JIRAs currently blocking the release: https://issues.apache.org/jira/browse/GIRAPH-819?jql=project%3Dgiraph%20and%20fixversion%3D%221.1.0%22%20and%20status%20!%3D%20closed%20and%20status%20!%3D%20resolved and helping to triage it. Thanks, Roman. On Wed, Jan 15, 2014 at 5:02 PM, Zhu, Xia xia@intel.com wrote: Is it possible to release 1.1.0 before March 2014? Thanks, Xia -Original Message- From: Zhu, Xia [mailto:xia@intel.com] Sent: Wednesday, January 15, 2014 4:36 PM To: user@giraph.apache.org Subject: RE: Release date for 1.1.0 May I know what are the Giraph release process? Thanks, Ivy -Original Message- From: shaposh...@gmail.com [mailto:shaposh...@gmail.com] On Behalf Of Roman Shaposhnik Sent: Monday, January 06, 2014 9:22 PM To: user@giraph.apache.org Subject: Re: Release date for 1.1.0 On Mon, Jan 6, 2014 at 6:13 AM, Ahmet Emre Aladağ aladage...@gmail.com wrote: Hi, Are there any advances so far on the 1.1.0 release schedule? Unfortunately, with my recent job change driving 1.1.0 release dropped from my list. I'll try to pick it up back this month. Still very much would like to help make it happen. Thanks, Roman.
Re: Preconfigured BigTop running Giraph
Hi Martin, sorry for the belated reply. I am wondering how did you configure your cluster? Did you use Bigtop's puppet recipies or did you do it by hand? It seems that Giraph is working fine on the toy cluster that I'm deploying with the Bigtop bits. But I'm using Puppet and the topology is really simple. Basically without seeing your hadoop and giraph config files its pretty tough to answer your question in any greater details. Thanks, Roman. On Tue, Jan 7, 2014 at 5:44 AM, Martin Neumann mneum...@spotify.com wrote: Hej, I installed Giraph from the apache BigTop project and want to try to run some Giraph jobs locally on my machine. If I understood the website correctly it should be preconfigured to do so. But when I run a Giraph Job I get the following exception: Exception in thread main java.lang.IllegalStateException: Giraph's estimated cluster heap 2048MB ask is greater than the current available cluster heap of 0MB. Aborting Job. To me it sounds like a configuration problem, I'm not even sure if it comes from Giraph or from Yarn. If it is a configuration issue is there help page that tells you to configure the environment correctly? Thanks for the help. here the log of the whole execution: spotify@spotify-ThinkPad-T430s:~$ giraph /home/spotify/workspace/GiraphExe/SpotifyConComp.jar spotifyConnectedComponent.ConComp -eif spotifyConnectedComponent.ConCompInput -of spotifyConnectedComponent.ConCompOutput -c spotifyConnectedComponent.MinTextCombiner -eip /home/spotify/workspace/GiraphExe/in/sample -op /home/spotify/workspace/GiraphExe/out/spotifyConnectedComponent.ConComp -w 1 HADOOP_CONF_DIR=/etc/hadoop/conf SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/giraph/lib/slf4j-log4j12-1.7.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 14/01/07 14:40:35 INFO utils.ConfigurationUtils: No vertex input format specified. Ensure your InputFormat does not require one. 14/01/07 14:40:35 INFO yarn.GiraphYarnClient: Final output path is: hdfs://localhost:8020/home/spotify/workspace/GiraphExe/out/spotifyConnectedComponent.ConComp 14/01/07 14:40:35 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited. 14/01/07 14:40:35 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started. 14/01/07 14:40:35 INFO yarn.GiraphYarnClient: Defaulting per-task heap size to 1024MB. Exception in thread main java.lang.IllegalStateException: Giraph's estimated cluster heap 2048MB ask is greater than the current available cluster heap of 0MB. Aborting Job. at org.apache.giraph.yarn.GiraphYarnClient.checkPerNodeResourcesAvailable(GiraphYarnClient.java:204) at org.apache.giraph.yarn.GiraphYarnClient.run(GiraphYarnClient.java:114) at org.apache.giraph.GiraphRunner.run(GiraphRunner.java:96) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.giraph.GiraphRunner.main(GiraphRunner.java:126) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Re: giraph-hive having problem with cdh-4.4 (hadoop2.0)
I could bet this is because the default version of Hive you're pulling is compiled against Hadoop 1, not Hadoop from CDH. If you want to run against a CDH cluster -- you have to make sure you change versions of all dependecies to be CDH ones (take a look at the properties section of the Giraph's root pom file). Thanks, Roman. On Mon, Nov 18, 2013 at 6:11 PM, Ping Jin jinpi...@gmail.com wrote: hi, I'm a new user of giraph. I'm trying to setup giraph-hive to work with the cdh4.4 Hive and Hadoop. I successfully built and run the SimpleShortestPath job on cdh-4.4 Hadoop cluster. However when I try to setup Giraph-Hive and run a job through GiraphHiveRunner, I got following exceptions: Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at com.facebook.hiveio.output.HiveApiOutputFormat.checkOutputSpecs(HiveApiOutputFormat.java:247) at org.apache.giraph.hive.output.HiveVertexOutputFormat.checkOutputSpecs(HiveVertexOutputFormat.java:108) at org.apache.giraph.io.internal.WrappedVertexOutputFormat.checkOutputSpecs(WrappedVertexOutputFormat.java:104) at org.apache.giraph.bsp.BspOutputFormat.checkOutputSpecs(BspOutputFormat.java:52) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:984) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:246) at org.apache.giraph.hive.HiveGiraphRunner.run(HiveGiraphRunner.java:275) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.giraph.hive.HiveGiraphRunner.main(HiveGiraphRunner.java:246) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Any one can help me figure out what's going wrong? Thanks, -Ping
Re: Issue Running Giraph Job
Do you have a dedicated Zookeeper ensemble running on your cluster? It feels like something is conflicting on ports with an embedded one. Thanks, Roman. On Wed, Oct 30, 2013 at 12:48 PM, Artie Pesh-Imam artie.pesh-i...@tapad.com wrote: Hi all, I'm able to run this job locally but when I try to run against our cluster, the job seems to fail. Here's the log entries. It looks like it's not proceeding beyond superstep 0. Is there anyway to get a more detailed picture of what's going on? 2013-10-30 19:38:04,467 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2013-10-30 19:38:05,157 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id 2013-10-30 19:38:05,158 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId= 2013-10-30 19:38:05,634 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0 2013-10-30 19:38:05,639 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@371e88fb 2013-10-30 19:38:06,029 INFO org.apache.hadoop.mapred.MapTask: Processing split: 'org.apache.giraph.bsp.BspInputSplit, index=-1, num=-1 2013-10-30 19:38:06,047 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info 2013-10-30 19:38:06,117 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar. 2013-10-30 19:38:06,117 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /d02/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/jars/job.jar for job jobs.ConnectedComponents 2013-10-30 19:38:06,155 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201310301559_0024 2013-10-30 19:38:06,166 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201310301559_0024/_task/datanode05.prd.nj1.tapad.com 0 2013-10-30 19:38:06,216 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Got [datanode05.prd.nj1.tapad.com] 1 hosts from 1 candidates when 1 required (polling period is 3000) on attempt 0 2013-10-30 19:38:06,217 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperServerList: Creating the final ZooKeeper file '_bsp/_defaultZkManagerDir/job_201310301559_0024/zkServerList_datanode05.prd.nj1.tapad.com 0 ' 2013-10-30 19:38:06,235 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 0, got file 'zkServerList_datanode05.prd.nj1.tapad.com 0 ' (polling period is 3000) 2013-10-30 19:38:06,235 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [datanode05.prd.nj1.tapad.com, 0] 2 hosts in filename 'zkServerList_datanode05.prd.nj1.tapad.com 0 ' 2013-10-30 19:38:06,236 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Trying to delete old directory /d06/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/work/_bspZooKeeper 2013-10-30 19:38:06,243 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Creating file /d06/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/work/_bspZooKeeper/zoo.cfg in /d06/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/work/_bspZooKeeper with base port 22181 2013-10-30 19:38:06,243 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true 2013-10-30 19:38:06,243 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Delete of zoo.cfg = false 2013-10-30 19:38:06,246 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Attempting to start ZooKeeper server with command [/usr/java/jdk1.7.0_40/jre/bin/java, -Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp, /d02/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/jars/job.jar, org.apache.zookeeper.server.quorum.QuorumPeerMain, /d06/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/work/_bspZooKeeper/zoo.cfg] in directory /d06/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/work/_bspZooKeeper 2013-10-30 19:38:06,249 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Shutdown hook added. 2013-10-30 19:38:06,249 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to datanode05.prd.nj1.tapad.com:22181 with poll msecs = 3000 2013-10-30 19:38:06,253 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got ConnectException java.net.ConnectException: Connection refused at
Re: Release date for 1.1.0
+dev@ On Tue, Oct 29, 2013 at 9:47 AM, Avery Ching ach...@apache.org wrote: I would like that as well. Does someone else want to coordinate this one? =) If all of the PMC members are fine with a non-committer driving a release -- I'd be more than happy to volunteer as an RM for 1.1.0. That said, I would still either need karma for switching branches and administering JIRA. I really do want to help -- please let me know how to proceed. Thanks, Roman.
Re: External Documentation about Giraph
On Wed, May 29, 2013 at 2:25 PM, Maria Stylianou mars...@gmail.com wrote: Hello guys, This semester I'm doing my master thesis using Giraph in a daily basis. In my blog (marsty5.wordpress.com) I wrote some posts about Giraph, some of the new users may find them useful! And maybe some of the experienced ones can give me feedback and correct any mistakes :D So far, I described: 1. How to set up Giraph 2. What to do next - after setting up Giraph 3. How to run ShortestPaths 4. How to run PageRank Good stuff! As a shameless plug, one more way to install Giraph is via Apache Bigtop. All it takes is hooking one of these files: http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Repository/label=fedora18/lastSuccessfulBuild/artifact/repo/bigtop.repo http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Repository/label=opensuse12/lastSuccessfulBuild/artifact/repo/bigtop.repo to your yum/apt system and typing: $ sudo yum install hadoop-conf-pseudo giraph In fact we're about to release Bigtop 0.6.0 with Hadoop 2.0.4.1 and Giraph 1.0 -- so anybody's interested in helping us to test this stuff -- that would be really appreciated. Thanks, Roman. P.S. There's quite a few other platforms available as well: http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Repository/
Re: [VOTE] Release Giraph 1.0 (rc0)
On Fri, Apr 12, 2013 at 3:56 PM, Avery Ching ach...@apache.org wrote: Fellow Giraphers, We have a our first release candidate since graduating from incubation. This is a source release, primarily due to the different versions of Hadoop we support with munge (similar to the 0.1 release). Since 0.1, we've made A TON of progress on overall performance, optimizing memory use, split vertex/edge inputs, easy interoperability with Apache Hive, and a bunch of other areas. In many ways, this is an almost totally different codebase. Thanks everyone for your hard work! Indeed this is a VERY impressive amount of new functionality! Kudos! Here's my feedback so far (before I pull the bits into Bigtop for more integration testing). I hope to convince you guys that we may need to spin additional RC (#1-#3 -- with #4 bein a subject of a special plea): 1. tarball contains the .git repo 2. tarball was generated in such a way that make Linux Ubuntu tar spew out tons of warnings 3. YARN profile is broken (GIRAPH-627 -- patch attached). 4. YARN profile is broken when compiled against hadoop-2.0.4 (GIRAPH-629 -- working on a patch) And here we come to me pleading with Giraph community (on behalf of Bigtop and Hadoop ones ;-)). I know that what I'm about to ask is typically considered a sort of a 'bad taste' in ASF but here I go: given the incompatibility between 2.0.3-alpha and 2.0.4-alpha is there any chance we can delay Griaph 1.0 to be full compatible with 2.0.4? The 2.0.4 release is suppose to come out at the end of next week and I can volunteer to make Giraph compatible with it. Hadoop 2.0.4-alpha is kind of a big deal because if everything goes according to a plan 2.0.4 will be a stepping stone towards the first Hadoop 2.X beta (and eventually GA). It is way more important to be compatible with it in my opinion. I guess, if you guys really want to save a couple of days an alternative could be to agree on Giraph 1.0.1 within a couple of weeks. That of course, will require cycles from whoever will be the 1.0.1. Finally, if we do spin a new RC, could we please follow an established ASF model where the tarball itself gets a name of the final artifact (in our case giraph-1.0.tar.gz) but the subdirectory name reflects the name of the RC. Here's an example of Hadoop 2.0.4 RC that the Hadoop community is voting on right now: http://people.apache.org/~acmurthy/hadoop-2.0.4-alpha-rc2/ as you can see the name of the artifact looks exactly like the final product of the release. Thanks, Roman.