Re: Giraph and HBase

2016-12-14 Thread Roman Shaposhnik
Nice! Thanks for sharing!

Roman.

On Tue, Dec 13, 2016 at 12:38 PM, Robert Yokota  wrote:
> Hi,
>
> In case anyone is interested in analyzing graphs in HBase with Apache
> Giraph, this might be helpful:
>
> https://yokota.blog/2016/12/13/graph-analytics-on-hbase-with-hgraphdb-and-giraph/


Re: Apache Giraph Visualisation

2016-08-08 Thread Roman Shaposhnik
On Sun, Aug 7, 2016 at 7:50 PM, agc studio  wrote:
> Hi Team,
>
> I have been looking for a graph drawing/visualisation implemented algorithm
> for Giraph. Are there any such implementations?

Can you be more specific/give examples?

Thanks,
Roman.


Re: Anybody's considering to present at ApacheCON EU?

2016-07-18 Thread Roman Shaposhnik
Awesome! Meantime, I'll take a look at the HBase one.

Thanks,
Roman.

On Mon, Jul 18, 2016 at 5:05 PM, Sergey Edunov <edu...@gmail.com> wrote:
> Hi Roman,
>
> I'll take care of out of core test case.
>
> Regards,
> Sergey Edunov
>
> On Mon, Jul 18, 2016 at 2:52 PM, Roman Shaposhnik <ro...@shaposhnik.org> 
> wrote:
>> Hi!
>>
>> I was wondering if anybody is considering doing a talk at
>> ApacheCON EU. The CFP closes on Sep 9th:
>>http://www.apachecon.com/
>>
>> Given upcoming 1.2.0 release it could be a good place
>> for us to rekindle some of that Giraph love ;-)
>>
>> Thanks,
>> Roman.
>>
>> P.S. Speaking of 1.2.0 can someone please take a look at the OutOfCore test
>> failure in hadoop_1 profile?
>>
>> https://builds.apache.org/job/Giraph-1.2/MVN_PROFILE=hadoop_1,jdk=JDK%201.7%20(latest),label=ubuntu/8/testReport/


Re: Giraph out-of-core feature is not helping!

2015-12-04 Thread Roman Shaposhnik
I remember seeing a discussion that discouraged users from
utilizing OOC. I am vague on the details but you can try
searching the archives.

Thanks,
Roman.

On Fri, Dec 4, 2015 at 4:34 PM, Khaled Ammar  wrote:
> Hi,
>
> I am using Giraph-1.1.0 to do large graph processing. I was trying to do a
> hashMin (WCC) algorithm on a large graph but it failed with out of memory
> error. I thought the out-of-core option may help, but it did not.
>
> Is there any advice about how to enable out-of-core processing?
>
> I followed this URL: http://giraph.apache.org/ooc.html
>
> --
> Thanks,
> -Khaled


Re: How can I build giraph 1.1.0-hadoop2 distribution from the source?

2015-09-10 Thread Roman Shaposhnik
On Thu, Sep 10, 2015 at 4:09 AM, Anton Petersson
 wrote:
> Dear Roman,
>
> Thanks for your reply.
> My application code runs on hadoop 2.4.1, by including the following
> dependency.
>
> 
>   org.apache.giraph
>   giraph-core
>   1.1.0-hadoop2
> 
>
>
> I tried to change the dependency to my custom build (such as mvn clean
> install -Pyarn -Dhadoop.version=2.4.1 -DskipTests),
> but my application code is NOT running on my custom build.

What exactly fails and how. I need details to be able to help you.

> Therefore I wish
> to know how to build the giraph-core jar (1.1.0-hadoop2) as listed in the
> central maven repo.

Building your local copy of Giraph has nothing to do with Maven central.

Thanks,
Roman.


Re: Please welcome our newest committer, Igor Kabiljo!

2015-02-10 Thread Roman Shaposhnik
Great work, Igor!

Thanks,
Roman.

On Tue, Feb 10, 2015 at 8:42 PM, Maja Kabiljo majakabi...@fb.com wrote:
 I am pleased to announce that Igor Kabiljo has been invited to become a
 committer by the Project Management Committee (PMC) of Apache Giraph, and he
 accepted.

 Igor's most important contributions are implementing reduce/broadcast that
 generalizes aggregators and working on primitive message/edge storages that
 make applications more efficient, as well as around using specific
 partitioners that utilize good partitioning. He is also coming up with
 issues for beginners and guiding them along the way. Igor, we are looking
 forward to your future work and deeper involvement in the project.

 Thanks,
 Maja

 List of Igor’s contributions:
 GIRAPH-785: Improve GraphPartitionerFactory usage
 GIRAPH-786: XSparseVector create a lot of objects in add/write
 GIRAPH-848: Allowing plain computation with types being configurable
 GIRAPH-934: Allow having state in aggregators
 GIRAPH-935: Loosen modifiers when needed
 GIRAPH-938: Allow fast working with primitives generically
 GIRAPH-939: Reduce/broadcast API
 GIRAPH-954: Allow configurable Aggregators/Reducers again
 GIRAPH-955: Allow vertex/edge/message value to be configurable
 GIRAPH-961: Internals of MasterLoggingAggregator have been incorrectly
 removed
 GIRAPH-965: Improving and adding reducers
 GIRAPH-986: Add more stuff to TypeOps
 GIRAPH-987: Improve naming for ReduceOperation
 Beginner issues he guided:
 GIRAPH-891: Make MessageStoreFactory configurable
 GIRAPH-895: Trim the edges in Giraph
 GIRAPH-921: Create ByteValueVertex to store vertex values as bytes without
 object instance
 GIRAPH-988: Allow object to be specified as next Computation in Giraph


Re: YARN vs. MR1: is YARN a good idea?

2014-12-19 Thread Roman Shaposhnik
Perfect summary! Thanks for writing it.

Thanks,
Roman.

On Fri, Dec 19, 2014 at 12:09 PM, Eli Reisman apache.mail...@gmail.com wrote:
 Giraph on YARN thus far doesn't break any compatibility with the MapReduce
 version. When I was working on it more actively, it had a slightly faster
 job startup but otherwise behaved similarly to the MapReduce version.

 There are a number of things design wise that could make the YARN profile
 substantially better (in theory) but would require a fork or bigger design
 changes/agreements about the MapReduce profiles. This would include things
 like spawning the Master Giraph task in the Application Master itself, and
 many other things along those lines.

 There are also a number of smaller things that would probably make a
 difference like exposing YARN's per-task resource configuration features in
 a more flexible way.

 I haven't had much time to hack on Giraph this past year, and at some point
 last summer some folks like Muhammad Islam from LinkedIn did some great work
 to update the YARN profile to run on Hadoop 2.2.0 or newer versions but
 since then it hasn't gotten much love.

 I noticed there is still a note in the master POM from the original Giraph
 on YARN implementation that says its compatible only with Hadoop
 2.0.3-alpha. I thought that was removed with Mohammad's Hadoop 2.2.0 patches
 but apparently it wasn't. We should remove that, it's no longer accurate and
 seems to be misleading people trying to build the YARN profile.



 On Fri, Oct 10, 2014 at 11:15 AM, Tripti Singh tri...@yahoo-inc.com wrote:

 Hi Matthew,
 I would have been thrilled to give you numbers on this one but for me the
 Application is not scaling without the out-of-core option( which isn't
 working the way it was in previous version)
 I'm still figuring it out and can get back once it's resolved. I have
 patched a few things and will share them for people who might face similar
 issue. If u have a fix for scalability, do let me know

 Thanks,
 Tripti

 Sent from my iPhone

  On 06-Oct-2014, at 9:22 pm, Matthew Cornell m...@matthewcornell.org
  wrote:
 
  Hi Folks. I don't think I paid enough attention to YARN vs. MR1 when I
  built Giraph 1.0.0 for our system. How much better is Giraph on YARN?
  Thank you.
 
  --
  Matthew Cornell | m...@matthewcornell.org


Re: ClassNotFoundException GiraphYarn Task with Giraph 1.1.0 for Hadoop 2.5.1

2014-12-15 Thread Roman Shaposhnik
On Sun, Dec 14, 2014 at 11:21 PM, Philipp Nolte p...@daslaboratorium.de wrote:
 Maybe its just a configuration thing.

How did you build Giraph in the first place? Also, what's
your mapred-site.xml and yarn-site.xml in HADOOP_CONF_DIR?

Finally, what version of Hadoop are you using? And from
what vendor?

 I’ve tried running in giraph.SplitMasterWorker mode and its seems like hadoop 
 is missing the worker nodes:

 Here is my command:
 $ hadoop jar 
 giraphs-and-balloons-computation-0.0.1-for-hadoop-2.5.1-and-giraph-1.1.0-RC1-jar-with-dependencies.jar

How did you build this JAR?

  -ca mapred.job.tracker=master:5431\

If your Giraph installation has access to a correctly
configured Hadoop client you really don't need this
line.

Thanks,
Roman.


Re: ClassNotFoundException GiraphYarn Task with Giraph 1.1.0 for Hadoop 2.5.1

2014-12-14 Thread Roman Shaposhnik
Fixing YARN backend would be nice, but in the meantime, what
prevents you from using a MR backend? That is known to work.

Thanks,
Roman.

On Sun, Dec 14, 2014 at 2:41 PM, Philipp Nolte p...@daslaboratorium.de wrote:
 Hello everyone,

 I have been developing a Giraph application and now wanted to try it on my 
 four-machine cluster running Hadoop 2.5.1.

 My Giraph application is packaged with Giraph 1.1.0-RC1 and all its 
 dependencies.

 I’ve built Giraph and installed it into my local repository using the 
 hadoop_2 profile running and built my application as a jar with all 
 dependencies (with giraph-core 1.1.0-RC1).

 My application runs nicely with GiraphRunner and one worker on my local 
 machine and on the cluster’s master. But as I have a small Hadooop 2.5.1 
 cluster, I need to use GiraphYarnTask to utilize all its workers.

 Running my application results in a ClassNotFoundException for GiraphYarnTask.

 Any idea?

 Philipp


Re: ClassNotFoundException GiraphYarn Task with Giraph 1.1.0 for Hadoop 2.5.1

2014-12-14 Thread Roman Shaposhnik
On Sun, Dec 14, 2014 at 4:09 PM, Philipp Nolte p...@daslaboratorium.de wrote:
 I’ve had a look at the assembled giraph-core jar file and it does not contain 
 any GiraphYarnTask. How so?

 Running my application using the GiraphRunner works fine as long as I only
 have one worker (local mode). To use the other workers, I need to start MR
 TaskTrackers on the machines - which aren’t available on hadoop 2.5.1.

 Thats why I need the GiraphYarnTask.

Looks like we're talking past each other. What I am saying is that running
pure MR-based Giraph job on a fully distributed YARNized cluster is perfectly
valid and works fine. You don't *have* to use YARN, even though it is
available on your cluster.

Thanks,
Roman.


Re: Can't complete Shortest Path Example: error code 127

2014-12-09 Thread Roman Shaposhnik
Is there any reason you actually *need* Giraph to run on YARN?
If you don't get any benefit out of it -- just go with the default MR model.

Thanks,
Roman.

On Mon, Dec 8, 2014 at 11:58 PM, Alessio Arleo ingar...@icloud.com wrote:
 Thanks Roman for your reply :) in first place, I am pretty new to Giraph and 
 Hadoop. For what I could learn, Yarn is the result a of a complete overhaul 
 of MR, so I thought I could be wise to focus on the new technology: should I 
 rollback to Hadoop_2 profile instead of pure_yarn? Or am I just missing the 
 whole picture?

 Thanks :)

 Sent from my iPad

 On 08 Dec 2014, at 23:59, Roman Shaposhnik ro...@shaposhnik.org wrote:

 On Mon, Dec 8, 2014 at 9:23 AM, Dr. Alessio Arleo ingar...@icloud.com 
 wrote:
 Hello everybody

 I managed to compile Giraph 1.1 for Hadoop 2.6.0 with pure_yarn maven
 profile. I am running on a VM environment so I took the necessary actions as
 listed in the Giraph Quick Start Guide.

 I think it would be fair to say that with Giraph on YARN you're running
 a configuration that hasn't been as much validated as Giraph on MR.
 Not to say you're on your own. More like: you're among a few users
 and collectively you guys may be on your own ;-)

 Patches, of course, are always welcome.

 Thanks,
 Roman.


Re: Can't complete Shortest Path Example: error code 127

2014-12-08 Thread Roman Shaposhnik
On Mon, Dec 8, 2014 at 9:23 AM, Dr. Alessio Arleo ingar...@icloud.com wrote:
 Hello everybody

 I managed to compile Giraph 1.1 for Hadoop 2.6.0 with pure_yarn maven
 profile. I am running on a VM environment so I took the necessary actions as
 listed in the Giraph Quick Start Guide.

I think it would be fair to say that with Giraph on YARN you're running
a configuration that hasn't been as much validated as Giraph on MR.
Not to say you're on your own. More like: you're among a few users
and collectively you guys may be on your own ;-)

Patches, of course, are always welcome.

Thanks,
Roman.


Re: Please welcome our newest committer, Sergey Edunov!

2014-12-03 Thread Roman Shaposhnik
Congrats! Great to have you onboard!

On Wed, Dec 3, 2014 at 10:34 AM, Maja Kabiljo majakabi...@fb.com wrote:
 I am happy to announce that the Project Management Committee (PMC) for
 Apache Giraph has elected Sergey Edunov to become a committer, and he
 accepted.

 Sergey has been an active member of Giraph community, finding issues,
 submitting patches and reviewing code. We’re looking forward to Sergey’s
 larger involvement and future work.

 List of his contributions:
 GIRAPH-895: Trim the edges in Giraph
 GIRAPH-896: Memory leak in SuperstepMetricsRegistry
 GIRAPH-897: Add an option to dump only live objects to JMap
 GIRAPH-898: Remove giraph-accumulo from Facebook profile
 GIRAPH-903: Detect crashes on Netty threads
 GIRAPH-924: Fix checkpointing
 GIRAPH-925: Unit tests should pass even if zookeeper port not available
 GIRAPH-927: Decouple netty server threads from message processing
 GIRAPH-933: Checkpointing improvements
 GIRAPH-936: Decouple netty server threads from message processing
 GIRAPH-940: Cleanup the list of supported hadoop versions
 GIRAPH-950: Auto-restart from checkpoint doesn't pick up latest checkpoint
 GIRAPH-963: Aggregators may fail with IllegalArgumentException upon
 deserialization

 Best,
 Maja


[ANNOUNCE] Giraph 1.1.0 is now available

2014-11-25 Thread Roman Shaposhnik
The Apache Giraph team is pleased to announce the
immediate release of Giraph 1.1.0.

Download it from your favorite Apache mirror at:
http://www.apache.org/dyn/closer.cgi/giraph/giraph-1.1.0

This release has also been pushed to Apache's maven
repository as two separate versions:
   * 1.1.0 is the version that has been compiled against
  hadoop 1.2+
   * 1.1.0-hadoop2 is the version that has been compiled against
  hadoop 2.5+
make sure to specify the correct version in your Maven
dependencies.

See also the full release notes:
   http://s.apache.org/a8X

Thanks to everybody who contributed to this release!

Yours,
The Apache Giraph team


[RESULT] [VOTE] Apache Giraph 1.1.0 RC2

2014-11-18 Thread Roman Shaposhnik
Hi!

with 3 binding +1, one non-binding +1,
no 0s or -1s the vote to publish
Apache Giraph  1.1.0 RC2 as the 1.1.0 release of
Apache Giraph passes. Thanks to everybody who
spent time on validating the bits!

The vote tally is
  +1s:
  Claudio Martella (binding)
  Maja Kabiljo (binding)
  Eli Reisman (binding)
  Roman Shaposhnik  (non-binding)

I'll do the publishing tonight and will send an announcement!

Thanks,
Roman (AKA 1.1.0 RM)

On Thu, Nov 13, 2014 at 5:28 AM, Roman Shaposhnik ro...@shaposhnik.org wrote:
 This vote is for Apache Giraph, version 1.1.0 release

 It fixes the following issues:
   http://s.apache.org/a8X

 *** Please download, test and vote by Mon 11/17 noon PST

 Note that we are voting upon the source (tag):
release-1.1.0-RC2

 Source and binary files are available at:
http://people.apache.org/~rvs/giraph-1.1.0-RC2/

 Staged website is available at:
http://people.apache.org/~rvs/giraph-1.1.0-RC2/site/

 Maven staging repo is available at:
https://repository.apache.org/content/repositories/orgapachegiraph-1003

 Please notice, that as per earlier agreement two sets
 of artifacts are published differentiated by the version ID:
   * version ID 1.1.0 corresponds to the artifacts built for
  the hadoop_1 profile
   * version ID 1.1.0-hadoop2 corresponds to the artifacts
  built for hadoop_2 profile.

 The tag to be voted upon (release-1.1.0-RC1):
   
 https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=log;h=refs/tags/release-1.1.0-RC2

 The KEYS file containing PGP keys we use to sign the release:
http://svn.apache.org/repos/asf/bigtop/dist/KEYS

 Thanks,
 Roman.


Re: [VOTE] Apache Giraph 1.1.0 RC2

2014-11-18 Thread Roman Shaposhnik
On Thu, Nov 13, 2014 at 5:28 AM, Roman Shaposhnik ro...@shaposhnik.org wrote:
 This vote is for Apache Giraph, version 1.1.0 release

+1 (non-binding)

Thanks,
Roman.


[VOTE] Apache Giraph 1.1.0 RC2

2014-11-13 Thread Roman Shaposhnik
This vote is for Apache Giraph, version 1.1.0 release

It fixes the following issues:
  http://s.apache.org/a8X

*** Please download, test and vote by Mon 11/17 noon PST

Note that we are voting upon the source (tag):
   release-1.1.0-RC2

Source and binary files are available at:
   http://people.apache.org/~rvs/giraph-1.1.0-RC2/

Staged website is available at:
   http://people.apache.org/~rvs/giraph-1.1.0-RC2/site/

Maven staging repo is available at:
   https://repository.apache.org/content/repositories/orgapachegiraph-1003

Please notice, that as per earlier agreement two sets
of artifacts are published differentiated by the version ID:
  * version ID 1.1.0 corresponds to the artifacts built for
 the hadoop_1 profile
  * version ID 1.1.0-hadoop2 corresponds to the artifacts
 built for hadoop_2 profile.

The tag to be voted upon (release-1.1.0-RC1):
  
https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=log;h=refs/tags/release-1.1.0-RC2

The KEYS file containing PGP keys we use to sign the release:
   http://svn.apache.org/repos/asf/bigtop/dist/KEYS

Thanks,
Roman.


Re: [VOTE] Apache Giraph 1.1.0 RC1

2014-11-13 Thread Roman Shaposhnik
Hi!

as per prior suggestion I've recut RC2 with the only
extra fix included:
   https://issues.apache.org/jira/browse/GIRAPH-961

Given the scope of the fix, I firmly believe that the
testing we've conducted so far should be 100%
applicable to the new RC.

Still, I started a new vote thread to keep things clean.

Claudio, can you please recast your vote in that
other new thread?

Maja, Eugene, we really need you guys to vote. Please
do so. We're sooo close to finally push 1.1.0 out.

Thanks,
Roman.

On Mon, Nov 10, 2014 at 4:01 AM, Claudio Martella
claudio.marte...@gmail.com wrote:
 Yes,

 I did re-run the build this weekend, and it built succesfully for the
 default profile and the hadoop_2 one.
 I ran a couple of examples on the cluster, and it ran succesfully.

 I'm +1.

 On Tue, Nov 4, 2014 at 8:10 PM, Roman Shaposhnik ro...@shaposhnik.org
 wrote:

 On Tue, Nov 4, 2014 at 5:47 AM, Claudio Martella
 claudio.marte...@gmail.com wrote:
  I am indeed having some problems. mvn install will fail because the test
  is
  opening too many files:

 [snip]

  I have to investigate why this happens. I'm not using a different ulimit
  than what I have on my Mac OS X by default. Where are you building
  yours?

 This is really weird. I have not issues whatsoever on Mac OS X with
 the following setup:
$ uname -a
Darwin usxxshaporm1.corp.emc.com 12.4.1 Darwin Kernel Version
 12.4.1: Tue May 21 17:04:50 PDT 2013;
 root:xnu-2050.40.51~1/RELEASE_X86_64 x86_64
$ ulimit -a
core file size  (blocks, -c) 0
data seg size   (kbytes, -d) unlimited
file size   (blocks, -f) unlimited
max locked memory   (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files  (-n) 2560
pipe size(512 bytes, -p) 1
stack size  (kbytes, -s) 8192
cpu time   (seconds, -t) unlimited
max user processes  (-u) 709
virtual memory  (kbytes, -v) unlimited
$ mvn --version
Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4;
 2014-08-11T13:58:10-07:00)
Maven home: /Users/shapor/dist/apache-maven-3.2.3
Java version: 1.7.0_51, vendor: Oracle Corporation
Java home:
 /Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: mac os x, version: 10.8.4, arch: x86_64, family: mac


 Thanks,
 Roman.




 --
Claudio Martella



Re: [VOTE] Apache Giraph 1.1.0 RC2

2014-11-13 Thread Roman Shaposhnik
On Thu, Nov 13, 2014 at 5:32 AM, Panagiotis Eustratiadis
ep.pan@gmail.com wrote:
 Was the issue with the aggregators on release-1.1 fixed?

The only extra fix in this latest RC is:
https://issues.apache.org/jira/browse/GIRAPH-961

Thanks,
Roman.


Re: Differences in building Giraph

2014-11-12 Thread Roman Shaposhnik
At this point you either need Hadoop 2.5+ or you need
to manually hack the two offending files.

Thanks,
Roman.

On Thu, Nov 6, 2014 at 9:44 PM, Sundara Raghavan Sankaran
sun...@crayondata.com wrote:
 Hi Giraph Experts,

 I'm trying to build Giraph (trunk) for Hadoop 2.4.0. I know that there are 2
 profiles with which I can proceed to build. hadoop_yarn and hadoop_2
 profiles.

 When I try with hadoop_yarn, build is fine. When I try with hadoop_2, build
 is failing.

 The commands I used is
 mvn -Phadoop_yarn -Dhadoop.version=2.4.0 -DskipTests clean install
 mvn -Phadoop_2 -Dhadoop.version=2.4.0 -DskipTests clean install

 I get the following error while building with hadoop_2 profile

 [INFO] -
 [INFO]
 
 [INFO] Reactor Summary:
 [INFO]
 [INFO] Apache Giraph Parent ... SUCCESS [  5.758
 s]
 [INFO] Apache Giraph Core . FAILURE [  9.259
 s]
 [INFO] Apache Giraph Examples . SKIPPED
 [INFO] Apache Giraph Accumulo I/O . SKIPPED
 [INFO] Apache Giraph HBase I/O  SKIPPED
 [INFO] Apache Giraph HCatalog I/O . SKIPPED
 [INFO] Apache Giraph Hive I/O . SKIPPED
 [INFO] Apache Giraph Gora I/O . SKIPPED
 [INFO] Apache Giraph Rexster I/O .. SKIPPED
 [INFO] Apache Giraph Rexster Kibble ... SKIPPED
 [INFO] Apache Giraph Rexster I/O Formats .. SKIPPED
 [INFO] Apache Giraph Distribution . SKIPPED
 [INFO]
 
 [INFO] BUILD FAILURE
 [INFO]
 
 [INFO] Total time: 15.722 s
 [INFO] Finished at: 2014-11-07T11:06:36+05:30
 [INFO] Final Memory: 48M/558M
 [INFO]
 
 [ERROR] Failed to execute goal
 org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile)
 on project giraph-core: Compilation failure: Compilation failure:
 [ERROR]
 /home/sundar/giraph/giraph-core/src/main/java/org/apache/giraph/comm/netty/SaslNettyClient.java:[93,32]
 getDefaultProperties() has protected access in
 org.apache.hadoop.security.SaslPropertiesResolver
 [ERROR]
 /home/sundar/giraph/giraph-core/src/main/java/org/apache/giraph/comm/netty/SaslNettyServer.java:[113,32]
 getDefaultProperties() has protected access in
 org.apache.hadoop.security.SaslPropertiesResolver
 [ERROR] - [Help 1]
 [ERROR]
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR]
 [ERROR] For more information about the errors and possible solutions, please
 read the following articles:
 [ERROR] [Help 1]
 http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
 [ERROR]
 [ERROR] After correcting the problems, you can resume the build with the
 command
 [ERROR]   mvn goals -rf :giraph-core

 Can some one help me in understanding why this is happening with profile
 change? I get it if the build failed when hadoop version changed.

 --
 Thanks,
 Sundar


Re: Compiling Giraph 1.1

2014-11-04 Thread Roman Shaposhnik
What's the exact compilation incantation you use?

Thanks,
Roman.

On Tue, Nov 4, 2014 at 9:56 AM, Ryan freelanceflashga...@gmail.com wrote:
 I'm attempting to build, compile and install Giraph 1.1 on a server running
 CDH5.1.2. A few weeks ago I successfully compiled it by changing the
 hadoop_2 profile version to be 2.3.0-cdh5.1.2. I recently did a fresh
 install and was unable to build, compile and install (perhaps due to the
 latest code updates).

 The error seems to be related to the SaslNettyClient and SaslNettyServer.
 Any idea on fixes?

 Here's part of the error log:

 [ERROR] Failed to execute goal
 org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile)
 on project giraph-core: Compilation failure: Compilation failure:
 [ERROR]
 /[myPath]/giraph/giraph-core/src/main/java/org/apache/giraph/comm/netty/SaslNettyClient.java:[28,34]
 cannot find symbol
 [ERROR] symbol:   class SaslPropertiesResolver
 [ERROR] location: package org.apache.hadoop.security
 ...
 [ERROR]
 /[myPath]/giraph/giraph-core/src/main/java/org/apache/giraph/comm/netty/SaslNettyServer.java:[108,11]
 cannot find symbol
 [ERROR] symbol:   variable SaslPropertiesResolver
 [ERROR] location: class org.apache.giraph.comm.netty.SaslNettyServer



Re: [VOTE] Apache Giraph 1.1.0 RC1

2014-11-04 Thread Roman Shaposhnik
On Mon, Nov 3, 2014 at 4:51 PM, Maja Kabiljo majakabi...@fb.com wrote:
 We¹ve been running code which is the same as release candidate plus fix on
 GIRAPH-961 in production for 5 days now, no problems. This is
 hadoop_facebook profile, using only hive-io from all io modules.

Great! This tells me that once I cut RC2 with GIRAPH-961 you guys will be
ready to vote!

Thanks,
Roman.


Re: [VOTE] Apache Giraph 1.1.0 RC1

2014-11-04 Thread Roman Shaposhnik
On Tue, Nov 4, 2014 at 5:47 AM, Claudio Martella
claudio.marte...@gmail.com wrote:
 I am indeed having some problems. mvn install will fail because the test is
 opening too many files:

[snip]

 I have to investigate why this happens. I'm not using a different ulimit
 than what I have on my Mac OS X by default. Where are you building yours?

This is really weird. I have not issues whatsoever on Mac OS X with
the following setup:
   $ uname -a
   Darwin usxxshaporm1.corp.emc.com 12.4.1 Darwin Kernel Version
12.4.1: Tue May 21 17:04:50 PDT 2013;
root:xnu-2050.40.51~1/RELEASE_X86_64 x86_64
   $ ulimit -a
   core file size  (blocks, -c) 0
   data seg size   (kbytes, -d) unlimited
   file size   (blocks, -f) unlimited
   max locked memory   (kbytes, -l) unlimited
   max memory size (kbytes, -m) unlimited
   open files  (-n) 2560
   pipe size(512 bytes, -p) 1
   stack size  (kbytes, -s) 8192
   cpu time   (seconds, -t) unlimited
   max user processes  (-u) 709
   virtual memory  (kbytes, -v) unlimited
   $ mvn --version
   Apache Maven 3.2.3 (33f8c3e1027c3ddde99d3cdebad2656a31e8fdf4;
2014-08-11T13:58:10-07:00)
   Maven home: /Users/shapor/dist/apache-maven-3.2.3
   Java version: 1.7.0_51, vendor: Oracle Corporation
   Java home: 
/Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/jre
   Default locale: en_US, platform encoding: UTF-8
   OS name: mac os x, version: 10.8.4, arch: x86_64, family: mac


Thanks,
Roman.


Re: [VOTE] Apache Giraph 1.1.0 RC1

2014-11-01 Thread Roman Shaposhnik
Ping! Any progress on testing the current RC?

Thanks,
Roman.

On Fri, Oct 31, 2014 at 9:00 AM, Claudio Martella
claudio.marte...@gmail.com wrote:
 Oh, thanks for the info!

 On Fri, Oct 31, 2014 at 3:06 PM, Roman Shaposhnik ro...@shaposhnik.org
 wrote:

 On Fri, Oct 31, 2014 at 3:26 AM, Claudio Martella
 claudio.marte...@gmail.com wrote:
  Hi Roman,
 
  thanks again for this. I have had a look at the staging site so far (our
  cluster has been down whole week... universities...), and I was
  wondering if
  you have an insight why some of the docs are missing, e.g. gora and
  rexster
  documentation.

 None of them are missing. The links moved to a User Docs - Modules
 though:
http://people.apache.org/~rvs/giraph-1.1.0-RC1/site/gora.html
http://people.apache.org/~rvs/giraph-1.1.0-RC1/site/rexster.html
 and so forth.

 Thanks,
 Roman.




 --
Claudio Martella



Re: Issue with Giraph on multinode cluster

2014-11-01 Thread Roman Shaposhnik
Please create a JIRA and attach your patch to it.

Thanks,
Roman.

On Mon, Oct 20, 2014 at 2:23 AM, Bojan Babic gba...@gmail.com wrote:
 I've made a patch that worked for me. Not sure, if I should post JIRA issue.
 In attach, you can find hack.



 On Fri, Oct 17, 2014 at 5:52 PM, Bojan Babic gba...@gmail.com wrote:

 I'm using giraph 1.1.0-SNAPSHOT for hadoop 1.2.1

 On Fri, Oct 17, 2014 at 4:01 PM, Bojan Babic gba...@gmail.com wrote:

 Hi guys,

 I'm risking to post issue that has been already issued, but I'll take
 risk to be ridiculed :)

 I have small hadoop cluster on Digital Ocean (1 master  4 nodes). I was
 able to setup cluster and run word count example as well as single node
 sample from Quick start.

 As I introduce more nodes into play, I get issue where Task Tracker
 spawns Child process

 hduser@hdnode-2:~# jps
 13839 TaskTracker
 13697 DataNode
 14067 Jps
 13962 Child

 13961 Child


 that listen on looback interface

 Proto Recv-Q Send-Q Local Address   Foreign Address
 State   User   Inode   PID/Program name
 tcp0  0 127.0.0.1:1337  0.0.0.0:*
 LISTEN  root   2154492529912/python
 tcp0  0 0.0.0.0:50010   0.0.0.0:*
 LISTEN  hduser 2169155213697/java
 tcp0  0 127.0.0.1:30011 0.0.0.0:*
 LISTEN  hduser 2169357813962/java
 tcp0  0 0.0.0.0:50075   0.0.0.0:*
 LISTEN  hduser 2169155413697/java
 tcp0  0 0.0.0.0:50020   0.0.0.0:*
 LISTEN  hduser 2169155713697/java
 tcp0  0 127.0.0.1:50118 0.0.0.0:*
 LISTEN  hduser 2169187013839/java
 tcp0  0 0.0.0.0:41640   0.0.0.0:*
 LISTEN  hduser 2169129613697/java
 tcp0  0 127.0.0.1:31337 0.0.0.0:*
 LISTEN  root   204326601514/python
 tcp0  0 0.0.0.0:50060   0.0.0.0:*
 LISTEN  hduser 2169214413839/java
 tcp0  0 0.0.0.0:http-alt0.0.0.0:*
 LISTEN  root   204318971421/python
 tcp0  0 127.0.0.1:30001 0.0.0.0:*
 LISTEN  hduser 213700047856/ssh
 tcp0  0 127.0.0.1:30003 0.0.0.0:*
 LISTEN  hduser 2169356213961/java
 tcp0  0 127.0.0.1:58741 0.0.0.0:*
 LISTEN  hduser 21377856/ssh
 tcp0  0 127.0.0.1:58742 0.0.0.0:*
 LISTEN  hduser 213699827845/autossh
 tcp0  0 0.0.0.0:ssh 0.0.0.0:*
 LISTEN  root   9130834/sshd
 tcp6   0  0 ::1:30001   :::*
 LISTEN  hduser 213700037856/ssh
 tcp6   0  0 ::1:58741   :::*
 LISTEN  hduser 21367856/ssh
 tcp6   0  0 :::ssh  :::*
 LISTEN  root   9165834/sshd


 instead of all interfaces (0.0.0.0)

 This results in node being unreachable from other nodes. ie hdnode02:


 2014-10-17 14:10:31,146 WARN org.apache.giraph.comm.netty.NettyClient:
 2014-10-17 14:10:31,159 WARN org.apache.giraph.comm.netty.NettyClient:
 connectAllAddresses: Future failed to connect with
 hdnode-2/XXX.XXX.XXX.XXX:30003 with 1 failures because of
 java.net.ConnectException: Connection refused:
 hdnode-2/XXX.XXX.XXX.XXX:30003
 2014-10-17 14:10:31,159 INFO org.apache.giraph.comm.netty.NettyClient:
 connectAllAddresses: Successfully added 1 connections, (1 total connected) 
 2
 failed, 2 failures total.


 If I stop all processes and start nc on 30003, I can telnet to hdnode2.

 Question here is if there is any setup that will configure Child process
 to listen on 0.0.0.0 instead of loopback interface?

 Thanks in advance




 --
 
 Bojan Babic, M.Sc.E.E
 Software developer
 twitter: @bojanbabic
 mobile: +1312 8602944




 --
 
 Bojan Babic, M.Sc.E.E
 Software developer
 twitter: @bojanbabic
 mobile: +1312 8602944



Re: [VOTE] Apache Giraph 1.1.0 RC1

2014-10-31 Thread Roman Shaposhnik
On Fri, Oct 31, 2014 at 3:26 AM, Claudio Martella
claudio.marte...@gmail.com wrote:
 Hi Roman,

 thanks again for this. I have had a look at the staging site so far (our
 cluster has been down whole week... universities...), and I was wondering if
 you have an insight why some of the docs are missing, e.g. gora and rexster
 documentation.

None of them are missing. The links moved to a User Docs - Modules
though:
   http://people.apache.org/~rvs/giraph-1.1.0-RC1/site/gora.html
   http://people.apache.org/~rvs/giraph-1.1.0-RC1/site/rexster.html
and so forth.

Thanks,
Roman.


[VOTE] Apache Giraph 1.1.0 RC1

2014-10-26 Thread Roman Shaposhnik
This vote is for Apache Giraph, version 1.1.0 release

It fixes the following issues:
  http://s.apache.org/a8X

*** Please download, test and vote by Mon 11/3 noon PST

Note that we are voting upon the source (tag):
   release-1.1.0-RC1

Source and binary files are available at:
   http://people.apache.org/~rvs/giraph-1.1.0-RC1/

Staged website is available at:
   http://people.apache.org/~rvs/giraph-1.1.0-RC1/site/

Maven staging repo is available at:
   https://repository.apache.org/content/repositories/orgapachegiraph-1002

Please notice, that as per earlier agreement two sets
of artifacts are published differentiated by the version ID:
  * version ID 1.1.0 corresponds to the artifacts built for
 the hadoop_1 profile
  * version ID 1.1.0-hadoop2 corresponds to the artifacts
 built for hadoop_2 profile.

The tag to be voted upon (release-1.1.0-RC1):
   
https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=commit;h=1f0fc23c26ce3addb746e3e57cc155f82afbab87

The KEYS file containing PGP keys we use to sign the release:
   http://svn.apache.org/repos/asf/bigtop/dist/KEYS

Thanks,
Roman.


Re: Empty Output when running Shortest Path Algorithm !!

2014-09-21 Thread Roman Shaposhnik
I this should should give you some clue:

 14/09/21 13:34:03 WARN job.GiraphConfigurationValidator: Output format
 vertex index type is not known
 14/09/21 13:34:03 WARN job.GiraphConfigurationValidator: Output format
 vertex value type is not known
 14/09/21 13:34:03 WARN job.GiraphConfigurationValidator: Output format edge
 value type is not known

At this point, what I would recommend though is to try
the same experiment with our Giraph 1.1.0 RC0 available
from the following URL:
   http://people.apache.org/~rvs/giraph-1.1.0-RC0/giraph-dist-1.1.0-bin.tar.gz

Thanks,
Roman.


Re: Are the unit tests supposed to fail?

2014-07-08 Thread Roman Shaposhnik
On Mon, Jul 7, 2014 at 4:50 PM, Toshio ITO toshio9@toshiba.co.jp wrote:
 Hi Roman,

 I previously reported some cases where Giraph unit tests failed.

 http://mail-archives.apache.org/mod_mbox/giraph-user/201406.mbox/%3c87a990dkni.wl%25toshio9@toshiba.co.jp%3E

This thread talks about hadoop_0.20.203 profile. I am not sure
this version of Hadoop gets a lot of attention in Giraph community.
Personally, I'd definitely not treat any failures in that profile
as release blockers.

 http://mail-archives.apache.org/mod_mbox/giraph-user/201407.mbox/%3C8761jgqjec.wl%25toshio9.ito%40toshiba.co.jp%3E

This one is more interesting. I can't reproduce your
failures in Rexster I/O Format in my environment.

As for -Dprop.mapred.job.tracker=localhost:54311 -- I've
never seen anybody running Giraph unit tests that way.

You're right that in theory it should work and it would be
useful for us to understand why it fails. I may be able to
look at it, but since it happens to be a pretty non-orthodox
way of running Unit test, I don't think it'll be a release
blocker all by itself.

 Because it seems I'm almost the only one in the user mailing list
 who cares about the unit tests

I think all of us do, but the thing is -- we run them as pure
unit tests -- you run then as combination of unit/system
tests. If you can make them work both ways that would
be appreciated regardless of whether your fixes end
up in 1.1.0 or not.

 I just wonder whether it is normal (or expected) for the unit tests
 to fail at this stage of development (release-1.1.0-RC0).

Pure unit tests are definitely expected to pass. They do pass
on our Jenkins, hence my suspicion that something's is
wrong with your env.

Thanks,
Roman.


Re: Giraph 1.1.0 and Jetty 7

2014-07-06 Thread Roman Shaposhnik
On Fri, Jul 4, 2014 at 9:11 AM, Carlo Sartiani sarti...@gmail.com wrote:
 We examined the code of Giraph 1.1.0, but we actually did not find any place
 where a Collector object is created and/or Jetty is really used.

 Do you have  any idea on how to solve this issue? Please, observe that we
 are forced to use this Hadoop distribution and cannot switch to a plain
 Hadoop 2.2.0 distribution with Jetty 6.

Here's what I'd suggest: take 1.1.0-RC0 source code and manually change
the version of Jetty in the top level pom.xml. Rebuild everything with
-Phadoop_2
and also specify the exact version of your Hadoop with -Dhadoop.version=X.Y.Z
for good measure.

Thanks,
Roman.


Re: Giraph (1.1.0-SNAPSHOT and 1.0.0-RC3) unit tests fail

2014-07-01 Thread Roman Shaposhnik
Yes, the failures around Accumulo in hadoop_2 profile are expected and nothing
to worry about. I should've probably mentioned it in my RC announcement email.
Sorry about that.

Any failures in hadoop_1 profile would be a reason to reconsider RC0.

Thanks,
Roman.

P.S. This is one of the reasons we're still running with hadoop_1 as a default
profile.

On Mon, Jun 30, 2014 at 3:09 AM, Akila Wajirasena
akila.wajiras...@gmail.com wrote:
 Hi Roman,

 I got the same error when running hadoop_2 profile.
 According to this [1] the Accumulo version we use in giraph (1.4) is not
 compatible with Hadoop 2.
 I think this is the issue.

 [1]
 http://apache-accumulo.1065345.n5.nabble.com/Accumulo-Hadoop-version-compatibility-matrix-tp3893p3894.html

 Thanks

 Akila


 On Mon, Jun 30, 2014 at 2:21 PM, Toshio ITO toshio9@toshiba.co.jp
 wrote:

 Hi Roman.

 I checked out release-1.1.0-RC0 and succeeded to build it.

 $ git checkout release-1.1.0-RC0
 $ mvn clean
 $ mvn package -Phadoop_2 -DskipTests
 ## SUCCESS

 However, when I ran the tests with LocalJobRunner, it failed.

 $ mvn clean
 $ mvn package -Phadoop_2

 It passed tests from Core and Examples, but it failed at
 Accumulo I/O.


 testAccumuloInputOutput(org.apache.giraph.io.accumulo.TestAccumuloVertexFormat)

 The error log contained the following exception

 java.lang.IncompatibleClassChangeError: Found interface
 org.apache.hadoop.mapreduce.JobContext, but class was expected


 Next I wanted to run the tests with a running Hadoop2 instance, but
 I'm having trouble to set it up (I'm quite new to Hadoop).

 Could you show me some example configuration (etc/hadoop/* files) of
 Hadoop 2.2.0 single-node cluster? That would be very helpful.




 
  On Sun, Jun 29, 2014 at 5:06 PM, Toshio ITO toshio9@toshiba.co.jp
  wrote:
   Hi Roman.
  
   Thanks for the reply.
  
   OK, I'll try hadoop_1 and hadoop_2 with the latest
   release-1.1.0-RC0 and report the result.
 
  That would be extremely helpful!
 
  And speaking of which -- I'd like to remind folks
  that taking RC0 for a spin would really help
  at this point. If we ever want to have 1.1.0 out
  we need the required PMC votes.
 
  Thanks,
  Roman.
 
 Toshio Ito









Re: Giraph (1.1.0-SNAPSHOT and 1.0.0-RC3) unit tests fail

2014-06-26 Thread Roman Shaposhnik
Please try profiles hadoop_1 and hadoop_2. These
are the two profiles that we're targeting for the upcoming
1.1.0 release. The rest of profiles are optional and
may or may NOT work.

Thanks,
Roman.

On Thu, Jun 26, 2014 at 3:34 AM, Toshio ITO toshio9@toshiba.co.jp wrote:
 Hi all,

 I recently tried Giraph.

 It compiled fine, but when I did the unit test (i.e. `mvn test`) it
 failed.  I did the test several times with different settings. See
 below for details.

 My question is:

 - Is it normal for Giraph? Is the unit test supposed to fail for now?
 - If it's not normal, what do you think is wrong?


 My environment:

 - Ubuntu Server 14.04 64-bit
 - Dynamic address by DHCP
 - Followed the Quick Start guide
   https://giraph.apache.org/quick_start.html
   except I used localhost for the node hostname.
 -- Installed openjdk-7-jdk, git, maven by apt-get
 -- Installed hadoop-0.20.203.0rc1 as instructed


 Test settings:

 (1) 1.1.0-SNAPSHOT (Commit ID: b218d72cedc52467e691c6002e596e482d8583e4)
 with LocalJobRunner
 (2) 1.1.0-SNAPSHOT with the running Hadoop instance
 (3) Tag release-1.0.0-RC3 with LocalJobRunner
 (4) Tag release-1.0.0-RC3 with the running Hadoop instance

 Cases (1) and (3) are run by the following commands.

 $ mvn clean
 $ mvn package -Phadoop_0.20.203 -DskipTests
 $ mvn test -Phadoop_0.20.203

 Cases (2) and (4) are run by the following commands.

 $ mvn clean
 $ mvn package -Phadoop_0.20.203 -DskipTests
 $ mvn test -Phadoop_0.20.203 -Dprop.mapred.job.tracker=localhost:54311


 Test results:

 (1) Failed at Rexster I/O Format. At some point, the test
 endlessly tried to connect to ZooKeeper, repeating the following log
 messages. I had to terminate the test by Ctrl-C.

 14/06/26 18:10:27 INFO zookeeper.ClientCnxn: Opening socket connection to 
 server localhost/127.0.0.1:22182. Will not attempt to authenticate using SASL 
 (unknown error)
 14/06/26 18:10:27 WARN zookeeper.ClientCnxn: Session 0x146d76e16750002 
 for server null, unexpected error, closing socket connection and attempting 
 reconnect
 java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
 at 
 org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 at 
 org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)


 (2) Failed at Examples. Hadoop was stuck at testBspFail
 job. Logs like the following line were printed every 5 seconds for
 10 minutes.  After that, the test endlessly tried to connect to
 ZooKeeper, just like the case (1). I killed the test process and the
 Hadoop job.

 14/06/26 18:40:30 INFO job.JobProgressTracker: Data from 2 workers - 
 Compute superstep 20: 10 out of 10 vertices computed; 6 out of 6 partitions 
 computed; min free memory on worker 1 - 137.42MB, average 149.12MB


 (3) Failed at HBase I/O in testHBaseInputOutput
 (org.apache.giraph.io.hbase.TestHBaseRootMarkerVertextFormat). Before
 it reported failure, it blocked for 10 minutes.


 (4) Failed at Core in testContinue
 (org.apache.giraph.io.TestJsonBase64Format) . Hadoop was stuck at
 the second testContinue job. After 10 minutes, the test went on
 and reported the failure. The testContinue map task was aborted
 with the following error.

 java.io.FileNotFoundException: /tmp/_giraphTests/testContinue/_logs (Is a 
 directory)



 Thanks in advance.

 
 Toshio Ito


Re: Is it possible to know the mapper task a particular vertex is assigned to?

2014-03-05 Thread Roman Shaposhnik
On Wed, Mar 5, 2014 at 9:53 PM, Pankaj Malhotra pankajiit...@gmail.com wrote:
 Hi,

 How can I find the mapper task a particular vertex is assigned to?
 I can do this by doing a sysout and then looking at the logs. But there must
 be a smarter way to do this. Please suggest.

That mapping is not static and can change. In theory you can rely on
the info in ZK, but that would be relying on what is, essentially, an
implementation detail of Giraph.

What's the reason for you to need this info?

Thanks,
Roman.


Giraph talks at Hadoop Summit

2014-02-27 Thread Roman Shaposhnik
Hi!

not sure if anybody from the Giraph community
submitted any talks to Hadoop Summit, but
here's the one I submitted:

https://hadoopsummit.uservoice.com/forums/242790-committer-track/suggestions/5568061-apache-giraph-start-analyzing-graph-relationships

Feel free to upvote if you feel like Giraph deserves
to be well represented at Hadoop Summit.

Thanks,
Roman.


Re: Issue in running giraph example in hadoop2.2

2014-02-23 Thread Roman Shaposhnik
You need to configure Giraph not to split workers and master via
giraph.SplitMasterWorker=false

You can either set it in giraph-site.xml or pass via command
line option -ca giraph.SplitMasterWorker=false

Thanks,
Roman.

On Sat, Feb 22, 2014 at 10:19 PM, Arun Kumar toga...@gmail.com wrote:
 While running Giraph over Hadoop 2.2 I am getting the below exception





 14/02/20 04:52:44 INFO Configuration.deprecation: mapred.job.tracker is
 deprecated. Instead, use mapreduce.jobtracker.address

 Exception in thread main java.lang.IllegalArgumentException:
 checkLocalJobRunnerConfiguration: When using LocalJobRunner, you cannot run
 in split master / worker mode since there is only 1 task at a time!

 at
 org.apache.giraph.job.GiraphJob.checkLocalJobRunnerConfiguration(GiraphJob.java:157)

 at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:225)

 at org.apache.giraph.GiraphRunner.run(GiraphRunner.java:94)

 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

 at
 org.apache.giraph.GiraphRunner.main(GiraphRunner.java:124)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
 Method)

 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

 at java.lang.reflect.Method.invoke(Method.java:597)

 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)





 One of the solution found in the archive was to modify
 /directory-to-hadoop/conf/mapred-site.xml with:



 property

   namemapred.tasktracker.map.tasks.maximum/name

   value4/value

 /property



 property

   namemapred.map.tasks/name

   value4/value

 /property



 But in my case it did not work..


 Can somebody help?


 Regards

 -Arun


Re: Problem running SimpleShortestPathsComputation on CDH5 with Hadoop 2.2.0

2014-02-19 Thread Roman Shaposhnik
On Wed, Feb 19, 2014 at 1:12 PM, Stefan Beskow stefan.bes...@sas.com wrote:
 Hi.

 I checked out and built Giraph for Cloudera CDH5 with Hadoop 2.2.0 using the
 following:

 git clone git://git.apache.org/giraph.git snapshot_from_git
 cd snapshot_from_git
 mvn -Phadoop_yarn -Dhadoop.version=2.2.0 clean package -DskipTests

It may help to build against CDH5 directly by:
   * manually adding repository.cloudera.com to the set of repos
   * specifying -Dhadoop.version=2.2.0-cdh5.0.0-beta-2

 When I run the sample application
 org.apache.giraph.examples.SimpleShortestPathsComputation I get the
 following exception:

You need to provide way more logs from the YARN side for us to
make sense of it.

Thanks,
Roman.


Re: Giraph 1.1.0 Apache Giraph Distribution ........................ FAILURE and Unsupported major.minor version 51.0 error.

2014-02-06 Thread Roman Shaposhnik
Try applying this patch (you may need to apply it with a fuzz):
https://issues.apache.org/jira/browse/GIRAPH-794

and let us know if it helps (it should).

Thanks,
Roman.

On Thu, Feb 6, 2014 at 2:23 PM, Rocky Grey rockyg...@gmail.com wrote:
 Hi

 I pulled the latest repository from Git -
 http://git-wip-us.apache.org/repos/asf/giraph.git

 and tried to package the application using

 mvn -Phadoop_yarn -Dhadoop.version=2.2.0 -DskipTests clean package

 The process succeeded partially. Below is the snapshot of the package log.
 Did anybody else face the same problem ?

 ...
 
 ...
 [INFO]
 
 [INFO] Building Apache Giraph Distribution 1.1.0-SNAPSHOT
 [INFO]
 
 [WARNING] The POM for org.apache.giraph:giraph-hbase:jar:1.1.0-SNAPSHOT is
 missing, no dependency information available
 [WARNING] The POM for org.apache.giraph:giraph-accumulo:jar:1.1.0-SNAPSHOT
 is missing, no dependency information available
 [WARNING] The POM for org.apache.giraph:giraph-hcatalog:jar:1.1.0-SNAPSHOT
 is missing, no dependency information available
 [WARNING] The POM for org.apache.giraph:giraph-hive:jar:1.1.0-SNAPSHOT is
 missing, no dependency information available
 [INFO]
 
 [INFO] Reactor Summary:
 [INFO]
 [INFO] Apache Giraph Parent .. SUCCESS [4.283s]
 [INFO] Apache Giraph Core  SUCCESS [35.053s]
 [INFO] Apache Giraph Examples  SUCCESS [12.905s]
 [INFO] Apache Giraph Distribution  FAILURE [0.110s]
 [INFO]
 
 [INFO] BUILD FAILURE
 [INFO]
 
 [INFO] Total time: 52.782s
 [INFO] Finished at: Thu Feb 06 13:17:41 PST 2014
 [INFO] Final Memory: 41M/286M
 [INFO]
 
 [ERROR] Failed to execute goal on project giraph-dist: Could not resolve
 dependencies for project org.apache.giraph:giraph-dist:pom:1.1.0-SNAPSHOT:
 The following artifacts could not be resolved:
 org.apache.giraph:giraph-hbase:jar:1.1.0-SNAPSHOT,
 org.apache.giraph:giraph-accumulo:jar:1.1.0-SNAPSHOT,
 org.apache.giraph:giraph-hcatalog:jar:1.1.0-SNAPSHOT,
 org.apache.giraph:giraph-hive:jar:1.1.0-SNAPSHOT: Failure to find
 org.apache.giraph:giraph-hbase:jar:1.1.0-SNAPSHOT in
 http://repo1.maven.org/maven2 was cached in the local repository, resolution
 will not be reattempted until the update interval of central has elapsed or
 updates are forced - [Help 1]
 [ERROR]
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR]
 [ERROR] For more information about the errors and possible solutions, please
 read the following articles:
 [ERROR] [Help 1]
 http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
 [ERROR]
 [ERROR] After correcting the problems, you can resume the build with the
 command
 [ERROR]   mvn goals -rf :giraph-dist

 I tried to rerun the process multiple times but got the same error.


 On a second note when I tried to test the partially successful installation
 like mentioned below

 # hadoop jar
 $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.2.0-jar-with-dependencies.jar
 org.apache.giraph.GiraphRunner -h

 It failed with the following error.

 Exception in thread main java.lang.UnsupportedClassVersionError:
 org/apache/giraph/GiraphRunner : Unsupported major.minor version 51.0
 at java.lang.ClassLoader.defineClass1(Native Method)
 at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
 at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
 at
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
 at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
 at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:247)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:205)

 Any help would be appreciated.

 Thanks.


Re: Release date for 1.1.0

2014-01-30 Thread Roman Shaposhnik
On Thu, Jan 30, 2014 at 5:04 PM, Avery Ching ach...@apache.org wrote:
 I've upgraded HBase in Giraph-833 ( HBase 0.90.5- 0.94.16)

 https://issues.apache.org/jira/browse/GIRAPH-833

Any reason it is not marked as for 1.1.0?

Thanks,
Roman.


Re: Release date for 1.1.0

2014-01-15 Thread Roman Shaposhnik
It is the usual community-driven ASF process. Somebody
familiar with the project has to step forward as a Release Manager
and drive the release.

I did a few months back, but since then I went through a career
change that made it very difficult for me to find free cycles to
drive this release. I fully intend to pick up the slack begging
of  Feb. Given that I think beginning of March should be a
realistic deadline, but it all depends on the availability of
the Giraph PMC members to cast votes on the release
candidate.

That said, if there's anybody else who would want to
speed up this release I'd be more than happy to yield.

By and large though, ASF project typically don't give any
schedule for future releases. The way to speed it up is
to join the community, start contributing and volunteering
as RM.

Thanks,
Roman.

On Wed, Jan 15, 2014 at 5:02 PM, Zhu, Xia xia@intel.com wrote:
 Is it possible to release 1.1.0 before March 2014?


 Thanks,
 Xia
 -Original Message-
 From: Zhu, Xia [mailto:xia@intel.com]
 Sent: Wednesday, January 15, 2014 4:36 PM
 To: user@giraph.apache.org
 Subject: RE: Release date for 1.1.0

 May I know what are the Giraph release process?


 Thanks,
 Ivy
 -Original Message-
 From: shaposh...@gmail.com [mailto:shaposh...@gmail.com] On Behalf Of Roman 
 Shaposhnik
 Sent: Monday, January 06, 2014 9:22 PM
 To: user@giraph.apache.org
 Subject: Re: Release date for 1.1.0

 On Mon, Jan 6, 2014 at 6:13 AM, Ahmet Emre Aladağ aladage...@gmail.com 
 wrote:
 Hi,

 Are there any advances so far on the 1.1.0 release schedule?

 Unfortunately, with my recent job change driving 1.1.0 release dropped from 
 my list. I'll try to pick it up back this month. Still very much would like 
 to help make it happen.

 Thanks,
 Roman.


Re: Release date for 1.1.0

2014-01-15 Thread Roman Shaposhnik
Forgot to add, one way you can help is to look through
the list of JIRAs currently blocking the release:
   
https://issues.apache.org/jira/browse/GIRAPH-819?jql=project%3Dgiraph%20and%20fixversion%3D%221.1.0%22%20and%20status%20!%3D%20closed%20and%20status%20!%3D%20resolved

and helping to triage it.

Thanks,
Roman.

On Wed, Jan 15, 2014 at 5:02 PM, Zhu, Xia xia@intel.com wrote:
 Is it possible to release 1.1.0 before March 2014?


 Thanks,
 Xia
 -Original Message-
 From: Zhu, Xia [mailto:xia@intel.com]
 Sent: Wednesday, January 15, 2014 4:36 PM
 To: user@giraph.apache.org
 Subject: RE: Release date for 1.1.0

 May I know what are the Giraph release process?


 Thanks,
 Ivy
 -Original Message-
 From: shaposh...@gmail.com [mailto:shaposh...@gmail.com] On Behalf Of Roman 
 Shaposhnik
 Sent: Monday, January 06, 2014 9:22 PM
 To: user@giraph.apache.org
 Subject: Re: Release date for 1.1.0

 On Mon, Jan 6, 2014 at 6:13 AM, Ahmet Emre Aladağ aladage...@gmail.com 
 wrote:
 Hi,

 Are there any advances so far on the 1.1.0 release schedule?

 Unfortunately, with my recent job change driving 1.1.0 release dropped from 
 my list. I'll try to pick it up back this month. Still very much would like 
 to help make it happen.

 Thanks,
 Roman.


Re: Preconfigured BigTop running Giraph

2014-01-14 Thread Roman Shaposhnik
Hi Martin,

sorry for the belated reply. I am wondering how
did you configure your cluster? Did you use
Bigtop's puppet recipies or did you do it by hand?

It seems that Giraph is working fine on the toy
cluster that I'm deploying with the Bigtop bits.
But I'm using Puppet and the topology is really
simple.

Basically without seeing your hadoop and giraph
config files its pretty tough to answer your
question in any greater details.

Thanks,
Roman.

On Tue, Jan 7, 2014 at 5:44 AM, Martin Neumann mneum...@spotify.com wrote:
 Hej,

 I installed Giraph from the apache BigTop project and want to try to run
 some Giraph jobs locally on my machine. If I understood the website
 correctly it should be preconfigured to do so.
 But when I run a Giraph Job I get the following exception:

 Exception in thread main java.lang.IllegalStateException: Giraph's
 estimated cluster heap 2048MB ask is greater than the current available
 cluster heap of 0MB. Aborting Job.


 To me it sounds like a configuration problem, I'm not even sure if it comes
 from Giraph or from Yarn. If it is a configuration issue is there help page
 that tells you to configure the environment correctly?
 Thanks for the help.





 here the log of the whole execution:

 spotify@spotify-ThinkPad-T430s:~$ giraph
 /home/spotify/workspace/GiraphExe/SpotifyConComp.jar
 spotifyConnectedComponent.ConComp -eif
 spotifyConnectedComponent.ConCompInput -of
 spotifyConnectedComponent.ConCompOutput -c
 spotifyConnectedComponent.MinTextCombiner -eip
 /home/spotify/workspace/GiraphExe/in/sample -op
 /home/spotify/workspace/GiraphExe/out/spotifyConnectedComponent.ConComp -w 1
 HADOOP_CONF_DIR=/etc/hadoop/conf
 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in
 [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: Found binding in
 [jar:file:/usr/lib/giraph/lib/slf4j-log4j12-1.7.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
 explanation.
 14/01/07 14:40:35 INFO utils.ConfigurationUtils: No vertex input format
 specified. Ensure your InputFormat does not require one.
 14/01/07 14:40:35 INFO yarn.GiraphYarnClient: Final output path is:
 hdfs://localhost:8020/home/spotify/workspace/GiraphExe/out/spotifyConnectedComponent.ConComp
 14/01/07 14:40:35 INFO service.AbstractService:
 Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
 14/01/07 14:40:35 INFO service.AbstractService:
 Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
 14/01/07 14:40:35 INFO yarn.GiraphYarnClient: Defaulting per-task heap size
 to 1024MB.
 Exception in thread main java.lang.IllegalStateException: Giraph's
 estimated cluster heap 2048MB ask is greater than the current available
 cluster heap of 0MB. Aborting Job.
 at
 org.apache.giraph.yarn.GiraphYarnClient.checkPerNodeResourcesAvailable(GiraphYarnClient.java:204)
 at
 org.apache.giraph.yarn.GiraphYarnClient.run(GiraphYarnClient.java:114)
 at org.apache.giraph.GiraphRunner.run(GiraphRunner.java:96)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at org.apache.giraph.GiraphRunner.main(GiraphRunner.java:126)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)


Re: giraph-hive having problem with cdh-4.4 (hadoop2.0)

2013-11-18 Thread Roman Shaposhnik
I could bet this is because the default version of Hive you're pulling
is compiled
against Hadoop 1, not Hadoop from CDH.

If you want to run against a CDH cluster -- you have to make sure you change
versions of all dependecies to be CDH ones (take a look at the properties
section of the Giraph's root pom file).

Thanks,
Roman.

On Mon, Nov 18, 2013 at 6:11 PM, Ping Jin jinpi...@gmail.com wrote:
 hi,
 I'm a new user of giraph. I'm trying to setup giraph-hive to work with the
 cdh4.4 Hive and Hadoop.
 I successfully built and run the SimpleShortestPath job on cdh-4.4 Hadoop
 cluster.
 However when I try to setup Giraph-Hive and run a job through
 GiraphHiveRunner, I got following exceptions:

 Exception in thread main java.lang.IncompatibleClassChangeError: Found
 interface org.apache.hadoop.mapreduce.JobContext, but class was expected

 at
 com.facebook.hiveio.output.HiveApiOutputFormat.checkOutputSpecs(HiveApiOutputFormat.java:247)

 at
 org.apache.giraph.hive.output.HiveVertexOutputFormat.checkOutputSpecs(HiveVertexOutputFormat.java:108)

 at
 org.apache.giraph.io.internal.WrappedVertexOutputFormat.checkOutputSpecs(WrappedVertexOutputFormat.java:104)

 at
 org.apache.giraph.bsp.BspOutputFormat.checkOutputSpecs(BspOutputFormat.java:52)

 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:984)

 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945)

 at java.security.AccessController.doPrivileged(Native Method)

 at javax.security.auth.Subject.doAs(Subject.java:394)

 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)

 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945)

 at org.apache.hadoop.mapreduce.Job.submit(Job.java:566)

 at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:246)

 at org.apache.giraph.hive.HiveGiraphRunner.run(HiveGiraphRunner.java:275)

 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

 at org.apache.giraph.hive.HiveGiraphRunner.main(HiveGiraphRunner.java:246)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

 at java.lang.reflect.Method.invoke(Method.java:597)

 at org.apache.hadoop.util.RunJar.main(RunJar.java:208)


 Any one can help me figure out what's going wrong?


 Thanks,

 -Ping


Re: Issue Running Giraph Job

2013-10-30 Thread Roman Shaposhnik
Do you have a dedicated Zookeeper ensemble running on your cluster?
It feels like something is conflicting on ports with an embedded one.

Thanks,
Roman.

On Wed, Oct 30, 2013 at 12:48 PM, Artie Pesh-Imam
artie.pesh-i...@tapad.com wrote:
 Hi all,

 I'm able to run this job locally but when I try to run against our cluster,
 the job seems to fail. Here's the log entries.

 It looks like it's not proceeding beyond superstep 0. Is there anyway to get
 a more detailed picture of what's going on?

 2013-10-30 19:38:04,467 WARN mapreduce.Counters: Group
 org.apache.hadoop.mapred.Task$Counter is deprecated. Use
 org.apache.hadoop.mapreduce.TaskCounter instead 2013-10-30 19:38:05,157 WARN
 org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use
 dfs.metrics.session-id 2013-10-30 19:38:05,158 INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with
 processName=MAP, sessionId= 2013-10-30 19:38:05,634 INFO
 org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
 2013-10-30 19:38:05,639 INFO org.apache.hadoop.mapred.Task: Using
 ResourceCalculatorPlugin :
 org.apache.hadoop.util.LinuxResourceCalculatorPlugin@371e88fb 2013-10-30
 19:38:06,029 INFO org.apache.hadoop.mapred.MapTask: Processing split:
 'org.apache.giraph.bsp.BspInputSplit, index=-1, num=-1 2013-10-30
 19:38:06,047 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level
 remains at info 2013-10-30 19:38:06,117 INFO
 org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty.
 Assuming fatjar. 2013-10-30 19:38:06,117 INFO
 org.apache.giraph.graph.GraphTaskManager: setup: classpath @
 /d02/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/jars/job.jar
 for job jobs.ConnectedComponents 2013-10-30 19:38:06,155 INFO
 org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the
 directory _bsp/_defaultZkManagerDir/job_201310301559_0024 2013-10-30
 19:38:06,166 INFO org.apache.giraph.zk.ZooKeeperManager:
 createCandidateStamp: Creating my filestamp
 _bsp/_defaultZkManagerDir/job_201310301559_0024/_task/datanode05.prd.nj1.tapad.com
 0 2013-10-30 19:38:06,216 INFO org.apache.giraph.zk.ZooKeeperManager:
 getZooKeeperServerList: Got [datanode05.prd.nj1.tapad.com] 1 hosts from 1
 candidates when 1 required (polling period is 3000) on attempt 0 2013-10-30
 19:38:06,217 INFO org.apache.giraph.zk.ZooKeeperManager:
 createZooKeeperServerList: Creating the final ZooKeeper file
 '_bsp/_defaultZkManagerDir/job_201310301559_0024/zkServerList_datanode05.prd.nj1.tapad.com
 0 ' 2013-10-30 19:38:06,235 INFO org.apache.giraph.zk.ZooKeeperManager:
 getZooKeeperServerList: For task 0, got file
 'zkServerList_datanode05.prd.nj1.tapad.com 0 ' (polling period is 3000)
 2013-10-30 19:38:06,235 INFO org.apache.giraph.zk.ZooKeeperManager:
 getZooKeeperServerList: Found [datanode05.prd.nj1.tapad.com, 0] 2 hosts in
 filename 'zkServerList_datanode05.prd.nj1.tapad.com 0 ' 2013-10-30
 19:38:06,236 INFO org.apache.giraph.zk.ZooKeeperManager:
 onlineZooKeeperServers: Trying to delete old directory
 /d06/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/work/_bspZooKeeper
 2013-10-30 19:38:06,243 INFO org.apache.giraph.zk.ZooKeeperManager:
 generateZooKeeperConfigFile: Creating file
 /d06/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/work/_bspZooKeeper/zoo.cfg
 in
 /d06/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/work/_bspZooKeeper
 with base port 22181 2013-10-30 19:38:06,243 INFO
 org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Make
 directory of _bspZooKeeper = true 2013-10-30 19:38:06,243 INFO
 org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Delete
 of zoo.cfg = false 2013-10-30 19:38:06,246 INFO
 org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Attempting to
 start ZooKeeper server with command [/usr/java/jdk1.7.0_40/jre/bin/java,
 -Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC,
 -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp,
 /d02/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/jars/job.jar,
 org.apache.zookeeper.server.quorum.QuorumPeerMain,
 /d06/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/work/_bspZooKeeper/zoo.cfg]
 in directory
 /d06/dfs/dn/tasktracker/taskTracker/artie.pesh-imam/jobcache/job_201310301559_0024/work/_bspZooKeeper
 2013-10-30 19:38:06,249 INFO org.apache.giraph.zk.ZooKeeperManager:
 onlineZooKeeperServers: Shutdown hook added. 2013-10-30 19:38:06,249 INFO
 org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect
 attempt 0 of 10 max trying to connect to datanode05.prd.nj1.tapad.com:22181
 with poll msecs = 3000 2013-10-30 19:38:06,253 WARN
 org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got
 ConnectException java.net.ConnectException: Connection refused at
 

Re: Release date for 1.1.0

2013-10-29 Thread Roman Shaposhnik
+dev@

On Tue, Oct 29, 2013 at 9:47 AM, Avery Ching ach...@apache.org wrote:
 I would like that as well.  Does someone else want to coordinate this one?
 =)

If all of the PMC members are fine with a non-committer
driving a release --  I'd be more than happy to volunteer
as an RM for 1.1.0. That said, I would still either need
karma for switching branches and administering JIRA.

I really do want to help -- please let me know how to proceed.

Thanks,
Roman.


Re: External Documentation about Giraph

2013-05-30 Thread Roman Shaposhnik
On Wed, May 29, 2013 at 2:25 PM, Maria Stylianou mars...@gmail.com wrote:
 Hello guys,

 This semester I'm doing my master thesis using Giraph in a daily basis.
 In my blog (marsty5.wordpress.com) I wrote some posts about Giraph, some of
 the new users may find them useful!
 And maybe some of the experienced ones can give me feedback and correct any
 mistakes :D
 So far, I described:
 1. How to set up Giraph
 2. What to do next - after setting up Giraph
 3. How to run ShortestPaths
 4. How to run PageRank

Good stuff! As a shameless plug, one more way
to install Giraph is via Apache Bigtop. All it takes is
hooking one of these files:

http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Repository/label=fedora18/lastSuccessfulBuild/artifact/repo/bigtop.repo

http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Repository/label=opensuse12/lastSuccessfulBuild/artifact/repo/bigtop.repo
to your yum/apt system and typing:
   $ sudo yum install hadoop-conf-pseudo giraph

In fact we're about to release Bigtop 0.6.0 with Hadoop 2.0.4.1
and Giraph 1.0 -- so anybody's interested in helping us
to test this stuff -- that would be really appreciated.

Thanks,
Roman.

P.S. There's quite a few other platforms available as well:

http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Repository/


Re: [VOTE] Release Giraph 1.0 (rc0)

2013-04-12 Thread Roman Shaposhnik
On Fri, Apr 12, 2013 at 3:56 PM, Avery Ching ach...@apache.org wrote:
 Fellow Giraphers,

 We have a our first release candidate since graduating from incubation.
 This is a source release, primarily due to the different versions of Hadoop
 we support with munge (similar to the 0.1 release).  Since 0.1, we've made A
 TON of progress on overall performance, optimizing memory use, split
 vertex/edge inputs, easy interoperability with Apache Hive, and a bunch of
 other areas.  In many ways, this is an almost totally different codebase.
 Thanks everyone for your hard work!

Indeed this is a VERY impressive amount of new functionality! Kudos!

Here's my feedback so far (before I pull the bits into Bigtop for more
integration testing). I hope to convince you guys that we may need
to spin additional RC (#1-#3 -- with #4 bein a subject of a special plea):
1. tarball contains the .git repo
2. tarball was generated in such a way that make Linux Ubuntu
tar spew out tons of warnings
3. YARN profile is broken (GIRAPH-627 -- patch attached).
4. YARN profile is broken when compiled against hadoop-2.0.4
(GIRAPH-629 -- working on a patch)

And here we come to me pleading with Giraph community (on
behalf of Bigtop and Hadoop ones ;-)). I know that what I'm about
to ask is typically considered a sort of a 'bad taste' in ASF but
here I go: given the incompatibility between 2.0.3-alpha and
2.0.4-alpha is there any chance we can delay Griaph 1.0 to be
full compatible with 2.0.4? The 2.0.4 release is suppose to come
out at the end of next week and I can volunteer to make Giraph
compatible with it.

Hadoop 2.0.4-alpha is kind of a big deal because if everything
goes according to a plan 2.0.4 will be a stepping stone towards
the first Hadoop 2.X beta (and eventually GA). It is way more
important to be compatible with it in my opinion.

I guess, if you guys really want to save a couple of days an
alternative could be to agree on Giraph 1.0.1 within a couple of weeks.
That of course, will require cycles from whoever will be the 1.0.1.

Finally, if we do spin a new RC, could we please follow an established
ASF model where the tarball itself gets a name of the final artifact
(in our case giraph-1.0.tar.gz) but the subdirectory name reflects the
name of the RC. Here's an example of Hadoop 2.0.4 RC that the
Hadoop community is voting on right now:
http://people.apache.org/~acmurthy/hadoop-2.0.4-alpha-rc2/
as you can see the name of the artifact looks exactly  like the
final product of the release.

Thanks,
Roman.