Error in dispatcher thread java.lang.NoSuchFieldError: HADOOP_CLASSPATH

2016-09-30 Thread Schweiger, Tom
Friends,

We recently had a Hadoop cluster upgrade, after which my Giraph applications 
failed with the following fatal exception in the yarn log:

2016-09-30 14:34:40,229 FATAL [AsyncDispatcher event handler] 
org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
java.lang.NoSuchFieldError: HADOOP_CLASSPATH
at 
org.apache.hadoop.mapreduce.v2.util.MRApps.setClasspath(MRApps.java:248)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.getInitialClasspath(TaskAttemptImpl.java:621)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:757)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:821)
...

After several days of sleuthing, I found the fix.  If I set 
“mapreduce.job.classloader” to  “true”, the kidney stone passes and life 
returns to normal.

I submit this to the list as a humanitarian act, in case others encounter this 
problem.



Tom Schweiger
Senior Software Architect
Identity Services and Shared Data (ISSD)
ebay Seattle
411 108th Avenue NE
Bellevue, WA 98004
Office: (425) 586-2669
email: 
thschwei...@ebay.com



Giraph stops at superstep 0

2015-01-06 Thread Schweiger, Tom
I'm on a plane now. I'll look into your problem ASAP.

Sent from my phone.

-Original Message-
From: Mckie, Duncan [dmc...@ebay.com]
Received: Tuesday, 06 Jan 2015, 10:04AM
To: user@giraph.apache.org [user@giraph.apache.org]
Subject: Re: Need help on simple text based giraph input format

Hi Arghya,

I’m not sure if you received a reply or not, but here is a Text-based Vertex 
Adjacency List that I created:

https://github.com/duncanmckie/giraph/blob/release-1.1/giraph-examples/src/main/java/org/apache/giraph/examples/io/formats/TextTextNullTextInputFormat.java

This takes a tab-separated, vertex-oriented adjacency list as an input, and (as 
per your example) does not expect any edge weights.

You can easily create your own input format by customising the examples 
provided, changing the Writable classes and doing any preprocessing as required.

Cheers,

Duncan


From

Arghya Kusum Das arghyakusumdas2...@gmail.com

Subject

Need help on simple text based giraph input format

Date

Fri, 05 Dec 2014 01:36:14 GMT

Hi,
This is the first time I am trying to deal with giraph input format class
and need some help.

I want a input-format-class where vertex-id, vertex-value and edge-weight
everything will be simple text.

Eg. I have a simple Graph like following:

AAA  AAT AAG
AAC  ACG ACT
AAG AGT AGA
AAT ATT
ACG CGA
ACT CTG CTT CTC

The first column is the vertex id and the following are the edges
(Adjacency list). As you can see edge weight or vertex id can be null also.

Is there any predefined input-format in giraph for this?
If so what is that? And if not can anybody provide a simple class for that?

-- It is not possible to come up with different integer ids for different
nodes. Because the vertex id can be even more than 50 characters (It is
just a simple excerpt).  So I need a very simple text based vertex input
format where each line represents a vertex and its outgoing edges
(adjacency list).

--
Thanks and regards,
Arghya Kusum Das
(225-270-6163)




RE: How do I output only a subset of a graph?

2014-08-25 Thread Schweiger, Tom

I think you answered your question Or am I supposed to write a 
VertexOutputFormat implementation that generates no output for the vertices 
that have no data?, as in YES!.

But don't be put off; It is actually a very simple class to override.  Here is 
an example for something like you describe:


package com.ebay.foo.bar.giraph.io.formats;

import org.apache.giraph.graph.Vertex;
import org.apache.giraph.io.formats.TextVertexOutputFormat;
import org.apache.hadoop.io.BooleanWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.TaskAttemptContext;

import java.io.IOException;

public class ExampleOutputFormat extends
TextVertexOutputFormatText, Text, BooleanWritable {

public class ExampleWriter extends TextVertexWriter {

@Override
public void writeVertex(
VertexText, Text, BooleanWritable vertex)
throws IOException, InterruptedException {
if (!vertex.getValue().toString().isEmpty())
getRecordWriter().write(vertex.getId(), vertex.getValue());
}
}

}

@Override
public TextVertexWriter createVertexWriter(TaskAttemptContext context)
throws IOException, InterruptedException {
return new ExampleWriter();
}

}



Thomas A J Schweiger
Sr. Software Architect
GDI-Inc Data Services-Seattle

[X]
Office: (425) 586-2669
email: thschwei...@ebay.commailto:thschwei...@ebay.com

From: matthewcorn...@gmail.com [matthewcorn...@gmail.com] on behalf of Matthew 
Cornell [m...@matthewcornell.org]
Sent: Monday, August 25, 2014 11:38 AM
To: user
Subject: How do I output only a subset of a graph?

Hi Folks. I have a graph computation that starts with a subset of vertices of a 
certain type and propagates information through the graph to a set of target 
vertices, which are also subset of the graph. I want to output only information 
from those particular vertices, but I don't see a way to do this in the various 
VertexOutputFormat subclasses, which all seem oriented to outputting something 
for every vertex in the graph. How do I do this? E.g., are there hooks for the 
output phase where I can filter output? Or am I supposed to write a 
VertexOutputFormat implementation that generates no output for the vertices 
that have no data? Thanks in advance.

--
Matthew Cornell | m...@matthewcornell.orgmailto:m...@matthewcornell.org | 
413-626-3621 | 34 Dickinson Street, Amherst MA 01002 | 
matthewcornell.orghttp://matthewcornell.org


RE: concept of vertex in giraph

2014-07-25 Thread Schweiger, Tom
Edges combine differently than vertexes.

By default, each edge you read is added to the adjacency set of the source 
vertex (all edges are directed in Giraph, if you had not realized that yet).  
So if you read multiple edge for the same source - target, they will all be 
represented in the source vertex's edges.

If you actually need to combine edges there are two way to go about it.

1) (easy but unelegant) deal with the fact in your compute

2) (more involved but efficient) write your own OutEdges class, unless one 
already exists that does what you need.



From: Carmen Manzulli [carmenmanzu...@gmail.com]
Sent: Friday, July 25, 2014 1:56 AM
To: user@giraph.apache.org
Subject: Re: concept of vertex in giraph

ah ok, thanksa lot!...so is the same for edgevalues and targetvertexids??? i 
need to use combiners, can you show me where can i read more information about?


2014-07-25 10:52 GMT+02:00 Lukas Nalezenec 
lukas.naleze...@firma.seznam.czmailto:lukas.naleze...@firma.seznam.cz:
Hi,
Afaik vertex ids must be unique but you can combine vertexes with same ID to 
one using VertexValueCombiner.

Lukas


On 25.7.2014 10:33, Carmen Manzulli wrote:
 Hi experts,
i would like to ask you if , in the graph rapresentation, every time a vertexId 
is reapeated, would giraph consider just one time that vertexId?

for example:

Carmen (vertexId) 24 (vertex value) .
Carmen (vertexId) 1,60 m (vertex value)...

does it became

Carmen --24
--1,60

from a point of view conceptual?




RE: Setting variable value in Compute class and using it in the next superstep

2014-07-21 Thread Schweiger, Tom
And in answer of :

This post also suggests (along with what I described above) to have a field in 
the vertex value itself. For that I need to change the vertex input format and 
also create my own custom vertex class. Is it really necessary?
No, you don't need a custom vertex class or vertex input format. You can 
create/initialize the value at the beginning of the first superstep.


From: Sardeshmukh, Vivek [vivek-sardeshm...@uiowa.edu]
Sent: Monday, July 21, 2014 2:05 PM
To: user@giraph.apache.org
Subject: Setting variable value in Compute class and using it in the next 
superstep


Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a 
vertex v). If this flag is set then execute some other block of code *only 
once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override 
compute function? I defined the flag as a public variable and setting it once 
the conditions are met but it seems the value is not carried over to the next 
superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in 
the vertex value itself. For that I need to change the vertex input format and 
also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able 
to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement 
Delta-stepping shortest path algorithm ( 
http://dl.acm.org/citation.cfm?id=740136 or 
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was 
mentioned in Pregel paper. A vertex relax light edges if it belongs to the 
minimum bucket index (of course, aggregators!). Once a vertex is done with 
relaxing light edges it relaxes heavy edges (here is where I need a flag) once. 
A vertex may be re-inserted to a newer bucket and may have to execute all the 
steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)



RE: Setting variable value in Compute class and using it in the next superstep

2014-07-21 Thread Schweiger, Tom

For more than one flag, a custom class is necessary (unless you're able to, 
say, toggle the sign bit to get double usage out or a value).

I've started a private thread with Vivek to get a better understanding of what 
he was trying to solve.

And you are also correct that there isn't much to writing a custom vertex 
class.  The key is making sure you read and write in the same order.  Likewise, 
extending a vertex reader can be quite simple.


From: Matthew Saltz [sal...@gmail.com]
Sent: Monday, July 21, 2014 3:09 PM
To: user@giraph.apache.org
Subject: Re: Setting variable value in Compute class and using it in the next 
superstep

Tom,

If it's necessary to store more than one flag though, for example, won't a 
custom class be necessary? I'm a beginner too, so I apologize if I'm incorrect 
about that. Just to be clarify, to keep persistent data for a vertex from one 
superstep to the next, it is necessary to encapsulate it in the type used for 
the 'V', right? In other words, if Vivek tries to use a normal member variable 
for the Computation class, it won't work will it?

Also, just to point out, there actually isn't too much involved with writing 
your own custom vertex class. Here's a quick 
examplehttps://gist.github.com/saltzm/692fba1d3aade035ce9c to get you 
started. Within your compute() method you can access the data in this class by 
doing

SampleVertexData d = vertex.getValue();

and then using d.setFlag(true) or boolean currentFlag = d.getFlag() for 
example.  And your computation class is now something like

public class MyComputation extends BasicComputationIdType, SampleVertexData, 
EdgeType, MessageType {
@Override
public void compute(VertexIdType, SampleVertexData, EdgeType vertex, 
IterableMessageType messages) {.}

...

}

As a warning, for this class I'm using Hadoop 0.20.203 and I'm also a beginner, 
so take everything I say with a grain of salt, and Tom please correct me if I'm 
wrong.

Best of luck,
Matthew


On Mon, Jul 21, 2014 at 11:37 PM, Schweiger, Tom 
thschwei...@ebay.commailto:thschwei...@ebay.com wrote:
And in answer of :


This post also suggests (along with what I described above) to have a field in 
the vertex value itself. For that I need to change the vertex input format and 
also create my own custom vertex class. Is it really necessary?
No, you don't need a custom vertex class or vertex input format. You can 
create/initialize the value at the beginning of the first superstep.


From: Sardeshmukh, Vivek 
[vivek-sardeshm...@uiowa.edumailto:vivek-sardeshm...@uiowa.edu]
Sent: Monday, July 21, 2014 2:05 PM
To: user@giraph.apache.orgmailto:user@giraph.apache.org
Subject: Setting variable value in Compute class and using it in the next 
superstep


Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a 
vertex v). If this flag is set then execute some other block of code *only 
once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override 
compute function? I defined the flag as a public variable and setting it once 
the conditions are met but it seems the value is not carried over to the next 
superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in 
the vertex value itself. For that I need to change the vertex input format and 
also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able 
to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement 
Delta-stepping shortest path algorithm ( 
http://dl.acm.org/citation.cfm?id=740136 or 
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was 
mentioned in Pregel paper. A vertex relax light edges if it belongs to the 
minimum bucket index (of course, aggregators!). Once a vertex is done with 
relaxing light edges it relaxes heavy edges (here is where I need a flag) once. 
A vertex may be re-inserted to a newer bucket and may have to execute all the 
steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)




Setting vertex data in an EdgeInputFormat

2014-06-17 Thread Schweiger, Tom

My graph uses vertex values which I would like to be able to set when I create 
an edge.  I know how to set the vertex values when reading from a 
VertexInputFormat, but for reasons best left out of this discussion I need to 
be able to set them when creating edges using a custom EdgeInputFormat and 
EdgeReader.   I have a VertexValueCombiner that ensures the proper behavior.

Is it possible to assign vertex values when I create an edge in an EdgeReader?



RE: finding vertex at certain distance

2014-04-20 Thread Schweiger, Tom
That sounds like a breadth-first search.

Start with all vertexes with a distance of MAX_INTEGER

From the starting vertex, set its distance to zero and send one to all 
neighbors. As vertexes get messages, if the message distance is less than the 
current distance, set the current distance to the message distance and send 
current distance +1 to all the vertexes neighbors.

That will give you the shortest path of every vertex to the source. You can 
stop early if your computation knows the target distance.

Sent from my phone using large fingers and a small keypad.

-Original Message-
From: yeshwanth kumar [yeshwant...@gmail.com]
Received: Sunday, 20 Apr 2014, 10:57AM
To: user@giraph.apache.org [user@giraph.apache.org]
Subject: finding vertex at certain distance

hi i am trying to find out vertices which are at equal distance from a vertex. 
using giraph,
can someone suggest  a good way to do it.

Thanks.


Starting a second computation

2014-04-18 Thread Schweiger, Tom

Hello Giraph list,

I have a problem that has two steps.  Step 2 needs to start after step 1 
completes.  Step 1 is completed when all the vertices have voted to halt and 
there are no more messages. 

I know I can switch my computes using a MasterCompute, but it is unclear how I 
re-awaken all the vertices.

Has anyone else solved a problem like this?  If so, how did you do it?  Is 
there an easier way to do this?

Basically I'm thinking this:

class TwoStep {

  class TwoStepMaster extends DefaultMasterCompute {

   public final void compute() {
  //
  // switch from StepOne to StepTwo  if StepOne is done
  //
  if (this .isHalted   
this.getComputation().equals(StepOne.class);) {
   setComputation(StepTwo.class());
// send a message to all vertices???
// unhalt somehow??
// suggestions anyone??
   }
   }
   }

   class StepOne extends BasicComputation {
public void compute(...) {
  // do step one stuff
 vertex.voteToHalt();
}
   }

   class StepTwo extends BasicComputation {
public void compute(...) {
  // do step two stuff
 vertex.voteTpHalt();
}
 }

 }