Re: where is the log files

2015-05-12 Thread Young Han
Oops, the job names have year, month, day and also hour, minute. So something like job_201505121130_0001. Young On Tue, May 12, 2015 at 12:24 PM, Young Han wrote: > Suppose you've created HDFS in ~/. Then the log files are in > ~/hadoop_data/hadoop_local-YOURUSER/userlogs/job_yy

Re: where is the log files

2015-05-12 Thread Young Han
Suppose you've created HDFS in ~/. Then the log files are in ~/hadoop_data/hadoop_local-YOURUSER/userlogs/job_mmdd_/attempt_mmdd__*/log-file where: - mmdd is the date you started Hadoop - is the job number - attempt_* will be different for different workers - log-file can

Re: Input format problems running Giraph 1.1.0 on Twitter dataset

2015-05-04 Thread Young Han
the previous superstep 2 >> >> Current step is 31 - 40383589 existed in the previous superstep 30 >> >> It seems that a subset of vertices still only become active after the >> first superstep, >> despite all vertices being initialized in superstep 0. I cant thin

Re: Input format problems running Giraph 1.1.0 on Twitter dataset

2015-04-29 Thread Young Han
For the initialization issue, you can define a (nested) class that extends DefaultVertexValueFactory (from org.apache.giraph.factories) and add "-Dgiraph.vertexValueFactoryClass=org.apache.giraph.examples.AlgClass\$AlgVertexValueFactory" after "org.apache.giraph.GiraphRunner" in your hadoop jar com

Re: SccCompitationTestInMemory - LimitExceededException

2015-03-11 Thread Young Han
This seems like the known problem with MapReduce counters. Try adding the following to your hadoop-*/conf/mapred-site.xml: mapreduce.job.counters.max 100 mapreduce.job.counters.limit 100 This does the trick for me on Hadoop 1.0.4, and should work for 0.20 as we

Re: Undirected Vertex Definition and Reflexivity

2015-03-09 Thread Young Han
The input is assumed to be the vertex followed by a set of *directed* edges. So, in your example, leaving out E2 means that the final graph will not have the directed edge from V2 to V1. To get an undirected edge, you need a pair of directed edges. Internally, Giraph stores the out-edges of each v

Re: giraph.metrics.enable

2014-10-10 Thread Young Han
don't know where to locate the output? > > thanks > > On Fri, Oct 10, 2014 at 2:16 PM, Young Han wrote: > >> Use them as -Dgiraph.metrics.enable=true, after GiraphRunner but before >> you specify the algorithm of interest. In other words, >> >> ha

Re: giraph.metrics.enable

2014-10-10 Thread Young Han
Use them as -Dgiraph.metrics.enable=true, after GiraphRunner but before you specify the algorithm of interest. In other words, hadoop jar org.apache.giraph.GiraphRunner \ -Dgiraph.metrics.enable=true \ -Dgiraph.metrics.directory=dir \ org.apache.giraph.examples.SomeAlgorithm \ -ca

Re: Zookeeper server null error when running giraph

2014-07-24 Thread Young Han
Hi, Try making your hostname all lower case (sudo hostname , and change /etc/hostname). I think this may be an issue caused by/related to the GIRAPH-904 patch. Young On Thu, Jul 24, 2014 at 10:25 AM, Jing Fan wrote: > Does anyone know the reason of this error and the solution? > > Thank y

Re: ConnectedComponents example

2014-03-31 Thread Young Han
Pattern.compile("[\t ]"); >>> public static void main( String[] args ) >>> { >>> String line = "1 0 2"; >>> String[] tokens = SEPARATOR.split(line.toString()); >>> >>> System.out.println(SEPARATOR); >>> Syst

Re: ConnectedComponents example

2014-03-31 Thread Young Han
ntln(tokens.length); > > for(String token : tokens){ > > System.out.println(token); > } > } > } > > and the pattern worked as I thought it should by tab spaces. > > I'll try your test as well to double check > > > On Mon, Mar 31, 2014 at 9:34 PM, Youn

Re: ConnectedComponents example

2014-03-31 Thread Young Han
, Mar 31, 2014 at 4:19 PM, ghufran malik wrote: > Yep you right it is a bug with all the InputFormats I believe, I just > checked it with the Giraph 1.1.0 jar using the IntIntNullVertexInputFormat > and the example ConnectedComponents class and it worked like a charm with > just the n

Re: ConnectedComponents example

2014-03-31 Thread Young Han
> I removed the spaces and it worked! I don't understand though. I'm sure > the separator pattern means that it splits it by tab spaces?. > > Thanks for all your help though some what relieved now! > > Kind regards, > > Ghufran > > > On Mon, Mar 31, 2014 at 8

Re: ConnectedComponents example

2014-03-31 Thread Young Han
Hi, That looks like an error with the algorithm... What do the Hadoop userlogs say? And just to rule out weirdness, what happens if you use spaces instead of tabs (for your input graph)? Young On Mon, Mar 31, 2014 at 2:04 PM, ghufran malik wrote: > Hey, > > No even after I added the .txt it g

Re: ConnectedComponents example

2014-03-31 Thread Young Han
ing slots (ms)=0 > 14/03/31 17:54:59 INFO mapred.JobClient: Launched map tasks=2 > 14/03/31 17:54:59 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 > 14/03/31 17:54:59 INFO mapred.JobClient: Failed map tasks=1 > > Any ideas to why this happened? Do you think I need

Re: ConnectedComponents example

2014-03-31 Thread Young Han
utFormat so that I know code wise > everything should be correct. > > I will be trying to implement the InputFormat class and > ConnectedComponents in the meantime and if I get it working before you > respond I'll update this post. > > Thanks > > Ghufran. > > >

Re: ConnectedComponents example

2014-03-30 Thread Young Han
Hey, As a sanity check, is the graph really loaded on HDFS? Do you see the correct results if you do "hadoop dfs -cat /user/ghufran/in/my_graph.txt"? (Where hadoop is your hadoop binary). Also, I noticed that your Giraph has been compiled for Hadoop 1.x, while the logs show Hadoop 0.20.203.0. May

Re: Java Process Memory Leak

2014-03-17 Thread Young Han
ull*) { > executionGroup.shutdownGracefully(); > ProgressableUtils.*awaitTerminationFuture*(executionGroup, > context); > } > > Notice that the first await termination call seems to be waiting on the > executionGroup instead of the workerGroup... > >

Re: Java Process Memory Leak

2014-03-17 Thread Young Han
Hi Young, > > Our Hadoop instance (Corona) kills processes after they finish executing > so we don't see this. You might want to do a jstack to see where it's hung > up on and figure out the issue. > > Thanks > > Avery > > > On 3/17/14, 7:56 AM, Young Han

Java Process Memory Leak

2014-03-17 Thread Young Han
Hi all, With Giraph 1.0.0, I've noticed an issue where the Java process corresponding to the job loiters around indefinitely even after the job completes (successfully). The process consumes memory but not CPU time. This happens on both a single machine and clusters of machines (in which case ever

Re: compilation error of giraph

2014-01-21 Thread Young Han
p, which > is built on top of the apache hadoop? > Again, thanks for extending the help. Highly appreciated. > Regards > Rob > > > > On Tue, Jan 21, 2014 at 2:22 PM, Young Han wrote: > > Hi, > > > > Could it be that you're missing the path to Hadoop? >

Re: compilation error of giraph

2014-01-21 Thread Young Han
Hi, Could it be that you're missing the path to Hadoop? Young On 2014-01-21 2:25 PM, "Rob Paul" wrote: > Hi, > Does anyone knows, what's missing? Why I can't compile giraph. I am > new to Girah and sorry, if I am missing something very trivial. > Regards > Rob > ===Below is the

Re: Re: About aggregator's input data type and output data type

2014-01-12 Thread Young Han
; Is there another way to share global variable between each vertex beside > getAggregatedValue() ? > thanks a lot ! > > Luo > > > > > At 2014-01-11 01:17:21,"Young Han" wrote: > > Hi, > > One way, though not a very clean way, would be to create an o

Re: About aggregator's input data type and output data type

2014-01-10 Thread Young Han
Hi, One way, though not a very clean way, would be to create an object that encapsulates what you want to store in A and B. So, say you want A to be a DoubleWritable and B to be a Writable object with two integers. Then you could just create a Writable object having three fields: double, int, int.

Tracking Bytes Sent

2014-01-02 Thread Young Han
Hi, I'd like to know how many bytes are being sent (per worker per superstep), rather than the number of messages sent. This is for an algorithm (DMST) that sends variable size messages. In the Giraph userlogs, there are "waitAllRequests" lines which contain "MBytesSent", with statistics obtained

Connect Components Example No Messages Sent

2013-11-30 Thread Young Han
Hi, I'm trying to run the connected components example in Giraph with the following command: hadoop jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-1.0.2-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.ConnectedComponentsVertex -vif org

Re: Giraph EC2 Map task fails

2013-11-24 Thread Young Han
orres < gsala...@ime.usp.br> wrote: > I guess from your stacktrace that you didn't start the zookeeper cluster. > > Cheers > Gustavo > > > On Sunday, November 24, 2013, Young Han wrote: > > Hi, > > > > We are attempting to get Giraph running on EC2

Giraph EC2 Map task fails

2013-11-23 Thread Young Han
Hi, We are attempting to get Giraph running on EC2, using Hadoop 1.0.4. We are using page rank with the following command: hadoop jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-1.0.2-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.Simp