I am trying to create a mapreduce example that add values of same keys. E.g.
the input
A 1
A 2
B 4
get the output
A 3
B4
The problem is that I cannot make the program read 2 inputs. How I do that?
Here is my example:
package org.apache.hadoop.examples;
import java.io.IOException;
I made a mistake in my example.
Given 2 files with the same content:
file 1 | file 2
A 3 | A 3
B 4 | B 4
gives the output
A 6
B 8
On 5 June 2013 21:08, Pedro Sá da Costa psdc1...@gmail.com wrote:
I am trying to create a mapreduce example that add values of same keys.
E.g.
the
Nadine_RIOU http://fonio-bio.org/yahoo.com/bernard_blanchet.jpg
mapoun_prioux http://obsession.mu/yahoo.com/isabelle_maillard.jpeg
Hi Han,
HDFS metadata cannot be fully reconstructed by datanode report.
If you have deployed a checkpoint node/secondary namenode, you can copy the
metadata to namenode and restart. This could recover most of the metadata.
On Wed, Jun 5, 2013 at 5:30 PM, Han JU ju.han.fe...@gmail.com wrote:
If you're asking in terms of discovering where to communicate at, then
basically just the RM scheduler address and port
(yarn.resourcemanager.scheduler.address).
The NodeManager addresses and ports are carried back from the RM to
the requesting AM as part of container requests and needn't be in
What service addresses and ports does a YARN ApplicationMaster need to know
about?
Thanks,
John
Well, I've failed and given up on building Hadoop in Eclipse. Too many things
go wrong with Maven plugins and m2e.
But Hadoop builds just fine using the command-line, and it runs using Sandy's
development-node instructions.
My strategy now is
1) Tell Eclipse about all of the Hadoop JARs
Wow, thanks. Is this documented anywhere other than the code? I hate to waste
y'alls time on things that can be RTFMed.
John
-Original Message-
From: Harsh J [mailto:ha...@cloudera.com]
Sent: Wednesday, June 05, 2013 9:35 AM
To: user@hadoop.apache.org
Subject: Re: yarn-site.xml and
Rams:
For hadoop related log directories, you can use ps command to see the
command line of namenode.
You would see the log dir in the command line, e.g.:
-Dhadoop.log.dir=/homes/zy/deploy/hadoop-common-2.0.5-SNAPSHOT/logs
Cheers
On Wed, Jun 5, 2013 at 8:38 AM, Jean-Marc Spaggiari
If your goal is to simply build an application, then you can use a
Maven project. Why do you require the whole of Hadoop's projects
itself on Eclipse when you can simply have the dependencies via a
maven pom.xml?
The following is what you can use in a simple maven app, to include
all necessary
Hi,
I am trying to create a more efficient namenode, and for that I need to the
standard distribution, and then compare it to my version.
Which benchmark should I run? I am doing nnbench, but it is not telling me
anything about performance, only about potential failures.
Thank you.
Sincerely,
What do you mean by it is not telling me any thing about performance? Also
I do not understand the part, only about potential failures.. Can you add
more details.
nnbench is the best microbenchmark for nn performance test.
On Wed, Jun 5, 2013 at 3:17 PM, Mark Kerzner
Hi Mark,
NNBench is a namenode load test. Output of the test is the set of performance
numbers, like transactions per second, average latency of operations, etc.
What do you mean by trying to create a more efficient namenode? What dimension
are you trying to optimize? Depending on this, people
Is it possible to use Hadoop streaming or Hadoop pipes for multiple inputs
and outputs? Consider for example an equality join that accepts two inputs
(left, right), and produces three outputs (left unmatched, right unmatched,
joined). That's not actually what I'm trying to implement, but
I’ve taken your advice and made a wrapper class which implements
WritableComparable. Thank you very much for your help. I believe everything
is working fine on that front. I used google’s gson for the comparison.
public int compareTo(Object o) {
JsonElement o1 =
Hi John,
On Thu, Jun 6, 2013 at 1:21 AM, John Lilley john.lil...@redpoint.net wrote:
-- From where will it fetch the Hadoop JARs?
From the Maven Central repository (we publish our jars and
dependencies are also available there), or a custom defined repository
if you lack internet access.
--
Do not use HADOOP_HOME anymore. Try removing the below line (and any
other references in your env to HADOOP_HOME):
export HADOOP_HOME=/hadoop-2.0.4-alpha
On Thu, Jun 6, 2013 at 1:18 AM, Boyu Zhang boyuzhan...@gmail.com wrote:
Dear All,
I just moved from version 0.20.2 to 2.0.4, there are a
Thanks Harsh,
I got it working, the problem for me is the java home, I reinstall java and
point the java home to the new one, then it worked.
Thanks,
Boyu
On Wed, Jun 5, 2013 at 7:22 PM, Harsh J ha...@cloudera.com wrote:
Do not use HADOOP_HOME anymore. Try removing the below line (and any
Does not seem like a hadoop question
Maybe gridgain list ??
Sent from my iPhone
On Jun 5, 2013, at 8:27 PM, Job Thomas j...@suntecgroup.com wrote:
Hi all,
When I am starting my jobtracker in gridgain and hadoop combined project i am
getting the following error
Exception in
Hi,
It's not been so long when I started to learn about Hadoop/HDFS/MapReduce
and have been implementing those things. Now I want to dive into the source
code and see whether I can be useful in providing patches. I have a good
foundation of programming and algorithm, owing to my computer science
21 matches
Mail list logo