You can get the stats for a job using rumen.
http://ksssblogs.blogspot.in/2013/06/getting-job-statistics-using-rumen.html
Regards,
Som Shekhar Sharma
+91-8197243810
On Tue, Aug 27, 2013 at 10:54 AM, Gopi Krishna M mgo...@gmail.com wrote:
Harsh: thanks for the quick response.
we often see an
Harsh-
Yes, I intend to use HA. That's what I'm trying to configure right now.
Unfortunately I cannot share my complete configuration files. They're on a
disconnected network. Are there any configuration items that you'd like me to
post my settings for?
The deployment is CDH 4.3 on a brand
Yes, I think so. The TaskTracker that launched the mapper and reducer in
the child JVM which further invoked the streaming process can (and does)
communicate with the JobTracker.
Regards,
Shahab
On Tue, Aug 27, 2013 at 8:34 AM, Manoj Babu manoj...@gmail.com wrote:
Team,
Does streaming
There are a number of Hadoop tutorials and textbooks available, but they
always seem to target older versions of Hadoop. Does anyone know of good
tutorials that work with modern Hadoop verions (v1.x.y)?
Hello,
I know that the client caches write requests before they send to a
datanode and that the client uses read-ahead, but where exactly is this
implemented, Hadoop or HDFS? Or better question, will write cache and
read-ahead also available when Hadoop uses another filesysten than HDFS?
In https://hadoop.apache.org/docs/stable/mapred_tutorial.html#Source+Code,
line 16 declares:
private Text word = new Text();
...
But only lines 22 and 23 use this, and only to pass the value along to
output:
word.set(tokenizer.nextToken());
output.collect(word, one);
Wouldn't this be better
Harsh-
Here are all of the other values that I have configured.
hdfs-site.xml
-
dfs.webhdfs.enabled
true
dfs.client.failover.proxy.provider.mycluster
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.automatic-falover.enabled
true
dfs.ha.namenodes.mycluster
nn.domain,snn.domain
it should be:
dfs.ha.namenodes.mycluster
nn1,nn2
On Aug 27, 2013 11:22 PM, Smith, Joshua D. joshua.sm...@gd-ais.com
wrote:
Harsh-
Here are all of the other values that I have configured.
hdfs-site.xml
-
dfs.webhdfs.enabled
There seems to be an abundance of boilerplate patterns in MapReduce:
* Write a class extending Map (1), implementing Mapper (2), with a map
method (3)
* Write a class extending Reduce (4), implementing Reducer (5), with a
reduce method (6)
Could we achieve the same behavior with a single Job
nn.domain is a place holder for the actual fully qualified hostname of my
NameNode
snn.domain is a place holder for the actual fully qualified hostname of my
StandbyNameNode.
Of course both the NameNode and the StandbyNameNode are running exactly the
same software with the same configuration
Hi,
For one of my map reduce code I want to use a different version of slf4j
jar (1.6.4)
But I guess hadoop has a different version of jar in hadoop
classpath lib/slf4j-log4j12-1.4.3.jar
And when I am trying to run my code, I am gettign this error:
Exception in thread main
For starters (experts might have more complex reasons), what if your
respective map and reduce logic becomes complex enough to demand separate
classes? Why tie the clients to implement both by moving these in one Job
interface. In the current design you can always implement both (map and
reduce)
One idea is, you can use the exclusion property of maven (provided you are
using that to build your application) while including hadoop dependencies
and exclude sl4j that is coming within hadoop and then include your own
sl4j as a separate dependency. Something like this:
dependency
Hi,
Please follow the HA configuration steps available at below link.
http://hadoop.apache.org/docs/r2.1.0-beta/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html
*dfs.ha.namenodes.[nameservice ID] - unique identifiers for each NameNode
in the nameservice *
*Configure with a list
As far as I undersstand, StringTokenizer.nextToken returns Java String type
object which does not implement the required Writable and Comparable
interfaces needed to Hadoop Mapreduce serialization and transport. The Text
class does that and is compatible and thus that is why that is being used
to
not yet.
please correct it.
On Aug 27, 2013 11:39 PM, Smith, Joshua D. joshua.sm...@gd-ais.com
wrote:
nn.domain is a place holder for the actual fully qualified hostname of
my NameNode
snn.domain is a place holder for the actual fully qualified hostname of my
StandbyNameNode.
**
I am right now using libjars option.
How do i do what you suggested using that route?
On Tue, Aug 27, 2013 at 8:51 AM, Shahab Yunus shahab.yu...@gmail.comwrote:
One idea is, you can use the exclusion property of maven (provided you are
using that to build your application) while including
That fixed it. I was assuming that nn1 and nn2 were hostnames and not IDs.
Once I replaced the value with nn1,nn2, everything started to make sense.
Thank you Azurry and Jitendra. Much appreciated!
Josh
From: Jitendra Yadav [mailto:jeetuyadav200...@gmail.com]
Sent: Tuesday, August 27, 2013
I have bunch of jars whcih i want to pass. I am using libjars option to do
so. But to do that I have to implement tool ??
So i change my code to following but still I am getting this warning?
13/08/27 11:32:37 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications
Hi All ,
I am trying to run a simple steaming command as mentioned below .
bin/hadoop jar
/windows/Hadoop/hadoop-1.2.1/contrib/streaming/hadoop-streaming-1.2.1.jar
-input /usr/pradeep/input/'Good words' -output /usr/pradeep/output mapper
/bin/cat reducer /bin/wc -w
I am getting below as message
Hi All,
I'm new in Hadoop administration, Can someone please help me?
Hadoop-version :- 2.0.5 alpha and using QJM
I'm getting below error messages while starting Hadoop hdfs using 'start-dfs.sh'
2013-01-23 03:25:43,208 INFO
org.apache.hadoop.hdfs.server.namenode.FSImage: Image file of size 121
I agree with @Shahab - it's simple enough to declare both interfaces in one
class if that's what you want to do. But given the distributed behavior of
Hadoop, it's likely that your mappers will be running on different nodes
than your reducers anyway - why ship around duplicate code?
On Tue, Aug
I am not sure the original suggestion will work for your case.
My understanding is the you want to use some API, only exists in slf4j versiobn
1.6.4, but this library with different version already existed in your hadoop
environment, which is quite possible.
To change the maven build of the
What OS are you starting this on?
Are you able to run the command df -k /tmp/hadoop-hadoop/dfs/name/
as user hadoop?
On Wed, Aug 28, 2013 at 12:53 AM, orahad bigdata oracle...@gmail.com wrote:
Hi All,
I'm new in Hadoop administration, Can someone please help me?
Hadoop-version :- 2.0.5
Yes ganglia
Please see
http://wiki.apache.org/hadoop/GangliaMetrics
From: Viswanathan J [mailto:jayamviswanat...@gmail.com]
Sent: Tuesday, August 27, 2013 7:45 PM
To: user@hadoop.apache.org
Subject: Apache Hadoop cluster monitoring
Hi,
What are the best monitoring tools for hadoop other than
Would add hannibal for hbase bird eye view
On 28/08/2013 12:45 PM, Viswanathan J jayamviswanat...@gmail.com wrote:
Hi,
What are the best monitoring tools for hadoop other than JT and NN default
UIs.
Please share the doc for configuring in production cluster.
Any tools that can be
I have written a blog on this a while ago where I was writing to multiple
tables from my mapper class. You can look into it at
http://bigdatabuzz.wordpress.com/2012/04/24/how-to-write-to-multiple-hbase-tables-in-a-mapreduce-job/
Key things are,
a) job.setOutputFormatClass
Also to add, the default serialization libraries supported are specified in
core-default,xml as
property
nameio.serializations/name
28 matches
Mail list logo