Hi, when I start my hbase cluster, hbase sometimes list 'ghost
regionservers' without any regions
kongs2.medisin.ntnu.no:60020 1345115409411
requests=0, regions=0, usedHeap=0, maxHeap=0
netstat does not list any services on 60020
if I start a region server locally
I get one real
Hi,
I am currently trying to tune a CDH 4.0.1 cluster running HDFS, YARN,
and HBase managed by Cloudera Manager 4.0.3 (Free Edition).
In CM, there are a number of options for setting mapreduce.*
configuration properties on the YARN client page.
Some of the explanations in the GUI still
Hi guys : I want to start automating the output of counter stats, cluster
size, etc... at the end of the main map reduce jobs which we run. Is there
a simple way to do this ?
Here is my current thought :
1) Run all jobs from a driver class (we already do this).
2) At the end of each job,
Also could you tell us more about your task statuses?
You might also have failed tasks...
Bertrand
On Thu, Aug 16, 2012 at 11:01 PM, Bertrand Dechoux decho...@gmail.comwrote:
Well, there is speculative executions too.
http://developer.yahoo.com/hadoop/tutorial/module4.html
*Speculative
You probably have speculative execution on. Extra maps and reduce tasks are run
in case some of them fail
Raj
Sent from my iPad
Please excuse the typos.
On Aug 16, 2012, at 11:36 AM, in.abdul in.ab...@gmail.com wrote:
Hi Gaurav,
Number map is not depents upon number block . It is really
It would be helpful to see some statistics out of both the jobs like bytes
read, written number of errors etc.
On Thu, Aug 16, 2012 at 8:02 PM, Raj Vishwanathan rajv...@yahoo.com wrote:
You probably have speculative execution on. Extra maps and reduce tasks
are run in case some of them fail
Hi all,
We have 48gb nn machine for our cluster. It isn't very heavily used.
Projected mem is less than 20% We now plan to have 3 zookeeper servers. Is
it advisable to run zk process along here name node process for production?
What factors do I need to look into to decide if this an
Hi Venkat,
If i understood it properly then you are trying to run 3 zookeeper and NN
service on one machine. Right? If yes, then one of the major problem is
that if unfortunately this machine crashes then you will lose the
zookeepers and as a result you cluster will go down. Usually, its
I am a bit confused about the different options for namenode high
availability (or something along those lines) in CDH4 (hadoop-2.0.0).
I understand that the secondary namenode is deprecated, and that there
are two options to replace it: checkpoint or backup namenodes. Both are
well explained
Hi Jan,
Don't confuse with the backupnode/checkpoint nodes here.
The new HA architecture mainly targetted to build HA with Namenode states.
1) Active Namenode
2) Standby Namenode
When you start NN, they both will start in standby mode bydefault.
then you can switch one NN to active state by
Dear list,
I'm rather new to HDFS and I am trying to figure out how to use the
HarFileSystem class. I have created a little sample Harchive for testing
purpose that looks like this:
==
$ bin/hadoop fs -ls har:///WPD.har/1
Found 8
Hi,
I am currently trying to tune a CDH 4.0.1 (i~ hadoop 2.0.0-alpha)
cluster running HDFS, YARN, and HBase managed by Cloudera Manager 4.0.3
(Free Edition).
In CM, there are a number of options for setting mapreduce.*
configuration properties on the YARN client page.
Some of the
Hi,
AFAIK, these properties are being ignored by YARN:
- mapreduce.tasktracker.map.tasks.maximum,
- mapreduce.tasktracker.reduce.tasks.maximum
Thanks,
Anil Gupta
On Thu, Aug 16, 2012 at 9:28 AM, mg userformailingli...@gmail.com wrote:
Hi,
I am currently trying to tune a CDH 4.0.1 (i~
Very helpful info. I hadn't considered the bandwidth aspect of it.
Thanks much, Harsh!
DR
On 08/16/2012 12:58 AM, Harsh J wrote:
I'd not do this if the fsimage size is greater than, say, 5-6 GB. The
SNN pulls and then pushes this back from the NameNode and the transfer
can get heavy. If you
Hi,
thanks.
But there are many more mapreduce.* properties. ...
Does anyone have more information on those with respect to whether they
are relevant to YARN, and if so, what they exactly mean in the YARN context?
I am trying to optimize settings in order to use as much RAM and CPU as
Sorry - this seems pretty basic, but I could not find a reference on
line or in my books. Is there a graceful way to stop a single datanode,
(for example to move the system to a new rack where it will be put back
on-line) or do you just whack the process ID and let HDFS clean up the
mess?
Thanks
Perhaps what you're looking for is the Decommission feature of HDFS,
which lets you safely remove a DN without incurring replica loss? It
is detailed in Hadoop: The Definitive Guide (2nd Edition), page 315 |
Chapter 10: Administering Hadoop / Maintenance section - Title
Decommissioning old nodes,
Thanks guys. I will need the decommission in a few weeks, but for now
just a simple system move. I found out the hard way not to have a
masters and slaves file in the conf directory of a slave: when I tried
bin/stop-all.sh, it stopped processes everywhere.
Gave me an idea to list it's own name as
Well, there is speculative executions too.
http://developer.yahoo.com/hadoop/tutorial/module4.html
*Speculative execution:* One problem with the Hadoop system is that by
dividing the tasks across many nodes, it is possible for a few slow nodes
to rate-limit the rest of the program. For example
Hello Terry,
You can ssh the command to the node where you want to stop the DN.
Something like this :
$ cluster@ubuntu:~/hadoop-1.0.3$ bin/hadoop-daemon.sh --config
/home/cluster/hadoop-1.0.3/conf/ stop datanode
Regards,
Mohammad Tariq
On Fri, Aug 17, 2012 at 2:26 AM, Terry Healy
Now, no utilities for job tracker log .
--
peter
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
On 2012年8月17日Friday at 下午12:29, Hank Cohen wrote:
Are there any utilities available to help parse jobtracker log files?
Hank Cohen
hank.co...@altior.com
Take a look at Pig's HadoopJobHistoryLoader
http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/storage/HadoopJobHistoryLoader.html
On Thu, Aug 16, 2012 at 9:34 PM, peter zhangju...@gmail.com wrote:
Now, no utilities for job tracker log .
--
peter
Sent with Sparrow
Hi,
I would also look into logstash if you are looking for analyzing logs
across your cluster:
http://logstash.net/docs/1.1.1/
I cannot recommend it from experience, but it seems to be a tool built for
this type of problem.
Best,
Ariel
On Aug 17, 2012 12:52 AM, Harsh J ha...@cloudera.com
23 matches
Mail list logo