Hi ,
I am new user to hadoop . I have installed hadoop0.19.1 on single windows
machine.
Its http://localhost:50030/jobtracker.jsp and
http://localhost:50070/dfshealth.jsp pages are working fine but when i am
executing bin/hadoop jar hadoop-0.19.1-examples.jar pi 5 100
It is showing below
$ bi
is there errors in the task outpu file?
on the jobtracker.jsp click the Jobid link -> tasks link -> Taskid link ->
Task logs link
2012/5/11 Mohit Kundra
> Hi ,
>
> I am new user to hadoop . I have installed hadoop0.19.1 on single windows
> machine.
> Its http://localhost:50030/jobtracker.jsp and
You might be running out of disk space. Check for that on your cluster
nodes.
-Prashant
On Fri, May 11, 2012 at 12:21 AM, JunYong Li wrote:
> is there errors in the task outpu file?
> on the jobtracker.jsp click the Jobid link -> tasks link -> Taskid link ->
> Task logs link
>
> 2012/5/11 Mohit
Mohit,
Why are you using Hadoop-0.19, a version released many years ago?
Please download the latest stable available at
http://hadoop.apache.org/common/releases.html#Download instead.
On Fri, May 11, 2012 at 12:26 PM, Mohit Kundra wrote:
> Hi ,
>
> I am new user to hadoop . I have installed hado
zabbix does monitoring, archiving and graphing, and alerts.
It has a JMX bean monitor system. If Hadoop has these, or you can add
them easily, you have a great monitor. Also, check out 'Starfish'.
It's a little old, but I got it running and it was really cool.
On Thu, May 10, 2012 at 11:24 PM, Ma
I've helped out linking hadoop to munin using jmx querying in the past,
there's a writeup at:
http://www.cs.huji.ac.il/wikis/MediaWiki/lawa/index.php/Munin_for_Hadoop
Stu
On Fri, May 11, 2012 at 02:15:16AM -0700, Lance Norskog wrote:
> zabbix does monitoring, archiving and graphing, and alerts.
On Thu, May 10, 2012 at 5:58 PM, Raj Vishwanathan wrote:
> Darrell
>
> Are the new dn,nn and mapred directories on the same physical disk?
> Nothing on NFS , correct?
>
Yes, that's correct
>
> Could you be having some hardware issue? Any clue in /var/log/messages or
> dmesg?
>
Hardware is goo
On Fri, May 11, 2012 at 2:29 AM, Darrell Taylor
wrote:
>
> What I saw on the machine was thousands of recursive processes in ps of the
> form 'bash /usr/bin/hbase classpath...', Stopping everything didn't clean
> the processes up so had to kill them manually with some grep/xargs foo.
> Once this
Doesn't look like the $HBASE_HOME/bin/hbase script runs
"$HADOOP_HOME/bin/hadoop classpath" directly. Its classpath builder
seems to add $HADOOP_HOME items manually via listing/etc.. Perhaps if
hbase-env.sh has a HBASE_CLASSPATH that imports `hadoop classpath`,
and the hadoop-env.sh has a `hbase cl
I do not know about the per-host slot control (that is most likely not
supported, or not yet anyway - and perhaps feels wrong to do), but the
rest of the needs can be doable if you use schedulers and
queues/pools.
If you use FairScheduler (FS), ensure that this job always goes to a
special pool an
Just a quick note...
If your task is currently occupying a slot, the only way to release the slot
is to kill the specific task.
If you are using FS, you can move the task to another queue and/or you can
lower the job's priority which will cause new tasks to spawn slower than other
jobs so you
thanks. I think I will investigate capacity scheduler.
On Fri, May 11, 2012 at 7:26 AM, Michael Segel wrote:
> Just a quick note...
>
> If your task is currently occupying a slot, the only way to release the
> slot is to kill the specific task.
> If you are using FS, you can move the task to a
Hi
When we store data into HDFS, it gets broken into small pieces and distributed
across the cluster based on Block size for the file.
While processing the data using MR program I want a particular record as a
whole without it being split across nodes, but the data has already been split
and st
Shreya,
This has been asked several times before, and the way it is handled by
TextInputFormats (for one example) is explained at
http://wiki.apache.org/hadoop/HadoopMapReduce in the Map section. If
you are writing a custom reader, feel free to follow the same steps -
you basically need to seek ov
Hi Mohit,
1) Hadoop is more portable with Linux,Ubantu or any non dos file
system.
but you are running hadoop on window it colud be the problem bcz hadoop
will generate some partial out put file for temporary use.
2) Another thing is that your are running hadoop version as 0.19 , I think
i
Hi,
I have a question to the hadoop experts:
I have two HDFS, in different subnet.
HDFS1 : 192.168.*.*
HDFS2: 10.10.*.*
the namenode of HDFS2 has two NIC. One connected to 192.168.*.* and another
to 10.10.*.*.
So, is it possible to transfer data from HDFS1 to HDFS2 and vice versa.
Regards,
A
If you could cross-access HDFS from both name nodes, then it should be
transferable using /distcp /command.
Shi *
*
On 5/11/2012 8:45 AM, Arindam Choudhury wrote:
Hi,
I have a question to the hadoop experts:
I have two HDFS, in different subnet.
HDFS1 : 192.168.*.*
HDFS2: 10.10.*.*
the nam
I can not cross access HDFS. Though HDFS2 has two NIC the HDFS is running
on the other subnet.
On Fri, May 11, 2012 at 3:57 PM, Shi Yu wrote:
> If you could cross-access HDFS from both name nodes, then it should be
> transferable using /distcp /command.
>
> Shi *
> *
>
> On 5/11/2012 8:45 AM, Ar
Is there any risk to suppress a job too long in FS?I guess there are
some parameters to control the waiting time of a job (such as timeout
,etc.), for example, if a job is kept idle for more than 24 hours is
there a configuration deciding kill/keep that job?
Shi
On 5/11/2012 6:52 AM, Ri
here are some quick code for you (based on Tom's book). You could
overwrite the TextInputFormat isSplitable method to avoid splitting,
which is pretty important and useful when processing sequence data.
//Old API
public class NonSplittableTextInputFormat extends TextInputFormat {
@Overrid
It seems in your case HDFS2 could access HDFS, so you should be able to
transfer HDFS data to HDFS2.
If you want to cross-transfer, you don't need to do distcp on cluster
nodes, if any client node (not necessary to be namenode, datanode,
secondary node, etc.) could access to both HDFSs, then r
Looks like both are private subnets, so you got to route via a public
default gateway. Try adding route using route command if your in
linux(windows i have no idea). Just a thought i havent tried it though.
Thanks,
Rajesh
Typed from mobile, please bear with typos.
On May 11, 2012 10:03 AM, "Arind
So,
hadoop dfs -cp hdfs:// hdfs://...
this will work.
On Fri, May 11, 2012 at 4:14 PM, Rajesh Sai T wrote:
> Looks like both are private subnets, so you got to route via a public
> default gateway. Try adding route using route command if your in
> linux(windows i have no idea). Just a thou
I haven't seen any.
Haven't really had to test that...
On May 11, 2012, at 9:03 AM, Shi Yu wrote:
> Is there any risk to suppress a job too long in FS?I guess there are some
> parameters to control the waiting time of a job (such as timeout ,etc.),
> for example, if a job is kept idle fo
Am not aware of a job-level timeout or idle monitor.
On Fri, May 11, 2012 at 7:33 PM, Shi Yu wrote:
> Is there any risk to suppress a job too long in FS? I guess there are
> some parameters to control the waiting time of a job (such as timeout
> ,etc.), for example, if a job is kept idle for
There is an idle timeout for map/reduce tasks. If a task makes no progress for
10 min (Default) the AM will kill it on 2.0 and the JT will kill it on 1.0.
But I don't know of anything associated with a Job, other then in 0.23 is the
AM does not heart beat back in for too long, I believe that t
Hi,
I am a newbie on Hadoop and have a quick question on optimal compute vs.
storage resources for MapReduce.
If I have a multiprocessor node with 4 processors, will Hadoop schedule
higher number of Map or Reduce tasks on the system than on a uni-processor
system? In other words, does Hadoop dete
I have set dfs.datanode.max.xcievers=4096 and have swapping turned off,
Regionserver Heap = 24 GB
Datanode Heap = 1 GB
On Fri, May 11, 2012 at 9:55 AM, sulabh choudhury wrote:
> I have spent a lot of time trying to find a solution to this issue, but
> have had no luck. I think this is because of
Nope, you must tune the config on that specific super node to have more M/R
slots (this is for 1.0.x)
This does not mean the JobTracker will be eager to stuff that super node with
all the M/R jobs at hand.
It still goes through the scheduler, Capacity Scheduler is most likely what
you have. (
Thanks, Leo. What is the config of a typical data node in a Hadoop cluster
- cores, storage capacity, and connectivity (SATA?).? How many tasktrackers
scheduled per core in general?
Is there a best practices guide somewhere?
Thanks,
Satheesh
On Fri, May 11, 2012 at 10:48 AM, Leo Leung wrote:
>
This maybe dated materials.
Cloudera and HDP folks please correct with updates :)
http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/
http://www.cloudera.com/blog/2010/08/hadoophbase-capacity-planning/
http://hortonworks.com/blog/best-practice
Record reader implementations are typically written to honor record
boundaries. This means that while reading a split data they will continue
reading if the end of split has reached BUT end of record is yet to be
encountered.
-@nkur
On 5/11/12 5:15 AM, "shreya@cognizant.com"
wrote:
>Hi
>
>W
I see mapred.tasktracker.reduce.tasks.maximum and
mapred.tasktracker.map.tasks.maximum, but I'm wondering if there isn't another
tuning parameter I need to look at.
I can tune the task tracker so that when I have many jobs running, with many
simultaneous maps and reduces I utilize 95% of cpu a
Hello,
We have a large number of
custom-generated files (not just web logs) that we need to move from our JBoss
servers to HDFS. Our first implementation ran a cron job every 5 minutes to
move our files from the "output" directory to HDFS.
Is this recommended? We are being told by our IT tea
34 matches
Mail list logo