Thanks for both inputs. My question actually focus more on what Vivek has
mentioned.
I would like to work on the JobClient to see how it submits jobs to
different file system and
slaves in the same Hadoop cluster.
Not sure if there is a complete document to explain the scheduler underneath
Hadoo
Yes, please file a bug.
There are file systems with different block sizes out there Linux or Solaris.
Thanks,
--Konstantin
Martin Traverso wrote:
I think I found the issue. The class org.apache.hadoop.fs.DU assumes
1024-byte blocks when reporting usage information:
this.used = Long.parseLon
david wrote:
> I mean I used IBM mapredce tools plugin Eclipse for connect to Hadoop server.
> Original Eclipse show connect fail, then I found problem appear my Hadoop
> server ssh port is not default 22. Eclipse connect to Hadoop sucess, if I
> change backup port 22.
>
> So how to used Eclipse
I read Andy's question a little differently. For a given job, the JobTracker
decides which tasks go to which TaskTracker (the TTs ask for a task to run
and the JT decides which task is the most appropriate). Currently, the JT
favors a task whose input data is on the same host as the TT (if there ar
I think I found the issue. The class org.apache.hadoop.fs.DU assumes
1024-byte blocks when reporting usage information:
this.used = Long.parseLong(tokens[0])*1024;
This works fine in linux, but in Solaris and Mac OS the reported number of
blocks is based on 512-byte blocks.
The solution is si
>
> What are the data directories
> specified in your configuration? Have you specified two data directories
> per
> volume?
>
No, just one directory per volume. This is the value of dfs.data.dir in my
hadoop-site.xml:
dfs.data.dir
/local/data/hadoop/d0/dfs/data,/local/data/ha
Core-user is the right place for this question.
Your description is mostly correct. Jobs don't necessarily go to all of
your boxes in the cluster, but they may.
Non-uniform machine specs are a bit of a problem that is being (has been?)
addressed by allowing each machine to have a slightly diffe
Datanode run du on data directories hourly. In between two "du"s, used space
is updated when a block is added or deleted. What are the data directories
specified in your configuration? Have you specified two data directories per
volume?
Hairong
On 2/15/08 1:05 PM, "Martin Traverso" <[EMAIL PROTEC
Hello,
My first time posting this in the news group.My question sounds more like a
MapReduce question
instead of Hadoop HDFS itself.
To my understanding, the JobClient will submit all Mapper and Reduce class
in a uniform way to the cluster? Can I assume this is more like a uniform
sched
Hi,
Are there any known issues on how dfsadmin reports disk usage? I'm getting
some weird values:
Name: 10.15.104.46:50010
State : In Service
Total raw bytes: 1433244008448 (1.3 TB)
Remaining raw bytes: 383128089432(356.82 GB)
Used raw bytes: 1042296986024 (970.71 GB)
% used: 72.72%
Ho
On Feb 14, 2008, at 2:09 PM, Jason Venner wrote:
We write a separate file in many our our mappers and or reducers.
We are somewhat concerned about speculative execution and what
happens to the output files of killed jobs, but it seems to work fine.
We build the output files by passing in a *
On 2/15/08 9:19 AM, "Nathan Wang" <[EMAIL PROTECTED]> wrote:
> Right, you can't add that line globally. That will affect all processes.
>
> What you can do is to modify this file: HADOOP_HOME/bin/hadoop.
> For each process, give a different port number.
See also https://issues.apache.org
Right, you can't add that line globally. That will affect all processes.
What you can do is to modify this file: HADOOP_HOME/bin/hadoop.
For each process, give a different port number.
For example, for tasktracker, assign port 12345:
...
elif [ "$COMMAND" = "tasktracker" ] ; then
CLASS=org.ap
If I use the following parameters in mapred.child.java.opts, then the
Reduce tasks will inmediately fail with exit code 1.
-Dcom.sun.management.jmxremote.port=7575 -
Dcom.sun.management.jmxremote.authenticate=false -
Dcom.sun.management.jmxremote.ssl=false
The problem is the fact that there are 2
If I use the following parameters in mapred.child.java.opts, then the
Reduce tasks will inmediately fail with exit code 1.
-Dcom.sun.management.jmxremote.port=7575 -
Dcom.sun.management.jmxremote.authenticate=false -
Dcom.sun.management.jmxremote.ssl=false
The problem is the fact that there are 2
We have several clusters, and on two of them the dfshealth.jsp does not
run, to the best of our knowledge the clusters are identical except for
the slaves, and the dfs and tasktracker.
I don't seem to find anything in the log files for the webapps. The
jobtracker.jsp runs without problem. What
Many thanks jdcryans :)
Regards|Sathish
-Original Message-
From: Jean-Daniel Cryans [mailto:[EMAIL PROTECTED]
Sent: Friday, February 15, 2008 7:13 PM
To: core-user@hadoop.apache.org
Subject: Re: specifying Hadoop disk space
Hi,
Have you read : http://wiki.apache.org/hadoop/QuickStart
Hi,
Have you read : http://wiki.apache.org/hadoop/QuickStart
Stage 3, second dot?
Regards,
jdcryans
2008/2/15, Chandran, Sathish <[EMAIL PROTECTED]>:
>
>
>
> Hi all,
>
>
>
> Can you help me out the following?
>
>
>
> Normally Hadoop takes the free disk spaces available from the machine.
> But I
18 matches
Mail list logo