With this info it is difficult to find out where the problem is coming. Can
you check the job tracker and task tracker logs related to these jobs?
Devaraj K
_
From: Sudharsan Sampath [mailto:sudha...@gmail.com]
Sent: Wednesday, June 22, 2011 11:51 AM
To: mapreduce-user@hadoop.apach
Hi,
I am starting a job from the map of another job. Following are quick mock of
the code snippets that I use. But the 2nd job hangs indefinitely after the
1st task attempt fails. There is not even a 2nd attempt. This runs fine on a
cluster with one node but fails on a two node cluster.
Can someo
Hi Virajith,
This exception will be thrown when the host name present in the
file which is the value for the property "mapred.hosts.exclude". If you
don't mention any thing for "mapred.hosts" and "mapred.hosts.exclude"
properties or the mentioned files don't contain any hosts, job trac
Fortunately, DistributedCache solved my problem! I put a jar file to
HDFS. which contains the necessary classes for the job and I used this:
*DistributedCache.addFileToClassPath(new Path("/myjar/myjar.jar"), conf);*
Thanks for fast answer!
And sorry for my mistake (about the wrong list), that was
Allen,
On Wed, Jun 22, 2011 at 2:28 AM, Allen Wittenauer wrote:
>
> On Jun 21, 2011, at 1:31 PM, Harsh J wrote:
>
>> Gabor,
>>
>> If your jar does not contain code changes that need to get transmitted
>> every time, you can consider placing them on the JT/TT classpaths
>
> ... which means
Hi,
On Tue, Jun 21, 2011 at 16:14, Mapred Learn wrote:
> The problem is when 1 text file goes on HDFS as 60 GB file, one mapper takes
> more than an hour to convert it to sequence file and finally fails.
>
> I was thinking how to split it from the client box before uploading to HDFS.
Have a look
Off the wall thought but it might be possible to do this through rolling your
own load manager using the fair scheduler. I know this is how people have setup
custom job distributions based on current cluster utilization.
Matt
From: Jonathan Zukerman [mailto:zukermanjonat...@gmail.com]
Sent: Tue
Hi,
In Tom White's Hadoop book, White discusses Writable collections and
ends the section with the following:
For lists of a single type of Writable, ArrayWritable is
adequate, but to store different types of Writable in a single list, you can use
GenericWritable to wrap the elements in an ArrayW
That is a bit problematic because I have other jobs running at the same
time, most of them don't care about the number of map tasks per tasktracker.
Is there a way to implement this in in my job project? What is the best way
to do it?
On Tue, Jun 21, 2011 at 8:08 PM, Joey Echeverria wrote:
> The
The only way to do that is to drop the setting down to one and bounce
the TaskTrackers.
-Joey
On Tue, Jun 21, 2011 at 12:52 PM, Jonathan Zukerman
wrote:
> Hi,
> Is there a way to set the maximum map tasks for all tasktrackers in my
> cluster for a certain job?
> Most of my tasktrackers are confi
Hi,
Is there a way to set the maximum map tasks for all tasktrackers in my
cluster for a certain job?
Most of my tasktrackers are configured to handle 4 maps concurrently, and
most of my jobs don't care where does the map function run. But small part
of my jobs requires that no two map functions w
*keep.failed.task.files* is also set by the client (also, HDFS block size,
replication level, *io.sort.{mb,factor}*, etc.)
On Tue, Jun 21, 2011 at 7:15 AM, John Armstrong wrote:
> On Tue, 21 Jun 2011 06:37:50 -0700, Alex Kozlov
> wrote:
> > However, the job's tasks are executed in a separate JVM
Hi,
I am trying to setup a hadoop cluster with 7nodes with the master node also
functioning as a slave node (i.e. runs a datanode and a tasktracker along
with the namenode and jobtracker deamons). I am able to get HDFS working.
However when I try starting the tasktrackers (bin/start-mapred.sh), I
On Tue, 21 Jun 2011 06:37:50 -0700, Alex Kozlov
wrote:
> However, the job's tasks are executed in a separate JVM and some
> of the parameters, like max heap from *mapred.java.child.opts*, are set
> during the job execution. In this case the parameter is coming from the
> client side where the who
Hi John, You are right: the *-site.xml files are read by daemons on
startup. However, the job's tasks are executed in a separate JVM and some
of the parameters, like max heap from *mapred.java.child.opts*, are set
during the job execution. In this case the parameter is coming from the
client side
One of my colleagues and I have a little confusion between us as to
exactly when mapred-site.xml is read. The pages on hadoop.apache.org don't
seem to specify it very clearly.
One position is that mapred-site.xml is read by the daemon processes at
startup, and so changing a parameter in mapred-si
16 matches
Mail list logo