I couldn't get much from the logs why it is so.
For reporting status, you can write to stderr from your script. The
format should be reporter:status:. If program emits such lines
in stderr, the framework will treat them as status report. Hope this
clarifies.
Thanks
Amareshwari
Rick Hangartne
You can enable speculative execution for your jobs.
On 12/24/08 10:25 AM, "Jeremy Chow" wrote:
> Hi list,
> I've come up against a scenario like this, to finish a same task, one of my
> hadoop cluster only needs 5 seconds, and another one needs more than 2
> minutes.
> It's a common phenomenon
Hi list,
I've come up against a scenario like this, to finish a same task, one of my
hadoop cluster only needs 5 seconds, and another one needs more than 2
minutes.
It's a common phenomenon that will decrease the parallelism of our system
due to the faster one will wait the slower one. How to coor
Hi Amareshwari,
This may or may not be helpful.
Here's an example of three runs in rapid succession. The first and
last completed without any problems. The middle one completed in this
case, but the log has three exceptions of the type we reported (We
think a fourth exception would have
Hello,
MultipleOutputFormat is a very good idea. Thanks. I have a question,
from the web page ,
"The reducer wants to write data to different files depending on the
actual keys" .. and values.
Examining, TestMultipleTextOutputFormat,
class KeyBasedMultipleTextOutputFormat extends
MultipleTextOutpu
Hi Amareshwari,
The "stream.non.zero.exit.status.is failure" is the default (which the
docs indicate is 'True').
We don't think the problem is the reducer script per se: Under one
circumstance we are investigating further it arises when the reducer
script does nothing but copy stdin to st
in 19 there is a chaining facility, I haven't looked at it yet, but it
may provide an alternative to the rather standard pattern of looping.
You may also what to check what mahout is doing as it is a common
problem in that space.
Delip Rao wrote:
Thanks Chris! I ended up doing something simi
The copy rate for the reduces is throttled by the availability of the
data from the maps.
If the map data is not available yet, the effective copy rate goes toward 0.
patek tek wrote:
Hello,
I have been running experiments with Hadoop and noticed that
the copy-rate of reducers descreases over
Hi all,
I am new to this list so please forgive me if this is not the right way or
format to ask for help.
I installed hadoop version hadoop-0.18.1 with single node and single disk and
everything was working fine.
When I added one more disk I started getting the errors copied below.
I tried bala
Yes this will work. You will need to configure the class path to include
that directory.
The Tasktracker's really only have the classpath as setup by
conf/hadoop-env.sh, and the Tasktracker$Child's have that classpath +
the unpacked distributed cache directory.
Saptarshi Guha wrote:
Hello,
Wh
Hi,
I run into some issues with Hadoop 18.2 on my Linux box:
The jobs executes without any complains and they are listed in the
succeeded list but there is no output data beside the "_logs" directory.
The same code works with .17.2.1
Here are some sections of the logs:
[logfile]
had...@
Hello,
I'm testing a cluster with Hadoop 0.18.1 / HBase 0.18.0.
These last days a problem arises with my hdfs:
My topology is 4 nodes. 3 nodes run DataNode and RegionServer, and one runs
HBase master, NameNode and Secondary NameNode. The cluster works for some
hours, then one of the DataNode fr
There are several ways you can pass static information to tasks in
Hadoop. The first is to store it in conf via DefaultStringifier, which
needs the object to be serialized either through Writable or
Serializable interfaces. Second way would be to save/serialize the data
to a file and send it vi
You can report status from streaming job by emitting
reporter:status: in stderr.
See documentation @
http://hadoop.apache.org/core/docs/r0.18.2/streaming.html#How+do+I+update+status+in+streaming+applications%3F
But from the exception trace, it doesn't look like lack of
report(timeout). The tr
It seems that JobConf doesn't help. Do I have to write the object into DFS?
15 matches
Mail list logo