Can you explain your usecase with some more details?
Thanks
Devaraj
From: 静行 [xiaoyong.den...@taobao.com]
Sent: Thursday, July 05, 2012 9:53 AM
To: mapreduce-user@hadoop.apache.org
Subject: 答复: How To Distribute One Map Data To All Reduce Tasks?
Thanks!
But wh
in some cases, u have to debug it in a real cluster (for example a
production problem)
in that case you need to specify the
-Xdebug -Xrunjdwp:transport=dt_socket,address=12345,server=n,suspend=n
for mapred.java.child.opts for the job (sorry forgot the exact param name,
maybe not exactly mapred.j
Thanks!
But what I really want to know is how can I distribute one map data to every
reduce task, not one of reduce tasks.
Do you have some ideas?
发件人: Devaraj k [mailto:devara...@huawei.com]
发送时间: 2012年7月5日 12:12
收件人: mapreduce-user@hadoop.apache.org
主题: RE: How To Distribute One Map Data To All
You can distribute the map data to the reduce tasks using Partitioner. By
default Job uses the HashPartitioner. You can use custom Partitioner it
according to your need.
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Partitioner.html
Thanks
Devaraj
_
all right, thanks~
在 2012年7月5日星期四,Marcos Ortiz 写道:
> Jason,
> Ramon is right.
> The best way to debug a MapReduce job is mounting a local cluster, and
> then, when you have tested enough your code, then, you can
> deploy it in a real distributed cluster.
> On 07/04/2012 10:00 PM, Jason Yang wrot
Hi all:
How can I distribute one map data to all reduce tasks?
This email (including any attachments) is confidential and may be legally
privileged. If you received this email in error, please delete it immediately
and do not copy it or use it for any p
Jason,
Ramon is right.
The best way to debug a MapReduce job is mounting a local cluster, and
then, when you have tested enough your code, then, you can
deploy it in a real distributed cluster.
On 07/04/2012 10:00 PM, Jason Yang wrote:
> ramon,
>
> Thank for your reply very much.
>
> However, I was
ramon,
Thank for your reply very much.
However, I was still wonder whether I could debug a MR application in this
way.
I have read some posts talking about using NAT to redirect all the packets
to the network card which connect to the local LAN, but it does not work as
I tried to redirect by usi
Ok thanks, I'll post there.
I realized that the issue has to do with the extra jars that I added to the
hadoop installation.
My job wasnt getting submitted becauae my tasktrackers don't seem to start if I
have avro-1.7.0.jar & avro-tools-1.7.0.jar in my hadoop/lib directory.
But I need these
Thanks for the response Anand,
The NullPointerException was by design, I wanted to illustrate that it was not
null. Your suspicion about having multiple threads was correct. I traced the
error down to the threadpool terminating prior to the last mapper completing
its operations.
Sincerely,
Matt
It's hard for folks here to help you on CDH - please ask their own user lists.
Arun
On Jul 4, 2012, at 8:49 AM, Alan Miller wrote:
> Hi,
>
> I’m trying to move from CDH3U3 to CDH4.
> My existing MR program works fine on CDH3U3 but I cant get it to run on CDH4.
>
> Basically my Driver class
Hi,
I'm trying to move from CDH3U3 to CDH4.
My existing MR program works fine on CDH3U3 but I cant get it to run on CDH4.
Basically my Driver class
1. queries a PG DB and writes some HashMaps to files in the Distributed
Cache,
2. then writes some Avro files (avro 1.7.0) to HDFS,
Hi all,
How do you kill a job or application when using mapred V2 yarn ?
I tried the following :
> mapred job -kill job_1341398677537_0020
Could not find job job_1341398677537_0020
I tried the application id, but it is invalid.
I'm using CDH4.
++
benoit
Jason,
the easiest way to debug a MapRedupe program with eclipse is working on
hadoop local. http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html#Local
In this mode all the components run locally on the same VM and can be easily
debugged using Eclipse.
Hope this will be useful.
From:
Hi, all
I have a hadoop cluster with 3 nodes, the network topology is like this:
1. For each DataNode, its IP address is like :192.168.0.XXX;
2. For the NameNode, it has two network cards: one is connect with the
DataNodes as a local LAN with IP address 192.168.0.110, while the other one
is connec
I am using 0.20.2 hadoop and I find many of my Tasks are getting timed out. The
task Time out is set to 10 minutes(default) :
Caused by: java.lang.RuntimeException: tasktracker: Task
attempt_201206280317_0052_m_00_0 failed to report status for 950 seconds.
Killing!
the tasks are waiting at
16 matches
Mail list logo