I am using streaming with perl, and I want to get jobconf variable values. As
many tutorials say they are in environment, but I can not get them.
For example, in reducer:
while (){
my $part = $ENV{"mapred.task.partition"};
print ("$part\n");
}
It turns out that $ENV{"mapred.task.partition"
Does anybody have a clue? Thanks lot.
--- On Thu, 4/9/09, Steve Gao wrote:
From: Steve Gao
Subject: [Interesting] One reducer randomly hangs on getting 0 mapper output
To: core-user@hadoop.apache.org
Date: Thursday, April 9, 2009, 6:04 PM
I have hadoop jobs with the last 1 reducer randomly
I am using 0.17.0 .
I think the problem is basically because reducer falls in a infinite loop to
get mapper output, when mapper is somehow not available/dead . Doesn't hadoop
have a solution?
--- On Thu, 4/9/09, Steve Gao wrote:
From: Steve Gao
Subject: [Interesting] One reducer ran
I have hadoop jobs with the last 1 reducer randomly hangs on getting 0 mapper
output. By randomly I mean the job sometimes works correctly, sometimes their
last 1 reducer keeps reading map output but always gets 0 data. It would hang
up to 100 hours for getting 0 data until I kill it. After I k
ends right now is that the patch that was
committed broke a lot of other things, so it's been disabled. As such, there
is no working append in HDFS, and certainly not in hadoop-17.x.
-Bryan
On Mar 17, 2009, at 4:50 PM, Steve Gao wrote:
> Thanks, but I was told there is an append command,
..@yahoo.com"
Date: Tuesday, March 17, 2009, 7:52 PM
Hello Steve.
Assuming you are using *nix.
To Apply patch
patch -p0 -E < HADOOP-X.patch
To remove Patch
patch -p0 --reverse -E < HADOOP-X.patch
Hope this helps.
Regards,
Ravi
On 3/17/09 4:48 PM, "Steve Gao"
core-user@hadoop.apache.org
Date: Tuesday, March 17, 2009, 7:42 PM
what about an identity mapper taking A and B as inputs? this will
likely mix rows of A and B together though...
On Tue, Mar 17, 2009 at 7:35 PM, Steve Gao wrote:
> BTW, I am using hadoop 0.17.0 and jdk 1.6
>
> --- On Tue, 3/17/0
I want to apply this patch https://issues.apache.org/jira/browse/HADOOP-1700
to my hadoop 0.17.0 .
Would anybody tell me how to do it? Thanks!
BTW, I am using hadoop 0.17.0 and jdk 1.6
--- On Tue, 3/17/09, Steve Gao wrote:
From: Steve Gao
Subject: Does HDFS provide a way to append A file to B ?
To: core-user@hadoop.apache.org
Date: Tuesday, March 17, 2009, 7:22 PM
I need to append file A to file B in HDFS without downloading
I need to append file A to file B in HDFS without downloading/uploading them to
local disk. Is there a way?
08, 12:11 AM
Personally haven't worked with streaming but I guess the ur jobconfs
map.input.file param should do it for you.
-Original Message-----
From: Steve Gao [mailto:[EMAIL PROTECTED]
Sent: Thursday, October 23, 2008 7:26 AM
To: core-user@hadoop.apache.org
Cc: [EMAIL PROTECTED]
Su
Sorry for the email. Thanks for any help or hint.
I am using Hadoop Streaming. The input are multiple files.
Is there a way to get the current filename in mapper?
For example:
$HADOOP_HOME/bin/hadoop \
jar $HADOOP_HOME/hadoop-streaming.jar \
-input file1 \
-in
I am using Hadoop Streaming. The input are multiple files.
Is there a way to get the current filename in mapper?
For example:
$HADOOP_HOME/bin/hadoop \
jar $HADOOP_HOME/hadoop-streaming.jar \
-input file1 \
-input file2 \
-output myOutputDir \
-mapper mapper \
-reducer reducer
Would anybody help me?
Can I use
-jobconf mapred.map.task=50 in streaming command to change the job's number of
mappers?
I don't have a hadoop at hand and can not verify it. Thanks for your help.
--- On Wed, 10/15/08, Steve Gao <[EMAIL PROTECTED]> wrote:
From: Steve Gao &
Is there a way to change number of mappers in Hadoop streaming command line?
I know I can change hadoop-default.xml:
mapred.map.tasks
10
The default number of map tasks per job. Typically set
to a prime several times greater than number of available hosts.
Ignored when mapred.job.track
I am excited to see the slides. Would you send me a copy? Thanks.
--- On Wed, 10/15/08, Nishant Khurana <[EMAIL PROTECTED]> wrote:
From: Nishant Khurana <[EMAIL PROTECTED]>
Subject: Re: Hadoop User Group (Bay Area) Oct 15th
To: core-user@hadoop.apache.org
Date: Wednesday, October 15, 2008, 9:45 AM
Does anybody know if there are books about hadoop or pig? The wiki and manual
are kind of ad-hoc and hard to comprehend, for example "I want to know how to
apply patchs to my Hadoop, but can't find how to do it" that kind of things.
Would anybody help? Thanks.
Anybody knows? Thanks a lot.
--- On Thu, 10/2/08, Steve Gao <[EMAIL PROTECTED]> wrote:
From: Steve Gao <[EMAIL PROTECTED]>
Subject: How to concatenate hadoop files to a single hadoop file
To: core-user@hadoop.apache.org
Cc: [EMAIL PROTECTED]
Date: Thursday, October 2, 2008, 3:17 P
Suppose I have 3 files in Hadoop that I want to "cat" them to a single file. I
know it can be done by "hadoop dfs -cat" to a local file and updating it to
Hadoop. But it's very expensive for large files. Is there an internal way to do
this in Hadoop itself? Thanks
I have 5 running jobs, each has 2 reducers. Because I set max number of
reducers as 10 so any incoming job will be hold until some of the 5 jobs finish
and release reducer quota.
Now the problem is that an incoming job has a higher priority that I want to
pause some of the 5 jobs, let the new
Unfortunately this does not work. Hadoop complains:
08/08/21 18:04:46 ERROR streaming.StreamJob: Unexpected arg1 while processing
-input|-output|-mapper|-combiner|-reducer|-file|-dfs|-jt|-additionalconfspec|-inputformat|-outputformat|-partitioner|-numReduceTasks|-inputreader|-mapdebug|-reducedebug
That's interesting. Suppose your mapper script is a Perl script, how do you
assign "my.mapper.arg1"'s value to a variable $x?
$x = $my.mapper.arg1
I just tried the way and my perl script does not recognize $my.mapper.arg1.
--- On Thu, 8/21/08, Rong-en Fan <[EMAIL PROTECTED]> wrote:
From: Rong-en
CTED]>
Subject: Re: [Streaming]What is the difference between streaming options: -file
and -CacheFile ?
To: core-user@hadoop.apache.org, "Steve Gao" <[EMAIL PROTECTED]>
Date: Friday, July 18, 2008, 8:27 PM
On Jul 18, 2008, at 4:53 PM, Steve Gao wrote:
> Hi All,
> I am u
Hi All,
I am using Hadoop Streaming. I am confused by streaming options: -file and
-CacheFile. Seems that they mean the same thing, right?
Another misleading options are : -NumReduceTasks and -jobconf
mapred.reduce.tasks. Both are used to control (or give hit to) the number of
reducer
Seems that they mean the same thing, right?
Another misleading options are : -NumReduceTasks and -jobconf
mapred.reduce.tasks. Both are used to control (or give hit to) the number of
reducers.
25 matches
Mail list logo