Hi
I'm having a query here. Is it possible to have no mappers but
reducers alone? AFAIK If we need to avoid the tyriggering of reducers we can
set numReduceTasks to zero but such a setting on mapper wont work. So how
can it be achieved if possible?
Thank You
Regards
Bejoy.K.S
I dont think that is possible, can you explain in what scenario you want to
have no mappers, only reducers?
Best Regards,
Sonal
Crux: Reporting for HBase https://github.com/sonalgoyal/crux
Nube Technologies http://www.nubetech.co
http://in.linkedin.com/in/sonalgoyal
On Wed, Sep 7, 2011 at
Hi
Is it possible to have my mapper in Perl and reducer in java. In my
existing legacy system some larger process is being handled by Perl and the
business logic of those are really complex. It is a herculean task to
convert all the Perl to java. But the reducer business logic which is
Hi Bejoy,
It is possible to execute a job with no mappers and reducers alone.
You can try this by giving the empty directory as input for the job.
Devaraj K
_
From: Bejoy KS [mailto:bejoy.had...@gmail.com]
Sent: Wednesday, September 07, 2011 1:30 PM
To:
Oh boy are you in for a surprise. Reducers _can_ run with 0 mappers in a job ;-)
/me puts his troll-mask on.
➜ ~HADOOP_HOME hadoop fs -mkdir abc
➜ ~HADOOP_HOME hadoop jar hadoop-examples-0.20.2-cdh3u1.jar wordcount abc out
11/09/07 14:24:14 INFO input.FileInputFormat: Total input paths to
Sahana,
Yes. But, isn't that how it is normally? What makes you question this
capability?
On Wed, Sep 7, 2011 at 2:37 PM, Sahana Bhat sana.b...@gmail.com wrote:
Hi,
Is it possible to have multiple mappers where each mapper is
operating on a different input file and whose result
You can look at Hadoop streaming
http://hadoop.apache.org/common/docs/r0.20.0/streaming.html
Thanks
Amareshwari
On 9/7/11 1:38 PM, Bejoy KS bejoy.had...@gmail.com wrote:
Hi
Is it possible to have my mapper in Perl and reducer in java. In my
existing legacy system some larger process is
Hi,
I understand that given a file, the file is split across 'n' mapper
instances, which is the normal case.
The scenario i have is :
1. Two files which are not totally identical in terms of number of columns
(but have data that is similar in a few columns) need to be processed and
after
This is true and it took as off by surprise in recent past. Also, it had
quite some impact on our job cycles where the size of input is totally
random and could also be zero at times.
In one of our cycles, we run a lot of jobs. Say we configure X as the num of
reducers for a job which does not
Hi,
Its possible by setting the num of reduce tasks to be 1. Based on your
example, it looks like u need to group ur records based on Date, counter1
and counter2. So that should go in the logic of building your key for your
map o/p.
Thanks
Sudhan S
On Wed, Sep 7, 2011 at 3:02 PM, Sahana Bhat
Sahana,
Yes this is possible as well. Please take a look at the MultipleInputs
API @
http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/mapred/lib/MultipleInputs.html
It will allow you to add a path each with its own mapper
implementation, and you can then have a common reducer
Harsh, Can you please tell how can we use MultipleInputs using Job Object on
hadoop 0.20.2. As you can see, in MultipleInputs, its using JobConf object.
I want to use Job object as mentioned in new hadoop 0.21 API.
I remember you talked about pulling out things from new API and add it into
out
Praveenesh,
The JIRA https://issues.apache.org/jira/browse/MAPREDUCE-369
introduced it and carries a patch that I think would apply without
much trouble on your cluster's sources. You can mail me directly if
you need help applying a patch.
Alternatively, you can do something like downloading
Thank You All. Even I have noticed this strange behavior some time back.
Now my inital concern still remains. If I provide my input directory an
empty one, yes the map tasks wont be executed .But my reducer needs input
to do the processing/ aggregation. In such a scenario, is there an option to
Hi
What is the right way to pass a parameter for all mapper and reducers to
see?
Thanks
Nope. A reducer's input is from the map outputs alone (fetched in by
the shuffling code), which would not exist here.
What are you looking to do? Why won't a map task suffice for doing that?
On Wed, Sep 7, 2011 at 4:51 PM, Bejoy KS bejoy.had...@gmail.com wrote:
Thank You All. Even I have
There is no right way. I think the best thing to do is to ask in the
forums. I thought maybe via the Configuration object, but this is by no way
a formal solution.
On Wed, Sep 7, 2011 at 2:39 PM, ilyal levin nipponil...@gmail.com wrote:
Hi
What is the right way to pass a parameter for all
Another method is to store it on a shared store which can be accessed from
each node, such as zookeeper, hdfs, hbase, db etc
在 2011 9 7 20:11,Yaron Gonen yaron.go...@gmail.com写道:
There is no right way. I think the best thing to do is to ask in the
forums. I thought maybe via the Configuration
I think my job is running out of memory before it calls reduce() in the
reducer. It's running with large blocks of binary data emitted from the
Maper. Each record emitted from the mappers should be small enough to
fit in memory. However, if it tried to somehow keep a bunch of records
for one
You could just have a mapper which sent off the exact values it took in (ie,
output k1,v1 as k2,v2). I think that's the best you'll be able to do here.
On Sep 7, 2011, at 4:21 AM, Bejoy KS bejoy.had...@gmail.com wrote:
Thank You All. Even I have noticed this strange behavior some time back.
20 matches
Mail list logo