Re: Prune out data to a specific reduce task

2015-03-16 Thread Azuryy Yu
avoid further processing downstream >> and hence less resources would be consumed, as unwanted records are pruned >> at the source itself. >> Is there any obstacle from doing this in your map method ? >> >> Regards, >> Naga >> -- >>

Re: Prune out data to a specific reduce task

2015-03-15 Thread Drake민영근
int getPartition(K key, V value, int numReduceTasks) > must always return a partition. I can’t return -1. Thus, I don’ t know how > to tell Mapreduce to not execute data from a partition. Any suggestion? > > Forwarded Message > > Subject: Re: Prune out data to a specif

Re: Prune out data to a specific reduce task

2015-03-13 Thread xeonmailinglist-gmail
, I don’ t know how to tell Mapreduce to not execute data from a partition. Any suggestion? Forwarded Message ———— Subject: Re: Prune out data to a specific reduce task Date: Thu, 12 Mar 2015 12:40:04 -0400 From: Fei Hu hufe...@gmail.com <http://mailto:hufe...@gmail.com> Reply-To: u

RE: Re: Prune out data to a specific reduce task

2015-03-12 Thread Naganarasimha G R (Naga)
tself. Is there any obstacle from doing this in your map method ? Regards, Naga From: xeonmailinglist-gmail [xeonmailingl...@gmail.com] Sent: Thursday, March 12, 2015 22:17 To: user@hadoop.apache.org Subject: Fwd: Re: Prune out data to a specific reduce task

Re: Prune out data to a specific reduce task

2015-03-12 Thread Fei Hu
tasks. > > The method public int > getPartition(K key, V value, int numReduceTasks) must always return > a partition. I can’t return -1. Thus, I don’ t know how to tell Mapreduce to > not execute data from a partition. Any suggestion? > > Forwarded Message ———— &

Fwd: Re: Prune out data to a specific reduce task

2015-03-12 Thread xeonmailinglist-gmail
data from a partition. Any suggestion? Forwarded Message Subject: Re: Prune out data to a specific reduce task Date: Thu, 12 Mar 2015 12:40:04 -0400 From: Fei Hu hufe...@gmail.com <http://mailto:hufe...@gmail.com> Reply-To: user@hadoop.apache.org To: user@hadoop.apache.org May

Re: Prune out data to a specific reduce task

2015-03-12 Thread Fei Hu
Maybe you could use Partitioner.class to solve your problem. > On Mar 11, 2015, at 6:28 AM, xeonmailinglist-gmail > wrote: > > Hi, > > I have this job that has 3 map tasks and 2 reduce tasks. But, I want to > excludes data that will go to the reduce task 2. Th

Re: Prune out data to a specific reduce task

2015-03-11 Thread Drake민영근
In the map method, records would be ignored with no output.collect() or context.write(). Or you just delete output file from reducer 2 at the end of job. the reducer 2's result file is "part-r-2". Drake 민영근 Ph.D kt NexR On Wed, Mar 11, 2015 at 9:43 PM, Fabio C. wrote: > As far as I know th

Re: Prune out data to a specific reduce task

2015-03-11 Thread Fabio C.
As far as I know the code running in each reducer is the same you specify in your reduce function, so if you know in advance the features of the data you want to ignore you can just instruct reducers to do so. If you are able to tell whether or not to keep an entry at the beginning, you can filter

Re: Prune out data to a specific reduce task

2015-03-11 Thread xeonmailinglist-gmail
Maybe the correct question is, how can I filter data in mapreduce in Java? On 11-03-2015 10:36, xeonmailinglist-gmail wrote: To exclude data to a specific reducer, should I build a partitioner that do this? Should I have a map function that checks to which reduce task the output goes? Can an

Re: Prune out data to a specific reduce task

2015-03-11 Thread xeonmailinglist-gmail
To exclude data to a specific reducer, should I build a partitioner that do this? Should I have a map function that checks to which reduce task the output goes? Can anyone give me some suggestion? And by the way, I really want to exclude data to a reduce task. So, I will run more than 1 reduc