subject:"Multithreaded Reducer"

Re: Multithreaded Reducer

2009-04-13 Thread Owen O'Malley

On Apr 10, 2009, at 11:12 AM, Sagar Naik wrote: Hi, I would like to implement a Multi-threaded reducer. As per my understanding , the system does not have one coz we expect the output to be sorted. However, in my case I dont need the output sorted. You'd probably want to make a blocking c

Re: Multithreaded Reducer

2009-04-10 Thread Todd Lipcon

On Fri, Apr 10, 2009 at 12:31 PM, jason hadoop wrote: > Hi Sagar! > > There is no reason for the body of your reduce method to do more than copy > and queue the key value set into an execution pool. > Agreed. You probably want to use a either a bounded queue on your execution pool, or even a Sync

Re: Multithreaded Reducer

2009-04-10 Thread jason hadoop

Hi Sagar! There is no reason for the body of your reduce method to do more than copy and queue the key value set into an execution pool. The close method will need to wait until the all of the items finish execution and potentially keep the heartbeat up with the task tracker by periodically repor

Re: Multithreaded Reducer

2009-04-10 Thread Aaron Kimball

At that level of parallelism, you're right that the process overhead would be too high. - Aaron On Fri, Apr 10, 2009 at 11:36 AM, Sagar Naik wrote: > > Two things > - multi-threaded is preferred over multi-processes. The process I m > planning is IO bound so I can really take advantage of mult

Re: Multithreaded Reducer

2009-04-10 Thread Sagar Naik

Two things - multi-threaded is preferred over multi-processes. The process I m planning is IO bound so I can really take advantage of multi-threads (100 threads) - Correct me if I m wrong. The next MR_JOB in the pipeline will have increased number of splits to process as the number of reduce

Re: Multithreaded Reducer

2009-04-10 Thread Aaron Kimball

Rather than implementing a multi-threaded reducer, why not simply increase the number of reducer tasks per machine via mapred.tasktracker.reduce.tasks.maximum, and increase the total number of reduce tasks per job via mapred.reduce.tasks to ensure that they're all filled. This will effectively util

Multithreaded Reducer

2009-04-10 Thread Sagar Naik

Hi, I would like to implement a Multi-threaded reducer. As per my understanding , the system does not have one coz we expect the output to be sorted. However, in my case I dont need the output sorted. Can u pl point to me any other issues or it would be safe to do so -Sagar

Re: Multithreaded Reducer

Re: Multithreaded Reducer

Re: Multithreaded Reducer

Re: Multithreaded Reducer

Re: Multithreaded Reducer

Re: Multithreaded Reducer

Multithreaded Reducer

7 matches

Site Navigation

Mail list logo

Footer information