Re: File Chunk to Map Thread Association

2009-08-20 Thread roman kolcun
On Thu, Aug 20, 2009 at 6:49 AM, Harish Mallipeddi harish.mallipe...@gmail.com wrote: On Thu, Aug 20, 2009 at 7:25 AM, roman kolcun roman.w...@gmail.com wrote: Hello everyone, could anyone please tell me in which class and which method does Hadoop download the file chunk from HDFS

Re: File Chunk to Map Thread Association

2009-08-20 Thread roman kolcun
On Thu, Aug 20, 2009 at 10:30 AM, Harish Mallipeddi harish.mallipe...@gmail.com wrote: On Thu, Aug 20, 2009 at 2:39 PM, roman kolcun roman.w...@gmail.com wrote: Hello Harish, I know that TaskTracker creates separate threads (up to mapred.tasktracker.map.tasks.maximum) which execute

Re: File Chunk to Map Thread Association

2009-08-20 Thread roman kolcun
...@gmail.com wrote: On Thu, Aug 20, 2009 at 10:30 AM, Harish Mallipeddi harish.mallipe...@gmail.com wrote: On Thu, Aug 20, 2009 at 2:39 PM, roman kolcun roman.w...@gmail.com wrote: Hello Harish, I know that TaskTracker creates separate threads (up

Re: File Chunk to Map Thread Association

2009-08-20 Thread roman kolcun
chunks takes 84 seconds which is 21 seconds longer than processing it with 256MB chunks. Therefore I would like to merge several smaller local chunks into a larger one and process it with one mapper. Roman Kolcun On Thu, Aug 20, 2009 at 6:36 PM, Ted Dunning ted.dunn...@gmail.com wrote: Uhh

File Chunk to Map Thread Association

2009-08-19 Thread roman kolcun
Hello everyone, could anyone please tell me in which class and which method does Hadoop download the file chunk from HDFS and associate it with the thread that executes the Map function on given chunk and process it? I would like to extend the Hadoop so one Task may have more chunks associated and

Re: Hadoop execution improvement in a heterogeneous environment

2009-07-16 Thread roman kolcun
Thank you for your reply. See my responses inline. On Thu, Jul 16, 2009 at 4:23 AM, Jothi Padmanabhan joth...@yahoo-inc.comwrote: See some responses inline. My idea is that on each node there will be a special DataAssign thread which will take care of assigning data to each map thread.

Re: Hadoop execution improvement in a heterogeneous environment

2009-07-15 Thread roman kolcun
Hello everyone, I am not sure whether this fits into the common user mailinglist, but I haven't received any reply in development mailinglist so I am trying it here. I've got an idea of how to improve the execution time of map phase in a heterogeneous environment (when other processes may run on