get SequenceFile reader

2011-12-16 Thread Markus Jelsma
Hi, I'm migrating Apache jobs to the new MapReduce API. I came across too many issues but there's one i can't seem to figure out: SequenceFile.Reader[] readers = SequenceFileOutputFormat.getReaders(tmpFolder, conf); I have looked through the API docs for too many times now but i cannot find

Map Reduce Phase questions:

2011-12-16 Thread Ann Pal
Hi, I had some questions specifically on the Map-Reduce phase: [1] For the reduce phase, the TaskTrackers corresponding to the reduce node, poll the Job Tracker to know about maps that have completed and if the Jobtracker informs it about maps that are complete, it then pulls the data from the

Re: Map Reduce Phase questions:

2011-12-16 Thread real great..
[1]. I think the reducers are allocated a space before the execution begins and it depends on the number of reducers. If am not a mistaken, a hash logic is used to implement this. [2]. do not think we can determine the 'number' of reduce nodes. Its determined by the load conditions i assume and

Re: Map Reduce Phase questions:

2011-12-16 Thread Harsh J
On Fri, Dec 16, 2011 at 7:03 PM, Ann Pal ann_r_...@yahoo.com wrote: Hi, I had some questions specifically on the Map-Reduce phase: [1] For the reduce phase, the TaskTrackers corresponding to the reduce node, poll the Job Tracker to know about maps that have completed and if the Jobtracker

Re: Generating job and topology traces from history folder of multinode cluster using Rumen

2011-12-16 Thread arun k
Ravi, Thanks for the info. Arun On Fri, Dec 16, 2011 at 12:27 PM, Ravi Gummadi gr...@yahoo-inc.com wrote: Amar is working on this issue MAPREDUCE-3349. The patch is not comiited to trunk yet. Feel free to try it out while it gets reviewed and committed. -Ravi

Fwd: how read a file in HDFS?

2011-12-16 Thread Pedro Costa
Hi, I want to read a file that has 100MB of size and it is in the HDFS. How should I do it? Is it with IOUtils.readFully? Can anyone give me an example? -- Thanks, -- Thanks,

RE: how read a file in HDFS?

2011-12-16 Thread Uma Maheswara Rao G
Yes you can use utility methods from IOUtils ex: FileOutputStream fo = new FileOutputStream (file); IOUtils.copyBytes(fs.open(fileName), fo, 1024, true); here fs is DFS stream. other option is, you can make use of FileSystem apis. EX: FileSystem fs=new DistributedFileSystem();

Re: Map Reduce Phase questions:

2011-12-16 Thread Ann Pal
Thanks a lot for your answers! For [1] With the Pull model chances of seeing a TCP-Incast problem where multiple map nodes send data to the same reduce node at the same time are minimal (since the reducer is responsible for retrieving data it can handle). Is this a valid assumption? For [3]

Re: Map Task Capacity Not Changing

2011-12-16 Thread Joey Krabacher
pid files are there, I checked for running processes with the sameID's and they all checked out. --Joey On Fri, Dec 16, 2011 at 5:40 PM, Rahul Jain rja...@gmail.com wrote: You might be suffering from HADOOP-7822; I'd suggest you verify your pid files and fix the problem by hand if it is the same