Re: Write and Read file through map reduce

2015-01-07 Thread Raj K Singh
you can configure your third mapreduce job using MultipleFileInput and read those file into you job. if the file size is small then you can consider the DistributedCache which will give you an optimal performance if you are joining the datasets of file1 and file2. I will also recommend you to use

Re: Write and Read file through map reduce

2015-01-06 Thread Shahab Yunus
Distributed Cache has been deprecated for a while. You can use the new mechanism, which is functionally the same thing, discussed here in this thread: http://stackoverflow.com/questions/21239722/hadoop-distributedcache-is-deprecated-what-is-the-preferred-api Regards, Shahab On Mon, Jan 5, 2015

Re: Write and Read file through map reduce

2015-01-05 Thread Corey Nolet
Hitarth, I don't know how much direction you are looking for with regards to the formats of the times but you can certainly read both files into the third mapreduce job using the FileInputFormat by comma-separating the paths to the files. The blocks for both files will essentially be unioned

Re: Write and Read file through map reduce

2015-01-05 Thread Ted Yu
Hitarth: You can also consider MultiFileInputFormat (and its concrete implementations). Cheers On Mon, Jan 5, 2015 at 6:14 PM, Corey Nolet cjno...@gmail.com wrote: Hitarth, I don't know how much direction you are looking for with regards to the formats of the times but you can certainly

Re: Write and Read file through map reduce

2015-01-05 Thread unmesha sreeveni
Hi hitarth ​, If your file1 and file 2 is smaller you can move on with Distributed Cache. mentioned here http://unmeshasreeveni.blogspot.in/2014/10/how-to-load-file-in-distributedcache-in.html . Or you can move on with MultipleInputFormat ​ mentioned here

Write and Read file through map reduce

2015-01-05 Thread hitarth trivedi
Hi, I have 6 node cluster, and the scenario is as follows :- I have one map reduce job which will write file1 in HDFS. I have another map reduce job which will write file2 in HDFS. In the third map reduce job I need to use file1 and file2 to do some computation and output the value. What is