Hi team,
I have a python script which normally runs like this locally,
Python mapper.py file1 file2 2 .
How can I achieve this by using streaming API, and using the script as mapper.
It actually joins the three files on a column which is passed as parameter (
numeric ) .
Also how ca
Hi Siddharth
Joins are better implemented in hive and pig. Try checking out those and
see whether it fits your requirements.
If you are still looking for implementing joins using mapreduce, you can
take a look at this example which uses MultipleInputs
http://kickstarthadoop.blogspot.in/2011/09/jo