Since the input files are very small, the default input formats in Hadoop all 
generate just a single InputSplit, so only a single map task is executed, and 
we wont have much parallelism. 

I was thinking of writing an InputFormat that would read the whole file as an 
InputSplit and replicate this input split n times (where n would be the number 
of ants in a single stage) so that we'll have n mappers.
Also I want the input format to return the value as the adjacency matrix of the 
graph (calculating it from the coordinates in the input file). 

But I can't find a way to do this. Is it possible? Or is it better to just have 
the input as Text and create the adjacency matrix in the mappers?

Date: Sat, 5 May 2012 08:20:34 -0700
Subject: Re: Ant Colony Optimization for Travelling Salesman Problem in Hadoop
From: lordjoe2...@gmail.com
To: mapreduce-user@hadoop.apache.org

yes - if you know how you can put it in distributed cache or if it is small put 
in as a String in the config or have all InputFormats read it from somewhere

On Sat, May 5, 2012 at 8:08 AM, sharat attupurath <shara...@hotmail.com> wrote:





I looked at both the files. in AbstractNShotInputFormat it is mentioned that 
this input format does not read from files. My input is in a text file. I want 
the whole file as a single record. So is it enough if i just copy the contents 
of the file and return it as a string from getValueFromIndex() ? 


Date: Fri, 4 May 2012 13:15:46 -0700
Subject: Re: Ant Colony Optimization for Travelling Salesman Problem in Hadoop
From: lordjoe2...@gmail.com

To: mapreduce-user@hadoop.apache.org

Look at NShotInputFormat call setNumberKeys to set the number of ants 
thenchange the method getValueFromIndex to return the text representing the 
original problem to each mapper -
what happens in the second and third jobs is an exercise but the saved test 
needs to have pheremones and the original problem 


On Fri, May 4, 2012 at 9:54 AM, sharat attupurath <shara...@hotmail.com> wrote:






Hi,

Thanks a lot for the quick reply sir! We are new to Apache Hadoop and haven't 
yet understood it properly yet. Can you please elaborate on how we can have 
multiple stages of mapreduce jobs for combining the trails as you have 
mentioned? 



We have been trying to find out how to write a custom splitter and almost all 
online resources tell to subclass the FileInputFormat and write only a custom 
RecordReader? Will it be possible to generate our splits in that way?



Regards

Sharat

Date: Fri, 4 May 2012 09:22:51 -0700
Subject: Re: Ant Colony Optimization for Travelling Salesman Problem in Hadoop
From: lordjoe2...@gmail.com


To: mapreduce-user@hadoop.apache.org



On Fri, May 4, 2012 at 9:06 AM, sharat attupurath <shara...@hotmail.com> wrote:







Hi,

We are trying to parallelize the ant colony optimization 
algorithm for TSP over hadoop and are facing some issues. We are using 
TSPLIB as input files. The input is a text file containing eucledian 
coordinates of the cities - first column is city number and the next two
 columns contain the x and y coordinates respectively. 

What we 
intend to do is to take input from this single file, send copies of it 
to multiple mappers (each mapper acts like the ant in the algorithm), 
each mapper works on its input to find its own TSP solution that it 
outputs and finally the reducer outputs the smallest tour found by the 
mappers. Hope we are in the right track. Here are the issues:

1) 
Since the input file is small, we need to force hadoop to fire up 
multiple map tasks by replicating the input. How can we make an 
InputSplit of the whole file and replicate it so that the input can be 
sent to multiple mappers?
Write a custom splitter sending the same data to all mappers - the only 
critical criteria is the number of "Ants" 



2) the algorithm uses a shared 
pheromone array and each mapper needs to read and write data from this. 
How can we share the pheromone data across the mappers.
You cannot share data across mappers and should not attempt to do so - Better 
to use the reducer(s) to combine trails  combine first pass trails and


then pass the combined trails to another mapreduce job - this with the original 
problem plus the current pheremone trails

Hope the questions are clear enough. Any help would be greatly appreciated. 

Thank you

Regards

Sharat                                    



-- 
Steven M. Lewis PhD4221 105th Ave NEKirkland, WA 98033206-384-1340 (cell)

Skype lordjoe_com


                                          


-- 
Steven M. Lewis PhD4221 105th Ave NEKirkland, WA 98033206-384-1340 (cell)

Skype lordjoe_com



                                          


-- 
Steven M. Lewis PhD4221 105th Ave NEKirkland, WA 98033206-384-1340 (cell)
Skype lordjoe_com


                                          

Reply via email to