@^ : We are just going to do a BFS kind of thing is crawler, its
better to use a <b>queue</b> for implementing it.
rest to store it , we can use accordingly like if we have just to
store the URL we can use a pretty simple DS like arrays, or say
Linklists(if its very large) .
but if we need to store entire thing HTML , then we have to go to DOM
structures so store it, in something like XML tags.

On 2/7/12, Durgesh Kumar <durgesh1...@gmail.com> wrote:
> U can use dictionary or linked list ............
>
> Better if U choose language like python or java.
>
> Python have module named "Urllib2" and "httplib2" which implements all
> the functions for getiing ,posting and browsing data.
>
>
> INFORMAL ALGORITHM......
>
> 1. Start with any arbitray link. LINK=[new link]
> 2.a>Get html content of the link .
>    b>Parse the required Content and store it .
>    c>Add the new link on the page to the LINK it it is not present there.
> 3.Repeat step 2 untill U want to crawl.
>
> On 2/5/12, Ravi Ranjan <ravi.cool2...@gmail.com> wrote:
>> what will the algorithm and the appropriate data structure to implement a
>> web crawler??
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Algorithm Geeks" group.
>> To post to this group, send email to algogeeks@googlegroups.com.
>> To unsubscribe from this group, send email to
>> algogeeks+unsubscr...@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/algogeeks?hl=en.
>>
>>
>
>
> --
> *Durgesh Kumar*
> Final Year, B.tech
> Information Technology
> HALDIA INSTITUTE OF TCHNOLOGY
> HALDIA
>
> --
> You received this message because you are subscribed to the Google Groups
> "Algorithm Geeks" group.
> To post to this group, send email to algogeeks@googlegroups.com.
> To unsubscribe from this group, send email to
> algogeeks+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/algogeeks?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to algogeeks@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.

Reply via email to