Well its good Question Instead of Googling
I would like to give some naive approach for this.. which pays from
time & space

1st Counts the number or words in single large file
 for this we can process this like

    while (in.get(ch))  //as we read character by character from file
   {

    if ( ch == ' ' || ch == '\n' || ch == '\t' )
       numWords++;


    }

   for every word we have to count how many time a particular  words
occurs using temp count array   see counting sort this
   then sort the word basic of their frequency ..we will get top 10,
20 ....as many as  words wants..from file .

   its tough but naive approach

2...  Best Approach to Put All Word from File to Hash Table where word
acts as a key & counts act as a value  so if again the same word
retrieved from the &
       when put into file we have to check whether it exist or not if
yes increment the counter else just put this new word into hash-table
& initialize its count to 1

       the Important part of this algorithm..is that how we will know
that this word is already stored in hash-table its requires lot
processing.beacause for this as we
       processing a Big File also requires the lot of computation to
first identifying that it is word & then putting  word into hash-
table

      Well this Kind Of Algorithm Used By Web Crawler when web crawler
looks for new URL from the Web it uses the same approach...


       I think Approach will work efficiently for large data. further
it depends how data is organized  if we talk about database ..then
again we have to think all possible
       ways to solve it.


     Correct me if Concepts Seems to be wrong


Thanks & Regards
Shashank Mani ""The best way to escape from a problem is to solve it."

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to algogeeks@googlegroups.com.
To unsubscribe from this group, send email to 
algogeeks+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.

Reply via email to