I have to admit that I was wrong in my previous post. I stated that if we have all words in the enumerated we can operate with them "better (faster)" but it is true. Enumeraing of the words makes no sence..
> Similar objections to using a hash table to assign integers to words. > If collisions are allowed, not 0(1). Might just as well have hashed in > the keywords first and accepted worse than O(1). As Vishal suggested a > trie looks more realistic. Building the trie can be done O(m), with m > - total characters in keywords. Identifying whether a docut ment > character is part of a keyword candidate is then O(1). What is 'trie' ? What are you trying to do? Find keyword in input sequence? They are separated by spaces and punctuations, aren't they? > > You asked for O(n) on the length of the sequence. I think this can be > taken as n, the number of characters in the document. Fair because we > have to iterate through all the characters to find the position of > words and keywords. Anyway, this is Big O so we can argue that > O(n)<=>O(doc_words) i.e. n = K(doc_words) with K average wordlength in > document. > If n is size of input sequence, d - number of words in input sequence and n ~ K d then O(n) ~ O(d). >... Andrey's listing > has a nested loop to do trimming that rechecks words already checked > in the main loop. Also the algorithm has a tendency to find longer and > longer sequences with the same start word that clearly cannot provide > a new minimum. Completely wrong! the algoritm is based on the following invariant: - It always keeps the shortest possible subsequence that ends at the current possition. Therefore it will eventually find the shortest one in entire sequence. > I'm no expert but does this give O(n+m)? Yes, and I professed that in listing. But O(n + m) is still O(n) Where n - size of input sequence m - number of keywords > I could perhaps do a small application (in Pascal) to test this point. > Meanwhile what do you think? How is your program? Can we have a glimpse of that? --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Algorithm Geeks" group. To post to this group, send email to algogeeks@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/algogeeks -~----------~----~----~----~------~----~------~--~---