On 1 окт, 06:20, Sticker <[EMAIL PROTECTED]> wrote:
> En, it is the idea. You assume each keyword is a single character and
> you use a map to hash key words and their counts. Each time you extend
> the range to right hand side, you may increase the counts of some
> found key words and each time you shrink the range from the left hand
> side, you decrease the counts of some out-of-range key words.
>
Exactly! pretty simple algorithm.
> For the case that keywords are not single characters, you need a
> complex hash. Even if keywords are single characters, you need
> lg(number of keywords) to locate the position of each keyword in the
> map,which is implemented in the STL map container.
Look at
> Example: a sequence "abaccdefgacel" and a set of key words "a", "c",
> "d" and "e". The shortest substring contianing all of the key words is
> "accde".
each word is a single character! more over It makes no difference we
can numerate all the words in the world :) and rewrite the program
using numbers instead of characters.
> May I draw the final conclusion that the final time complexity "must"
> have to involve the total length of keywords and the length of the
> input text, rather than linear to the input text's length? (Of course
> you can say that in usual case the length of keywords << length of
> input text. But this assumption should not remove the length of
> keywords in the time complexity analysis thus should not sufficient to
> make the algorithm works linear to the length of text.) Am I right?
>
Let's write new algorithm based on my previous with integers instead
words.
Ruby like code :)
@words = {} # Hash {word => sequence number }
1.
for word in inputSequence
words[word] = words.size unless words[word] # If word not in
hash put it there
# and assign sequence number.
end
2.
for word in keyWords
words[word] = words.size unless words[word]
end
# This procedure takes N log(N) + M * log(N + M)
# N - size of input sequence
# M - number of keywords
3. Run previous algorithm on integers!
The total time will be
N log(N) + M * log (N + M) + N + M
So if we don't have all word already enumerated you are right.
> On Oct 1, 4:10 am, Andrey <[EMAIL PROTECTED]> wrote:
>
> > Check this program:
>
> > #include <map>
> > #include <string>
> > #include <iostream>
> > #include <algorithm>
> > #include <iterator>
>
> > using namespace std;
>
> > class Range : public pair<const char*, const char*> {
> > public:
> > size_t size() const { return second - first; }
> > Range(const char* f=0, const char * s=0)
> > : pair<const char*, const char*>(f ,s) {}
>
> > };
>
> > ostream & operator << (ostream & out, const Range& r) {
> > out<<"range size: "<<r.size()<<" [";
> > copy(r.first, r.second, ostream_iterator<char>(out, ""));
> > out<<"]";
> > return out;
>
> > }
>
> > typedef map<char, int> SetCnt;
>
> > // array based implementation will have complexity O(m)
> > bool checkSet(const SetCnt & setCnt) {
> > SetCnt::const_iterator sIt = setCnt.begin(), sEnd = setCnt.end();
> > for (SetCnt::const_iterator sIt = setCnt.begin(); sIt != sEnd; +
> > +sIt ) {
> > if (sIt->second == 0) {
> > return false;
> > }
> > }
> > return true;
>
> > }
>
> > // array based implementation will have complexity O(1)
> > bool checkAndShrink(SetCnt & setCnt, char c) {
> > SetCnt::iterator sIt = setCnt.find(c);
> > if (sIt == setCnt.end()) {
> > return true;
> > }
> > if (sIt->second > 1) {
> > sIt->second--;
> > return true;
> > }
> > return false;
>
> > }
>
> > // aproximate complexity is O(n*m + m)
> > // where n - is size of input sequence and m - is number of key
> > words.
> > Range solve(const char * strBegin, const char * strEnd,
> > const char * setBegin, const char * setEnd) {
> > SetCnt setCnt;
>
> > // complexity O(m)
> > for (const char * cIt = setBegin; cIt != setEnd; cIt++ ) {
> > setCnt[*cIt] = 0;
> > }
>
> > bool solved = false;
> > Range range(strBegin, strEnd), result(range);
> > for (const char * sIt = strBegin; sIt != strEnd; sIt++ ) {
> > SetCnt::iterator it = setCnt.find(*sIt);
> > if (it != setCnt.end()) {
> > setCnt[*sIt] += 1;
> > range.second = sIt + 1;
> > if (checkSet(setCnt)) {
> > solved = true;
> > while (checkAndShrink(setCnt,
> > *range.first)) {
> > range.first++;
> > }
> > if (result.size() > range.size()) {
> > result = range;
> > }
> > // Debug
> > cout<<range<<endl;
> > }
> > }
> > }
> > return solved ? result : Range(0, 0);
>
> > }
>
> > int main() {
> > const char * string = "abrebbqbcdfagbasdfbbaggqqebbcg--324c";
> > const char * set = "abc";
>
> > Range range = solve(string, string + strlen(string), set, set +
> > strlen(set));
> > if (range.size()) {
> > cout<<"The shortes subsequence is: "<<range<<endl;
> > } else {
> > cout<<"No solution..."<<endl;
> > }
> > return 0;
>
> > }
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Algorithm Geeks" group.
To post to this group, send email to algogeeks@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/algogeeks
-~----------~----~----~----~------~----~------~--~---