> So, what if one has to map a string of 15 character nucleotide to a > jumbo string of characters of length in millions.
Hi Srinivas, If you're doing research into this, I strongly recommend you take a look at Dan Gusfield's excellent textbook "Algorithms on Strings, Trees, and Sequences": http://www.cambridge.org/uk/catalogue/catalogue.asp?isbn=0521585198 Reinvention is fun, but there's already been a lot of work poured into this field; the textbook above provides more background, so you can stand on previous efforts. [code cut] > Do you think is there any flaw in this. It works, and I don't see anything immediately wrong with this. The only issue is that it'll be only exact matching on the oligos that have been stored in that hashtable. > Is there any better method that is possible. It depends on what you're looking for. Note that the approach you're using will only match properly if your oligo width matches the query string width. This may or may not be a problem for you. Gusfield's textbook above surveys several other methods of doing string matching that are particularly suitable for biologists. If you're planning to do a lot of small queries on a large string like that, then a data structure such as a "suffix tree" is worth knowing. http://en.wikipedia.org/wiki/Suffix_tree You may also want to look into specialized applications; arabidopsis.org hosts a program called 'patmatch' that may be helpful for you: http://nar.oxfordjournals.org/cgi/content/abstract/33/suppl_2/W262 (As a disclaimer: I did some work on patmatch.) Good luck. _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor