Hi Danny, thanks for your email. In the example I've shown, there are no odd elements except for character case.
In the real case I have a list of 100 gene names for Humans. The human gene names are conventioanlly represented in higher cases (eg.DDX3X). However, NCBI's gene_info dataset the gene names are reported in lowercase (eg. ddx3x). I want to extract the rest of the information for DDX3X that I have from NCBI's file (given that dataset is in tab delim format). my approach was if i can define DDX3X is identical ddx3x then I want to print that line from the other list (NCBI's gene_info dataset). I guess, I understood your suggestion wrongly. In such case, why do I have to drop something from list b (which is over 150 K lines). If I can create a sublist of all elements in b (a small list of 100) then it is more easy. this is my opinion. -srini --- Danny Yoo <[EMAIL PROTECTED]> wrote: > > > On Sat, 3 Dec 2005, Srinivas Iyyer wrote: > > >>> a > > ['apple', 'boy', 'boy', 'apple'] > > > > >>> b > > ['Apple', 'BOY', 'APPLE-231'] > > > > >>> for i in a: > > pat = re.compile(i,re.IGNORECASE) > > for m in b: > > if pat.match(m): > > print m > > > Hi Srinivas, > > We may want to change the problem so that it's less > focused on "print"ing > results directly. We can rephrase the question as a > list "filtering" > operation: we want to keep the elements of b that > satisfy a certain > criteron. > > > Let's give a name to that criterion now: > > ###### > def doesNameMatchSomePrefix(word, prefixes): > """Returns True if the input word is matched by > some prefix in > the input list of prefixes. Otherwise, returns > False.""" > # ... fill me in > > ###### > > > Can you write doesNameMatchSomePrefix()? In fact, > you might not even need > regexes to write an initial version of it. > > > > If you can write that function, then what you're > asking: > > > I do not want python to print both elenents from > lists a and b. I just > > want only the elements in the list B. > > should not be so difficult: it'll be a > straightforward loop across b, > using that helper function. > > > > (Optimization can be done to make > doesNameMatchSomePrefix() fast, but you > probably should concentrate on correctness first. > If you're interested in > doing something like this for a large number of > prefixes, you might be > interested in: > > > http://hkn.eecs.berkeley.edu/~dyoo/python/ahocorasick/ > > which has more details and references to specialized > modules that attack > the problem you've shown us so far.) > > > Good luck! > > __________________________________________ Yahoo! DSL Something to write home about. Just $16.99/mo. or less. dsl.yahoo.com _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor