sorry, I hit send too soon on last message On Fri, Apr 15, 2011 at 8:54 AM, Joel Goldstick <joel.goldst...@gmail.com>wrote:
> > > On Fri, Apr 15, 2011 at 8:41 AM, Spyros Charonis <s.charo...@gmail.com>wrote: > >> Hello, >> >> I'm doing a biomedical degree and am taking a course on bioinformatics. We >> were given a raw version of a public database in a file (the file is in >> simple ASCII) and need to extract only certain lines containing important >> information. I've made a script that does not work and I am having trouble >> understanding why. >> >> when I run it on the python shell, it prompts for a protein name but then >> reports that there is no such entry. The first while loop nested inside a >> for loop is intended to pick up all lines beginning with "gc;", chop off the >> "gc;" part and keep only the text after that (which is a protein name). >> Then it scans the file and collects all lines, chops the "gc;" and stores >> in them in a tuple. This tuple is not built correctly, because as I posted >> when the program is run it reports that it cannot find my query in the tuple >> I created and it is certainly in the database. Can you detect what the >> mistake is? Thank you in advance! >> >> Spyros >> >> _______________________________________________ >> Tutor maillist - Tutor@python.org >> To unsubscribe or change subscription options: >> http://mail.python.org/mailman/listinfo/tutor >> >> > import os, string > > printsdb = > open('/users/spyros/folder1/python/PRINTSmotifs/prints41_1.kdat', 'r') > lines = printsdb.readlines() > > # find PRINTS name entries > you need to have a list to collect your strings: > protnames = [] > for line in lines: # this gets you each line > #while line.startswith('gc;'): this is wrong > if line.startswith('gc;'); # do this instead > protnames.append(line.lstrip('gc;')) # this adds your stripped > string to the protnames list > # try doing something like: print protnames # this should give you a list of all your lines that started with 'gc;' # this block I don't understand > if not protnames: > print('error in creating tuple') # check if tuple is true or > false > #print(protnames) > break > > Now, you have protnames with all of your protein names see if above helps. then you have below to figure out query = input("search a protein: ") > query = query.upper() > if query in protnames: > print("\nDisplaying Motifs") > else: > print("\nentry not in database") > > # Parse motifs > def extract_motifs(query): > motif_id = () > motif = () > while query in lines: ####for query, get motif_ids and motifs > while line.startswith('ft;'): > motif_id = line.lstrip('ft;') > motif_ids = (motif_id) > #print(motif_id) > while line.startswith('fd;'): > motif = line.lstrip('fd;') > motifs = (motif) > #print(motif) > return motif_id, motif > > if __name__ == '__main__': > final_motifs = extract_motifs('query') > > > > -- > Joel Goldstick > > -- Joel Goldstick
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor