Re: [Tutor] how to unique the string
On Mon, Oct 24, 2011 at 3:24 PM, Peter Otten <__pete...@web.de> wrote: > lina wrote: > >> But I am getting confused later: >> >> def translate_process(dictionary,tobetranslatedfile): >> results=[] >> unique={} >> for line in open(tobetranslatedfile,"r"): >> tobetranslatedparts=line.strip().split() >> results.append(dictionary[tobetranslatedparts[2]]) > >> unique=Counter(results) >> with open(base+OUTPUTFILEEXT,"w") as f: >> for residue, numbers in unique.items(): >> print(residue,numbers,file=f) > > As Dave says, the above four lines should run only once, outside the for- > loop. > > Here's a way to avoid the intermediate results list. As a bonus I'm removing > access to the `base` global variable: > > def change_ext(name, new_ext): > """ > >>> change_ext("/deep/blue.eyes", ".c") > '/deep/blue.c' > """ > return os.path.splitext(name)[0] + new_ext > > def translate_process(dictionary, tobetranslatedfile): > with open(tobetranslatedfile, "r") as f: > results = (dictionary[line.split()[2]] for line in f) > unique = Counter(results) > > with open(change_ext(tobetranslatedfile, OUTPUTFILEEXT), "w") as out: > for residue, numbers in unique.items(): > print(residue, numbers, file=out) Now work as expected. concise than before. Thanks, > > >> it still the same in the OUTPUTFILE as before, >> >> $ more atom-pair_6.txt >> {'26SER': 2, '16LYS': 1, '83ILE': 2, '70LYS': 6, '55HIS': 5} > > Unlikely. Verify that you are running the correct script and looking into > the right output file. > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
>> >> def translate_process(dictionary,tobetranslatedfile): >> results=[] >> unique={} >> for line in open(tobetranslatedfile,"r"): >> tobetranslatedparts=line.strip().split() >> results.append(dictionary[tobetranslatedparts[2]]) >> unique=Counter(results) >> with open(base+OUTPUTFILEEXT,"w") as f: > > Every time you do this, you're truncating the file. It'd be better to open > the file outside the for-line loop, and just use the file object in the > loop. Before I did not understand well, until I read Peter's following post. Thanks, > > -- > > DaveA > > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
On Mon, Oct 24, 2011 at 3:24 PM, Peter Otten <__pete...@web.de> wrote: > lina wrote: > >> But I am getting confused later: >> >> def translate_process(dictionary,tobetranslatedfile): >> results=[] >> unique={} >> for line in open(tobetranslatedfile,"r"): >> tobetranslatedparts=line.strip().split() >> results.append(dictionary[tobetranslatedparts[2]]) > >> unique=Counter(results) >> with open(base+OUTPUTFILEEXT,"w") as f: >> for residue, numbers in unique.items(): >> print(residue,numbers,file=f) > > As Dave says, the above four lines should run only once, outside the for- > loop. > > Here's a way to avoid the intermediate results list. As a bonus I'm removing > access to the `base` global variable: > > def change_ext(name, new_ext): > """ > >>> change_ext("/deep/blue.eyes", ".c") > '/deep/blue.c' > """ > return os.path.splitext(name)[0] + new_ext > > def translate_process(dictionary, tobetranslatedfile): > with open(tobetranslatedfile, "r") as f: > results = (dictionary[line.split()[2]] for line in f) > unique = Counter(results) > > with open(change_ext(tobetranslatedfile, OUTPUTFILEEXT), "w") as out: > for residue, numbers in unique.items(): > print(residue, numbers, file=out) > > >> it still the same in the OUTPUTFILE as before, >> >> $ more atom-pair_6.txt >> {'26SER': 2, '16LYS': 1, '83ILE': 2, '70LYS': 6, '55HIS': 5} > > Unlikely. Verify that you are running the correct script and looking into > the right output file. Thanks, print(residue,numbers,"\n",file=f) achieve this. 62PHE 10 34LEU 37 43ASP 6 but seems the \n added one more line, > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
lina wrote: > But I am getting confused later: > > def translate_process(dictionary,tobetranslatedfile): > results=[] > unique={} > for line in open(tobetranslatedfile,"r"): > tobetranslatedparts=line.strip().split() > results.append(dictionary[tobetranslatedparts[2]]) > unique=Counter(results) > with open(base+OUTPUTFILEEXT,"w") as f: > for residue, numbers in unique.items(): > print(residue,numbers,file=f) As Dave says, the above four lines should run only once, outside the for- loop. Here's a way to avoid the intermediate results list. As a bonus I'm removing access to the `base` global variable: def change_ext(name, new_ext): """ >>> change_ext("/deep/blue.eyes", ".c") '/deep/blue.c' """ return os.path.splitext(name)[0] + new_ext def translate_process(dictionary, tobetranslatedfile): with open(tobetranslatedfile, "r") as f: results = (dictionary[line.split()[2]] for line in f) unique = Counter(results) with open(change_ext(tobetranslatedfile, OUTPUTFILEEXT), "w") as out: for residue, numbers in unique.items(): print(residue, numbers, file=out) > it still the same in the OUTPUTFILE as before, > > $ more atom-pair_6.txt > {'26SER': 2, '16LYS': 1, '83ILE': 2, '70LYS': 6, '55HIS': 5} Unlikely. Verify that you are running the correct script and looking into the right output file. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
On 10/23/2011 08:01 AM, lina wrote: On Sun, Oct 23, 2011 at 6:06 PM, Peter Otten<__pete...@web.de> wrote: lina wrote: tobetranslatedparts=line.strip().split() strip() is superfluous here, split() will take care of the stripping: " alpha \tbeta\n".split() ['alpha', 'beta'] for residue in results: if residue not in unique: unique[residue]=1 else: unique[residue]+=1 There is a dedicated class to help you with that, collections.Counter: from collections import Counter results = ["alpha", "beta", "gamma", "alpha"] unique = Counter(results) unique Counter({'alpha': 2, 'beta': 1, 'gamma': 1}) Counter is a subclass of dict, so the stuff you are doing with `unique` elswhere should continue to work. This part I just wish the output in file like: {'26SER': 2, '16LYS': 1, '83ILE': 2, '70LYS': 6} as 26SER 2 16LYS 1 83ILE 2 70LYS 6 You can redirect the output of print() to a file using the `file` keyword argument: unique = {'26SER': 2, '16LYS': 1, '83ILE': 2, '70LYS': 6} with open("tmp.txt", "w") as f: ... for k, v in unique.items(): ... print(k, v, file=f) I tested it in idle3, it has no problem achieving this. But I am getting confused later: def translate_process(dictionary,tobetranslatedfile): results=[] unique={} for line in open(tobetranslatedfile,"r"): tobetranslatedparts=line.strip().split() results.append(dictionary[tobetranslatedparts[2]]) unique=Counter(results) with open(base+OUTPUTFILEEXT,"w") as f: Every time you do this, you're truncating the file. It'd be better to open the file outside the for-line loop, and just use the file object in the loop. for residue, numbers in unique.items(): print(residue,numbers,file=f) it still the same in the OUTPUTFILE as before, $ more atom-pair_6.txt {'26SER': 2, '16LYS': 1, '83ILE': 2, '70LYS': 6, '55HIS': 5} Thanks, ... $ cat tmp.txt 26SER 2 83ILE 2 70LYS 6 16LYS 1 $ -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
On Sun, Oct 23, 2011 at 6:06 PM, Peter Otten <__pete...@web.de> wrote: > lina wrote: > tobetranslatedparts=line.strip().split() > > strip() is superfluous here, split() will take care of the stripping: > " alpha \tbeta\n".split() > ['alpha', 'beta'] > for residue in results: if residue not in unique: unique[residue]=1 else: unique[residue]+=1 > > There is a dedicated class to help you with that, collections.Counter: > from collections import Counter results = ["alpha", "beta", "gamma", "alpha"] unique = Counter(results) unique > Counter({'alpha': 2, 'beta': 1, 'gamma': 1}) > > Counter is a subclass of dict, so the stuff you are doing with `unique` > elswhere should continue to work. > >> This part I just wish the output in file like: >> >> {'26SER': 2, '16LYS': 1, '83ILE': 2, '70LYS': 6} >> >> as >> >> 26SER 2 >> 16LYS 1 >> 83ILE 2 >> 70LYS 6 > > You can redirect the output of print() to a file using the `file` keyword > argument: > unique = {'26SER': 2, '16LYS': 1, '83ILE': 2, '70LYS': 6} with open("tmp.txt", "w") as f: > ... for k, v in unique.items(): > ... print(k, v, file=f) I tested it in idle3, it has no problem achieving this. But I am getting confused later: def translate_process(dictionary,tobetranslatedfile): results=[] unique={} for line in open(tobetranslatedfile,"r"): tobetranslatedparts=line.strip().split() results.append(dictionary[tobetranslatedparts[2]]) unique=Counter(results) with open(base+OUTPUTFILEEXT,"w") as f: for residue, numbers in unique.items(): print(residue,numbers,file=f) it still the same in the OUTPUTFILE as before, $ more atom-pair_6.txt {'26SER': 2, '16LYS': 1, '83ILE': 2, '70LYS': 6, '55HIS': 5} Thanks, > ... > $ cat tmp.txt > 26SER 2 > 83ILE 2 > 70LYS 6 > 16LYS 1 > $ > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
lina wrote: >>> tobetranslatedparts=line.strip().split() strip() is superfluous here, split() will take care of the stripping: >>> " alpha \tbeta\n".split() ['alpha', 'beta'] >>> for residue in results: >>> if residue not in unique: >>> unique[residue]=1 >>> else: >>> unique[residue]+=1 There is a dedicated class to help you with that, collections.Counter: >>> from collections import Counter >>> results = ["alpha", "beta", "gamma", "alpha"] >>> unique = Counter(results) >>> unique Counter({'alpha': 2, 'beta': 1, 'gamma': 1}) Counter is a subclass of dict, so the stuff you are doing with `unique` elswhere should continue to work. > This part I just wish the output in file like: > > {'26SER': 2, '16LYS': 1, '83ILE': 2, '70LYS': 6} > > as > > 26SER 2 > 16LYS 1 > 83ILE 2 > 70LYS 6 You can redirect the output of print() to a file using the `file` keyword argument: >>> unique = {'26SER': 2, '16LYS': 1, '83ILE': 2, '70LYS': 6} >>> with open("tmp.txt", "w") as f: ... for k, v in unique.items(): ... print(k, v, file=f) ... >>> $ cat tmp.txt 26SER 2 83ILE 2 70LYS 6 16LYS 1 $ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
The updated one -- following Alan's advice. #!/usr/bin/python3 import os.path mapping={} DICTIONARYFILE="dictionary.pdb" TOBETRANSLATEDFILEEXT=".out" OUTPUTFILEEXT=".txt" def generate_dict(dictionarysourcefile): for line in open(dictionarysourcefile,"r"): parts=line.strip().split() mapping[parts[2]]=parts[0] def translate_process(dictionary,tobetranslatedfile): results=[] unique={} for line in open(tobetranslatedfile,"r"): tobetranslatedparts=line.strip().split() results.append(dictionary[tobetranslatedparts[2]]) for residue in results: unique[residue]=unique.get(residue,0)+1 for residue, numbers in unique.items(): print(residue,numbers) with open(base+OUTPUTFILEEXT,"w") as f: f.write(str(unique)) if __name__=="__main__": generate_dict(DICTIONARYFILE) for infilename in os.listdir("."): base, ext = os.path.splitext(infilename) if ext == TOBETRANSLATEDFILEEXT: translate_process(mapping, infilename) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
On Sun, Oct 23, 2011 at 5:08 PM, Alan Gauld wrote: > On 23/10/11 09:33, lina wrote: > >> I have a further question: >> > >> Welcome anyone help me transform the code to another form. > > What form would you like it transformed to? > A flow chart? Another programming language? A different style of Python > (Functional programming or OOP maybe?) Just an updated version. Like the comments you gave. BTW, thanks for the comments. > > I'm not sure what you want here? > In the meantime I'll offer some general comments: > >> #!/usr/bin/python3 >> import os.path >> mapping={} >> >> >> DICTIONARYFILE="dictionary.pdb" >> TOBETRANSLATEDFILEEXT=".out" >> OUTPUTFILEEXT=".txt" >> >> def generate_dict(dictionarysourcefile): >> for line in open(dictionarysourcefile,"r").readlines(): > > You don't need the readlines(). Just > use > > for line in open(dictionarysourcefile,"r"): > > That will work just as well. > >> parts=line.strip().split() >> mapping[parts[2]]=parts[0] >> >> >> def translate_process(dictionary,tobetranslatedfile): >> results=[] >> unique={} >> for line in open(tobetranslatedfile,"r").readlines(): >> tobetranslatedparts=line.strip().split() >> results.append(dictionary[tobetranslatedparts[2]]) >> for residue in results: >> if residue not in unique: >> unique[residue]=1 >> else: >> unique[residue]+=1 > > You can replace the if/else with the get() metjod of a dictionary: > > unique[residue] = unique.get(residue,0) + 1 > > get returns the current value and if the value is not there it returns the > second parameter(zero here) > > >> for residue, numbers in unique.items(): >> print(residue,numbers) >> with open(base+OUTPUTFILEEXT,"w") as f: >> f.write(str(unique)) ### How can I output the >> results the same as the print one. Thanks. > > create a string before you write it: > > mystr = str(residue) + str(numbers) This part I just wish the output in file like: {'26SER': 2, '16LYS': 1, '83ILE': 2, '70LYS': 6} as 26SER 2 16LYS 1 83ILE 2 70LYS 6 still have problem in writing the dict. Thanks again for your time, Best regards, > > is the simplest way. However you may prefer to format the string in another > way first. But thats your choice... > >> if __name__=="__main__": >> generate_dict(DICTIONARYFILE) >> for infilename in os.listdir("."): >> base, ext = os.path.splitext(infilename) >> if ext == TOBETRANSLATEDFILEEXT: >> translate_process(mapping, infilename) > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
On 23/10/11 09:33, lina wrote: I have a further question: > Welcome anyone help me transform the code to another form. What form would you like it transformed to? A flow chart? Another programming language? A different style of Python (Functional programming or OOP maybe?) I'm not sure what you want here? In the meantime I'll offer some general comments: #!/usr/bin/python3 import os.path mapping={} DICTIONARYFILE="dictionary.pdb" TOBETRANSLATEDFILEEXT=".out" OUTPUTFILEEXT=".txt" def generate_dict(dictionarysourcefile): for line in open(dictionarysourcefile,"r").readlines(): You don't need the readlines(). Just use for line in open(dictionarysourcefile,"r"): That will work just as well. parts=line.strip().split() mapping[parts[2]]=parts[0] def translate_process(dictionary,tobetranslatedfile): results=[] unique={} for line in open(tobetranslatedfile,"r").readlines(): tobetranslatedparts=line.strip().split() results.append(dictionary[tobetranslatedparts[2]]) for residue in results: if residue not in unique: unique[residue]=1 else: unique[residue]+=1 You can replace the if/else with the get() metjod of a dictionary: unique[residue] = unique.get(residue,0) + 1 get returns the current value and if the value is not there it returns the second parameter(zero here) for residue, numbers in unique.items(): print(residue,numbers) with open(base+OUTPUTFILEEXT,"w") as f: f.write(str(unique)) ### How can I output the results the same as the print one. Thanks. create a string before you write it: mystr = str(residue) + str(numbers) is the simplest way. However you may prefer to format the string in another way first. But thats your choice... if __name__=="__main__": generate_dict(DICTIONARYFILE) for infilename in os.listdir("."): base, ext = os.path.splitext(infilename) if ext == TOBETRANSLATEDFILEEXT: translate_process(mapping, infilename) -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
On Sun, Oct 23, 2011 at 12:50 AM, Steven D'Aprano wrote: > lina wrote: >> >> Hi, >> >> I googled for a while, but failed to find the perfect answer, >> >> for a string >> >> ['85CUR', '85CUR'] > > > That's not a string, it is a list. > > >> how can I unique it as: >> >> ['85CUR'] > > Your question is unclear. If you have this: > > ['85CUR', '99bcd', '85CUR', '85CUR'] > > what do you expect to get? > > > # keep only the very first item > ['85CUR'] > # keep the first copy of each string, in order > ['85CUR', '99bcd'] > # keep the last copy of each string, in order > ['99bcd', '85CUR'] > # ignore duplicates only when next to each other > ['85CUR', '99bcd', '85CUR'] > > > Does the order of the result matter? > > If order matters, and you want to keep the first copy of each string: > > unique = [] > for item in items: > if item not in unique: > unique.append(item) > > > If order doesn't matter, then use this: > > unique = list(set(items)) Thanks, this one unique=list(set(item)) works well. I have a further question: #!/usr/bin/python3 import os.path mapping={} DICTIONARYFILE="dictionary.pdb" TOBETRANSLATEDFILEEXT=".out" OUTPUTFILEEXT=".txt" def generate_dict(dictionarysourcefile): for line in open(dictionarysourcefile,"r").readlines(): parts=line.strip().split() mapping[parts[2]]=parts[0] def translate_process(dictionary,tobetranslatedfile): results=[] unique={} for line in open(tobetranslatedfile,"r").readlines(): tobetranslatedparts=line.strip().split() results.append(dictionary[tobetranslatedparts[2]]) for residue in results: if residue not in unique: unique[residue]=1 else: unique[residue]+=1 for residue, numbers in unique.items(): print(residue,numbers) with open(base+OUTPUTFILEEXT,"w") as f: f.write(str(unique)) ### How can I output the results the same as the print one. Thanks. if __name__=="__main__": generate_dict(DICTIONARYFILE) for infilename in os.listdir("."): base, ext = os.path.splitext(infilename) if ext == TOBETRANSLATEDFILEEXT: translate_process(mapping, infilename) https://docs.google.com/leaf?id=0B93SVRfpVVg3ZTBiYjU1MzYtNTNkMS00ZjQ1LWI4MzEtNDEyZWUwYTFmNjU4&hl=en_GB https://docs.google.com/leaf?id=0B93SVRfpVVg3ODU4MDlkMDQtOTJmMy00MDJiLTkwM2EtY2EyNTUyZmNhNTNm&hl=en_GB Welcome anyone help me transform the code to another form. > > > > -- > Steven > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
On 23 Oct, 2011, at 0:22, Devin Jeanpierre wrote: > You should be able to do this yourself, with the help of the following link: > > http://docs.python.org/library/stdtypes.html#set Thanks. > > Is this a homework question? No. > > Devin > > On Sat, Oct 22, 2011 at 12:09 PM, lina wrote: >> Hi, >> >> I googled for a while, but failed to find the perfect answer, >> >> for a string >> >> ['85CUR', '85CUR'] >> >> how can I unique it as: >> >> ['85CUR'] >> >> Thanks, >> ___ >> Tutor maillist - Tutor@python.org >> To unsubscribe or change subscription options: >> http://mail.python.org/mailman/listinfo/tutor >> ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
lina wrote: Hi, I googled for a while, but failed to find the perfect answer, for a string ['85CUR', '85CUR'] That's not a string, it is a list. how can I unique it as: ['85CUR'] Your question is unclear. If you have this: ['85CUR', '99bcd', '85CUR', '85CUR'] what do you expect to get? # keep only the very first item ['85CUR'] # keep the first copy of each string, in order ['85CUR', '99bcd'] # keep the last copy of each string, in order ['99bcd', '85CUR'] # ignore duplicates only when next to each other ['85CUR', '99bcd', '85CUR'] Does the order of the result matter? If order matters, and you want to keep the first copy of each string: unique = [] for item in items: if item not in unique: unique.append(item) If order doesn't matter, then use this: unique = list(set(items)) -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
On 10/22/2011 12:09 PM, lina wrote: Hi, I googled for a while, but failed to find the perfect answer, for a string ['85CUR', '85CUR'] how can I unique it as: ['85CUR'] Try set(['85CUR', '85CUR'] -- Bob Gailer 919-636-4239 Chapel Hill NC ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to unique the string
You should be able to do this yourself, with the help of the following link: http://docs.python.org/library/stdtypes.html#set Is this a homework question? Devin On Sat, Oct 22, 2011 at 12:09 PM, lina wrote: > Hi, > > I googled for a while, but failed to find the perfect answer, > > for a string > > ['85CUR', '85CUR'] > > how can I unique it as: > > ['85CUR'] > > Thanks, > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor