Thanks Kent, I now have:
target = sys.argv[1] seqDictionary = {} # python 2.5 import defaultdict from collections. structureArray = [0, 0, 0] #THIS TAKES THE PLACE OF THE STANDARD PERL $DIR,$file #shorter to do the os.path.join once to a variable. if (os.path.isfile(os.path.join(structDir, target)) and os.path.isfile(os.path.join(seqDir, target))): structureHandle = open(os.path.join(structDir, target)) structureString = structureHandle.readline() sequenceHandle = open(os.path.join(seqDir, target)) sequenceString = sequenceHandle.readline() if len(structureString) == len(sequenceString): for strChar, seqChar in zip(structureString, sequenceString): #SET DEFAULT VALUE AS ZERO ELSE INCREMENT seqDictionary[seqChar] = seqDictionary.get(seqChar, 0) + 1 if (strChar.count('-')): structureArray[0] += 1 elif (strChar.count('H')): structureArray[1] += 1 elif (strChar.count('E')): structureArray[2] += 1 else: print strChar, " is not valid" break; else: print "Some data is missing!\n" The reason I want to create a dictionary of lists is because for each of the keys in the dictionary I wanted to keep tabs on the associated structure. For example: dictionary[A] = [0,0,0] list x element = A list y element = '-' then dictionary[A][0] = 1 print dictionary[A] : [1, 0, 0] I thought that a dictionary would be the best way (it is the same way as I have done it in perl and java). I am using google but having limited success. *Do you folks bottom post or top post? The users of the perl list are sensitive about this stuff! I am only running python 2.4 and the system admin doesn't like me so I won't ask him to upgrade it. Kent Johnson wrote: > Daniel Klose wrote: >> Hi all, >> >> All I would like to do is take a file and count the number of times a >> letter occurs in it. It so happens that there letters are amino acids. >> There are also some other checks in the script but these are not a >> concern just yet. >> >> What I would like to do is create a dictionary of arrays. >> In perl (my current scripting language of choice) I would simply put: >> ${$dictionary{$key}}[$element] += 1 >> I have no idea how to create this kind of structure in python. > > I don't speak perl much but it looks like you have a dict whose values > are lists. Not quite the same as what you have below, which is a dict > whose values are integers. >> >> Also I have a while loop. If this were perl, rather than using the i = >> 0 while(i < len(x)): >> I would do : for (my $i = 0; $i < @array; $i++) {}. I have found the >> range function but I am not sure how to use it properly. > > You could use > for i in range(len(strArray)): > but this is not good usage; better to iterate over strArray directly. > >> What I would like to do is create an index that allows me to access the >> same element in two arrays (lists) of identical size. > > You can use the zip() function to process two lists in parallel: > for x, y in zip(xlist, ylist): > # x is an element from xlist > # y is the corresponding element from ylist >> >> I have pasted in my current code below, I would be very grateful if you >> could help me trim up this code. >> #!/usr/bin/python >> >> import sys, os >> >> structDir = '/home/danny/dataset/structure/' >> seqDir = '/home/danny/dataset/sequence/' >> >> target = sys.argv[1] >> >> seqFile = seqDir + target >> strFile = structDir + target > > os.path.join() would be more idiomatic here though what you have works. >> >> seqDictionary = {} >> >> if (os.path.isfile(seqFile) and os.path.isfile(strFile)): >> structureHandle = open(strFile) >> structureString = structureHandle.readline() >> sequenceHandle = open(seqFile) >> sequenceString = sequenceHandle.readline() >> strArray = list(structureString) >> seqArray = list(sequenceString) > > You don't have to convert to lists; strings are already sequences. >> >> if len(strArray) == len(seqArray): >> print "Length match\n" >> i=0 >> while(i < len(strArray)): >> if seqDictionary.has_key(seqArray[i]): >> seqDictionary[seqArray[i]] += 1 >> else: >> seqDictionary[seqArray[i]] = 1 >> i += 1 > > The idiomatic way to iterate over sequenceString is just > for c in sequenceString: > > You don't seem to be using strArray except to get the length. Maybe > this is where you need zip()? For example you could say > for structChr, seqChr in zip(structureString, sequenceString): > > An alternative to your conditional with has_key() is to use dict.get() > with a default value: > seqDictionary[c] = seqDictionary.get(c, 0) + 1 > > so the whole loop becomes just > for c in sequenceString: > seqDictionary[c] = seqDictionary.get(c, 0) + 1 > > In Python 2.5 you can use defaultdict to create a dict with a default > value of 0: > from collections import defaultdict > seqDictionary = defaultdict(int) > > then in the loop you can say > seqDictionary[c] += 1 > > Kent > > _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor