Daniel Klose wrote: > Hi all, > > All I would like to do is take a file and count the number of times a > letter occurs in it. It so happens that there letters are amino acids. > There are also some other checks in the script but these are not a > concern just yet. > > What I would like to do is create a dictionary of arrays. > In perl (my current scripting language of choice) I would simply put: > ${$dictionary{$key}}[$element] += 1 > I have no idea how to create this kind of structure in python.
I don't speak perl much but it looks like you have a dict whose values are lists. Not quite the same as what you have below, which is a dict whose values are integers. > > Also I have a while loop. If this were perl, rather than using the i = > 0 while(i < len(x)): > I would do : for (my $i = 0; $i < @array; $i++) {}. I have found the > range function but I am not sure how to use it properly. You could use for i in range(len(strArray)): but this is not good usage; better to iterate over strArray directly. > What I would like to do is create an index that allows me to access the > same element in two arrays (lists) of identical size. You can use the zip() function to process two lists in parallel: for x, y in zip(xlist, ylist): # x is an element from xlist # y is the corresponding element from ylist > > I have pasted in my current code below, I would be very grateful if you > could help me trim up this code. > #!/usr/bin/python > > import sys, os > > structDir = '/home/danny/dataset/structure/' > seqDir = '/home/danny/dataset/sequence/' > > target = sys.argv[1] > > seqFile = seqDir + target > strFile = structDir + target os.path.join() would be more idiomatic here though what you have works. > > seqDictionary = {} > > if (os.path.isfile(seqFile) and os.path.isfile(strFile)): > > structureHandle = open(strFile) > structureString = structureHandle.readline() > > sequenceHandle = open(seqFile) > sequenceString = sequenceHandle.readline() > > strArray = list(structureString) > seqArray = list(sequenceString) You don't have to convert to lists; strings are already sequences. > > if len(strArray) == len(seqArray): > print "Length match\n" > > i=0 > while(i < len(strArray)): > if seqDictionary.has_key(seqArray[i]): > seqDictionary[seqArray[i]] += 1 > else: > seqDictionary[seqArray[i]] = 1 > > i += 1 The idiomatic way to iterate over sequenceString is just for c in sequenceString: You don't seem to be using strArray except to get the length. Maybe this is where you need zip()? For example you could say for structChr, seqChr in zip(structureString, sequenceString): An alternative to your conditional with has_key() is to use dict.get() with a default value: seqDictionary[c] = seqDictionary.get(c, 0) + 1 so the whole loop becomes just for c in sequenceString: seqDictionary[c] = seqDictionary.get(c, 0) + 1 In Python 2.5 you can use defaultdict to create a dict with a default value of 0: from collections import defaultdict seqDictionary = defaultdict(int) then in the loop you can say seqDictionary[c] += 1 Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor