Yes, Vincent's way is the better way to go. To elaborate more on the problem, string appending is O(N^2) while appending to a list and then joining is an O(N) operation. Why CPython is faster than Pypy at doing the less efficient way is something that I'm not fully sure about, but I believe that it might have to do with the differing memory management strategies.
On Thu, Aug 18, 2011 at 4:24 PM, Vincent Legoll <vincent.leg...@gmail.com> wrote: > Hello, > > Try this: > > import sys > > fasta_file = sys.argv[1] # should be *.fa > print 'loading dna from', fasta_file > chroms = {} > dna = [] > for l in open(fasta_file): > if l.startswith('>'): # new chromosome > if len(dna) > 0: > chroms[chrom] = ''.join(dna) > chrom = l.strip().replace('>', '') > dna = [] > else: > dna.append(l.rstrip()) > if len(dna) > 0: > chroms[chrom] = ''.join(dna) > > -- > Vincent Legoll > _______________________________________________ > pypy-dev mailing list > pypy-dev@python.org > http://mail.python.org/mailman/listinfo/pypy-dev > _______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev