On Tue, Mar 25, 2014 at 8:36 AM, Sydney Shall <s.sh...@virginmedia.com> wrote: > I did not know about biopython, but then I am a debutant. > I tried to import biopython and I get the message that the name is unknown.
No problem. It is an external library; I hope that you were able to find it! I just want to make sure no one else tries to write yet another FASTA parser badly. It's all too easy to code something quick-and-dirty that almost solves the issue. The devil's in the details. It might be instructive to look at source code. You can look at: https://github.com/biopython/biopython/blob/master/Bio/SeqIO/FastaIO.py and see all the implementation details the Biopython community has had to consider in the real world. These include things like skipping crazy garbage at the beginning of files, https://github.com/biopython/biopython/blob/master/Bio/SeqIO/FastaIO.py#L40-L45 and providing a stream-like interface by using generators (using the "yield" command): https://github.com/biopython/biopython/blob/master/Bio/SeqIO/FastaIO.py#L65 But also consider data validation facilities. At least, the Biopython folks have. They provide a way to declare the genomic alphabet to be used: https://github.com/biopython/biopython/blob/master/Bio/SeqIO/FastaIO.py#L73 https://github.com/biopython/biopython/blob/master/Bio/Alphabet/ where if the input data doesn't match the allowed alphabet, you'll get a good warning about it ahead of time. This is checked in places like: https://github.com/biopython/biopython/blob/master/Bio/Alphabet/__init__.py#L375 https://github.com/biopython/biopython/blob/master/Bio/Seq.py#L336 In short, in the presence of potentially messy data, the developers have thought about these sorts of issues and have programmed for those situations. As the commit history demonstrates: https://github.com/biopython/biopython/commits/master they started work in the last century or so (since at least 1999-12-07), and continue to work on it even now. So taking advantage of their generous and hard work is a good idea. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor