* Ketil Malde <[email protected]> [20.10.2011 09:54]: > > One imortant piece of functionality that could be ripped from biolib and > made into a separate library, is the BLAST output parser¹. This could > also do with some cleanup, and would make a nice, standalone project. > It's also fairly open-ended. If you're more interested in algorithms, > there's some stuff for sequence alignments that I was never quite > satisfied with. > > ¹ Christian, didn't you do something on this?
Yeah, I completely forgot about that. The bits and pieces I have, once I find them again ;-), are iteratee-code, however. Not the best thing to start Haskell with. On the other hand, once you understand that stuff you know a lot of high-level Haskell in addition to how to make Haskell fast... == As a student project, the second idea on sequence alignments seems to be more fun, though. And it would be useful in on its own. The sequence alignment stuff can be done in a month as the algorithms are not that complicated and you mostly just need to know Haskell arrays. These are possible tasks: - global alignment - (backtracking) - local alignment - high-performance code - unboxed arrays - vector-based fusion operations If put backtracking in brackets as there are two interesting ways on how to do alignments: have a forward pass calculating scores and find out via backtracking what alignments produce this score. Or use s.th. like "algebraic dynamic programming" (Giegerich et al) to do it all in one pass. The order of tasks above should allow to stop at any point and have something to show, that would be useful later on. Basically, if you write that and stop after vector-fusion operations you come close to C-code in terms of performance. Anyway, start with global alignment, you can basically read a book chapter on that on day 1, code it using arrays in 1-2 days thereafter (depending on what you know about Haskell) Gruss, Christian
pgpmpw3XNouJu.pgp
Description: PGP signature
_______________________________________________ Biohaskell mailing list [email protected] http://malde.org/cgi-bin/mailman/listinfo/biohaskell
