* Ketil Malde <[email protected]> [20.10.2011 09:54]:
> 
> One imortant piece of functionality that could be ripped from biolib and
> made into a separate library, is the BLAST output parser¹.  This could
> also do with some cleanup, and would make a nice, standalone project.
> It's also fairly open-ended.  If you're more interested in algorithms,
> there's some stuff for sequence alignments that I was never quite
> satisfied with.
> 
> ¹ Christian, didn't you do something on this?

Yeah, I completely forgot about that. The bits and pieces I have, once I
find them again ;-), are iteratee-code, however. Not the best thing to
start Haskell with. On the other hand, once you understand that stuff
you know a lot of high-level Haskell in addition to how to make Haskell
fast...

==

As a student project, the second idea on sequence alignments seems to be
more fun, though. And it would be useful in on its own. The sequence
alignment stuff can be done in a month as the algorithms are not that
complicated and you mostly just need to know Haskell arrays.

These are possible tasks:

- global alignment
- (backtracking)
- local alignment
- high-performance code
  - unboxed arrays
  - vector-based fusion operations

If put backtracking in brackets as there are two interesting ways on
how to do alignments: have a forward pass calculating scores and find
out via backtracking what alignments produce this score. Or use s.th.
like "algebraic dynamic programming" (Giegerich et al) to do it all in
one pass.

The order of tasks above should allow to stop at any point and have
something to show, that would be useful later on. Basically, if you
write that and stop after vector-fusion operations you come close to
C-code in terms of performance.

Anyway, start with global alignment, you can basically read a book
chapter on that on day 1, code it using arrays in 1-2 days thereafter
(depending on what you know about Haskell)

Gruss,
Christian

Attachment: pgpmpw3XNouJu.pgp
Description: PGP signature

_______________________________________________
Biohaskell mailing list
[email protected]
http://malde.org/cgi-bin/mailman/listinfo/biohaskell

Reply via email to