Re: [Biohaskell] Hoping to contribute to BioHaskell for a student project

Kenneth Lui Thu, 20 Oct 2011 03:08:16 -0700

Global alignment does sound like an interesting subject!

I have a few questions,
1) does biohaskell already have any globabl alignment code? If so, in which
package?
2) From the biohaskell wiki, I found the following information about
alignment: "Supported alignment formats: ACE, BlastXML, PSL (Blat), Bowtie,
Soap, GFF3, BED." and "The various alignment formats (Blast, PSL, ACE)
should be standardized and better integrated." How are they related to the
algorithm that I should implement. e.g. what are the input/output format
that it should support?
3) From the wikepedia, it mentions the Needleman–Wunsch algorithm, is it
what I am supposed to implement?


For algebraic dynamic programming, a professor of mine has mentioned it as
well, hopefully I can dive further in it if possible.

Thanks Christian for all the info!

Cheers,
Kenneth

On Thu, Oct 20, 2011 at 02:32, Christian Höner zu Siederdissen <
[email protected]> wrote:

> * Ketil Malde <[email protected]> [20.10.2011 09:54]:
> >
> > One imortant piece of functionality that could be ripped from biolib and
> > made into a separate library, is the BLAST output parser¹.  This could
> > also do with some cleanup, and would make a nice, standalone project.
> > It's also fairly open-ended.  If you're more interested in algorithms,
> > there's some stuff for sequence alignments that I was never quite
> > satisfied with.
> >
> > ¹ Christian, didn't you do something on this?
>
> Yeah, I completely forgot about that. The bits and pieces I have, once I
> find them again ;-), are iteratee-code, however. Not the best thing to
> start Haskell with. On the other hand, once you understand that stuff
> you know a lot of high-level Haskell in addition to how to make Haskell
> fast...
>
> ==
>
> As a student project, the second idea on sequence alignments seems to be
> more fun, though. And it would be useful in on its own. The sequence
> alignment stuff can be done in a month as the algorithms are not that
> complicated and you mostly just need to know Haskell arrays.
>
> These are possible tasks:
>
> - global alignment
> - (backtracking)
> - local alignment
> - high-performance code
>  - unboxed arrays
>  - vector-based fusion operations
>
> If put backtracking in brackets as there are two interesting ways on
> how to do alignments: have a forward pass calculating scores and find
> out via backtracking what alignments produce this score. Or use s.th.
> like "algebraic dynamic programming" (Giegerich et al) to do it all in
> one pass.
>
> The order of tasks above should allow to stop at any point and have
> something to show, that would be useful later on. Basically, if you
> write that and stop after vector-fusion operations you come close to
> C-code in terms of performance.
>
> Anyway, start with global alignment, you can basically read a book
> chapter on that on day 1, code it using arrays in 1-2 days thereafter
> (depending on what you know about Haskell)
>
> Gruss,
> Christian
>



-- 
*Kenneth Lui*

_______________________________________________
Biohaskell mailing list
[email protected]
http://malde.org/cgi-bin/mailman/listinfo/biohaskell

Re: [Biohaskell] Hoping to contribute to BioHaskell for a student project

Reply via email to