Hi Sébastien
Do you have any general advice on using mate pair reads (illumina) with Ray
assemblies? I think from our experience that there are a number of
issues specific to MP data beyond those we find in PE data that need attention,
for example:
The location of the MP junction can be within one of the two mate ends
making that read chimeric (frequency is linked to read length/fragment size)
MP reads can have high over-read rates and an associated limited
diversity
False mates can make up around 20% of a library - these usually turn
out to be PE in orientation and almost end to end in genomic origin
Some Mates seem to be formed by the synthesis of several loops meaning
the two ends come from quite different genomic locations (this should hopefully
be low)
The first and third problems can I think be mostly addressed through
pre-filtering reads against a contig assembly. But how pro-active do you think
we have to be in addressing these (and I guess other) technical problems
through read pre-filtering? Working with SOAP seems to show it's relatively
sensitive to
the percentage of poorly constructed mates. Do you think we can rely on Ray to
compensate for these types of read errors or is it simply GIGO?
It would be good to hear about general approaches to coping with MP read error
characteristics (biochemical/bioinformatic) if anyone is willing to share.
Adrian
Adrian Platts
McGill
------------------------------------------------------------------------------
10 Tips for Better Web Security
Learn 10 ways to better secure your business today. Topics covered include:
Web security, SSL, hacker attacks & Denial of Service (DoS), private keys,
security Microsoft Exchange, secure Instant Messaging, and much more.
http://www.accelacomm.com/jaw/sfnl/114/51426210/
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users