Hi Sébastien

Do you have any general advice on using mate pair reads (illumina) with Ray 
assemblies?  I think from our experience that there are a number of 
issues specific to MP data beyond those we find in PE data that need attention, 
for example:

        The location of the MP junction can be within one of the two mate ends 
making that read chimeric (frequency is linked to read length/fragment size)

        MP reads can have high over-read rates and an associated limited 
diversity

        False mates can make up around 20% of a library - these usually turn 
out to be PE in orientation and almost end to end in genomic origin

        Some Mates seem to be formed by the synthesis of several loops meaning 
the two ends come from quite different genomic locations (this should hopefully 
be low)

The first and third problems can I think be mostly addressed through 
pre-filtering reads against a contig assembly.  But how pro-active do you think
we have to be in addressing these (and I guess other) technical problems 
through read pre-filtering?   Working with SOAP seems to show it's relatively 
sensitive to 
the percentage of poorly constructed mates.  Do you think we can rely on Ray to 
compensate for these types of read errors or is it simply GIGO? 

It would be good to hear about general approaches to coping with MP read error 
characteristics (biochemical/bioinformatic) if anyone is willing to share.

Adrian

Adrian Platts
McGill
        
------------------------------------------------------------------------------
10 Tips for Better Web Security
Learn 10 ways to better secure your business today. Topics covered include:
Web security, SSL, hacker attacks & Denial of Service (DoS), private keys,
security Microsoft Exchange, secure Instant Messaging, and much more.
http://www.accelacomm.com/jaw/sfnl/114/51426210/
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to