--Hi,

For tophat, i need to fix parameter "-r" i.e. inner distance (instead of 
the insert size) which is insert-length - 2*readLength.
So, i have mapped my paired-end reads(read length=100) to the reference 
genome with bowtie2, then i used Picard's CollectInsertSizeMetrics.jar 
tool to estimate the insert size
to then set the mate inner distance parameter in tophat (-r), this is 
the output of picards tools:

MEDIAN_INSERT_SIZE      MEDIAN_ABSOLUTE_DEVIATION MIN_INSERT_SIZE 
MAX_INSERT_SIZE MEAN_INSERT_SIZE STANDARD_DEVIATION      READ_PAIRS      
PAIR_ORIENTATION        W
IDTH_OF_10_PERCENT      WIDTH_OF_20_PERCENT WIDTH_OF_30_PERCENT     
WIDTH_OF_40_PERCENT WIDTH_OF_50_PERCENT     WIDTH_OF_60_PERCENT 
WIDTH_OF_70_PERCENT     WIDTH_OF_
80_PERCENT      WIDTH_OF_90_PERCENT     WIDTH_OF_99_PERCENT SAMPLE  
LIBRARY READ_GROUP
163     38      82      269356442       180.122413 69.468016       
66668352        FR      15      31      45 61      77      95      
113     135     247     3777

now if i use another method to compute 'mean insert size' and 'standard 
deviation' directly from the bam file:
samtools view -F 0x4 File.mapped.bam | awk '{if ($9 >0) 
{sum+=$9;sumsq+=$9*$9;N+=1}} END {print "mean = " sum/N " SD=" 
sqrt(sumsq/N - (sum/N)**2)}'

i obtain: mean = 6486.58 SD=806658

In the second method mean insert size is vastly different from that 
obtained in the first method with picard tool.
Do you think I might be doing something wrong here ? Is there another 
tool to compute the correct mate inner distance ?

thank you --










------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to