Hi!
Peter Huang a écrit :
> Hi Ray group,
>
> I am using Ray to assembly our sequencing data. As some of the reads
> mis-assembled onto our final scaffolds and
> we have many low coverage contigs hanging around, I am curious if there is a
> flag to eliminate the contigs
> with low coverage such as 5 or 10. ( I know Ray has a flag to set the limit
> length of contig).
I just added this option:
-use-minimum-seed-coverage minimumSeedCoverageDepth
Sets the minimum seed coverage depth.
Any path with a coverage depth lower than this will be
discarded. The default is 0.
Example: -use-minimum-seed-coverage 40
You will need to install Ray (and RayPlatform) from the git repository.
The changes:
MANUAL_PAGE.txt | 4 ++++
code/application_core/Parameters.cpp | 5 ++++-
code/plugin_SeedingData/SeedingData.cpp | 13 ++++++++++++-
code/plugin_SeedingData/SeedingData.h | 6 ++++++
4 files changed, 26 insertions(+), 2 deletions(-)
This option will be available also in Ray v2.1.0 which will be shipped around
mid September 2012.
Also, Ray creates a file containing meta data for each contig, you can use the
column 'Mode k-mer coverage depth':
[boiseb01@ls30 RayKmerSearchDevel]$ head
TestX/BiologicalAbundances/_DeNovoAssembly/Contigs.tsv
#Contig name K-mer length Length in k-mers Colored k-mers
Proportion Mode k-mer coverage depth K-mer observations Total
Proportion
contig-0 21 9859 0 0 30 295770 60497522
0.00488896
contig-15 21 6874 0 0 28 192472 60497522
0.00318149
contig-16 21 3353 0 0 31 103943 60497522
0.00171814
contig-14 21 8809 0 0 32 281888 60497522
0.0046595
contig-1000000 21 139 0 0 88 12232 60497522
0.00020219
contig-1000015 21 558 0 0 58 32364 60497522
0.000534964
contig-3 21 9297 0 0 29 269613 60497522
0.0044566
contig-7 21 9644 0 0 30 289320 60497522
0.00478234
contig-27 21 12701 0 0 30 381030 60497522
0.00629827
>In addition, is there a cut off for kmer as well, so that low kmer coverage
>will be eliminated at early stage of assembly?
>
This:
-use-minimum-seed-coverage minimumSeedCoverageDepth
Sets the minimum seed coverage depth.
Any path with a coverage depth lower than this will be
discarded. The default is 0.
(not available in v2.0.0, see above).
There is also the following option that discard things that have too much
coverage:
-use-maximum-seed-coverage
Ignores any seed with a coverage depth above this threshold.ééé
The default is 4294967295.
If the problem is with memory usage caused by erroneous k-mers, you can
increase the number of
bits in the Bloom filter:
-bloom-filter-bits bits
Sets the number of bits for the Bloom filter
Default is 268435456 bits, 0 bits disables the Bloom filter.
This option was added recently, you will need to install from the git
repository.
There are other useful new options for tuning the distributed in-memory storage
engine,
see MANUAL_PAGE.txt
Sébastien Boisvert
> Thanks.
>
> Best,
>
> Peter
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users