Well, I'm now a bit stuck on my colour-space Kmers/Reads modifications,
sorry. I don't think I can progress further on this until I understand
more about the run sequence so that I can work out where it's going
wrong. I'm currently just trying to get the program to assemble / output
in base-space (although the hash ID is calculated from the colour-space
sequence), but it's not working for that.
Here's what I think are the important bits of the Ray output:
$ ../code/Ray --debug-seeds -s phix_1.fasta | grep -e complete -e workers
Rank 0 has 1000 sequence reads (completed)
Rank 0 is counting k-mers in sequence reads [1000/1000] (completed)
Rank 0 has 10434 k-mers (completed)
Rank 0 is computing vertices & edges [1000/1000] (completed)
Rank 0 has 10200 vertices (completed)
Rank 0 is purging edges [10200/10200] (completed)
Rank 0: peak number of workers: 500, maximum: 30000
Rank 0 is selecting optimal read markers [1000/1000] (completed)
Rank 0: peak number of workers: 500, maximum: 30000
Rank 0 is creating seeds [10200/10200] (completed)
Rank 0: peak number of workers: 500, maximum: 30000
Rank 0 is calculating library lengths [0/0] (completed)
Rank 0: peak number of workers: 0, maximum: 30000
Rank 0 is extending seeds [0/0] (completed)
Rank 0 is computing fusions [0/0] (completed)
Rank 0 is distributing fusions [0/0] (completed)
Rank 0 is finishing fusions [0/0] (completed)
Rank 0 is distributing fusions [0/0] (completed)
Rank 0 is computing fusions [0/0] (completed)
Rank 0 is distributing fusions [0/0] (completed)
Rank 0 is finishing fusions [0/0] (completed)
Rank 0 is distributing fusions [0/0] (completed)
Output file list:
0 Jul 12 16:40 RayOutput.ContigLengths.txt
0 Jul 12 16:40 RayOutput.Contigs.fasta
305 Jul 12 16:40 RayOutput.CoverageDistributionAnalysis.txt
100 Jul 12 16:40 RayOutput.CoverageDistribution.txt
578 Jul 12 16:40 RayOutput.degreeDistribution.txt
88 Jul 12 16:40 RayOutput.LibraryStatistics.txt
5.5K Jul 12 16:40 RayOutput.MessagePassingInterface.txt
177 Jul 12 16:40 RayOutput.NetworkTest.txt
237 Jul 12 16:40 RayOutput.OutputNumbers.txt
67 Jul 12 16:40 RayOutput.RayCommand.txt
25 Jul 12 16:40 RayOutput.RayVersion.txt
0 Jul 12 16:40 RayOutput.ScaffoldComponents.txt
0 Jul 12 16:40 RayOutput.ScaffoldLengths.txt
0 Jul 12 16:40 RayOutput.ScaffoldLinks.txt
0 Jul 12 16:40 RayOutput.Scaffolds.fasta
0 Jul 12 16:40 RayOutput.SeedLengthDistribution.txt
It's produced reasonable looking numbers in the Coverage distribution
file (and coverage analysis file) after I changed the smoothing function
to be a bit more like a smoothing function
(https://github.com/gringer/ray/commit/f048203e571d4c7a267dace27fec063cfe2059dd,
ignore the 'push_back(0)' bits).
However, the code I have doesn't seem to have any success in finding seeds.
Oh, and FWIW, all the unit tests are passing [except for my new Ray
assembly run system test].
Any ideas on where I should look?
Thanks,
David Eccles (gringer)
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users