Well, I'm now a bit stuck on my colour-space Kmers/Reads modifications, 
sorry. I don't think I can progress further on this until I understand 
more about the run sequence so that I can work out where it's going 
wrong. I'm currently just trying to get the program to assemble / output 
in base-space (although the hash ID is calculated from the colour-space 
sequence), but it's not working for that.

Here's what I think are the important bits of the Ray output:

$ ../code/Ray --debug-seeds -s phix_1.fasta | grep -e complete -e workers
Rank 0 has 1000 sequence reads (completed)
Rank 0 is counting k-mers in sequence reads [1000/1000] (completed)
Rank 0 has 10434 k-mers (completed)
Rank 0 is computing vertices & edges [1000/1000] (completed)
Rank 0 has 10200 vertices (completed)
Rank 0 is purging edges [10200/10200] (completed)
Rank 0: peak number of workers: 500, maximum: 30000
Rank 0 is selecting optimal read markers [1000/1000] (completed)
Rank 0: peak number of workers: 500, maximum: 30000
Rank 0 is creating seeds [10200/10200] (completed)
Rank 0: peak number of workers: 500, maximum: 30000
Rank 0 is calculating library lengths [0/0] (completed)
Rank 0: peak number of workers: 0, maximum: 30000
Rank 0 is extending seeds [0/0] (completed)
Rank 0 is computing fusions [0/0] (completed)
Rank 0 is distributing fusions [0/0] (completed)
Rank 0 is finishing fusions [0/0] (completed)
Rank 0 is distributing fusions [0/0] (completed)
Rank 0 is computing fusions [0/0] (completed)
Rank 0 is distributing fusions [0/0] (completed)
Rank 0 is finishing fusions [0/0] (completed)
Rank 0 is distributing fusions [0/0] (completed)

Output file list:
     0 Jul 12 16:40 RayOutput.ContigLengths.txt
     0 Jul 12 16:40 RayOutput.Contigs.fasta
   305 Jul 12 16:40 RayOutput.CoverageDistributionAnalysis.txt
   100 Jul 12 16:40 RayOutput.CoverageDistribution.txt
   578 Jul 12 16:40 RayOutput.degreeDistribution.txt
    88 Jul 12 16:40 RayOutput.LibraryStatistics.txt
  5.5K Jul 12 16:40 RayOutput.MessagePassingInterface.txt
   177 Jul 12 16:40 RayOutput.NetworkTest.txt
   237 Jul 12 16:40 RayOutput.OutputNumbers.txt
    67 Jul 12 16:40 RayOutput.RayCommand.txt
    25 Jul 12 16:40 RayOutput.RayVersion.txt
     0 Jul 12 16:40 RayOutput.ScaffoldComponents.txt
     0 Jul 12 16:40 RayOutput.ScaffoldLengths.txt
     0 Jul 12 16:40 RayOutput.ScaffoldLinks.txt
     0 Jul 12 16:40 RayOutput.Scaffolds.fasta
     0 Jul 12 16:40 RayOutput.SeedLengthDistribution.txt

It's produced reasonable looking numbers in the Coverage distribution 
file (and coverage analysis file) after I changed the smoothing function 
to be a bit more like a smoothing function 
(https://github.com/gringer/ray/commit/f048203e571d4c7a267dace27fec063cfe2059dd,
 
ignore the 'push_back(0)' bits).

However, the code I have doesn't seem to have any success in finding seeds.

Oh, and FWIW, all the unit tests are passing [except for my new Ray 
assembly run system test].

Any ideas on where I should look?

Thanks,

David Eccles (gringer)

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to