On Thursday, 7 February 2013 at 14:42:57 UTC, monarch_dodra wrote:
On Thursday, 7 February 2013 at 14:30:11 UTC, bioinfornatics
wrote:
Little feed back
i named f the f's script and monarch the monarch's script
gdmd -O -w -release f.d
~ $ time ./f bigFastq.fastq
['T':999786820, 'A':1007129068, 'N':39413, 'C':1350576504,
'G':1353023772]
real 2m14.966s
user 0m47.168s
sys 0m15.379s
~ $ gdmd -O -w -release monarch.d
monarch.d:117: no identifier for declarator Lines
monarch.d:117: alias cannot have initializer
monarch.d:130: identifier or integer expected, not assert
i haven't take the time to look more
but in any case it seem memory mapped file is really slowly
whereas it is said that is the faster way to read file. Create
an index where reading the file need 12 min that is useless as
for read and compute you need 2 min
You must be using dmd 2.060. I'm using some 2.061 features:
Namelly "new style alias".
Just change line 117:
alias Lines = typeof(File.init.byLine());
to
alias typeof(File.init.byLine()) Lines;
As for 130, it's a "version(assert)" eg, code that does not get
executed in release. Just remove the "version(assert)", if it
gets executed, it is not a big deal.
In any case, I think the code is mostly "proof", I wouldn't use
it as is.
------------
BTW, I've started working on my library. How would users expect
the "quality" format served? As an array of characters, or as
an array of integrals (ubytes)?
ubyte as is a number is maybe easier to understand an cuttoff
some value