Re: How to read fastly files ( I/O operation)

bioinfornatics Tue, 12 Feb 2013 08:30:41 -0800

On Tuesday, 12 February 2013 at 12:45:26 UTC, monarch_dodra wrote:

On Tuesday, 12 February 2013 at 12:02:59 UTC, bioinfornaticswrote:
instead to use memcpy I try with slicing ~ lines 136 :
_hardBuffer[ 0 .. moveSize] = _hardBuffer[_bufPosition ..moveSize + _bufPosition];
I get same perf
I think I figured out why I'm getting different results thanyou guys are, on my windows machine.
AFAIK, file reads in windows are done natively asynchronously.
I wrote a multi-threaded version of the parser, with a threaddedicated to reading the file, while the main thread parses theread buffers.
I'm getting EXACTLY 0% performance improvement. Not better, notworst, just 0%.
I'd have to try again on my SSD. Right now, I'm parsing thefile 6 Gig file in 60 seconds, which is the limit of my HDD. Asa matter of fact, just *reading* the files takes the EXACT sameamount of time as parsing it...
This takes 60 seconds.
//----
    auto input = File(args[1], "rb");
    ubyte[] buffer = new ubyte[](BufferSize);
    do{
        buffer = input.rawRead(buffer);
    }while(buffer.length);
//----

This takes 60 seconds too.
//----
    Parser parser = new Parser(args[1]);
    foreach(q; parser)
        foreach(char c; q.sequence)
            globalNucleic.collect(c);
}
//----
So at this point, I'd need to test on my Linux box, or publishthe code so you can tell me how I'm doing.
I'm still tweaking the code to publish something readable, asthere is a lot of sketchy code right now.
I'm also implementing a correct exception handling, so that ifthere is an erroneous entry, an exception is thrown. However,all the erroneous data is parsed out of the file, and placedinside the exception. This means that:
a) You can inspect the erroneous data
b) You can skip the erroneous data, and parse the rest of thefile.
Once I deliver the code with the multi-threaded code activated,you should get some better performance on Linux.
When "1.0" is ready, I'll create a github project for it, sowork can be done parallel on it.

about threaded version is possible to use get file size functionto split it in several thread.Use fseek read end of section return it to detect end of split toused

Re: How to read fastly files ( I/O operation)

Reply via email to