On Tuesday, 12 May 2015 at 19:10:13 UTC, Laeeth Isharc wrote:
On Tuesday, 12 May 2015 at 18:14:56 UTC, Gerald Jansen wrote:
On Tuesday, 12 May 2015 at 16:35:23 UTC, Rikki Cattermole wrote:
On 13/05/2015 4:20 a.m., Gerald Jansen wrote:
At the risk of great embarassment ... here's my program:
http://dekoppel.eu/tmp/pedupg.d

Would it be possible to give us some example data?
I might give it a go to try rewriting it tomorrow.

http://dekoppel.eu/tmp/pedupgLarge.tar.gz (89 Mb)

Contains two largish datasets in a directory structure expected by the program.

I haven't had time to read code closely. But if you disable the logging does that change things? If so, how about having the logging done asynchronously in another thread?

And are you using optimization on gdc ?

Also try byLineFast eg
http://forum.dlang.org/thread/umkcjntsxchskljyg...@forum.dlang.org#post-20130516144627.000050da:40unknown

I don't know if std.csv CSVReader would be faster than parsing yourself, but worth trying.

Some tricks here, also:
http://tech.adroll.com/blog/data/2014/11/17/d-is-for-data-science.html

Reply via email to