A tiny little file lines reading benchmark I've just found on Reddit: http://www.reddit.com/r/programming/comments/pub98/a_benchmark_for_reading_flat_files_into_memory/
http://steve.80cols.com/reading_flat_files_into_memory_benchmark.html The Ruby code that generates slowly the test data: https://raw.github.com/lorca/flat_file_benchmark/master/gen_data.rb But for my timings I have used only about a 40% of that file, the first 1_965_800 lines, because I have less memory. My Python-Psyco version runs in 2.46 seconds, the D version in 4.65 seconds (the D version runs in 13.20 seconds if I don't disable the GC). >From many other benchmarks I've seen that file reading line-by-line is slow in >D. ------------------------- My D code: import std.stdio, std.string, std.array; void main(in string[] args) { Appender!(string[][]) rows; foreach (line; File(args[1]).byLine()) rows.put(line.idup.split("\t")); writeln(rows.data[1].join(",")); } ------------------------- My Python 2.6 code: from sys import argv from collections import deque import gc import psyco def main(): gc.disable() rows = deque() for line in open(argv[1]): rows.append(line[:-1].split("\t")) print ",".join(rows[1]) psyco.full() main() ------------------------- The test data generator in Ruby: user_id=1 for user_id in (1..10000) payments = (rand * 1000).to_i for user_payment_id in (1..payments) payment_id = user_id.to_s + user_payment_id.to_s payment_amount = "%.2f" % (rand * 30); is_card_present = "N" created_at = (rand * 10000000).to_i if payment_id.to_i % 3 == 0 is_card_present = "Y" end puts [user_id, payment_id, payment_amount, is_card_present, created_at].join("\t") end end ------------------------- Bye, bearophile