Folks, I just went through an performance comparison exercise and I thought a summary of the results might be of interest here. A colleague is converting some C++ code to C# to see if it's possible to maintain the legacy high performance while enjoying the benefits of the managed world. The core code reads from 1 to 15 text files line-by-line and parses the contents of the lines which may look like these samples:
83;61;58;18;42;96;24;15;42;39 a1b1*0.333333333333333a2b1*0.333333333333333a3b1 a3b1*826;2*93;3*101a19b1*526;2*557;3*518 The input files often contain up to 1 million lines. Each parsed number is used to update a cell in a large matrix that is typically hundreds wide or high, but might be tens of thousands wide. So you can see that this is mainly a CPU and memory intensive task. We know that most of the time is taken in the tight loop parsing of millions of numbers out of the input lines. I wrote a test harness that simulated the processing in C# and discovered the following: - Release or Debug build made little difference. - Using compiled Regex slows by a factor of 5. - Using string Split slows by a factor of about 3. - Using Parallel.ForEach slows things slightly. - Using an unmanaged buffer with unsafe unchecked pointers slows things slightly. - The fastest way to parse the lines is with an index loop over the chars in the line string. In a normal business app you would of course use Regex or string methods for parsing because it's clear and maintainable, but in this case where every millisecond counts I found that any FCL usage would blow-out the time and only a for-loop was viable. Parallelism is probably useless in this case because the processing on each worker thread is just a blink, meaning the threading burden was heavier than the processing it carried. So it turns out that an old-fashioned C-style for-loop to manually parse the lines is the fastest by a long-shot. It's fragile of course, but my colleague has translated the old well-tested C++ code directly over to C# (it's rather ugly). This whole scenario is rather unusual and not very applicable to LOB apps, but I thought it was worth posting anyway. Cheers, *Greg Keogh* [image: image.png] Regex.Match(es) Regex.Match(es) with Parallel Processing (PPL) String Split String Split with PPL For-loop For-loop with PPL Plain file reads with no parsing (lowest baseline)