hi,
i'm attempting to write a fast scanner/parser in C#. basically, i receive a
byte buffer from a stream, then i copy it to a char array while manipulating
the bytes in the process.
in order to make this fast, i'm processing the buffers in an 'unsafe' block
and 'fixing' the byte and char arrays in order to do pointer arithmetic
directly on the buffers.
using VS.NET on windows xp, i wrote a simple console program (called
Scratch.exe) to test out the basics of this approach and i came across a
performance issue when running the same program using mono 2.2 - the speed
in which strings and char buffers are created.
the main code snippet that i used for the two test runs is below:
int length = 1024;
byte[] bytes = new byte[length];
bytes[length - 1] = 0;
char[] chars = new char[length + 1];
int iterations = 50;
for (int i = 0; i iterations; i++)
{
// NOTE: uncomment this line for the second test...
//chars = new char[length + 1];
unsafe
{
fixed (byte* pFixedBytes = bytes)
fixed (char* pFixedChars = chars)
{
byte* pByte = pFixedBytes;
char* pChar = pFixedChars;
while ((*pChar++ = (char)(*pByte++)) != 0)
{
// NOTE: further processing will go here...
}
*pChar = (char)0;
string result = new string(pChar);
// NOTE: the result string will be used elsewhere in
the future; ignored
for tests...
}
}
}
the performance difference (caused by the string and buffer allocation) when
running this simple program on windows xp using microsoft vs. mono 2.2 is
pretty big...and i was hoping there's something i can do to reduce or
eliminate the difference.
here are the performance numbers for test 1 (allocating the char array once,
upfront):
microsoft/windows xp: duration: 0.047sec; rate: 10638298/sec
mono 2.2/windows xp: duration: 0.234sec; rate: 2136752/sec
for test 1, the mono 2.2 default profiler results show:
Time(ms) Count P/call(ms) Method name
[snip]
1488.000 500.003 System.String::.ctor(char*)
Callers (with count) that contribute at least for 1%:
50 100 % Scratch.Program::Main(string[])
991.000 500.002 System.String::CreateString(char*)
Callers (with count) that contribute at least for 1%:
50 100 % System.String::.ctor(char*)
454.000 5008300.001 System.String::InternalAllocateStr(int)
Callers (with count) that contribute at least for 1%:
50 99 % System.String::CreateString(char*)
[snip]
here are the performance numbers for test 2 (allocating a new char array in
each pass of the outer loop):
microsoft/windows xp: duration: 0.531sec; rate: 941620/sec
mono 2.2/windows xp: duration: 6.131sec; rate: 81553/sec
for test 2, the mono 2.2 default profiler results show:
Time(ms) Count P/call(ms) Method name
[snip]
6294.000 5002440.013
System.Object::__icall_wrapper_mono_array_new_specific(intptr,int)
Callers (with count) that contribute at least for 1%:
52 99 % Scratch.Program::Main(string[])
1510.000 500.003 System.String::.ctor(char*)
Callers (with count) that contribute at least for 1%:
50 100 % Scratch.Program::Main(string[])
1041.000 500.002 System.String::CreateString(char*)
Callers (with count) that contribute at least for 1%:
50 100 % System.String::.ctor(char*)
482.000 5008300.001 System.String::InternalAllocateStr(int)
Callers (with count) that contribute at least for 1%:
50 99 % System.String::CreateString(char*)
[snip]
any advice on how to eliminate these differences? i could pre-allocate a
'buffer pool' to reduce or eliminate the allocation of the char buffer (in
test 2) - but i don't really want to resort to this if i don't have to.
also, is there a way to make string allocation (test 1 and test 2) faster? i
can't seem to find a work-around for this issue. or should this code run
faster under linux using mono 2.2 (i.e., is mono 2.2 tuned for linux more
than windows)? (i'm going to run this test on a couple of different
systems.)
one final question: i also tried a third test where i have this code snippet
running on its own thread - with one thread per core (i have a dual-core
processor on my laptop). the performance difference using mono 2.2 between a
single-threaded and a multi-threaded programs was minimal...but using
microsoft, it was about double the performance. (also, mono 2.2 wasn't using
the full 100% of the CPU, while microsoft was able to, on this simple test.)
i'll gather some more info about this issue, but is there a