hi, i'm attempting to write a fast scanner/parser in C#. basically, i receive a byte buffer from a stream, then i copy it to a char array while manipulating the bytes in the process.
in order to make this fast, i'm processing the buffers in an 'unsafe' block and 'fixing' the byte and char arrays in order to do pointer arithmetic directly on the buffers. using VS.NET on windows xp, i wrote a simple console program (called Scratch.exe) to test out the basics of this approach and i came across a performance issue when running the same program using mono 2.2 - the speed in which strings and char buffers are created. the main code snippet that i used for the two test runs is below: int length = 1024; byte[] bytes = new byte[length]; bytes[length - 1] = 0; char[] chars = new char[length + 1]; int iterations = 500000; for (int i = 0; i < iterations; i++) { // NOTE: uncomment this line for the second test... //chars = new char[length + 1]; unsafe { fixed (byte* pFixedBytes = bytes) fixed (char* pFixedChars = chars) { byte* pByte = pFixedBytes; char* pChar = pFixedChars; while ((*pChar++ = (char)(*pByte++)) != 0) { // NOTE: further processing will go here... } *pChar = (char)0; string result = new string(pChar); // NOTE: the result string will be used elsewhere in the future; ignored for tests... } } } the performance difference (caused by the string and buffer allocation) when running this simple program on windows xp using microsoft vs. mono 2.2 is pretty big...and i was hoping there's something i can do to reduce or eliminate the difference. here are the performance numbers for test 1 (allocating the char array once, upfront): microsoft/windows xp: duration: 0.047sec; rate: 10638298/sec mono 2.2/windows xp: duration: 0.234sec; rate: 2136752/sec for test 1, the mono 2.2 default profiler results show: Time(ms) Count P/call(ms) Method name [snip] ######################## 1488.000 500000 0.003 System.String::.ctor(char*) Callers (with count) that contribute at least for 1%: 500000 100 % Scratch.Program::Main(string[]) ######################## 991.000 500000 0.002 System.String::CreateString(char*) Callers (with count) that contribute at least for 1%: 500000 100 % System.String::.ctor(char*) ######################## 454.000 500830 0.001 System.String::InternalAllocateStr(int) Callers (with count) that contribute at least for 1%: 500000 99 % System.String::CreateString(char*) [snip] here are the performance numbers for test 2 (allocating a new char array in each pass of the outer loop): microsoft/windows xp: duration: 0.531sec; rate: 941620/sec mono 2.2/windows xp: duration: 6.131sec; rate: 81553/sec for test 2, the mono 2.2 default profiler results show: Time(ms) Count P/call(ms) Method name [snip] ######################## 6294.000 500244 0.013 System.Object::__icall_wrapper_mono_array_new_specific(intptr,int) Callers (with count) that contribute at least for 1%: 500002 99 % Scratch.Program::Main(string[]) ######################## 1510.000 500000 0.003 System.String::.ctor(char*) Callers (with count) that contribute at least for 1%: 500000 100 % Scratch.Program::Main(string[]) ######################## 1041.000 500000 0.002 System.String::CreateString(char*) Callers (with count) that contribute at least for 1%: 500000 100 % System.String::.ctor(char*) ######################## 482.000 500830 0.001 System.String::InternalAllocateStr(int) Callers (with count) that contribute at least for 1%: 500000 99 % System.String::CreateString(char*) [snip] any advice on how to eliminate these differences? i could pre-allocate a 'buffer pool' to reduce or eliminate the allocation of the char buffer (in test 2) - but i don't really want to resort to this if i don't have to. also, is there a way to make string allocation (test 1 and test 2) faster? i can't seem to find a work-around for this issue. or should this code run faster under linux using mono 2.2 (i.e., is mono 2.2 tuned for linux more than windows)? (i'm going to run this test on a couple of different systems.) one final question: i also tried a third test where i have this code snippet running on its own thread - with one thread per core (i have a dual-core processor on my laptop). the performance difference using mono 2.2 between a single-threaded and a multi-threaded programs was minimal...but using microsoft, it was about double the performance. (also, mono 2.2 wasn't using the full 100% of the CPU, while microsoft was able to, on this simple test.) i'll gather some more info about this issue, but is there a reason why string/buffer allocation in a multi-threaded program would not scale linearly up to the number of cores in a processor? i'd be happy to dig into code or donate my time to help fix this issue (if it is an issue)...so, if someone could help to direct me to the right place, i'd appreciate it! thanks for the help! tom -- View this message in context: http://www.nabble.com/string-buffer-allocation-speed-issue-tp21626581p21626581.html Sent from the Mono - General mailing list archive at Nabble.com. _______________________________________________ Mono-list maillist - Mono-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-list