Re: [Mono-list] string/buffer allocation speed issue

2009-02-27 Thread tomjohnson3


Robert Jordan wrote:
 
 Since your sample does nothing, MS.NET has probably optimized out
 parts of the code.
 

for clarity, this code is meant to illustrate the issue. i started with a
full scanner/parser and trimmed down the guts to isolate the issue. when
running with and without the core bits (which perform similarly under .NET
and mono), the magnitude of the performance difference is similar.

memory allocation is the bottleneck in this case.


Robert Jordan wrote:
 
 If you're benching mono, do it under Linux and with real world code.
 

are you saying that the mono/linux is optimized so that memory allocations
are faster than mono/windows? (this is helpful, if true.)
-- 
View this message in context: 
http://www.nabble.com/string-buffer-allocation-speed-issue-tp21626581p22247873.html
Sent from the Mono - General mailing list archive at Nabble.com.

___
Mono-list maillist  -  Mono-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-list


Re: [Mono-list] string/buffer allocation speed issue

2009-02-27 Thread tomjohnson3



Alan McGovern-2 wrote:
 
 All this test does is allocate memory non-stop which is a GC stress test.
 It
 is known that monos GC is currently slower than the MS GC. 
 

this simple example was meant to illustrate the core bottleneck in a larger
scanner/parser that i'm writing.

a version of my scanner/parser that uses a pre-allocated buffer pool
performs similarly on .NET and mono. ...but it'd be nice to remove the
buffer pool.

are there plans to improve memory allocation speed in a future GC? does
SGen-GC improve on this?

-- 
View this message in context: 
http://www.nabble.com/string-buffer-allocation-speed-issue-tp21626581p22248129.html
Sent from the Mono - General mailing list archive at Nabble.com.

___
Mono-list maillist  -  Mono-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-list


[Mono-list] string/buffer allocation speed issue

2009-02-22 Thread tomjohnson3

hi,

i'm attempting to write a fast scanner/parser in C#. basically, i receive a
byte buffer from a stream, then i copy it to a char array while manipulating
the bytes in the process.

in order to make this fast, i'm processing the buffers in an 'unsafe' block
and 'fixing' the byte and char arrays in order to do pointer arithmetic
directly on the buffers.

using VS.NET on windows xp, i wrote a simple console program (called
Scratch.exe) to test out the basics of this approach and i came across a
performance issue when running the same program using mono 2.2 - the speed
in which strings and char buffers are created.

the main code snippet that i used for the two test runs is below:

int length = 1024;
byte[] bytes = new byte[length];
bytes[length - 1] = 0;
char[] chars = new char[length + 1];

int iterations = 50;

for (int i = 0; i  iterations; i++)
{
// NOTE: uncomment this line for the second test...
//chars = new char[length + 1];

unsafe
{
fixed (byte* pFixedBytes = bytes)
fixed (char* pFixedChars = chars)
{
byte* pByte = pFixedBytes;
char* pChar = pFixedChars;

while ((*pChar++ = (char)(*pByte++)) != 0)
{
// NOTE: further processing will go here...
}
*pChar = (char)0;

string result = new string(pChar);
// NOTE: the result string will be used elsewhere in 
the future; ignored
for tests...
}
}
}

the performance difference (caused by the string and buffer allocation) when
running this simple program on windows xp using microsoft vs. mono 2.2 is
pretty big...and i was hoping there's something i can do to reduce or
eliminate the difference.

here are the performance numbers for test 1 (allocating the char array once,
upfront):

microsoft/windows xp: duration: 0.047sec; rate: 10638298/sec
mono 2.2/windows xp: duration: 0.234sec; rate: 2136752/sec

for test 1, the mono 2.2 default profiler results show:

Time(ms) Count   P/call(ms) Method name
[snip]

 1488.000  500.003   System.String::.ctor(char*)
  Callers (with count) that contribute at least for 1%:
  50  100 % Scratch.Program::Main(string[])

 991.000  500.002   System.String::CreateString(char*)
  Callers (with count) that contribute at least for 1%:
  50  100 % System.String::.ctor(char*)

 454.000  5008300.001   System.String::InternalAllocateStr(int)
  Callers (with count) that contribute at least for 1%:
  50  99 % System.String::CreateString(char*)
[snip]

here are the performance numbers for test 2 (allocating a new char array in
each pass of the outer loop):

microsoft/windows xp: duration: 0.531sec; rate: 941620/sec
mono 2.2/windows xp: duration: 6.131sec; rate: 81553/sec

for test 2, the mono 2.2 default profiler results show:

Time(ms) Count   P/call(ms) Method name
[snip]

 6294.000  5002440.013  
System.Object::__icall_wrapper_mono_array_new_specific(intptr,int)
  Callers (with count) that contribute at least for 1%:
  52  99 % Scratch.Program::Main(string[])

 1510.000  500.003   System.String::.ctor(char*)
  Callers (with count) that contribute at least for 1%:
  50  100 % Scratch.Program::Main(string[])

 1041.000  500.002   System.String::CreateString(char*)
  Callers (with count) that contribute at least for 1%:
  50  100 % System.String::.ctor(char*)

 482.000  5008300.001   System.String::InternalAllocateStr(int)
  Callers (with count) that contribute at least for 1%:
  50  99 % System.String::CreateString(char*)
[snip]

any advice on how to eliminate these differences? i could pre-allocate a
'buffer pool' to reduce or eliminate the allocation of the char buffer (in
test 2) - but i don't really want to resort to this if i don't have to.

also, is there a way to make string allocation (test 1 and test 2) faster? i
can't seem to find a work-around for this issue. or should this code run
faster under linux using mono 2.2 (i.e., is mono 2.2 tuned for linux more
than windows)? (i'm going to run this test on a couple of different
systems.)

one final question: i also tried a third test where i have this code snippet
running on its own thread - with one thread per core (i have a dual-core
processor on my laptop). the performance difference using mono 2.2 between a
single-threaded and a multi-threaded programs was minimal...but using
microsoft, it was about double the performance. (also, mono 2.2 wasn't using
the full 100% of the CPU, while microsoft was able to, on this simple test.)

i'll gather some more info about this issue, but is there a