Re: [Mono-dev] large performance drop between boehm and sgen for a parallel app

2013-12-07 Thread Jonathan Shore
Probably the best thing i can do here is try to scale this problem down so that 
does not depend on external data and post to bugzilla.  Then someone can take a 
look at the pathelogical sgen GC profile.


On Dec 7, 2013, at 1:15 PM, Rodrigo Kumpera  wrote:

> A perf run with sampling profiling and actual traces would be of some use.
> 
> If you have a very large heap, you're probably hitting the limitation of on 
> the default serial major collector mono uses.
> You could try to run with the parallel major enabled, but that's experimental 
> and has not received enough tunning/testing.
> 
> 
> On Sat, Dec 7, 2013 at 12:38 PM, Jonathan Shore  
> wrote:
> Here are the results for a trimmed down version of the problem:
> 
> Boehm (default settings)
> Performance counter stats for '/opt/mono-3.0/bin/mono-boehm --llvm 
> /home/jonathan/Dev/hf/lib/Debug/FeatureGeneratorCSVFile.exe -info -config 
> etc/samples/orderbook-2013-CX-V11.xml -out features-2013-CX.csv':
> 
>48579862.522506 task-clock#9.034 CPUs utilized 
>  
>188,866,824 context-switches  #0.004 M/sec 
>  
> 46,500 CPU-migrations#0.000 M/sec 
>  
>  1,475,427 page-faults   #0.000 M/sec 
>  
> 140,468,865,368,193 cycles#2.892 GHz  
>   
> stalled-cycles-frontend 
> stalled-cycles-backend  
> 80,012,982,451,027 instructions  #0.57  insns per cycle   
>  
> 16,967,686,291,478 branches  #  349.274 M/sec 
>  
> 95,315,728,420 branch-misses #0.56% of all branches   
>  
> 
> 5377.495775794 seconds time elapsed
> 
> SGen (default settings)
> Performance counter stats for '/opt/mono-3.0/bin/mono-sgen --llvm 
> /home/jonathan/Dev/hf/lib/Debug/FeatureGeneratorCSVFile.exe -info -config 
> etc/samples/orderbook-2013-CX-V11.xml -out features-2013-CX.csv':
> 
>   108414200.651113 task-clock#2.049 CPUs utilized 
>  
> 65,792,604 context-switches  #0.001 M/sec 
>  
> 30,536 CPU-migrations#0.000 M/sec 
>  
>309,928,477 page-faults   #0.003 M/sec 
>  
> 263,506,866,481,917 cycles   #2.431 GHz   
>  
> stalled-cycles-frontend 
> stalled-cycles-backend  
> 130,560,004,191,686 instructions #0.50  insns per cycle   
>  
> 27,570,367,199,486 branches  #  254.306 M/sec 
>  
>382,673,241,515 branch-misses #1.39% of all branches   
>  
> 
>52912.358974732 seconds time elapsed
> 
> There is a nearly 10x difference in performance between these.  Both were run 
> on 10 cores.   The boehm version achieved a 9 cpu average and the sgen 
> achieved a 2 cpu average + more overhead.
> 
> 
> On Dec 5, 2013, at 12:48 PM, Rodrigo Kumpera  wrote:
> 
>> Are you running boehm in parallel mode? Can you run perf on your application 
>> and email us the translated results?
>> 
>> 
>> On Thu, Dec 5, 2013 at 11:11 AM, Jonathan Shore  
>> wrote:
>> Hi,
>> 
>> I have a complex parallel application which, when run on 10 threads gets 
>> very close to 1000% cpu with mono-boehm (linux) consistently (running for 
>> hours).   With mono-sgen only achieves  200 - 250% cpu.   This is on a 12 / 
>> 24 core machine.   I need to run sgen eventually because run into the 32 bit 
>> limit with boehm from time to time.
>> 
>> Note that this is with a fairly recent version of mono compiled from git 
>> sources with llvm enabled.
>> 
>> It is not an application I can easily box up for analysis on bugzilla due to 
>> size of data context, though happy to provide an enviroment to the mono team 
>> if useful.  Wondering whether there is some GC debugging can turn on that is 
>> useful to the mono team?
>> 
>> Thanks
>> Jonathan
>> 
>> ___
>> Mono-devel-list mailing list
>> Mono-devel-list@lists.ximian.com
>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>> 
> 
> 

___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] large performance drop between boehm and sgen for a parallel app

2013-12-07 Thread Rodrigo Kumpera
A perf run with sampling profiling and actual traces would be of some use.

If you have a very large heap, you're probably hitting the limitation of on
the default serial major collector mono uses.
You could try to run with the parallel major enabled, but that's
experimental and has not received enough tunning/testing.


On Sat, Dec 7, 2013 at 12:38 PM, Jonathan Shore wrote:

> Here are the results for a trimmed down version of the problem:
>
> *Boehm (default settings)*
> Performance counter stats for '/opt/mono-3.0/bin/mono-boehm --llvm
> /home/jonathan/Dev/hf/lib/Debug/FeatureGeneratorCSVFile.exe -info -config
> etc/samples/orderbook-2013-CX-V11.xml -out features-2013-CX.csv':
>
>48579862.522506 task-clock#*9.034* CPUs utilized
>
>188,866,824 context-switches  #0.004 M/sec
>
> 46,500 CPU-migrations#0.000 M/sec
>
>  1,475,427 page-faults   #0.000 M/sec
>
> 140,468,865,368,193 cycles#2.892 GHz
>
> stalled-cycles-frontend
> stalled-cycles-backend
> 80,012,982,451,027 instructions  #0.57  insns per cycle
>
> 16,967,686,291,478 branches  #  349.274 M/sec
>
> 95,315,728,420 branch-misses #0.56% of all branches
>
>
> *5377*.495775794 seconds time elapsed
>
> *SGen (default settings)*
> Performance counter stats for '/opt/mono-3.0/bin/mono-sgen --llvm
> /home/jonathan/Dev/hf/lib/Debug/FeatureGeneratorCSVFile.exe -info -config
> etc/samples/orderbook-2013-CX-V11.xml -out features-2013-CX.csv':
>
>   108414200.651113 task-clock#*2.049* CPUs utilized
>
> 65,792,604 context-switches  #0.001 M/sec
>
> 30,536 CPU-migrations#0.000 M/sec
>
>309,928,477 page-faults   #0.003 M/sec
>
> 263,506,866,481,917 cycles   #2.431 GHz
>
> stalled-cycles-frontend
> stalled-cycles-backend
> 130,560,004,191,686 instructions #0.50  insns per cycle
>
> 27,570,367,199,486 branches  #  254.306 M/sec
>
>382,673,241,515 branch-misses #1.39% of all branches
>
>
>*52912.*358974732 seconds time elapsed
>
> There is a nearly 10x difference in performance between these.  Both were
> run on 10 cores.   The boehm version achieved a 9 cpu average and the sgen
> achieved a 2 cpu average + more overhead.
>
>
> On Dec 5, 2013, at 12:48 PM, Rodrigo Kumpera  wrote:
>
> Are you running boehm in parallel mode? Can you run perf on your
> application and email us the translated results?
>
>
> On Thu, Dec 5, 2013 at 11:11 AM, Jonathan Shore 
> wrote:
>
>> Hi,
>>
>> I have a complex parallel application which, when run on 10 threads gets
>> very close to 1000% cpu with mono-boehm (linux) consistently (running for
>> hours).   With mono-sgen only achieves  200 - 250% cpu.   This is on a 12 /
>> 24 core machine.   I need to run sgen eventually because run into the 32
>> bit limit with boehm from time to time.
>>
>> Note that this is with a fairly recent version of mono compiled from git
>> sources with llvm enabled.
>>
>> It is not an application I can easily box up for analysis on bugzilla due
>> to size of data context, though happy to provide an enviroment to the mono
>> team if useful.  Wondering whether there is some GC debugging can turn on
>> that is useful to the mono team?
>>
>> Thanks
>> Jonathan
>>
>> ___
>> Mono-devel-list mailing list
>> Mono-devel-list@lists.ximian.com
>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>>
>
>
>
___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] large performance drop between boehm and sgen for a parallel app

2013-12-07 Thread Jonathan Shore
Here are the results for a trimmed down version of the problem:

Boehm (default settings)
Performance counter stats for '/opt/mono-3.0/bin/mono-boehm --llvm 
/home/jonathan/Dev/hf/lib/Debug/FeatureGeneratorCSVFile.exe -info -config 
etc/samples/orderbook-2013-CX-V11.xml -out features-2013-CX.csv':

   48579862.522506 task-clock#9.034 CPUs utilized  
   188,866,824 context-switches  #0.004 M/sec  
46,500 CPU-migrations#0.000 M/sec  
 1,475,427 page-faults   #0.000 M/sec  
140,468,865,368,193 cycles#2.892 GHz
stalled-cycles-frontend 
stalled-cycles-backend  
80,012,982,451,027 instructions  #0.57  insns per cycle
16,967,686,291,478 branches  #  349.274 M/sec  
95,315,728,420 branch-misses #0.56% of all branches

5377.495775794 seconds time elapsed

SGen (default settings)
Performance counter stats for '/opt/mono-3.0/bin/mono-sgen --llvm 
/home/jonathan/Dev/hf/lib/Debug/FeatureGeneratorCSVFile.exe -info -config 
etc/samples/orderbook-2013-CX-V11.xml -out features-2013-CX.csv':

  108414200.651113 task-clock#2.049 CPUs utilized  
65,792,604 context-switches  #0.001 M/sec  
30,536 CPU-migrations#0.000 M/sec  
   309,928,477 page-faults   #0.003 M/sec  
263,506,866,481,917 cycles   #2.431 GHz
stalled-cycles-frontend 
stalled-cycles-backend  
130,560,004,191,686 instructions #0.50  insns per cycle
27,570,367,199,486 branches  #  254.306 M/sec  
   382,673,241,515 branch-misses #1.39% of all branches

   52912.358974732 seconds time elapsed

There is a nearly 10x difference in performance between these.  Both were run 
on 10 cores.   The boehm version achieved a 9 cpu average and the sgen achieved 
a 2 cpu average + more overhead.


On Dec 5, 2013, at 12:48 PM, Rodrigo Kumpera  wrote:

> Are you running boehm in parallel mode? Can you run perf on your application 
> and email us the translated results?
> 
> 
> On Thu, Dec 5, 2013 at 11:11 AM, Jonathan Shore  
> wrote:
> Hi,
> 
> I have a complex parallel application which, when run on 10 threads gets very 
> close to 1000% cpu with mono-boehm (linux) consistently (running for hours).  
>  With mono-sgen only achieves  200 - 250% cpu.   This is on a 12 / 24 core 
> machine.   I need to run sgen eventually because run into the 32 bit limit 
> with boehm from time to time.
> 
> Note that this is with a fairly recent version of mono compiled from git 
> sources with llvm enabled.
> 
> It is not an application I can easily box up for analysis on bugzilla due to 
> size of data context, though happy to provide an enviroment to the mono team 
> if useful.  Wondering whether there is some GC debugging can turn on that is 
> useful to the mono team?
> 
> Thanks
> Jonathan
> 
> ___
> Mono-devel-list mailing list
> Mono-devel-list@lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
> 

___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


mono-devel-list@lists.ximian.com

2013-12-07 Thread BearishSun
I'd have to learn how to deal with git and mono contributions in general and
I don't have that much to spare at the moment having a million other things
on my mind. If I get enough free time I will, but for now I only had time to
write it down in a post. Hopefully it will be enough for someone.



--
View this message in context: 
http://mono.1490590.n4.nabble.com/Compiling-Mono-3-2-3-libraries-Windows-Visual-Studio-32-64bit-tp4661474p4661476.html
Sent from the Mono - Dev mailing list archive at Nabble.com.
___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


mono-devel-list@lists.ximian.com

2013-12-07 Thread Andrés G. Aragoneses
On 07/12/13 10:25, BearishSun wrote:
> Hello,
> 
> I have managed to compile Mono 3.2.3. on Windows with Visual Studio 2012
> using both 32bit and 64bit configurations but had some issues along the way.
> As I have found little to none up-to-date information regarding this topic,
> thought I'd share them with people in case anyone finds them useful. I have
> only compiled the native libraries (mono-2.0.dll, mono.exe). Here goes:
> 
> 

These changes would have been much easier to grok if each of them was in
.diff format. Bonus points if you can create a pull request (with 12
commits) in github.

Thanks!

___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


mono-devel-list@lists.ximian.com

2013-12-07 Thread BearishSun
Hello,

I have managed to compile Mono 3.2.3. on Windows with Visual Studio 2012
using both 32bit and 64bit configurations but had some issues along the way.
As I have found little to none up-to-date information regarding this topic,
thought I'd share them with people in case anyone finds them useful. I have
only compiled the native libraries (mono-2.0.dll, mono.exe). Here goes:

1. Visual Studio solution is located inside msvc folder of Mono source
distribution, but it needs some fixes to get it working.
2. 3.2.3 version is missing "mono.props" file and without it Visual Studio
will fail to open some project files. To fix download the .props file from
https://raw.github.com/mono/mono/master/msvc/mono.props and put it in
mono-3.2.3\msvc folder before opening any projects or solution
3. Even though solution is VS2010, it will only compile with VS2012 unless
you change platform toolkit to v10 for all projects. (Haven't actually
tested with VS2010)
4. If compiler complains it cannot find pthreads.h make sure to define
"HAS_64BITS_ATOMIC" in libmonoutils project (for all configurations)
5. In dlmalloc.c change #include  to #include "dlmalloc.h" if
compiler complains it cannot find that file
6. In "mono-proclib.c" add this bit of code somewhere near the start of the
file:

#ifdef HOST_WIN32
#define strtoll _strtoi64
#define strtoull _strtoui64
#endif

7. In "threads.c" replace a line in
ves_icall_System_Threading_Interlocked_CompareExchange_Long method, from:

return InterlockedCompareExchange64 (location, value, comparand);

to

#ifdef HOST_WIN32
return _InterlockedCompareExchange64 (location, value, comparand);
#else
return InterlockedCompareExchange64 (location, value, comparand); 
#endif

InterlockedCompareExchange64 is just a typedef for
_InterlockedCompareExchange64 on Windows and for some reason compiler
doesn't realize it (typedefs to intrinstics don't work?). Anyway, so we just
reference the intrinsic directly.

8. In "threads.c" replace a line in
ves_icall_System_Threading_Thread_VolatileRead8 method, from:

return InterlockedCompareExchange64 (ptr, 0, 0);
 
to

#ifdef HOST_WIN32
return _InterlockedCompareExchange64 (ptr, 0, 0);
#else
return InterlockedCompareExchange64 (ptr, 0, 0);
#endif

Same problem as previous.

9. For all projects and configurations update their property pages under
"C/C++->Optimization" and set "Enable Intrinsic Functions" to "Yes"
- You might be able to skip this step.
10. In "exceptions-amd64.c" replace line 121:

if (win32_chained_exception_needs_run) { 

   with

if (mono_win_chained_exception_needs_run) {

Delete lines 167 and 168:

if (old_win32_toplevel_exception_filter)
SetUnhandledExceptionFilter(mono_old_win_toplevel_exception_filter);

And move line 166 (line # after previous changes) to start of the function:

guint32 ret = 0;

11. In threads.c in mono_thread_get_stack_bounds method replace the bit of
code under "#if defined(HOST_WIN32)" from:

void* tib = (void*)__readfsdword(0x18);
guint8 *stackTop = (guint8*)*(int*)((char*)tib + 4);
guint8 *stackBottom = (guint8*)*(int*)((char*)tib + 8);

to:

NT_TIB* tib = (NT_TIB*)NtCurrentTeb();
guint8 *stackTop = (guint8*)tib->StackBase;
guint8 *stackBottom = (guint8*)tib->StackLimit;

__readfsdword doesn't exist when building 64bit. Use may use
__readgsqword instead but then you need
to double all your offsets. NtCurrentTeb works equally for both 32 and 
64
bit builds.

12. You are done. Hopefully this helps someone!



--
View this message in context: 
http://mono.1490590.n4.nabble.com/Compiling-Mono-3-2-3-libraries-Windows-Visual-Studio-32-64bit-tp4661474.html
Sent from the Mono - Dev mailing list archive at Nabble.com.
___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list