Rob Giltrap wrote:
> Calling all performance experts! Here's you chance to become part of 
> mathematics history contributing to the finding of one of the rarest 
> things known to man... Mersenne Primes (there are currently only 44 
> known to exist). Refer: http://en.wikipedia.org/wiki/Mersenne_prime
> 
> How does this relate to OpenSolaris? Well OpenSolaris is involved in 
> this years Google Summer of Code (GSoC)...
> 
> Refer: http://www.opensolaris.org/os/project/summerofcode/students/
> 
> One of the projects is the FFT project which basically is about 
> leveraging OpenSolaris for it's incredible scalability, cross platform 
> support and most importantly it's observability and performance tools as 
> a showcase for it's usefulness in highly threaded / HPC computing.
> 
> To date we have taken what was a single threaded 'C' application 
> 'Mlucas', added coarse grained parallel code (using OpenMP) to the 
> majority of the program (there are a few minor components still being 
> worked on) and compiled using the latest SunStudio12 C compiler. This 
> has resulted in significant scalability as shown below...
> 
> Sparc64 VI (2.13Ghz x 8 CPU (16 core))
> - 1 thread  =  106 secs
> - 2 threads =   55 secs (x 1.93)
> - 4 threads =   31 secs (x 3.42)
> - 8 threads =   18 secs (x 5.89)
> 
> We are still a couple of days away from getting the application to be 
> able to usefully employ 16 threads, however assuming even modest 
> improvement beyond 8 threads, it will result in a new world record for 
> the verification of Mersenne Primes. Currently at 8 threads on the 
> Sparc64 VI it would take over 7.5 days to do the calculations for just 
> one exponent.
> 
> There are only 5 weeks left of the GSoC program and we now need to look 
> at further tuning the software particularly for OpenSolaris on the Sparc 
> & x64 architectures.
>  
> What I require is for those on this list with performance experience to 
> provide a little help in removing any system bottlenecks that might be 
> there. This is actually not a large task as the application is only 
> bound by CPU and real memory. There is no network activity, a tiny 
> amount of disk I/O occurs once every 30+ secs, the application uses less 
> than 100MB of real memory and is pretty much all integer calculations.
> 
> So I only need help with CPU/Memory (particularly cache) performance. 
> Basic checks have been made for cpu utilization, trapstat & context 
> switching issues etc but we really need to bring in a bit more knowledge 
> into the team. Ideally what I would like is for someone just to take 1-2 
> hours and give it the once over in terms of, are there any obvious wins 
> that we have missed, which may include pointing us to the right Dtrace 
> scripts to use for ongoing analysis.
> 
> The Mlucas application is actually VERY simple in terms of processing 
> and should be easy to analyze. There are four steps:
>  - transform
>  - square
>  - inverse transform
>  - carry
> 
> These steps currently (at 8 threads) collectively take 0.018 secs to 
> complete and then they run again and again (about 36million times). Thus 
> ANY performance improvement makes a big difference as anything * 36 
> million equals something significant.
> 
> So my request is.... if you may have the skills we are after please 
> contact me and I can provide more information (an initial inquiry will 
> not be treated as your guarantee of further commitment!). We can provide 
> you with whatever you need and if you are in the Bay Area all the better 
> as we could arrange for a physical meet with the main developer. Our two 
> main developers are VERY smart cookies, we are just a little short on 
> OpenSolaris observability/performance expertise.
> 
> One benefit to OpenSolaris as a whole, is that this tuning will be 
> including in the case study I'm writing up showcasing OpenSolaris as a 
> great environment in which to do highly threaded / HPC type development. 
> It should also result in OpenSolaris being the infrastructure that is 
> used to verify the EFF prizewinning 10+million digit 45th Mersenne Prime 
> (when it is found) resulting in additional media attention for 
> OpenSolaris especially within the math and HPC sciences arenas.
> 
> Many thanks,
> 
> Rob Giltrap
> OpenSolaris - Google Summer of Code Mentor.

Hi Rob -

Have you taken a pass at this w/ DProfile (in the latest sun studio)?
You should look for cache/tlb hots spots; this may show false sharing, 
excessive conflicts, etc.

- Bart


-- 
Bart Smaalders                  Solaris Kernel Performance
[EMAIL PROTECTED]               http://blogs.sun.com/barts
_______________________________________________
perf-discuss mailing list
[email protected]

Reply via email to