Re: [Haskell-cafe] Strange parallel behaviour with Ubuntu Karmic / GHC 6.10.4

2009-11-16 Thread Neil Brown

Michael Lesniak wrote:

Hello,

  

getTime?  I wonder if that number might be causing the problem; can you
replicate it with lower sys times?


That was it! Thanks Neil!

When I'm using some number crunching without getTime it works (with
more or less the expected speedup and usage of two cores) on my Ubuntu
9.10, too.

Out of curiosity, the question is still open: Why does the old example
(using getTime) work so much better on an older version of
Ubuntu/RedHat and not on the new ones?

  

Your kernels were:

Setup:
Machine A: Quadcore, Ubuntu 9.04, Kernel  2.6.28-13 SMP
Machine B: AMD Opteron 875, 8 cores,  2.6.18-164 SMP- (some redhat)
Machine C: Dual-Core, Ubuntu 9.10, Kernel 2.6.31-14 SMP

Looking at the implementation of getTime ThreadCPUTime in the clock 
package, it calls clock_gettime(CLOCK_THREAD_CPUTIME_ID,..).  According 
to this page 
(http://www.h-online.com/open/news/item/Kernel-Log-What-s-new-in-2-6-29-Part-8-Faster-start-up-and-other-behind-the-scenes-changes-740591.html), 
the changes in 2.6.29 (changes which only your Ubuntu 9.10 machine has) 
included a patch 
(http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c742b31c03f37c5c499178f09f57381aa6c70131) 
which altered the implementation of that function.  Perhaps on some 
multi-processor machines the new implementation effectively serialises 
the code?  I know there used to be issues of whether some of the timers 
were synchronised across processors/cores (to stop them appearing to go 
backwards), so maybe something with the timers and their 
synchronisations effectively stops your program running in parallel.  If 
it helps, my machine is: "Intel(R) Core(TM)2 Duo CPU E8400  @ 
3.00GHz" according to /proc/cpuinfo.


Thanks,

Neil.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Strange parallel behaviour with Ubuntu Karmic / GHC 6.10.4

2009-11-16 Thread Michael Lesniak
Hello,

> getTime?  I wonder if that number might be causing the problem; can you
> replicate it with lower sys times?
That was it! Thanks Neil!

When I'm using some number crunching without getTime it works (with
more or less the expected speedup and usage of two cores) on my Ubuntu
9.10, too.

Out of curiosity, the question is still open: Why does the old example
(using getTime) work so much better on an older version of
Ubuntu/RedHat and not on the new ones?


Kind regards,
Michael

-- 
Dipl.-Inf. Michael C. Lesniak
University of Kassel
Programming Languages / Methodologies Research Group
Department of Computer Science and Electrical Engineering

Wilhelmshöher Allee 73
34121 Kassel

Phone: +49-(0)561-804-6269
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Strange parallel behaviour with Ubuntu Karmic / GHC 6.10.4

2009-11-16 Thread Neil Brown

Michael Lesniak wrote:

Hello,

I've written a smaller example which reproduces the unusual behaviour.
Should I open a GHC-Ticket, too?
  

Hi,

I get these results:

$ time ./Temp +RTS -N1 -RTS 16

real0m16.010s
user0m10.869s
sys0m5.144s

$ time ./Temp +RTS -N2 -RTS 16

real0m12.794s
user0m13.341s
sys0m7.136s

Looking at top, the second version used ~160% CPU time (i.e. it was 
using both cores fairly well).  So I don't think I get the same bad 
behaviour as you.  Those sys times look high by the way -- I guess it's 
all the calls to getTime?  I wonder if that number might be causing the 
problem; can you replicate it with lower sys times?


Thanks,

Neil.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Strange parallel behaviour with Ubuntu Karmic / GHC 6.10.4

2009-11-15 Thread Michael Lesniak
Hello,

I've written a smaller example which reproduces the unusual behaviour.
Should I open a GHC-Ticket, too?


-- A small working example which describes the problems (I have) with GHC
-- 6.10.4, Ubuntu Karmic 9.10, explicit threading and core usage.
--
-- See http://www.haskell.org/pipermail/haskell-cafe/2009-November/069144.html
-- for the general description of the problem.
--
-- For comparsion:
-- Compilation on both machines with
-- 
-- ghc --make -O2 -threaded Example.hs -o e -Wall
--
-- 
-- 1. Machine B: (Quadcore, Ubuntu 9.04)
-- a. With 1 thread:
-- time e +RTS -N1 -RTS 16
-- e +RTS -N1 -RTS 16  11,00s user 5,00s system 100% cpu 16,004 total
--
-- b. With 2 threads:
-- time e +RTS -N2 -RTS 16
-- e +RTS -N2 -RTS 16  11,44s user 4,58s system 197% cpu 8,102 total
--
--
-- 2. Machine C: (Dualcore, Ubuntu 9.10)
-- a. With 1 thread:
-- time e +RTS -N1 -RTS  16
--
-- real 0m16.414s
-- user 0m11.360s
-- sys  0m4.650s
--
-- b. With 2 threads:
-- time e +RTS -N2 -RTS  16
--
-- real 0m18.484s
-- user 0m14.320s
-- sys  0m5.940s
--
---
module Main where

import GHC.Conc
import Control.Concurrent
import Control.Monad
import System.Posix.Clock
import System.Environment



---
main :: IO ()
main = do
-- Configuration
args <- getArgs
let threads = numCapabilities-- number of threads determined by -N<...>
taskDur = 1.0-- seconds each task takes
taskNum = (read . head) args -- Number of tasks is 1st parameter

-- Generate a channel for the tasks to do and fill it with uniform and
-- independent tasks. The other channel receives a message for each task
-- which is finished.
queue<- newChan
finished <- newChan
writeList2Chan queue (replicate taskNum taskDur)

-- Fork threads
replicateM_ threads (forkIO (thread queue finished))

-- Wait until the queue is empty
replicateM_ taskNum (readChan finished)


---
thread :: Chan Double -> Chan Int -> IO ()
thread queue finished =
forever $ do
task <- readChan queue
workFor task
writeChan finished 1



---
-- | Generates work for @s@ seconds.
workFor :: Double -> IO ()
workFor s = do
now <- getTime ThreadCPUTime
repeat (time2Double now + s)
  where repeat fs = do
now <- nSqrt 1 `pseq` getTime ThreadCPUTime
let f = time2Double now
unless (f >= fs) $ repeat fs
time2Double t =
fromIntegral (sec t) + (fromIntegral (nsec t) / 10)
-- Calculates the sqrt of 2^1000. The parameter n is to ensure
-- that GHC does not optimize it away.
-- (In fact, I'm not sure this is needed...)
nSqrt n =
let sqs = map (\_ -> iterate sqrt (2^1000) !! 50) [1..n]
in foldr seq 1 sqs
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Strange parallel behaviour with Ubuntu Karmic / GHC 6.10.4

2009-11-15 Thread Neil Brown

Michael Lesniak wrote:

Hello,

I'm currently developing some applications with explicit threading
using forkIO and have strange behaviour on my freshly installed Ubuntu
Karmic 9.10 (Kernel 2.6.31-14 SMP).

Setup:
Machine A: Quadcore, Ubuntu 9.04, Kernel  2.6.28-13 SMP
Machine B: AMD Opteron 875, 8 cores,  2.6.18-164 SMP- (some redhat)
Machine C: Dual-Core, Ubuntu 9.10, Kernel 2.6.31-14 SMP
Compiler on all machines: ghc 6.10.4 (downloaded from GHCs official website)
  

Hi,

I have a dual-core Ubuntu 9.10 machine (running whatever GHC comes with 
the distro -- 6.10.x), so if you put your test code somewhere that I can 
get at, I can run it and see if I get the same effect.


Thanks,

Neil.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Strange parallel behaviour with Ubuntu Karmic / GHC 6.10.4

2009-11-13 Thread Michael Lesniak
Hello,

I'm currently developing some applications with explicit threading
using forkIO and have strange behaviour on my freshly installed Ubuntu
Karmic 9.10 (Kernel 2.6.31-14 SMP).

Setup:
Machine A: Quadcore, Ubuntu 9.04, Kernel  2.6.28-13 SMP
Machine B: AMD Opteron 875, 8 cores,  2.6.18-164 SMP- (some redhat)
Machine C: Dual-Core, Ubuntu 9.10, Kernel 2.6.31-14 SMP
Compiler on all machines: ghc 6.10.4 (downloaded from GHCs official website)

Program, Compilation, Execution
A simple taskqueue with independent tasks and explicit parallelization
(hence should deliver more or less perfect speedup).
For one core wall-times around 16 are ok, for 2 a bit more than 8 seconds.

Since I used the same sources and Makefiles on all machines all files
were compiled with -threaded and started with +RTS -N2 -RTS.

Testing:
Machine A: Ok (meaning works and delivers the expected speedup)
Machine B: Ok
Machine C: Not ok (with -N2 wall times around 14-15 seconds)

Looking at the core usage, for example with htop, I see that the
second core is not really used on C. Executing OpenMP programs shows
the expected speedup and usage of both cores, hence I do not think its
a kind of general linux configuration problem.

So, after all the testing I think its either the Linux Kernel or some
other component of Ubuntu 9.10. But: Ubuntu is often used and I did
not found any information regarding this problem. The simple solution
of installing the old version of Ubuntu would probably help but should
not be the way to go, should it?


I'd be glad for any hints or comments,
Michael

-- 
Dipl.-Inf. Michael C. Lesniak
University of Kassel
Programming Languages / Methodologies Research Group
Department of Computer Science and Electrical Engineering

Wilhelmshöher Allee 73
34121 Kassel

Phone: +49-(0)561-804-6269
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Strange parallel behaviour with Ubuntu Karmic / GHC 6.10.4

2009-11-13 Thread Michael Lesniak
Hello,

I'm currently developing some applications with explicit threading
using forkIO and have strange behaviour on my freshly installed Ubuntu
Karmic 9.10 (Kernel 2.6.31-14 SMP).

Setup:
Machine A: Quadcore, Ubuntu 9.04, Kernel  2.6.28-13 SMP
Machine B: AMD Opteron 875, 8 cores,  2.6.18-164 SMP- (some redhat)
Machine C: Dual-Core, Ubuntu 9.10, Kernel 2.6.31-14 SMP
Compiler on all machines: ghc 6.10.4 (downloaded from GHCs official website)

Program, Compilation, Execution
A simple taskqueue with independent tasks and explicit parallelization
(hence should deliver more or less perfect speedup).
For one core wall-times around 16 are ok, for 2 a bit more than 8 seconds.

Since I used the same sources and Makefiles on all machines all files
were compiled with -threaded and started with +RTS -N2 -RTS.

Testing:
Machine A: Ok (meaning works and delivers the expected speedup)
Machine B: Ok
Machine C: Not ok (with -N2 wall times around 14-15 seconds)

Looking at the core usage, for example with htop, I see that the
second core is not really used on C. Executing OpenMP programs shows
the expected speedup and usage of both cores, hence I do not think its
a kind of general linux configuration problem.

So, after all the testing I think its either the Linux Kernel or some
other component of Ubuntu 9.10. But: Ubuntu is often used and I did
not found any information regarding this problem. The simple solution
of installing the old version of Ubuntu would probably help but should
not be the way to go, should it?


I'd be glad for any hints or comments,
Michael

-- 
Dipl.-Inf. Michael C. Lesniak
University of Kassel
Programming Languages / Methodologies Research Group
Department of Computer Science and Electrical Engineering

Wilhelmshöher Allee 73
34121 Kassel

Phone: +49-(0)561-804-6269
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe