Re: Re[2]: FFI: number of worker threads?

2006-06-21 Thread Li, Peng

On 6/21/06, Simon Peyton-Jones [EMAIL PROTECTED] wrote:

New worker threads are spawned on as needed.  You'll need as many of
them as you have simultaneously-blocked foreign calls. If you have 2000
simultaneously-blocked foreign calls, you'll need 2000 OS threads to
support them, which probably won't work.


2000 OS threads definitely sound scary, but it is possible to work.
The Linux NPTL threads can scale well up to 10K threads and the stack
address spaces would be sufficient on 64-bit systems.

I am thinking about some p2p applications where each peer is
maintaining a huge amount of TCP connections to other peers, but most
of these connections are idle. Unforturnately the default GHC RTS is
multiplexing I/O using select, which is O(n) and it seems to have a
FDSET size limit of 1024.

That makes me wonder if the current design of the GHC RTS is optimal
in the long run. As software and hardware evolves, we will have
efficient OS threads (like NPTL)  and huge (64-bit) address spaces.
My guess is

(1) It is always a good idea to multiplex GHC user-level threads on OS
threads, because it improve performance.
(2) It may not be optimal to multiplex nonblocking I/O inside the GHC
RTS, because it is unrealistic to have an event-driven I/O interface
that is both efficient (like AIO/epoll) and portable (like
select/poll). What is worse, nonblocking I/O still blocks on disk
accesses. On the other hand, the POSIX threads are portable and it can
be efficiently implemented on many systems. At least on Linux, NPTL
easily beats select!

My wish is to have a future GHC implementation that (a) uses blocking
I/O directly provided by the OS, and (b) provides more control over OS
threads and the internal worker thread pool.  Using blocking I/O will
simplify the current design and allow the programmer to take advantage
of high-performance OS threads. If non-blocking I/O is really needed,
the programmer can use customized, Claessen-style threads wrapped in
modular libraries---some of my preliminary tests show that
Claessen-style threads can do a much better job to multiplex
asynchronous I/O.



If you think you have only a handful of simultaneously-blocked foreign
calls, but you still get runaway worker threads, please do make a
reproducible test case and file a bug report.


Yes, I will try to make a reproducible test case soon.


Once you get answers, can I ask either or both of you to type in what
you learned to the GHC user-documentation Wiki?  That way things
improve!   The place to start is here
http://haskell.org/haskellwiki/GHC
under Collaborative documentation.  There's a already a page for
Concurrency and for FFI, so you can add to those.  Thanks


Certainly!
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Re[2]: FFI: number of worker threads?

2006-06-21 Thread Li, Peng

On 6/21/06, Duncan Coutts [EMAIL PROTECTED] wrote:

On linux, epoll scales very well with minimal overhead. Using multiple
OS threads to do blocking IO would not scale in the case of lots of idle
socket connections, you'd need one OS thread per socket.


On Linux, OS threads can also scale very well. I have done an
experiment using pipes and NPTL where most connections are idle---the
performance scales like a straight line when up to 32K file
descriptors and 16K threads are used.



The IO is actually no longer done inside the RTS, it's done by a Haskell
worker thread. So it should be easier now to use platform-specific
select() replacements. It's already different between unix/win32.

So I'd suggest the best approach is to keep the existing multiplexing
non-blocking IO system and start to take advantage of more scalable IO
APIs on the platforms we really care about (either select/poll
replacements or AIO).



It is easy to take advantage of epoll---it shouldn't be that hard to
bake it in. The question is about flexiblity: do we want it to be
edge-triggered or level-triggered?  Even with epoll built-in, the disk
performance cannot keep up with NPTL unless AIO is also built-in.  But
for AIO, it is more complicated.  It bypasses the OS caching; the
Linux AIO even requires the use of certain kinds of file systems.

My idea is that not everybody needs high-performance, asynchronous or
nonblocking I/O.  For those who really need it, it is worth (or,
necessary) writing their own event loops, and event-driven programming
in Haskell is not that difficult using CPS monads.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


FFI: number of worker threads?

2006-06-20 Thread Li, Peng

Hello,

The paper Extending the Haskell FFI with Concurrency mentioned the
following in Section 6.3:

GHC's run-time system employs one OS thread for every bound thread;
additionally, there is a variable number of so-called worker OS
threads that are used to execute the unbounded (lightweight) threads.

How does the runtime system determine the number of worker threads?
Is the number hardcoded in the RTS or dynamically adjustable?  Can a
programmer specify it as an RTS option or change it using an API?

I would like to use a large number (say, 2000) of unbounded threads,
each calling a blocking, safe foreign function via FFI import.  What
is supposed to happen if all the worker threads are used up?  I tried
this in the recent GHC 6.5 and got some kind of runaway worker
threads? RTS failure message when more than 32 threads are used. Is
it a current limitation of the RTS, or should I file a bug report for
it?

Thanks,
Peng
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


readChan and unGetChan?

2006-04-03 Thread Li, Peng
Suppose the following happens:

(1) Thread A calls readChan on an empty channel and waits
(2) Thread B puts something to the read-end of the channel using unGetChan

When a GHC program does this, both threads are blocked! Is it the
behaviour we really want for unGetChan, or should we fix the
implementation for Control.Concurrent.Chan?

Thanks,
Peng
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Allocating aligned memory?

2006-03-03 Thread Li, Peng
In GHC, how can I allocate a chunk of memory aligned to some block
size (say, 512 or 1024 bytes)? I tried to specify it in the
alignment method in the Storable typeclass, but that does not seem
to work. Is Storable.alignment really used in GHC? If so, is there a
code example that allocates aligned memory in this way?

For the moment I am using the C function memalign() like this:

foreign import ccall static stdlib.h
 memalign :: CInt - CInt - IO (Ptr CChar)

 do ptr - memalign alignment size
fptr - newForeignPtr finalizerFree ptr

Is it safe to do so?

Thanks,
Peng
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Using GHC with SMP and FFI?

2006-03-03 Thread Li, Peng
[1] Extending the Haskell Foreign Function Interface with Concurrency
[2] Haskell on a Shared-Memory Multiprocessor

I read the above two papers [1,2] and I have been trying to write an
application that uses both FFI and SMP. The first paper [1] shows how
FFI is implemented on uniprocessor concurrent Haskell; the second
paper [2] shows how SMP Concurrent Haskell is implemented.  However, I
found little documentation on using FFI with the latest SMP extension.
 In addition to [1],  what has been changed and what should a
programmer know if he wants to use FFI in a multithreaded program
running on SMP machines?

Best,
Peng
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Read integer from prompt

1999-09-23 Thread Li Peng

hi
In the older hugs, i do this to read in integer from standard input:

readNum :: IO Integer
readNum  = do {
  line - getLine
; readIO line
  }

 However, in hugs98, it failed and the error message is:
User error: PreludeIO.readIO: no parse

 Why? And can anybody tell me how to read in integer in hugs98

lipeng