Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-17 Thread Tomasz Zielonka
On Fri, Dec 16, 2005 at 04:41:05PM +0300, Bulat Ziganshin wrote:
> Hello Joel,
> 
> Friday, December 16, 2005, 3:22:46 AM, you wrote:
> 
> >> TZ> You don't have to check "every few seconds". You can determine
> >> TZ> exactly how much you have to sleep - just check the timeout/ 
> >> event with
> >> TZ> the lowest ClockTime.
> 
> JR> The scenario above does account for the situation that you are  
> JR> describing.
> 
> to be exact - Tomasz's variant don't work proper in this situation,
> but your code (which is not use this technique) is ok

Well, what I said was just a sketch. Of course you have to somehow
handle timeout requests coming during the sleep.

Best regards
Tomasz

-- 
I am searching for a programmer who is good at least in some of
[Haskell, ML, C++, Linux, FreeBSD, math] for work in Warsaw, Poland
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-16 Thread Simon Marlow
On 16 December 2005 15:19, Lennart Augustsson wrote:

> John Meacham wrote:
>> On Thu, Dec 15, 2005 at 02:02:02PM -, Simon Marlow wrote:
>> 
>>> With 2k connections the overhead of select() is going to start to
>>> be a problem.  You would notice the system time going up. 
>>> -threaded may help with this, because it calls select() less often.
>> 
>> 
>> we should be using /dev/poll on systems that support it.
> 
> And kqueue for systems that support that.  Much, much more efficient
> than select.

Yeah, yeah.  We know.  We just haven't got around to doing anything
about it :-(  It's actually quite fiddly to hook this up to Handles -
see Einar's implementation in Network.Alt for instance.

Cheers,
Simon (who wished he hadn't mentioned select() again)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-16 Thread Lennart Augustsson

John Meacham wrote:

On Thu, Dec 15, 2005 at 02:02:02PM -, Simon Marlow wrote:


With 2k connections the overhead of select() is going to start to be a
problem.  You would notice the system time going up.  -threaded may help
with this, because it calls select() less often.



we should be using /dev/poll on systems that support it.


And kqueue for systems that support that.  Much, much more efficient
than select.

-- Lennart

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-16 Thread Einar Karttunen
On 16.12 07:03, Tomasz Zielonka wrote:
> On 12/16/05, Einar Karttunen  wrote:
> > To matters nontrivial all the *nix variants use a different
> > more efficient replacement for poll.
> 
> So we should find a library that offers a unified
> interface for all of them, or implement one ourselves.
> 
> I am pretty sure such a library exists. It should fall back to select()
> or poll() on platforms that don't have better alternatives.

network-alt has select(2), epoll, blocking and very experimental kqueue
(the last one is not yet committed but I can suply patches
if someone is interested.

- Einar
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Andrew Pimlott
On Fri, Dec 16, 2005 at 07:03:46AM +0100, Tomasz Zielonka wrote:
> On 12/16/05, Einar Karttunen  wrote:
> > To matters nontrivial all the *nix variants use a different
> > more efficient replacement for poll.
> 
> So we should find a library that offers a unified
> interface for all of them, or implement one ourselves.

http://monkey.org/~provos/libevent/

See also

http://www.kegel.com/c10k.html

Andrew
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Tomasz Zielonka
On 12/16/05, Einar Karttunen  wrote:
> To matters nontrivial all the *nix variants use a different
> more efficient replacement for poll.

So we should find a library that offers a unified
interface for all of them, or implement one ourselves.

I am pretty sure such a library exists. It should fall back to select()
or poll() on platforms that don't have better alternatives.

Best regards
Tomasz
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Ketil Malde
Einar Karttunen  writes:

> To matters nontrivial all the *nix variants use a different
> more efficient replacement for poll.

> Solaris has /dev/poll
> *BSD (and OS X) has kqueue
> Linux has epoll

Since this is 'cafe, here's a page has some performance testing of
epoll: 

   http://lse.sourceforge.net/epoll/

> Thus e.g. not all linux machines will have epoll.

It is present in 2.6, but not 2.4?

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Einar Karttunen
On 15.12 17:14, John Meacham wrote:
> On Thu, Dec 15, 2005 at 02:02:02PM -, Simon Marlow wrote:
> > With 2k connections the overhead of select() is going to start to be a
> > problem.  You would notice the system time going up.  -threaded may help
> > with this, because it calls select() less often.
> 
> we should be using /dev/poll on systems that support it. it cuts down on
> the overhead a whole lot. 'poll(2)' is also mostly portable and usually
> better than select since there is no arbitrary file descriptor limit and
> it doesn't have to traverse the whole bitset. a few #ifdefs should let
> us choose the optimum one available on any given system.

To matters nontrivial all the *nix variants use a different
more efficient replacement for poll.

Solaris has /dev/poll
*BSD (and OS X) has kqueue
Linux has epoll

Also on linux NPTL+blocking calls can actually be very fast
with a suitable scenario. An additional problem is that
these mechanisms depend on the version of the kernel
running on the machine... Thus e.g. not all linux machines
will have epoll.

- Einar Karttunen
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread John Meacham
On Thu, Dec 15, 2005 at 02:02:02PM -, Simon Marlow wrote:
> With 2k connections the overhead of select() is going to start to be a
> problem.  You would notice the system time going up.  -threaded may help
> with this, because it calls select() less often.

we should be using /dev/poll on systems that support it. it cuts down on
the overhead a whole lot. 'poll(2)' is also mostly portable and usually
better than select since there is no arbitrary file descriptor limit and
it doesn't have to traverse the whole bitset. a few #ifdefs should let
us choose the optimum one available on any given system.

John

-- 
John Meacham - ⑆repetae.net⑆john⑈ 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Joel Reymont


On Dec 15, 2005, at 2:02 PM, Simon Marlow wrote:

Hmm, your machine is spending 50% of its time doing nothing, and the
network traffic is very low.  I wouldn't expect 2k connections to pose
any problem at all, so further investigation is definitely required.

With 2k connections the overhead of select() is going to start to be a
problem.  You would notice the system time going up.  -threaded may  
help

with this, because it calls select() less often.


I ran two more tests today after making a few changes. The end result  
is that increasing the thread stack space makes the program run  
significantly faster as it was able to launch 1,000 more bots within  
the same hour.


Looking at the end of the 2nd test, 267Mb of physical memory and  
423Mb of VM are something that I will need to really look into. 80%  
CPU utilization by the app is probably a combination of select on 4k  
sockets


The 89 failures are all connections reset by peer, probable cause is  
my wireless LAN.


I'm now using the threaded runtime. Worker threads write to the  
socket. There's one thread monitoring all the timers. Started about  
12:30pm with no thread stack increase and full (very verbose) logging.


It's running 5 OS threads pretty consistently.

Total:  399, Lobby:  398, Failed: 0, 26/81, 10-20%,
Total:  819, Lobby:  810, Failed: 0, 52/119, 20-30%
Total: 1051, Lobby: 1048, Failed: 0, 63/136, 30-50%
Total: 1229, Lobby: 1219, Failed: 0, 74/153, 30-50%
Total: 1318, Lobby: 1299, Failed: 0, 76/157, 30-50%
Total: 1448, Lobby: 1433, Failed: 0, 82/167, 40-60%, 13:06
Total: 1544, Lobby: 1526, Failed: 0, 86/174, 50-60%, 13:13
Total: 1672, Lobby: 1648, Failed: 0, 90/182, 50-60%, 13:23
Total: 1754, Lobby: 1727, Failed: 0, 91/186, 40-60%, 13:31
Total: 1824, Lobby: 1796, Failed: 0, 93/189, 50-60%, 13:40

With reduced logging and +RTS -k3k. Started at 13:42, 4 OS threads.

Total:  367, Lobby:  363, Failed: 0,  24/76, 10%, 13:49
Total:  516, Lobby:  510, Failed: 14, 34/91, 10-20%, 13:52
Total:  841, Lobby:  836, Failed: 17, 49/116, 20% , 13:56
Total: 1450, Lobby: 1434, Failed: 34, 97/181, 20-50-80%, 14:08
Total: 2008, Lobby: 1999, Failed: 35, 133/234, 70-80%, 14:20
Total: 2318, Lobby: 2308, Failed: 35, 154/263, 70-85%, 14:29
Total: 2623, Lobby: 2613, Failed: 35, 174/293, 70-80%, 14:39
Total: 2862, Lobby: 2854, Failed: 35, 191/316, 70-80%, 14:47
Total: 3151, Lobby: 3142, Failed: 40, 214/347, 60-80%, 14:56
Total: 3364, Lobby: 3355, Failed: 40, 219/359, 60-80%, 15:03
Total: 3808, Lobby: 3744, Failed: 89, 247/398, 70-85%, 15:19
Total: 4000, Lobby: 3938, Failed: 89, 267/423, 80%, 15:27

The system has 120+Mb of free physical memory around 3pm but is not  
swapping heavily as the number of page outs is not increasing.  
There's a total of 1Gb of physical memory. 4 OS threads became 5 at  
some point.


--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Joel Reymont


On Dec 15, 2005, at 2:02 PM, Simon Marlow wrote:

The statistics are phys/VM, CPU usage in % and #packets/transfer  
speed


Total:   1345, Lobby:   1326, Failed:  0, 102/184, 50%, 90/8kb
Total:   1395, Lobby:   1367, Failed:  2
Total:   1421, Lobby:   1394, Failed:  4
Total:   1490, Lobby:   1463, Failed:  4, 108/194, 50%, 110/11Kb
Total:   1574, Lobby:   1546, Failed:  4, 113/202, 50%, 116/11kb


Hmm, your machine is spending 50% of its time doing nothing, and the
network traffic is very low.  I wouldn't expect 2k connections to pose
any problem at all, so further investigation is definitely required.


That's CPU utilization by the program. My laptop is actually running  
a lot of other stuff as well, although the other stuff is not  
consuming much CPU.



With 2k connections the overhead of select() is going to start to be a
problem.  You would notice the system time going up.  -threaded may  
help

with this, because it calls select() less often.


I'm testing 4k connections now but I think the app is spending most  
of the time collecting garbage :-). Well, running handlers on those  
keep-alive packets as well to update internal state.


I think I would need to profile next. I would love to see a report of  
data in drag/void state but it's impossible since I'm using STM.  
Unless I can hack support for STM into profiling myself (unlikely?  
any pointers?) I think I'll have to move away from STM just to  
profile the program.


Joel


--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Simon Marlow
On 15 December 2005 10:21, Joel Reymont wrote:

> Here are statistics that I gathered. I'm almost done modifying the
> program to use 1 timer thread instead of 1 per bot as well as writing
> to the socket from the writer thread. This should reduce the number
> of threads from 6k (2k x 3) to 2k plus change.
> 
> It appears that +RTS -k3k does make a difference. As per Simon, 2-4k
> avoids the thread being garbage collected because each thread gets
> its own block in the storage manager. Simon, did I get that right?
> 
> BTW, how does garbage-collecting a thread works in this scenario? My
> threads are very long-running.
> 
> The total is the number of bots launched, lobby is how many bots
> connected to the lobby. Failed is mostly due to connection reset by
> peer errors. The Windows C++ server uses IOCP and running a firewall
> was apparently interfering with that somehow. I hate Windows :-(.
> 
> --- Test#1 +RTS -k3k as per Simon. Keep-alive timeout of 9 minutes.
> 
> Total:   1961, Lobby:   1961, Failed:  0
> Total:   2000, Lobby:   2000, Failed:  1
> 
> This test went smoothly and got to 2k connections very quickly. Maybe
> within 30 minutes or so. I did not gather CPU usage, etc. statistics.
> 
> --- Test #2, No thread stack increase, 1 minute keep-alive timeout,
> more network traffic
> 
> With a 1 minute timeout things run veeery slow. 86 physical and 158Mb
> of VM with 1k bots, CPU 50-60%. Data sent/received is 60-70 packets
> and 6-7kb/sec. Killed after a while.
> 
> The statistics are phys/VM, CPU usage in % and #packets/transfer speed
> 
> Total:   1345, Lobby:   1326, Failed:  0, 102/184, 50%, 90/8kb
> Total:   1395, Lobby:   1367, Failed:  2
> Total:   1421, Lobby:   1394, Failed:  4
> Total:   1490, Lobby:   1463, Failed:  4, 108/194, 50%, 110/11Kb
> Total:   1574, Lobby:   1546, Failed:  4, 113/202, 50%, 116/11kb

Hmm, your machine is spending 50% of its time doing nothing, and the
network traffic is very low.  I wouldn't expect 2k connections to pose
any problem at all, so further investigation is definitely required.

With 2k connections the overhead of select() is going to start to be a
problem.  You would notice the system time going up.  -threaded may help
with this, because it calls select() less often.

If that's not the cause, we should find out what your app is doing while
it's idle.  If there are runnable threads (eg. the lauchner), then the
app should not be spending any of its time idle.

Cheers,
Simon
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Simon Marlow
On 15 December 2005 10:21, Joel Reymont wrote:

> Here are statistics that I gathered. I'm almost done modifying the
> program to use 1 timer thread instead of 1 per bot as well as writing
> to the socket from the writer thread. This should reduce the number
> of threads from 6k (2k x 3) to 2k plus change.
> 
> It appears that +RTS -k3k does make a difference. As per Simon, 2-4k
> avoids the thread being garbage collected because each thread gets
> its own block in the storage manager. Simon, did I get that right?

The 3k threads are still GC'd, but they are not actually *copied* during
GC.

It'll increase the memory overhead per thread from 2k (1k * 2 for
copying) to 4k (4k block, no overhead for copying).

Cheers,
Simon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Timers (was Re: [Haskell-cafe] Optimizing a high-traffic network architecture)

2005-12-15 Thread Tomasz Zielonka
On Thu, Dec 15, 2005 at 10:46:55AM +, Joel Reymont wrote:
> One idea would be to index the timer on ThreadId and name and stick  
> Nothing into the timer action once the timer has been fired/stopped.  
> Since timers are restarted with the same name quite often this would  
> just keep one relatively big map in memory. The additional ThreadId  
> would help distinguish the timers and avoid clashes.

I don't know how you use your timers, but perhaps startTimer could
return a cancel action? It's type would be

startTimer :: Int -> (IO ()) -> IO (IO ())

and you would use it like this

cancel <- startTimer delay action

...

cancel

How cancelling was implemented would be entirely startTimer's business.

Best regards
Tomasz

-- 
I am searching for a programmer who is good at least in some of
[Haskell, ML, C++, Linux, FreeBSD, math] for work in Warsaw, Poland
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Timers (was Re: [Haskell-cafe] Optimizing a high-traffic network architecture)

2005-12-15 Thread Joel Reymont
One idea would be to index the timer on ThreadId and name and stick  
Nothing into the timer action once the timer has been fired/stopped.  
Since timers are restarted with the same name quite often this would  
just keep one relatively big map in memory. The additional ThreadId  
would help distinguish the timers and avoid clashes.


On Dec 15, 2005, at 10:41 AM, Joel Reymont wrote:

After a chat with Einar on #haskell I realized that I would have,  
say, 4k expiring timers and maybe 12k timers that are started and  
then killed. That would make a 16k element map on which 3/4 of the  
operations are O(n=16k) (Einar).


I need a better abstraction I guess. I also need to be able to find  
timers by id instead of by name like now since each bot will use  
the same timer name for the same operation. I should have starTimer  
return X and then kill the timer using the same X.


--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Timers (was Re: [Haskell-cafe] Optimizing a high-traffic network architecture)

2005-12-15 Thread Joel Reymont
After a chat with Einar on #haskell I realized that I would have,  
say, 4k expiring timers and maybe 12k timers that are started and  
then killed. That would make a 16k element map on which 3/4 of the  
operations are O(n=16k) (Einar).


I need a better abstraction I guess. I also need to be able to find  
timers by id instead of by name like now since each bot will use the  
same timer name for the same operation. I should have starTimer  
return X and then kill the timer using the same X.


I'm looking for suggestions. Here's the improved code:

---
{-# OPTIONS_GHC -fglasgow-exts -fno-cse #-}
module Timer
(
startTimer,
stopTimer
)
where

import qualified Data.Map as M
import System.Time
import System.IO.Unsafe
import Control.Exception
import Control.Concurrent

--- Map timer name and kick-off time to action
type Timers = M.Map (ClockTime, String) (IO ())

timeout :: Int
timeout = 500 -- 1 second

{-# NOINLINE timers #-}
timers :: MVar Timers
timers = unsafePerformIO $ do mv <- newMVar M.empty
  forkIO $ checkTimers
  return mv

--- Not sure if this is the most efficient way to do it
startTimer :: String -> Int -> (IO ()) -> IO ()
startTimer name delay io =
do stopTimer name
   now <- getClockTime
   let plus = TimeDiff 0 0 0 0 0 delay 0
   future = addToClockTime plus now
   block $ do t <- takeMVar timers
  putMVar timers $ M.insert (future, name) io t

--- The filter expression is kind of long...
stopTimer :: String -> IO ()
stopTimer name =
block $ do t <- takeMVar timers
   putMVar timers $
   M.filterWithKey (\(_, k) _ -> k /= name) t

--- Now runs unblocked
checkTimers :: IO ()
checkTimers =
do t <- readMVar timers -- takes it and puts it back
   case M.size t of
 -- no timers
 0 -> threadDelay timeout
 -- some timers
 _ -> do let (key@(time, _), io) = M.findMin t
 now <- getClockTime
 if (time <= now)
then do modifyMVar_ timers $ \a ->
return $! M.delete key a
try $ io -- don't think we care
return ()
else threadDelay timeout
   checkTimers



--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Joel Reymont
Here are statistics that I gathered. I'm almost done modifying the  
program to use 1 timer thread instead of 1 per bot as well as writing  
to the socket from the writer thread. This should reduce the number  
of threads from 6k (2k x 3) to 2k plus change.


It appears that +RTS -k3k does make a difference. As per Simon, 2-4k  
avoids the thread being garbage collected because each thread gets  
its own block in the storage manager. Simon, did I get that right?


BTW, how does garbage-collecting a thread works in this scenario? My  
threads are very long-running.


The total is the number of bots launched, lobby is how many bots  
connected to the lobby. Failed is mostly due to connection reset by  
peer errors. The Windows C++ server uses IOCP and running a firewall  
was apparently interfering with that somehow. I hate Windows :-(.


--- Test#1 +RTS -k3k as per Simon. Keep-alive timeout of 9 minutes.

Total:   1961, Lobby:   1961, Failed:  0
Total:   2000, Lobby:   2000, Failed:  1

This test went smoothly and got to 2k connections very quickly. Maybe  
within 30 minutes or so. I did not gather CPU usage, etc. statistics.


--- Test #2, No thread stack increase, 1 minute keep-alive timeout,  
more network traffic


With a 1 minute timeout things run veeery slow. 86 physical and 158Mb  
of VM with 1k bots, CPU 50-60%. Data sent/received is 60-70 packets  
and 6-7kb/sec. Killed after a while.


The statistics are phys/VM, CPU usage in % and #packets/transfer speed

Total:   1345, Lobby:   1326, Failed:  0, 102/184, 50%, 90/8kb
Total:   1395, Lobby:   1367, Failed:  2
Total:   1421, Lobby:   1394, Failed:  4
Total:   1490, Lobby:   1463, Failed:  4, 108/194, 50%, 110/11Kb
Total:   1574, Lobby:   1546, Failed:  4, 113/202, 50%, 116/11kb

--- Test #3, Rebuilding app with basic logging only (level 10). Stil  
veeery slow. Started ~6pm


Total:   121, Lobby:   118, Failed:  1
Total:   521, Lobby:   509, Failed:  13, 46/104, 20-30%, 35/3kb
Total:   1055, Lobby:   1044, Failed:  13, 94/168, 50%
Total:   1325, Lobby:   1313, Failed:  13
Total:   1566, Lobby:   1553, Failed:  13, 126/215, 70-80%,
Total:   1692, Lobby:   1680, Failed:  13, 136/228, 80%
Total:   1728, Lobby:   1715, Failed:  13, 140/234, 85%
Total:   1746, Lobby:   1733, Failed:  13, 140/235, 50-85%, 6:39pm
Total:   1818, Lobby:   1805, Failed:  13, 145/240, 60-85%,
Total:   1896, Lobby:   1883, Failed:  13, 153/250, 60-85%, 7:01pm
Total:   1933, Lobby:   1919, Failed:  13, 155/255, 70-85%, 7:12pm

System has 216Mb of spare physical memory at this point but the app  
seems to spend most of the time collecting garbage.


Total:   1999, Lobby:   1986, Failed:  13, 162/262, 65-86%, 7:41pm

--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Joel Reymont
Something like this. If someone inserts a timer while we are doing  
our checking we can always catch it on the next iteration of the loop.


--- Now runs unblocked
checkTimers :: IO ()
checkTimers =
do t <- readMVar timers -- takes it and puts it back
   case M.size t of
 -- no timers
 0 -> threadDelay timeout
 -- some timers
 n -> do let (key@(time, name), io) = M.findMin t
 now <- getClockTime
 if (time <= now)
then do modifyMVar_ timers $ \a ->
return $! M.delete key a
try $ io -- don't think we care
return ()
else threadDelay timeout
   checkTimers

On Dec 15, 2005, at 9:39 AM, Tomasz Zielonka wrote:


Perhaps you could use modifyMVar:

http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control- 
Concurrent-MVar.html#v%3AmodifyMVar


  modifyMVar_ :: MVar a -> (a -> IO a) -> IO ()

  A safe wrapper for modifying the contents of an MVar. Like withMVar,
  modifyMVar will replace the original contents of the MVar if an
  exception is raised during the operation.


--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Joel Reymont


On Dec 15, 2005, at 12:08 AM, Einar Karttunen wrote:


timeout = 500 -- 1 second


Is that correct?


I think so. threadDelay takes microseconds.


Here is a nice trick for you:


Thanks!


--- The filter expression is kind of long...
stopTimer :: String -> IO ()
stopTimer name =
block $ do t <- takeMVar timers
   putMVar timers $
   M.filterWithKey (\(_, k) _ -> k /= name) t


And slow. This is O(size_of_map)


Any way to optimize it? I need timer ids so that I can remove a timer  
before it expires. And I need ClockTime as key to so that I don't  
have to wake up every second, etc.


Joel

--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Tomasz Zielonka
On Thu, Dec 15, 2005 at 09:32:38AM +, Joel Reymont wrote:
> Well, my understanding is that once I do a takeMVar I must do a  
> putMVar under any circumstances. This is why I was blocking checkTimers.

Perhaps you could use modifyMVar:

http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurrent-MVar.html#v%3AmodifyMVar

  modifyMVar_ :: MVar a -> (a -> IO a) -> IO ()

  A safe wrapper for modifying the contents of an MVar. Like withMVar,
  modifyMVar will replace the original contents of the MVar if an
  exception is raised during the operation.

  modifyMVar :: MVar a -> (a -> IO (a, b)) -> IO b

  A slight variation on modifyMVar_ that allows a value to be returned (b)
  in addition to the modified value of the MVar.

Best regards
Tomasz

-- 
I am searching for a programmer who is good at least in some of
[Haskell, ML, C++, Linux, FreeBSD, math] for work in Warsaw, Poland
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-15 Thread Joel Reymont
Well, my understanding is that once I do a takeMVar I must do a  
putMVar under any circumstances. This is why I was blocking checkTimers.


On Dec 15, 2005, at 12:08 AM, Einar Karttunen wrote:


Is there a reason you need block for checkTimers?
What you certainly want to do is ignore exceptions
from the timer actions.


--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-14 Thread Bulat Ziganshin
Hello Joel,

Wednesday, December 14, 2005, 7:55:36 PM, you wrote:

JR> With a 1 minute keep-alive timeout system is starting to get stressed
JR> almost right away. There's verbose logging going on and almost every  
JR> event/packet sent and received is traced. The extra logging of the  
JR> timeout events probably adds to the stress and so, I assume, do the  
JR> extra packets.

oh, yes, i forget to say that you can speed up logging bu using large
buffer on logger hadnle, say use:

hSetBuffering logger (BlockBuffering (Just 4096))

and of course avoid logging to the screen


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-14 Thread Einar Karttunen
On 14.12 23:07, Joel Reymont wrote:
> Something like this? Comments are welcome!

> timeout :: Int
> timeout = 500 -- 1 second

Is that correct?

> {-# NOINLINE timers #-}
> timers :: MVar Timers
> timers = unsafePerformIO $ newMVar M.empty
> 
> --- Call this first
> initTimers :: IO ()
> initTimers =
> do forkIO $ block checkTimers
>return ()

Here is a nice trick for you:

{-# NOINLINE timers #-}
timers :: MVar Timers
timers = unsafePerformIO $ do mv <- newMVar M.empty
  forkIO $ block checkTimers
  return mv


initTimers goes thus away.

> --- Not sure if this is the most efficient way to do it
> startTimer :: String -> Int -> (IO ()) -> IO ()
> startTimer name delay io =
> do stopTimer name
>now <- getClockTime
>let plus = TimeDiff 0 0 0 0 0 delay 0
>future = addToClockTime plus now
>block $ do t <- takeMVar timers
>   putMVar timers $ M.insert (future, name) io t

I had code which used a global IORef containing
the current time. It was updated once by a second
by a dedicated thread, but reading it was practically
free. Depends how common getClockTime calls are.

> --- The filter expression is kind of long...
> stopTimer :: String -> IO ()
> stopTimer name =
> block $ do t <- takeMVar timers
>putMVar timers $
>M.filterWithKey (\(_, k) _ -> k /= name) t

And slow. This is O(size_of_map)

> --- Tried to take care of exceptions here
> --- but the code looks kind of ugly

Is there a reason you need block for checkTimers?
What you certainly want to do is ignore exceptions
from the timer actions.


- Einar Karttunen
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-14 Thread Joel Reymont


On Dec 14, 2005, at 7:48 PM, Tomasz Zielonka wrote:


You don't have to check "every few seconds". You can determine
exactly how much you have to sleep - just check the timeout/event with
the lowest ClockTime.


Something like this? Comments are welcome!

It would be cool to not have to export and call initTimers somehow.

---
{-# OPTIONS_GHC -fglasgow-exts -fno-cse #-}
module Timer
(
initTimers,
startTimer,
stopTimer
)
where

import qualified Data.Map as M
import System.Time
import System.IO.Unsafe
import Control.Exception
import Control.Concurrent

--- Map timer name and kick-off time to action
type Timers = M.Map (ClockTime, String) (IO ())

timeout :: Int
timeout = 500 -- 1 second

{-# NOINLINE timers #-}
timers :: MVar Timers
timers = unsafePerformIO $ newMVar M.empty

--- Call this first
initTimers :: IO ()
initTimers =
do forkIO $ block checkTimers
   return ()

--- Not sure if this is the most efficient way to do it
startTimer :: String -> Int -> (IO ()) -> IO ()
startTimer name delay io =
do stopTimer name
   now <- getClockTime
   let plus = TimeDiff 0 0 0 0 0 delay 0
   future = addToClockTime plus now
   block $ do t <- takeMVar timers
  putMVar timers $ M.insert (future, name) io t

--- The filter expression is kind of long...
stopTimer :: String -> IO ()
stopTimer name =
block $ do t <- takeMVar timers
   putMVar timers $
   M.filterWithKey (\(_, k) _ -> k /= name) t

--- Tried to take care of exceptions here
--- but the code looks kind of ugly
checkTimers :: IO ()
checkTimers =
do t <- takeMVar timers
   case M.size t of
 -- no timers
 0 -> do putMVar timers t
 unblock $ threadDelay timeout
 -- some timers
 n -> do let (key@(time, name), io) = M.findMin t
 now <- getClockTime
 if (time <= now)
then do putMVar timers $ M.delete key t
unblock io
else do putMVar timers t
unblock $ threadDelay timeout
   checkTimers


--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-14 Thread Joel Reymont


On Dec 14, 2005, at 7:48 PM, Tomasz Zielonka wrote:


You don't have to check "every few seconds". You can determine
exactly how much you have to sleep - just check the timeout/event with
the lowest ClockTime.


Right, thanks for the tip! I would need to way a predefined amount of  
time when the map is empty, though.


--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-14 Thread Tomasz Zielonka
On Wed, Dec 14, 2005 at 07:11:15PM +, Joel Reymont wrote:
> I figure I can have a single timer thread and a timer map keyed on  
> ClockTime. I would try to get the min. key from the map every few  
> seconds, compare it to clock time, fire of the event as needed,  
> remove the timer and repeat.

You don't have to check "every few seconds". You can determine
exactly how much you have to sleep - just check the timeout/event with
the lowest ClockTime.

Best regards
Tomasz

-- 
I am searching for a programmer who is good at least in some of
[Haskell, ML, C++, Linux, FreeBSD, math] for work in Warsaw, Poland
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-14 Thread Joel Reymont


On Dec 14, 2005, at 6:06 PM, Bulat Ziganshin wrote:


as i already said, you can write to socket directly in your worker
thread


True. 1 less thread to deal with... multiplied by 4,000.


you can use just one timeouts thread for all your bots. if this
timeout is constant across program run, then this thread will be very
simple - just:


Well, the bots may take a couple of hours to get on board. I don't  
think using one thread with a constant timeout is appropriate. This  
is also a keep-alive timeout, meaning that the bot sends a ping to  
server whenever the timer is fired.


I figure I can have a single timer thread and a timer map keyed on  
ClockTime. I would try to get the min. key from the map every few  
seconds, compare it to clock time, fire of the event as needed,  
remove the timer and repeat. This way I will have a single timer  
thread but as many timers as I need.


Thanks, Joel

--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Optimizing a high-traffic network architecture

2005-12-14 Thread Bulat Ziganshin
Hello Joel,

Wednesday, December 14, 2005, 7:55:36 PM, you wrote:

JR> In my current architecture I launch a two threads per socket where
JR> the socket reader places results in a TMVar and the socket writer  
JR> takes input from a TChan.

as i already said, you can write to socket directly in your worker
thread

JR> I also have the worker thread the does the  
JR> bulk of packet processing and a timer thread. The time thread sleeps  
JR> for a few minutes and exits after posting a timeout event if it  
JR> hasn't been killed before.

you can use just one timeouts thread for all your bots. if this
timeout is constant across program run, then this thread will be very
simple - just: 

1) read from Chan (yes, it is the case where using of Chan wll be appropriate! 
;)
2) wait until 9 or so minutes from the time when this message was sent
3) send kill signal to the thread mentioned in message

so, you will had only 2 threads. you can then try to play with
conbinating socket reading and TMVar reading in one thread (btw, try
to replace TMVar with MVar - may be, it will be better?). or, you can
try to create one sockets reading thread, which will service all sockets.
may be, this can be somewhat done with help of select() system call?
it is a more "right way", but i don't know how this can be
accomplished

-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe