[Haskell-cafe] trick to easily generate Eq/Ord instances

2005-12-13 Thread Bulat Ziganshin
Hello

sometimes, Eq/Ord classes can't be derived automatically because we
need to comare only part of fields. in such situations i use the
following trick to easify generation of class instances:

data ArchiveBlock = ArchiveBlock {
blArchive :: Archive
  , blType:: BlockType
  , blCompressor  :: Compressor
  , blPos :: Integer
  , blOrigSize:: Integer
  , blCompSize:: Integer
  , blCRC :: CRC
  , blFiles   :: Int
}

instance Eq ArchiveBlock where
  (==)=  map2eq  $ map3 (blArchive,blPos,blCRC)

instance Ord ArchiveBlock where
  compare =  map2cmp $ map2 (blArchive,blPos)

{-
instance Ord ArchiveBlock where
  compare =  map2cmp blPos  -- for comparision on just one field
-}



-- Utility functions
map2   (f,g) a  =  (f a, g a)
map3 (f,g,h) a  =  (f a, g a, h a)
keyval  f x=  (f x, x)-- |Return pair containing computed 
key and original value
map2cmp f x y  =  (f x) `compare` (f y)   -- |Converts key_func to 
compare_func
map2eq  f x y  =  (f x) == (f y)  -- |Converts key_func to eq_func
  



-- 
Best regards,
 Bulat  mailto:[EMAIL PROTECTED]



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Bringing Erlang to Haskell

2005-12-13 Thread Bulat Ziganshin
Hello Joel,

Monday, December 12, 2005, 7:00:46 PM, you wrote:

JR 1) Processes, aka threads with single-slot in/out mailboxes

are you read dewscription of my own Process library in haskell
maillist?

JR One particular thing that bugs me is that I cannot really use TChan
JR for thread mailboxes.

i use. but i limit number of messages in this channel by additional
tools. you can easily do the same. but first ask yourself - what you
will gain by this? imho, it will only help to smooth temporary speed
changes. if you just want to test whether this can speed up your
program - implement such limited Channel and test whether it works

btw, i suggested you to try not using logging thread entirely, making
all logging actions synchronously

JR I found single-slot mailboxes (TMVar) to work much better as they
JR pace the overall message flow. Using them means that asynchronous  
JR messages cannot be implemented, though.

not exactly. they can hold at most one message

i think that your aspiration to make things asynchronous is just sort
of fashion. what you really want to get?



-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Tricky exception handling

2005-12-13 Thread Bulat Ziganshin
Hello Joel,

Monday, December 12, 2005, 7:26:23 PM, you wrote:

JR Unless I'm mistaken, the code above will run forever and will not
JR exit on exception.

yes, you are muistaken! :)  this code will repeat permanently until
exception arrived. at this time it will process exception handler and
then exit the whole function. you musr reread docs. hmm, actually this
is the way exception handling works in ANY language

 handle (...)
   repeat_forever
 do cmd - read h ssl
post $! Cmd $! cmd

JR --
JR http://wagerlabs.com/







-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Opening the same file multiple times

2005-12-13 Thread Bulat Ziganshin
Hello Einar,

Monday, December 12, 2005, 5:01:20 PM, you wrote:

EK 3) Using System.Posix.IO

EK Using the fd{Read,Close,Write} functions from System.Posix.IO
EK could solve the problem - except that there is no way to
EK write binary buffers (Ptr Word8) with the API. Thus no
EK solution.

you can easily import these functions via FFI:

foreign import ccall unsafe HsBase.h read
   c_read :: CInt - Ptr CChar - CSize - IO CSsize

moreover, they are already imported by System.Posix.Internals. and
even more - it works both under Windows and Unix


below is a part of file api i proposed for inclusion in ghc. i think
it is exactly what you need:


{-# OPTIONS_GHC -fvia-C -fglasgow-exts -fno-monomorphism-restriction#-}
module FD where

import Control.Monad
import Data.Bits
import Data.Int
import Data.Word
import Foreign.C.Types
import Foreign.C.Error
import Foreign.C.String
import Foreign.Marshal.Alloc
import Foreign.Ptr
import System.IO
import System.IO.Error
import System.Posix.Internals
import System.Posix.Types
import System.Win32

type FD = CInt-- handle of open file
type CWFilePath   = CString   -- filename in C land
type CWFileOffset = COff  -- filesize or filepos in C land
type FileSize = Integer   -- filesize or filepos in Haskell land
withCWFilePath = withCString  -- FilePath-CWFilePath conversion
peekCWFilePath = peekCString  -- CWFilePath-FilePath conversion

fdOpen :: String - CInt - CMode - IO FD
fdOpen name access mode =
  modifyIOError (`ioeSetFileName` name) $
withCWFilePath name $ \ p_name -
  throwErrnoIfMinus1Retry fdOpen $
c_open p_name access mode

fdClose :: FD - IO ()
fdClose fd =
  throwErrnoIfMinus1Retry_ fdClose $
c_close fd

fdGetBuf :: FD - Ptr a - Int - IO Int
fdGetBuf fd buf size =
  fromIntegral `liftM`
(throwErrnoIfMinus1Retry fdGetBuf $
  c_read fd (castPtr buf) (fromIntegral size))

fdPutBuf :: FD - Ptr a - Int - IO ()
fdPutBuf fd buf size =
  throwErrnoIfMinus1Retry_ fdPutBuf $
c_write fd (castPtr buf) (fromIntegral size)   -- to do: check that 
result==size?

fdTell :: FD - IO FileSize
fdTell fd =
  fromIntegral `liftM`
throwErrnoIfMinus1Retry fdTell
  (c_tell fd)

fdSeek :: FD - SeekMode - FileSize - IO ()
fdSeek fd mode offset =
  throwErrnoIfMinus1Retry_ fdSeek $
c_lseek fd (fromIntegral offset) whence
  where whence = case mode of
   AbsoluteSeek - sEEK_SET
   RelativeSeek - sEEK_CUR
   SeekFromEnd  - sEEK_END

fdFileSize :: FD - IO FileSize
fdFileSize fd =
  fromIntegral `liftM`
throwErrnoIfMinus1Retry fdFileSize
  (c_filelength fd)

{-open/close/truncate/dup

  new_fd - throwErrnoIfMinus1 dupHandle $
c_dup (fromIntegral (haFD h_))
  new_fd - throwErrnoIfMinus1 dupHandleTo $
c_dup2 (fromIntegral (haFD h_)) (fromIntegral (haFD hto_))
-}

foreign import ccall unsafe HsBase.h tell
   c_tell :: CInt - IO COff

foreign import ccall unsafe HsBase.h filelength
   c_filelength :: CInt - IO COff

foreign import ccall unsafe __hscore_bufsiz   dEFAULT_BUFFER_SIZE :: Int
foreign import ccall unsafe __hscore_seek_cur sEEK_CUR :: CInt
foreign import ccall unsafe __hscore_seek_set sEEK_SET :: CInt
foreign import ccall unsafe __hscore_seek_end sEEK_END :: CInt

i=fromIntegral





-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Bringing Erlang to Haskell

2005-12-13 Thread Joel Reymont


On Dec 13, 2005, at 1:13 AM, Bulat Ziganshin wrote:


are you read dewscription of my own Process library in haskell
maillist?


No. Can you give me a pointer?


btw, i suggested you to try not using logging thread entirely, making
all logging actions synchronously


I cannot. Only one thread can use stdout, otherwise the output is  
garbled. Plus, combing through a few thousand individual files  
produced by the threads would be a pain.



i think that your aspiration to make things asynchronous is just sort
of fashion. what you really want to get?


I have no such aspirations. I was just making a point.

Joel

--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[2]: [Haskell-cafe] Tricky exception handling

2005-12-13 Thread Joel Reymont

Yes, you are right. I will make the change.

On Dec 13, 2005, at 9:28 AM, Bulat Ziganshin wrote:


yes, you are muistaken! :)  this code will repeat permanently until
exception arrived. at this time it will process exception handler and
then exit the whole function. you musr reread docs. hmm, actually this
is the way exception handling works in ANY language


handle (...)
  repeat_forever
do cmd - read h ssl
   post $! Cmd $! cmd


--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Bringing Erlang to Haskell

2005-12-13 Thread Joel Reymont


On Dec 13, 2005, at 9:49 AM, Tomasz Zielonka wrote:


On Mon, Dec 12, 2005 at 04:00:46PM +, Joel Reymont wrote:

One particular thing that bugs me is that I cannot really use TChan
for thread mailboxes. I don't think I experienced this problem with
Erlang but using a TChan with a logger thread quickly overwhelms the
logger and fills the TChan and a lot (hundreds? thousands) of other
threads are logging to it.


I wonder what Erlang does to solve this problem? Perhaps we should  
track

the number of unprocessed messages in TChans and the bigger it is
the more favor consumers over producers.


It sounds to me that introducing thread priorities would be key.

Here's a reply from Ulf Wiger (and Erlang expert):

Erlang has four process priorities:
- 'max' and 'high' are strict priorities (you stay at that level
  while there are processes ready to run)
- normal and low are scheduled fairly, I believe with
  8000 reductions (roughly function calls) at normal
  priority and one low-priority job (if such a job exists)

The scheduler works on reduction count.
A context switch happens if the current process
would block (e.g. if it's in a receive statement
and there is no matching message in the queue),
or when it's executed 1000 reductions.

Since not all operations are equal in cost, there
is an internal function called erlang:bump_reductions(N).
File operations are usually followed by a call to
erlang:bump_reductions(100) (see prim_file:get_drv_response/1)
which means that processes writing to disk run out of
their time slice in fewer function calls. This is
of course to keep them from getting an unfair amount
of CPU time.

A logging process would probably therefore do well to
fetch all messages in the message queue and write them
using disk_log:alog_terms/2 (logging multiple messages
each time).

One could also possibly run the logger process at high
priority. This means that normal priority process will
have a hard time starving it. If the disk is slow, the
logger process will yield while waiting for the disk
(which won't block the runtime system as long as you
have the thread pool enabled).

In general, I think that processes that mainly dispatch
messages, and don't generate any work on their own,
should usually run on high priority. Otherwise, they
tend to just contribute to delays in the system.

/Uffe

--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Bringing Erlang to Haskell

2005-12-13 Thread Bulat Ziganshin
Hello Tomasz,

Tuesday, December 13, 2005, 12:49:04 PM, you wrote:

TZ On Mon, Dec 12, 2005 at 04:00:46PM +, Joel Reymont wrote:
 One particular thing that bugs me is that I cannot really use TChan  
 for thread mailboxes. I don't think I experienced this problem with  
 Erlang but using a TChan with a logger thread quickly overwhelms the  
 logger and fills the TChan and a lot (hundreds? thousands) of other  
 threads are logging to it. 

TZ I wonder what Erlang does to solve this problem? Perhaps we should track
TZ the number of unprocessed messages in TChans and the bigger it is
TZ the more favor consumers over producers.

even best - always prefer consumer to producer :)  may be have two
lists - one of threads waiting to consume, and one of threads waiting to
produce?



-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Substring replacements

2005-12-13 Thread Daniel Fischer
Am Montag, 12. Dezember 2005 16:28 schrieben Sie:
 From: Daniel Fischer [EMAIL PROTECTED]

 To: Branimir Maksimovic [EMAIL PROTECTED]
 CC: Haskell-Cafe@haskell.org
 Subject: Re: [Haskell-cafe] Substring replacements
 Date: Mon, 12 Dec 2005 16:15:46 +0100
 
 Earlier today:
   Sorry, but
   Prelude SearchRep searchReplace abaaba ## abababaaba
   abababaaba
  
   I haven't analyzed the algorithm, so I don't know why exactly this
 
 fails.
 
   I'll take a look sometime soon.
 
 I found the problem (one at least).
 Say the pattern to be replaced begins with 'a' and we have a sufficiently
 long
 match with the pattern starting at the first 'a' in the String. Upon
 encountering the second 'a', while the first pattern still matches, you
 start
 pushing onto the rollback-stack. But that isn't inspected anymore, so if
 the
 actual occurence of the pattern starts at the third (or fourth, n-th)
 occurence of 'a' and that is already pushed onto the rollback, you miss
  it.

 I've corrected this with adjusting rollback position. if rollBack is null
 then
 search for rollback starts at second character if not starts at same as
 searhed
 character because I skip what was searched. That's all.
 Though I'm not so sure now when I read this.

Still not working:

*New searchReplace abababc # ababababababc
ababababababc
*New searchReplace1 abababc # ababababababc
ababababababc


 So the question is, can we find a cheap test to decide whether to use KMP
 or
 Bulat's version?

 In real world situation your KMP will always be fastest on average.
 I like that we are not using C arrays as then we have advantage
 of lazyness and save on memory usage. C++ program will be faster
 on shorter strings but on this large strings will loose due memory
 latency. and with your test, both programs are very fast.

 Greetings, Bane.


On my 256MB RAM AMD Duron 1200 MHz, Bulat's version is consistently about 20% 
faster than my KMP on your test -- btw, I unboxed the pat array, which gave a 
bit of extra speed, but not much. 
And apologies to Sebastian Sylvan, I also included an unboxed version of bord, 
built from the boxed version, and that sped things further up -- not much, 
again, but there it is.
I wonder about this difference, -10% on one system and +20% on another system, 
ist that normal?

Cheers, Daniel
--
Up-To-Date version of KMP:

import Data.Array.Unboxed (UArray, listArray, (!))
import qualified Data.Array as A (array, (!), elems)

searchReplace :: String - String - String - String
searchReplace  _ str = str
searchReplace src@(c:cs) dst str = process 0 str 
where
  len = {-# scc len #-} length src
  pat :: UArray Int Char
  pat = {-# scc pat #-} listArray (0,len-1) src
  bord ={-# scc bord #-} A.array (0,len) $ (0,-1):(1,0):
 [(i+1,getBord (pat!i) i + 1) | i - [1 .. len-1]]
  getBord s n
 | m  0  = m
 | s == pat!m = m
 | otherwise  = getBord s m
   where
 m = bord A.! n
  bor :: UArray Int Int
  bor = listArray (0,len) $ A.elems bord
  getBor s n
 | m  0 || s == pat!m = m
 | otherwise = getBor s m
   where
 m = bor!n
  process n str _ | n = len = {-# scc process #-} dst ++ process 0 str 

  process _  mat = {-# scc process #-} reverse mat
  process 0 (s:st) _
 | s == c= {-# scc process #-} process 1 st [s]
 | otherwise = {-# scc process #-} s:process 0 st 
  process n str@(s:st) mat
 | s == pat!n = {-# scc process #-} process (n+1) st (s:mat)
 | otherwise  = {-# scc process #-}
let j = getBor s n
(ret,skip) = splitAt j mat
in if j  0 then reverse mat ++ process 0 str 
   else reverse skip ++ process (j+1) st (s:ret)

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] The Price of Performance

2005-12-13 Thread Joel Reymont

I thought this would be of interest to the Haskell community:

http://www.acmqueue.com/modules.php?name=Contentpa=showpagepid=330

The ever-growing popularity of small multiprocessors is exposing  
more programmers to parallel hardware. More tools to spot correctness  
and performance problems are becoming available (e.g., thread  
checkers8 and performance debuggers9). Also, a few expert programmers  
can write efficient threaded code that is in turn leveraged by many  
others. Fast-locking and thread-efficient memory allocation libraries  
are good examples of programming work that is highly leveraged.


Viva le STM!

--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Bringing Erlang to Haskell

2005-12-13 Thread Bulat Ziganshin
Hello Joel,

Tuesday, December 13, 2005, 1:05:10 PM, you wrote:
 are you read dewscription of my own Process library in haskell
 maillist?

JR No. Can you give me a pointer?

i will forward it to you. it have meaning to be subcribed there, just
to see interesting announcements

 btw, i suggested you to try not using logging thread entirely, making
 all logging actions synchronously

JR I cannot. Only one thread can use stdout, otherwise the output is  
JR garbled. Plus, combing through a few thousand individual files  
JR produced by the threads would be a pain.

:)))

import Control.Concurrent
import Control.Monad
import System.IO
import System.IO.Unsafe

main = do h - openBinaryFile test WriteMode
  for [1..100] $ \n -
forkIO $
  for [1..] $ \i -
logger h (thread ++show n++ msg ++show i)
  getLine
  hClose h

lock = unsafePerformIO$ newMVar ()

logger h msg = withMVar lock $ const$ do
 hPutStrLn h msg
 putStrLn msg

for = flip mapM_

-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Fwd: [Haskell] ANNOUNCE: Process library (for dataflow-oriented programming?)

2005-12-13 Thread Bulat Ziganshin
This is a forwarded message
From: Bulat Ziganshin [EMAIL PROTECTED]
To: haskell@haskell.org
Date: Thursday, December 08, 2005, 1:36:05 AM
Subject: [Haskell] ANNOUNCE: Process library (for dataflow-oriented 
programming?)

===8==Original message text===
Hello haskell,

Joel's program (discussed in cafe), which now uses MVars instead of
Channels to send data between threads, may be a good example of
dataflow-driven program: it consists of many hundreds of threads and
when one thread sends data to another through MVar, this thread in
most cases goes to sleep until receiving thread will process previous
value of this MVar. so, threads are waked up and asleep according to
passing values between them, and the whole program executes in order
defined by these data dependencies, as opposite to the order of
program statements

one year ago i developed small library, which can be helpful if you
want to use such style of programming. its ideas are modelled after
Unix pipes, which are widely used to assemble complex data processing
engines from simple details. really this library is very thin
layer over direct using of forkOS, channels and MVars; nevertheless,
is is very convenient and beatiful 

you can download library as http://freearc.narod.ru/Process.tar.gz
this page also contains sources of my program where you can find
examples of using library in real toy :)  below is a guide to library
usage


to create pipe, which contains 3 processes - producer, transformer
and consumer:

runP ( producer | transformer | consumer )

each process in pipe runned in separate Haskell thread. process is
represented by ordinary Haskell function which gets an additional
parameter - handle, which can be used to receive data from previous
process in pipe (using receiveP) and send data to the next process
(using sendP). for example, abovementioned processes can be implemented
as: 

producer handle = mapM_ (sendP handle) [1..10]

transformer handle = replicateM_ 10 $
   do x - receiveP handle
  sendP handle (x*2)

consumer handle = replicateM_ 10 $
do x - receiveP handle
   print x


if first process in pipe tries to use receiveP or last process in pipe
tries to use sendP, then run-time exception is generated. number of
processes in pipe can be arbitrary. because each process is just
ordinary Haskell function, you can add additional parameters to
processes when constructing pipes:

runP ( producer | multiple 2 | multiple 3 | consumer )

multiple n handle = replicateM_ 10 $
  do x - receiveP handle
 sendP handle (x*n)


moreover, you can construct pipe or part of it as ordinary data value,
which then can be runned by runP:

let pipe = case multipliers of
 [x] - multiple x
 [x,y] - multiple x | multiple y
 [x,y,z] - multiple x | multiple y | multiple z
 _ - \handle - fail Zero or too much multipliers
runP ( producer | pipe | consumer )


there is also back channel, which can be used to return data to
previous process in the pipe, its operations is send_backP and
receive_backP. it can be used to return acknowledgments, synchronize
processes or to return resources back. brief example of its usage:

producer: sendP pipe (buf,len)  consumer: ;
  ;   (buf,len) - receiveP pipe
  ;   hPutBuf file buf len
  ;   send_backP pipe ()
  receive_backP pipe  ;
  --now we know that buf is free  ;

(i organized lines to show execution order)


if processes joined in pipe with | then channel between them
uses MVar, so at any moment it may contain no more than 1 element. if
channel between two processes is created with | then Chan is used,
which can contain arbitrary number of data items. be careful with such
channels, because they can grow to unlimited size. | and | can
be arbitrarily combined in one pipe:

runP ( producer | multiple 2 | multiple 3 | consumer )

back channel (used by send_backP and receive_backP) are always
multi-element (uses Chan)


runP returns when all processes in pipe are finished. if any process
in pipe generates uncaught exception, then all processes in pipe are
killed and this exception is re-raised in thread called runP

pipe or single process can also be runned in background using
runAsyncP:

handle - runAsyncP (multiple 2)

handle returned here can be used to interact with first and last
processes in pipe, in contrast to runP:

handle - runAsyncP (multiple 2)
sendP handle 1
res - receiveP handle

of course, pipe runned asynchronously is not required to perform
input, output, or both:

handle - runAsyncP ( producer | transformer )
handle - runAsyncP ( transformer | consumer )
handle - runAsyncP ( producer | transformer | consumer )

currently channels to 

Re: Re[2]: [Haskell-cafe] Bringing Erlang to Haskell

2005-12-13 Thread Joel Reymont
Thank you Bulat, makes total sense. This list is a treasure trove of  
a resource.


I guess this is what happens when you go from Erlang to Haskell :-).  
I'm conditioned to think of everything as a process and uses  
processes for everything.


On Dec 13, 2005, at 11:17 AM, Bulat Ziganshin wrote:


import Control.Concurrent
import Control.Monad
import System.IO
import System.IO.Unsafe

main = do h - openBinaryFile test WriteMode
  for [1..100] $ \n -
forkIO $
  for [1..] $ \i -
logger h (thread ++show n++ msg ++show i)
  getLine
  hClose h

lock = unsafePerformIO$ newMVar ()

logger h msg = withMVar lock $ const$ do
 hPutStrLn h msg
 putStrLn msg

for = flip mapM_


--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [Haskell] ANNOUNCE: Process library (for dataflow-oriented programming?)

2005-12-13 Thread Joel Reymont

Bulat,

How is your library licensed?

How can a process maintain internal state?

How would I use your library to code a socket reader/writer that  
writes received events to the socket and propagates back anything  
that is received?


The producer/consumer in front of this network client would be  
another process that analyzes the events sent back to it and produces  
events based on the analysis.


How would I use it to launch a few network clients that seat there  
and process events until they decided to quit? The whole program  
needs to stay up until the last network client has exited.


The pipeline to me looks like this:

   - Bot - Socket client ... Server
 /
Bot launcher ---   - Bot - Socket client ... Server
 \
   - Bot - Socket client ... Server

Where bot launcher starts a predefined # of bots and collects results  
sent back by each one.


I think your library looks a bit like Yampa in that your processes  
are somewhat like signal functions.


Thanks, Joel

--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Substring replacements

2005-12-13 Thread Branimir Maksimovic





From: Daniel Fischer [EMAIL PROTECTED]
To: Branimir Maksimovic [EMAIL PROTECTED]
CC: Haskell-Cafe@haskell.org
Subject: Re: [Haskell-cafe] Substring replacements
Date: Tue, 13 Dec 2005 11:23:29 +0100

Am Montag, 12. Dezember 2005 16:28 schrieben Sie:
 From: Daniel Fischer [EMAIL PROTECTED]

 To: Branimir Maksimovic [EMAIL PROTECTED]
 CC: Haskell-Cafe@haskell.org
 Subject: Re: [Haskell-cafe] Substring replacements
 Date: Mon, 12 Dec 2005 16:15:46 +0100
 
 Earlier today:
   Sorry, but
   Prelude SearchRep searchReplace abaaba ## abababaaba
   abababaaba
  
   I haven't analyzed the algorithm, so I don't know why exactly this
 
 fails.
 
   I'll take a look sometime soon.
 
 I found the problem (one at least).
 Say the pattern to be replaced begins with 'a' and we have a 
sufficiently

 long
 match with the pattern starting at the first 'a' in the String. Upon
 encountering the second 'a', while the first pattern still matches, you
 start
 pushing onto the rollback-stack. But that isn't inspected anymore, so 
if

 the
 actual occurence of the pattern starts at the third (or fourth, n-th)
 occurence of 'a' and that is already pushed onto the rollback, you miss
  it.

 I've corrected this with adjusting rollback position. if rollBack is 
null

 then
 search for rollback starts at second character if not starts at same as
 searhed
 character because I skip what was searched. That's all.
 Though I'm not so sure now when I read this.

Still not working:

*New searchReplace abababc # ababababababc
ababababababc
*New searchReplace1 abababc # ababababababc
ababababababc



Yes, perhaps you've missed another post of mine. I've noticed
that problem when pattern repeats more then 2 times and gave up
because now whatever I do, your version is always fastest.



 So the question is, can we find a cheap test to decide whether to use 
KMP

 or
 Bulat's version?


Just interleave string with  search hits with one with no seacrh (that means 
partial too)

hits, and your version will gain in speed.
More partial matches and full search matches Bulat's version will gain in
speed.
Longer search strings, your version will have gains.



 In real world situation your KMP will always be fastest on average.
 I like that we are not using C arrays as then we have advantage
 of lazyness and save on memory usage. C++ program will be faster
 on shorter strings but on this large strings will loose due memory
 latency. and with your test, both programs are very fast.

 Greetings, Bane.


On my 256MB RAM AMD Duron 1200 MHz, Bulat's version is consistently about 
20%
faster than my KMP on your test -- btw, I unboxed the pat array, which gave 
a

bit of extra speed, but not much.


I think that's because on your machine Bulat's version have better 
perfromance

with CPU cache.
I don;t know but now your version is 25% faster with my test on P4
hyperthreaded.

your new version:
$ time srchrep.exe
Working:seasearch replace  able seaseasearch baker seasearch charlie
True
Done


real0m8.734s
user0m0.015s
sys 0m0.000s

Bulat's version:

[EMAIL PROTECTED] ~/tutorial
$ time replace1.exe
Working:seasearch replace  able seaseasearch baker seasearch charlie
True
Done


real0m11.734s
user0m0.015s
sys 0m0.015s

3 secs difference now.

And apologies to Sebastian Sylvan, I also included an unboxed version of 
bord,

built from the boxed version, and that sped things further up -- not much,
again, but there it is.


On my machine you got another 10-15% of boost with unboxed arrays.

I wonder about this difference, -10% on one system and +20% on another 
system,

ist that normal?


Different caching schemes on CPU's perhaps? different memory latencies?
hyperthreading helps your version? more code and data, perhaps because
of that it pays the price on your machine?

Greetings, Bane.

_
Don't just search. Find. Check out the new MSN Search! 
http://search.msn.click-url.com/go/onm00200636ave/direct/01/


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] trick to easily generate Eq/Ord instances

2005-12-13 Thread Henning Thielemann


On Mon, 12 Dec 2005, Bulat Ziganshin wrote:


Hello

sometimes, Eq/Ord classes can't be derived automatically because we
need to comare only part of fields. in such situations i use the
following trick to easify generation of class instances:

data ArchiveBlock = ArchiveBlock {
   blArchive :: Archive
 , blType:: BlockType
 , blCompressor  :: Compressor
 , blPos :: Integer
 , blOrigSize:: Integer
 , blCompSize:: Integer
 , blCRC :: CRC
 , blFiles   :: Int
   }

instance Eq ArchiveBlock where
 (==)=  map2eq  $ map3 (blArchive,blPos,blCRC)

instance Ord ArchiveBlock where
 compare =  map2cmp $ map2 (blArchive,blPos)

{-
instance Ord ArchiveBlock where
 compare =  map2cmp blPos  -- for comparision on just one field
-}




I solved that problem with two generic functions:

Compare the same item of two records.


compareField :: Ord b = (a - b) - a - a - Ordering
compareField f x y = compare (f x) (f y)


Lexicographically compare a list of attributes of two records.


compareRecord :: [a - a - Ordering] - a - a - Ordering
compareRecord cs x y =
   head (dropWhile (EQ==) (map (\c - c x y) cs) ++ [EQ])


Use it this way:


instance Ord ArchiveBlock where
   compare =
  compareRecord
 [compareField blArchive,
  compareField blPos,
  compareField blCRC]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Bringing Erlang to Haskell

2005-12-13 Thread Robert Dockins

BTW, there has already been some work in this area.

http://www-i2.informatik.rwth-aachen.de/~stolz/dhs/
http://www.informatik.uni-kiel.de/~fhu/PUBLICATIONS/1999/ifl.html


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re[2]: [Haskell] ANNOUNCE: Process library (for dataflow-oriented programming?)

2005-12-13 Thread Bulat Ziganshin
Hello Joel,

Tuesday, December 13, 2005, 3:04:11 PM, you wrote:
JR How is your library licensed?

is costs many megabucks because it's very complex proprietary design
where some functions reach whole 12 lines! :)

of course, you can do what you want with this library. may be the
better way is to write your won, stealing one ot two ideas from mine

you can find another interesting works in:

http://www-i2.informatik.rwth-aachen.de/~stolz/Haskell/CA.hs
http://www-i2.informatik.rwth-aachen.de/Research/distributedHaskell/pbdhs-2001-09-20.tar.gz
(this one seems to be especially interesting for you, providing
ports - i think, in Erlang style)
http://www-i2.informatik.rwth-aachen.de/Research/distributedHaskell/network.tar.gz
http://quux.org/devel/missingh/missingh_0.12.0.tar.gz
(see Logging, Network, Threads directories)

JR How can a process maintain internal state?

process in my lib is just an ordinary Haskell function and therefore
this is done as in any other Haskell functions :)

my examples are easified - to not bother with EOF i just organized in
each process a loop which sends and/or receives just 10 messages. in
my real program data sent between process are defined by structures
like this:

data Message = FileStarted String
 |   FileData String
 | DataEnd

so typical communication scenario is:

sendP h (FileStart 1)
; sendP h (FileData abc)
; sendP h (FileData def)
; sendP h (FileData ghi)
sendP h (FileStart 2)
; sendP h (FileData qwer)
sendP h (FileStart 3)
; sendP h (FileData 123)
sendP h DataEnd

and each function realizing process finishes only when this process is
done. sender process organizes cycle which reads files and sends their
data to the channel. receiver process organizes cycle until `DataEnd`
is received

in one phrase, it's just the same organization as in your own program
:)))


JR How would I use your library to code a socket reader/writer that  
JR writes received events to the socket and propagates back anything  
JR that is received?

JR The producer/consumer in front of this network client would be
JR another process that analyzes the events sent back to it and produces  
JR events based on the analysis.

i don't understand your questions

JR How would I use it to launch a few network clients that seat there  
JR and process events until they decided to quit? The whole program  
JR needs to stay up until the last network client has exited.

JR The pipeline to me looks like this:

JR - Bot - Socket client ... Server
JR   /
JR Bot launcher ---   - Bot - Socket client ... Server
JR   \
JR - Bot - Socket client ... Server

JR Where bot launcher starts a predefined # of bots and collects results  
JR sent back by each one.

my lib is not appropriate for yor task, because it is oriented to
easify creation of processes which have only one input. but your main
thread must receive data from all bots, and bot must receive data from
two sources. the decision depends on the strategy of mixing these
inputs - will it be fair FIFO or more advanced schema?

if it's a FIFO then something like this (i'm skipped only exceptions
processing and creating socket-reader process inside of each bot -
writing to socket must be performed by bot itself):

{-# OPTIONS_GHC -cpp #-}
import Control.Concurrent
import Control.Exception
import Control.Monad
import Data.Array
import Data.Char
import Data.Either
import Data.HashTable
import Data.IORef
import Data.List
import Data.Maybe
import Data.Word
import Debug.Trace
import Foreign.C.String
import Foreign.Marshal.Alloc
import Foreign.Marshal.Array
import Foreign.Marshal.Pool
import Foreign.Marshal.Utils
import Foreign.Ptr
import Text.Regex
import System.IO.Unsafe

-
--- Bot launcher implementation -
-

main = do
  (sendToMain, receiveFromBots) - createChannel
  bots - foreach [1..10] $ createProcess . bot sendToMain
  mapM_ (`sendToProcess` Wake up, Neo!) bots
  while receiveFromBots (/=I want to stop the Matrix!) print
  -- 
  mapM_ killProcess bots
  -- or, if you are more humane - mapM_ waitProcessDie bots :)


-
--- Bot implementation --
-

bot sendToMain n receiveMessagesForMe = do
  forever $ do
x - receiveMessagesForMe
case x of
  Wake up, Neo! - sendToMain$ show n++: I'm not sleeping!
  Are you wanna coffee? - sendToMain$ show n++: Yes, it is!
yield
sendToMain I want to stop the Matrix!


-
--- Process implementation details --
-

-- |Abstract type for all of 

[Haskell-cafe] FreeBSD: Max # of sockets opened

2005-12-13 Thread Joel Reymont

Folks,

I need some help from those of you with a FreeBSD box.

It looks like 'ulimit -n' on FreeBSD lets you have 10k+ file  
descriptors open per process. FD_SETSIZE is 1024 in the system  
headers, though. GHC relies on this value (see ghc/rts/Select.c).


Normally, you will get the EMFILE error if you try to open more  
sockets than what is allowed with 'ulimit -n'. If you allow yourself  
more than 1024 descriptors per process then you do not get this error  
but...


This seems to lead to a situation where you open more than 1024  
sockets and shortly afterwards get 'connection resets' for some or  
all of your sockets. Maybe just those above 1024, I have not  
determined this precisely.


My question is this: is it possible to get a higher number of open  
sockets by editing the system header files on FreeBSD and recompiling  
GHC? Has anyone tried this before? How high can you go?


Thanks, Joel

--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Top-level TVars

2005-12-13 Thread Joel Reymont

Can this be done now or is this a GHC 6.5 feature?

My combination of unsafePerformIO with atomically $ newTVar does not  
seem to be working.


Thanks, Joel

P.S. What is the ETA for 6.5?

On Mon, Dec 05, 2005 at 10:50:13AM -, Simon Peyton-Jones wrote:

 It turns out to be easy to provide

 newTVarIO :: a - IO (TVar a)

 which you can call from inside 'unsafePerformIO'.  That means you can
 allocate top-level TVars without fuss.


--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] FreeBSD: Max # of sockets opened

2005-12-13 Thread Tony Finch
On Tue, 13 Dec 2005, Joel Reymont wrote:

 It looks like 'ulimit -n' on FreeBSD lets you have 10k+ file descriptors open
 per process. FD_SETSIZE is 1024 in the system headers, though. GHC relies on
 this value (see ghc/rts/Select.c).

FD_SETSIZE is actually dynamic on FreeBSD (at least from the kernel's
point of view - the macros are less so). You can re-set it to whatever
value you like at compile time (e.g. gcc -DFD_SETSIZE=10240).

Tony.
-- 
f.a.n.finch  [EMAIL PROTECTED]  http://dotat.at/
BISCAY: WEST 5 OR 6 BECOMING VARIABLE 3 OR 4. SHOWERS AT FIRST. MODERATE OR
GOOD.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] FreeBSD: Max # of sockets opened

2005-12-13 Thread Joel Reymont

So it's just a matter of recompiling GHC and have it pick up new values?

On Dec 13, 2005, at 6:11 PM, Tony Finch wrote:


FD_SETSIZE is actually dynamic on FreeBSD (at least from the kernel's
point of view - the macros are less so). You can re-set it to whatever
value you like at compile time (e.g. gcc -DFD_SETSIZE=10240).


--
http://wagerlabs.com/





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] FreeBSD: Max # of sockets opened

2005-12-13 Thread Tony Finch
On Tue, 13 Dec 2005, Joel Reymont wrote:

 So it's just a matter of recompiling GHC and have it pick up new values?

Yes.

(It's a pity that the FD_SET macros aren't run-time configurable.)

Tony.
-- 
f.a.n.finch  [EMAIL PROTECTED]  http://dotat.at/
BISCAY: WEST 5 OR 6 BECOMING VARIABLE 3 OR 4. SHOWERS AT FIRST. MODERATE OR
GOOD.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Top-level TVars

2005-12-13 Thread Tomasz Zielonka
On Tue, Dec 13, 2005 at 06:08:23PM +, Joel Reymont wrote:
 Can this be done now or is this a GHC 6.5 feature?
 
 My combination of unsafePerformIO with atomically $ newTVar does not  
 seem to be working.

Here is an example how you can initialize a top-level STM variable.
http://www.uncurry.com/repos/TimeVar/TimeVar.hs
It just forks a new thread inside unsafePerformIO, it runs atomically
in it and passes the result through ordinary MVar.

Best regards
Tomasz

-- 
I am searching for a programmer who is good at least in some of
[Haskell, ML, C++, Linux, FreeBSD, math] for work in Warsaw, Poland
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Announcing Djinn, new version 2004-12-13

2005-12-13 Thread Lennart Augustsson

There is a new version of Djinn available, with two notable
new features: Haskell data types can be defined and the
found functions are sorted (heuristically) to present the
best one first.

To play with Djinn do a
  darcs get http://darcs.augustsson.net/Darcs/Djinn
or get
  http://darcs.augustsson.net/Darcs/Djinn/Djinn.tar.gz
Then just type make.  (You need a Haskell 98 implementation and
some libraries.)  And then start djinn.

`-- Lennart
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe