I'm allergic to hex constants.  Surely, they are not necessary.

        -- Lennart

On Dec 18, 2006, at 18:14 , Chris Kuklewicz wrote:

Hi all (and Don!),

  I have some rewritten versions of readInt below...

Bulat Ziganshin wrote:
Hello Donald,

Monday, December 18, 2006, 3:51:48 AM, you wrote:
Haskell can't provide fast execution speed unless very low-level
programming style is used (which is much harder to do in Haskell than in C,
see one of my last messages for example) AND jhc compiler is used

I have to dispute this Bulat's characterisation here. We can solve lots
of nice problems and have high performance *right now*. Particularly
concurrency problems, and ones involving streams of bytestrings.
No need to leave the safety of GHC either, nor resort to low level evil
code.

let's go further in this long-term discussion. i've read Shootout problems
and concluded that there are only 2 tasks which speed is dependent on
code-generation abilities of compiler, all other tasks are dependent on speed of used libraries. just for example - in one test TCL was fastest language. why? because this test contained almost nothing but 1000 calls to the regex engine with very large strings and TCL regex engine was fastest

the same applies to the two above-mentioned areas - GHC wins in concurrency
tests just because only built-in libraries considered, and GHC has a
lightweight threads library built-in while C compilers don't

with ByteString library, i know at least one example, where you have added
function - readInt - to the library only to win in Shootout test

This obsession with mutable-variable, imperative code is unhealthy, Bulat ;)

so why your readInt routine is written in imperative way? ;)


ajb:
The PGP format is heavily character stream-based. We know how horrible the performance of character streams are in Haskell. On one hand, this would be an excellent test case. On the other hand, performance would
indeed suck now.

Unless you used a stream of lazy bytestrings!
As Duncan did for his pure gzip and bzip2 bindings:

these are binding to existing C libs. if you try to convince us that to get fast FPS routines one need to write them in C and then provide Haskell
binding, i will 100% agree

otherwise, please explain me why your own function, readInt, don't use
these fascinating stream fusion capabilities? ;)

P.S. The comments on this thread makes me think that the state of the
art high perf programming in Haskell isn't widely known. Bulat-style
imperative Haskell is rarely (ever?) needed in the GHC 6.6 Haskell code
I'm writing in these days. Solving large data problems is feasible
*right now* using bytestrings.

may be it's me who wrote FPS library? :) let's agree that high- performance code can be written in C and then called from Haskell. and when *you* want to have real performance, you write something very far from your optimistic
words:

readInt :: ByteString -> Maybe (Int, ByteString)
readInt as
    | null as   = Nothing
    | otherwise =
        case unsafeHead as of
            '-' -> loop True  0 0 (unsafeTail as)
            '+' -> loop False 0 0 (unsafeTail as)
            _   -> loop False 0 0 as

where loop :: Bool -> Int -> Int -> ByteString -> Maybe (Int, ByteString)
          STRICT4(loop)
          loop neg i n ps
              | null ps   = end neg i n ps
              | otherwise =
                  case B.unsafeHead ps of
                    w | w >= 0x30
                     && w <= 0x39 -> loop neg (i+1)
(n * 10 + (fromIntegral w - 0x30))
                                          (unsafeTail ps)
                      | otherwise -> end neg i n ps

          end _    0 _ _  = Nothing
          end True _ n ps = Just (negate n, ps)
          end _    _ n ps = Just (n, ps)


Trying to write code using an imperative style is likely to do more harm than good, and certainly not something to suggest to beginners on the
mailing list ;)

of course, if you want to fool beginners, you can continue to sing these
songs :D


That code for readInt has no mutable state or IO, and it may be imperative
style, it is not much like c-code.

I liked my readInt1 version below, which actually benchmarks faster than the GHC 6.6 Data.ByteString.Char8.readInt (on the little "test r" below). What a
pleasant surprise.

The readInt2 version below is even more functional, but takes about 1.7 times as long to run "test r". (gaged from running "time ./ReadInt.2" on Mac OS X on a
G4 cpu).  The latest fusion in darcs fps might be better...

module Main(main) where

import Control.Arrow((***))
import Data.Char(chr,ord)
import Data.ByteString(ByteString)
import qualified Data.ByteString as B(null,foldl',span)
import qualified Data.ByteString.Char8 as C (readInt,pack)
import qualified Data.ByteString.Base as B(unsafeHead,unsafeTail)
import Data.Ix(inRange)
import Data.Maybe(fromJust)
import Data.Word(Word8)

default ()

{-# INLINE chr8 #-}
chr8 :: Word8 -> Char
chr8 = chr . fromEnum
{-# INLINE ord8 #-}
ord8 :: Char -> Word8
ord8 = toEnum . ord

{-# INLINE decompose #-}
decompose :: a -> (Word8 -> ByteString -> a) -> ByteString -> a
decompose whenNull withHeadTail bs =
if B.null bs then whenNull else (withHeadTail $! (B.unsafeHead bs)) $! (B.unsafeTail bs)

-- This does not do any bound checking if the Int overflows
readInt1 :: ByteString -> Maybe (Int, ByteString)
readInt1 bs = decompose Nothing (\h t ->
    case h of
      0x2d -> fmap (negate *** id) $ first t -- '-'
      0x2b -> first t                        -- '+'
      _    -> first bs) bs
  where first :: ByteString -> Maybe (Int, ByteString)
        first = decompose Nothing (\h t ->
            if inRange (0x30,0x39) h
              then Just (loop t $! (fromIntegral h-0x30))
              else Nothing)
        loop :: ByteString -> Int -> (Int, ByteString)
        loop bs n = decompose (n,bs) (\h t ->
            if inRange (0x30,0x39) h
              then loop t $! (n*10+(fromIntegral h-0x30))
              else (n,bs)) bs


readInt2 :: ByteString -> Maybe (Int,ByteString)
readInt2 bs =  decompose Nothing (\h t ->
    case h of
      0x2d -> fmap (negate *** id) $ toInt t -- '-'
      0x2b -> toInt t                        -- '+'
      _    -> toInt bs) bs
  where toInt :: ByteString -> Maybe (Int,ByteString)
toInt bs = let (digits,rest) = B.span (inRange (0x30,0x39)) bs in if B.null digits then Nothing else Just (convert digits,rest)
        convert :: ByteString -> Int
        convert = B.foldl' (\n h -> n*10 + (fromIntegral h - 0x30)) 0

test r = let a = take 1000000 $ cycle [C.pack "13247897",C.pack "-13247896"]
         in (sum . map (fst . fromJust . r) $ a
            , take 4 a
            , map r $ take 4 a)

main = print $ test $ (readInt2)

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to