subject:"Re\: \[Haskell\-cafe\] bytestring vs. uvector"

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-14 Thread Duncan Coutts

On Mon, 2009-03-09 at 18:29 -0700, Alexander Dunlap wrote:

> Thanks for all of the responses!
> 
> So let me see if my summary is accurate here:
> 
> - ByteString is for just that: strings of bytes, generally read off of
> a disk. The Char8 version just interprets the Word8s as Chars but
> doesn't do anything special with that.

Right. So it's only suitable for binary or ASCII (or mixed) formats.

> - Data.Text/text library is a higher-level library that deals with
> "text," abstracting over Unicode details and treating each element as
> a potentially-multibye "character."

If you're writing about this on the wiki for people, it's best not to
confuse the issue by talking about multibyte anything. We do not
describe [Char] as a multibyte encoding of Unicode. We say it is a
Unicode string. The abstraction is at the level of Unicode code points.
The String type *is* a sequence of Unicode code points.

This is exactly the same for Data.Text. It is a sequence of Unicode code
points. It is not an encoding. It is not UTF-anything. It does not
abstract over Unicode.

The Text type is just like the String type but with different strictness
and performance characteristics. Both are just sequences of Unicode code
points.

There is a reasonably close correspondence between Unicode code points
and what people normally think of as characters.

> - utf8-string is a wrapper over ByteString that interprets the bytes
> in the bytestring as potentially-multibye unicode "characters."

This on the other hand is an encoding. ByteString is a sequence of bytes
and when we interpret that as UTF-8 then we are looking at an encoding
of a sequence of Unicode code points.

Clear as mud? :-)

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-13 Thread Henning Thielemann



On Mon, 9 Mar 2009, Alexander Dunlap wrote:


- uvector, storablevector and vector are all designed for dealing with
arrays. They *can* be used for characters/word8s but are not
specialized for that purpose, do not deal with Unicode at all, and are
probably worse at it. They are better for dealing with things that you
would generally use arrays for.


Since the Storable instance of Char stores 32 bit values, I expect they 
store full Unicode codes and thus StorableVector can be used for Unicode 
strings without loss. They will require more memory than UTF8 strings, but 
whether they are faster or slower will depend on how expensive UTF8 
decoding/encoding is.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-13 Thread Don Stewart

manlio_perillo:
> Bryan O'Sullivan ha scritto:
>> [...]
>> text is not mature, and is based on the same modern fusion framework as 
>> uvector and vector. It uses unpinned arrays, but provides functions for 
>> dealing with foreign code. 
>
> What is the reason why you have decided to use unpinned arrays  
> (ByteArray#) instead of pinned arrays (Foreign.Ptr)?

They prevent heap fragmentation (and in general are faster).

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-13 Thread Manlio Perillo


Bryan O'Sullivan ha scritto:

[...]
text is not mature, and is based on the same modern fusion framework as 
uvector and vector. It uses unpinned arrays, but provides functions for 
dealing with foreign code. 


What is the reason why you have decided to use unpinned arrays 
(ByteArray#) instead of pinned arrays (Foreign.Ptr)?



> [...]



Thanks  Manlio
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-13 Thread Manlio Perillo


Don Stewart ha scritto:

[...]

You also have to add instance for UIO:

instance (RealFloat a, UIO a) => UIO (Complex a) where
hPutU h (UAComplex arr) = hPutU h arr
hGetU h = do arr <- hGetU h
 return (UAComplex arr)


With Storable, this should not be required; you just have to write an  
instance for the Storable class.




Though you get no IO operations with Storable... UIO is entirely
separate.



Yes, but using Storable and Foreign.Ptr, IO is rather simple.
From storablevector package:

-- | Outputs a 'Vector' to the specified 'Handle'.
hPut :: (Storable a) => Handle -> Vector a -> IO ()
hPut h v =
   if null v
 then return ()
 else
   let (fptr, s, l) = toForeignPtr v
   in  withForeignPtr fptr $ \ ptr ->
  let ptrS = advancePtr ptr s
  ptrE = advancePtr ptrS l
  -- use advancePtr and minusPtr in order to respect
  -- alignment
  in  hPutBuf h ptrS (minusPtr ptrE ptrS)

-- | Read a 'Vector' directly from the specified 'Handle'.  This
-- is far more efficient than reading the characters into a list
-- and then using 'pack'.
--
hGet :: (Storable a) => Handle -> Int -> IO (Vector a)
hGet _ 0 = return empty
hGet h i =
   createAndTrim i $ \p ->
  let elemType :: Ptr a -> a
  elemType _ = undefined
  sizeOfElem = sizeOf (elemType p)
  in  fmap (flip div sizeOfElem) $
  hGetBuf h p (i * sizeOfElem)


Regards  Manlio
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-13 Thread Don Stewart

manlio_perillo:
> Don Stewart ha scritto:
>> bulat.ziganshin:
>>> Hello Don,
>>>
>>> Wednesday, March 11, 2009, 12:12:07 AM, you wrote:
>>>
 Right, so my point stands: there's no difference now. If you can write a
 Storable instance, you can write a UA et al instance.
>>> yes, if there is some class provided for this and not just hard-coded
>>> 4 or so base types 
>>
>> That's right. For example (supporting even pairs):
>>
>> instance (RealFloat a, UA a) => UA (Complex a) where
>>
>>   newtype UArr  (Complex a)   = UAComplex  (UArr (a :*: a))
>>   newtype MUArr (Complex a) s = MUAComplex (MUArr (a :*: a) s)
>>
>
> You also have to add instance for UIO:
>
> instance (RealFloat a, UIO a) => UIO (Complex a) where
> hPutU h (UAComplex arr) = hPutU h arr
> hGetU h = do arr <- hGetU h
>  return (UAComplex arr)
>
>
> With Storable, this should not be required; you just have to write an  
> instance for the Storable class.
>



Though you get no IO operations with Storable... UIO is entirely
separate.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-13 Thread Manlio Perillo


Manlio Perillo ha scritto:

[...]
uvector package also suppors Complex and Rational, however the support 
for these type is "hard written", using a UAProd class, and requires 
some boiler plate code (IMHO).




Correction: UAProd is not a class, sorry.
It is the UA constructor overloaded for a:*:b, and Complex and Rational 
just reuse this specialization.



> [...]


Manlio
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-13 Thread Manlio Perillo


Don Stewart ha scritto:

bulat.ziganshin:

Hello Don,

Wednesday, March 11, 2009, 12:12:07 AM, you wrote:


Right, so my point stands: there's no difference now. If you can write a
Storable instance, you can write a UA et al instance.

yes, if there is some class provided for this and not just hard-coded
4 or so base types 


That's right. For example (supporting even pairs):

instance (RealFloat a, UA a) => UA (Complex a) where

  newtype UArr  (Complex a)   = UAComplex  (UArr (a :*: a))
  newtype MUArr (Complex a) s = MUAComplex (MUArr (a :*: a) s)



You also have to add instance for UIO:

instance (RealFloat a, UIO a) => UIO (Complex a) where
hPutU h (UAComplex arr) = hPutU h arr
hGetU h = do arr <- hGetU h
 return (UAComplex arr)


With Storable, this should not be required; you just have to write an 
instance for the Storable class.




Regards  Manlio
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-13 Thread Manlio Perillo


Don Stewart ha scritto:

[...]

I think uvector only works with certain types that can be
unboxed, while storablevector works with all types that
instantiate Foreign.Storable.Storable.  I don't know about
vector.  From the description of vector, I have the




One of the nice feature of uvector is the support for UArr (a :*: b).

An UArr (a :*: b) can be easily (with fstU and sndU) transformed in
UArr a and UArr b.

uvector package also suppors Complex and Rational, however the support 
for these type is "hard written", using a UAProd class, and requires 
some boiler plate code (IMHO).


I find StorableVector implementation much more simple; I would like to 
see it in the Haskell Platform.


As for Data.Parallel, uvector and vector, it seems there is some code 
duplication.


Both Data.Parallel and uvector, make us of a strict pair type.
Such a type is also implemented in the strict package [1].

The authors are the same, so I don't understand the reason of code 
replication.



There is also replication in the definition of the Stream data type.



[1] there seems to be an error in the documentation:
http://hackage.haskell.org/packages/archive/strict/0.3.2/doc/html/Data-Strict-Tuple.html

In the description, there is:
"Same as regular Haskell pairs, but (x :*: _|_) = (_|_ :*: y) = _|_"

but in the synopsis, the data constructor is :!:, not :*:.



Regards   Manlio Perillo
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-10 Thread Xiao-Yong Jin

Hi, thanks for the hint.  I'll see what I can do with it.

Xiao-Yong

Bulat Ziganshin  writes:

> Hello Xiao-Yong,
>
> Wednesday, March 11, 2009, 12:28:45 AM, you wrote:
>
>> It goes beyond my current knowledge, now.  How do you define
>> a custom data type as an instance of UA or Storable?
>
> just look at existing instances. basically, for complex data type, you
> just use instances for its basic types, plus you need to calculate
> offset of second and following fields (using sizeOf in Storable class)

-- 
c/*__o/*
<\ * (__
*/\  <
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-10 Thread Don Stewart

bulat.ziganshin:
> Hello Don,
> 
> Wednesday, March 11, 2009, 12:48:35 AM, you wrote:
> 
> >> unfortunately, Array library unboxed arrays still aren't based on any
> >> Unboxable *class*
> 
> > Hmm. Aren't all the array library types based on MArray and IArray?
> 
> > So I can define my own say, new STUArray element type by writing an 
> > instance of
> > MArray for it.  Like so:
> 
> yes, you can, just this definition duplicates too much code

Ah yes. Then in that case that is indeed true. Array / UArray aren't as
extensible as the newer systems modelled on Storable (like UA in
vector/uvector).
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-10 Thread Don Stewart

bulat.ziganshin:
> Hello Don,
> 
> Wednesday, March 11, 2009, 12:12:07 AM, you wrote:
> 
> > Right, so my point stands: there's no difference now. If you can write a
> > Storable instance, you can write a UA et al instance.
> 
> yes, if there is some class provided for this and not just hard-coded
> 4 or so base types 

That's right. For example (supporting even pairs):

instance (RealFloat a, UA a) => UA (Complex a) where

  newtype UArr  (Complex a)   = UAComplex  (UArr (a :*: a))
  newtype MUArr (Complex a) s = MUAComplex (MUArr (a :*: a) s)

> > And GHC 6.6 was released what, 11 October 2006? So this has been the
> > case for a long time.
> 
> unfortunately, Array library unboxed arrays still aren't based on any
> Unboxable *class*


Hmm. Aren't all the array library types based on MArray and IArray?

So I can define my own say, new STUArray element type by writing an instance of
MArray for it.  Like so:


{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}
{-# LANGUAGE CPP   #-}

-- get at low level representation stuff

import Data.Array.Base
import GHC.IOBase
import GHC.ST   ( ST(..), runST )
import GHC.Prim
import GHC.Base
import GHC.Word
import GHC.Ptr
import GHC.Float
import GHC.Stable
import GHC.Int
import GHC.Word
import Data.Array.Unboxed

-- helpers

import Data.Bits
import Text.Printf
import System.Environment
import Control.Monad

-- portable to 32 bit or 64 bit

#include 

-- define a new data type we wish to store in unboxed arrays

data Boolean = IsTrue | IsFalse
deriving (Eq, Ord, Enum, Show, Bounded)

-- write a program using an unboxed array of these things

main = do
n <- getArgs >>= readIO . head :: IO Int
mapM_ (\i -> sieve (1 `shiftL` (n-i))) [0, 1, 2]

-- Nsieve with bit packing of a custom MyBool type.
sieve n = do
   let r = runST (do a <- newArray (2,n) IsTrue :: ST s (STUArray s Int 
Boolean)
 go a n 2 0)
   printf "Primes up to %8d %8d\n" (n::Int) (r::Int) :: IO ()

go !a !}m !n !c
| n == m= return c
| otherwise = do
  e <- unsafeRead a n
  if e == IsTrue
   then let loop j
  | j < m = do
  x <- unsafeRead a j
  when (x == IsTrue) $ unsafeWrite a j IsFalse
  loop (j+n)

  | otherwise = go a m (n+1) (c+1)
in loop (n `shiftL` 1)
   else go a m (n+1) c

--
-- Create a new unboxed representation for MyBool
-- We choose to use bit packing, storing them in a W#
--

instance MArray (STUArray s) Boolean (ST s) where

getBounds (STUArray l u _ _)  = return (l,u)
getNumElements (STUArray _ _ n _) = return n

newArray (l,u) initialValue = ST $ \s1# ->
case safeRangeSize (l,u)of { n@(I# n#) ->
case newByteArray# (bOOL_SCALE n#) s1# of { (# s2#, marr# #) ->
case bOOL_WORD_SCALE n# of { n'# ->
let loop i# s3# | i# ==# n'# = s3#
| otherwise  =
case writeWordArray# marr# i# e# s3# of { s4# ->
loop (i# +# 1#) s4# } in
case loop 0# s2#of { s3# ->
(# s3#, STUArray l u n marr# #) 
  where
W# e# = if initialValue == IsTrue then maxBound else 0

unsafeNewArray_ (l,u) = unsafeNewArraySTUArray_ (l,u) bOOL_SCALE

newArray_ arrBounds = newArray arrBounds IsFalse

unsafeRead (STUArray _ _ _ marr#) (I# i#) = ST $ \s1# ->
case readWordArray# marr# (bOOL_INDEX i#) s1# of { (# s2#, e# #) ->
(# s2#, case (e# `and#` bOOL_BIT i#) `neWord#` int2Word# 0# of True 
-> IsTrue ; _ -> IsFalse #) }

unsafeWrite (STUArray _ _ _ marr#) (I# i#) e = ST $ \s1# ->
case bOOL_INDEX i#  of { j# ->
case readWordArray# marr# j# s1# of { (# s2#, old# #) ->
case if e == IsTrue then old# `or#` bOOL_BIT i#
 else old# `and#` bOOL_NOT_BIT i# of { e# ->
case writeWordArray# marr# j# e# s2# of { s3# ->
(# s3#, () #) 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-10 Thread Xiao-Yong Jin

Don Stewart  writes:

>> instance UA UserDefinedDataType
>> 
>> I'm not sure how to do that.  Can you give me some
>> clarification?
>
> Yes, you can do that. This is the case for most of the new array
> libraries.

It goes beyond my current knowledge, now.  How do you define
a custom data type as an instance of UA or Storable?

Xiao-Yong
-- 
c/*__o/*
<\ * (__
*/\  <
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-10 Thread Don Stewart

xj2106:
> Don Stewart  writes:
> 
> > And what is Storable limited to?
> >
> > Ultimately they're all limited to the primops for reading and writing,
> > and to what types we can encode in those. So:
> >
> > primop ReadOffAddrOp_Char "readCharOffAddr#" GenPrimOp
> > ...
> > {-
> > instance Storable Double
> > instance Storable Bool
> > instance Storable Char
> > instance Storable Int
> > instance Storable Float
> > ...
> > -}
> >
> > {-
> >
> > instance UA ()
> > instance (UA a, UA b) => UA (a :*: b)
> > instance UA Bool
> > instance UA Char
> > instance UA Int
> > instance UA Float
> > instance UA Double
> > ...
> > -}
> >
> > So what's a type that's Storable, but not writable in UA (or UArray or ..)
> 
> So it's me who understand it wrong.  If I want some high
> performance array with elements of custom data type, I'm
> stuck with Array, anyway?
> 
> Is it possible to make
> 
> instance UA UserDefinedDataType
> 
> I'm not sure how to do that.  Can you give me some
> clarification?

Yes, you can do that. This is the case for most of the new array
libraries.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-10 Thread Don Stewart

bulat.ziganshin:
> Hello Don,
> 
> Tuesday, March 10, 2009, 11:01:31 PM, you wrote:
> 
> >> if uavector use ghc's built-in unboxed array operations (as
> >> Data.Array.Unboxed does) then it's necessarily bounded to types
> >> supported by those operations
> 
> > And what is Storable limited to?
> 
> > Ultimately they're all limited to the primops for reading and writing,
> 
> the full story:
> 
> ghc up to 6.6 has slow access to ForeignArrays, as you may recall
> 
> therefore, those primitives was added. ByteArra# plus those primitives
> was the only way to have unboxed arrays with fast access
> 
> starting with 6.6, ForeignArray access is no-op, so we can just use
> obvious Ptr operations (via Storable class) to get unboxed arrays fast
> access. so, no more need for those special ByteArray# access operations
> 
> but Array library still old, so any effort based on its spources, got
> the same restrictions
> 
> also, ByteArray# may be unpinned, but afaik, this isn't really
> important - it can be coerced to Ptr for the period of one operation


Right, so my point stands: there's no difference now. If you can write a
Storable instance, you can write a UA et al instance.

And GHC 6.6 was released what, 11 October 2006? So this has been the
case for a long time.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-10 Thread Xiao-Yong Jin

Don Stewart  writes:

> And what is Storable limited to?
>
> Ultimately they're all limited to the primops for reading and writing,
> and to what types we can encode in those. So:
>
> primop ReadOffAddrOp_Char "readCharOffAddr#" GenPrimOp
> ...
> {-
> instance Storable Double
> instance Storable Bool
> instance Storable Char
> instance Storable Int
> instance Storable Float
> ...
> -}
>
> {-
>
> instance UA ()
> instance (UA a, UA b) => UA (a :*: b)
> instance UA Bool
> instance UA Char
> instance UA Int
> instance UA Float
> instance UA Double
> ...
> -}
>
> So what's a type that's Storable, but not writable in UA (or UArray or ..)

So it's me who understand it wrong.  If I want some high
performance array with elements of custom data type, I'm
stuck with Array, anyway?

Is it possible to make

instance UA UserDefinedDataType

I'm not sure how to do that.  Can you give me some
clarification?


Thanks,
Xiao-Yong
-- 
c/*__o/*
<\ * (__
*/\  <
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-10 Thread Don Stewart

bulat.ziganshin:
> Hello Don,
> 
> Tuesday, March 10, 2009, 10:40:30 PM, you wrote:
> 
> >> I think uvector only works with certain types that can be
> >> unboxed, while storablevector works with all types that
> >> instantiate Foreign.Storable.Storable.  I don't know about
> >> vector.  From the description of vector, I have the
> 
> > That's interesting. I'd expect Storable and UA to have the same set of
> > inhabitants. Is there any difference?
> 
> if uavector use ghc's built-in unboxed array operations (as
> Data.Array.Unboxed does) then it's necessarily bounded to types
> supported by those operations

And what is Storable limited to?

Ultimately they're all limited to the primops for reading and writing,
and to what types we can encode in those. So:

primop ReadOffAddrOp_Char "readCharOffAddr#" GenPrimOp
primop ReadOffAddrOp_WideChar "readWideCharOffAddr#" GenPrimOp

primop ReadOffAddrOp_Int "readIntOffAddr#" GenPrimOp
primop ReadOffAddrOp_Word "readWordOffAddr#" GenPrimOp

primop ReadOffAddrOp_Addr "readAddrOffAddr#" GenPrimOp

primop ReadOffAddrOp_Float "readFloatOffAddr#" GenPrimOp
primop ReadOffAddrOp_Double "readDoubleOffAddr#" GenPrimOp

primop ReadOffAddrOp_StablePtr "readStablePtrOffAddr#" GenPrimOp

primop ReadOffAddrOp_Int8 "readInt8OffAddr#" GenPrimOp
primop ReadOffAddrOp_Int16 "readInt16OffAddr#" GenPrimOp
primop ReadOffAddrOp_Int32 "readInt32OffAddr#" GenPrimOp
primop ReadOffAddrOp_Int64 "readInt64OffAddr#" GenPrimOp

primop ReadOffAddrOp_Word8 "readWord8OffAddr#" GenPrimOp
primop ReadOffAddrOp_Word16 "readWord16OffAddr#" GenPrimOp
primop ReadOffAddrOp_Word32 "readWord32OffAddr#" GenPrimOp
primop ReadOffAddrOp_Word64 "readWord64OffAddr#" GenPrimOp

{-
instance Storable Double
instance Storable Bool
instance Storable Char
instance Storable Int
instance Storable Float
...
-}

{-

instance UA ()
instance (UA a, UA b) => UA (a :*: b)
instance UA Bool
instance UA Char
instance UA Int
instance UA Float
instance UA Double
...
-}

So what's a type that's Storable, but not writable in UA (or UArray or ..)

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-10 Thread Don Stewart

xj2106:
> Alexander Dunlap  writes:
> 
> > - uvector, storablevector and vector are all designed for dealing with
> > arrays. They *can* be used for characters/word8s but are not
> > specialized for that purpose, do not deal with Unicode at all, and are
> > probably worse at it. They are better for dealing with things that you
> > would generally use arrays for.
> 
> I think uvector only works with certain types that can be
> unboxed, while storablevector works with all types that
> instantiate Foreign.Storable.Storable.  I don't know about
> vector.  From the description of vector, I have the

That's interesting. I'd expect Storable and UA to have the same set of
inhabitants. Is there any difference?

> impression that it is quite unstable.  How is it compared to
> uvector and storablevector?  I need one of those to work
> with my code to possibly improve the efficiency.  But I
> can't use uvector, because I can't use unboxed types.

Hmm? If you can write a Storable instance, you can write a UA instance.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-09 Thread Xiao-Yong Jin

Alexander Dunlap  writes:

> - uvector, storablevector and vector are all designed for dealing with
> arrays. They *can* be used for characters/word8s but are not
> specialized for that purpose, do not deal with Unicode at all, and are
> probably worse at it. They are better for dealing with things that you
> would generally use arrays for.

I think uvector only works with certain types that can be
unboxed, while storablevector works with all types that
instantiate Foreign.Storable.Storable.  I don't know about
vector.  From the description of vector, I have the
impression that it is quite unstable.  How is it compared to
uvector and storablevector?  I need one of those to work
with my code to possibly improve the efficiency.  But I
can't use uvector, because I can't use unboxed types.
Should I use storablevector?  Or vector?

Thanks,
Xiao-Yong
-- 
c/*__o/*
<\ * (__
*/\  <
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-09 Thread Alexander Dunlap

On Mon, Mar 9, 2009 at 3:12 AM, Henning Thielemann
 wrote:
>
> On Mon, 9 Mar 2009, Claus Reinke wrote:
>
>> Given the close relationship between uvector and vector, it would
>> be very helpful if both package descriptions on hackage could point to a
>> common haskell wiki page, starting out with the text
>> and link above, plus a link to the stream fusion paper (I hadn't been
>> aware that vector incorporates the recycling work, and had often wondered
>> about the precise relationship between those
>> two packages). Apart from saving others from similar confusion,
>> that would also provide a place to record experience with those two
>> alternatives.
>
> I have at least started a page which mentions the existing alternatives:
>   http://www.haskell.org/haskellwiki/Storable_Vector
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>

Thanks for all of the responses!

So let me see if my summary is accurate here:

- ByteString is for just that: strings of bytes, generally read off of
a disk. The Char8 version just interprets the Word8s as Chars but
doesn't do anything special with that.
- Data.Text/text library is a higher-level library that deals with
"text," abstracting over Unicode details and treating each element as
a potentially-multibye "character."
- utf8-string is a wrapper over ByteString that interprets the bytes
in the bytestring as potentially-multibye unicode "characters."

- uvector, storablevector and vector are all designed for dealing with
arrays. They *can* be used for characters/word8s but are not
specialized for that purpose, do not deal with Unicode at all, and are
probably worse at it. They are better for dealing with things that you
would generally use arrays for.

If that seems accurate, I'll put it on the wiki.

Alex
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-09 Thread Henning Thielemann



On Mon, 9 Mar 2009, Claus Reinke wrote:


Given the close relationship between uvector and vector, it would
be very helpful if both package descriptions on hackage could point to a 
common haskell wiki page, starting out with the text
and link above, plus a link to the stream fusion paper (I hadn't been aware 
that vector incorporates the recycling work, and had often wondered about the 
precise relationship between those

two packages). Apart from saving others from similar confusion,
that would also provide a place to record experience with those two 
alternatives.


I have at least started a page which mentions the existing alternatives:
   http://www.haskell.org/haskellwiki/Storable_Vector
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-09 Thread Henning Thielemann



On Sat, 7 Mar 2009, Bryan O'Sullivan wrote:


On Sat, Mar 7, 2009 at 10:23 PM, Alexander Dunlap  
wrote:

  Hi all,

  For a while now, we have had Data.ByteString[.Lazy][.Char8] for our
  fast strings. Now we also have Data.Text, which does the same for
  Unicode. These seem to be the standard for dealing with lists of bytes
  and characters.

  Now we also have the storablevector, uvector, and vector packages.
  These seem to be also useful for unpacked data, *including* Char and
  Word8 values.

  What is the difference between bytestring and these new "fast array"
  libraries? Are the latter just generalizations of the former?

storablevector is not mature (I'm not even sure if it's actually used) and is a 
derivative of
an old version of the bytestring library, and so has similar characteristics 
for interacting
with foreign code. It contains some old fusion code that is sketchy in nature 
and somewhat
likely to be broken. I'm not sure I would recommend using this library.


As maintainer of storablevector I can tell that I use it for realtime 
audio signal processing. Indeed, I expected more of the fusion mechanism 
than it can do. It is hard to get correct fusion on lazy storablevectors 
at all, because of non-matching chunk sizes. Thus I didn't follow that 
path anymore. For audio signal processing I use a list type, like that of 
the Streams approach. When you convert those streams into storable vectors 
you get efficient inner loops without any fusion. It would certainly be 
worth to fuse with those Stream lists, but I haven't tried that so far.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-08 Thread Roman Leshchinskiy


On 09/03/2009, at 11:47, Claus Reinke wrote:


Btw, have any of the Haskell array optimization researchers
considered fixpoints yet?


This, for instance, is a very nice paper:

http://www.pllab.riec.tohoku.ac.jp/~ohori/research/OhoriSasanoPOPL07.pdf

However, in the context of high-performance array programming explicit  
recursion is bad because it is very hard if not impossible to  
parallelise automatically except in fairly trivial cases. And if your  
array program is not parallelisable then you don't really care about  
performance all that much :-)


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-08 Thread Claus Reinke


uvector is, if my memory serves me correctly, a fork of the vector library.
It uses modern stream fusion, but is under active development and is a
little scary. I'm a little unclear on the exact difference between uvector
and vector. Both use arrays that are not pinned, so they can't be readily
used with foreign code. If you want to use either library, understand that
you're embarking on a bracing adventure.


vector and uvector are roughly based on the same technology; uvector
is - as far as I remember - a fork of some of the old DPH code which
uses stream fusion which Don cleaned up and worked on (and it's proven
pretty useful, and people are still hacking on it.)

vector however, has the notion of 'recycling arrays' when it does
array operations. The technique is in fact quite similar to stream
fusion. Roman L. built this from scratch I think, so it's quite a bit
more unused and less stable than even uvector is maybe, but I suppose
you could say it's kind of a superset of uvector. Hopefully though
it should mature a little, and the plans are to have the technology
from both of these folded into the Data Parallel Haskell project so we
get fast array operations+automatic parallelisation.

For info, see Roman's paper, 'Recycle your arrays!'

http://www.cse.unsw.edu.au/~rl/publications/recycling.html


Given the close relationship between uvector and vector, it would
be very helpful if both package descriptions on hackage could 
point to a common haskell wiki page, starting out with the text
and link above, plus a link to the stream fusion paper (I hadn't 
been aware that vector incorporates the recycling work, and 
had often wondered about the precise relationship between those

two packages). Apart from saving others from similar confusion,
that would also provide a place to record experience with those two 
alternatives.

Btw, have any of the Haskell array optimization researchers
considered fixpoints yet? Both fusion and recycling are based 
on rewrite rules of the kind "in . out --> id". Now, given a loop 
like this:


   loop a = if c a then loop (out (action (in a))) else a
   loop a

these rules don't apply. Unrolling the loop a fixed number of
times would enable some rule applications, but still some would
remain in the loop body. But with a little rewriting

   loop a = if c a then loop (out (action (in a))) else out (id (in a))
   loop a

   loop a = if c a then loop (out (action (in a))) else out (id (in a))
   (if c a then loop (out (action (in a))) else out (id (in a)))

we can now push the out into the next iteration of the loop or,
if there is no next iteration, into the loop epilogue

   loop a = if c (out a) then loop (action (in (out a))) else id (in (out a))
   out (if c a then loop (action (in a)) else a)

making the rewrite rule applicable

   loop a = if c (out a) then loop (action a) else id a
   out (if c a then loop (action (in a)) else a)

leading (modulo bugs, omissions, and oversights;-) to a fused/
recycled loop body, with potentially substantial benefit.

Claus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-07 Thread Don Stewart

bos:
> On Sat, Mar 7, 2009 at 10:23 PM, Alexander Dunlap 
> wrote:
> 
> Hi all,
> 
> For a while now, we have had Data.ByteString[.Lazy][.Char8] for our
> fast strings. Now we also have Data.Text, which does the same for
> Unicode. These seem to be the standard for dealing with lists of bytes
> and characters.
> 
> Now we also have the storablevector, uvector, and vector packages.
> These seem to be also useful for unpacked data, *including* Char and
> Word8 values.
> 
> What is the difference between bytestring and these new "fast array"
> libraries? Are the latter just generalizations of the former?
> 
> 
> There are quite a few overlaps and differences among them.
> 
> bytestring is mature and useful for low-level byte buffer manipulations, and
> also for efficient I/O. This is in part because it uses pinned pointers that
> can interoperate easily with foreign code. It used to have an early fusion
> rewriting framework, but that was abandoned. So it will not fuse multiple
> ByteString traversals into single loops. This library is widely used, and also
> somewhat abused for text I/O.
> 
> storablevector is not mature (I'm not even sure if it's actually used) and is 
> a
> derivative of an old version of the bytestring library, and so has similar
> characteristics for interacting with foreign code. It contains some old fusion
> code that is sketchy in nature and somewhat likely to be broken. I'm not sure 
> I
> would recommend using this library.
> 
> uvector is, if my memory serves me correctly, a fork of the vector library. It
> uses modern stream fusion, but is under active development and is a little
> scary. I'm a little unclear on the exact difference between uvector and 
> vector.
> Both use arrays that are not pinned, so they can't be readily used with 
> foreign
> code. If you want to use either library, understand that you're embarking on a
> bracing adventure.
> 
> text is not mature, and is based on the same modern fusion framework as 
> uvector
> and vector. It uses unpinned arrays, but provides functions for dealing with
> foreign code. It uses a denser encoding than uvector for text, and provides
> text-oriented functions like splitting on word and line boundaries. Although
> it's intended for use with Unicode text, it does not yet provide proper
> Unicode-aware functions for things like case conversion. It interacts with
> bytestring to perform conversion to and from standard representations like
> UTF-8, and (via the text-icu package) ICU for others (SJIS, KOI-8, etc). If 
> you
> want to use this library, understand that you're embarking on a bracing
> adventure.

I endorse this message.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-07 Thread Austin Seipp

Excerpts from Bryan O'Sullivan's message of Sun Mar 08 00:45:03 -0600 2009:
> uvector is, if my memory serves me correctly, a fork of the vector library.
> It uses modern stream fusion, but is under active development and is a
> little scary. I'm a little unclear on the exact difference between uvector
> and vector. Both use arrays that are not pinned, so they can't be readily
> used with foreign code. If you want to use either library, understand that
> you're embarking on a bracing adventure.

vector and uvector are roughly based on the same technology; uvector
is - as far as I remember - a fork of some of the old DPH code which
uses stream fusion which Don cleaned up and worked on (and it's proven
pretty useful, and people are still hacking on it.)

vector however, has the notion of 'recycling arrays' when it does
array operations. The technique is in fact quite similar to stream
fusion. Roman L. built this from scratch I think, so it's quite a bit
more unused and less stable than even uvector is maybe, but I suppose
you could say it's kind of a superset of uvector. Hopefully though
it should mature a little, and the plans are to have the technology
from both of these folded into the Data Parallel Haskell project so we
get fast array operations+automatic parallelisation.

For info, see Roman's paper, 'Recycle your arrays!'

http://www.cse.unsw.edu.au/~rl/publications/recycling.html

Austin
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-07 Thread Austin Seipp

Excerpts from Alexander Dunlap's message of Sun Mar 08 00:23:01 -0600 2009:
> For a while now, we have had Data.ByteString[.Lazy][.Char8] for our
> fast strings. Now we also have Data.Text, which does the same for
> Unicode. These seem to be the standard for dealing with lists of bytes
> and characters.
> 
> Now we also have the storablevector, uvector, and vector packages.
> These seem to be also useful for unpacked data, *including* Char and
> Word8 values.
> 
> What is the difference between bytestring and these new "fast array"
> libraries? Are the latter just generalizations of the former?
> 
> Thanks for any insight anyone can give on this.
> 
> Alex


Data.Text provides functions for unicode over bytestrings, with several
encoding/decoding methods. So, I think that bytestring+text now solves
the general problem with the slow String type - we get various
international encodings, and fast, efficient packed strings.

(It's also worth mentioning utf8-string, which gives you utf8 over
bytestrings. text gives you more encodings and is probably still quite
efficient, however.)

But this is pretty much a separate effort to that of packages like
uvector/vector etc. etc.. To clarify, uvector and vector are likely to
be merged in the future I think - vector is based on the idea of
'recycling arrays' so that array operations are still very efficient,
while uvector only has the tested stream fusion technique behind it.

Actually, I think the inevitable plan is to merge the technology
behind both vector and uvector into the Data Parallel Haskell
project. Array recylcing and stream fusion goes into creating
extremely efficient sequential code, while the vectorisation pass
turns that into efficient multicore code at the same time.

In any case, I suppose that hypothetically if someone wanted to use a
package like uvector to create an efficient string type, they could,
but if they want that, why not just use bytestring? It's already
optimized, battle tested and in extremely wide use.

I think some library proliferation is good; in this case, the
libraries mentioned here are really for some different purposes, and
that's great, because they all lead to some nice, fast code with low
conceptual overhead when put together (hopefully...) But I'm not even
going to begin examining/comparing the different array interfaces or
anything, because that's been done many times here, so you best check
the archives if you want the 'in-depth' on the matter.

Austin
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

2009-03-07 Thread Bryan O'Sullivan

On Sat, Mar 7, 2009 at 10:23 PM, Alexander Dunlap <
alexander.dun...@gmail.com> wrote:

> Hi all,
>
> For a while now, we have had Data.ByteString[.Lazy][.Char8] for our
> fast strings. Now we also have Data.Text, which does the same for
> Unicode. These seem to be the standard for dealing with lists of bytes
> and characters.
>
> Now we also have the storablevector, uvector, and vector packages.
> These seem to be also useful for unpacked data, *including* Char and
> Word8 values.
>
> What is the difference between bytestring and these new "fast array"
> libraries? Are the latter just generalizations of the former?

There are quite a few overlaps and differences among them.

bytestring is mature and useful for low-level byte buffer manipulations, and
also for efficient I/O. This is in part because it uses pinned pointers that
can interoperate easily with foreign code. It used to have an early fusion
rewriting framework, but that was abandoned. So it will not fuse multiple
ByteString traversals into single loops. This library is widely used, and
also somewhat abused for text I/O.

storablevector is not mature (I'm not even sure if it's actually used) and
is a derivative of an old version of the bytestring library, and so has
similar characteristics for interacting with foreign code. It contains some
old fusion code that is sketchy in nature and somewhat likely to be broken.
I'm not sure I would recommend using this library.

uvector is, if my memory serves me correctly, a fork of the vector library.
It uses modern stream fusion, but is under active development and is a
little scary. I'm a little unclear on the exact difference between uvector
and vector. Both use arrays that are not pinned, so they can't be readily
used with foreign code. If you want to use either library, understand that
you're embarking on a bracing adventure.

text is not mature, and is based on the same modern fusion framework as
uvector and vector. It uses unpinned arrays, but provides functions for
dealing with foreign code. It uses a denser encoding than uvector for text,
and provides text-oriented functions like splitting on word and line
boundaries. Although it's intended for use with Unicode text, it does not
yet provide proper Unicode-aware functions for things like case conversion.
It interacts with bytestring to perform conversion to and from standard
representations like UTF-8, and (via the text-icu package) ICU for others
(SJIS, KOI-8, etc). If you want to use this library, understand that you're
embarking on a bracing adventure.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

Re: [Haskell-cafe] bytestring vs. uvector

28 matches

Site Navigation

Mail list logo

Footer information