Re: Staying alive

1999-10-06 Thread Sven Panne

Michael Weber wrote:
 [...] otherwise there would be a `marshalList' and `marshalListN' with
 same type signature, which I also regarded as confusing.

That was a problem with your approach, not mine...   :-)
 
 [...] But how else can you marshall an arbitrary length list? For
 Int lists, there is no zeroElem... [...]

A usual approach for this in many C-based APIs is that two arguments
are passed: a pointer and an integral of some kind, stating the number
of elements pointed to, see e.g. conformant arrays and the size_of
attribute in DCE-IDL (can't remember how MIDL calls this).

Cheers,
   Sven
-- 
Sven PanneTel.: +49/89/2178-2235
LMU, Institut fuer Informatik FAX : +49/89/2178-2211
LFE Programmier- und Modellierungssprachen  Oettingenstr. 67
mailto:[EMAIL PROTECTED]D-80538 Muenchen
http://www.informatik.uni-muenchen.de/~Sven.Panne



Re: Staying alive

1999-10-06 Thread Michael Weber

On Wed, Oct 06, 1999 at 12:04:03 +0200, Sven Panne wrote:
 Michael Weber wrote:
marshalListN :: Int - [a] - IO Addr
 
 I don't see a reason for the duplication of length information, e.g.
 what should `marshalListN 1000 [1,2,3]' mean? And
 `marshalListN 2 [1,2,3]' is equivalent to `marshalList (take 2 [1,2,3])'.

Yes... I thought about this, but otherwise there would be a `marshalList'
and `marshalListN' with same type signature, which I also regarded as
confusing.

 where `marshalList' stores the list length at the beginning of the buffer,
 like: [ marshalList code deleted ]

 This seems to be a rather special case, e.g. why should the length be
 marshaled as an Int? Some ancient Pascals on PCs used a byte length prefix
 for strings. And what about alignment? There could/must be some padding if
 the alignment constraints for the list element type are stronger than the
 ones for Int.

Yes, I know. I cursed on this stupid Pascal string limit in school :-) But
how else can you marshall an arbitrary length list? For Int lists, there is
no zeroElem...

BTW: This is what is done in XDR (eXternal Data Representation standard, see
RFC1832). The length is at the beginning, coded as big-endian 32Bit unsigned
int (we could use `writeInt32OffAddr . ccall_to_htonl' for this...)

 It seems that both spellings are possible and both "feel right", depending
 on the speaker and the dictionary.  :-)

That's what I expected... :-)


Cheers,
Michael



Re: Staying alive

1999-10-06 Thread Sven Panne

Michael Weber wrote:
 Maybe it's worth to know how many objects a list has (without explicit
 zeroElem), so I'd suggest to extend the class to:
 
marshalList   ::[a] - IO Addr

We agree here.

marshalListN  :: Int - [a] - IO Addr

I don't see a reason for the duplication of length information, e.g.
what should `marshalListN 1000 [1,2,3]' mean? And
`marshalListN 2 [1,2,3]' is equivalent to `marshalList (take 2 [1,2,3])'.

unmarshalList ::Addr - IO [a]

How many elements are there starting at the given Addr? Hmmm

unmarshalListN:: Int - Addr - IO [a]

This is my definition of unmarshalListN.

 where `marshalList' stores the list length at the beginning of the
 buffer, like:
 [ marshalList code deleted ]

This seems to be a rather special case, e.g. why should the length
be marshaled as an Int? Some ancient Pascals on PCs used a byte length
prefix for strings. And what about alignment? There could/must be some
padding if the alignment constraints for the list element type are
stronger than the ones for Int. Therefore this function does not belong
to a general purpose library IMHO.

I'll post a new proposal for a "better" Addr module later...

Cheers,
   Sven

P.S.: I've received a few personal mails regarding the spelling of
marshaling, but there is no real agreement on this. It seems that both
spellings are possible and both "feel right", depending on the speaker
and the dictionary.  :-)
-- 
Sven PanneTel.: +49/89/2178-2235
LMU, Institut fuer Informatik FAX : +49/89/2178-2211
LFE Programmier- und Modellierungssprachen  Oettingenstr. 67
mailto:[EMAIL PROTECTED]D-80538 Muenchen
http://www.informatik.uni-muenchen.de/~Sven.Panne



Re: Staying alive

1999-10-05 Thread Michael Weber

[My apologies: some day - I promise - I'll learn to choose the right mailing
list... 8-]

On Mon, Oct 04, 1999 at 13:45:43 +0200, Sven Panne wrote:
[micro-marshalling]
 I know Manuel's code already and the one H/Direct produces. Has anybody
 else some FFI-related code and/or suggestions? This could be a wonderful
 topic for the wish list, but it would be nice to see the problems and needs
 for APIs different from GTK+ and OpenGL first.

Maybe it's worth to know how many objects a list has (without explicit
zeroElem), so I'd suggest to extend the class to:

   marshalList   ::[a] - IO Addr
   marshalListN  :: Int - [a] - IO Addr
   unmarshalList ::Addr - IO [a]
   unmarshalListN:: Int - Addr - IO [a]

where `marshalList' stores the list length at the beginning of the buffer,
like:

   marshalList xs = do
  let len = length xs
  lenSize = sizeOf len
  buf - malloc (lenSize + len * sizeOf (head xs))
  writeIntOffAddr buf 0 len
  zipWithM_ (writeOffAddr (buf `plusAddr` lenSize)) [ 0 .. ] xs
  return buf
  
and `instance Marshalable Int {...}'

 P.S. for the native speakers: Which spelling is correct, "marshaling" or
 "marshalling"? Ispell says "marshaling", but this looks a bit odd to me.

dictd knows both:
--
From The Free On-line Dictionary of Computing (20Jul99) [foldoc]:
  marshalling
  communications (US -ll- or -l-) The process of packing one
  or more items of data into a {message buffer}, prior to
  transmitting that message buffer over a communication channel.
  The packing process not only collects together values which
  may be stored in non-consecutive memory locations but also
  converts data of different types into a standard
  representation agreed with the recipient of the message.
  marshaling
  spelling Alternative US spelling of "{marshalling}".
  (1998-03-16)
--

I'd vote for the "double-L" variant... :-)


Cheers,
Michael
-- 
"Don't worry about people stealing your ideas.  If your ideas are any
 good, you'll have to ram them down people's throats."
-- Howard Aiken



Re: Staying alive

1999-10-04 Thread Sven Panne

"Manuel M. T. Chakravarty" wrote:
 I wouldn't call autoconf a hack :-)

OK, MEGA-hack.   :-)   But it's m4 which is the real culprit...

 [...] Anyway, such a class seems to be a nice way for expressing
 the information about marshalable data types.

The class is actually more about "micro"-marshaling, e.g. there are no
methods which actually do some form of memory management. In my current
OpenGL-binding I actually use the following larger class definition:

-
class Eq a = Marshalable a where
   sizeOf:: a - Int
   zeroElem  :: a

   indexOffAddr  :: Addr - Int -a
   readOffAddr   :: Addr - Int - IO a
   writeOffAddr  :: Addr - Int - a - IO ()

   marshal   ::  a  - IO Addr
   marshalList   :: [a] - IO Addr
   marshalListZero   :: [a] - IO Addr

   unmarshal ::Addr - IO  a
   unmarshalList :: Int - Addr - IO [a]
   unmarshalListZero ::Addr - IO [a]

   marshal x = marshalList [x]

   marshalList xs = do
  buf - malloc (length xs * sizeOf (head xs))
  zipWithM_ (writeOffAddr buf) [ 0 .. ] xs
  return buf

   marshalListZero xs = do
  let numElements = length xs
  buf - malloc ((numElements+1) * sizeOf (head xs))
  zipWithM_ (writeOffAddr buf) [ 0 .. ] xs
  writeOffAddr buf numElements (zeroElem `asTypeOf` head xs)
  return buf

   unmarshal buf = liftM head $ unmarshalList 1 buf

   unmarshalList numElements buf =
  mapM (readOffAddr buf) [ 0 .. numElements-1 ]

   unmarshalListZero buf = loop 0 []
  where loop idx accu = do x - readOffAddr buf idx
   if x == zeroElem
  then return $ reverse accu
  else loop (idx+1) (x:accu)
-

This captures usual marshaling/unmarshaling patterns for

   * a single value,
   * a list of values,
   * and a list of values, terminated by a special value (zeroElem).

Two typical instances are:

-
instance Marshalable Char where
   sizeOf   = const @ac_cv_sizeof_char@
   zeroElem = '\0'
   indexOffAddr = indexCharOffAddr
   readOffAddr  = readCharOffAddr
   writeOffAddr = writeCharOffAddr

instance Marshalable Addr where
   sizeOf   = const @ac_cv_sizeof_void_p@
   zeroElem = nullAddr
   indexOffAddr = indexAddrOffAddr
   readOffAddr  = readAddrOffAddr
   writeOffAddr = writeAddrOffAddr
-

The @ac_cv_sizeof_...@ are substituted by the configure script.
Given the above class definition one can write convenient higher order
functions, e.g. for in/inout parameter passing (which actually
involve some kind of alloca):

-
inParamWith :: (a - IO Addr) - (Addr - IO b) - a - IO b
inParamWith marsh act x = do
   buf - marsh x
   ret - act buf
   free buf
   return ret

inParam :: Marshalable a = (Addr - IO b) - a - IO b
inParam = inParamWith marshal

inOutParamWith :: (a - IO Addr) - (Addr - IO a) - (Addr - IO ()) - a - IO a
inOutParamWith marsh unmarsh act x = do
   buf - marsh x
   act buf
   ret - unmarsh buf
   free buf
   return ret

inOutParam :: Marshalable a = (Addr - IO ()) - a - IO a
inOutParam = inOutParamWith marshal unmarshal
-

I know Manuel's code already and the one H/Direct produces. Has
anybody else some FFI-related code and/or suggestions? This could be
a wonderful topic for the wish list, but it would be nice to see the
problems and needs for APIs different from GTK+ and OpenGL first.

Cheers,
   Sven

P.S. for the native speakers: Which spelling is correct, "marshaling"
or "marshalling"? Ispell says "marshaling", but this looks a bit odd
to me.
-- 
Sven PanneTel.: +49/89/2178-2235
LMU, Institut fuer Informatik FAX : +49/89/2178-2211
LFE Programmier- und Modellierungssprachen  Oettingenstr. 67
mailto:[EMAIL PROTECTED]D-80538 Muenchen
http://www.informatik.uni-muenchen.de/~Sven.Panne



Re: Staying alive

1999-10-03 Thread Manuel M. T. Chakravarty

Sven Panne [EMAIL PROTECTED] wrote,

 Simon Marlow wrote:
  Manuel M. T. Chakravarty wrote:
  [...] we should really enforce a stack discipline and use a wrapper
   function:
 alloca :: Int - (Addr - IO a) - IO a
  
  Nice idea.  I'm not sure about the implementation though: did you
  have in mind using C's malloc/free?
 
 Perhaps I'm totally missing the point here, but alloca seems to be
 dead easy to me:
[delete version using malloc/free]
 It's getting tough if you want to avoid malloc/free though.

Which is exactly the reason, I didn't mention malloc and
free.  First of all, my thinking is more along the lines of
how can we make this already promising FFI design into
something that can eventually get into the Haskell
standard.  So, all GHC-specific hacks don't help us
anything.  Second, we need a design that is sufficiently
straight-forward to implement and still allows an efficient
implementation of the FFI across different systems.

So, the question is, given the general problem of having to
deal with (in)out parameters in C, how can we handle them
nicely in the FFI?  The byte array stuff doesn't cut it for
reasons already discussed.

The idea behind the `alloca' proposal is that it (1) allows
the compiler (or interpreter) to allocate a chunk of stable
memory in whatever way is most efficient in that
implementation and (2) it makes it clear when the memory
area can be freed again.  Any better ideas?

 But IMHO the main problem is the handling of the correct offsets for
 e.g. inout parameters in the buffer. Alloca doesn't help here, and if
 you do it by hand, you end up with Haskell code which look suspiciously
 like the one H/Direct generates...

Hmmm, that's truly a problem if coding on the FFI manually.
So, my thinking was too tool-centric.  So, another problem
is that we want to allow tools to use their knowlege of the
size of primitives datatypes etc to optimise the marshaling
process, but still, we want to allow a sufficiently
convenient manual use of the FFI.

So, how about providing functions that advance the `mem'
pointer for you or even have `writeXXXOffAddr' return the
next address after the datum just written?  I.e, something
like

  foo = alloca N $ \mem1 - do 
  mem1 - writeWord32OffAddr mem1 0 ...
  mem2 - writeInt16OffAddr mem2 0 ...
  ...
  bar mem1 mem2
  optionally take stuff out of `memX' again
 
  foreign import ... bar :: Addr - Addr ... - IO ()

This is, however, maybe not so helpful because of alignment
constraints that are harder to satisfy manually than in
generated code.  So how about

  foo = alloca N $ \mem1 - do
  writeWord16OffAddr mem 0 ...
  let mem2 = mem `plusAddr` 4
  mem2 - writeInt32OffAddr mem2 0 ...
  ...
  bar mem1 mem2
  optionally take stuff out of `memX' again
 
  foreign import ... bar :: Addr - Addr ... - IO ()

A tool can pack everything as dense as the architecture-
dependent alignment constraints permit and a human can play
save and, e.g., use multiples of 4 on a 32-bit architecture
(the exact number should of course be hidden in a config
file to allow easy migration to other architectures).  I
think, this is not more difficult than having some form of
`malloc' call for every in(out) parameter passed (which is
how I do it currently in Gtk+HS).

BTW, I recently noticed that `writeInt32OffAddr a k' and
friends access `a + k * 4' and not `a + k', as I always
thought.  I think that this is rather unfortunate, as for
marshaling, I usually want to give a byte offset and not
multiples of the value currently accessed (as the marshaled
data is usually not uniform).  Furthermore, I can always get 
the current behviour by explicitly multiplying the index
with the number of 4 byte elements, but it is not possible
the other way around.

Finally, it would be nice to have floating point numbers
with explicit sizes - in the above `alloca' example, how
would a human programmer know, how much memory to allocate
for a float?  I guess, they are not provided, because it
would be a pain to implement the float operations for sizes
that are not `natively' supported on the platform.  However,
the current situation, where you have to hope that `Float'
in Haskell and `float' in C match is also not very nice and
definitely not appropriate for standardisation - unless
Haskell guarantees that `Float' and `float' as well as
`Double' and `double' always use the same representation
(and `long double' is completely missing so far).  So, what
is less pain?  Guarantee the use of the standard float sizes
in Haskell or `Float's of explicit size?  Any other ideas?

In summary, WishList requests:
* `alloca' or something better :-)
* writeXXXOffAddr indexed with bytes offsets
* floats with explicit sizes or a any other form of
  guarantee about their size  support for `long double'

Cheers,
Manuel



Staying alive

1999-09-22 Thread Sven Panne

In a recent discussion with Manuel Chakravarty the following question
arose. Given the following code:

   foo = do
  ba - newMutableByteArray ...
  ...
  bar ba
  -- ba not used here anymore

  foreign import ... bar :: MutableByteArray ... - IO ()

Let's further assume that bar makes callbacks to Haskell-land where a
GC occurs. Now the subtle question: Is it *guaranteed* that ba is
considered alive until bar returns or not? Either way, this should be
clarified in the FFI documentation.

Background: A lot of APIs (e.g. GTK+ and OpenGL :-) have functions
expecting a pointer to some data. One possible way to use them from
Haskell is

   * malloc area
   * fill area with data
   * call API function with pointer to area
   * explicitly free malloced area

Using (Mutable)ByteArrays would simplify thing here, especially the
last step would not be needed.

Cheers,
   Sven
-- 
Sven PanneTel.: +49/89/2178-2235
LMU, Institut fuer Informatik FAX : +49/89/2178-2211
LFE Programmier- und Modellierungssprachen  Oettingenstr. 67
mailto:[EMAIL PROTECTED]D-80538 Muenchen
http://www.informatik.uni-muenchen.de/~Sven.Panne