Re: [Haskell] thread-local variables

2006-08-08 Thread Frederik Eaton
> Furthermore, can we move this thread from the Haskell mailing list
> (which should not have heavy traffic) to either Haskell-Café, or
> the libraries list?

Sure, moving to haskell-cafe.

Frederik

-- 
http://ofb.net/~frederik/
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell] thread-local variables

2006-08-08 Thread Frederik Eaton
On Tue, Aug 08, 2006 at 04:21:06PM +0300, Einar Karttunen wrote:
> On 07.08 13:16, Frederik Eaton wrote:
> > > How would this work together with the FFI?
> > 
> > It wouldn't, at least I wouldn't care if it didn't.
> 
> Suddenly breaking libraries that happen to use FFI behind your
> back does not seem like a good conservative extension.

FFI already doesn't mix well with GHC's IO handles. What if I write to
file descriptor 1 before all data in stdout has been flushed? Is that
a reason not to allow FFI?

> I think we should move the discussion to the wiki as Simon
> suggested. I can create a wikipage if you don't want to.

http://haskell.org/haskellwiki/Thread_local_storage

I think the wiki is a good place for proposals, but not most
discussion.

> > What about my example:
> > 
> > newMain host environment program_args
> > network_config locale terminal_settings
> > stdin stdout stderr = do
> > ...
> > 
> > Now, let's see. We might want two threads to have the same network
> > configuration, but a different view of the filesystem; or the same
> > view of the filesystem, but a different set of environment variables;
> > or the same environment, but different command line arguments. All
> > three cases are pretty common in practice. We might also want to have
> > the same arguments but different IO handles - as in a multi-threaded
> > server application.
> 
> This won't be pretty even with TLS. Our fancy app will probably mix
> in STM and pass callback actions to the thread processing
> packets coming directly from the network interface. Quickly
> the TLS approach seems problematic - we need to know what actions
> depend on each other and how.

I don't understand. Does TLS make such design harder or easier?

> > And the part that implements the filesystem might want to access the
> > network (if there is a network filesystem). And the part that starts
> > processes with an environment might want to access the filesystem, for
> > instance to read the code for the process and for shared libraries;
> > and maybe it also wants to get the hostname from the network layer. 
> > And the part that starts programs with arguments might want to access
> > the environment (for instance, to get the current locale), as well as
> > the filesystem (for instance, to read locale configuration files). And
> > the part that accesses the IO handles might also want to access not
> > just the program arguments but the environment, and the filesystem,
> > and the network.
> 
> So we have the following dependencies:
> 
> 
> FileSystem  -> Network
> Environment -> FileSystem, Network
> Arguments   -> Environment
and Filesystem
> IO Handles  -> Arguments,Environment,FS,Network
> 
> With TLS every one of them has type IO. Now the programmer is supposed
> to know that he has to configure the network before using program
> arguments? So a programmer first wanting to process command line
> arguments and only then configuring network will probably have
> hidden bugs.

The running example is an example of an executable starting in an
operating system. So everything is already configured by the time it
starts, as you know.

My application will be no different - for instance, the
database-related parameter will be set; then a request thread will
start, and after parsing the request, a user-id parameter will be set,
and then the request-processing functions will be called. There is no
reason for the main server thread to call any of the
request-processing functions, because it doesn't have a request to
process.

> It becomes very hard to know what different components depend on.
> 
> Even if we had to define all those instances that would be
> 1+2+1+3 = 7 instance declarations. Not 5^2 = 25 instances.
> Or use small wrapper combinators (which I prefer).

O(x) doesn't mean "same as x".

> btw how would the TLS solution elegantly handle that I'd like
> separate network configurations for e.g.
> IO Handle -> Network(socket) and
> IO Handle -> FileSystem(NFS) -> Network
> ?

The filesystem could send its actions to be executed in a separate
thread, which has its own configuration?

> > So here is an example where we have nested layers, and each layer
> > accesses most of the layers below it.
> 
> And this will cause problems. A good API should not encourage
> going to the lower levels directly. If the lowest level changes
> then with your design one has to make O(layers) changes instead of
> O(1) if the layers are not available directly.

No, you just write a compatibility wrapper over the new
implementation.

> If one of the layers adds a new dependency then making sure it is
> initialized and used correctly seems very hard to check.

I disagree.

> > If we started with a library that dealt with OS devices such as the
> > network, and used a special monad for that; and then if we built upon
> > that a layer for keeping track of environment variables, with another
> > monad; and then a layer for invoking executables with arguments; and

Re: [Haskell] thread-local variables

2006-08-08 Thread Einar Karttunen
On 07.08 13:16, Frederik Eaton wrote:
> > How would this work together with the FFI?
> 
> It wouldn't, at least I wouldn't care if it didn't.
>

Suddenly breaking libraries that happen to use FFI behind your
back does not seem like a good conservative extension.

I think we should move the discussion to the wiki as Simon
suggested. I can create a wikipage if you don't want to.

> What about my example:
> 
> newMain host environment program_args
> network_config locale terminal_settings
> stdin stdout stderr = do
> ...
> 
> Now, let's see. We might want two threads to have the same network
> configuration, but a different view of the filesystem; or the same
> view of the filesystem, but a different set of environment variables;
> or the same environment, but different command line arguments. All
> three cases are pretty common in practice. We might also want to have
> the same arguments but different IO handles - as in a multi-threaded
> server application.

This won't be pretty even with TLS. Our fancy app will probably mix
in STM and pass callback actions to the thread processing
packets coming directly from the network interface. Quickly
the TLS approach seems problematic - we need to know what actions
depend on each other and how.

> And the part that implements the filesystem might want to access the
> network (if there is a network filesystem). And the part that starts
> processes with an environment might want to access the filesystem, for
> instance to read the code for the process and for shared libraries;
> and maybe it also wants to get the hostname from the network layer. 
> And the part that starts programs with arguments might want to access
> the environment (for instance, to get the current locale), as well as
> the filesystem (for instance, to read locale configuration files). And
> the part that accesses the IO handles might also want to access not
> just the program arguments but the environment, and the filesystem,
> and the network.

So we have the following dependencies:


FileSystem  -> Network
Environment -> FileSystem, Network
Arguments   -> Environment
IO Handles  -> Arguments,Environment,FS,Network

With TLS every one of them has type IO. Now the programmer is supposed
to know that he has to configure the network before using program
arguments? So a programmer first wanting to process command line
arguments and only then configuring network will probably have
hidden bugs.

It becomes very hard to know what different components depend on.

Even if we had to define all those instances that would be
1+2+1+3 = 7 instance declarations. Not 5^2 = 25 instances.
Or use small wrapper combinators (which I prefer).

btw how would the TLS solution elegantly handle that I'd like
separate network configurations for e.g.
IO Handle -> Network(socket) and
IO Handle -> FileSystem(NFS) -> Network
?

> So here is an example where we have nested layers, and each layer
> accesses most of the layers below it.

And this will cause problems. A good API should not encourage
going to the lower levels directly. If the lowest level changes
then with your design one has to make O(layers) changes instead of
O(1) if the layers are not available directly.

If one of the layers adds a new dependency then making sure it is
initialized and used correctly seems very hard to check.

> If we started with a library that dealt with OS devices such as the
> network, and used a special monad for that; and then if we built upon
> that a layer for keeping track of environment variables, with another
> monad; and then a layer for invoking executables with arguments; and
> then a layer for IO; all with monads - then we would have a good
> modular, extensible design, which, due to the interactions between
> layers, would, in Haskell, require code length which is quadratic in
> the number of layers.

The trick here is that most components should not talk with each
other. Composition and encapsulation are the keys to victory.

> (Of course, it's true that in real operating systems, each of these
> layers has its own set of interfaces to the other layers - so the
> monadic approach is actually not more verbose. But the point is that
> it's a reasonable design, with layers, and where each layer uses each
> of the ones below it. I want to write code which is designed the same
> way, but without the overhead)

Yes, the size of the code is dependent on the size of the API.
Making things explicit is more infrastructure at the start,
but makes things easier later on when they have to be changed.

> If you move it somewhere else, but forget to move the thread-local
> variables it refers to, then you'll get a compiler error.

I was meaning forgetting to initialize it - not omitting the whole
definition.

> db2 <- getIOParam db2Param
> withIOParam dbParam db2 $ ...

And one needs to make sure that the "..." part does not need the
other database connection(s). Makes composing things hard.

> I'm still not sure I understand why t

Re: [Haskell] thread-local variables

2006-08-08 Thread Frederik Eaton
Hi Simon,

It is good that you support thread-local variables.

I have initialized a wiki page:

http://haskell.org/haskellwiki/Thread_local_storage

The main difference between my and your proposals, as I see it, is
that your proposal is based on "keys" which can be used for other
things.

I think that leads to an interface which is less natural. In my
proposal, the IOParam type is quite similar to an IORef - it has a
user-specified initial state, and the internal implementation is
hidden from the user - yours differs in both of these aspects.

> * I agree with Robert that a key issue is initialisation.  Maybe it
> should be possible to associate an initialiser with a key.  I have not
> thought this out.

I still don't understand this, so it is not mentioned on the wiki.

> *  A key issue is this: when forking a thread, does the new thread
> inherit the current thread's bindings, or does it get a
> freshly-initialised set.  Sometimes you want one, sometimes the other,
> alas.

I think the inheritance semantics are more useful and also more
general: If I wanted a freshly-initialized set of bindings, and I only
had inheritance semantics, then I could start a thread early on when
all the bindings are in their initial state, and have this thread read
actions from a channel and execute them in sub-threads of itself, and
implement a 'fork' variant based on this. More generally, I could do
the same thing from a sub-thread of the main thread - I could start a
thread with any set of bindings, and use it to launch other threads
with those bindings. In this way, the "initial" set of bindings is not
specially privileged over intermediate sets of bindings.

> On the GHC front, we're going to be busy with 6.6 etc until after ICFP,
> so nothing is going to happen fast -- which gives an opportunity to
> discuss it.  However it's just infeasible for the community at large to
> follow a long email thread like this one. My suggestion would be for the
> interested parties to proceed somewhat as we did with packages.
> (http://hackage.haskell.org/trac/ghc/wiki/GhcPackages)

I have put a page on the wiki summarizing the thread. However, I want
to say that I think that email is a better medium for most ongoing
discussions. (I'm not sure if I may have suggested the opposite
earlier) For those who are not interested in the discussion, it should
be easy in most mail readers to ignore or hide a long thread, or to
skip to the very end of it to get a rough idea of where things stand. 
I think it is a good idea to have proposals on a wiki, though, so that
the product of all agreed-upon amendments and alterations can be
easily referred to.

When discussions happen on a wiki, though, they often take the same
threaded form as email discussions (see Wikipedia) - but, they are
seen by fewer interested people, and the interface is clumsier (for
instance, I can subscribe to email notification when a wiki page
changes - thanks to whomever finally made this possible on
haskell.org, by the way - but I have to read the updated version to
figure out whether the modification was replying to me or another
poster; whereas my mail reader clearly flags messages where I appear
in the recipients list).

Frederik

-- 
http://ofb.net/~frederik/
___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


[Haskell] deriving DeepSeq and deep strict fields proposals (Re[2]: [Haskell-cafe] How can we detect and fix memory leak due to lazyness?)

2006-08-08 Thread Bulat Ziganshin
Hello Ki,

Tuesday, August 8, 2006, 6:34:51 AM, you wrote:

> Unfortunately seq and the strict data declaration is not helpful in general.
> They are only helpful on base values such as Int or Bool.
> What they do is just making sure that it is not a thunk.
> That is if it was a list it would just evaluate to see the cons cell
> but no further.

> Someone wrote a deepSeq module for forcing deep evaluation, which is

it was a proposal to add deepSeq to the language itself (just allow to
automatically derive it by compiler, for example). we can add another
proposal of implementing deep strict fields:

data T = C !![Int]

-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell