Re: Unsafe hGetContents

2010-01-23 Thread Florian Weimer
* Simon Marlow:

>> What about handles from System.Process?  Do they count as well?
>
> Sure - we hopefully don't consider System.Process to be unsafe.

Here's a demonstration that lazy input has an observable effect.  It
needs the Perl helper script included below.

Of course, this example is constructed, but there are similar issues
to consider when network IO is involved.  For instance, not reading
the lazy structure to its end causes the server to keep the connection
open longer than necessary.

--
-- Based on Oleg Kiselyov's example in:
-- 

module Main where

import System.IO (hGetContents)
import System.Process (runInteractiveProcess)

f1, f2:: String -> String -> String

f1 e1 e2 = e1 `seq` e2 `seq` e1
f2 e1 e2 = e2 `seq` e1 `seq` e1

f = head . tail . lines

spawn :: () -> IO String
spawn () = do
  (inp,out,err,pid) <-
  runInteractiveProcess "perl" ["magic.pl"] Nothing Nothing
  hGetContents out

main = do
   s1 <- spawn ()
   s2 <- spawn ()
   print $ f1 (f s1) (f s2)
   -- print $ f2 (f s1) (f s2)
--

#!/usr/bin/perl

# Magic program to demonstrate that lazy I/O leads to observable
# differences in behavior.

use strict;
use warnings;

use Fcntl ':flock';

open my $self, '<', $0 or die "opening $0: $!\n"; # use this file as lock
flock($self, LOCK_SH) or die "flock(LOCK_SH): $!\n";
print "x" x 100_000 . "\n"; # blocks if reader blocks
print flock($self, LOCK_EX | LOCK_NB) ? "locked\n" : "failed\n";
  # only succeeds if the other process has exited

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-20 Thread Duncan Coutts
On Tue, 2009-10-20 at 15:45 +0100, Simon Marlow wrote:

> > I've not yet seen anyone put forward any practical programs that have
> > confusing behaviour but were not written deliberately to be as wacky as
> > possible and avoid all the safety mechanism.
> >
> > The standard use case for hGetContents is reading a read-only file, or
> > stdin where it really does not matter when the read actions occur with
> > respect to other IO actions. You could do it in parallel rather than
> > on-demand and it'd still be ok.
> >
> > There's the beginner mistake where people don't notice that they're not
> > actually demanding anything before closing the file, that's nothing new
> > of course.
> 
> If the parallel runtime reads files eagerly, that might hide a resource 
> problem that would occur when the program is run on a sequential system, 
> for example.

That's true, but we have the same problem without doing any IO. There
are many ways of generating large amounts of data.

Duncan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-20 Thread Simon Marlow

On 20/10/2009 15:24, Duncan Coutts wrote:

On Tue, 2009-10-20 at 13:58 +0100, Simon Marlow wrote:


Duncan has found a definition of hGetContents that explains why it has
surprising behaviour, and that's very nice because it lets us write the
compilers that we want to write, and we get to tell the users to stop
moaning because the strange behaviour they're experiencing is allowed
according to the spec.  :-)


:-)


Of course, the problem is that users don't want the hGetContents that
has non-deterministic semantics, they want a deterministic one.  And for
that, they want to fix the evaluation order (or something).  The obvious
drawback with fixing the evaluation order is that it ties the hands of
the compiler developers, and makes a fundamental change to the language
definition.


I've not yet seen anyone put forward any practical programs that have
confusing behaviour but were not written deliberately to be as wacky as
possible and avoid all the safety mechanism.

The standard use case for hGetContents is reading a read-only file, or
stdin where it really does not matter when the read actions occur with
respect to other IO actions. You could do it in parallel rather than
on-demand and it'd still be ok.

There's the beginner mistake where people don't notice that they're not
actually demanding anything before closing the file, that's nothing new
of course.


If the parallel runtime reads files eagerly, that might hide a resource 
problem that would occur when the program is run on a sequential system, 
for example.


Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-20 Thread Duncan Coutts
On Tue, 2009-10-20 at 13:58 +0100, Simon Marlow wrote:

> Duncan has found a definition of hGetContents that explains why it has 
> surprising behaviour, and that's very nice because it lets us write the 
> compilers that we want to write, and we get to tell the users to stop 
> moaning because the strange behaviour they're experiencing is allowed 
> according to the spec.  :-)

:-)

> Of course, the problem is that users don't want the hGetContents that 
> has non-deterministic semantics, they want a deterministic one.  And for 
> that, they want to fix the evaluation order (or something).  The obvious 
> drawback with fixing the evaluation order is that it ties the hands of 
> the compiler developers, and makes a fundamental change to the language 
> definition.

I've not yet seen anyone put forward any practical programs that have
confusing behaviour but were not written deliberately to be as wacky as
possible and avoid all the safety mechanism.

The standard use case for hGetContents is reading a read-only file, or
stdin where it really does not matter when the read actions occur with
respect to other IO actions. You could do it in parallel rather than
on-demand and it'd still be ok.

There's the beginner mistake where people don't notice that they're not
actually demanding anything before closing the file, that's nothing new
of course.

Duncan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-20 Thread Simon Marlow

On 10/10/2009 18:59, Iavor Diatchki wrote:

Hello,

well, I think that the fact that we seem to have a program context
that can distinguish "f1" from "f2" is worth discussing because I
would have thought that in a pure language they are interchangable.
The question is, does the context in Oleg's example really distinguish
between "f1" and "f2"?  You seem to be saying that this is not the
case:  in both cases you end up with the same non-deterministic
program that reads two numbers from the standard input and subtracts
them but you can't assume anything about the order in which the
numbers are extracted from the input---it is merely an artifact of the
GHC implementation that with "f1" the subtraction always happens the
one way, and with "f2" it happens the other way.

I can (sort of) buy this argument, after all, it is quite similar to
what happens with asynchronous exceptions (f1 (error "1") (error "2")
vs f2 (error "1") (error "2")).  Still, the whole thing does not
"smell right":  there is some impurity going on here, and trying to
offload the problem onto the IO monad only makes reasoning about IO
computations even harder (and it is petty hard to start with).  So,
discussion and alternative solutions should be strongly encouraged, I
think.


Duncan has found a definition of hGetContents that explains why it has 
surprising behaviour, and that's very nice because it lets us write the 
compilers that we want to write, and we get to tell the users to stop 
moaning because the strange behaviour they're experiencing is allowed 
according to the spec.  :-)


Of course, the problem is that users don't want the hGetContents that 
has non-deterministic semantics, they want a deterministic one.  And for 
that, they want to fix the evaluation order (or something).  The obvious 
drawback with fixing the evaluation order is that it ties the hands of 
the compiler developers, and makes a fundamental change to the language 
definition.


Things will get a lot worse in the future as we experiment with more 
elaborate compiler optimisations and evaluation strategies.  I predict 
that eventually we'll have to ditch hGetContents, at least in its 
current generality.


Cheers,
Simon


-Iavor







On Sat, Oct 10, 2009 at 7:38 AM, Duncan Coutts
  wrote:

On Sat, 2009-10-10 at 02:51 -0700, o...@okmij.org wrote:


The reason it's hard is that to demonstrate a difference you have to get
the lazy I/O to commute with some other I/O, and GHC will never do that.


The keyword here is GHC. I may well believe that GHC is able to divine
programmer's true intent and so it always does the right thing. But
writing in the language standard ``do what the version x.y.z of GHC
does'' does not seem very appropriate, or helpful to other
implementors.


With access to unsafeInterleaveIO it's fairly straightforward to show
that it is non-deterministic. These programs that bypass the safety
mechanisms on hGetContents just get us back to having access to the
non-deterministic semantics of unsafeInterleaveIO.


Haskell's IO library is carefully designed to not run into this
problem on its own.  It's normally not possible to get two Handles
with the same FD...



Is this behavior is specified somewhere, or is this just an artifact
of a particular GHC implementation?


It is in the Haskell 98 report, in the design of the IO library. It does
not not mention FDs of course. The IO/Handle functions it provides give
no (portable) way to obtain two read handles on the same OS file
descriptor. The hGetContents behaviour of semi-closing is to stop you
from getting two lazy lists of the same read Handle.

There's nothing semantically wrong with you bypassing those restrictions
(eg openFile "/dev/fd/0") it just means you end up with a
non-deterministic IO program, which is something we typically try to
avoid.

I am a bit perplexed by this whole discussion. It seems to come down to
saying that unsafeInterleaveIO is non-deterministic and that things
implemented on top are also non-deterministic. The standard IO library
puts up some barriers to restrict the non-determinism, but if you walk
around the barrier then you can still find it. It's not clear to me what
is supposed to be surprising or alarming here.

Duncan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime



___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-12 Thread Simon Marlow

On 11/10/2009 09:26, Florian Weimer wrote:

* Simon Marlow:


Oleg's example is quite close, don't you think?

URL: http://www.haskell.org/pipermail/haskell/2009-March/021064.html


Ah yes, if you have two lazy input streams both referring to the same
underlying stream, that is enough to demonstrate a problem.  As for
whether Oleg's example is within the rules, it depends whether you
consider fdToHandle as "unsafe":


Is relying on seq to show the difference allowed, according to your
rules on an insecurity proof?


Absolutely.


What about handles from System.Process?  Do they count as well?


Sure - we hopefully don't consider System.Process to be unsafe.

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-11 Thread Isaac Dupree
Hmm, Don't you think forkIO deserves some of the same complaints as 
unsafeInterleaveIO?  Things happen in a nondeterministic order!


I think what irritates us about unsafeInterleaveIO is that it's IO that 
tinkers with the internals of the Haskell evaluation system.  The OS 
can't do it: in a C program it might, because there's libc and debuggers 
and all kinds of things that understand compiled C to some extent.  But 
the Haskell runtime system is pretty much obfuscated to anyone except 
ourselves.  This obfuscation maintains its conceptual purity to a 
greater extent than is really guaranteed by the standards.  This 
obfuscation is supported in our minds by the fact that functions (->) 
cannot be compared for equality or deconstructed or serialized in any 
way, only applied.


forkIO causes IO to happen in a nondeterministic order.  So does 
unsafeInterleaveIO.  But for unsafeInterleaveIO, the nondeterminism 
depends in part on how pure functions are written: partly because there 
is a compiler that makes arbitrary choices, and also partly affected by 
the strictness properties of the functions.  This feels disconcerting to 
us.  And worse: I am not sure if forkIO has a formal guarantee that its 
IO will complete, but we tend to assume that it will, sooner or later; 
unsafeInterleaveIO might not happen at all, and frequently does not, due 
to the observations of how pure functions are written.


It's disconcerting.  It can affect how we choose to write our pure code, 
the same way that efficiency and memory concerns can.  But if 'catch' 
can catch a different exception depending even, conceptually, on the 
phase of the moon, it is a similarly strange stretch to imagine 
unsafeInterleaveIO doing so.  It plays with chronology (like forkIO 
does) and with the ways Haskell functions are written (like 'catch' 
does) at the same time.


A result is that it makes a lot of people confused when they do 
something they didn't intend with it.  Also, it's a powerful enough tool 
that when you want to replace its formal nondeterminism with something 
more precise, you may have quite a bit of work cut out for you, 
restructuring your code (like Darcs did, IIRC).


-Isaac

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-11 Thread Duncan Coutts
On Sat, 2009-10-10 at 10:59 -0700, Iavor Diatchki wrote:
> Hello,
> 
> well, I think that the fact that we seem to have a program context
> that can distinguish "f1" from "f2" is worth discussing because I
> would have thought that in a pure language they are interchangable.

Crucially they are contexts in an IO program.

> The question is, does the context in Oleg's example really distinguish
> between "f1" and "f2"?  You seem to be saying that this is not the
> case:  in both cases you end up with the same non-deterministic
> program that reads two numbers from the standard input and subtracts
> them but you can't assume anything about the order in which the
> numbers are extracted from the input---it is merely an artifact of the
> GHC implementation that with "f1" the subtraction always happens the
> one way, and with "f2" it happens the other way.

Right.

> I can (sort of) buy this argument, after all, it is quite similar to
> what happens with asynchronous exceptions (f1 (error "1") (error "2")
> vs f2 (error "1") (error "2")).  Still, the whole thing does not
> "smell right":  there is some impurity going on here,

No, there's no impurity.

> and trying to offload the problem onto the IO monad only makes
> reasoning about IO computations even harder (and it is petty hard to
> start with).

Sure, reasoning about non-deterministic IO programs is tricky. But then
nobody here is advocating writing non-deterministic IO programs. Lazy IO
is sensible and useful when the non-determinism doesn't make any
difference to the results.

Lets look at a simplified case, instead of general IO and the whole OS
API at our disposal, lets look at the case of a single thread of control
and mutable variables, specifically the ST monad.

We can construct a semantics for this based on a sequence of read /
write events for the mutable variables. The ST monad bind gives
guarantees about the ordering of the events. So the ST programs are
deterministic.

do x <- readSTRef v
   writeSTRef v (x+1)
   writeSTRef v (x+2)

The semantics of this ST program is the trace

read(v,x)
write(v,x+1)
write(v,x+2)

We could introduce non-determinism to this system by allowing read /
write events to be arbitrarily interleaved with other subsequent events:

do x <- readSTRef v
   unsafeInterleaveST $ writeSTRef v (x+1)
   writeSTRef v (x+2)

now we can have two traces:

read(v,x)
write(v,x+1)
write(v,x+2)

or

read(v,x)
write(v,x+2)
write(v,x+1)

The semantics is the set of traces, in this case just the two.

Of course with this modified ST system we cannot allow a pure runST
because we've got non-deterministic ST programs (or we could make it
pure by returning the full set of traces). But it'd be ok for IO.

Now working with and reasoning about these non-deterministic ST programs
is tricky. Depending on the implementation choice for the interleaving
we'll get different results and under some implementation choices we'll
be able to influence the result by coding pure bits of the program
differently. None of this changes the semantics since the semantics just
says any possible interleaving is OK.

Another interesting thing to note is that we can limit the interleaving
somewhat by forcing deferred events to come before other subsequent
events:

do writeSTRef v 1
   x <- unsafeInterleaveST $ readSTRef v
   writeSTRef v 2
   evaluate x
   writeSTRef v 3

So in the traces for this program, x can have the value 1 or 2 but not 3
because of the partial order on events that we impose using evaluate.

We can also do something like Oleg's example (simplified to only a
single getChar rather than reading the whole input stream)

do
  let fileContent = "hello"
  seekPoint <- newSTRef 0
  let getChar = do
s <- readIORef seekPoint
writeIORef seekPoint (s+1)
return (fileContent !! s)

  s1 <- unsafeInterleaveST getChar
  s2 <- unsafeInterleaveST getChar
  
  --evaluate (f1 s1 s2)
  evaluate (f2 s1 s2)

Under some implementations of the interleaving we can expect to get
different event interleavings for the f1 program vs the f2 program. So
we apparently have a pure function influencing the event ordering. Of
course the semantics says we have both event orderings anyway.

It is also possible to write ST programs that produce the same result
irrespective of the event interleaving. These programs might actually be
useful. For example:

do writeSTRef v 1
   x <- unsafeInterleaveST $ readSTRef v
   ...
   -- no more writes to v

So here we allow the read from v do be performed any time. We still have
loads of different possible traces, but the value of x is the same in
each, because the v variable is never written to again.

In IO with the full OS API and other programs running concurrently it is
harder to reason about. But we can see similar possibilities for
non-deterministic primitives where we can still get a deterministic
result. One of those is if we read from a mutable variable (a file) and
can be sure that there are no other writes to 

Re: Unsafe hGetContents

2009-10-11 Thread Heinrich Apfelmus
Iavor Diatchki wrote:
> Hello,
> 
> well, I think that the fact that we seem to have a program context
> that can distinguish "f1" from "f2" is worth discussing because I
> would have thought that in a pure language they are interchangable.
> The question is, does the context in Oleg's example really distinguish
> between "f1" and "f2"?  You seem to be saying that this is not the
> case:  in both cases you end up with the same non-deterministic
> program that reads two numbers from the standard input and subtracts
> them but you can't assume anything about the order in which the
> numbers are extracted from the input---it is merely an artifact of the
> GHC implementation that with "f1" the subtraction always happens the
> one way, and with "f2" it happens the other way.
>
> I can (sort of) buy this argument, after all, it is quite similar to
> what happens with asynchronous exceptions (f1 (error "1") (error "2")
> vs f2 (error "1") (error "2")).  Still, the whole thing does not
> "smell right":  there is some impurity going on here, and trying to
> offload the problem onto the IO monad only makes reasoning about IO
> computations even harder (and it is petty hard to start with).  So,
> discussion and alternative solutions should be strongly encouraged, I
> think.

To put it in different words, here an elaboration on what exactly the
non-determinism argument is:


Consider programs  foo1  and  foo2  defined as

foo :: (a -> b -> c) -> IO String
foo f = Control.Exception.catch
(evaluate (f (error "1") (error "2")) >> return "3")
(\(ErrorCall s) -> return s)

foo1  = foo f1  where  f1 x y = x `seq` y `seq` ()
foo2  = foo f2  where  f2 x y = y `seq` x `seq` ()

Knowing how exceptions and  seq  behave in GHC, it is straightforward to
prove that

foo1  = return "1"
foo2  = return "2"

which clearly violates referential transparency. This is bad, so the
idea is to disallow the proof.


In particular, the idea is that referential transparency can be restored
if we only allow proofs that work for all evaluation orders, which is
equivalent to introducing non-determinism. In other words, we are only
allowed to prove

foo1  = return "1"  or  return "2"
foo2  = return "1"  or  return "2"

Moreover, we can push the non-determinism into the IO type and pretend
that pure functions  A -> B  are semantically lifted to  Nondet A ->
Nondet B  with some kind of  fmap .


The same goes for  hGetContents : if you use it twice on the same
handle, you're only allowed to prove non-deterministic behavior, which
is not very useful if you want a deterministic program. But you are
allowed to prove deterministic results if you use it with appropriate
caution.


In other words, the language semantics guarantees less than GHC actually
does. In particular, the semantics only allows reasoning that is
independent of the evaluation order and this means to treat IO as
non-deterministic in certain cases.


Regards,
apfelmus

--
http://apfelmus.nfshost.com

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-11 Thread Florian Weimer
* Simon Marlow:

>> Oleg's example is quite close, don't you think?
>>
>> URL: http://www.haskell.org/pipermail/haskell/2009-March/021064.html
>
> Ah yes, if you have two lazy input streams both referring to the same
> underlying stream, that is enough to demonstrate a problem.  As for
> whether Oleg's example is within the rules, it depends whether you
> consider fdToHandle as "unsafe":

Is relying on seq to show the difference allowed, according to your
rules on an insecurity proof?

What about handles from System.Process?  Do they count as well?
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-10 Thread Iavor Diatchki
Hello,

well, I think that the fact that we seem to have a program context
that can distinguish "f1" from "f2" is worth discussing because I
would have thought that in a pure language they are interchangable.
The question is, does the context in Oleg's example really distinguish
between "f1" and "f2"?  You seem to be saying that this is not the
case:  in both cases you end up with the same non-deterministic
program that reads two numbers from the standard input and subtracts
them but you can't assume anything about the order in which the
numbers are extracted from the input---it is merely an artifact of the
GHC implementation that with "f1" the subtraction always happens the
one way, and with "f2" it happens the other way.

I can (sort of) buy this argument, after all, it is quite similar to
what happens with asynchronous exceptions (f1 (error "1") (error "2")
vs f2 (error "1") (error "2")).  Still, the whole thing does not
"smell right":  there is some impurity going on here, and trying to
offload the problem onto the IO monad only makes reasoning about IO
computations even harder (and it is petty hard to start with).  So,
discussion and alternative solutions should be strongly encouraged, I
think.

-Iavor







On Sat, Oct 10, 2009 at 7:38 AM, Duncan Coutts
 wrote:
> On Sat, 2009-10-10 at 02:51 -0700, o...@okmij.org wrote:
>
>> > The reason it's hard is that to demonstrate a difference you have to get
>> > the lazy I/O to commute with some other I/O, and GHC will never do that.
>>
>> The keyword here is GHC. I may well believe that GHC is able to divine
>> programmer's true intent and so it always does the right thing. But
>> writing in the language standard ``do what the version x.y.z of GHC
>> does'' does not seem very appropriate, or helpful to other
>> implementors.
>
> With access to unsafeInterleaveIO it's fairly straightforward to show
> that it is non-deterministic. These programs that bypass the safety
> mechanisms on hGetContents just get us back to having access to the
> non-deterministic semantics of unsafeInterleaveIO.
>
>> > Haskell's IO library is carefully designed to not run into this
>> > problem on its own.  It's normally not possible to get two Handles
>> > with the same FD...
>
>> Is this behavior is specified somewhere, or is this just an artifact
>> of a particular GHC implementation?
>
> It is in the Haskell 98 report, in the design of the IO library. It does
> not not mention FDs of course. The IO/Handle functions it provides give
> no (portable) way to obtain two read handles on the same OS file
> descriptor. The hGetContents behaviour of semi-closing is to stop you
> from getting two lazy lists of the same read Handle.
>
> There's nothing semantically wrong with you bypassing those restrictions
> (eg openFile "/dev/fd/0") it just means you end up with a
> non-deterministic IO program, which is something we typically try to
> avoid.
>
> I am a bit perplexed by this whole discussion. It seems to come down to
> saying that unsafeInterleaveIO is non-deterministic and that things
> implemented on top are also non-deterministic. The standard IO library
> puts up some barriers to restrict the non-determinism, but if you walk
> around the barrier then you can still find it. It's not clear to me what
> is supposed to be surprising or alarming here.
>
> Duncan
>
> ___
> Haskell-prime mailing list
> Haskell-prime@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-prime
>
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-10 Thread Duncan Coutts
On Sat, 2009-10-10 at 02:51 -0700, o...@okmij.org wrote:

> > The reason it's hard is that to demonstrate a difference you have to get
> > the lazy I/O to commute with some other I/O, and GHC will never do that.
> 
> The keyword here is GHC. I may well believe that GHC is able to divine
> programmer's true intent and so it always does the right thing. But
> writing in the language standard ``do what the version x.y.z of GHC
> does'' does not seem very appropriate, or helpful to other
> implementors.

With access to unsafeInterleaveIO it's fairly straightforward to show
that it is non-deterministic. These programs that bypass the safety
mechanisms on hGetContents just get us back to having access to the
non-deterministic semantics of unsafeInterleaveIO.

> > Haskell's IO library is carefully designed to not run into this
> > problem on its own.  It's normally not possible to get two Handles
> > with the same FD...

> Is this behavior is specified somewhere, or is this just an artifact
> of a particular GHC implementation?

It is in the Haskell 98 report, in the design of the IO library. It does
not not mention FDs of course. The IO/Handle functions it provides give
no (portable) way to obtain two read handles on the same OS file
descriptor. The hGetContents behaviour of semi-closing is to stop you
from getting two lazy lists of the same read Handle.

There's nothing semantically wrong with you bypassing those restrictions
(eg openFile "/dev/fd/0") it just means you end up with a
non-deterministic IO program, which is something we typically try to
avoid.

I am a bit perplexed by this whole discussion. It seems to come down to
saying that unsafeInterleaveIO is non-deterministic and that things
implemented on top are also non-deterministic. The standard IO library
puts up some barriers to restrict the non-determinism, but if you walk
around the barrier then you can still find it. It's not clear to me what
is supposed to be surprising or alarming here.

Duncan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Unsafe hGetContents

2009-10-10 Thread oleg

Simon Marlow wrote:
> Ah yes, if you have two lazy input streams both referring to the same
> underlying stream, that is enough to demonstrate a problem.  As for
> whether Oleg's example is within the rules, it depends whether you
> consider fdToHandle as "unsafe"

I wasn't aware of the rules. Fortunately, UNIX (FreeBSD and Linux)
give plenty of opportunities to shoot oneself. Here is the code from
the earlier message without the offending fdToHandle:

> {- Haskell98! -}
>
> module Main where
>
> import System.IO
>
> -- f1 and f2 are both pure functions, with the pure type.
> -- Both compute the result of the subtraction e1 - e2.
> -- The only difference between them is the sequence of
> -- evaluating their arguments, e1 `seq` e2 vs. e2 `seq` e1
> -- For really pure functions, that difference should not be observable
>
> f1, f2:: Int ->Int ->Int
>
> f1 e1 e2 = e1 `seq` e2 `seq` e1 - e2
> f2 e1 e2 = e2 `seq` e1 `seq` e1 - e2
>
> read_int s = read . head . words $ s
>
> main = do
>let h1 = stdin
>h2 <- openFile "/dev/stdin" ReadMode
>s1 <- hGetContents h1
>s2 <- hGetContents h2
>-- print $ f1 (read_int s1) (read_int s2)
>print $ f2 (read_int s1) (read_int s2)

It exhibits the same behavior that was described in
http://www.haskell.org/pipermail/haskell/2009-March/021064.html

I think Windows may have something similar.


> The reason it's hard is that to demonstrate a difference you have to get
> the lazy I/O to commute with some other I/O, and GHC will never do that.

The keyword here is GHC. I may well believe that GHC is able to divine
programmer's true intent and so it always does the right thing. But
writing in the language standard ``do what the version x.y.z of GHC
does'' does not seem very appropriate, or helpful to other
implementors.

> Haskell's IO library is carefully designed to not run into this
> problem on its own.  It's normally not possible to get two Handles
> with the same FD...
Is this behavior is specified somewhere, or is this just an artifact
of a particular GHC implementation?

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-06 Thread Duncan Coutts
On Tue, 2009-10-06 at 15:18 +0200, Nicolas Pouillard wrote:

> > The reason it's hard is that to demonstrate a difference you have to get 
> > the lazy I/O to commute with some other I/O, and GHC will never do that. 
> >   If you find a way to do it, then we'll probably consider it a bug in GHC.
> > 
> > You can get lazy I/O to commute with other lazy I/O, and perhaps with 
> > some cunning arrangement of pipes (or something) that might be a way to 
> > solve the puzzle.  Good luck!
> 
> Oleg's example is quite close, don't you think?
> 
> URL: http://www.haskell.org/pipermail/haskell/2009-March/021064.html


I didn't think that showed very much. He showed two different runs of
two different IO programs where he got different results after having
bypassed the safety switch on hGetContents.

It shows that lazy IO is non-deterministic, but then we knew that. It
didn't show anything was impure.

As a software engineering thing, it's recommended to use lazy IO in the
cases where the non-determinism has a low impact, ie where the order of
the actions with respect to other actions doesn't really matter. When it
does matter then your programs will probably be more comprehensible if
you do the actions more explicitly.

For example we have the shoot-yourself-in-the-foot restriction that you
can only use hGetContents on a handle a single time (this is the safety
mechanism that Oleg turned off) and after that you cannot write to the
same handle. That's not because it'd be semantically unsound if those
restrictions were not there, but it would let you write some jolly
confusing non-deterministic programs.

Duncan

___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-06 Thread Simon Marlow

On 06/10/2009 14:18, Nicolas Pouillard wrote:

Excerpts from Simon Marlow's message of Tue Oct 06 14:59:06 +0200 2009:

On 03/10/2009 19:59, Florian Weimer wrote:

* Nicolas Pouillard:


Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009:

Are there any plans to get rid of hGetContents and the semi-closed
handle state for Haskell Prime?

(I call hGetContents unsafe because it adds side effects to pattern
matching, stricly speaking invalidating most of the transformations
which are expected to be valid in a pure language.)


Would you consider something like [1] as an acceptable replacement?

[1]: http://hackage.haskell.org/package/safe-lazy-io


It only addresses two known issues with lazy I/O, doesn't it?  It
still injects input operations into pure code not in the IO monad.


While what you say is true, and I've complained about the same thing
myself in the past, it turns out to be quite difficult to demonstrate
the unsafety.

Try it!  Here's the rules.

- write a program that gives different results when compiled with
  different optimisation flags only. (one exception: you're not
  allowed to take advantage of -fno-state-hack).

- Using exceptions is not allowed (they're non-determinstic).

- A difference caused by resources (e.g. stack overflow) doesn't
  count.

- The only "unsafe" operation you're allowed to use is hGetContents.

- You're allowed to use any other I/O operations, including from
  libraries, as long as they're not unsafe, and as long as the I/O
  itself is deterministic.

The reason it's hard is that to demonstrate a difference you have to get
the lazy I/O to commute with some other I/O, and GHC will never do that.
   If you find a way to do it, then we'll probably consider it a bug in GHC.

You can get lazy I/O to commute with other lazy I/O, and perhaps with
some cunning arrangement of pipes (or something) that might be a way to
solve the puzzle.  Good luck!


Oleg's example is quite close, don't you think?

URL: http://www.haskell.org/pipermail/haskell/2009-March/021064.html


Ah yes, if you have two lazy input streams both referring to the same 
underlying stream, that is enough to demonstrate a problem.  As for 
whether Oleg's example is within the rules, it depends whether you 
consider fdToHandle as "unsafe": Haskell's IO library is carefully 
designed to not run into this problem on its own.  It's normally not 
possible to get two Handles with the same FD, however 
GHC.IO.Handle.hDuplicate also lets you do this.


Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-06 Thread Nicolas Pouillard
Excerpts from Simon Marlow's message of Tue Oct 06 14:59:06 +0200 2009:
> On 03/10/2009 19:59, Florian Weimer wrote:
> > * Nicolas Pouillard:
> >
> >> Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009:
> >>> Are there any plans to get rid of hGetContents and the semi-closed
> >>> handle state for Haskell Prime?
> >>>
> >>> (I call hGetContents unsafe because it adds side effects to pattern
> >>> matching, stricly speaking invalidating most of the transformations
> >>> which are expected to be valid in a pure language.)
> >>
> >> Would you consider something like [1] as an acceptable replacement?
> >>
> >> [1]: http://hackage.haskell.org/package/safe-lazy-io
> >
> > It only addresses two known issues with lazy I/O, doesn't it?  It
> > still injects input operations into pure code not in the IO monad.
> 
> While what you say is true, and I've complained about the same thing 
> myself in the past, it turns out to be quite difficult to demonstrate 
> the unsafety.
> 
> Try it!  Here's the rules.
> 
>- write a program that gives different results when compiled with
>  different optimisation flags only. (one exception: you're not
>  allowed to take advantage of -fno-state-hack).
> 
>- Using exceptions is not allowed (they're non-determinstic).
> 
>- A difference caused by resources (e.g. stack overflow) doesn't
>  count.
> 
>- The only "unsafe" operation you're allowed to use is hGetContents.
> 
>- You're allowed to use any other I/O operations, including from
>  libraries, as long as they're not unsafe, and as long as the I/O
>  itself is deterministic.
> 
> The reason it's hard is that to demonstrate a difference you have to get 
> the lazy I/O to commute with some other I/O, and GHC will never do that. 
>   If you find a way to do it, then we'll probably consider it a bug in GHC.
> 
> You can get lazy I/O to commute with other lazy I/O, and perhaps with 
> some cunning arrangement of pipes (or something) that might be a way to 
> solve the puzzle.  Good luck!

Oleg's example is quite close, don't you think?

URL: http://www.haskell.org/pipermail/haskell/2009-March/021064.html

Cheers,

-- 
Nicolas Pouillard
http://nicolaspouillard.fr
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-06 Thread Simon Marlow

On 03/10/2009 19:59, Florian Weimer wrote:

* Nicolas Pouillard:


Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009:

Are there any plans to get rid of hGetContents and the semi-closed
handle state for Haskell Prime?

(I call hGetContents unsafe because it adds side effects to pattern
matching, stricly speaking invalidating most of the transformations
which are expected to be valid in a pure language.)


Would you consider something like [1] as an acceptable replacement?

[1]: http://hackage.haskell.org/package/safe-lazy-io


It only addresses two known issues with lazy I/O, doesn't it?  It
still injects input operations into pure code not in the IO monad.


While what you say is true, and I've complained about the same thing 
myself in the past, it turns out to be quite difficult to demonstrate 
the unsafety.


Try it!  Here's the rules.

  - write a program that gives different results when compiled with
different optimisation flags only. (one exception: you're not
allowed to take advantage of -fno-state-hack).

  - Using exceptions is not allowed (they're non-determinstic).

  - A difference caused by resources (e.g. stack overflow) doesn't
count.

  - The only "unsafe" operation you're allowed to use is hGetContents.

  - You're allowed to use any other I/O operations, including from
libraries, as long as they're not unsafe, and as long as the I/O
itself is deterministic.

The reason it's hard is that to demonstrate a difference you have to get 
the lazy I/O to commute with some other I/O, and GHC will never do that. 
 If you find a way to do it, then we'll probably consider it a bug in GHC.


You can get lazy I/O to commute with other lazy I/O, and perhaps with 
some cunning arrangement of pipes (or something) that might be a way to 
solve the puzzle.  Good luck!


Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-10-03 Thread Florian Weimer
* Nicolas Pouillard:

> Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009:
>> Are there any plans to get rid of hGetContents and the semi-closed
>> handle state for Haskell Prime?
>> 
>> (I call hGetContents unsafe because it adds side effects to pattern
>> matching, stricly speaking invalidating most of the transformations
>> which are expected to be valid in a pure language.)
>
> Would you consider something like [1] as an acceptable replacement?
>
> [1]: http://hackage.haskell.org/package/safe-lazy-io

It only addresses two known issues with lazy I/O, doesn't it?  It
still injects input operations into pure code not in the IO monad.
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-09-21 Thread Nicolas Pouillard
Excerpts from Simon Marlow's message of Mon Sep 21 11:52:41 +0200 2009:
> On 17/09/2009 13:58, Nicolas Pouillard wrote:
> > Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009:
> >> Are there any plans to get rid of hGetContents and the semi-closed
> >> handle state for Haskell Prime?
> >>
> >> (I call hGetContents unsafe because it adds side effects to pattern
> >> matching, stricly speaking invalidating most of the transformations
> >> which are expected to be valid in a pure language.)
> >
> > Would you consider something like [1] as an acceptable replacement?
> >
> > [1]: http://hackage.haskell.org/package/safe-lazy-io
> 
> I rater like this as a workaround for the most common practical problems 
> with lazy I/O, those of resource control.

> It doesn't address the deeper 
> concern that lazy I/O requires a particular evaluation order and is 
> therefore a bit warty as a language feature

When using safe-lazy-io we no longer rely (or a lot less) on the evaluation
order (assuming you mean the order of side-effects). Since the way of combining
the different inputs is statically chosen by user.

> - implementing lazy I/O 
> properly in GHC's parallel mutator was somewhat tricky.  I'm not of the 
> opinion that we should throw out lazy I/O, but it's still a problematic 
> area in Haskell.

Maybe the 'unsafeGetContents' feature required by a safe-lazy-io would be
less problematic, in particular it does not have to ignore exceptions.

Best regards,

-- 
Nicolas Pouillard
http://nicolaspouillard.fr
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-09-21 Thread Nicolas Pouillard
Excerpts from Simon Marlow's message of Mon Sep 21 11:41:38 +0200 2009:
> On 16/09/2009 21:17, Florian Weimer wrote:
> > Are there any plans to get rid of hGetContents and the semi-closed
> > handle state for Haskell Prime?
> >
> > (I call hGetContents unsafe because it adds side effects to pattern
> > matching, stricly speaking invalidating most of the transformations
> > which are expected to be valid in a pure language.)
> 
> There is no current proposal for this, no.  Feel free to start one; 
> information about the process for Haskell Prime proposals is here
> 
> http://hackage.haskell.org/trac/haskell-prime/wiki/Process

An alternate proposition (instead of removing it) would to to move it to
System.IO.Unsafe.

-- 
Nicolas Pouillard
http://nicolaspouillard.fr
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-09-21 Thread Simon Marlow

On 17/09/2009 13:58, Nicolas Pouillard wrote:

Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009:

Are there any plans to get rid of hGetContents and the semi-closed
handle state for Haskell Prime?

(I call hGetContents unsafe because it adds side effects to pattern
matching, stricly speaking invalidating most of the transformations
which are expected to be valid in a pure language.)


Would you consider something like [1] as an acceptable replacement?

[1]: http://hackage.haskell.org/package/safe-lazy-io


I rater like this as a workaround for the most common practical problems 
with lazy I/O, those of resource control.  It doesn't address the deeper 
concern that lazy I/O requires a particular evaluation order and is 
therefore a bit warty as a language feature - implementing lazy I/O 
properly in GHC's parallel mutator was somewhat tricky.  I'm not of the 
opinion that we should throw out lazy I/O, but it's still a problematic 
area in Haskell.


Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-09-21 Thread Simon Marlow

On 16/09/2009 21:17, Florian Weimer wrote:

Are there any plans to get rid of hGetContents and the semi-closed
handle state for Haskell Prime?

(I call hGetContents unsafe because it adds side effects to pattern
matching, stricly speaking invalidating most of the transformations
which are expected to be valid in a pure language.)


There is no current proposal for this, no.  Feel free to start one; 
information about the process for Haskell Prime proposals is here


http://hackage.haskell.org/trac/haskell-prime/wiki/Process

Cheers,
Simon
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-09-17 Thread Nicolas Pouillard
Excerpts from Florian Weimer's message of Wed Sep 16 22:17:08 +0200 2009:
> Are there any plans to get rid of hGetContents and the semi-closed
> handle state for Haskell Prime?
> 
> (I call hGetContents unsafe because it adds side effects to pattern
> matching, stricly speaking invalidating most of the transformations
> which are expected to be valid in a pure language.)

Would you consider something like [1] as an acceptable replacement?

[1]: http://hackage.haskell.org/package/safe-lazy-io

-- 
Nicolas Pouillard
http://nicolaspouillard.fr
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-09-16 Thread Florian Weimer
* Don Stewart:

> fw:
>> Are there any plans to get rid of hGetContents and the semi-closed
>> handle state for Haskell Prime?
>> 
>> (I call hGetContents unsafe because it adds side effects to pattern
>> matching, stricly speaking invalidating most of the transformations
>> which are expected to be valid in a pure language.)
>
> Isn't this a broader complaint about lazy IO in general?

Yes, sort of.  But doesn't lazy input derive its justification from
being present in the prelude?
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Unsafe hGetContents

2009-09-16 Thread Don Stewart
fw:
> Are there any plans to get rid of hGetContents and the semi-closed
> handle state for Haskell Prime?
> 
> (I call hGetContents unsafe because it adds side effects to pattern
> matching, stricly speaking invalidating most of the transformations
> which are expected to be valid in a pure language.)

Isn't this a broader complaint about lazy IO in general?

-- Don
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Unsafe hGetContents

2009-09-16 Thread Florian Weimer
Are there any plans to get rid of hGetContents and the semi-closed
handle state for Haskell Prime?

(I call hGetContents unsafe because it adds side effects to pattern
matching, stricly speaking invalidating most of the transformations
which are expected to be valid in a pure language.)
___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime