Hi Conal, I'm aware of one case that violates semantic referential transparency, but it's a bug. Which pretty much proves your point as I understand it.
John On Tue, Aug 24, 2010 at 2:01 PM, Conal Elliott <[email protected]> wrote: > Hi John, > > Please note that I'm suggesting eliminating chunks from the semantics only > -- not from the implementation. > > For precise & simple chunk-less semantics, it's only important that the > iteratees map equivalent input streams to equivalent output streams, where > "equivalent" means equal after concatenating all of the chunks. In other > words, the chunk lists denote their concatenations, so semantically equal > inputs must lead to semantically equal outputs. (Assuming I understand the > intention of chunking as being an implementation issue only, i.e., present > only for optimization.) We could call this property "semantic referential > transparency". IIUC, 'drop' is semantically RT, since it's *specified* in > terms of elements (and only *implemented* in terms of chunks). > > Do you know of any iteratees in use that map (semantically) equal inputs to > (semantically) unequal outputs, i.e.,that violate semantic RT as I've > defined it? In the current APIs, one can easily define such iteratees, but > I'm hoping that the programming interfaces can be repaired to eliminate that > problem (the "abstraction leaks" I've been mentioning). > > - Conal > > > On Tue, Aug 24, 2010 at 9:32 PM, John Lato <[email protected]> wrote: > >> I think the big problem with chunking is that many useful iteratees need >> to be able to inspect the length of the chunk. The most obvious is "drop", >> but there are many others. Or if not inspect the length, have a new >> function on the stream "dropReport :: Int -> s -> (s, Int)" which reports >> how much was dropped. Either way, chunking adds a deal of implementation >> burden. >> >> I suspect that the proper vocabulary for iteratees wouldn't include chunks >> at all, only single elements. This discussion has prompted me to consider >> the implications of such an implementation, as it would be much simpler. I >> have one idea that I think will at least maintain performance for many >> operations, although there will be performance hits too. If the drawbacks >> are in areas that aren't particularly useful, though, it may be acceptable. >> >> John >> >> >>> From: Conal Elliott <[email protected]> >>> >>> >>> Here's a way I've been tinkering with to think about iteratees clearly. >>> >>> For simplicity, I'll stick with pure, error-free iteratees for now, and >>> take >>> chunks to be strings. Define a function that runs the iteratee: >>> >>> > runIter :: Iteratee a -> [String] -> (a, [String]) >>> >>> Note that chunking is explicit here. >>> >>> Next, a relation that an iteratee implements a given specification, >>> defined >>> by a state transformer: >>> >>> > sat :: Iteratee a -> State String a -> Bool >>> >>> Define sat in terms of concatenating chunks: >>> >>> > sat it st = >>> > second concat . runIter it == runState st . second concat >>> >>> where the RHS equality is between functions (pointwise/extensionally), >>> and >>> runState uses the representation of State directly >>> >>> > runState :: State s a -> s -> (a,s) >>> >>> (I think this sat definition is what Conrad was alluding to.) >>> >>> Now use sat to specify and verify operations on iteratees and to >>> *synthesize* those operations from their specifications. Some iteratees >>> might not satisfy *any* (State-based) specification. For instance, an >>> iteratee could look at the lengths or number of its chunks and produce >>> results accordingly. I think of such iteratees as abstraction leaks. >>> Can >>> the iteratee vocabulary be honed to make only well-behaved (specifiable) >>> iteratees possible to express? If so, can we preserve performance >>> benefits? >>> >>> If indeed the abstraction leaks can be fixed, I expect there will be a >>> simpler & more conventional semantics than sat above. >>> >>> - Conal >>> >>> >>> On Tue, Aug 24, 2010 at 2:55 PM, Conrad Parker <[email protected]> >>> wrote: >>> >>> > On 24 August 2010 14:47, Jason Dagit <[email protected]> wrote: >>> > > >>> > > >>> > > On Mon, Aug 23, 2010 at 10:37 PM, Conrad Parker < >>> [email protected]> >>> > > wrote: >>> > >> >>> > >> On 24 August 2010 14:14, Jason Dagit <[email protected]> wrote: >>> > >> > I'm not a semanticist, so I apologize right now if I say something >>> > >> > stupid or >>> > >> > incorrect. >>> > >> > >>> > >> > On Mon, Aug 23, 2010 at 9:57 PM, Conal Elliott <[email protected]> >>> > wrote: >>> > >> >>> >>> > >> >>> So perhaps this could be a reasonable semantics? >>> > >> >>> >>> > >> >>> Iteratee a = [Char] -> Maybe (a, [Char]) >>> > >> >> >>> > >> >> I've been tinkering with this model as well. >>> > >> >> >>> > >> >> However, it doesn't really correspond to the iteratee interfaces >>> I've >>> > >> >> seen, since those interfaces allow an iteratee to notice size and >>> > >> >> number of >>> > >> >> chunks. I suspect this ability is an accidental abstraction >>> leak, >>> > >> >> which >>> > >> >> raises the question of how to patch the leak. >>> > >> > >>> > >> > From a purely practical viewpoint I feel that treating the >>> chunking as >>> > >> > an >>> > >> > abstraction leak might be missing the point. If you said, you >>> wanted >>> > >> > the >>> > >> > semantics to acknowledge the chunking but be invariant under the >>> size >>> > or >>> > >> > number of the chunks then I would be happier. >>> > >> >>> > >> I think that's the point, ie. to specify what the invariants should >>> > >> be. For example (to paraphrase, very poorly, something Conal wrote >>> on >>> > >> the whiteboard behind me): >>> > >> >>> > >> run [concat [chunk]] == run [chunk] >>> > >> >>> > >> ie. the (a, [Char]) you maybe get from running an iteratee over any >>> > >> partitioning of chunks should be the same, ie. the same as from >>> > >> running it over the concatenation of all chunks, which is the whole >>> > >> input [Char]. >>> > > >>> > > I find this notation foreign. I get [Char], that's the Haskell >>> String >>> > > type, but what is [chunk]? I doubt you mean a list of one element. >>> > >>> > sorry, that was just my way of writing "the list of chunks" or perhaps >>> > "the stream of chunks that represents the input". >>> > >>> > Conrad. >>> > >>> > > >>> > >> >>> > >> > I use iteratees when I need to be explicit about chunking and when >>> I >>> > >> > don't >>> > >> > want the resources to "leak outside" of the stream processing. If >>> you >>> > >> > took >>> > >> > those properties away, I wouldn't want to use it anymore because >>> then >>> > it >>> > >> > would just be an inelegant way to do things. >>> > >> >>> > >> Then I suppose the model for Enumerators is different than that for >>> > >> Iteratees; part of the point of an Enumerator is to control the size >>> > >> of the chunks, so that needs to be part of the model. An Iteratee, >>> on >>> > >> the other hand, should not have to know the size of its chunks. So >>> you >>> > >> don't want to be able to know the length of a chunk (ie. a part of >>> the >>> > >> stream), but you do want to be able to, say, fold over it, and to be >>> > >> able to stop the computation at any time (these being the main point >>> > >> of iteratees ...). >>> > > >>> > > I think I agree with that. >>> > > Jason >>> > >>> >> >
_______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
