Re: [Haskell-cafe] Copy on read
Dan, Let me first apologize for this late reply. Neil pointed you to Jurriaan Hage and Stefan Holdermans. Heap recycling for lazy languages. In John Hatcliff, Robert Glück, and Oege de Moor, editors, _Proceedings of the 2008 ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation_, PEPM'08, San Francisco, California, USA, January 7--8, 2008, pages 189--197. ACM Press, 2008. http://doi.acm.org/10.1145/1328408.1328436. and you commented: If I'm reading it aright, it works by tagging the types of values as unique or not but keeps that annotation secret from the programmer. The typing rules are incredibly complicated so to make things less scary, they are also hidden from the user. As a result, this idea makes it impossible for a developer to reason about whether their code will compile. That doesn't seem like a viable solution. Have I read this correctly? Well, let me first out point out that the paper by no means describes a complete solution. When we wrote it, we were mainly interested in the static semantics of pure nonstrict functional languages featuring a means to perform in-place updates explicitly. We use uniqueness types to ensure that such updates are safe, i.e., do not violate referential transparency, so that we can keep on enjoying the benefits of purity---equational reasoning being the most obvious example of such benefits. Our choice for denoting in-place updates in the source program explicitly is a very conscious one. Of course, having opportunities for in-place updates inferred by the compiler is highly desirable, but there's always a chance that the compiler does not find all the locations in the source program in which an update can be inserted. A programmer, having specific knowledge of the behaviour of the program and the opportunities for in-place updates that arise, may then expect updates to be inserted only to find out, when profiling the memory behaviour of the program, that certain updates where not inferred. What to do then? Restructuring the program, hoping that the optimization can be found if some pieces of code are altered, seems particularly undesirable; we then rather insert the updates by hand. Thus, explicit updates, at the very least, can be a valuable addition to a system that tries to infer updates automatically. As of now, the material in the paper is merely an exercise in static semantics: we have not implemented the system in a full-scale compiler. But to do so, ranks high on our wish list. Maybe it makes an interesting student project. By all means, we would like to get our hands on a mature set of benchmark results. As mentioned in the paper, the viability of a system of explicit updates may well depend on the particularities of the underlying memory model. This may further complicate the semantics and, worse, make the semantics backend-dependent. Note, that backend consideration also play a role in a system in which all updates are implicit, i.e, inferred. The Clean language, for example, features uniqueness typing as a means to deal with side-effects in general and the Clean compiler uses uniqueness information to insert in-place updates of memory cells. Opportunities for in-place updates, however, do not solely depend on the uniqueness of values but, indeed, also on the details of the backend involved as illustrated by a recent thread on the Clean Discussions mailing list: http://mailman.science.ru.nl/pipermail/clean-list/2008/003794.html (in particular: http://mailman.science.ru.nl/pipermail/clean-list/2008/003795.html) . With regard to your concerns: in a system with explicit updates there's always a chance that a programmer uses updates in an unsafe fashion, i.e., in such a way that referential transparency can no longer be guaranteed. Simply issuing a type-error message that involves richly annotates types may be a bit harsh then as these types tend to grow rather complex. We do not as much suggest to hide these types from the programmer altogether, but we think it's at the very least useful to come up with an error message that does a good job explaining what went wrong. (What comes to mind almost immediately is the PhD work of Bastiaan Heeren: http://people.cs.uu.nl/bastiaan/phdthesis/.) An interesting point here is that there are always at least two program points that are "to blame" here: the location of the actual explicit update and the point (or points) at the which the flow of the value to update may fork. So, in short: as far as we are concerned, there's no definite story on this subject yet. For those attending Fun in the Afternoon tomorrow at the University of Herfordshire: Jur will give a talk about this material, roughly based on the stuff we presented at PEPM in January. Then, Malcolm remarked: Uniqueness typing does not lead to in-place update. If a value is only used once, then the
Re: [Haskell-cafe] Copy on read
Hi Andrew, my probably dodgy reason for mentioning deforestation is that sharing of intermediate values is a major stumbling block; code that uses data linearly is possibly well suited for deforesting. See Frankau's SASL for a language that deforests all lists simply by not letting you copy them! (IIRC there is another constraint that forbids accumulating parameters too.) > Similarly, there are recursion patterns for which fusion isn't very > easy. Yep, I suspect you're right. > That's why most array-based code is explicitly in-place. wouldn't > it be nice if it didn't have to be? I agree. As an aside, DiffArray looks quite nice: http://www.haskell.org/haskellwiki/Modern_array_libraries ``if a diff array is used in a single-threaded style, ..., a!i takes O(1) time and a//d takes O(length d).'' Notice the use of "seq" in 2nd example to enforce a kind of single-threaded behaviour. Seems nasty! I wonder if this could be avoided by providing a (*!*) such that arr *!* i = seq x (arr, x) where x = arr ! i It returns a new array which the programmer should use if they want single-threaded behaviour. Matt. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Copy on read
On Sat, May 10, 2008 at 7:20 AM, Neil Mitchell <[EMAIL PROTECTED]> wrote: > Jurriaan Hage and Stefan Holdermans. Heap recycling for lazy > languages. In John Hatcliff, Robert Glück, and Oege de Moor, editors, > _Proceedings of the 2008 ACM SIGPLAN Symposium on Partial Evaluation > and Semantics-Based Program Manipulation_, PEPM'08, San Francisco, > California, USA, January 7--8, 2008, pages 189--197. ACM Press, 2008. > http://doi.acm.org/10.1145/1328408.1328436. If I'm reading it aright, it works by tagging the types of values as unique or not but keeps that annotation secret from the programmer. The typing rules are incredibly complicated so to make things less scary, they are also hidden from the user. As a result, this idea makes it impossible for a developer to reason about whether their code will compile. That doesn't seem like a viable solution. Have I read this correctly? Cute idea though. -- Dan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Copy on read
Matthew Naylor wrote: I wonder to what extent deforestation supersedes such optimisations. This would be a good topic to raise with a Cleaner. The paper Neil mentions looks like a nice alternative to uniqueness typing -- and it appears that there will be a FitA talk about it, this Thursday. What if you have, say, a giant array, and you run a loop that updates the entire array. Rather expensive to keep copying and collecting all that space. That's why most array-based code is explicitly in-place. But wouldn't it be nice if it didn't have to be? Similarly, there are recursion patterns for which fusion isn't very easy. (How would you fuse, say, a Gaussian elimination algorithm?) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Copy on read
> Uniqueness typing does not lead to in-place update. If a value is > only used once, then there is no need to update it at all! my understanding is that if a value is uniquely-typed then it is statically known never to have more than one reference, thus it can be modified in-place. Some potential advantages seem to be: * Less heap is used, reducing the strain on the garbage collector. * Memory writes can sometimes be avoided. Imagine changing the value of a leaf in a large unique tree. Only one memory write is required rather than N, where N is the length of the path to that leaf from the root. Such paths can obviously be quite long when working with lists. Also, the cost of rebuilding constructors with many fields -- only to change a single field -- could be expensive. I wonder to what extent deforestation supersedes such optimisations. This would be a good topic to raise with a Cleaner. The paper Neil mentions looks like a nice alternative to uniqueness typing -- and it appears that there will be a FitA talk about it, this Thursday. Matt. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Copy on read
If Haskell had uniqueness typing, maybe the compiler could be made to infer when values are used in a unique way. And maybe if the compiler can detect data being used in a unique way, it could code for in-place updates instead of copying, and thereby gain a performance advantage. Uniqueness typing does not lead to in-place update. If a value is only used once, then there is no need to update it at all! In fact, I believe ghc does this analysis, and has a minor optimisation that omits the thunk-update. That is, if an unevaluated thunk is forced, but it has a unique usage, the original thunk is not overwritten with an indirection to the reduced value (as it normally would be), but the reduced value is just used directly. I believe that rather than _uniqueness_, you are really trying to describe _linear_ usage, that is, multiple uses, but in a single non- branching sequence. Single-threaded usage of a data structure might permit in-place modification of its parts. The analysis for linear usage is much more difficult than for uniqueness, and David Wakeling's PhD Thesis "Linearity and Laziness" (circa 1990) did some experiments, but was ultimately pessimistic about the value. Regards, Malcolm ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Copy on read
Hi You're not the first: Jurriaan Hage and Stefan Holdermans. Heap recycling for lazy languages. In John Hatcliff, Robert Glück, and Oege de Moor, editors, _Proceedings of the 2008 ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation_, PEPM'08, San Francisco, California, USA, January 7--8, 2008, pages 189--197. ACM Press, 2008. http://doi.acm.org/10.1145/1328408.1328436. Like I said to them in private mail - nice idea, but without benchmarks its only an idea. You have to consider effects of cache, generational garbage collection, complexity etc. I think they are going to do the benchmarks at some point, when we'll know if the idea was a good one. Thanks Neil On Sat, May 10, 2008 at 1:42 PM, Andrew Coppin <[EMAIL PROTECTED]> wrote: > I just had a random idea, which I thought I'd share with you all. > > I've heard about systems that do "copy on write". That is, all users share a > single copy of some structure, until somebody tries to write on it. At that > moment they get a personal copy to modify so they don't upset everybody > else. This way, you don't have to go to all the expense of copying a > potentially large structure unless it's actually necessary to do so. > > Then I started thinking about Clean's "uniqueness typing". If I'm > understanding this correctly, it lets you do things like mutate data > in-place without requiring a monad. The restriction is that the compiler has > to be satisfied, at compile-time, that you will never try to access the old > data you just overwrote. > > Although I once held more optimistic beliefs, as far as I can tell no > current, real-world Haskell implementation ever mutates the fields of > algebraic data types in-place. (Ignoring for a moment the issue of > overwriting a thunk with it's result.) This strikes me as rather > pessimistic. I was thinking... If Haskell had uniqueness typing, maybe the > compiler could be made to infer when values are used in a unique way. And > maybe if the compiler can detect data being used in a unique way, it could > code for in-place updates instead of copying, and thereby gain a performance > advantage. > > I don't know how viable this would be - the thing that immediately jumps out > at me is that in practice there might be comparatively few places where you > can *guarantee* the transformation is safe, and so it might not get applied > very much. And considering all this would likely take a fair bit of work on > the compiler, that wouldn't be a great payoff. Still, the idea of the > compiler transparently rewriting your pure functional code to efficient > mutable updates (when safe) is very appealing for performance reasons. > > Thoughts, people? [I'd be surprised if I'm the first person ever to have > this idea...] > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Copy on read
I just had a random idea, which I thought I'd share with you all. I've heard about systems that do "copy on write". That is, all users share a single copy of some structure, until somebody tries to write on it. At that moment they get a personal copy to modify so they don't upset everybody else. This way, you don't have to go to all the expense of copying a potentially large structure unless it's actually necessary to do so. Then I started thinking about Clean's "uniqueness typing". If I'm understanding this correctly, it lets you do things like mutate data in-place without requiring a monad. The restriction is that the compiler has to be satisfied, at compile-time, that you will never try to access the old data you just overwrote. Although I once held more optimistic beliefs, as far as I can tell no current, real-world Haskell implementation ever mutates the fields of algebraic data types in-place. (Ignoring for a moment the issue of overwriting a thunk with it's result.) This strikes me as rather pessimistic. I was thinking... If Haskell had uniqueness typing, maybe the compiler could be made to infer when values are used in a unique way. And maybe if the compiler can detect data being used in a unique way, it could code for in-place updates instead of copying, and thereby gain a performance advantage. I don't know how viable this would be - the thing that immediately jumps out at me is that in practice there might be comparatively few places where you can *guarantee* the transformation is safe, and so it might not get applied very much. And considering all this would likely take a fair bit of work on the compiler, that wouldn't be a great payoff. Still, the idea of the compiler transparently rewriting your pure functional code to efficient mutable updates (when safe) is very appealing for performance reasons. Thoughts, people? [I'd be surprised if I'm the first person ever to have this idea...] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe