The update analysis developed by Keith Wansbrough was very complicated, and 
although rather beautiful (read his thesis and our papers), it didn't pay its 
way in terms of complexity.

Coincidentally, Ilya Sergey is here on an internship and is now working on a 
"cheap and cheerful" update analysis that may get much of the benefit for much 
less cost. We'll see.

Simon

| -----Original Message-----
| From: glasgow-haskell-users-boun...@haskell.org [mailto:glasgow-
| haskell-users-boun...@haskell.org] On Behalf Of Joachim Breitner
| Sent: 01 August 2012 11:38
| To: glasgow-haskell-users
| Subject: Non-updateable thunks
| 
| Hello,
| 
| I’m still working on issues of performance vs. sharing; I must assume
| some of the people here on the list must have seen my "dup"-paper¹ as
| referees.
| 
| I’m now wondering about a approach where the compiler (either
| automatically or by user annotation; I’ll leave that question for
| later) would mark some thunks as reentrant, i.e. simply skip the
| blackholing and update frame pushing. A quick test showed that this
| should work quite well, take the usual example:
| 
|         import System.Environment
|         main = do
|             a <- getArgs
|             let n = length a
|             print n
|             let l = [n..30000000]
|             print $ last l + last l
| 
| This obviously leaks memory:
| 
|         $ ./Test +RTS -t
|         0
|         60000000
|         <<ghc: 2400054760 bytes, 4596 GCs, 169560494/935354240 avg/max
|         bytes residency (11 samples), 2121M in use, 0.00 INIT (0.00
|         elapsed), 0.63 MUT (0.63 elapsed), 4.28 GC (4.29 elapsed)
| :ghc>>
| 
| 
| I then modified the the assembly (a crude but effective way of testing
| this ;-)) to not push a stack frame:
| 
| $ diff -u Test.s Test-modified.s
| --- Test.s    2012-08-01 11:30:00.000000000 +0200
| +++ Test-modified.s   2012-08-01 11:29:40.000000000 +0200
| @@ -56,20 +56,20 @@
|       leaq -40(%rbp),%rax
|       cmpq %r15,%rax
|       jb .LcpZ
| -     addq $16,%r12
| -     cmpq 144(%r13),%r12
| -     ja .Lcq1
| -     movq $stg_upd_frame_info,-16(%rbp)
| -     movq %rbx,-8(%rbp)
| +     //addq $16,%r12
| +     //cmpq 144(%r13),%r12
| +     //ja .Lcq1
| +     //movq $stg_upd_frame_info,-16(%rbp)
| +     //movq %rbx,-8(%rbp)
|       movq $ghczmprim_GHCziTypes_Izh_con_info,-8(%r12)
|       movq $30000000,0(%r12)
|       leaq -7(%r12),%rax
| -     movq %rax,-24(%rbp)
| +     movq %rax,-8(%rbp)
|       movq 16(%rbx),%rax
| -     movq %rax,-32(%rbp)
| -     movq $stg_ap_pp_info,-40(%rbp)
| +     movq %rax,-16(%rbp)
| +     movq $stg_ap_pp_info,-24(%rbp)
|       movl $base_GHCziEnum_zdfEnumInt_closure,%r14d
| -     addq $-40,%rbp
| +     addq $-24,%rbp
|       jmp base_GHCziEnum_enumFromTo_info
|  .Lcq1:
|       movq $16,192(%r13)
| 
| Now it runs fast and slim (and did not crash on the first try, which I
| find surprising after hand-modifying the assembly code):
| 
|         $ ./Test +RTS -t
|         0
|         60000000
|         <<ghc: 4800054840 bytes, 9192 GCs, 28632/28632 avg/max bytes
|         residency (1 samples), 1M in use, 0.00 INIT (0.00 elapsed),
| 0.73
|         MUT (0.73 elapsed), 0.04 GC (0.04 elapsed) :ghc>>
| 
| 
| My question is: Has anybody worked in that direction? And are there any
| fundamental problems with the current RTS implementation and such
| closures?
| 
| Greetings,
| Joachim
| 
| 
| ¹ http://arxiv.org/abs/1207.2017
| currently not about to appear anywhere else, but I have not given up
| hope yet :-)
| 
| 
| --
| Dipl.-Math. Dipl.-Inform. Joachim Breitner Wissenschaftlicher
| Mitarbeiter http://pp.info.uni-karlsruhe.de/~breitner
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Reply via email to