>> This is a terribly written program. It uses 3-element lists as vectors >> (including higher-order "vector" arithmetic using "map") and allocates >> like hell. The compiler can not do much with this code, and it >> produces CPS calls everywhere. > > I take it you are referring to the {add, sub, scale, mul, dot, > squared-length, normal} functions? It seems clear to me that there's > plenty of room for optimization in them, but I wonder if you're being a > bit hasty to dismiss the naive implementation so readily. IMHO, use of > `map` to transform lists is sort of idiomatically lisp.
Using map to transform lists is indeed, but using lists to store fixed-size collections that are expected to be heavily used is probably not, I'd say. On the other hand, this benchmark appears to be intentionally inefficient to stress the GC by allocating a lot. > > That is to say, in this case there are alternatives for numerical > computation, but transforming a list in a similar manner ought to be > commonplace in lispy programmes. If it is this sort of behaviour that > drives the GC into agony then I expect that there are many Chicken > programmes that may benefit from any form of optimization that could be > had from `map`. That's right, of course. Nobody would complain if we tried to catch this special case. But every additional optimization complicates the optimizer (which is already pretty hairy) and increases the possibility of bugs. Still, "(map <op> ...)" must allocate, unless one can infer some lifetime information about the argument (or result) lists. CHICKEN may pay an extra penalty by producing relatively bad code for certain loops, like those introduced by an already existing optimization of "map". So, for example: (define (foo x y) (map + x y)) (foo (read) (read)) gets translated and optimized to: csc -debug 7 x.scm -O5 [optimized] (lambda (k205) (let ((k207 (##core#lambda (t33) (let ((k209 (##core#lambda (t34) (let ((k260 (##core#lambda (t36) (let ((k262 (##core#lambda (t37) (k205 (##core#undefined))))) (let ((k264 (##core#lambda (r265) (r265 k262)))) (##sys#implicit-exit-handler k264)))))) (let ((k268 (##core#lambda (a267) (let ((k271 (##core#lambda (a270) (let ((x1 a267)) (let ((g919 '())) (let ((g1020 #f)) (let ((map-loop524 (##core#undefined))) (let ((t218 (set! map-loop524 (lambda (k220 g1725 g1826) (let ((r255 (##core#inline "C_i_pairp" g1725))) (let ((r225 (##core#cond r255 (##core#inline "C_i_pairp" g1826) #f))) (if r225 (let ((a248 (##core#inline "C_slot" g1725 0))) (let ((a251 (##core#inline "C_slot" g1826 0))) (let ((a245 (##core#inline_allocate ("C_a_i_plus" 4) a248 a251))) (let ((g628 (##core#inline_allocate ("C_a_i_cons" 3) a245 '()))) (let ((k229 (##core#lambda (t29) (let ((t231 (set! g1020 g628))) (let ((a235 (##core#inline "C_slot" g1725 1))) (let ((a238 (##core#inline "C_slot" g1826 1))) (map-loop524 k220 a235 a238))))))) (if g1020 (k229 (##core#inline "C_i_setslot" g1020 1 g628)) (let ((t244 (set! g919 g628))) (k229 t244)))))))) (let ((f_221274 g919)) (k220 f_221274))))))))) (map-loop524 k260 x1 a270))))))))) (read k271))))) (read k268)))))) (##core#callunit "eval" k209))))) (##core#callunit "library" k207))) Note the unnecessary lambda "k229", which could be beta-reduced in the conditional following it (and the conditional simplified), with some effort, removing the need for a CPS call and the related allocating of a continuation. Any takers? Moreover, code produced by CHICKEN already allocates a lot (activation-frames are heap-allocated), so allocation-heavy code will increase this amount, in other words, CHICKEN allocates more than non-CPS implementations. Allocation is already quite fast, which compensates the cost in code that allocates in a "normal" manner (yes, I know: whatever that means...) cheers felix _______________________________________________ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users