Re: [Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
Thank you Felix and Mario, I've applied this. All the best, Evan ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
Hi! Indeed, all you say is right. It needed a full recompile with the HEAD to trigger this error by generating a use of the unboxed fXX accessors in the SRFI-4 runtime system. See attached patch, the reason is quite clear: the unboxed accessors assumed unboxed fixnum index arguments. felix From 0b5ba854f5211d266964c22d1e082bfba47046a7 Mon Sep 17 00:00:00 2001 From: felix Date: Sat, 1 Dec 2018 23:00:04 +0100 Subject: [PATCH] Unboxed variants fXX SRFI-4 vector accessors assumed umboxed fixnum index operand. --- chicken.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/chicken.h b/chicken.h index 8bf588a3..598a3cff 100644 --- a/chicken.h +++ b/chicken.h @@ -1609,10 +1609,10 @@ typedef void (C_ccall *C_proc)(C_word, C_word *) C_noret; #define C_u_i_f32vector_set(v, i, x)float *)C_data_pointer(C_block_item((v), 1)))[ C_unfix(i) ] = C_flonum_magnitude(x)), C_SCHEME_UNDEFINED) #define C_u_i_f64vector_set(v, i, x)double *)C_data_pointer(C_block_item((v), 1)))[ C_unfix(i) ] = C_flonum_magnitude(x)), C_SCHEME_UNDEFINED) -#define C_ub_i_f32vector_ref(b, i) (((float *)C_data_pointer(C_block_item((b), 1)))[ i ]) -#define C_ub_i_f64vector_ref(b, i) (((double *)C_data_pointer(C_block_item((b), 1)))[ i ]) -#define C_ub_i_f32vector_set(v, i, x) float *)C_data_pointer(C_block_item((v), 1)))[ i ] = (x)), 0) -#define C_ub_i_f64vector_set(v, i, x) double *)C_data_pointer(C_block_item((v), 1)))[ i ] = (x)), 0) +#define C_ub_i_f32vector_ref(b, i) (((float *)C_data_pointer(C_block_item((b), 1)))[ C_unfix(i) ]) +#define C_ub_i_f64vector_ref(b, i) (((double *)C_data_pointer(C_block_item((b), 1)))[ C_unfix(i) ]) +#define C_ub_i_f32vector_set(v, i, x) float *)C_data_pointer(C_block_item((v), 1)))[ C_unfix(i) ] = (x)), 0) +#define C_ub_i_f64vector_set(v, i, x) double *)C_data_pointer(C_block_item((v), 1)))[ C_unfix(i) ] = (x)), 0) #define C_a_i_flonum_sin(ptr, c, x) C_flonum(ptr, C_sin(C_flonum_magnitude(x))) #define C_a_i_flonum_cos(ptr, c, x) C_flonum(ptr, C_cos(C_flonum_magnitude(x))) -- 2.16.2 ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
Sorry, I sent an empty mail. I'm trying to reproduce this. felix ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
> Hi Felix, > > On Sat, 01 Dec 2018 08:42:40 +0100 felix.winkelm...@bevuta.com wrote: > > >> No problem. Unfortunately, now "make check" breaks: > >> > >> Error: assertion failed: (eqv? (f32vector-ref old 6) (f32vector-ref new 0)) > > > > Ouch. I'm running make check with something based on the current > > HEAD all the time, on what platform is this? > > I initially got the error on a x86-64 system, but it looks that it is > failing on all the salmonella machines (x86, x86-64 and arm64). I also > tested on my Raspberry Pi (arm) and it fails there as well. > > > Is the error consistently appearing? > > It happened all the times I ran "make check". > > What I use to reproduce the problem is: > > $ make PLATFORM=... PREFIX=... CHICKEN=/path/to/chicken-5.0.0/bin/chicken > spotless boot-chicken > $ make PLATFORM=... PREFIX=... CHICKEN=./chicken-boot spotless install > $ make PLATFORM=... PREFIX=... CHICKEN=./chicken-boot check > > (On the tip of chicken-core's master.) > > > If you comment out the f32vector test (lolevel-tests.scm), does the > > f64vector test fail, too? > > I only tested that case on my x86-64 system, and it fails as well: > > Error: assertion failed: (eqv? (f64vector-ref old 6) (f64vector-ref new 0)) > > Call history: > > (=598 i596 7) > (make-locative601 old593 i596) > (make-locative601 new594 (|-603| 7 i596 1)) > (|-603| 7 i596 1) > (locative?605 loc-src600) > (locative?605 loc-dst602) > (locative-set!606 loc-dst602 (locative-ref607 loc-src600)) > (locative-ref607 loc-src600) > (doloop612 (add1597 i596)) > (add1597 i596) > (=598 i596 7) > (printf608 "\nold: ~S\nnew: ~S\n" old593 new594) > (eqv?609 (f64vector-ref old593 6) (f64vector-ref new594 0)) > (f64vector-ref old593 6) > (f64vector-ref new594 0) > (##sys#error "assertion failed" (##core#quote (eqv? > (f64vector-ref old 6) (f64vector-ref new 0<-- > rules.make:975: recipe for target 'check' failed > > > Is perhaps some float-equality-precision issue at work here? > > Keine Ahnung. :-) > > All the best. > Mario > -- > http://parenteses.org/mario ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
Hi Felix, On Sat, 01 Dec 2018 08:42:40 +0100 felix.winkelm...@bevuta.com wrote: >> No problem. Unfortunately, now "make check" breaks: >> >> Error: assertion failed: (eqv? (f32vector-ref old 6) (f32vector-ref new 0)) > > Ouch. I'm running make check with something based on the current > HEAD all the time, on what platform is this? I initially got the error on a x86-64 system, but it looks that it is failing on all the salmonella machines (x86, x86-64 and arm64). I also tested on my Raspberry Pi (arm) and it fails there as well. > Is the error consistently appearing? It happened all the times I ran "make check". What I use to reproduce the problem is: $ make PLATFORM=... PREFIX=... CHICKEN=/path/to/chicken-5.0.0/bin/chicken spotless boot-chicken $ make PLATFORM=... PREFIX=... CHICKEN=./chicken-boot spotless install $ make PLATFORM=... PREFIX=... CHICKEN=./chicken-boot check (On the tip of chicken-core's master.) > If you comment out the f32vector test (lolevel-tests.scm), does the > f64vector test fail, too? I only tested that case on my x86-64 system, and it fails as well: Error: assertion failed: (eqv? (f64vector-ref old 6) (f64vector-ref new 0)) Call history: (=598 i596 7) (make-locative601 old593 i596) (make-locative601 new594 (|-603| 7 i596 1)) (|-603| 7 i596 1) (locative?605 loc-src600) (locative?605 loc-dst602) (locative-set!606 loc-dst602 (locative-ref607 loc-src600)) (locative-ref607 loc-src600) (doloop612 (add1597 i596)) (add1597 i596) (=598 i596 7) (printf608 "\nold: ~S\nnew: ~S\n" old593 new594) (eqv?609 (f64vector-ref old593 6) (f64vector-ref new594 0)) (f64vector-ref old593 6) (f64vector-ref new594 0) (##sys#error "assertion failed" (##core#quote (eqv? (f64vector-ref old 6) (f64vector-ref new 0<-- rules.make:975: recipe for target 'check' failed > Is perhaps some float-equality-precision issue at work here? Keine Ahnung. :-) All the best. Mario -- http://parenteses.org/mario ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
> > No problem. Unfortunately, now "make check" breaks: > > Error: assertion failed: (eqv? (f32vector-ref old 6) (f32vector-ref new 0)) > Ouch. I'm running make check with something based on the current HEAD all the time, on what platform is this? Is the error consistently appearing? If you comment out the f32vector test (lolevel-tests.scm), does the f64vector test fail, too? Is perhaps some float-equality-precision issue at work here? felix ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
Hi, On Thu, 29 Nov 2018 11:42:36 +0100 felix.winkelm...@bevuta.com wrote: >> On Wed, Nov 28, 2018 at 07:25:33PM +0100, Mario Domenech Goulart wrote: >> > It looks like this patch (79cf7427, master) has broken "make >> > bootstrap". Log attached (using CHICKEN 5.0.0 as CHICKEN). >> >> Right you are. The reason is that lfa2 is trying to unbox the arguments >> to {f32,f64}_vector_ref, which are not flonums but srfi-4 vector and >> integer arguments. The return value is a flonum. >> >> Attached is a patch to avoid unboxing the argument to accessor functions. > > Sorry about this. Pushed. No problem. Unfortunately, now "make check" breaks: Error: assertion failed: (eqv? (f32vector-ref old 6) (f32vector-ref new 0)) Call history: (=598 i596 7) (make-locative601 old593 i596) (make-locative601 new594 (|-603| 7 i596 1)) (|-603| 7 i596 1) (locative?605 loc-src600) (locative?605 loc-dst602) (locative-set!606 loc-dst602 (locative-ref607 loc-src600)) (locative-ref607 loc-src600) (doloop612 (add1597 i596)) (add1597 i596) (=598 i596 7) (printf608 "\nold: ~S\nnew: ~S\n" old593 new594) (eqv?609 (f32vector-ref old593 6) (f32vector-ref new594 0)) (f32vector-ref old593 6) (f32vector-ref new594 0) (##sys#error "assertion failed" (##core#quote (eqv? (f32vector-ref old 6) (f32vector-ref new 0<-- rules.make:975: recipe for target 'check' failed make: *** [check] Error 70 All the best. Mario -- http://parenteses.org/mario ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
> On Wed, Nov 28, 2018 at 07:25:33PM +0100, Mario Domenech Goulart wrote: > > It looks like this patch (79cf7427, master) has broken "make > > bootstrap". Log attached (using CHICKEN 5.0.0 as CHICKEN). > > Right you are. The reason is that lfa2 is trying to unbox the arguments > to {f32,f64}_vector_ref, which are not flonums but srfi-4 vector and > integer arguments. The return value is a flonum. > > Attached is a patch to avoid unboxing the argument to accessor functions. > Sorry about this. Pushed. felix ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
On Wed, Nov 28, 2018 at 07:25:33PM +0100, Mario Domenech Goulart wrote: > It looks like this patch (79cf7427, master) has broken "make > bootstrap". Log attached (using CHICKEN 5.0.0 as CHICKEN). Right you are. The reason is that lfa2 is trying to unbox the arguments to {f32,f64}_vector_ref, which are not flonums but srfi-4 vector and integer arguments. The return value is a flonum. Attached is a patch to avoid unboxing the argument to accessor functions. Cheers, Peter From f7fa7d3655ce7b8587beaa63e6f671ae5b06c597 Mon Sep 17 00:00:00 2001 From: Peter Bex Date: Wed, 28 Nov 2018 21:50:18 +0100 Subject: [PATCH] Do not float-unbox arguments to srfi-4 vector accessors The arguments aren't flonums, only the return value is! --- lfa2.scm | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lfa2.scm b/lfa2.scm index dffaee6f..1fba207c 100644 --- a/lfa2.scm +++ b/lfa2.scm @@ -563,7 +563,10 @@ (set! count (add1 count)) (let ((n (make-node '##core#inline (list ub) - (map walk/unbox subs + (map (if (eq? type 'acc) + walk + walk/unbox) + subs (case type ((pred) n) (else (make-node '##core#box_float '() -- 2.11.0 signature.asc Description: PGP signature ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
Hi, On Mon, 26 Nov 2018 12:32:19 +0100 felix.winkelm...@bevuta.com wrote: > Thanks, your suggestions seem to be correct, I applied the patch and removed > the last call to sub-boxed!. I also added a (very simple) test. > >> - Finally: there are still quite some remnants of the old boxing/unboxing >>code around to mark variables as 'boxed, and there's still ##core#box >>and ##core#unbox in the intermediate language. >> >>Is that still relevant, or can we delete that too? As far as I can >>tell, that code is still active and used; could you tell me more about >>how it works and how it relates (or not) to the lfa2 boxing and >>unboxing step, especially why the patch introduces a new box_float >>operation rather than re-using the old intermediate language box/unbox >>operations? > > ##core#box/##core#unbox are unrelated, they access boxed variables > in closures (1-element vectors). It looks like this patch (79cf7427, master) has broken "make bootstrap". Log attached (using CHICKEN 5.0.0 as CHICKEN). All the best. Mario -- http://parenteses.org/mario "make" PLATFORM=linux PREFIX=/nowhere CONFIG= \ CHICKEN=/home/mario/local/chicken-5.0.0/bin/chicken PROGRAM_SUFFIX=-boot-stage1 STATICBUILD=1 \ C_COMPILER_OPTIMIZATION_OPTIONS="-Os -fomit-frame-pointer" BUILDING_CHICKEN_BOOT=1 \ confclean chicken-boot-stage1 make[1]: Entering directory '/home/mario/src/chicken-core' rm -f \ chicken-config.h chicken-defaults.h chicken-install.rc chicken-uninstall.rc echo '#define STATICBUILD 1' >> chicken-defaults.h echo '#define C_CHICKEN_PROGRAM "chicken-boot-stage1"' >> chicken-defaults.h echo '#ifndef C_INSTALL_CC' >> chicken-defaults.h echo '# define C_INSTALL_CC "gcc"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_CXX' >> chicken-defaults.h echo '# define C_INSTALL_CXX "g++"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_POSTINSTALL_PROGRAM' >> chicken-defaults.h echo '# define C_INSTALL_POSTINSTALL_PROGRAM "true"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_RC_COMPILER' >> chicken-defaults.h echo '# define C_INSTALL_RC_COMPILER ""' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_CFLAGS' >> chicken-defaults.h echo '# define C_INSTALL_CFLAGS "-fno-strict-aliasing -fwrapv -DHAVE_CHICKEN_CONFIG_H -DC_ENABLE_PTABLES -Os -fomit-frame-pointer"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_LDFLAGS' >> chicken-defaults.h echo '# define C_INSTALL_LDFLAGS " "' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_PREFIX' >> chicken-defaults.h echo '# define C_INSTALL_PREFIX "/nowhere"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_SHARE_HOME' >> chicken-defaults.h echo '# define C_INSTALL_SHARE_HOME "/nowhere/share/chicken-boot-stage1"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_BIN_HOME' >> chicken-defaults.h echo '# define C_INSTALL_BIN_HOME "/nowhere/bin"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_EGG_HOME' >> chicken-defaults.h echo '# define C_INSTALL_EGG_HOME "/nowhere/lib/chicken-boot-stage1/9"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_LIB_HOME' >> chicken-defaults.h echo '# define C_INSTALL_LIB_HOME "/nowhere/lib"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_LIB_NAME' >> chicken-defaults.h echo '# define C_INSTALL_LIB_NAME "chicken-boot-stage1"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_STATIC_LIB_HOME' >> chicken-defaults.h echo '# define C_INSTALL_STATIC_LIB_HOME "/nowhere/lib"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_INCLUDE_HOME' >> chicken-defaults.h echo '# define C_INSTALL_INCLUDE_HOME "/nowhere/include/chicken-boot-stage1"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_MORE_LIBS' >> chicken-defaults.h echo '# define C_INSTALL_MORE_LIBS "-lm -ldl"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_INSTALL_MORE_STATIC_LIBS' >> chicken-defaults.h echo '# define C_INSTALL_MORE_STATIC_LIBS "-lm -ldl"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_STACK_GROWS_DOWNWARD' >> chicken-defaults.h echo '# define C_STACK_GROWS_DOWNWARD 1' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_TARGET_MORE_LIBS' >> chicken-defaults.h echo '# define C_TARGET_MORE_LIBS "-lm -ldl"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_TARGET_MORE_STATIC_LIBS' >> chicken-defaults.h echo '# define C_TARGET_MORE_STATIC_LIBS "-lm -ldl"' >> chicken-defaults.h echo '#endif' >> chicken-defaults.h echo '#ifndef C_TARGET_CC' >> chicken-defaults.h echo '# define C_TARGET_CC "gcc"' >> chicken-de
Re: [Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
Thanks, your suggestions seem to be correct, I applied the patch and removed the last call to sub-boxed!. I also added a (very simple) test. > - Finally: there are still quite some remnants of the old boxing/unboxing >code around to mark variables as 'boxed, and there's still ##core#box >and ##core#unbox in the intermediate language. > >Is that still relevant, or can we delete that too? As far as I can >tell, that code is still active and used; could you tell me more about >how it works and how it relates (or not) to the lfa2 boxing and >unboxing step, especially why the patch introduces a new box_float >operation rather than re-using the old intermediate language box/unbox >operations? ##core#box/##core#unbox are unrelated, they access boxed variables in closures (1-element vectors). felix ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
On Thu, Nov 22, 2018 at 11:46:44AM +0100, felix.winkelm...@bevuta.com wrote: > This patch adds an additional optimization pass to the "lfa2" > compiler stage, which attempts to remove unnecessary > boxing and unboxing of floating point numbers. Specifically, > calls to floating point inline operations that have a variant that > accepts unboxed arguments are replaced with a faster version, > omitting the unboxing of arguments, and possibly also the > boxing of results. Hi Felix, I had a look and it looks quite simple but effective. I have made a few small modifications to your patch: - "utype" was no longer used in c-backend.scm after removing the old unboxing code, so I've removed that procedure as well. - The patch introduced several lines with trailing whitespace, I've cleaned it up a bit so git doesn't complain as much when applying. - I noticed you introduced a second entry for fix_to_flo, in the constructor map, so I've removed that. There were also some additional entries with the (invalid) "flonum" type that you missed, so I've updated those too. - In lfa2, sub-boxed was always called just before calling extinguish! to decrement variable nodes in floatvars, but this seems not entirely justified; extinguish! will do some checks to see if it is droppable at all, and only drop it in that case. Additionally, it will traverse the sub-node tree first to see if it can drop any of those. Therefore I think it is better to move the sub-boxed calls into drop!, to ensure the counts will match up with the actual number of nodes that remain; if there are multiple references to the variable in the sub-nodes and the entire node is dropped, you want to decrement the counters by the total number of (sub)nodes referring to the variable that were dropped, not just once for the main node. I kept the one remaining call to sub-boxed in the case where we look up the call in +ffi-type-check-map+ to "raise" the subexpression, but I'm not 100% sure it is correct; if we raise the subexpression, the sub-expressions remain the same, no? So, for example, when we make this replacement: (##core#inline "C_i_foreign_flonum_argumentp" foo) => foo AFAIK that does not remove any references to variables in the foo sub-expression, so if I understand it correctly, it should not be decremented from the list of floatvars. I've kept it because I'm not certain enough about this to remove it, but if you agree, please remove that call after applying this modified patch. - Finally: there are still quite some remnants of the old boxing/unboxing code around to mark variables as 'boxed, and there's still ##core#box and ##core#unbox in the intermediate language. Is that still relevant, or can we delete that too? As far as I can tell, that code is still active and used; could you tell me more about how it works and how it relates (or not) to the lfa2 boxing and unboxing step, especially why the patch introduces a new box_float operation rather than re-using the old intermediate language box/unbox operations? See attachment for the updated patch. Cheers, Peter From 226623a7b9fb5c3f083fa1245f656cd909c751c8 Mon Sep 17 00:00:00 2001 From: felix Date: Wed, 21 Nov 2018 18:48:48 +0100 Subject: [PATCH] Add unboxing pass to lfa2 After the lfa2 pass another pass is executed to eliminate unnecessary boxing + unboxing of floating point intermediate values. The process is roughly this: identify variables that are unassigned and are known to contain flonums, count all accesses, then count all accesses of these variables that are in direct operator position of an intrinsic that has an unboxed variant and, if the number of accesses in unboxed position is the same as the number of total accesses, then the variable can be let-bound using a specialized construct (##core#let_float) and all accesses be direct accesses (without any boxing/unboxing). Results of unboxable intrinsics are boxed automatically (using ##core#box_float), uses of ##core#inline_allocate on unboxable intrinsics are converted to ##core#inline forms. The lfa2 pass is now enabled at optimization levels 2 or higher. Signed-off-by: Peter Bex --- NEWS | 4 + batch-driver.scm | 9 +- c-backend.scm | 57 chicken.h | 18 ++- chicken.scm | 7 +- core.scm | 56 +--- lfa2.scm | 331 -- manual/Using the compiler | 2 +- support.scm | 4 +- 9 files changed, 347 insertions(+), 141 deletions(-) diff --git a/NEWS b/NEWS index f59d72ec..f3be786d 100644 --- a/NEWS +++ b/NEWS @@ -59,6 +59,10 @@ with the same version of the compiler. - the "-consult-type-file" and "-emit-type-file" options have been renamed to "-consult-types-file" and "-emit-types-file", respectively. + - Added an optimizat
[Chicken-hackers] ⍄PATCH⍃ Unboxing optimization for flonums
This patch adds an additional optimization pass to the "lfa2" compiler stage, which attempts to remove unnecessary boxing and unboxing of floating point numbers. Specifically, calls to floating point inline operations that have a variant that accepts unboxed arguments are replaced with a faster version, omitting the unboxing of arguments, and possibly also the boxing of results. This also enables -lfa2 at optimization levels 2 or higher. The performance improvement for flonum-intensive code is quite considerable, but only really takes effect in unsafe code (removing error checks and therefore making possible to pass arguments to numeric ops directly), so either compile your code in unsafe mode or use the "unsafe" egg (http://wiki.call-cc.org/eggref/5/unsafe) which provides replacement modules for some core libraries. The core libraries don't seem to take much advantage of this optimization, this is mainly intended for speed-critical, tight code (tests/fft.scm is a good example). Here a few timings, using 5.0.0 as a baseline: baseline: 0m17.51s real 0m17.47s user 0m00.06s system unsafe (baseline): 0m08.69s real 0m08.68s user 0m00.01s system unboxing: 0m16.77s real 0m16.74s user 0m00.06s system unsafe (unboxing): 0m06.02s real 0m05.94s user 0m00.05s system baseline, -O5: 0m07.45s real 0m07.44s user 0m00.00s system unboxing, -O5: 0m05.10s real 0m05.08s user 0m00.00s system Results will of course change wildly, depending on whatever code you throw at it. felix From 899d3db27d1a2c1317543499f1762fad62ffd144 Mon Sep 17 00:00:00 2001 From: felix Date: Wed, 21 Nov 2018 18:48:48 +0100 Subject: [PATCH] Add unboxing pass to lfa2 After the lfa2 pass another pass is executed to eliminate unnecessary boxing + unboxing of floating point intermediate values. The process is roughly this: identify variables that are unassigned and are known to contain flonums, count all accesses, then count all accesses of these variables that are in direct operator position of an intrinsic that has an unboxed variant and, if the number of accesses in unboxed position is the same as the number of total accesses, then the variable can be let-bound using a specialized construct (##core#let_float) and all accesses be direct accesses (without any boxing/unboxing). Results of unboxable intrinsics are boxed automatically (using ##core#box_float), uses of ##core#inline_allocate on unboxable intrinsics are converted to ##core#inline forms. The lfa2 pass is now enabled at optimization levels 2 or higher. --- NEWS | 4 + batch-driver.scm | 9 +- c-backend.scm | 47 +++ chicken.h | 18 ++- chicken.scm | 7 +- core.scm | 60 + lfa2.scm | 324 -- manual/Using the compiler | 2 +- support.scm | 4 +- 9 files changed, 349 insertions(+), 126 deletions(-) diff --git a/NEWS b/NEWS index c643784b..3eba6d6a 100644 --- a/NEWS +++ b/NEWS @@ -46,6 +46,10 @@ with the same version of the compiler. - the "-consult-type-file" and "-emit-type-file" options have been renamed to "-consult-types-file" and "-emit-types-file", respectively. + - Added an optimization pass for reducing the amount of boxing of +intermediate floating point values, enabled by the "-lfa2" compiler +option. + - The "lfa2" pass is now enabled at optimization levels 2 or higher. - Tools - The new "-link" option to csc allows linking with objects from extensions. diff --git a/batch-driver.scm b/batch-driver.scm index fc7afb04..4a4a370e 100644 --- a/batch-driver.scm +++ b/batch-driver.scm @@ -802,8 +802,13 @@ (when do-lfa2 (begin-time) (debugging 'p "doing lfa2") - (perform-secondary-flow-analysis node2 db) - (end-time "secondary flow analysis")) + (let ((floatvars (perform-secondary-flow-analysis node2 db))) + (end-time "secondary flow analysis") +(unless (null? floatvars) + (begin-time) + (debugging 'p "doing unboxing") + (set! node2 (perform-unboxing node2 floatvars))) + (end-time "unboxing"))) (print-node "optimized" '|7| node2) ;; inlining into a file with interrupts enabled would ;; change semantics diff --git a/c-backend.scm b/c-backend.scm index babb2ac3..952fa8ea 100644 --- a/c-backend.scm +++ b/c-backend.scm @@ -124,6 +124,9 @@ (if (vector? lit) (gen "((C_word)li" (vector-ref lit 0) ")") (gen