Hi, the performance increase looks nice, but see few comment below please.
"Dr. Arne Babenhauserheide" <[email protected]> writes: > Hi, > > While optimizing an implementation of levenshtein distance, I found that > (min ...) took up a significant part of the profile (16%), so I tried > replacing it with a simple implementation in Scheme. That worked out > surprisingly well (shaved off about 25% of the total runtime) and when I > benchmarked it directly it turned out to be between 1.5 and 3 times > faster than the min in numbers.c. There are few difference in behavior --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (min -0.0 0.0) $8 = -0.0 scheme@(guile-user)> (mins -0.0 0.0) $9 = 0.0 scheme@(guile-user)> (min (nan) 1) $15 = +nan.0 scheme@(guile-user)> (mins (nan) 1) $16 = 1 --8<---------------cut here---------------end--------------->8--- I would also expect `min' to not care about the order of arguments. The original implementation does not, but yours does --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (min (nan) 1) $21 = +nan.0 scheme@(guile-user)> (min 1 (nan)) $22 = +nan.0 scheme@(guile-user)> (mins (nan) 1) $19 = 1 scheme@(guile-user)> (mins 1 (nan)) $20 = +nan.0 --8<---------------cut here---------------end--------------->8--- One last observation is that your version does not preserve exactness of the arguments, which seems to be a requirement in both R6RS (11.7.4, Arithmetic operations) and R7RS (6.2.2) --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (min +inf.0 1) $29 = 1.0 scheme@(guile-user)> (mins +inf.0 1) $30 = 1 --8<---------------cut here---------------end--------------->8--- As for the performance, while your version is a lot faster for integers --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (define x (iota #_1_000_000)) scheme@(guile-user)> ,t (let loop ((n 100)) (if (zero? n) (when #f #f) (begin (apply min x) (loop (1- n))))) ;; 2.208355s real time, 2.208266s run time. 0.000000s spent in GC. scheme@(guile-user)> ,t (let loop ((n 100)) (if (zero? n) (when #f #f) (begin (apply mins x) (loop (1- n))))) ;; 1.044301s real time, 1.044260s run time. 0.000000s spent in GC. --8<---------------cut here---------------end--------------->8--- it is at the cost of performance regressions elsewhere --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (define x (map (cut / <> 3) (iota #_1_000_000))) scheme@(guile-user)> ,t (let loop ((n 100)) (if (zero? n) (when #f #f) (begin (apply min x) (loop (1- n))))) ;; 2.720850s real time, 2.720736s run time. 0.000000s spent in GC. scheme@(guile-user)> ,t (let loop ((n 100)) (if (zero? n) (when #f #f) (begin (apply mins x) (loop (1- n))))) ;; 3.476514s real time, 3.476266s run time. 0.000000s spent in GC. scheme@(guile-user)> (define x (map (cut + <> 0.1) (iota #_1_000_000))) scheme@(guile-user)> ,t (let loop ((n 100)) (if (zero? n) (when #f #f) (begin (apply min x) (loop (1- n))))) ;; 2.144861s real time, 2.144699s run time. 0.000000s spent in GC. scheme@(guile-user)> ,t (let loop ((n 100)) (if (zero? n) (when #f #f) (begin (apply mins x) (loop (1- n))))) ;; 2.719851s real time, 2.719682s run time. 0.000000s spent in GC. --8<---------------cut here---------------end--------------->8--- Please correct me if I have messed up somewhere in the above text. Tomas -- There are only two hard things in Computer Science: cache invalidation, naming things and off-by-one errors.
