Re: [racket-dev] [plt] Push #27909: master branch updated

Neil Toronto Tue, 14 Jan 2014 10:39:47 -0800

I don't think so. The flonum functions are Flonum -> Flonum, and`magnitude*' is Number -> Number.


The best candidate from the math library is probably `mean'.

Neil ⊥

On 01/14/2014 11:24 AM, Robby Findler wrote:

Do any of these functions turn out to have case-> contracts by the time
TR gets done with them?

Robby



On Tue, Jan 14, 2014 at 12:22 PM, Neil Toronto <neil.toro...@gmail.com
<mailto:neil.toro...@gmail.com>> wrote:

    An update on my math-centered tests. The first is the built-in
    `magnitude' vs. a TR near-transliteration of its C implementation,
    called from *untyped* Racket. The numbers are in milliseconds, for 5
    million calls, by input type:


    Function         Flonum  Rational  Fixnum  Integer  Float-Complex
    ------------------------------__------------------------------__-------
    Pre-static contracts:

    magnitude*         385      419      378     414         686
    magnitude           59       44       40      40         390
    Post-static contracts:
    magnitude*         175      196      164     195         475
    magnitude           68       43       36      40         406

    All but the Float-Complex case just return the absolute value of the
    argument, so there's not much computation. They're dominated by
    contract checking, and the speedup is 0.45 to 0.5. Nice. The
    Float-Complex case does something substantive, and is close to C.
    That's awesome.

    So for really small functions (i.e. just a test and a call to
    `abs'), writing them in C is still better. But for anything much
    larger (that part of `magnitude*' is about 10 lines and 7-13 numeric
    ops), a TR implementation isn't much slower than C (about 17%).

    --

    1 million iterations of some math library flonum functions, in TR,
    untyped before Robby's changes, untyped after Robby's 12/12 changes,
    and untyped after Eric's static contract changes, in milliseconds:

    Function         TR     Untyped    Robby's   Robby's+Eric's
    ------------------------------__------------------------------__--
    flrational?       5       322        98           37
    flsinh           55       343       121           86
    fllog1p          47       351       117           80
    lg+              61       384       154          115
    flgamma         165       521       262          234

    So calling TR floating-point functions from untyped Racket takes
    only 0.11 to 0.45 times the amount of time it took only a month ago,
    with typical cases being about 0.25. That's awesome.

    Neil ⊥


    On 01/14/2014 10:27 AM, Eric Dobson wrote:

        The changes to TR contract generation are now in at HEAD.

        If you can find any cases where contracts are still slowing programs
        down by a lot I'd like to take a look at them. (Excepting the
        case of
        structs where I know it is still an issue).

        On Thu, Dec 12, 2013 at 10:52 AM, Robby Findler
        <ro...@eecs.northwestern.edu
        <mailto:ro...@eecs.northwestern.edu>> wrote:

            Re-reading your message I see that you're not actually
            asserting something
            different from what I said, but just for some precision here
            I wish to point
            out that I wasn't basing my opinion on intuition from the
            code, but on some
            microbenchmark timings. (There was a much more substantial
            difference
            yesterday because the loop inside any-wrap/c wasn't as cheap
            as it could
            have been.)

            I'd be interested to see if your improvements to
            type->contract improve the
            situation any! I expect they will make things better again
            for the Number
            case, but at the moment, there isn't a big difference.

            Program 1:


            #lang racket/base
            (module m typed/racket/base
                (: f (Any -> Any))
                (define (f x) 1)
                (provide f))
            (require 'm)
            (time
               (for ([x (in-range 20000)])
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))

            Timings:

            cpu time: 142 real time: 142 gc time: 8
            cpu time: 144 real time: 144 gc time: 7
            cpu time: 144 real time: 143 gc time: 6
            cpu time: 142 real time: 142 gc time: 6
            cpu time: 142 real time: 142 gc time: 7
            cpu time: 146 real time: 146 gc time: 6

            Program 2:


            #lang racket/base
            (module m typed/racket/base
                (: f (Any -> Integer))

                (define (f x) 1)
                (provide f))
            (require 'm)
            (time
               (for ([x (in-range 20000)])
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))


            Timings:

            cpu time: 139 real time: 138 gc time: 7
            cpu time: 145 real time: 144 gc time: 7
            cpu time: 140 real time: 140 gc time: 6
            cpu time: 151 real time: 150 gc time: 6
            cpu time: 139 real time: 138 gc time: 6
            cpu time: 139 real time: 139 gc time: 8








            On Thu, Dec 12, 2013 at 12:33 PM, Eric Dobson
            <eric.n.dob...@gmail.com <mailto:eric.n.dob...@gmail.com>>
            wrote:


                any-wrap/c still requires the check for one value, while
                any (which is
                from Number not Any) does not. So I would still guess at
                Number being
                faster, but Robby's changes may make it so that inlining
                and dead code
                elimination can see through everything and turn it into
                the same code.

                On Thu, Dec 12, 2013 at 10:27 AM, Robby Findler
                <ro...@eecs.northwestern.edu
                <mailto:ro...@eecs.northwestern.edu>> wrote:

                    FWIW, my push speeds up the any-wrap/c
                    implementation a bunch. Those two
                    should have similar speeds after you get that, I guess.

                    Robby


                    On Thu, Dec 12, 2013 at 11:03 AM, Neil Toronto
                    <neil.toro...@gmail.com <mailto:neil.toro...@gmail.com>>
                    wrote:


                        I tried your branch that implements it and saw
                        about 3.5x speedup for
                        the
                        `magnitude*' test. This is of course without
                        Robby's recent first-order
                        contract changes.

                        (I think it's about 3.5x: I tried with
                        magnitude* : Number -> Any first
                        and got 2400ms on the easy tests. I changed it
                        to magnitude* : Number
                        ->
                        Number and got 690ms. Apparently, for an `Any'
                        return type, an
                        `any-wrap/c'
                        contract is generated instead of nothing. If
                        that's much faster to
                        check
                        than `number?', though, the speedup is even better.)

                        I'd love to see this with Robby's recent
                        changes. Hint? Nudge? Please?

                        I didn't see very much speedup with arrays
                        (about 1.2x). Speed tests on
                        the math library's distribution objects were
                        very interesting, though,
                        and
                        indicate why the arrays might not be much
                        faster. Here's my test
                        program:


                        #lang racket

                        (require math/distributions)

                        (define d (normal-dist 0 1))

                        (printf "pdf d 0~n")
                        (for ([_  (in-range 5)])
                            (time (for ([_  (in-range 100000)])
                                    (pdf d 0))))
                        (newline)

                        (define p (distribution-pdf d))
                        (printf "p 0~n")
                        (for ([_  (in-range 5)])
                            (time (for ([_  (in-range 100000)])
                                    (p 0))))
                        (newline)


                        The two tests are equivalent, as `pdf' just
                        pulls the pdf function out
                        of
                        the distribution struct and applies it. In TR,
                        the tests are exactly
                        the
                        same speed (extremely fast). In untyped Racket,
                        on the main branch, the
                        second test is 16x faster, and on your branch,
                        it's 44x faster. (It's
                        still
                        10x slower than TR on your branch, so again...
                        I'd love to see your
                        changes
                        and Robby's together. :D)

                        Neil ⊥


                        On 12/12/2013 12:40 AM, Eric Dobson wrote:


                            Removing the return value checking is in the
                            works. It actually is
                            removing all of the checks that would blame
                            typed code, so higher
                            order functions/datastructure get
                            improvements too. It is actually
                            functional the last time I checked, but
                            lacking documentation which is
                            what is holding up merging with mainline.

                            https://github.com/plt/racket/__pull/453
                            <https://github.com/plt/racket/pull/453>

                            On Wed, Dec 11, 2013 at 7:57 PM, Robby Findler
                            <ro...@eecs.northwestern.edu
                            <mailto:ro...@eecs.northwestern.edu>> wrote:


                                I see that TR's type->contract returns

                                    (-> (flat-named-contract (quote
                                Float) flonum?)
                                (flat-named-contract
                                (quote
                                Float) flonum?))

                                for the type (Float -> Float), but it
                                could return

                                    (-> (flat-named-contract (quote
                                Float) flonum?) any)

                                which wouldn't do any result value
                                checking (this being different
                                from
                                any/c
                                as the range of the arrow contract).

                                Robby


                                On Wed, Dec 11, 2013 at 6:18 PM, Neil
                                Toronto
                                <neil.toro...@gmail.com
                                <mailto:neil.toro...@gmail.com>>
                                wrote:



                                    On 12/11/2013 02:49 PM, Neil Toronto
                                    wrote:



                                        On 12/11/2013 01:55 PM, Stephen
                                        Bloch wrote:



                                                On Dec 11, 2013, at 2:36
                                                PM, Neil Toronto wrote:

                                                    numeric primitives
                                                    implemented in Typed
                                                    Racket are faster than
                                                    the
                                                    same primitives
                                                    implemented in C.




                                            Whoa!  How did that happen?




                                        Whoa! That's not what I meant! O_o

                                        I said "we might be getting
                                        close" to that. I haven't tried
                                        porting
                                        a
                                        numeric C primitive to TR yet,
                                        but I have a hunch that it'll still
                                        be
                                        slower. I'll try one now and
                                        report what I find.

                                        Neil ⊥




                                    I can't figure out why `flsinh' is
                                    faster to call from untyped
                                    Racket
                                    than
                                    `sinh'. All my tests with a Typed
                                    Racket `magnitude' show calls from
                                    untyped
                                    code are significantly slower,
                                    except in the one case that it
                                    computes
                                    Euclidean distance. That case is
                                    only twice as slow.

                                    I've attached the benchmark program.
                                    The `magnitude*' function is
                                    more
                                    or
                                    less a direct translation of
                                    `magnitude' from "number.c" into Typed
                                    Racket.
                                    Here's a summary of the results I
                                    get on my computer, in
                                    milliseconds,
                                    for 5
                                    million calls from untyped Racket,
                                    by data type.


                                    Function         Flonum  Rational
                                      Fixnum  Integer  Float-Complex
                                    
------------------------------__------------------------------__-------
                                    magnitude*         385      419
                                      378     414         686
                                    magnitude           59       44
                                       40      40         390


                                    The only one that's close in
                                    relative terms is Float-Complex. The
                                    others
                                    just call `abs'. The decompiled code
                                    doesn't show any inlining of
                                    `magnitude', so this comparison
                                    should be good.

                                    I'll bet checking the return value
                                    contract (which is unnecessary)
                                    is
                                    the
                                    main slowdown. It has to check for
                                    number of values.

                                    For comparison, here are the timings
                                    for running the benchmarks in
                                    TR
                                    with
                                    #:no-optimize:


                                    Function         Flonum  Rational
                                      Fixnum  Integer  Float-Complex
                                    
------------------------------__------------------------------__-------
                                    magnitude*          45       70*
                                      37     102*       318
                                    magnitude           61       45
                                       39      91*       394


                                                 * = unexpectedly high


                                    Here's what I understand from
                                    comparing the numbers:

                                        * Except for non-fixnum
                                    integers, calling `magnitude' in TR is
                                    just
                                    as
                                    fast as in untyped Racket. I have no
                                    idea why it would be slower on
                                    big
                                    integers. That's just weird.

                                        * Calling `abs' in Racket is
                                    faster than calling `scheme_abs' in
                                    C,
                                    except on rationals and big integers.

                                        * Operating on flonums in Typed
                                    Racket, using generic numeric
                                    functions,
                                    is faster than doing the same in C.

                                    Overall, it looks like the TR code
                                    is within the same order of
                                    magnitude
                                    (pun not intended) as the C code. I
                                    would love to try this benchmark
                                    with
                                    either 1) a `magnitude*' with an
                                    `AnyValues' return type; or 2) a
                                    contract
                                    boundary that doesn't check TR's
                                    return types for first-order
                                    functions.

                                    (I managed to make a `magnitude*'
                                    with type Number -> AnyValues, but
                                    TR
                                    couldn't make a contract for it.)

                                    Neil ⊥


                                    _________________________
                                         Racket Developers list:
                                    http://lists.racket-lang.org/__dev
                                    <http://lists.racket-lang.org/dev>


                                _________________________
                                     Racket Developers list:
                                http://lists.racket-lang.org/__dev
                                <http://lists.racket-lang.org/dev>


_________________________
 Racket Developers list:
 http://lists.racket-lang.org/dev

Re: [racket-dev] [plt] Push #27909: master branch updated

Reply via email to