(Changing the thread title to be a little more relevant than
        "Fast
        computation of binomial coefficients".
        
        I now have a Tesla C1060 plugged into a Dell T7400 box running
        RHEL5 and am learning how to use CUDA for non-trivial
        computations.  I'm starting to have a little free time again, as
        well.  Can't promise much time just yet, but hope to have more
        later in the year.
        
        Anyway, Marc Glisse indicated one very useful aspect (constant
        time
        add/sub) of nails when running on GPUs.   Another is that, for
        the
        version 1 hardware at least (which includes all shipped
        products)
        integer multiplication on NVidia cards is markedly faster for
        24*24 bit products than for full-word (i.e. 32-bit) products.
        An implementation with 8-bit nails allows for a large number of
        carry propagation steps to be postponed.  I've not yet taken
        measurements but expect the carry saving to more than compensate
        for the smaller effective word size.  Note that I'm not saying
        that the high-level MPIR be 24+8 bits, only that the GPU work
        with that representation internally.
        
        My interests, primarily integer factorization, also lead me into
        considering CRT representations where multiplication is a linear
        time operations.  However, that is almost certainly a step too
        far for an MPIR implementation!
        
        Paul



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to