(Changing the thread title to be a little more relevant than "Fast computation of binomial coefficients". I now have a Tesla C1060 plugged into a Dell T7400 box running RHEL5 and am learning how to use CUDA for non-trivial computations. I'm starting to have a little free time again, as well. Can't promise much time just yet, but hope to have more later in the year. Anyway, Marc Glisse indicated one very useful aspect (constant time add/sub) of nails when running on GPUs. Another is that, for the version 1 hardware at least (which includes all shipped products) integer multiplication on NVidia cards is markedly faster for 24*24 bit products than for full-word (i.e. 32-bit) products. An implementation with 8-bit nails allows for a large number of carry propagation steps to be postponed. I've not yet taken measurements but expect the carry saving to more than compensate for the smaller effective word size. Note that I'm not saying that the high-level MPIR be 24+8 bits, only that the GPU work with that representation internally. My interests, primarily integer factorization, also lead me into considering CRT representations where multiplication is a linear time operations. However, that is almost certainly a step too far for an MPIR implementation! Paul
--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to mpir-devel@googlegroups.com To unsubscribe from this group, send email to mpir-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en -~----------~----~----~----~------~----~------~--~---