2009/6/6 Jason Moxham <ja...@njkfrudils.plus.com>:
>
> On Saturday 06 June 2009 00:14:58 Bill Hart wrote:
>> MPIR 1.2 looks to be passing all its tests, so should be out
>> officially over the weekend, so time to think about the next
>> release....
>>
>> There will be an MPIR 1.2.1 which will merge Robert Gerbicz'
>> contributed code, "turn on" some improvements we have for the FFT, and
>> clean up some longstanding issues we have lurking in our trac server
>> (see those issues flagged for mpir-1.2.1:
>> http://trac.mpir.org/mpir_trac/report/1 ).
>>
>> Here are some of the things we've been musing about for the next
>> release(s) (MPIR 1.3 and following), roughly in order of importance in
>> each section:
>>
>> Multiplication:
>>
>> * Further improvements to assembly language on x86_64 (yes, we think
>> even for K8 :-))
>> * Move more of karatsuba into assembly language (there should be two
>> new assembly functions for this)
>> * Factor Karatsuba code out into separate files (kara_mul_n.c and
>> kara_mul.c)
>> * Unbalanced version of karatsuba
>
> Assuming I get the time etc , then I volunteer for the above assembler and
> karatsuba mul.
>
>> * Implement an improvement to Toom 3 suggested by Thorsten Kleinjung
>> * Toom 4 variants for unbalanced operands based on the code of Marco
>> Bodrato that we have
>> * Toom 7 variants for "slightly unbalanced" operands
>> * Factor out some of the lower level FFT helper functions into
>> separate files
>> * Implement some interesting strategies for speeding up pointwise
>> multiplications in the FFT code
>
> Looking at the existing toom code there are a lot of
> mpn_addmul_1(....,2^k)
> we should change this to
> mpn_addlsh_n(....,k)
> and write asm functions for this , for K8 this would be a macro to call
> addmul_1 , and for core2,K10 a new function
> Various other combinations can be done faster than we do at the moment , we
> just need to choose the most common or the easiest.
>
>> * Work on left and right truncated products and middle products
>>
>> Division:
>>
>> * Asymptotically fast division code based on Newton iteration -
>> Fredrick Johansson has python code which demonstrates an asymptotic
>> improvement
>
> Again , assuming I have the time , then I would like to do the division below
>
>> * Implement Peter Montgomery's mod_1 algorithm
>> * Further assembly improvements to divrem_1 and divrem_2
>>
>> GCD:
>>
>> * Strassen multiplication for the 2x2 matrix multiplication used by
>> the asymptotically fast GCD code of Niels Moller
>> * A best of breed implementation at the mpn level making use of the
>> different strengths of the currently implemented GCD variants in Niels
>> Moller's code
>> * A super fast implementation of GCD for 1 and 2 limbs - I have some
>> code for the < 1 limb case already, but this is not merged
>>
>
> Is this for GCD(x,y) where x and y have <=2 limbs ? branch free??

The current code is not branch free, but is fast. However we probably
need multiple versions for different processors.

>
>> XGCD:
>>
>> * An XGCD implementation of at least one of the GCD algorithms which
>> is fast for smaller operands (currently we only have good
>> asymptotically fast code, after I finally got around to writing it)
>>
>> Benchmarketing:
>>
>> * The new GMP benchmark is better than the old one, however Brian
>> Gladman has written some code for a new benchmark program entirely in
>> C which will become a base for a new mpir benchmark. As we aren't
>> legally allowed to report gmpbench results for MPIR, we may as well
>> write our own benchmark to prevent people inadvertently doing this.
>>
>
> Jeff Gilchrist has a real world benchmark , ie something thats run for
> purposes other than generating benchmarks. I would like to see a few of these
> in our benchmark suite.
>
>> mpn layer:
>>
>> * If we are serious about maintaining compatibility with GMP, it looks
>> as if we have to think about catching up with some of the new mpn
>> functions they have implemented - it seems they plan to expose many to
>> the user in a later version - some of them I think we already have,
>> but perhaps with different names, but others I don't have a clue what
>> they do. We'll probably do this on a case by case basis for now,
>> essentially looking to implement the more canonical looking ones
>> before we implement ones which may be internal only - e.g. I think
>> mpn_neg is a safe bet for a user exposed function.
>>
>> Then we can think towards MPIR 2.0. Immediately that MPIR 1.2 is out I
>> will create mpir-mt, mpir-cuda and mpir-cell branches in svn. Glenn
>> reports the cuda1 machine is back up with working CUDA 2.0 drivers,
>> though there may still be some more packages we need to install from
>> NVIDIA for maximum joy.
>>
>
> I know nothing about cuda/cell , but it would great to work on them.
>
>> There are other things in the works, but these depend on us getting
>> written contracts and will be announced once we have signed copies in
>> our hands. :-)
>>
>> Of course all of the above is open for discussion (high level ideas
>> and algorithms can be discussed here, implementation details and
>> debugging will be thrashed out on mpir-dev). Further code
>> contributions and ideas are most welcome!!
>>
>> Bill.
>>
>>
>>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to