On Tuesday 10 January 2012 14:23:35 Bill Hart wrote:
> I just tried removing mpn_sumdiff_n references from my code, and this
> slowed it down substantially. So this function is really important for
> the speed of the FFT.

Given that the speedup of sumdiff is pretty small I would guess that the 
measured speedup is because of better use
of the L1 cache. Why dont we just document sumdiff in the manual is export it 
properly. The only question is whether
to allow both destinations to alias both sources. The asm versions can handle 
this case no problem and the existing C implementation can 
althouth it has to allocate space to do so. The new simpler `C implementation 
can't handle this case. As the measured speedup is from the better use
of the L1 cache this is a pain. However if the define HAVE_NATIVE_mpn_sumdiff_t 
is set then we know we can , so like the case where sumdiff was
"not present" on some arch we can just keep the present code(the present new 
fft code) and just use the HAVE_NATIVE_mpn_sumdiff where we have this special
case , otherwise we must use a temp var and sumdiff or even separate add and 
sub (as that how it's writen now)

> 
> Unfortunately it is not exported by MPIR and even though it is defined
> for all processors, it is mpn_sumdiff_n in some libraries and
> _gmpn_sumdiff_n in others and __gmpn_sumdiff_n in others. So this is a
> total pain in the neck. I'm not sure what the best solution is.
> 
> Bill.
> 
> On 9 January 2012 18:01, Bill Hart <goodwillh...@googlemail.com> wrote:
> > I wouldn't worry about it. It is possible I overwrote my timings file
> > and that the times are not affected after all.
> >
> > On 9 January 2012 17:56, Jason <ja...@njkfrudils.plus.com> wrote:
> >> On Sunday 08 January 2012 11:30:05 Bill Hart wrote:
> >>> I decided to try the FFT without addsub_n and it seems to actually go
> >>> consistently about 3% faster, which is totally mysterious. So I have
> >>> removed it from the two files it is defined in:
> >>
> >>
> >> Thats very strange , I assume this is on a K10 , what kind of sizes are we 
> >> talking about
> >>
> >>>
> >>> ifft_mfa_truncate_sqrt2.c
> >>> ifft_truncate_sqrt2.c
> >>>
> >>> As sumdiff_n seems to be defined for all platforms as far back as the
> >>> MPIR 2.1 series (even if it is not explicitly exported), I can just
> >>> extern this in my flint code (and of course it won't be a problem in
> >>> mpir).
> >>>
> >>> So the fft should build on all systems now.
> >>>
> >>> Bill.
> >>>
> >>>
> 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com.
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en.

Reply via email to