On Saturday 23 July 2011 21:17:25 Jason wrote:
> There's sandybridge addsub to do

The gain was extremely small so I won't bother

> and that will finish the asm part of toom
> evaluation for +-1 , I'll write the C bit , and then do the asm for
> evaluation at +-2 +-1/2 +-4 etc , this will consist of inclsh's/incrsh's

I'll get onto this now

> and then perhaps do a toom23 interpolation combined with a  reconstitution
> if it seems worth while(karasub/add was the toom22 version of this). 

The above is complicated by the amount of different ways we can combine the two 
phases of the toom's interpolation and re-composition , definitely looks like 
it's worth doing , but it will have to wait for MPIR 2.6 as until I get all 
the basic asm fn's written there is no way you can know which way is best , 
previous mathematical analysis are no use because they only assume basic 
operations ie add,sub,shift and ignore the large savings of rsh1add,addsub etc 
and some strange ones ie dualsub . Note: combining these two phases and using 
specific fn's was key to karasub/add speed.

> That
> will be all for now on toom from me. Then I want to try to improve our
> *_basecase's , that will probably all I get to do before our next release.

I also want to improve the way I do the inner loops , at the moment I look for 
loops that have timings (over all alignments) < some limit , instead I think 
if I took the loop which has the least sum of timings over all alignments 
would give a faster average speed. Also I try to do an exhaustive search which 
is not really possible , find some way to overcome this , one thought is to use 
a genetic algorithm.

> How's the FFT going? Let me know what linear asm stuff might be needed so
> I can start thinking about it , even if I write no code.
> 
> On Saturday 23 July 2011 18:25:57 Bill Hart wrote:
> > There is a way in Ubuntu. It may be similar in Debian. However, I've
> > forgotten what the command is and I think you need sudo powers.
> 
> Possible skynet may be able to accommodate this if the kernel has the right
> Capability handling in it , ie you can restrict a particular sudo to only
> one aspect of the system.
> 
> > Bill.
> > 
> > On 23 July 2011 16:32, Jason <ja...@njkfrudils.plus.com> wrote:
> > > On Saturday 23 July 2011 15:42:08 Jason wrote:
> > >> On Saturday 23 July 2011 14:18:09 Jason wrote:
> > >> > On Friday 22 July 2011 23:11:29 Jason wrote:
> > >> > > On Friday 22 July 2011 17:39:55 Jason wrote:
> > >> > > > On Friday 22 July 2011 12:36:21 Jason wrote:
> > >> > > > > On Friday 22 July 2011 12:25:42 Bill Hart wrote:
> > >> > > > > > That's fantastic! Thanks for all your hard work on these
> > >> > > > > > Jason.
> > >> > > > > > 
> > >> > > > > > Bill.
> > >> > > > > > 
> > >> > > > > > On 22 July 2011 11:13, jason <ja...@njkfrudils.plus.com> 
wrote:
> > >> > > > > > > Hi
> > >> > > > > > > 
> > >> > > > > > > New assembler for the nehalem mpn_addadd mpn_addsub
> > >> > > > > > > mpn_subadd , used to run at 3.5c/l now at 3.0c/l therefore
> > >> > > > > > > optimal. I'll check out how they run on the other intel
> > >> > > > > > > chips.
> > >> > > > > 
> > >> > > > > it's also an improvement on core2/penryn and nearly optimal ,
> > >> > > > > probably just needs a shuffle. Sandybridge doesn't benefit
> > >> > > > > though.
> > >> > > > > 
> > >> > > > > > > Note mpn_subadd really should be called mpn_subsub , I'll
> > >> > > > > > > change it later.
> > >> > > > > > > 
> > >> > > > > > > Jason
> > >> > > > > > > 
> > >> > > > > > > --
> > >> > > > > > > You received this message because you are subscribed to
> > >> > > > > > > the Google Groups "mpir-devel" group. To post to this
> > >> > > > > > > group, send email to mpir-devel@googlegroups.com. To
> > >> > > > > > > unsubscribe from this group, send email to
> > >> > > > > > > mpir-devel+unsubscr...@googlegroups.com. For more options,
> > >> > > > > > > visit this group at
> > >> > > > > > > http://groups.google.com/group/mpir-devel?hl=en.
> > >> > > > 
> > >> > > > I tried to do a mpn_addaddadd ie x=y+z+u+v but on Intel chips
> > >> > > > the scheduler cant really cope with it , also with 5 pointer
> > >> > > > you get so many L1 data cache bank conflicts that the code runs
> > >> > > > at different speeds for the relative differences between the
> > >> > > > pointers mod 64 . But if we are many using it for toom then we
> > >> > > > could perhaps guarantee the relative differences. On the AMD
> > >> > > > chips I have a strange problem with my optimizer where it
> > >> > > > reports silly numbers for some functions ie sumdiff addadd , no
> > >> > > > idea why it's happening , I even reverted to an earier svn
> > >> > > > version where I found the original fast addadd code , but it
> > >> > > > still gave silly figures , but karaadd was fine ?
> > >> > > > 
> > >> > > > Jason
> > >> > > 
> > >> > > New mpn_sumdiff for the nehalem , didn't have before so we can say
> > >> > > it would of run at 4.0c/w but now is 3.6c/w (lost a lttle bit with
> > >> > > the feedin) , this code also benefits the core2 but not penryn or
> > >> > > sandybridge , probably just another trivial shuffle needed :)
> > >> > > 
> > >> > > Jason
> > >> > 
> > >> > I've shuffled the sumdiff for the core2 it's a bit faster at 3.5c/w
> > >> > 
> > >> > Jason
> > >> 
> > >> I've shuffled the sumdiff again for penryn and it runs at 3.7c/w
> > >> 
> > >> and for netburst/atom I'll just choose the fastest from what we
> > >> already have (just before release) , the only one left to do is
> > >> westmere on the gcc farm , I dont know how loaded it is , so I might
> > >> not be able to make any meaningful comparisons.
> > >> 
> > >> Jason
> > > 
> > > Well gcc20 (westmere) is very lightly loaded at the moment but it has
> > > Mhz throttling to save power and turbo boost ,which means I can't use
> > > it for timings. Have to find a software way to temporary turn these
> > > off , on my machines (nehalem,sandybridge,bobcat) I just turned it off
> > > in the bios.
> > > 
> > > Jason
> > > 
> > > --
> > > You received this message because you are subscribed to the Google
> > > Groups "mpir-devel" group. To post to this group, send email to
> > > mpir-devel@googlegroups.com. To unsubscribe from this group, send email
> > > to mpir-devel+unsubscr...@googlegroups.com. For more options, visit
> > > this group at http://groups.google.com/group/mpir-devel?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to mpir-devel@googlegroups.com.
To unsubscribe from this group, send email to 
mpir-devel+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en.

Reply via email to