Ping... The latest version of this patch was here: https://gcc.gnu.org/ml/gcc-patches/2017-04/msg01567.html
Thanks Bernd. On 06/14/17 14:34, Bernd Edlinger wrote: > Ping... > > On 06/01/17 18:01, Bernd Edlinger wrote: >> Ping... >> >> On 05/12/17 18:49, Bernd Edlinger wrote: >>> Ping... >>> >>> On 04/29/17 19:45, Bernd Edlinger wrote: >>>> Ping... >>>> >>>> I attached a rebased version since there was a merge conflict in >>>> the xordi3 pattern, otherwise the patch is still identical. >>>> It splits adddi3, subdi3, anddi3, iordi3, xordi3 and one_cmpldi2 >>>> early when the target has no neon or iwmmxt. >>>> >>>> >>>> Thanks >>>> Bernd. >>>> >>>> >>>> >>>> On 11/28/16 20:42, Bernd Edlinger wrote: >>>>> On 11/25/16 12:30, Ramana Radhakrishnan wrote: >>>>>> On Sun, Nov 6, 2016 at 2:18 PM, Bernd Edlinger >>>>>> <bernd.edlin...@hotmail.de> wrote: >>>>>>> Hi! >>>>>>> >>>>>>> This improves the stack usage on the sha512 test case for the case >>>>>>> without hardware fpu and without iwmmxt by splitting all di-mode >>>>>>> patterns right while expanding which is similar to what the >>>>>>> shift-pattern >>>>>>> does. It does nothing in the case iwmmxt and fpu=neon or vfp as >>>>>>> well as >>>>>>> thumb1. >>>>>>> >>>>>> >>>>>> I would go further and do this in the absence of Neon, the VFP unit >>>>>> being there doesn't help with DImode operations i.e. we do not >>>>>> have 64 >>>>>> bit integer arithmetic instructions without Neon. The main reason why >>>>>> we have the DImode patterns split so late is to give a chance for >>>>>> folks who want to do 64 bit arithmetic in Neon a chance to make this >>>>>> work as well as support some of the 64 bit Neon intrinsics which IIRC >>>>>> map down to these instructions. Doing this just for soft-float >>>>>> doesn't >>>>>> improve the default case only. I don't usually test iwmmxt and I'm >>>>>> not >>>>>> sure who has the ability to do so, thus keeping this restriction for >>>>>> iwMMX is fine. >>>>>> >>>>>> >>>>> >>>>> Yes I understand, thanks for pointing that out. >>>>> >>>>> I was not aware what iwmmxt exists at all, but I noticed that most >>>>> 64bit expansions work completely different, and would break if we >>>>> split >>>>> the pattern early. >>>>> >>>>> I can however only look at the assembler outout for iwmmxt, and make >>>>> sure that the stack usage does not get worse. >>>>> >>>>> Thus the new version of the patch keeps only thumb1, neon and >>>>> iwmmxt as >>>>> it is: around 1570 (thumb1), 2300 (neon) and 2200 (wimmxt) bytes stack >>>>> for the test cases, and vfp and soft-float at around 270 bytes stack >>>>> usage. >>>>> >>>>>>> It reduces the stack usage from 2300 to near optimal 272 bytes (!). >>>>>>> >>>>>>> Note this also splits many ldrd/strd instructions and therefore I >>>>>>> will >>>>>>> post a followup-patch that mitigates this effect by enabling the >>>>>>> ldrd/strd >>>>>>> peephole optimization after the necessary reg-testing. >>>>>>> >>>>>>> >>>>>>> Bootstrapped and reg-tested on arm-linux-gnueabihf. >>>>>> >>>>>> What do you mean by arm-linux-gnueabihf - when folks say that I >>>>>> interpret it as --with-arch=armv7-a --with-float=hard >>>>>> --with-fpu=vfpv3-d16 or (--with-fpu=neon). >>>>>> >>>>>> If you've really bootstrapped and regtested it on armhf, doesn't this >>>>>> patch as it stand have no effect there i.e. no change ? >>>>>> arm-linux-gnueabihf usually means to me someone has configured with >>>>>> --with-float=hard, so there are no regressions in the hard float ABI >>>>>> case, >>>>>> >>>>> >>>>> I know it proves little. When I say arm-linux-gnueabihf >>>>> I do in fact mean --enable-languages=all,ada,go,obj-c++ >>>>> --with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16 >>>>> --with-float=hard. >>>>> >>>>> My main interest in the stack usage is of course not because of linux, >>>>> but because of eCos where we have very small task stacks and in fact >>>>> no fpu support by the O/S at all, so that patch is exactly what we >>>>> need. >>>>> >>>>> >>>>> Bootstrapped and reg-tested on arm-linux-gnueabihf >>>>> Is it OK for trunk? >>>>> >>>>> >>>>> Thanks >>>>> Bernd.