From: David Miller
Date: Sun, 14 Apr 2013 21:35:27 -0400 (EDT)
> This will unpack into a directory named "sol2_test", just 'cd' into
> there and run "make" on the Solaris machine that showed all of these
> problems. After the target objects are all made please tar up the
> result and send it to
From: Torbjorn Granlund
Date: Sun, 14 Apr 2013 23:26:31 +0200
> David Miller writes:
>
> Sure, let's revert v9/sqr_diagonal.asm and sparc64/gcd_1.asm back to
> their previous state for now, and try to work from that. Here's a
> patch.
>
> 2013-04-14 David S. Miller
>
> *
David Miller writes:
Since using rdpc avoids the whole issue of corrupting the return
address stack, it seems pretty desirable to move over to it.
Let's do it.
Well see a slight slowdown for T3, but probably its general slowness
will make this new slowdown almost unnoticeable.
--
Torbjö
David Miller writes:
Sure, let's revert v9/sqr_diagonal.asm and sparc64/gcd_1.asm back to
their previous state for now, and try to work from that. Here's a
patch.
2013-04-14 David S. Miller
* mpn/sparc32/v9/sqr_diagonal.asm: Revert LEA and INT32 changes.
* mpn/sp
From: Torbjorn Granlund
Date: Sun, 14 Apr 2013 19:21:36 +0200
> T3 and T4 are of course quite relevant, so we should take these into
> account. If they run rdpc no slower than the thunk call, then we should
> use rdpc unconditionally.
>
> I used this test program:
Ok, on T4, %pc reads are defi
From: Torbjorn Granlund
Date: Sun, 14 Apr 2013 17:55:31 +0200
> I think we need to consider backing out some of the changes, to restore
> GMP's function on sparc to non-GNU/Linux systems (and perhaps to
> obsolete GNU/Linux systems). We need to keep in mind the symbol
> reference code was tried
From: Torbjorn Granlund
Date: Sun, 14 Apr 2013 19:21:36 +0200
> I tried some timing of call to a pc loading thunk versus an rdpc
> instruction. Approximate cycle counts:
>
>rdpcthunk
> US2 5 2
> US3 6 6
> T1 6 10
>
> I assume US1=US2, US3=US4, and T
I tried some timing of call to a pc loading thunk versus an rdpc
instruction. Approximate cycle counts:
rdpcthunk
US2 5 2
US3 6 6
T1 6 10
I assume US1=US2, US3=US4, and T1=T2. US1, US2 are the least relevant
machines, and the only ones where I could s
David Miller writes:
So here is what I have right now.
I guessed on the sqr_diagonal.asm failures on 32-bit
Solaris/Sparc that something is wrong with INT32
or W32 (which INT32 uses) on Solaris.
Please give it a go.
2013-04-13 David S. Miller
* mpn/sparc32/v9/sq