On Sun, 31 Jul 2016, Konstantin Belousov wrote:
On Sun, Jul 31, 2016 at 11:11:25PM +1000, Bruce Evans wrote:
I said that I didn't replace (sse2) pagecopy() by bcopy() on amd64 for
Haswell. Actually I do, for a small improvement on makeworld. i386
doesn't have (sse*) pagecopy() except in
On Sun, Jul 31, 2016 at 11:11:25PM +1000, Bruce Evans wrote:
> On Haswell, "rep stos" takes about 25 cycles to start up, and the function
> call overhead is in the noise. 25 cycles is a lot. Haswell can move
> 32 bytes/cycle from L2 to L2, so it misses moving 800 bytes or 1/5 of a
> page in its
On Sun, Jul 31, 2016 at 06:26:29PM +0300, Slawa Olhovchenkov wrote:
> On Mon, Aug 01, 2016 at 12:30:14AM +1000, Bruce Evans wrote:
>
> > On Sun, 31 Jul 2016, Slawa Olhovchenkov wrote:
> >
> > > On Sun, Jul 31, 2016 at 11:11:25PM +1000, Bruce Evans wrote:
> > >
> > >> Misalignment of this loop
On Mon, Aug 01, 2016 at 12:30:14AM +1000, Bruce Evans wrote:
> On Sun, 31 Jul 2016, Slawa Olhovchenkov wrote:
>
> > On Sun, Jul 31, 2016 at 11:11:25PM +1000, Bruce Evans wrote:
> >
> >> Misalignment of this loop made it almost twice as slow on old Turion2 with
> >> slow DDR2 memory. It made no
On Sun, 31 Jul 2016, Slawa Olhovchenkov wrote:
On Sun, Jul 31, 2016 at 11:11:25PM +1000, Bruce Evans wrote:
Misalignment of this loop made it almost twice as slow on old Turion2 with
slow DDR2 memory. It made no difference on Haswell. I added an extra
movnti, but that makes little or no
On Sun, Jul 31, 2016 at 11:11:25PM +1000, Bruce Evans wrote:
> Misalignment of this loop made it almost twice as slow on old Turion2 with
> slow DDR2 memory. It made no difference on Haswell. I added an extra
> movnti, but that makes little or no differences. 2 more movnti's wouldn't
> fit in
On Sun, 31 Jul 2016, Mateusz Guzik wrote:
Log:
amd64: implement pagezero using rep stos
The current implementation uses non-temporal writes. This turns out to
be detrimental to performance if the page is used shortly after, which
is the typical case with page faults.
Switch to rep stos.
Author: mjg
Date: Sun Jul 31 11:34:08 2016
New Revision: 303583
URL: https://svnweb.freebsd.org/changeset/base/303583
Log:
amd64: implement pagezero using rep stos
The current implementation uses non-temporal writes. This turns out to
be detrimental to performance if the page is used