Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-03-22 Thread Ben RUBSON
> On 01 Mar 2017, at 21:57, Conrad Meyer wrote: > > Hi Bruce, > > On my laptop (Intel(R) Core(TM) i5-3320M CPU — Ivy Bridge) I still see > a little worse performance with this patch. Hi Bruce & Conrad, I gave both patches a try. It's a real use case, iSCSI throughput. Both

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-03-02 Thread Bruce Evans
On Wed, 1 Mar 2017, Conrad Meyer wrote: On Wed, Mar 1, 2017 at 9:27 PM, Bruce Evans wrote: On Wed, 1 Mar 2017, Conrad Meyer wrote: On my laptop (Intel(R) Core(TM) i5-3320M CPU ??? Ivy Bridge) I still see a little worse performance with this patch. Please excuse the

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-03-01 Thread Conrad Meyer
On Wed, Mar 1, 2017 at 9:27 PM, Bruce Evans wrote: > On Wed, 1 Mar 2017, Conrad Meyer wrote: > >> On my laptop (Intel(R) Core(TM) i5-3320M CPU — Ivy Bridge) I still see >> a little worse performance with this patch. Please excuse the ugly >> graphs, I don't have a better

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-03-01 Thread Ben RUBSON
> On 02 Mar 2017, at 06:27, Bruce Evans wrote: > > On Wed, 1 Mar 2017, Conrad Meyer wrote: > >> On my laptop (Intel(R) Core(TM) i5-3320M CPU — Ivy Bridge) I still see >> a little worse performance with this patch. Please excuse the ugly >> graphs, I don't have a better

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-03-01 Thread Bruce Evans
On Wed, 1 Mar 2017, Conrad Meyer wrote: On my laptop (Intel(R) Core(TM) i5-3320M CPU ??? Ivy Bridge) I still see a little worse performance with this patch. Please excuse the ugly graphs, I don't have a better graphing tool set up at this time:

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-03-01 Thread Conrad Meyer
Hi Bruce, On my laptop (Intel(R) Core(TM) i5-3320M CPU — Ivy Bridge) I still see a little worse performance with this patch. Please excuse the ugly graphs, I don't have a better graphing tool set up at this time: https://people.freebsd.org/~cem/crc32/sse42_bde.png

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-02-27 Thread Bruce Evans
On Mon, 27 Feb 2017, Conrad Meyer wrote: On Thu, Feb 2, 2017 at 12:29 PM, Bruce Evans wrote: I've almost finished fixing and optimizing this. I didn't manage to fix all the compiler pessimizations, but the result is within 5% of optimal for buffers larger than a few K.

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-02-27 Thread Conrad Meyer
On Thu, Feb 2, 2017 at 12:29 PM, Bruce Evans wrote: > I've almost finished fixing and optimizing this. I didn't manage to fix > all the compiler pessimizations, but the result is within 5% of optimal > for buffers larger than a few K. Hi Bruce, Did you ever get to a final

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-02-23 Thread Ben RUBSON
Hi guys, Conrad, Bruce, May I ask you some news regarding this please ? More than 3 weeks now running Conrad commit on 2 CRC32C digest enabled iSCSI initiators / targets without issue :) Thank you very much again for this ! Shall we then think about "fixing" the last one or two remaining

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-02-02 Thread Bruce Evans
On Thu, 2 Feb 2017, Konstantin Belousov wrote: On Tue, Jan 31, 2017 at 03:26:32AM +, Conrad E. Meyer wrote: + compile-with"${CC} -c ${CFLAGS:N-nostdinc} ${WERROR} ${PROF} -msse4 ${.IMPSRC}" \ BTW, new gcc has -mcrc32 option, but clang 3.9.1 apparently does not. I've almost

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-02-02 Thread Konstantin Belousov
On Tue, Jan 31, 2017 at 03:26:32AM +, Conrad E. Meyer wrote: > + compile-with"${CC} -c ${CFLAGS:N-nostdinc} ${WERROR} ${PROF} -msse4 > ${.IMPSRC}" \ BTW, new gcc has -mcrc32 option, but clang 3.9.1 apparently does not. ___

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-02-01 Thread Bruce Evans
On Tue, 31 Jan 2017, Conrad Meyer wrote: On Tue, Jan 31, 2017 at 7:16 PM, Bruce Evans wrote: Another reply to this... On Tue, 31 Jan 2017, Conrad Meyer wrote: On Tue, Jan 31, 2017 at 7:36 AM, Bruce Evans wrote: On Tue, 31 Jan 2017, Bruce Evans

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-01-31 Thread Conrad Meyer
On Tue, Jan 31, 2017 at 7:16 PM, Bruce Evans wrote: > Another reply to this... > > On Tue, 31 Jan 2017, Conrad Meyer wrote: > >> On Tue, Jan 31, 2017 at 7:36 AM, Bruce Evans wrote: >>> >>> On Tue, 31 Jan 2017, Bruce Evans wrote: >>> I >>> think there

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-01-31 Thread Bruce Evans
Another reply to this... On Tue, 31 Jan 2017, Conrad Meyer wrote: On Tue, Jan 31, 2017 at 7:36 AM, Bruce Evans wrote: On Tue, 31 Jan 2017, Bruce Evans wrote: Unrolling (or not) may be helpful or harmful for entry and exit code. Helpful, per my earlier benchmarks. I

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-01-31 Thread Bruce Evans
On Tue, 31 Jan 2017, Conrad Meyer wrote: On Tue, Jan 31, 2017 at 7:36 AM, Bruce Evans wrote: On Tue, 31 Jan 2017, Bruce Evans wrote: Unrolling (or not) may be helpful or harmful for entry and exit code. Helpful, per my earlier benchmarks. I think there should by no

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-01-31 Thread Conrad Meyer
On Tue, Jan 31, 2017 at 7:36 AM, Bruce Evans wrote: > On Tue, 31 Jan 2017, Bruce Evans wrote: > Unrolling (or not) may be helpful or harmful for entry and exit code. Helpful, per my earlier benchmarks. > I > think there should by no alignment on entry -- just assume the

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-01-31 Thread Bruce Evans
On Tue, 31 Jan 2017, Bruce Evans wrote: On Mon, 30 Jan 2017, Conrad Meyer wrote: On Mon, Jan 30, 2017 at 9:26 PM, Bruce Evans wrote: On Tue, 31 Jan 2017, Conrad E. Meyer wrote: Log: calculate_crc32c: Add SSE4.2 implementation on x86 This breaks building with

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-01-31 Thread Bruce Evans
On Mon, 30 Jan 2017, Conrad Meyer wrote: On Mon, Jan 30, 2017 at 9:26 PM, Bruce Evans wrote: On Tue, 31 Jan 2017, Conrad E. Meyer wrote: Log: calculate_crc32c: Add SSE4.2 implementation on x86 This breaks building with gcc-4.2.1, gcc-4.2.1 is an ancient compiler.

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-01-30 Thread Conrad Meyer
Hi Bruce, On Mon, Jan 30, 2017 at 9:26 PM, Bruce Evans wrote: > On Tue, 31 Jan 2017, Conrad E. Meyer wrote: > >> Log: >> calculate_crc32c: Add SSE4.2 implementation on x86 > > > This breaks building with gcc-4.2.1, gcc-4.2.1 is an ancient compiler. Good riddance. >>

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-01-30 Thread Bruce Evans
On Tue, 31 Jan 2017, Conrad E. Meyer wrote: Log: calculate_crc32c: Add SSE4.2 implementation on x86 This breaks building with gcc-4.2.1, and depends on using non-kernel clang headers for clang. Modified: head/sys/conf/files.amd64

Re: svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-01-30 Thread Conrad Meyer
On Mon, Jan 30, 2017 at 7:26 PM, Conrad E. Meyer wrote: > (The CRC instruction takes 1 cycle but has 2-3 cycles of latency.) My mistake, it's not 2 anywhere. It's just 3 cycles on all workstation/server CPUs since Nehalem. Different on Atom chips and AMD. Best, Conrad

svn commit: r313006 - in head: sys/conf sys/libkern sys/libkern/x86 sys/sys tests/sys/kern

2017-01-30 Thread Conrad E. Meyer
Author: cem Date: Tue Jan 31 03:26:32 2017 New Revision: 313006 URL: https://svnweb.freebsd.org/changeset/base/313006 Log: calculate_crc32c: Add SSE4.2 implementation on x86 Derived from an implementation by Mark Adler. The fast loop performs three simultaneous CRCs over subsets of