Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator
On Thu, 2020-03-26 at 08:26 -0500, Eric Blake wrote: > On 3/25/20 9:09 PM, Hu, Robert wrote: > > (Don't know why my Linux-Evolution missed this mail.) > > > -Original Message- > > > Long line; it's nice to wrap commit messages around column 70 or > > > so (because > > > reading 'git log' in an 80-column window adds indentation). > > > > > > > [Hu, Robert] > > I think I set my vim on wrap. This probably escaped by paste. > > I ran checkpatch.pl on the patches before sending. It escaped check > > but didn't > > escaped your eagle eye😊 Thank you. > > checkpatch doesn't flag commit message long lines. Maybe it could > be > patched to do so, but it's not at the top of my list to write that > patch. > > > > > > > I just fix a boudary case on his original patch. > > > > > > boundary > > > > [Hu, Robert] > > Emm... again spell error. Usually I would paste descriptions into > > some editors > > with spell check, but forgot this time. > > Vim doesn't have spell check I think. What editor would you suggest > > me to > > integrate with git editing? > > I'm an emacs user, so I have no suggestions for vim, but I'd be very > surprised if there were not some vim expert online that could figure > out > how to wire in a spell-checker to vim. Google quickly finds: > https://www.ostechnix.com/use-spell-check-feature-vim-text-editor/ > nice, thanks:)
Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator
On 3/25/20 9:09 PM, Hu, Robert wrote: (Don't know why my Linux-Evolution missed this mail.) -Original Message- Long line; it's nice to wrap commit messages around column 70 or so (because reading 'git log' in an 80-column window adds indentation). [Hu, Robert] I think I set my vim on wrap. This probably escaped by paste. I ran checkpatch.pl on the patches before sending. It escaped check but didn't escaped your eagle eye😊 Thank you. checkpatch doesn't flag commit message long lines. Maybe it could be patched to do so, but it's not at the top of my list to write that patch. I just fix a boudary case on his original patch. boundary [Hu, Robert] Emm... again spell error. Usually I would paste descriptions into some editors with spell check, but forgot this time. Vim doesn't have spell check I think. What editor would you suggest me to integrate with git editing? I'm an emacs user, so I have no suggestions for vim, but I'd be very surprised if there were not some vim expert online that could figure out how to wire in a spell-checker to vim. Google quickly finds: https://www.ostechnix.com/use-spell-check-feature-vim-text-editor/ -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator
On 26/03/20 03:09, Hu, Robert wrote: > BTW, do I need to resend these 2 patches? No, thanks! I have queued them. Paolo
RE: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator
(Don't know why my Linux-Evolution missed this mail.) > -Original Message- > From: Eric Blake > Sent: Wednesday, March 25, 2020 20:54 > To: Robert Hoo ; qemu-devel@nongnu.org; > pbonz...@redhat.com; richard.hender...@linaro.org > Cc: Hu, Robert > Subject: Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator > > On 3/25/20 1:50 AM, Robert Hoo wrote: > > By increasing avx2 length_to_accel to 128, we can simplify its logic > > and reduce a branch. > > > > The authorship of this patch actually belongs to Richard Henderson > > , > > Long line; it's nice to wrap commit messages around column 70 or so (because > reading 'git log' in an 80-column window adds indentation). > [Hu, Robert] I think I set my vim on wrap. This probably escaped by paste. I ran checkpatch.pl on the patches before sending. It escaped check but didn't escaped your eagle eye😊 Thank you. > > I just fix a boudary case on his original patch. > > boundary [Hu, Robert] Emm... again spell error. Usually I would paste descriptions into some editors with spell check, but forgot this time. Vim doesn't have spell check I think. What editor would you suggest me to integrate with git editing? BTW, do I need to resend these 2 patches? > > > > > Suggested-by: Richard Henderson > > Signed-off-by: Robert Hoo > > --- > > util/bufferiszero.c | 26 +- > > 1 file changed, 9 insertions(+), 17 deletions(-) > > > > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3226 > Virtualization: qemu.org | libvirt.org
Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator
On 3/25/20 1:50 AM, Robert Hoo wrote: By increasing avx2 length_to_accel to 128, we can simplify its logic and reduce a branch. The authorship of this patch actually belongs to Richard Henderson , Long line; it's nice to wrap commit messages around column 70 or so (because reading 'git log' in an 80-column window adds indentation). I just fix a boudary case on his original patch. boundary Suggested-by: Richard Henderson Signed-off-by: Robert Hoo --- util/bufferiszero.c | 26 +- 1 file changed, 9 insertions(+), 17 deletions(-) -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
[PATCH 2/2] util/bufferiszero: improve avx2 accelerator
By increasing avx2 length_to_accel to 128, we can simplify its logic and reduce a branch. The authorship of this patch actually belongs to Richard Henderson , I just fix a boudary case on his original patch. Suggested-by: Richard Henderson Signed-off-by: Robert Hoo --- util/bufferiszero.c | 26 +- 1 file changed, 9 insertions(+), 17 deletions(-) diff --git a/util/bufferiszero.c b/util/bufferiszero.c index b801253..695bb4c 100644 --- a/util/bufferiszero.c +++ b/util/bufferiszero.c @@ -158,27 +158,19 @@ buffer_zero_avx2(const void *buf, size_t len) __m256i *p = (__m256i *)(((uintptr_t)buf + 5 * 32) & -32); __m256i *e = (__m256i *)(((uintptr_t)buf + len) & -32); -if (likely(p <= e)) { -/* Loop over 32-byte aligned blocks of 128. */ -do { -__builtin_prefetch(p); -if (unlikely(!_mm256_testz_si256(t, t))) { -return false; -} -t = p[-4] | p[-3] | p[-2] | p[-1]; -p += 4; -} while (p <= e); -} else { -t |= _mm256_loadu_si256(buf + 32); -if (len <= 128) { -goto last2; +/* Loop over 32-byte aligned blocks of 128. */ +while (p <= e) { +__builtin_prefetch(p); +if (unlikely(!_mm256_testz_si256(t, t))) { +return false; } -} +t = p[-4] | p[-3] | p[-2] | p[-1]; +p += 4; +} ; /* Finish the last block of 128 unaligned. */ t |= _mm256_loadu_si256(buf + len - 4 * 32); t |= _mm256_loadu_si256(buf + len - 3 * 32); - last2: t |= _mm256_loadu_si256(buf + len - 2 * 32); t |= _mm256_loadu_si256(buf + len - 1 * 32); @@ -263,7 +255,7 @@ static void init_accel(unsigned cache) } if (cache & CACHE_AVX2) { fn = buffer_zero_avx2; -length_to_accel = 64; +length_to_accel = 128; } #endif #ifdef CONFIG_AVX512F_OPT -- 1.8.3.1