Re: PING^3 [PATCH] x86: Add cmpmemsi for -minline-all-stringops

2020-10-23 Thread H.J. Lu via Gcc-patches
On Sun, Oct 18, 2020 at 8:16 AM Jan Hubicka  wrote:
>
> > On Fri, Oct 2, 2020 at 6:21 AM H.J. Lu  wrote:
> > >
> > > On Wed, Sep 16, 2020 at 10:07 PM H.J. Lu  wrote:
> > > >
> > > > On Wed, Aug 19, 2020 at 6:09 AM H.J. Lu  wrote:
> > > > >
> > > > > On Tue, May 19, 2020 at 5:14 AM H.J. Lu  wrote:
> > > > > >
> > > > > > On Tue, May 19, 2020 at 1:48 AM Uros Bizjak  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Sun, May 17, 2020 at 7:06 PM H.J. Lu  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Duplicate the cmpstrn pattern for cmpmem.  The only difference 
> > > > > > > > is that
> > > > > > > > the length argument of cmpmem is guaranteed to be less than or 
> > > > > > > > equal to
> > > > > > > > lengths of 2 memory areas.  Since "repz cmpsb" can be much 
> > > > > > > > slower than
> > > > > > > > memcmp function implemented with vector instruction, see
> > > > > > > >
> > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > > > > > > >
> > > > > > > > expand cmpmem to "repz cmpsb" only with -mgeneral-regs-only.
> > > > > > >
> > > > > > > If there is no benefit compared to the library implementation, 
> > > > > > > then
> > > > > > > enable these patterns only when -minline-all-stringops is used.
> > > > > >
> > > > > > Fixed.
> > > > > >
> > > > > > > Eventually these should be reimplemented with SSE4 string 
> > > > > > > instructions.
> > > > > > >
> > > > > > > Honza is the author of the block handling x86 system, I'll leave 
> > > > > > > the
> > > > > > > review to him.
> > > > > >
> > > > > > We used to expand memcmp to "repz cmpsb" via cmpstrnsi.  It was 
> > > > > > changed
> > > > > > by
> > > > > >
> > > > > > commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
> > > > > > Author: Nick Clifton 
> > > > > > Date:   Fri Aug 12 16:26:11 2011 +
> > > > > >
> > > > > > builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi 
> > > > > > pattern.
> > > > > >
> > > > > > * builtins.c (expand_builtin_memcmp): Do not use 
> > > > > > cmpstrnsi
> > > > > > pattern.
> > > > > > * doc/md.texi (cmpstrn): Note that the comparison stops 
> > > > > > if both
> > > > > > fetched bytes are zero.
> > > > > > (cmpstr): Likewise.
> > > > > > (cmpmem): Note that the comparison does not stop if 
> > > > > > both of the
> > > > > > fetched bytes are zero.
> > > > > >
> > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95151
> > > > > >
> > > > > > is a regression.
> > > > > >
> > > > > > Honza, can you take a look at this?
> > > > > >
> > > > >
> > > > > PING:
> > > > >
> > > > > https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546921.html
> > > > >
> > > >
> > > > PING.
> > > >
> > >
> > > PING.
> > >
> >
> > I'd like to check it in next Tuesday if there are no comments.
>
> I still plan to intorduce the two-level optimize_for_size predicates.
> Will try to do that by tuesday.

Any updates on this?  My patch is still needed to generate cmpmemsi
with -minline-all-stringops.  If there are comments, I will check it in next
Monday.

--
H.J.


Re: PING^3 [PATCH] x86: Add cmpmemsi for -minline-all-stringops

2020-10-18 Thread H.J. Lu via Gcc-patches
On Sun, Oct 18, 2020 at 8:16 AM Jan Hubicka  wrote:
>
> > On Fri, Oct 2, 2020 at 6:21 AM H.J. Lu  wrote:
> > >
> > > On Wed, Sep 16, 2020 at 10:07 PM H.J. Lu  wrote:
> > > >
> > > > On Wed, Aug 19, 2020 at 6:09 AM H.J. Lu  wrote:
> > > > >
> > > > > On Tue, May 19, 2020 at 5:14 AM H.J. Lu  wrote:
> > > > > >
> > > > > > On Tue, May 19, 2020 at 1:48 AM Uros Bizjak  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Sun, May 17, 2020 at 7:06 PM H.J. Lu  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Duplicate the cmpstrn pattern for cmpmem.  The only difference 
> > > > > > > > is that
> > > > > > > > the length argument of cmpmem is guaranteed to be less than or 
> > > > > > > > equal to
> > > > > > > > lengths of 2 memory areas.  Since "repz cmpsb" can be much 
> > > > > > > > slower than
> > > > > > > > memcmp function implemented with vector instruction, see
> > > > > > > >
> > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > > > > > > >
> > > > > > > > expand cmpmem to "repz cmpsb" only with -mgeneral-regs-only.
> > > > > > >
> > > > > > > If there is no benefit compared to the library implementation, 
> > > > > > > then
> > > > > > > enable these patterns only when -minline-all-stringops is used.
> > > > > >
> > > > > > Fixed.
> > > > > >
> > > > > > > Eventually these should be reimplemented with SSE4 string 
> > > > > > > instructions.
> > > > > > >
> > > > > > > Honza is the author of the block handling x86 system, I'll leave 
> > > > > > > the
> > > > > > > review to him.
> > > > > >
> > > > > > We used to expand memcmp to "repz cmpsb" via cmpstrnsi.  It was 
> > > > > > changed
> > > > > > by
> > > > > >
> > > > > > commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
> > > > > > Author: Nick Clifton 
> > > > > > Date:   Fri Aug 12 16:26:11 2011 +
> > > > > >
> > > > > > builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi 
> > > > > > pattern.
> > > > > >
> > > > > > * builtins.c (expand_builtin_memcmp): Do not use 
> > > > > > cmpstrnsi
> > > > > > pattern.
> > > > > > * doc/md.texi (cmpstrn): Note that the comparison stops 
> > > > > > if both
> > > > > > fetched bytes are zero.
> > > > > > (cmpstr): Likewise.
> > > > > > (cmpmem): Note that the comparison does not stop if 
> > > > > > both of the
> > > > > > fetched bytes are zero.
> > > > > >
> > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95151
> > > > > >
> > > > > > is a regression.
> > > > > >
> > > > > > Honza, can you take a look at this?
> > > > > >
> > > > >
> > > > > PING:
> > > > >
> > > > > https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546921.html
> > > > >
> > > >
> > > > PING.
> > > >
> > >
> > > PING.
> > >
> >
> > I'd like to check it in next Tuesday if there are no comments.
>
> I still plan to intorduce the two-level optimize_for_size predicates.
> Will try to do that by tuesday.
>

Thanks.

BTW, this patch is about inlining memcmp with -minline-all-stringops.
It is very important for user interrupt codes (UINTR) not to call memcmp
since memcmp in glibc uses vector registers which shouldn't be used in
user interrupt codes.

-- 
H.J.


Re: PING^3 [PATCH] x86: Add cmpmemsi for -minline-all-stringops

2020-10-18 Thread Jan Hubicka
> On Fri, Oct 2, 2020 at 6:21 AM H.J. Lu  wrote:
> >
> > On Wed, Sep 16, 2020 at 10:07 PM H.J. Lu  wrote:
> > >
> > > On Wed, Aug 19, 2020 at 6:09 AM H.J. Lu  wrote:
> > > >
> > > > On Tue, May 19, 2020 at 5:14 AM H.J. Lu  wrote:
> > > > >
> > > > > On Tue, May 19, 2020 at 1:48 AM Uros Bizjak  wrote:
> > > > > >
> > > > > > On Sun, May 17, 2020 at 7:06 PM H.J. Lu  wrote:
> > > > > > >
> > > > > > > Duplicate the cmpstrn pattern for cmpmem.  The only difference is 
> > > > > > > that
> > > > > > > the length argument of cmpmem is guaranteed to be less than or 
> > > > > > > equal to
> > > > > > > lengths of 2 memory areas.  Since "repz cmpsb" can be much slower 
> > > > > > > than
> > > > > > > memcmp function implemented with vector instruction, see
> > > > > > >
> > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > > > > > >
> > > > > > > expand cmpmem to "repz cmpsb" only with -mgeneral-regs-only.
> > > > > >
> > > > > > If there is no benefit compared to the library implementation, then
> > > > > > enable these patterns only when -minline-all-stringops is used.
> > > > >
> > > > > Fixed.
> > > > >
> > > > > > Eventually these should be reimplemented with SSE4 string 
> > > > > > instructions.
> > > > > >
> > > > > > Honza is the author of the block handling x86 system, I'll leave the
> > > > > > review to him.
> > > > >
> > > > > We used to expand memcmp to "repz cmpsb" via cmpstrnsi.  It was 
> > > > > changed
> > > > > by
> > > > >
> > > > > commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
> > > > > Author: Nick Clifton 
> > > > > Date:   Fri Aug 12 16:26:11 2011 +
> > > > >
> > > > > builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern.
> > > > >
> > > > > * builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi
> > > > > pattern.
> > > > > * doc/md.texi (cmpstrn): Note that the comparison stops 
> > > > > if both
> > > > > fetched bytes are zero.
> > > > > (cmpstr): Likewise.
> > > > > (cmpmem): Note that the comparison does not stop if both 
> > > > > of the
> > > > > fetched bytes are zero.
> > > > >
> > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95151
> > > > >
> > > > > is a regression.
> > > > >
> > > > > Honza, can you take a look at this?
> > > > >
> > > >
> > > > PING:
> > > >
> > > > https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546921.html
> > > >
> > >
> > > PING.
> > >
> >
> > PING.
> >
> 
> I'd like to check it in next Tuesday if there are no comments.

I still plan to intorduce the two-level optimize_for_size predicates.
Will try to do that by tuesday.

Honza
> 
> -- 
> H.J.


Re: PING^3 [PATCH] x86: Add cmpmemsi for -minline-all-stringops

2020-10-17 Thread H.J. Lu via Gcc-patches
On Fri, Oct 2, 2020 at 6:21 AM H.J. Lu  wrote:
>
> On Wed, Sep 16, 2020 at 10:07 PM H.J. Lu  wrote:
> >
> > On Wed, Aug 19, 2020 at 6:09 AM H.J. Lu  wrote:
> > >
> > > On Tue, May 19, 2020 at 5:14 AM H.J. Lu  wrote:
> > > >
> > > > On Tue, May 19, 2020 at 1:48 AM Uros Bizjak  wrote:
> > > > >
> > > > > On Sun, May 17, 2020 at 7:06 PM H.J. Lu  wrote:
> > > > > >
> > > > > > Duplicate the cmpstrn pattern for cmpmem.  The only difference is 
> > > > > > that
> > > > > > the length argument of cmpmem is guaranteed to be less than or 
> > > > > > equal to
> > > > > > lengths of 2 memory areas.  Since "repz cmpsb" can be much slower 
> > > > > > than
> > > > > > memcmp function implemented with vector instruction, see
> > > > > >
> > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > > > > >
> > > > > > expand cmpmem to "repz cmpsb" only with -mgeneral-regs-only.
> > > > >
> > > > > If there is no benefit compared to the library implementation, then
> > > > > enable these patterns only when -minline-all-stringops is used.
> > > >
> > > > Fixed.
> > > >
> > > > > Eventually these should be reimplemented with SSE4 string 
> > > > > instructions.
> > > > >
> > > > > Honza is the author of the block handling x86 system, I'll leave the
> > > > > review to him.
> > > >
> > > > We used to expand memcmp to "repz cmpsb" via cmpstrnsi.  It was changed
> > > > by
> > > >
> > > > commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
> > > > Author: Nick Clifton 
> > > > Date:   Fri Aug 12 16:26:11 2011 +
> > > >
> > > > builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern.
> > > >
> > > > * builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi
> > > > pattern.
> > > > * doc/md.texi (cmpstrn): Note that the comparison stops if 
> > > > both
> > > > fetched bytes are zero.
> > > > (cmpstr): Likewise.
> > > > (cmpmem): Note that the comparison does not stop if both of 
> > > > the
> > > > fetched bytes are zero.
> > > >
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95151
> > > >
> > > > is a regression.
> > > >
> > > > Honza, can you take a look at this?
> > > >
> > >
> > > PING:
> > >
> > > https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546921.html
> > >
> >
> > PING.
> >
>
> PING.
>

I'd like to check it in next Tuesday if there are no comments.

-- 
H.J.


PING^3 [PATCH] x86: Add cmpmemsi for -minline-all-stringops

2020-10-02 Thread H.J. Lu via Gcc-patches
On Wed, Sep 16, 2020 at 10:07 PM H.J. Lu  wrote:
>
> On Wed, Aug 19, 2020 at 6:09 AM H.J. Lu  wrote:
> >
> > On Tue, May 19, 2020 at 5:14 AM H.J. Lu  wrote:
> > >
> > > On Tue, May 19, 2020 at 1:48 AM Uros Bizjak  wrote:
> > > >
> > > > On Sun, May 17, 2020 at 7:06 PM H.J. Lu  wrote:
> > > > >
> > > > > Duplicate the cmpstrn pattern for cmpmem.  The only difference is that
> > > > > the length argument of cmpmem is guaranteed to be less than or equal 
> > > > > to
> > > > > lengths of 2 memory areas.  Since "repz cmpsb" can be much slower than
> > > > > memcmp function implemented with vector instruction, see
> > > > >
> > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > > > >
> > > > > expand cmpmem to "repz cmpsb" only with -mgeneral-regs-only.
> > > >
> > > > If there is no benefit compared to the library implementation, then
> > > > enable these patterns only when -minline-all-stringops is used.
> > >
> > > Fixed.
> > >
> > > > Eventually these should be reimplemented with SSE4 string instructions.
> > > >
> > > > Honza is the author of the block handling x86 system, I'll leave the
> > > > review to him.
> > >
> > > We used to expand memcmp to "repz cmpsb" via cmpstrnsi.  It was changed
> > > by
> > >
> > > commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
> > > Author: Nick Clifton 
> > > Date:   Fri Aug 12 16:26:11 2011 +
> > >
> > > builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern.
> > >
> > > * builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi
> > > pattern.
> > > * doc/md.texi (cmpstrn): Note that the comparison stops if 
> > > both
> > > fetched bytes are zero.
> > > (cmpstr): Likewise.
> > > (cmpmem): Note that the comparison does not stop if both of 
> > > the
> > > fetched bytes are zero.
> > >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95151
> > >
> > > is a regression.
> > >
> > > Honza, can you take a look at this?
> > >
> >
> > PING:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546921.html
> >
>
> PING.
>

PING.

-- 
H.J.


PING^2 [PATCH] x86: Add cmpmemsi for -minline-all-stringops

2020-09-16 Thread H.J. Lu via Gcc-patches
On Wed, Aug 19, 2020 at 6:09 AM H.J. Lu  wrote:
>
> On Tue, May 19, 2020 at 5:14 AM H.J. Lu  wrote:
> >
> > On Tue, May 19, 2020 at 1:48 AM Uros Bizjak  wrote:
> > >
> > > On Sun, May 17, 2020 at 7:06 PM H.J. Lu  wrote:
> > > >
> > > > Duplicate the cmpstrn pattern for cmpmem.  The only difference is that
> > > > the length argument of cmpmem is guaranteed to be less than or equal to
> > > > lengths of 2 memory areas.  Since "repz cmpsb" can be much slower than
> > > > memcmp function implemented with vector instruction, see
> > > >
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > > >
> > > > expand cmpmem to "repz cmpsb" only with -mgeneral-regs-only.
> > >
> > > If there is no benefit compared to the library implementation, then
> > > enable these patterns only when -minline-all-stringops is used.
> >
> > Fixed.
> >
> > > Eventually these should be reimplemented with SSE4 string instructions.
> > >
> > > Honza is the author of the block handling x86 system, I'll leave the
> > > review to him.
> >
> > We used to expand memcmp to "repz cmpsb" via cmpstrnsi.  It was changed
> > by
> >
> > commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
> > Author: Nick Clifton 
> > Date:   Fri Aug 12 16:26:11 2011 +
> >
> > builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern.
> >
> > * builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi
> > pattern.
> > * doc/md.texi (cmpstrn): Note that the comparison stops if both
> > fetched bytes are zero.
> > (cmpstr): Likewise.
> > (cmpmem): Note that the comparison does not stop if both of the
> > fetched bytes are zero.
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95151
> >
> > is a regression.
> >
> > Honza, can you take a look at this?
> >
>
> PING:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546921.html
>

PING.

-- 
H.J.


PING [PATCH] x86: Add cmpmemsi for -minline-all-stringops

2020-08-19 Thread H.J. Lu via Gcc-patches
On Tue, May 19, 2020 at 5:14 AM H.J. Lu  wrote:
>
> On Tue, May 19, 2020 at 1:48 AM Uros Bizjak  wrote:
> >
> > On Sun, May 17, 2020 at 7:06 PM H.J. Lu  wrote:
> > >
> > > Duplicate the cmpstrn pattern for cmpmem.  The only difference is that
> > > the length argument of cmpmem is guaranteed to be less than or equal to
> > > lengths of 2 memory areas.  Since "repz cmpsb" can be much slower than
> > > memcmp function implemented with vector instruction, see
> > >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > >
> > > expand cmpmem to "repz cmpsb" only with -mgeneral-regs-only.
> >
> > If there is no benefit compared to the library implementation, then
> > enable these patterns only when -minline-all-stringops is used.
>
> Fixed.
>
> > Eventually these should be reimplemented with SSE4 string instructions.
> >
> > Honza is the author of the block handling x86 system, I'll leave the
> > review to him.
>
> We used to expand memcmp to "repz cmpsb" via cmpstrnsi.  It was changed
> by
>
> commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
> Author: Nick Clifton 
> Date:   Fri Aug 12 16:26:11 2011 +
>
> builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern.
>
> * builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi
> pattern.
> * doc/md.texi (cmpstrn): Note that the comparison stops if both
> fetched bytes are zero.
> (cmpstr): Likewise.
> (cmpmem): Note that the comparison does not stop if both of the
> fetched bytes are zero.
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95151
>
> is a regression.
>
> Honza, can you take a look at this?
>

PING:

https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546921.html

-- 
H.J.


[PATCH] x86: Add cmpmemsi for -minline-all-stringops

2020-05-19 Thread H.J. Lu via Gcc-patches
On Tue, May 19, 2020 at 1:48 AM Uros Bizjak  wrote:
>
> On Sun, May 17, 2020 at 7:06 PM H.J. Lu  wrote:
> >
> > Duplicate the cmpstrn pattern for cmpmem.  The only difference is that
> > the length argument of cmpmem is guaranteed to be less than or equal to
> > lengths of 2 memory areas.  Since "repz cmpsb" can be much slower than
> > memcmp function implemented with vector instruction, see
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> >
> > expand cmpmem to "repz cmpsb" only with -mgeneral-regs-only.
>
> If there is no benefit compared to the library implementation, then
> enable these patterns only when -minline-all-stringops is used.

Fixed.

> Eventually these should be reimplemented with SSE4 string instructions.
>
> Honza is the author of the block handling x86 system, I'll leave the
> review to him.

We used to expand memcmp to "repz cmpsb" via cmpstrnsi.  It was changed
by

commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
Author: Nick Clifton 
Date:   Fri Aug 12 16:26:11 2011 +

builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern.

* builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi
pattern.
* doc/md.texi (cmpstrn): Note that the comparison stops if both
fetched bytes are zero.
(cmpstr): Likewise.
(cmpmem): Note that the comparison does not stop if both of the
fetched bytes are zero.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95151

is a regression.

Honza, can you take a look at this?

Thanks.

--
H.J.
From ec91a57f3f91168034f7f5eb391949c301741680 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Thu, 14 May 2020 13:06:23 -0700
Subject: [PATCH] x86: Add cmpmemsi for -minline-all-stringops
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

We used to expand memcmp to "repz cmpsb" via cmpstrnsi.  It was changed
by

commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
Author: Nick Clifton 
Date:   Fri Aug 12 16:26:11 2011 +

builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern.

* builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi
pattern.
* doc/md.texi (cmpstrn): Note that the comparison stops if both
fetched bytes are zero.
(cmpstr): Likewise.
(cmpmem): Note that the comparison does not stop if both of the
fetched bytes are zero.

Duplicate the cmpstrn pattern for cmpmem.  The only difference is that
the length argument of cmpmem is guaranteed to be less than or equal to
lengths of 2 memory areas.  Since "repz cmpsb" can be much slower than
memcmp function implemented with vector instruction, see

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052

expand cmpmem to "repz cmpsb" only for -minline-all-stringops.

gcc/

	PR target/95151
	* config/i386/i386-expand.c (ix86_expand_cmpstrn_or_cmpmem): New
	function.
	* config/i386/i386-protos.h (ix86_expand_cmpstrn_or_cmpmem): New
	prototype.
	* config/i386/i386.md (cmpmemsi): New pattern.

gcc/testsuite/

	PR target/95151
	* gcc.target/i386/pr95151-1.c: New test.
	* gcc.target/i386/pr95151-2.c: Likewise.
	* gcc.target/i386/pr95151-3.c: Likewise.
	* gcc.target/i386/pr95151-4.c: Likewise.
---
 gcc/config/i386/i386-expand.c | 80 +++
 gcc/config/i386/i386-protos.h |  1 +
 gcc/config/i386/i386.md   | 80 ++-
 gcc/testsuite/gcc.target/i386/pr95151-1.c | 17 +
 gcc/testsuite/gcc.target/i386/pr95151-2.c | 10 +++
 gcc/testsuite/gcc.target/i386/pr95151-3.c | 18 +
 gcc/testsuite/gcc.target/i386/pr95151-4.c | 11 
 7 files changed, 158 insertions(+), 59 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr95151-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr95151-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr95151-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr95151-4.c

diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 79f827fd653..d3f4280ad58 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -7656,6 +7656,86 @@ ix86_expand_set_or_cpymem (rtx dst, rtx src, rtx count_exp, rtx val_exp,
   return true;
 }
 
+/* Expand cmpstrn or memcmp.  */
+
+bool
+ix86_expand_cmpstrn_or_cmpmem (rtx result, rtx src1, rtx src2,
+			   rtx length, rtx align, bool is_cmpstrn)
+{
+  if (optimize_insn_for_size_p () && !TARGET_INLINE_ALL_STRINGOPS)
+return false;
+
+  /* Can't use this if the user has appropriated ecx, esi or edi.  */
+  if (fixed_regs[CX_REG] || fixed_regs[SI_REG] || fixed_regs[DI_REG])
+return false;
+
+  if (is_cmpstrn)
+{
+  /* For strncmp, length is the maximum length, which can be larger
+	 than actual string lengths.