On Wed, Sep 18, 2013 at 1:39 PM, Jan Hubicka <hubi...@ucw.cz> wrote:
>
> when generic model was introduced, the 32bit only CPUs was still common on the
> market.  It would be stupid to tune 64bit code for CPUs that will never run 
> it.
> We thus introduced two models - generic32 that was considering needs
> of 32bit cpus (centrinos in particular) and generic64 that didn't.
>
>  /* Generic32 should produce code tuned for PPro, Pentium4, Nocona,
>     Athlon and K8.  */
>  /* Generic64 should produce code tuned for Nocona and K8.  */
>
> Was original definitions that are still in the source.
>
> Today the 32bit only CPUs are no longer important.  This patch thus
> drops 32bit generic.  This has effect of droping the following flags
> for generic at -m32:
>  use_leave, avoid_vector_decode, slow_imul_imm32_mem, slow_imul_imm8
> that are currently enabled for generic64 only.  This was to accomodate
> earlier AMD chips that are no longer relevant too.
>
> I also updated comment:
> ! /* Generic64 should produce code tuned for Nocona and K8.  */
> to:
> ! /* Generic should produce code tuned for Core-i7 (and newer chips)
> !    and btver1 (and newer chips).  */
> This is what I think generic represents today (it also fares swell on earlier
> cores and amdfam10, but we probably don't want to get too limited by these
> anymore).
>
> I would like to proceed with modernization of generic64 - in particular
> to switch it to 4 issue scheduling model and revisit individual flags
> incrementally.
>
> Bootstrapped/regtested x86_64-linux, will commit it tomorrow if there
> are no complains.
>
> Honza
>
>         * i386.h (TARGET_GENERIC32, TARGET_GENERIC64): Remove.
>         (TARGET_GENERIC): Use PROCESOR_GENERIC
>         (enum processor_type): Unify generic32 and 64.
>         * i386.md (cpu): Likewise.
>         * x86-tune.def (use_leave): Enable for generic32.
>         (avoid_vector_decode, slow_imul_imm32_mem, slow_imul_imm8): Likewise.
>         * athlon.md: Change generic64 to generic in all occurences.
>         * i386-c.c (ix86_target_macros_internal): Unify generic64 and 32.
>         (ix86_target_macros_internal): Likewise.
>         * driver-i386.c (host_detect_local_cpu): Likewise.
>         * i386.c (generic64_memcpy, generic64_memset, generic64_cost): Rename 
> to ..
>         (generic_memcpy, generic_memset, generic_cost): This one.
>         (generic32_memcpy, generic32_memset, generic32_cost): Remove.
>         (m_GENERIC32, m_GENERIC64): Remove.
>         (m_GENERIC): Turn into one flag.
>         (processor_target): Unify generic tunnings.
>         (ix86_option_override_internal): Replace generic32/64 by generic.
>         (ix86_issue_rate): Likewise.
>         (ix86_adjust_cost): Likewise.



I'm seeing infinite recursion in decide_alg on
x86_64-unknown-linux-gnu when compiling the C code in
libgo/runtime/go-append.c with -m32, and I suspect that this patch is
the culprit.  The file compiles fine without -m32.

#5  0x0000000000b6cc23 in decide_alg (count=count@entry=0, expected_size=2048,
    memset=memset@entry=false,
    dynamic_check=dynamic_check@entry=0x7fffffffd72c,
    noalign=noalign@entry=0x7fffffffd72b)
    at ../../trunk/gcc/config/i386/i386.c:22769
#6  0x0000000000b6cf3e in decide_alg (count=count@entry=0, expected_size=2048,
    memset=memset@entry=false,
    dynamic_check=dynamic_check@entry=0x7fffffffd72c,
    noalign=noalign@entry=0x7fffffffd72b)
    at ../../trunk/gcc/config/i386/i386.c:22871
#7  0x0000000000b6cf3e in decide_alg (count=count@entry=0, expected_size=2048,
    memset=memset@entry=false,
    dynamic_check=dynamic_check@entry=0x7fffffffd72c,
    noalign=noalign@entry=0x7fffffffd72b)
    at ../../trunk/gcc/config/i386/i386.c:22871
#8  0x0000000000b6cf3e in decide_alg (count=count@entry=0, expected_size=2048,
    memset=memset@entry=false,
    dynamic_check=dynamic_check@entry=0x7fffffffd72c,
    noalign=noalign@entry=0x7fffffffd72b)
    at ../../trunk/gcc/config/i386/i386.c:22871
#9  0x0000000000b6cf3e in decide_alg (count=count@entry=0, expected_size=2048,
    memset=memset@entry=false,
    dynamic_check=dynamic_check@entry=0x7fffffffd72c,
    noalign=noalign@entry=0x7fffffffd72b)
    at ../../trunk/gcc/config/i386/i386.c:22871
#10 0x0000000000b6cf3e in decide_alg (count=count@entry=0, expected_size=2048,
    memset=memset@entry=false,
    dynamic_check=dynamic_check@entry=0x7fffffffd72c,
    noalign=noalign@entry=0x7fffffffd72b)
    at ../../trunk/gcc/config/i386/i386.c:22871


decide_alg is being called from ix86_expand_movmem, from
expand_builtin_memcpy, for the call at line 61 of go-append.c.
      __builtin_memcpy (n, a.__values, a.__count * element_size);

I'm continuing to look.

Ian

Reply via email to