ywkaras commented on issue #7546:
URL: https://github.com/apache/trafficserver/issues/7546#issuecomment-786971941


   I've not been able to reproduce the problem on the AMD proxy.  There are my 
results:
   ```
   -bash-4.2$ cat http2_benchmark.report 
   **http2load**
   finished in 10.17s, 98320.46 req/s, 1.50GB/s
   requests: 1000000 total, 1000000 started, 1000000 done, 1000000 succeeded, 0 
failed, 0 errored, 0 timeout
   status codes: 1000000 2xx, 0 3xx, 0 4xx, 0 5xx
   traffic: 15.29GB (16419037931) total, 7.66MB (8031931) headers (space 
savings 96.63%), 15.26GB (16384000000) data
                        min         max         mean         sd        +/- sd
   time for request:      111us    130.27ms      1.51ms      4.90ms    92.15%
   time for connect:     3.38ms     87.15ms     38.33ms     20.31ms    69.00%
   time to 1st byte:     5.59ms    106.70ms     47.40ms     19.96ms    74.00%
   req/s           :     476.66      803.50      671.46       97.62    65.50%
   
   **dstat**
   You did not select any stats, using -cdngy by default.
   ----total-cpu-usage---- -dsk/total- ---net/lo-- -net/total- ---paging-- 
---system--
   usr sys idl wai hiq siq| read  writ| recv  send: recv  send|  in   out | int 
  csw 
     1   0  99   0   0   0|2597B  211k|   0     0 :   0     0 |   0     0 |  
28k   84k
    13   5  81   0   0   1|   0    19M|1590M 1590M:2169B 2064B|   0     0 | 
340k  295k
   **perf stat**
   
    Performance counter stats for process id '52642':
   
           167,926.60 msec task-clock                #   16.268 CPUs utilized   
       
              644,529      context-switches          #    0.004 M/sec           
       
               16,677      cpu-migrations            #    0.099 K/sec           
       
                7,170      page-faults               #    0.043 K/sec           
       
      292,679,346,215      cycles                    #    1.743 GHz             
         (67.12%)
       41,677,240,403      stalled-cycles-frontend   #   14.24% frontend cycles 
idle     (67.21%)
       22,998,344,897      stalled-cycles-backend    #    7.86% backend cycles 
idle      (67.20%)
      212,494,852,375      instructions              #    0.73  insn per cycle  
       
                                                     #    0.20  stalled cycles 
per insn  (67.15%)
       38,284,614,744      branches                  #  227.984 M/sec           
         (67.13%)
        1,354,399,053      branch-misses             #    3.54% of all branches 
         (67.09%)
   
         10.322305247 seconds time elapsed
   
   **perf report**
   # Total Lost Samples: 0
   #
   # Samples: 1M of event 'cycles'
   # Event count (approx.): 448142823289
   #
   #   Overhead  Shared Object         Symbol                                   
         
   # ..........  ....................  
..................................................
   #
          7.17%  libcrypto.so.1.1      [.] _aesni_ctr32_ghash_6x
          5.81%  libc-2.17.so          [.] __memcpy_ssse3
          3.51%  [kernel.kallsyms]     [k] copy_user_generic_string
          2.65%  libtscore.so.10.0.0   [.] freelist_new
          1.94%  [kernel.kallsyms]     [k] acpi_processor_ffh_cstate_enter
          1.60%  libc-2.17.so          [.] __memcmp_sse4_1
          1.04%  libtscpputil.so.10.0  [.] memcmp
          0.91%  traffic_server        [.] HpackIndexingTable::lookup
          0.80%  [kernel.kallsyms]     [k] _raw_spin_lock_irqsave
   ...
   ```
   I'm seeing much lower req/s on the AMD than on the Intel Xeon.
   Xeon:
   ```
   Architecture:          x86_64
   CPU op-mode(s):        32-bit, 64-bit
   Byte Order:            Little Endian
   CPU(s):                64
   On-line CPU(s) list:   0-63
   Thread(s) per core:    2
   Core(s) per socket:    16
   Socket(s):             2
   NUMA node(s):          2
   Vendor ID:             GenuineIntel
   CPU family:            6
   Model:                 79
   Model name:            Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz
   Stepping:              1
   CPU MHz:               1200.091
   CPU max MHz:           3000.0000
   CPU min MHz:           1200.0000
   BogoMIPS:              4199.63
   Virtualization:        VT-x
   L1d cache:             32K
   L1i cache:             32K
   L2 cache:              256K
   L3 cache:              40960K
   ```
   AMD:
   ```
   Architecture:          x86_64
   CPU op-mode(s):        32-bit, 64-bit
   Byte Order:            Little Endian
   CPU(s):                128
   On-line CPU(s) list:   0-127
   Thread(s) per core:    2
   Core(s) per socket:    64
   Socket(s):             1
   NUMA node(s):          1
   Vendor ID:             AuthenticAMD
   CPU family:            23
   Model:                 49
   Model name:            AMD EPYC 7742 64-Core Processor
   Stepping:              0
   CPU MHz:               2250.000
   CPU max MHz:           2250.0000
   CPU min MHz:           1500.0000
   BogoMIPS:              4491.55
   Virtualization:        AMD-V
   L1d cache:             32K
   L1i cache:             32K
   L2 cache:              512K
   L3 cache:              16384K
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to