Hello again,

Now I send you the statistics I have collected in the test I've done.
But before, another problem I didn't tell before, because it didn't
happen always. But it's quite strange. I have used pktgen in quite a
few machines and in all of them, if you say clone_skb=1000 or so, the
performance boosts. With the Pentium 4, however, it doesnt do
anything. I mean:

clone_skb = 0            --> aprox. 400kpps
clone_skb = 100000  --> aprox. 400kpps

In the Pentium 3, on the other hand, I can see the performance boost
in all its essence (from 100kpps with clone_skb=0 to 400kpps with
clone_skb=100000). Why are these results? From what you have already
told me, I guess the bottleneck in the Pentium 4 is the PCI bus
(33MHz), which cant send faster. As it's seen, the machine has enough
time to alloc new skb's before the packets are sent. So the unique
perceptible difference between the two would be the idle time. Am I
right?


Now, let's face the statistics:
There are two injectors and a receiving machine connected to a switch.
Both injectors are sending at the same time, in order to achieve an
aggregated throughput. So, if both send at 400kpps, we wil get 800kpps
in the receiver. There will be a slice of time in which just one of
them will be running (one will finish its count first), but this time
is considerably shorter than the measured time.

Global pktgen parameters:
pkt_size=60
delay=0
clone_skb=1000000
count = 20000000

Injector A (Pentium 4 / e1000)
------------------------------------------------------------
pktgen.packet_count: 20000000
pktgen.packet_rate (pps): 411767
pktgen.throughput (Mbps): 197
pktgen.total_time (us): 48571152
pktgen.work_time (us): 35601946
pktgen.idle_time (us): 12969206


Injector B (Dual Pentium III / e1000)
------------------------------------------------------------
pktgen.packet_count: 20000000
pktgen.packet_rate (pps): 466700
pktgen.throughput (Mbps): 224
pktgen.total_time (us): 42854078
pktgen.work_time (us): 34770557
pktgen.idle_time (us): 8083521



Receiver (Dual AMD Opteron / tg3)
------------------------------------------------------------
ifstats.uname: Linux bipt176 2.6.13ksensor #11 SMP Mon Dec 26 14:15:52
CET 2005 x86_64 GNU/Linux
ifstats.nr_cpus: 2
ifstats.cpu_speed (MHz): 1792.654
ifstats.arch_bits: 64-bit
ifstats.if_driver: tg3
ifstats.if_speed (Mbps): 1000Mb/s
ifstats.rx_octets: 2.56e+09
ifstats.rx_fragments: 0
ifstats.rx_ucast_packets: 40000003
ifstats.rx_mcast_packets: 0
ifstats.rx_bcast_packets: 2
ifstats.rx_fcs_errors: 0
ifstats.rx_align_errors: 0
ifstats.rx_xon_pause_rcvd: 0
ifstats.rx_xoff_pause_rcvd: 0
ifstats.rx_mac_ctrl_rcvd: 0
ifstats.rx_xoff_entered: 0
ifstats.rx_frame_too_long_errors: 0
ifstats.rx_jabbers: 0
ifstats.rx_undersize_packets: 0
ifstats.rx_in_length_errors: 0
ifstats.rx_out_length_errors: 0
ifstats.rx_64_or_less_octet_packets: 40000005
ifstats.rx_65_to_127_octet_packets: 0
ifstats.rx_128_to_255_octet_packets: 0
ifstats.rx_256_to_511_octet_packets: 0
ifstats.rx_512_to_1023_octet_packets: 0
ifstats.rx_1024_to_1522_octet_packets: 0
ifstats.rx_1523_to_2047_octet_packets: 0
ifstats.rx_2048_to_4095_octet_packets: 0
ifstats.rx_4096_to_8191_octet_packets: 0
ifstats.rx_8192_to_9022_octet_packets: 0
ifstats.tx_octets: 9024
ifstats.tx_collisions: 0
ifstats.tx_xon_sent: 0
ifstats.tx_xoff_sent: 0
ifstats.tx_flow_control: 0
ifstats.tx_mac_errors: 0
ifstats.tx_single_collisions: 0
ifstats.tx_mult_collisions: 0
ifstats.tx_deferred: 0
ifstats.tx_excessive_collisions: 0
ifstats.tx_late_collisions: 0
ifstats.tx_collide_2times: 0
ifstats.tx_collide_3times: 0
ifstats.tx_collide_4times: 0
ifstats.tx_collide_5times: 0
ifstats.tx_collide_6times: 0
ifstats.tx_collide_7times: 0
ifstats.tx_collide_8times: 0
ifstats.tx_collide_9times: 0
ifstats.tx_collide_10times: 0
ifstats.tx_collide_11times: 0
ifstats.tx_collide_12times: 0
ifstats.tx_collide_13times: 0
ifstats.tx_collide_14times: 0
ifstats.tx_collide_15times: 0
ifstats.tx_ucast_packets: 98
ifstats.tx_mcast_packets: 0
ifstats.tx_bcast_packets: 1
ifstats.tx_carrier_sense_errors: 0
ifstats.tx_discards: 0
ifstats.tx_errors: 0
ifstats.dma_writeq_full: 30066024
ifstats.dma_write_prioq_full: 0
ifstats.rxbds_empty: 0
ifstats.rx_discards: 13517210
ifstats.rx_errors: 0
ifstats.rx_threshold_hit: 5057812
ifstats.dma_readq_full: 0
ifstats.dma_read_prioq_full: 0
ifstats.tx_comp_queue_full: 0
ifstats.ring_set_send_prod_index: 99
ifstats.ring_status_update: 5404635
ifstats.nic_irqs: 141204
ifstats.nic_avoided_irqs: 5263431
ifstats.nic_tx_threshold_hit: 0

I hope these stats (combined with the information I provided in the
previous email) will let you understand if the receiving machine has
got HW_FLOW on or off.
Last question:
There are two stats of interest, dma_writeq_full and rx_discards
(these stats are specific for the tg3 card):
        ifstats.dma_writeq_full: 30066024
        ifstats.rx_discards: 13517210
As far as I can understand, dma_writeq_full means that the card finds
the rx_ring full and overwrites a previous packet (so that packet is
lost). So how can the rx_discards (packets discarded) counter less
than the dma_writeq_full counter?

Thank you
Regards
Aritz
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to