Well as a general rule anything over about 80usecs for
InterruptThrottleRate is a waste.  One advantage to reducing the
interrupt throttle rate is you can reduce the ring size and you might
see a slight performance improvement.  One problem with using 4096
descriptors is that it greatly increases the cache footprint and leads
to more buffer-bloat and cache thrash as you have to evict old
descriptors to pull in new ones.  I'm also sure if you are doing an
intrusion detection system (I'm assuming that is what IDS is in
reference to), then the users would appreciate it if you didn't add up
to a half dozen extra milliseconds of latency to their network (worst
case with an elephant flow of 1514 byte frames).

What size packets is it you are working with?  One limitation of the
82599 is that it can only handle an upper limit of somewhere around
12Mpps if you are using something like 6 queues, and only a little
over 2 for a single queue.  If you exceed 12Mpps then the part will
start reporting rx_missed because the PCIe overhead for moving 64 byte
packets is great enough that it actually causes us to exceed the
limits of the x8 gen2 link.  If the memcpy is what I think it is then
it allows us to avoid having to do two different atomic operations
that would have been more expensive otherwise.

On Fri, Sep 23, 2016 at 12:46 PM, Michał Purzyński
<michalpurzyns...@gmail.com> wrote:
> Here's what I did
>
> ethtool -A p1p1 rx off tx off
> ethtool -A p3p1 rx off tx off
>
> Both ethtool -a <interface> and Arista that's pumping data show that RX/TX
> pause are disabled.
>
> I have two cards, each connected to a separate NUMA node, threads pinned,
> etc.
>
> One non-standard thing is that I use single queue only, because any form of
> multiqueue leads to packet reordering and confuses IDS. An issue that's been
> hidden for a while in the NSM community.
>
> driver (from sourceforge) was loaded with MQ=0 DCA=2 RSS=1 VMDQ=0
> InterruptThrottleRate=956 FCoE=0 LRO=0 vxvlan_rx=0 (each option's value
> given enogh times so it applies to all cards in this system).
>
> I could see the same issue sending traffic to just one card.
>
> Of course a single core is swamped with ACK-ing hardware IRQ and then doing
> softIRQ (which seems to be mostly memcpy?). But then again, I don't see
> errors about lacking buffers (I run with 4096 descriptors).
>
>
> On Fri, Sep 23, 2016 at 9:22 PM, Alexander Duyck <alexander.du...@gmail.com>
> wrote:
>>
>> When you say you disabled flow control did you disable it on the
>> interface that is dropping packets or the other end?  You might try
>> explicitly disabling it on the interface that is dropping packets,
>> that in turn should enable per-queue drop instead of putting
>> back-pressure onto the Rx FIFO.
>>
>> With flow control disabled on the local port you should see
>> rx_no_dma_resources start incrementing if the issue is that one of the
>> Rx rings is not keeping up.
>>
>> - Alex
>>
>> On Fri, Sep 23, 2016 at 11:09 AM, Michał Purzyński
>> <michalpurzyns...@gmail.com> wrote:
>> > xoff was increasing so I disabled flow control.
>> >
>> > That's a HP DL360 Gen9 and lspci -vvv tells me cards are connected to x8
>> > link, speed is 5GT/s and ASPM is disabled.
>> >
>> > Other error counters are still zero. When I compared rx_packets and
>> > rx_missed_errors it looks like a 38% (!!) packets are getting lost.
>> >
>> > Unfortunately HP documentation is a scam and they actively avoid
>> > publishing
>> > motherboard layout.
>> >
>> > Any other place I could look for hints?
>> >
>> >
>> > On Fri, Sep 23, 2016 at 7:01 PM, Alexander Duyck
>> > <alexander.du...@gmail.com>
>> > wrote:
>> >>
>> >> On Fri, Sep 23, 2016 at 1:10 AM, Michał Purzyński
>> >> <michalpurzyns...@gmail.com> wrote:
>> >> > Hello.
>> >> >
>> >> > On my IDS workload with af_packet I can see rx_missed_errors growing
>> >> > while
>> >> > rx_no_buffer_count does not. Basically every other kind of rx_ error
>> >> > counter is 0, including rx_no_dma_resources. It's an 82599 based
>> >> > card.
>> >> >
>> >> > I don't know what to think about that. I went through ixgbe source
>> >> > code
>> >> > and
>> >> > the 82599 datasheet and seems like rx_missed_error means a new packet
>> >> > overwrote something already in the packet buffer (FIFO queue on the
>> >> > card)
>> >> > because there was no more space in it.
>> >> >
>> >> > Now, that would happen if there is no place to DMA packets into - but
>> >> > that
>> >> > counter does not grow.
>> >> >
>> >> > Could you point me to where should I be looking for a problem?
>> >> >
>> >> > --
>> >> > Michal Purzynski
>> >>
>> >> The Rx missed count will increment if you are not able to receive a
>> >> packet because the Rx FIFO is full.  If you are not seeing any
>> >> rx_no_dma_resources problems it might indicate that the problem is not
>> >> with providing the DMA resources, but a problem on the bus itself.
>> >> You might want to double check the slot the device is connected to in
>> >> order to guarantee that there is a x8 link that supports 5GT/s all the
>> >> way through to the root complex.
>> >>
>> >> - Alex
>> >
>> >
>
>

------------------------------------------------------------------------------
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to