Good to hear you are making progress. I haven’t seen this problem in my test systems but I’m mainly using netmap and dpdk without virtual interfaces to access NIC queues directly skipping the kernel altogether.
-Matias > On 20 Feb 2017, at 16:48, Oriol Arcas <or...@starflownetworks.com> wrote: > > Small update: currently, we are having issues without ODP, just using the > netmap example bridges. For kernel 4.1 it exhibits errors, for 4.9 it > doesn't. We are trying to find a minimal working kernel. > > So our current hypothesis is that there is some kind of bug that appears > under concurrent transactions at kernel level, triggered by socket mmap or > netmap... I don't know if this matches your experience. > > -- > Oriol Arcas > Software Engineer > Starflow Networks > > On Mon, Feb 20, 2017 at 11:39 AM, Maxim Uvarov <maxim.uva...@linaro.org> > wrote: > >> version from .travis file. >> >> On 20 February 2017 at 13:05, Oriol Arcas <or...@starflownetworks.com> >> wrote: >> >>> Hi, >>> >>> Thank you for your feedback Matias and Maxim, we really appreciate it. >>> >>> We are trying netmap, but sometimes it doesn't solve the problem. Could >>> you share what versions (ODP, netmap, Linux) are you using and are working >>> fine? This would help us having a control group for our tests... >>> >>> -- >>> Oriol Arcas >>> Software Engineer >>> Starflow Networks >>> >>> On Fri, Feb 17, 2017 at 8:26 PM, Maxim Uvarov <maxim.uva...@linaro.org> >>> wrote: >>> >>>> On 02/17/17 18:45, Oriol Arcas wrote: >>>>> I tried setting the MAC addresses. In my local test, the problem >>>>> disappeared, but I doubt that it's been fixed. >>>>> >>>>> On our larger testbed, with OpenVPN tunnels, the bug persists event >>>> with >>>>> the MAC addresses. But our setup may be problematic, for instance in >>>> this >>>>> interface chain: >>>>> >>>>> veth0 -|- veth1 <---> l2fwd <---> veth2 -|- veth3 >>>>> >>>>> we set the addresses from the endpoints (veth0, veth3), while l2fwd is >>>>> attached to middle interfaces (veth1, veth2). >>>>> >>>>> Do you think this is interfering with the network stack? It looks like >>>> a >>>>> serious bug in the kernel, then... >>>>> >>>>> It seems that we'll have to try netmap. >>>>> >>>> >>>> >>>> that is environment which we use in 'make check' testing. Even for dpdk >>>> or netmap. You can take exact steps from .travis.yml file. But it always >>>> run in our CI. Maybe you have some issues related to promisc mode >>>> and you get some additional files? Or might be packet mmap fanout >>>> problems. But that is very strange because we would see this issue >>>> before because that env bring up at each test run. >>>> >>>> Maxim. >>>> >>>>> -- >>>>> Oriol Arcas >>>>> Software Engineer >>>>> Starflow Networks >>>>> >>>>> On Fri, Feb 17, 2017 at 1:23 PM, Elo, Matias (Nokia - FI/Espoo) < >>>>> matias....@nokia-bell-labs.com> wrote: >>>>> >>>>>> >>>>>>> On 17 Feb 2017, at 14:03, Oriol Arcas <or...@starflownetworks.com> >>>>>> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Thanks for your reply Marias. >>>>>>> >>>>>>> I tried a simpler setup and the bug persists. With a linux bridge it >>>>>> works fine. >>>>>>> >>>>>>> My setup is the following: >>>>>>> >>>>>>> | nginx <---> veth0 -|- veth1 <---> l2fwd <---> veth2 -|- veth3 <---> >>>>>> wget | >>>>>>> >>>>>>> where the | delimiters mean a network namespace. >>>>>>> >>>>>>> I have tried it with ODP_PKTIO_DISABLE_SOCKET_MMAP, all the different >>>>>> scheduling modes and -c 1. >>>>>>> >>>>>>> Our next test would be using DPDK or netmap, if they can be used with >>>>>> veth interfaces. >>>>>> >>>>>> At least netmap should work with virtual interfaces. >>>>>> >>>>>> More things to try with odp_l2fwd arguments: >>>>>> >>>>>> Set the MAC addresses correctly. Using the same MAC from multiple >>>>>> interfaces >>>>>> could potentially cause some issues with the host network stack. >>>>>> For example: odp_l2fwd -i if1,if2 -d 1 -s 1 -r >>>> <if1_mac,if2_mac2> >>>>>> >>>>>> Use direct pktio mode: -m 0 >>>>>> >>>>>> >>>>>> -Matias >>>> >>>> >>> >>