Re: how to cross-connect 2 interfaces

2023-11-26 Thread Vincenzo Maffione
I've never tried with vxlan interfaces... But in principle it should work
(emulated netmap mode enables netmap on any interface at reduced
performance).
However, in your case a good deal of packet processing already happens
within the kernel (vxlan encapsulation) so I would definitely go for a
kernel approach such as netgraph.

Vincenzo

On Sun, Nov 26, 2023, 10:36 PM Benoit Chesneau 
wrote:

> Thanks! I guess though it will only work for HW interfaces, not with vxlan
> interfaces?
>
> Benoît
> On Sunday, November 26th, 2023 at 21:43, Vincenzo Maffione <
> vmaffi...@freebsd.org> wrote:
>
> Or, the netmap(4) bridge example
>
> On Sun, Nov 26, 2023, 12:40 PM Benoit Chesneau 
> wrote:
>
>> thanks, I didn't noticed this one.
>>
>> Benoît Chesneau, Enki Multimedia
>> —
>> t. +33608655490
>>
>> Sent with Proton Mail <https://proton.me/> secure email.
>>
>> On Saturday, November 25th, 2023 at 23:30, Jim Thompson 
>> wrote:
>>
>> ng_hub(4)
>>
>> On Nov 25, 2023, at 8:34 AM, Benoit Chesneau 
>> wrote:
>>
>> 
>> Is there a way to cross-connect 2 interfaces without using a bridge .
>> Something similar to the command ˋl2 xconnect` in vpp (or cisco) :
>> https://docs.fd.io/vpp/16.12/vnet_vnet_l2.html
>> <https://www.google.com/url?q=https%3A%2F%2Fdocs.fd.io%2Fvpp%2F16.12%2Fvnet_vnet_l2.html=170153126300=AOvVaw3gotHKi-LVQOb_H5p6W-4d>
>> This could be quite handy to create a patch between diffrent machines in
>> the network.
>>
>> Benoît
>>
>>
>>
>


Re: how to cross-connect 2 interfaces

2023-11-26 Thread Vincenzo Maffione
Or, the netmap(4) bridge example

On Sun, Nov 26, 2023, 12:40 PM Benoit Chesneau 
wrote:

> thanks, I didn't noticed this one.
>
> Benoît Chesneau, Enki Multimedia
> —
> t. +33608655490
>
> Sent with Proton Mail  secure email.
>
> On Saturday, November 25th, 2023 at 23:30, Jim Thompson 
> wrote:
>
> ng_hub(4)
>
> On Nov 25, 2023, at 8:34 AM, Benoit Chesneau 
> wrote:
>
> 
> Is there a way to cross-connect 2 interfaces without using a bridge .
> Something similar to the command ˋl2 xconnect` in vpp (or cisco) :
> https://docs.fd.io/vpp/16.12/vnet_vnet_l2.html
> 
> This could be quite handy to create a patch between diffrent machines in
> the network.
>
> Benoît
>
>
>


Re: About IFLIB compliant network device driver development

2023-08-31 Thread Vincenzo Maffione
Hi,
  I think it's pretty common nowadays to have NICs with completion
rings/queues. It is definitely appropriate to implement such drivers with
iflib, and indeed iflib provides separate callbacks for updating the TX/RX
"submission" and "completion" queues:

   - ift_txd_encap: submit a packet to a TX ring (or submission queue)
   - ift_txd_flush: flush submitted TX descriptors to the hardware.
   - ift_txd_credits_update: check the completion queue (or reclaim
   completed descriptor from the TX ring).
   - ift_rxd_refill: submit RX descriptors to a RX ring (or submission
   queue)
   - ift_rxd_flush: flush submitted RX descriptors to the NIC hardware.
   - ift_rxd_pkt_get: get a received packet from the completion queue (or
   reclaim completed descriptor from the RX ring)
   - ...

If you want to look at an example of iflib driver using completion queues,
you can check out sys/dev/vmware/vmxnet3/.

It's actually a pretty complex example because vmxnet3 is a complex
paravirtualized device with multiple versions etc..
However, the complexity does not come from the completion rings...

Cheers,
  Vincenzo



Il giorno mer 30 ago 2023 alle ore 15:58 PAVEL POPA 
ha scritto:

> I have a NIC (a SmartNIC actually) for which I have to implement a
> driver, which in addition to RX and TX rings exposes also completion
> CMPT rings. Due to this additional complication (the CMPT rings), I'm
> not sure how appropriate it is to implement such a driver via the
> IFLIB framework. Does anyone have any similar experience that can
> share? Any suggestions at all, ideas, feedbacks?
>
> Thanks in advance,
> Pavel
>
>

-- 
Vincenzo


Re: Forwarding packets to the host stack in net map lb app

2022-09-26 Thread Vincenzo Maffione
I think you could avoid any modifications to lb(8) and take advantage of
the "multiple pipe groups" feature.
You open two groups,
# lb -i netmap:em0/R -p mon:$N -p fwd:$M [...]
Each group receives all the packets arriving on the RX (NIC) rings of em0.
(I'm pretty sure) this happens without packet copies, i.e. by swapping
netmap slots.

The first group (mon) is for your existing monitor process. The second one
(fwd) would be used for a separate process that handles the host stack:
 - It reads from fwd:$M pipes, selecting only the RX packets that should be
forwarded to the host stack. Selected packets will be forwarded to
netmap:em0^/T. All the other packets are just dropped.
 - it forwards all traffic from netmap:em0^/R to netmap:em0/T (e.g. from
the em0 host RX ring to the em0 TX rings). Keep in mind that lb does not
touch em0 TX rings, so there would not be conflicts. In any case, it is
good practice to have lb only open RX rings (netmap:em0/R).
This second process can probably be a modified version of the netmap
bridge, although you have asymmetric three-party forwarding here (fwd/R -->
netmap:em0^T, netmap:em0^/R --> netmap:em0/T).

The alternative (harder) option would be to actually modify lb(8). You
should probably:
 - open netmap:em0^/R and netmap:em0*/T with separate nmport_open() calls
 - parse the packet before pkt_hdr_hash() to select the RX packets that you
need to forward to the host TX ring, and modify the forwarding logic to
perform this task.
 - modify the logic of the lb poll() loop so that it also performs the
forwarding from host RX rings to NIC TX rings
I'm not sure that you would have any advantages by choosing this path.

Cheers,
  Vincenzo

Il giorno dom 18 set 2022 alle ore 23:47 Kim Shrier  ha
scritto:

> I have a network monitoring program and I am using the lb app from
> netmap to distribute packets to netmap pipes.  The monitor processes
> are successfully receiving packets.
>
> I would like to modify lb to send some packets to the host stack and
> have packets coming from the host stack go out on the monitoring
> ethernet interface.
>
> I am relatively new to using netmap and it is not obvious to me how
> to properly send/receive some packets to the host tx/rx rings while
> still letting the the netmap pipes forward packets to my monitoring
> application.
>
> I have looked at the bridge app from netmap which opens 2 netmap
> ports. It does not seem to me that that would be the right way to
> deal with the host stack in the context of lb.
>
> Should I just process the last tx/rx ring differently from the first ones
> that are forwarding packets to the netmap pipes?
>
> Thanks,
> Kim
>
>
>
>


Re: FBSD-13 - Vale maximum virtual switches.

2022-03-03 Thread Vincenzo Maffione
Hi,
  Yes, the maximum number of VALE bridges should definitely become a sysctl.
I'll try to implement the change asap.

Cheers,
  Vincenzo

Il giorno gio 3 mar 2022 alle ore 19:07 Santiago Martinez <
s...@codenetworks.net> ha scritto:

> Hi Everyone,
>
> The other day had to simulate a network topology and I wanted to use vale
> switches instead of in-kernel bridges.
>
> After creating a few switches I notice that there was a hard limit of 8
> switches ( that is clearly stated on the man page).
>
> For my simulation I needed 32 virtual switches, hence I increase the value
> of NM_BRIDGES from 8 to 64.
>
> After that I was able to create the bridges and they seem to work fine.
>
> My question is, do we need that hard limit on 8? Should this be change to
> a dynamic value set with sysctl?
>
> Best regards.
> Santi
>
> diff --git a/sys/dev/netmap/netmap_bdg.h b/sys/dev/netmap/netmap_bdg.h
>
> index e4683885e66c..3afe1d9d5d99 100644
> --- a/sys/dev/netmap/netmap_bdg.h
> +++ b/sys/dev/netmap/netmap_bdg.h
> @@ -73,8 +73,8 @@ struct netmap_bdg_ops {
> int netmap_bwrap_attach(const char *name, struct netmap_adapter *, struct
> netmap_bdg_ops *);
> int netmap_bdg_regops(const char *name, struct netmap_bdg_ops *bdg_ops,
> void *private_data, void *auth_token);
>
> -#defineNM_BRIDGES  8   /* number of bridges */
> -#defineNM_BDG_MAXPORTS 254 /* up to 254 */
> +#defineNM_BRIDGES  64  /* number of bridges */
> +#defineNM_BDG_MAXPORTS 16  /* up to 254 */
> #defineNM_BDG_BROADCASTNM_BDG_MAXPORTS
> #defineNM_BDG_NOPORT   (NM_BDG_MAXPORTS+1)
>
>
>
>


Re: e1000 & igb if_vlan netmap header stripping issue after e1000-igb driver updates.

2021-11-28 Thread Vincenzo Maffione
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260068

On Sat, Nov 20, 2021, 3:19 PM Özkan KIRIK  wrote:

> Hello,
>
> I'm using stable/12 (aba2dc46dfa5, Oct 24 2021). I'm hitting some
> problems with if_vlan + parent interface netmap. It was working with
> before driver update. Maybe something missing for netmap
> implementation.
>
> The way to reproduce:
> [HostA] <> [HostB]
>
> HostA
> - ifconfig em1.110 create 10.10.10.2/24 up
> - ping 10.10.10.1
> - tcpdump -eni em1
> 17:05:11.393411 00:50:56:26:69:ea > 00:0c:29:84:5d:88, ethertype
> 802.1Q (0x8100), length 102: vlan 110, p 0, ethertype IPv4, 10.10.10.1
> > 10.10.10.2: ICMP echo reply, id 32844, seq 53, length 64
>
> HostB
> - ifconfig em1.110 create 10.10.10.1/24 up
> - ifconfig em1 promisc -tso -lro -rxcsum -txcsum -tso6 -rxcsum -txcsum
> -tso6 -rxcsum6 -txcsum6 -vlanhwtag -vlanhwcsum -vlanhwtso
> - ./bridge -i em1 -i em1^ &
> # tcpdump -eni em1
> 17:05:11.391215 00:0c:29:84:5d:88 > 00:50:56:26:69:ea, ethertype IPv4
> (0x0800), length 98: 10.10.10.2 > 10.10.10.1: ICMP echo request, id
> 32844, seq 53, length 64
>
> Pinging from HostA to HostB through if_vlan. When netmap bridge is
> closed, everything is okey, we can see the original packet on tcpdump.
> But when netmap bridge is started, packet's vlan header was lost as
> you can see above. The netmap bridge app is the original
> tools/tools/netmap/bridge.c application.
> HostA and HostB connected back to back directly with a patch cable.
> There is no switch between them.
>
> I tried this test on real hardware em, igb and vmware e1000 (em) nics.
> Problem is easy to reproduce.
> But there is no such problem on ix and ixl cards.
>
> Is it possible to check and fix?
> Best Regards,
> Özkan KIRIK
>


Re: vmxnet3: possible bug in vmxnet3_isc_rxd_pkt_get

2021-11-20 Thread Vincenzo Maffione
+1 for adding the sanity check in vmxnet3_isc_rxd_pkt_get().
This looks like a bug to me...

Cheers
  Vincenzo

Il giorno ven 19 nov 2021 alle ore 19:46 Andriy Gapon  ha
scritto:

> On 19/11/2021 20:19, Andriy Gapon wrote:
> > Here is some data to demonstrate the issue:
> > $1 = (iflib_rxq_t) 0xfe00ea9f6200
> > (kgdb) p $1->ifr_frags[0]
> > $2 = {irf_flid = 0 '\000', irf_idx = 1799, irf_len = 118}
> >
> > (kgdb) p $1->ifr_frags[1]
> > $3 = {irf_flid = 1 '\001', irf_idx = 674, irf_len = 0}
> > (kgdb) p $1->ifr_frags[2]
> > $4 = {irf_flid = 1 '\001', irf_idx = 675, irf_len = 0}
> >
> > ... elements 3..62 follow the same pattern ...
> >
> > (kgdb) p $1->ifr_frags[63]
> > $6 = {irf_flid = 1 '\001', irf_idx = 736, irf_len = 0}
> >
> > and then...
> >
> > (kgdb) p $1->ifr_frags[64]
> > $7 = {irf_flid = 1 '\001', irf_idx = 737, irf_len = 0}
> > (kgdb) p $1->ifr_frags[65]
> > $8 = {irf_flid = 1 '\001', irf_idx = 738, irf_len = 0}
> > ... the pattern continues ...
> > (kgdb) p $1->ifr_frags[70]
> > $10 = {irf_flid = 1 '\001', irf_idx = 743, irf_len = 0}
> >
> >
> > It seems like a start-of-packet completion descriptor referenced a
> descriptor in
> > the command ring zero (and apparently it didn't have the end-of-packet
> bit). And
> > there were another 70 zero-length completions referencing the ring one
> until the
> > end-of-packet.
> > So, in total 71 fragment was recorded.
> >
> > Or it's possible that those zero-length fragments were from the
> penultimate
> > pkt_get call and ifr_frags[0] was obtained after that...
>
>
> I think that this was the case and that I was able to find the
> corresponding
> descriptors in the completion ring.
>
> Please see https://people.freebsd.org/~avg/vmxnet3-fragment-overrun.txt
>
> $54 is the SOP, it has qid of 6.
> It is followed by many fragments with qid 14 (there are 8 queues / queue
> sets)
> and zero length.
> But not all of them are zero length, some have length of 4096, e.g. $77,
> $86, etc.
> $124 is the last fragment, its has eop = 1 and error = 1.
> So, there are 71 fragments in total.
>
> So, it is clear that VMWare produced 71 segments for a single packet
> before
> giving up on it.
>
> I wonder why it did that.
> Perhaps it's a bug, perhaps it does not count zero-length segments against
> the
> limit, maybe something else.
>
> In any case, it happens.
>
> Finally, the packet looks interesting: udp = 0, tcp = 0, ipcsum_ok = 0,
> ipv6 =
> 0, ipv4 = 0.  I wonder what kind of a packet it could be -- being rather
> large
> and not an IP packet.
>
> > I am not sure how that could happen.
> > I am thinking about adding a sanity check for the number of fragments.
> > Not sure yet what options there are for handling the overflow besides
> panicing.
> >
> >
> > Also, some data from the vmxnet3's side of things:
> > (kgdb) p $15.vmx_rxq[6]
> > $18 = {vxrxq_sc = 0xf80002d9b800, vxrxq_id = 6, vxrxq_intr_idx = 6,
> > vxrxq_irq = {ii_res = 0xf80002f23e00, ii_rid = 7, ii_tag =
> > 0xf80002f23d80}, vxrxq_cmd_ring = {{vxrxr_rxd = 0xfe00ead3c000,
> > vxrxr_ndesc = 2048,
> >vxrxr_gen = 0, vxrxr_paddr = 57917440, vxrxr_desc_skips = 1114,
> > vxrxr_refill_start = 1799}, {vxrxr_rxd = 0xfe00ead44000, vxrxr_ndesc
> = 2048,
> > vxrxr_gen = 0, vxrxr_paddr = 57950208, vxrxr_desc_skips = 121,
> >vxrxr_refill_start = 743}}, vxrxq_comp_ring = {vxcr_u = {txcd =
> > 0xfe00ead2c000, rxcd = 0xfe00ead2c000}, vxcr_next = 0,
> vxcr_ndesc =
> > 4096, vxcr_gen = 1, vxcr_paddr = 57851904, vxcr_zero_length = 1044,
> >  vxcr_pkt_errors = 128}, vxrxq_rs = 0xf80002d78e00, vxrxq_sysctl
> =
> > 0xf80004308080, vxrxq_name = "vmx0-rx6\000\000\000\000\000\000\000"}
> >
> > vxrxr_refill_start values are consistent with what is seen in
> ifr_frags[].
> > vxcr_zero_length and vxcr_pkt_errors are both not zero, so maybe
> something got
> > the driver into a confused state or the emulated hardware became
> confused.
>
>
> --
> Andriy Gapon
>
>


Re: Vector Packet Processing (VPP) portability on FreeBSD

2021-05-18 Thread Vincenzo Maffione
Il giorno mar 18 mag 2021 alle ore 09:32 Kevin Bowling <
kevin.bowl...@kev009.com> ha scritto:

>
>
> On Mon, May 17, 2021 at 10:20 AM Marko Zec  wrote:
>
>> On Mon, 17 May 2021 09:53:25 +
>> Francois ten Krooden  wrote:
>>
>> > On 2021/05/16 09:22, Vincenzo Maffione wrote:
>> >
>> > >
>> > > Hi,
>> > >   Yes, you are not using emulated netmap mode.
>> > >
>> > >   In the test setup depicted here
>> > > https://github.com/ftk-ntq/vpp/wiki/VPP-throughput-using-netmap-
>> > > interfaces#test-setup
>> > > I think you should really try to replace VPP with the netmap
>> > > "bridge" application (tools/tools/netmap/bridge.c), and see what
>> > > numbers you get.
>> > >
>> > > You would run the application this way
>> > > # bridge -i ix0 -i ix1
>> > > and this will forward any traffic between ix0 and ix1 (in both
>> > > directions).
>> > >
>> > > These numbers would give you a better idea of where to look next
>> > > (e.g. VPP code improvements or system tuning such as NIC
>> > > interrupts, CPU binding, etc.).
>> >
>> > Thank you for the suggestion.
>> > I did run a test with the bridge this morning, and updated the
>> > results as well. +-+--+
>> > | Packet Size | Throughput (pps) |
>> > +-+--+
>> > |   64 bytes  |7.197 Mpps|
>> > |  128 bytes  |7.638 Mpps|
>> > |  512 bytes  |2.358 Mpps|
>> > | 1280 bytes  |  964.915 kpps|
>> > | 1518 bytes  |  815.239 kpps|
>> > +-+--+
>>
>> I assume you're on 13.0 where netmap throughput is lower compared to
>> 11.x due to migration of most drivers to iflib (apparently increased
>> overhead) and different driver defaults.  On 11.x I could move 10G line
>> rate from one ix to another at low CPU freqs, where on 13.x the CPU
>> must be set to max speed, and still can't do 14.88 Mpps.
>>
>
> I believe this issue is in the combined txrx interrupt filter.  It is
> causing a bunch of unnecessary tx re-arms.
>

Could you please elaborate on that?

TX completion is indeed the one thing that changed considerably with the
porting to iflib. And this could be a major contributor to the performance
drop.
My understanding is that TX interrupts are not really used anymore on
multi-gigabit NICs such as ix or ixl. Instead, "softirqs" are used, meaning
that a timer is used to perform TX completion. I don't know what the
motivations were for this design decision.
I had to decrease the timer period to 90us to ensure timely completion (see
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=248652). However, the
timer period is currently not adaptive.



>
>
>> #1 thing which changed: default # of packets per ring dropped down from
>> 2048 (11.x) to 1024 (13.x).  Try changing this in /boot/loader.conf:
>>
>> dev.ixl.0.iflib.override_nrxds=2048
>> dev.ixl.0.iflib.override_ntxds=2048
>> dev.ixl.1.iflib.override_nrxds=2048
>> dev.ixl.1.iflib.override_ntxds=2048
>> etc.
>>
>> For me this increases the throughput of
>> bridge -i netmap:ixl0 -i netmap:ixl1
>> from 9.3 Mpps to 11.4 Mpps
>>
>> #2: default interrupt moderation delays seem to be too long.  Combined
>> with increasing the ring sizes, reducing dev.ixl.0.rx_itr from 62
>> (default) to 40 increases the throughput further from 11.4 to 14.5 Mpps
>>
>> Hope this helps,
>>
>> Marko
>>
>>
>> > Besides for the 64-byte and 128-byte packets the other sizes where
>> > matching the maximum rates possible on 10Gbps. This was when the
>> > bridge application was running on a single core, and the cpu core was
>> > maxing out at a 100%.
>> >
>> > I think there might be a bit of system tuning needed, but I suspect
>> > most of the improvement would be needed in VPP.
>> >
>> > Regards
>> > Francois
>> ___
>> freebsd-net@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>>
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Vector Packet Processing (VPP) portability on FreeBSD

2021-05-18 Thread Vincenzo Maffione
+1

Thanks,
  Vincenzo

Il giorno lun 17 mag 2021 alle ore 19:20 Marko Zec  ha scritto:

> On Mon, 17 May 2021 09:53:25 +
> Francois ten Krooden  wrote:
>
> > On 2021/05/16 09:22, Vincenzo Maffione wrote:
> >
> > >
> > > Hi,
> > >   Yes, you are not using emulated netmap mode.
> > >
> > >   In the test setup depicted here
> > > https://github.com/ftk-ntq/vpp/wiki/VPP-throughput-using-netmap-
> > > interfaces#test-setup
> > > I think you should really try to replace VPP with the netmap
> > > "bridge" application (tools/tools/netmap/bridge.c), and see what
> > > numbers you get.
> > >
> > > You would run the application this way
> > > # bridge -i ix0 -i ix1
> > > and this will forward any traffic between ix0 and ix1 (in both
> > > directions).
> > >
> > > These numbers would give you a better idea of where to look next
> > > (e.g. VPP code improvements or system tuning such as NIC
> > > interrupts, CPU binding, etc.).
> >
> > Thank you for the suggestion.
> > I did run a test with the bridge this morning, and updated the
> > results as well. +-+--+
> > | Packet Size | Throughput (pps) |
> > +-+--+
> > |   64 bytes  |7.197 Mpps|
> > |  128 bytes  |7.638 Mpps|
> > |  512 bytes  |2.358 Mpps|
> > | 1280 bytes  |  964.915 kpps|
> > | 1518 bytes  |  815.239 kpps|
> > +-+--+
>
> I assume you're on 13.0 where netmap throughput is lower compared to
> 11.x due to migration of most drivers to iflib (apparently increased
> overhead) and different driver defaults.  On 11.x I could move 10G line
> rate from one ix to another at low CPU freqs, where on 13.x the CPU
> must be set to max speed, and still can't do 14.88 Mpps.
>
> #1 thing which changed: default # of packets per ring dropped down from
> 2048 (11.x) to 1024 (13.x).  Try changing this in /boot/loader.conf:
>
> dev.ixl.0.iflib.override_nrxds=2048
> dev.ixl.0.iflib.override_ntxds=2048
> dev.ixl.1.iflib.override_nrxds=2048
> dev.ixl.1.iflib.override_ntxds=2048
> etc.
>
> For me this increases the throughput of
> bridge -i netmap:ixl0 -i netmap:ixl1
> from 9.3 Mpps to 11.4 Mpps
>
> #2: default interrupt moderation delays seem to be too long.  Combined
> with increasing the ring sizes, reducing dev.ixl.0.rx_itr from 62
> (default) to 40 increases the throughput further from 11.4 to 14.5 Mpps
>
> Hope this helps,
>
> Marko
>
>
> > Besides for the 64-byte and 128-byte packets the other sizes where
> > matching the maximum rates possible on 10Gbps. This was when the
> > bridge application was running on a single core, and the cpu core was
> > maxing out at a 100%.
> >
> > I think there might be a bit of system tuning needed, but I suspect
> > most of the improvement would be needed in VPP.
> >
> > Regards
> > Francois
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Vector Packet Processing (VPP) portability on FreeBSD

2021-05-16 Thread Vincenzo Maffione
Hi,
  Yes, you are not using emulated netmap mode.

  In the test setup depicted here
https://github.com/ftk-ntq/vpp/wiki/VPP-throughput-using-netmap-interfaces#test-setup
I think you should really try to replace VPP with the netmap "bridge"
application (tools/tools/netmap/bridge.c), and see what numbers you get.

You would run the application this way
# bridge -i ix0 -i ix1
and this will forward any traffic between ix0 and ix1 (in both directions).

These numbers would give you a better idea of where to look next (e.g. VPP
code improvements or system tuning such as NIC interrupts, CPU binding,
etc.).

Cheers,
  Vincenzo

Il giorno gio 13 mag 2021 alle ore 15:02 Francois ten Krooden <
f...@nanoteq.com> ha scritto:

> On Thursday, 13 May 2021 13:59 Jacques Fourie
>
> >
> > On Thu, May 13, 2021 at 7:27 AM Francois ten Krooden 
> > wrote:
> >
> > On Thursday, 13 May 2021 13:05 Luigi Rizzo wrote:
> > >
> > > On Thu, May 13, 2021 at 10:42 AM Francois ten Krooden
> > >  wrote:
> > > >
> > > > Hi
> > > >
> > > > Just for info I ran a test using TREX (https://trex-tgn.cisco.com/)
> > > > Where I just sent traffic in one direction through the box
> > running  FreeBSD
> > > with VPP using the netmap interfaces.
> > > > These were the results we found before significant packet loss
> started
> > > occuring.
> > > > +-+--+
> > > > | Packet Size | Throughput (pps) |
> > > > +-+--+
> > > > |   64 bytes  |   1.008 Mpps |
> > > > |  128 bytes  |   920.311 kpps   |
> > > > |  256 bytes  |   797.789 kpps   |
> > > > |  512 bytes  |   706.338 kpps   |
> > > > | 1024 bytes  |   621.963 kpps   |
> > > > | 1280 bytes  |   569.140 kpps   |
> > > > | 1440 bytes  |   547.139 kpps   |
> > > > | 1518 bytes  |   524.864 kpps   |
> > > > +-+--+
> > >
> > > Those numbers are way too low for netmap.
> > >
> > > I believe you are either using the emulated mode, or issuing a system
> call
> > on
> > > every single packet.
> > >
> > > I am not up to date (Vincenzo may know better) but there used to be a
> > sysctl
> > > variable to control the operating mode:
> > >
> > > https://www.freebsd.org/cgi/man.cgi?query=netmap=4
> > >
> > > SYSCTL VARIABLES AND MODULE PARAMETERS
> > >  Some aspects of the operation of netmap and VALE are controlled
> > > through
> > >  sysctl variables on FreeBSD (dev.netmap.*) and module parameters
> on
> > > Linux
> > >  (/sys/module/netmap/parameters/*):
> > >
> > >  dev.netmap.admode: 0
> > >  Controls the use of native or emulated adapter mode.
> > >
> > >  0 uses the best available option;
> > >
> > >  1 forces native mode and fails if not available;
> > >
> > >  2 forces emulated hence never fails.
> > >
> > > If it still exists, try set it to 1. If the program fails, then you
> should figure
> > out
> > > why native netmap support is not compiled in.
> >
> > Thank you.  I did set this to 1 specifically now and it still works.  So
> then it
> > should be running in native mode.
> >
> > I will dig a bit into the function that processes the incoming packets.
> > The code I currently use was added to VPP in somewhere before 2016, so it
> > might be that there is a bug in that code.
> >
> > Will try and see if I can find anything interesting there.
> >
> > >
> > > cheers
> > > luigi
> > >
> > A couple of questions / suggestions:
>
> Thank you for the suggestions.
>
> > Will it be possible to test using the netmap bridge app or a vale switch
> > instead of vpp?
> I did perform a test using netmap-fwd (
> https://github.com/Netgate/netmap-fwd)
> I did look at the code and it appears that the packets are processed as a
> batch in the application.  But each packet is passed through the complete
> IP stack in the application, before the next one is processed.
> With this application it was possible to reach about 1.4Mpps for 64-byte
> packets, and 812 kpps for 1518 byte packets
> I haven't done any other tweaking on the FreeBSD box yet.  It is running
> FreeBSD 13.0
>
> > Did you verify that the TREX setup can perform at line rate when
> connected
> > back to back?
> We did tests with TREX back to back yesterday and we reached the following.
> +-+--+
> | Packet Size | Throughput (pps) |
> +-+--+
> |   64 bytes  |14.570 Mpps   |
> |  128 bytes  | 8.466 kpps   |
> |  256 bytes  | 4.542 kpps   |
> |  512 bytes  | 2.354 kpps   |
> | 1024 bytes  | 1.200 kpps   |
> | 1280 bytes  |   965.042 kpps   |
> | 1440 bytes  |   857.795 kpps   |
> | 1518 bytes  |   814.690 kpps   |
> +-+--+
>
> > Which NICs are you using?
> We are using Intel X552 10 GbE SFP+ NIC's which is part of the Intel Xeon
> D-1537 SoC, on a SuperMicro X10SDV-8C-TLN4F+ Board.
>
> I will also put the results on the github repository
> https://github.com/ftk-ntq/vpp/wiki
> and will update as we get some more information
>
> Kind Regards
> 

Re: ixl netmap TX queue remains full

2021-03-31 Thread Vincenzo Maffione
Hi Özkan,
  I'm glad that worked.
Nevertheless, there must be an issue lurking around in the ixl driver code,
affecting the case enable_head_writeback==1.
It may be related to the fact that https://reviews.freebsd.org/D26896
causes issues, even though it looks a legitimate change.

Cheers,
  Vincenzo

Il giorno mar 30 mar 2021 alle ore 08:56 Özkan KIRIK 
ha scritto:

> Hello Vincenzo,
>
> Before your email, hw.ixl.enable_head_writeback = 1. After your
> suggestion, i set the hw.ixl.enable_head_writeback = 0. then it works
> properly.
>
> Thank you so much
>
> Cheers
> Özkan
>
> On Tue, Mar 30, 2021 at 9:22 AM Vincenzo Maffione 
> wrote:
>
>> Hi,
>>   Could this be related to
>> https://reviews.freebsd.org/D26896?
>>
>> Moreover, what happens if you switch the enable_head_writeback sysctl?
>>
>> Cheers,
>>   Vincenzo
>>
>> Il giorno lun 29 mar 2021 alle ore 10:36 Özkan KIRIK <
>> ozkan.ki...@gmail.com> ha scritto:
>>
>>> Hello,
>>>
>>> I hit problems about ixl driver's netmap support. I have no problems with
>>> ixgbe.
>>> The problem is tested with FreeBSD 12.2-p5 and FreeBSD 13.0-RC3.
>>>
>>> ixl in netmap mode, it works with low throughput (about 2 Gbps) for 20-30
>>> seconds. And then TX queue remains full. poll with POLLOUT and even
>>> ioctl(fd, NIOCTXSYNC) does not work. So that nic stops working.
>>>
>>> Same netmap software with ixgbe has no problems.
>>>
>>> pciconf -lv output:
>>> ixl0@pci0:183:0:0: class=0x02 card=0x37d215d9 chip=0x37d28086
>>> rev=0x04
>>> hdr=0x00
>>> vendor = 'Intel Corporation'
>>> device = 'Ethernet Connection X722 for 10GBASE-T'
>>> class  = network
>>> subclass   = ethernet
>>> ixl1@pci0:183:0:1: class=0x02 card=0x37d215d9 chip=0x37d28086
>>> rev=0x04
>>> hdr=0x00
>>> vendor = 'Intel Corporation'
>>> device = 'Ethernet Connection X722 for 10GBASE-T'
>>> class  = network
>>> subclass   = ethernet
>>> ixl2@pci0:183:0:2: class=0x02 card=0x37d015d9 chip=0x37d08086
>>> rev=0x04
>>> hdr=0x00
>>> vendor = 'Intel Corporation'
>>> device = 'Ethernet Connection X722 for 10GbE SFP+'
>>> class  = network
>>> subclass   = ethernet
>>> ixl3@pci0:183:0:3: class=0x02 card=0x37d015d9 chip=0x37d08086
>>> rev=0x04
>>> hdr=0x00
>>> vendor = 'Intel Corporation'
>>> device = 'Ethernet Connection X722 for 10GbE SFP+'
>>> class  = network
>>> subclass   = ethernet
>>>
>>> Best regards
>>> ___
>>> freebsd-net@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>>>
>>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ixl netmap TX queue remains full

2021-03-30 Thread Vincenzo Maffione
Hi,
  Could this be related to
https://reviews.freebsd.org/D26896?

Moreover, what happens if you switch the enable_head_writeback sysctl?

Cheers,
  Vincenzo

Il giorno lun 29 mar 2021 alle ore 10:36 Özkan KIRIK 
ha scritto:

> Hello,
>
> I hit problems about ixl driver's netmap support. I have no problems with
> ixgbe.
> The problem is tested with FreeBSD 12.2-p5 and FreeBSD 13.0-RC3.
>
> ixl in netmap mode, it works with low throughput (about 2 Gbps) for 20-30
> seconds. And then TX queue remains full. poll with POLLOUT and even
> ioctl(fd, NIOCTXSYNC) does not work. So that nic stops working.
>
> Same netmap software with ixgbe has no problems.
>
> pciconf -lv output:
> ixl0@pci0:183:0:0: class=0x02 card=0x37d215d9 chip=0x37d28086 rev=0x04
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Ethernet Connection X722 for 10GBASE-T'
> class  = network
> subclass   = ethernet
> ixl1@pci0:183:0:1: class=0x02 card=0x37d215d9 chip=0x37d28086 rev=0x04
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Ethernet Connection X722 for 10GBASE-T'
> class  = network
> subclass   = ethernet
> ixl2@pci0:183:0:2: class=0x02 card=0x37d015d9 chip=0x37d08086 rev=0x04
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Ethernet Connection X722 for 10GbE SFP+'
> class  = network
> subclass   = ethernet
> ixl3@pci0:183:0:3: class=0x02 card=0x37d015d9 chip=0x37d08086 rev=0x04
> hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Ethernet Connection X722 for 10GbE SFP+'
> class  = network
> subclass   = ethernet
>
> Best regards
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Netmap bridge not working with 10G Ethernet ports

2020-11-21 Thread Vincenzo Maffione
Hi Rajesh,
  I think the issue here is simply that you have not enabled promiscuous
mode on your interfaces.
# ifconfig ix0 promisc
# ifconfig ix1 promisc

This is an additional requirement when using netmap bridge, because that is
not done automatically (differently from what happens with if_bridge(4)).
If promisc is not enabled, the NIC will drop any unicast packet that is not
directed to the NIC's address (e.g. the ARP reply in your case). Broadcast
packets will of course pass (e.g. the ARP request). This explains the
absence of IRQs and the head/tail pointers not being updated.
So no bugs AFAIK.

I figured it out the hard way, but it was actually also documented on the
github (https://github.com/luigirizzo/netmap#receiver-does-not-receive).
I will add it to the netmap bridge man page.

Cheers,
  Vincenzo


Il giorno sab 21 nov 2020 alle ore 17:06 Vincenzo Maffione <
vmaffi...@freebsd.org> ha scritto:

>
>
> Il giorno ven 20 nov 2020 alle ore 14:31 Rajesh Kumar 
> ha scritto:
>
>> Hi Vincenzo,
>>
>> On Fri, Nov 20, 2020 at 3:20 AM Vincenzo Maffione 
>> wrote:
>>
>>>
>>> Ok, now it makes sense. Thanks for clarifying. I see that if_axp(4) uses
>>> iflib(4). This means that actually if_axp(4) has native netmap support,
>>> because iflib(4) has native netmap support.
>>>
>>>
>> It means that the driver has some modifications to allow netmap to
>>> directly program the NIC rings. These modifications are mostly the
>>> per-driver txsync and rxsyng routines.
>>> In case of iflib(4) drivers, these modifications are provided directly
>>> within the iflib(4) code, and therefore any driver using iflib will have
>>> native netmap support.
>>>
>>
>> Thanks for clarifying on the Native Netmap support.
>>
>> Ok, this makes sense, because also ix(4) uses iflib, and therefore you
>>> are basically hitting the same issue of if_axp(4)
>>> At this point I must think that there is still some issue with the
>>> interaction between iflib(4) and netmap(4).
>>>
>>
>> Ok. Let me know if any more debug info needed in this part.
>>
>> I see. This info may be useful. Have you tried to look at interrupts
>>> (e.g. `vmstat -i`), to see if "ix0" gets any RX interrupts (for the missing
>>> ARP replies)?
>>>
>>
>> It's interesting here. When I try with Intel NIC card. I see atleast 1
>> interrupt raised.  But not sure whether that is for ARP reply. Because,
>> when I try to dump the packet from "bridge"(modified) utility, I don't see
>> any ARP reply packet getting dumped.
>>
>>
>> *irq59: ix0:rxq01  0 (only 1 interrupt on
>> the opposite side)*irq67: ix0:aq  2  0
>>
>> *irq68: ix1:rxq03  0  (you can see 3
>> interrupts for 3 ARP requests from System 1)*irq76: ix1:aq
>>2  0
>>
>> The same experiment, when I try with AMD inbuilt ports, I don't see that
>> 1 interrupt also raised.
>>
>> irq81: ax0:dev_irq16  0
>> irq83: ax0  2541  4
>> irq93: ax1:dev_irq27  0
>> irq95: ax1  2371  3
>> *irq97: ax1:rxq03  0 (you can see 3
>> interrupts for 3 ARP requests from System 1, but no interrupt is seen from
>> "ax0:rxq0" for ARP reply from System 2)*
>>
>> I will do some more testing to see whether this behavior is consistent or
>> intermittent.
>>
>> Also the igb(4) driver is using iflib(4). So the involved netmap code is
>>> the same as ix(4) and if_axp(4).
>>> This is something that I'm not able to understand right now.
>>> It does not look like something related to offloads.
>>>
>>> Next week I will try to see if I can reproduce your issue with em(4),
>>> and report back. That's still an Intel driver using iflib(4).
>>>
>>
>> The "igb(4)" driver, with which things are working now is related to
>> em(4) driver (may be for newer hardware version).  Initially we faced
>> similar issue with igb(4) driver as well. After reverting the following
>> commits, things started to work.  Thanks to Stephan Dewt (copied) for
>> pointing this.  But it still fails with ix(4) driver and if_axp(4) driver.
>>
>>
>> https://github.com/freebsd/freebsd/commit/e12efc2c9e434075d0740e2e2e9e2fca2ad5f7cf
>>
>> Thanks for providing your inputs on this issue Vincenzo.  Let me know for
>> any more

Re: Netmap bridge not working with 10G Ethernet ports

2020-11-21 Thread Vincenzo Maffione
Il giorno ven 20 nov 2020 alle ore 14:31 Rajesh Kumar 
ha scritto:

> Hi Vincenzo,
>
> On Fri, Nov 20, 2020 at 3:20 AM Vincenzo Maffione 
> wrote:
>
>>
>> Ok, now it makes sense. Thanks for clarifying. I see that if_axp(4) uses
>> iflib(4). This means that actually if_axp(4) has native netmap support,
>> because iflib(4) has native netmap support.
>>
>>
> It means that the driver has some modifications to allow netmap to
>> directly program the NIC rings. These modifications are mostly the
>> per-driver txsync and rxsyng routines.
>> In case of iflib(4) drivers, these modifications are provided directly
>> within the iflib(4) code, and therefore any driver using iflib will have
>> native netmap support.
>>
>
> Thanks for clarifying on the Native Netmap support.
>
> Ok, this makes sense, because also ix(4) uses iflib, and therefore you are
>> basically hitting the same issue of if_axp(4)
>> At this point I must think that there is still some issue with the
>> interaction between iflib(4) and netmap(4).
>>
>
> Ok. Let me know if any more debug info needed in this part.
>
> I see. This info may be useful. Have you tried to look at interrupts (e.g.
>> `vmstat -i`), to see if "ix0" gets any RX interrupts (for the missing ARP
>> replies)?
>>
>
> It's interesting here. When I try with Intel NIC card. I see atleast 1
> interrupt raised.  But not sure whether that is for ARP reply. Because,
> when I try to dump the packet from "bridge"(modified) utility, I don't see
> any ARP reply packet getting dumped.
>
>
> *irq59: ix0:rxq01  0 (only 1 interrupt on
> the opposite side)*irq67: ix0:aq  2  0
>
> *irq68: ix1:rxq03  0  (you can see 3
> interrupts for 3 ARP requests from System 1)*irq76: ix1:aq
>2  0
>
> The same experiment, when I try with AMD inbuilt ports, I don't see that 1
> interrupt also raised.
>
> irq81: ax0:dev_irq16  0
> irq83: ax0  2541  4
> irq93: ax1:dev_irq27  0
> irq95: ax1  2371  3
> *irq97: ax1:rxq03  0 (you can see 3
> interrupts for 3 ARP requests from System 1, but no interrupt is seen from
> "ax0:rxq0" for ARP reply from System 2)*
>
> I will do some more testing to see whether this behavior is consistent or
> intermittent.
>
> Also the igb(4) driver is using iflib(4). So the involved netmap code is
>> the same as ix(4) and if_axp(4).
>> This is something that I'm not able to understand right now.
>> It does not look like something related to offloads.
>>
>> Next week I will try to see if I can reproduce your issue with em(4), and
>> report back. That's still an Intel driver using iflib(4).
>>
>
> The "igb(4)" driver, with which things are working now is related to em(4)
> driver (may be for newer hardware version).  Initially we faced similar
> issue with igb(4) driver as well. After reverting the following commits,
> things started to work.  Thanks to Stephan Dewt (copied) for pointing
> this.  But it still fails with ix(4) driver and if_axp(4) driver.
>
>
> https://github.com/freebsd/freebsd/commit/e12efc2c9e434075d0740e2e2e9e2fca2ad5f7cf
>
> Thanks for providing your inputs on this issue Vincenzo.  Let me know for
> any more details that you need.
>
>
I was able to reproduce your issue on FreeBSD-CURRENT running within a QEMU
VM, with two em(4) devices and the netmap bridge running between them.
I see the ARP request packet received on em0 (with associated IRQ), and
forwarded on em1. However, the ARP reply coming on em1 does not trigger an
IRQ on em1, and indeed the NIC RX head/tail pointers are not incremented as
they should (`sysctl -a | grep em.1 | grep queue_rx`) ... that is weird,
and lets me think that the issue is more likely driver-related than
netmap/iflib-related.
In any case, would you mind filing the issue on the bugzilla, so that we
can properly track this issue?

Thanks,
  Vincenzo


> Thanks,
> Rajesh.
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Netmap bridge not working with 10G Ethernet ports

2020-11-19 Thread Vincenzo Maffione
Il giorno gio 19 nov 2020 alle ore 12:28 Rajesh Kumar 
ha scritto:

> Hi Vincenzo,
>
> Thanks for your reply.
>
> On Thu, Nov 19, 2020 at 3:16 AM Vincenzo Maffione 
> wrote:
>
>>
>> This looks like if_axe(4) driver, and therefore there's no native netmap
>> support, which means you are falling back on
>> the emulated netmap adapter. Are these USB dongles? If so, how can they
>> be 10G?
>>
>
> The Driver I am working with is "if_axp" (sys/dev/axgbe).  This is AMD
> 10Gigabit Ethernet Driver. This is recently committed upstream. Yes, it
> doesn't have a Native netmap support, but uses the netmap stack which is
> existing already.  These are inbuilt SFP ports with our test board and not
> USB dongles.
>

Ok, now it makes sense. Thanks for clarifying. I see that if_axp(4) uses
iflib(4). This means that actually if_axp(4) has native netmap support,
because iflib(4) has native netmap support.


> Does Native netmap mean the hardware capability which needs to be
> programmed appropriately from driver side?  Any generic documentation
> regarding the same?
>

It means that the driver has some modifications to allow netmap to directly
program the NIC rings. These modifications are mostly the per-driver txsync
and rxsyng routines.
In case of iflib(4) drivers, these modifications are provided directly
within the iflib(4) code, and therefore any driver using iflib will have
native netmap support.


>
>> In this kind of configuration it is mandatory to disable all the NIC
>> offloads, because netmap does not program the NIC
>> to honor them, e.g.:
>>
>> # ifconfig ax0 -txcsum -rxcsum -tso4 -tso6 -lro -txcsum6 -rxcsum6
>> # ifconfig ax1 -txcsum -rxcsum -tso4 -tso6 -lro -txcsum6 -rxcsum6
>>
>
> Earlier, I haven't tried disabling the Offload capabilities.  But I tried
> now, but it still behaves the same way.  ARP replies doesn't seem to reach
> the bridge (or dropped) to be forwarded.  I will collect the details for
> AMD driver. Tried the same test with another 10G card (Intel "ix" driver)
> also exhibits similar behavior.  Details below.
>

Ok, this makes sense, because also ix(4) uses iflib, and therefore you are
basically hitting the same issue of if_axp(4)
At this point I must think that there is still some issue with the
interaction between iflib(4) and netmap(4).


>
>
>> a) I tried with another vendor 10G NIC card. It behaves the same way. So
>>> this issue doesn't seem to be generic and not hardware specific.
>>>
>>
>> Which driver are those NICs using? That makes the difference. I guess
>> it's still a driver with no native netmap support, hence
>> you are using the same emulated adapter
>>
>
> I am using the "ix" driver (Intel 10G NIC adapter).  I guess this driver
> also doesn't support Native Netmap.  Please correct me if I am wrong.  I
> tried disabling the offload capabilities with this device/driver and tested
> and still observed the netmap bridging fails.
>

As I stated above, ix(4) has netmap support, like any iflib(4) driver.


> root@fbsd_cur# sysctl dev.ix.0 | grep tx_packets
> dev.ix.0.queue7.tx_packets: 0
> dev.ix.0.queue6.tx_packets: 0
> dev.ix.0.queue5.tx_packets: 0
> dev.ix.0.queue4.tx_packets: 0
> dev.ix.0.queue3.tx_packets: 0
> dev.ix.0.queue2.tx_packets: 0
> dev.ix.0.queue1.tx_packets: 0
> *dev.ix.0.queue0.tx_packets: 3*
> root@fbsd_cur# sysctl dev.ix.0 | grep rx_packets
> dev.ix.0.queue7.rx_packets: 0
> dev.ix.0.queue6.rx_packets: 0
> dev.ix.0.queue5.rx_packets: 0
> dev.ix.0.queue4.rx_packets: 0
> dev.ix.0.queue3.rx_packets: 0
> dev.ix.0.queue2.rx_packets: 0
> dev.ix.0.queue1.rx_packets: 0
> dev.ix.0.queue0.rx_packets: 0
> root@fbsd_cur # sysctl dev.ix.1 | grep tx_packets
> dev.ix.1.queue7.tx_packets: 0
> dev.ix.1.queue6.tx_packets: 0
> dev.ix.1.queue5.tx_packets: 0
> dev.ix.1.queue4.tx_packets: 0
> dev.ix.1.queue3.tx_packets: 0
> dev.ix.1.queue2.tx_packets: 0
> dev.ix.1.queue1.tx_packets: 0
> dev.ix.1.queue0.tx_packets: 0
> root@fbsd_cur # sysctl dev.ix.1 | grep rx_packets
> dev.ix.1.queue7.rx_packets: 0
> dev.ix.1.queue6.rx_packets: 0
> dev.ix.1.queue5.rx_packets: 0
> dev.ix.1.queue4.rx_packets: 0
> dev.ix.1.queue3.rx_packets: 0
> dev.ix.1.queue2.rx_packets: 0
> dev.ix.1.queue1.rx_packets: 0
>
> *dev.ix.1.queue0.rx_packets: 3*
>
> You can see "ix1" received 3 packets (ARP requests) from system 1 and
> transmitted 3 packets to system 2 via "ix0". But ARP reply from system 2 is
> not captured or forwared properly.
>

I see. This info may be useful. Have you tried to look at interrupts (e.g.
`vmstat -i`), to see if "ix0" gets any RX interrupts (for the missing ARP
replies)?


&g

Re: Netmap bridge not working with 10G Ethernet ports

2020-11-18 Thread Vincenzo Maffione
Hi,

Il giorno mer 18 nov 2020 alle ore 08:13 Rajesh Kumar 
ha scritto:

> Hi,
>
> I am testing a 10G Network driver with Netmap "bridge" utility, where it
> doesn't seem to work. Here is my setup details.
>
> *System under Test:*  Running FreeBSD CURRENT.  Has two inbuilt 10G NIC
> ports.
> *System 1:* Running Ubuntu, whose network port is connected to Port1 of
> System Under Test
> *System 2:* Running FreeBSD CURRENT, whose network port is connected to
> Port 0 of System Under Test.
>
> Bridged the Port0 and Port1 of System Under Test using the Netmap "bridge"
> utility. Able to see interfaces coming up active and Link UP.
> # bridge -c -v -i netmap:ax0 -i netmap:ax1
>
> This looks like if_axe(4) driver, and therefore there's no native netmap
support, which means you are falling back on
the emulated netmap adapter. Are these USB dongles? If so, how can they be
10G?


> Then tried pinging from System 1 to System 2. It fails.
>
> *Observations:*
> 1. ARP request from System 1 goes to bridge port 1 (netmap_rxsync) and then
> forwarded to port 0 (netmap_txsync)
> 2. ARP request is received in System 2 (via bridge port 0) and ARP reply is
> being sent from System 2.
> 3. ARP reply from System 2 seems to be not reaching bridge port 0 to get
> forwarded to bridge 1 and hence to System 1.
> 4. Above 3 steps happen 3 times for ARP resolution cycle and then fails.
> Hence the ping fails.
>
> On Debugging, when the ARP reply is being sent from System 2, I don't see
> any interrupt triggered on the bridge port 0 in system under test.
>
> In this kind of configuration it is mandatory to disable all the NIC
offloads, because netmap does not program the NIC
to honor them, e.g.:

# ifconfig ax0 -txcsum -rxcsum -tso4 -tso6 -lro -txcsum6 -rxcsum6
# ifconfig ax1 -txcsum -rxcsum -tso4 -tso6 -lro -txcsum6 -rxcsum6


> Netstat in system under test, doesn't show any receive or drop counters
> incremented. But as I understand netstat capture the stats above the netmap
> stack. Hence not reflecting the counts.
>

Correct.


>
> *Note:*
> a) I tried with another vendor 10G NIC card. It behaves the same way. So
> this issue doesn't seem to be generic and not hardware specific.
>

Which driver are those NICs using? That makes the difference. I guess it's
still a driver with no native netmap support, hence
you are using the same emulated adapter.


> b) Trying with another vendor 1G NIC card, things are working.  So not
> sure, what makes a difference here.  The ports in System 1 and System 2 are
> USB attached Ethernet capable of maximum speed of 1G.  So does connecting
> 1G to 10G bridge ports is having any impact?
>

I don't think so. On each p2p link the NICs will negotiate 1G speed.
In any case, what driver was this one?


> c) We have verified the same 10G driver with pkt-gen utility and things are
> working. Facing issue only when using "bridge" utility.
>

That may be because pkt-gen does not care about checksums, whereas the
TCP/IP stack does.
Hence the need to disable offloads (see above).

Cheers,
  Vincenzo


> So, wondering how the ARP reply packet is getting lost here. Any ideas to
> debug?
>
> Thanks,
> Rajesh.
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Netmap - Vale switch - tcp problem

2020-06-07 Thread Vincenzo Maffione
Il giorno sab 6 giu 2020 alle ore 20:32 Anthony Arnaud <
antho.arnaudi...@gmail.com> ha scritto:

> This works!!
> A good news!
> It works even if I connect the nic to vale switch directly from ssh, the
> connection does not drop like before your patch.
>

Yes, connecting the vtnet0 to the vale switch it's what I meant.


> Can you list the steps to setup the bridge in your tests? I thing
> "ifconfig bridge create" etc.
>

I was doing my tests on a Linux host (KVM hypervisor), but it should be the
same if you run on FreeBSD.
The steps are:

# ifconfig bridge0 create
# ifconfig bridge0 up 192.168.1.1/24
# ifconfig tap0 create
# ifconfig tap0 up
# ifconfig bridge0 addm tap0

And then run your VM in such a way that your vtnet0 is backed by tap0.

It's basically the same instructions you find here:
https://www.freebsd.org/doc/handbook/virtualization-host-bhyve.html

Now I have to understand why it doesn't work in my previous case.
>

At this point I think if you still don't see TCP traffic you are likely to
have some misconfiguration, rather than a VALE/vtnet issue.

It may be useful to know that netmap has an intercept mechanism, "netmap
monitor ports", that you can use to sniff on any netmap port, while the
netmap port is being used by another netmap application.

For instance, if you run
# pkt-gen -XX -i netmap:vtnet0/r
you will sniff all the packets that are being received on the vtnet0 port.
Otherwise you can sniff packets being transmitted:
# pkt-gen -XX -i netmap:vtnet0/t
The same goes for VALE ports:
# pkt-gen -XX -i vale0:1/r

This is very similar to tcpdump (although pkt-gen does not do packet decode
and pretty printing).

Cheers,
  VIncenzo


> Thanks,
> Antho
>
> Il giorno sab 6 giu 2020 alle ore 09:52 Vincenzo Maffione <
> vmaffi...@freebsd.org> ha scritto:
>
>>
>>
>> Il giorno ven 5 giu 2020 alle ore 18:16 Anthony Arnaud <
>> antho.arnaudi...@gmail.com> ha scritto:
>>
>>> Hi Vincenzo,
>>> Thank you for your answers.
>>> I upgraded a guest machine in proxmox env to FreeBSD 13.0-CURRENT #0
>>> r361830.
>>> After that I compiled the tools as usual from /src/tools/tools/netmap
>>> I configured 2 NIC vtnet
>>>
>>> vtnet0 with an ip 192.168.1.x
>>> vtnet1 without ip and csum disabled as mirror of vtnet0 (using Open
>>> vSwitch)
>>> Tcp traffic is generated by an ssh connection from my host to guest
>>> machine.
>>>
>>
>> Not clear where you run OVS. In the host or in the guest?
>>
>>
>>> "tcpdump -i vtnet1 tcp": at each keypress in ssh shell connected to
>>> 192.168.1.x I see few tcp packet sniffed in guest machine.
>>> But if I attach vtnet1 to vale switch, tcpdump no longer works as
>>> before. No tcp traffic is shown.
>>> Can I perform any other test about this?
>>>
>>
>> First, have you tried to leave vtnet1 alone and try:
>>
>> # ifconfig vtnet0 up 192.168.1.x -txcsum -rxcsum -tso4 -tso6
>> # vale-ctl -h vale0:vtnet0
>> and verified that you can ssh into 192.168.1.x from the host?
>>
>> At least you verify that the vale-ctl -h works also on your enivironment.
>>
>> Cheers,
>>   Vincenzo
>>
>>
>>> Cheers,
>>> Antho
>>>
>>> Il giorno mer 3 giu 2020 alle ore 19:53 Vincenzo Maffione <
>>> vmaffi...@freebsd.org> ha scritto:
>>>
>>>> Hi Anthony,
>>>>   I fixed more bugs in the vtnet driver (in FreeBSD-CURRENT). As of
>>>> r361760, I'm now able to run the following steps in a VM with a vtnet 
>>>> device
>>>>
>>>> # ifconfig vtnet0 -txcsum -rxcsum -tso4 192.168.100.2/24
>>>> # vale-ctl -h vale:vtnet0
>>>> # netperf -H 192.168.100.1  # Run a netperf TCP_STREAM test to the host
>>>> bridge interface (br0)
>>>>
>>>> Since TCP works correctly at reasonable speed I'm confident that most
>>>> of the existing problems have gone.
>>>> Let me know if you have any questions about this.
>>>>
>>>> Cheers,
>>>>   Vincenzo
>>>>
>>>> Il giorno lun 1 giu 2020 alle ore 18:22 Vincenzo Maffione <
>>>> vmaffi...@freebsd.org> ha scritto:
>>>>
>>>>> Hi Anthony,
>>>>>   I think there is more than a bug, drivers-related, that show up when
>>>>> you attach the interface to a vale switch.
>>>>> I've found and fixed some related to if_vtnet (see below). In any
>>>>> case, in my tests there is no difference between TCP traffic and other
>>>>> (UDP, ICMP,

Re: Netmap - Vale switch - tcp problem

2020-06-06 Thread Vincenzo Maffione
Il giorno ven 5 giu 2020 alle ore 18:16 Anthony Arnaud <
antho.arnaudi...@gmail.com> ha scritto:

> Hi Vincenzo,
> Thank you for your answers.
> I upgraded a guest machine in proxmox env to FreeBSD 13.0-CURRENT #0
> r361830.
> After that I compiled the tools as usual from /src/tools/tools/netmap
> I configured 2 NIC vtnet
>
> vtnet0 with an ip 192.168.1.x
> vtnet1 without ip and csum disabled as mirror of vtnet0 (using Open
> vSwitch)
> Tcp traffic is generated by an ssh connection from my host to guest
> machine.
>

Not clear where you run OVS. In the host or in the guest?


> "tcpdump -i vtnet1 tcp": at each keypress in ssh shell connected to
> 192.168.1.x I see few tcp packet sniffed in guest machine.
> But if I attach vtnet1 to vale switch, tcpdump no longer works as before.
> No tcp traffic is shown.
> Can I perform any other test about this?
>

First, have you tried to leave vtnet1 alone and try:

# ifconfig vtnet0 up 192.168.1.x -txcsum -rxcsum -tso4 -tso6
# vale-ctl -h vale0:vtnet0
and verified that you can ssh into 192.168.1.x from the host?

At least you verify that the vale-ctl -h works also on your enivironment.

Cheers,
  Vincenzo


> Cheers,
> Antho
>
> Il giorno mer 3 giu 2020 alle ore 19:53 Vincenzo Maffione <
> vmaffi...@freebsd.org> ha scritto:
>
>> Hi Anthony,
>>   I fixed more bugs in the vtnet driver (in FreeBSD-CURRENT). As of
>> r361760, I'm now able to run the following steps in a VM with a vtnet device
>>
>> # ifconfig vtnet0 -txcsum -rxcsum -tso4 192.168.100.2/24
>> # vale-ctl -h vale:vtnet0
>> # netperf -H 192.168.100.1  # Run a netperf TCP_STREAM test to the host
>> bridge interface (br0)
>>
>> Since TCP works correctly at reasonable speed I'm confident that most of
>> the existing problems have gone.
>> Let me know if you have any questions about this.
>>
>> Cheers,
>>   Vincenzo
>>
>> Il giorno lun 1 giu 2020 alle ore 18:22 Vincenzo Maffione <
>> vmaffi...@freebsd.org> ha scritto:
>>
>>> Hi Anthony,
>>>   I think there is more than a bug, drivers-related, that show up when
>>> you attach the interface to a vale switch.
>>> I've found and fixed some related to if_vtnet (see below). In any case,
>>> in my tests there is no difference between TCP traffic and other (UDP,
>>> ICMP, STP,...).
>>> The issues are not related to LRO, as I thought.
>>> There are still more bugs in vtnet and I'm trying to chase them.
>>> In the meanwhile it would help if you apply the patches below and try
>>> again with vtnet to see if the situation improves. They apply cleanly to
>>> 12.1 release.
>>>
>>> Regarding your problem with em devices, it is probably yet a different
>>> issue. It may be related to the iflib transition or not. It would help to
>>> try the same setup on stable/11 (which does not have iflib). I don't have
>>> an em device, but I will try with an emulated em in QEMU/KVM.
>>>
>>> Cheers,
>>>   Vincenzo
>>>
>>> 
>>> r361698 | vmaffione | 2020-06-01 16:14:29 + (Mon, 01 Jun 2020) | 8
>>> lines
>>>
>>> netmap: if_vtnet: avoid netmap ring wraparound
>>>
>>> netmap assumes the one "slot" is left unused to distinguish
>>> the empty ring and full ring conditions. This assumption was
>>> violated by vtnet_netmap_rxq_populate().
>>>
>>> MFC after:  1 week
>>>
>>> 
>>> r361697 | vmaffione | 2020-06-01 16:12:09 + (Mon, 01 Jun 2020) | 8
>>> lines
>>>
>>> netmap: if_vtnet: replace vtnet_free_used()
>>>
>>> The functionality contained in this function is duplicated,
>>> as it is already available in vtnet_txq_free_mbufs()
>>> and vtnet_rxq_free_mbufs().
>>>
>>> MFC after:  1 week
>>>
>>> 
>>> r361696 | vmaffione | 2020-06-01 16:10:44 + (Mon, 01 Jun 2020) | 13
>>> lines
>>>
>>> netmap: vtnet: fix RX virtqueue initialization bug
>>>
>>> The vtnet_netmap_rxq_populate() function erroneously assumed
>>> that kring->nr_hwcur = 0, i.e. the kring was in the initial
>>> state. However, this is not always the case: for example,
>>> when a vtnet reinit is triggered by some changes in the
>>> interface flags or capenable.
>>> This patch changes th

Re: Netmap - Vale switch - tcp problem

2020-06-03 Thread Vincenzo Maffione
Hi Anthony,
  I fixed more bugs in the vtnet driver (in FreeBSD-CURRENT). As of
r361760, I'm now able to run the following steps in a VM with a vtnet device

# ifconfig vtnet0 -txcsum -rxcsum -tso4 192.168.100.2/24
# vale-ctl -h vale:vtnet0
# netperf -H 192.168.100.1  # Run a netperf TCP_STREAM test to the host
bridge interface (br0)

Since TCP works correctly at reasonable speed I'm confident that most of
the existing problems have gone.
Let me know if you have any questions about this.

Cheers,
  Vincenzo

Il giorno lun 1 giu 2020 alle ore 18:22 Vincenzo Maffione <
vmaffi...@freebsd.org> ha scritto:

> Hi Anthony,
>   I think there is more than a bug, drivers-related, that show up when you
> attach the interface to a vale switch.
> I've found and fixed some related to if_vtnet (see below). In any case, in
> my tests there is no difference between TCP traffic and other (UDP, ICMP,
> STP,...).
> The issues are not related to LRO, as I thought.
> There are still more bugs in vtnet and I'm trying to chase them.
> In the meanwhile it would help if you apply the patches below and try
> again with vtnet to see if the situation improves. They apply cleanly to
> 12.1 release.
>
> Regarding your problem with em devices, it is probably yet a different
> issue. It may be related to the iflib transition or not. It would help to
> try the same setup on stable/11 (which does not have iflib). I don't have
> an em device, but I will try with an emulated em in QEMU/KVM.
>
> Cheers,
>   Vincenzo
>
> 
> r361698 | vmaffione | 2020-06-01 16:14:29 + (Mon, 01 Jun 2020) | 8
> lines
>
> netmap: if_vtnet: avoid netmap ring wraparound
>
> netmap assumes the one "slot" is left unused to distinguish
> the empty ring and full ring conditions. This assumption was
> violated by vtnet_netmap_rxq_populate().
>
> MFC after:  1 week
>
> 
> r361697 | vmaffione | 2020-06-01 16:12:09 + (Mon, 01 Jun 2020) | 8
> lines
>
> netmap: if_vtnet: replace vtnet_free_used()
>
> The functionality contained in this function is duplicated,
> as it is already available in vtnet_txq_free_mbufs()
> and vtnet_rxq_free_mbufs().
>
> MFC after:  1 week
>
> 
> r361696 | vmaffione | 2020-06-01 16:10:44 + (Mon, 01 Jun 2020) | 13
> lines
>
> netmap: vtnet: fix RX virtqueue initialization bug
>
> The vtnet_netmap_rxq_populate() function erroneously assumed
> that kring->nr_hwcur = 0, i.e. the kring was in the initial
> state. However, this is not always the case: for example,
> when a vtnet reinit is triggered by some changes in the
> interface flags or capenable.
> This patch changes the behaviour of vtnet_netmap_kring_refill()
> so that it always starts publishing the netmap buffers starting
> from the current value of kring->nr_hwcur.
>
> MFC after:  1 week
> 
> Il giorno lun 1 giu 2020 alle ore 15:19 Anthony Arnaud <
> antho.arnaudi...@gmail.com> ha scritto:
>
>> Hi Vincenzo,
>>
>> To simplify the scenario I have installed from scratch FBSD12.1 on a new
>> machine, without any virtualization env.
>> I have encountered the same problem, when i attach an ethernet interface
>> to vale switch (in this case an intel card em5) the tcp traffic disappears
>> and tcpdump shown only udp, icmp6 and stp packets.
>> If I detach the NIC from vale0 tcpdump shown all tcp traffic.
>> I'm using the netmap version included in FBSD 12.1, and I have compiled
>> vale-ctl presents in kernel sources (/src/tools/tools/netmap/)
>> I executed your steps.
>> There is something dark about that behaviour...
>>
>> Cheers
>> Anthon
>>
>>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Netmap - Vale switch - tcp problem

2020-06-01 Thread Vincenzo Maffione
Hi Anthony,
  I think there is more than a bug, drivers-related, that show up when you
attach the interface to a vale switch.
I've found and fixed some related to if_vtnet (see below). In any case, in
my tests there is no difference between TCP traffic and other (UDP, ICMP,
STP,...).
The issues are not related to LRO, as I thought.
There are still more bugs in vtnet and I'm trying to chase them.
In the meanwhile it would help if you apply the patches below and try again
with vtnet to see if the situation improves. They apply cleanly to 12.1
release.

Regarding your problem with em devices, it is probably yet a different
issue. It may be related to the iflib transition or not. It would help to
try the same setup on stable/11 (which does not have iflib). I don't have
an em device, but I will try with an emulated em in QEMU/KVM.

Cheers,
  Vincenzo


r361698 | vmaffione | 2020-06-01 16:14:29 + (Mon, 01 Jun 2020) | 8 lines

netmap: if_vtnet: avoid netmap ring wraparound

netmap assumes the one "slot" is left unused to distinguish
the empty ring and full ring conditions. This assumption was
violated by vtnet_netmap_rxq_populate().

MFC after:  1 week


r361697 | vmaffione | 2020-06-01 16:12:09 + (Mon, 01 Jun 2020) | 8 lines

netmap: if_vtnet: replace vtnet_free_used()

The functionality contained in this function is duplicated,
as it is already available in vtnet_txq_free_mbufs()
and vtnet_rxq_free_mbufs().

MFC after:  1 week


r361696 | vmaffione | 2020-06-01 16:10:44 + (Mon, 01 Jun 2020) | 13
lines

netmap: vtnet: fix RX virtqueue initialization bug

The vtnet_netmap_rxq_populate() function erroneously assumed
that kring->nr_hwcur = 0, i.e. the kring was in the initial
state. However, this is not always the case: for example,
when a vtnet reinit is triggered by some changes in the
interface flags or capenable.
This patch changes the behaviour of vtnet_netmap_kring_refill()
so that it always starts publishing the netmap buffers starting
from the current value of kring->nr_hwcur.

MFC after:  1 week

Il giorno lun 1 giu 2020 alle ore 15:19 Anthony Arnaud <
antho.arnaudi...@gmail.com> ha scritto:

> Hi Vincenzo,
>
> To simplify the scenario I have installed from scratch FBSD12.1 on a new
> machine, without any virtualization env.
> I have encountered the same problem, when i attach an ethernet interface
> to vale switch (in this case an intel card em5) the tcp traffic disappears
> and tcpdump shown only udp, icmp6 and stp packets.
> If I detach the NIC from vale0 tcpdump shown all tcp traffic.
> I'm using the netmap version included in FBSD 12.1, and I have compiled
> vale-ctl presents in kernel sources (/src/tools/tools/netmap/)
> I executed your steps.
> There is something dark about that behaviour...
>
> Cheers
> Anthon
>
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Netmap - Vale switch - tcp problem

2020-05-30 Thread Vincenzo Maffione
Il giorno ven 29 mag 2020 alle ore 19:01 Anthony Arnaud <
antho.arnaudi...@gmail.com> ha scritto:

> Hi Vincenzo,
>
> thanks for your hints!
> I rebooted my guest FBSD 12.1 machine, and I have perfomed your steps
>
> #ifconfig vtnet1 -txcsum -rxcsum -tso4 -tso6 up promisc
>
> vtnet1:
> flags=28943 metric
> 0 mtu 1500
>
> options=6c04b8
> ether 0e:bd:ec:7a:08:06
> media: Ethernet 10Gbase-T 
> status: active
> nd6 options=29
>
> tcpdump is ok.
>
> Does it mean you see both ICMP, UDP and TCP traffic?


> But after:
>
> #vale-ctl
>
> 446.196827 bdg_ctl [149] bridge:0 port:0 vale0:vtnet1
> 446.196855 bdg_ctl [149] bridge:0 port:1 vale0:vtnet1^
>
> tcpdump not work anymore.
>

Do you see ICMP/UDP only and not TCP?


> I don't understand why the tcp traffic disappears.
> In my configuration vtnet1 is a mirror port created by Open vSwitch, but I
> don't think that's the reason.
>
> No, I don't think that's relevant.

In my setup, vtnet0 is a guest interface backed by a host tap device
(attached to a linux bridge), and hypervisor is QEMU/KVM.
Here are the steps I follow in the VM (in this order):

# ifconfig vtnet0 -txcsum -rxcsum -tso4 -tso6 up 192.168.100.2/24
# vale-ctl -h vale0:vtnet0

# nc 192.168.100.1  # connect to listening netcat in the host.
hello
abc
[...]
# tcpdump -ni vtnet0 # This shows the TCP traffic.

I start to see problems when I change the offloads:
# ifconfig vtnet0 -lro

Cheers,
  Vincenzo


> Below some info about my configuration:
>
> dev.netmap.iflib_rx_miss_bufs: 0
> dev.netmap.iflib_rx_miss: 0
> dev.netmap.iflib_crcstrip: 1
> dev.netmap.bridge_batch: 1024
> dev.netmap.default_pipes: 0
> dev.netmap.priv_buf_num: 4098
> dev.netmap.priv_buf_size: 2048
> dev.netmap.buf_curr_num: 163840
> dev.netmap.buf_num: 163840
> dev.netmap.buf_curr_size: 2048
> dev.netmap.buf_size: 2048
> dev.netmap.priv_ring_num: 4
> dev.netmap.priv_ring_size: 20480
> dev.netmap.ring_curr_num: 200
> dev.netmap.ring_num: 200
> dev.netmap.ring_curr_size: 36864
> dev.netmap.ring_size: 36864
> dev.netmap.priv_if_num: 2
> dev.netmap.priv_if_size: 1024
> dev.netmap.if_curr_num: 100
> dev.netmap.if_num: 100
> dev.netmap.if_curr_size: 1024
> dev.netmap.if_size: 1024
> dev.netmap.ptnet_vnet_hdr: 1
> dev.netmap.generic_rings: 1
> dev.netmap.generic_ringsize: 1024
> dev.netmap.generic_mit: 10
> dev.netmap.generic_hwcsum: 0
> dev.netmap.admode: 0
> dev.netmap.fwd: 0
> dev.netmap.txsync_retry: 2
> dev.netmap.no_pendintr: 1
> dev.netmap.no_timestamp: 0
> dev.netmap.verbose: 0
>
>
> dev.vtnet.1.txq0.rescheduled: 0
> dev.vtnet.1.txq0.tso: 0
> dev.vtnet.1.txq0.csum: 0
> dev.vtnet.1.txq0.omcasts: 0
> dev.vtnet.1.txq0.obytes: 0
> dev.vtnet.1.txq0.opackets: 0
> dev.vtnet.1.rxq0.rescheduled: 0
> dev.vtnet.1.rxq0.csum_failed: 0
> dev.vtnet.1.rxq0.csum: 66
> dev.vtnet.1.rxq0.ierrors: 0
> dev.vtnet.1.rxq0.iqdrops: 0
> dev.vtnet.1.rxq0.ibytes: 11904780
> dev.vtnet.1.rxq0.ipackets: 40984
> dev.vtnet.1.tx_task_rescheduled: 0
> dev.vtnet.1.tx_tso_offloaded: 0
> dev.vtnet.1.tx_csum_offloaded: 0
> dev.vtnet.1.tx_defrag_failed: 0
> dev.vtnet.1.tx_defragged: 0
> dev.vtnet.1.tx_tso_not_tcp: 0
> dev.vtnet.1.tx_tso_bad_ethtype: 0
> dev.vtnet.1.tx_csum_bad_ethtype: 0
> dev.vtnet.1.rx_task_rescheduled: 0
> dev.vtnet.1.rx_csum_offloaded: 0
> dev.vtnet.1.rx_csum_failed: 0
> dev.vtnet.1.rx_csum_bad_proto: 0
> dev.vtnet.1.rx_csum_bad_offset: 0
> dev.vtnet.1.rx_csum_bad_ipproto: 0
> dev.vtnet.1.rx_csum_bad_ethtype: 0
> dev.vtnet.1.rx_mergeable_failed: 0
> dev.vtnet.1.rx_enq_replacement_failed: 0
> dev.vtnet.1.rx_frame_too_large: 0
> dev.vtnet.1.mbuf_alloc_failed: 0
> dev.vtnet.1.act_vq_pairs: 1
> dev.vtnet.1.requested_vq_pairs: 0
> dev.vtnet.1.max_vq_pairs: 1
> dev.vtnet.1.%parent: virtio_pci4
> dev.vtnet.1.%pnpinfo:
> dev.vtnet.1.%location:
> dev.vtnet.1.%driver: vtnet
> dev.vtnet.1.%desc: VirtIO Networking Adapter
> dev.vtnet.0.txq0.rescheduled: 0
>
> Cheers,
> Anthony
>
> Il giorno gio 28 mag 2020 alle ore 21:38 Vincenzo Maffione <
> vmaffi...@freebsd.org> ha scritto:
>
>> Hi,
>>   I was trying to reproduce your problem (same FreeBSD release as yours).
>> It looks like there is some sort of bad interaction with LRO.
>>
>> Starting from a fresh boot, if you keep lro enabled, e.g.
>>   # ifconfig vtnet0 -txcsum -rxcsum -tso4 -tso6
>>   # vale-ctl 
>> then I experience no problem (TCP works between vtnet0 and the host,
>> tcpdump on vtnet0 works as expected).
>>
>> As soon as you disable LRO:
>>   # ifconfig vtnet0 -lro
>> both TCP and tcpdump stop working.
>> If I en

Re: Netmap - Vale switch - tcp problem

2020-05-28 Thread Vincenzo Maffione
Hi,
  I was trying to reproduce your problem (same FreeBSD release as yours).
It looks like there is some sort of bad interaction with LRO.

Starting from a fresh boot, if you keep lro enabled, e.g.
  # ifconfig vtnet0 -txcsum -rxcsum -tso4 -tso6
  # vale-ctl 
then I experience no problem (TCP works between vtnet0 and the host,
tcpdump on vtnet0 works as expected).

As soon as you disable LRO:
  # ifconfig vtnet0 -lro
both TCP and tcpdump stop working.
If I enable LRO again, TCP restarts working, but tcpdump doesn't. I need to
reboot the machine to fix it.

Btw, creating vi0 (persistent VALE port) is not relevant for this test. You
may as well use ephemeral VALE ports (e.g. run pkt-gen -i vale0:1 -f rx).

I will have a look at the LRO issue asap. In the meantime you could avoid
disabling LRO and see if that works for you.

Cheers,
  Vincenzo

Il giorno gio 28 mag 2020 alle ore 17:16 Anthony Arnaud <
antho.arnaudi...@gmail.com> ha scritto:

> I already disabled the checksum, the vtnet config is:
>
> ifconfig vtnet1 -txcsum -rxcsum -tso4 -tso6 -lro -txcsum6 -rxcsum6 -vlanmtu
> -vlanhwtag -vlanhwfilter -vlanhwtso -vlanhwcsum up promisc
>
> vtnet1:
> flags=28943 metric
> 0 mtu 1500
> options=1800a8
> ether 0e:bd:ec:7a:08:06
> media: Ethernet 10Gbase-T 
> status: active
> nd6 options=29
>
> Sorry for not having posted vtnet config before.
> PS: VLAN_HWCSUM is not switchable off for some reason!
> the problem is not that.
>
> Cheers
> Anthony
>
>
>
>
> Il giorno gio 28 mag 2020 alle ore 16:05 Luigi Rizzo 
> ha scritto:
>
> >
> >
> > On Thursday, May 28, 2020, Anthony Arnaud 
> > wrote:
> >
> >> Hi everyone!
> >> I would like to create a vale switch with an interface attached with the
> >> host stack and some virtual.
> >> My env is a VM with FBSD-12.1 12.1-RELEASE FreeBSD 12.1-RELEASE r354233
> >> GENERIC  amd64
> >> and VirtIO support.
> >>
> >> I performed:
> >>
> >> vale-ctl -h vale0:vtnet1
> >> vale-ctl -n vi0
> >> vale-ctl -a vale0:vi0
> >>
> >> 615.925514 bdg_ctl [149] bridge:0 port:0 vale0:vtnet1
> >> 615.925559 bdg_ctl [149] bridge:0 port:1 vale0:vtnet1^
> >> 615.925572 bdg_ctl [149] bridge:0 port:2 vale0:vi0
> >>
> >> vtnet1 is configured as mirror port.
> >> But if:
> >>
> >> tcpdump -i vtnet1
> >> or
> >> tcpdump -i vale0:vi0
> >>
> >> why can't I see any TCP packets?
> >> UDP and ICMP packet are ok.
> >>
> >> Without vale switch tcpdump show all TCP packets correctly.
> >
> >
> > You have to disable checksum offloading on vtnet1.
> >
> > Cheers
> > Luigi
> >
> > It is a bug?
> >> Thanks to all!
> >> 
> >> ___
> >> freebsd-net@freebsd.org mailing list
> >> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> >>
> >
> >
> > --
> > -+---
> >  Prof. Luigi RIZZO, ri...@iet.unipi.it  . Dip. di Ing. dell'Informazione
> >  http://www.iet.unipi.it/~luigi/. Universita` di Pisa
> >  TEL  +39-050-2217533   . via Diotisalvi 2
> >  Mobile   +39-338-6809875   . 56122 PISA (Italy)
> > -+---
> >
> >
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap/ixl and crc addition..

2020-03-26 Thread Vincenzo Maffione
Il giorno gio 26 mar 2020 alle ore 11:39 Alexandre Snarskii <
s...@snar.spb.ru> ha scritto:

> On Wed, Mar 25, 2020 at 11:31:30PM +0100, Vincenzo Maffione wrote:
> > Hi Alexandre,
> >   Thanks. Your patch looks good to me. I assume you have tested it?
>
> Sure, this patch is already handles production traffic.
>
> > I will commit that to stable/11.
>
> Thanks.
>

> >
> > The issue you report on stable/12 is more worrisome. The 'no space in TX
> ring'
> > condition (head==cur==tail) is ok per-se: on a subsequent poll() wakeup
> (e.g.
> > TX interrupt) or explicit ioctl(NIOCTXSYNC) you should see tail moving
> forward,
> > therefore freeing some space to be used in the ring.
>
> Unfortunately, further NIOCTXSYNC did not freed queue.
>
> Let me explain my observations in details: this issue was reproduceable
> even under really light load (say, hundreds pps per queue). As the load
> was ligth, I was able to insert debugging output before and after any
> calls
> to ioctl and poll, code is like:
>
> while(1) {
> if (nm_tx_pending(txring)) {
> printf("%s: pre-txsync cur: %d, head: %d, tail: %d\n",
> thread_name, txring->cur, txring->head,
> txring->tail);
> if (ioctl(pa->fd, NIOCTXSYNC) == -1) {
> log event
> };
> printf("%s: post-txsync cur: %d, haad: %d, tail: %d\n",
> thread_name, txring->cur, txring->head,
> txring->tail);
> };
> // setup poll descriptors with POLLIN only
> printf("%s: pre-poll cur: %d, head: %d, tail: %d\n",
> thread_name, txring->cur, txring->head, txring->tail);
> ret = poll(fds, 2, 1000);
> printf("%s: post-poll cur: %d, head: %d, tail: %d\n",
> thread_name, txring->cur, txring->head, txring->tail);
>
>
Ok, but I have some preliminary questions and notes:
 - You show the code that syncs the tx ring to collect completed
transmissions (and thus advance tail). However, I do not see the code that
submits new packets to be transmitted (and thus advances head/tail). Where
is it? If this is happening in a separate thread, that leads to undefined
behaviour, e.g. ring resets (although you can't crash the kernel).


> Logs in normal situation (only single thread shown):
>
> Notice:nz(ixl0:2): pre-txsync cur: 649, head: 649, tail: 647
> Notice:nz(ixl0:2): post-txsync cur: 649, head: 649, tail: 648
> Notice:nz(ixl0:2): pre-poll cur: 649, head: 649, tail: 648
> Notice:nz(ixl0:2): post-poll cur: 649, head: 649, tail: 648
>
> in pre-txsync: cur = head = tail + 2 (sometimes tail + 3), queue
> is nearly empty.
> in post-txsync: cur = head = tail + 1, queue is flushed.
>
> however, after some time, txcsync call advanced tail not to head - 1,
> but to head:
>
> Notice:nz(ixl0:2): pre-txsync cur: 654, head: 654, tail: 652
> Notice:nz(ixl0:2): post-txsync cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): pre-poll cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): post-poll cur: 654, head: 654, tail: 654
>

I see. This is wrong and must not happen. Once you get here, the tx ring
becomes unusable, so we may as well ignore what happens afterwards.

>
> and following txcsyncs was not able to free (already empty) queue:
>
> Notice:nz(ixl0:2): pre-txsync cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): post-txsync cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): pre-poll cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): post-poll cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): pre-txsync cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): post-txsync cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): pre-poll cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): post-poll cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): pre-txsync cur: 654, head: 654, tail: 654
> Notice:nz(ixl0:2): post-txsync cur: 654, head: 654, tail: 654
>
> application entered busy wait.. (in my case most packets just zero-copied
> from rxring to txring, so if there are no space in txring, packets are not
> consumed from rx).
>
> > However, the ring_reinit means that something is going wrong: either your
> > application is using the TX ring incorrectly, or there is a bug in the
> netmap
>
> ring_reinit was indeed caused by my incorrect use of txring: I
> tried to ignore situation head == tail and inject packet anyway..
>

Yes, ring reinit actually resets the ring variables to a sane value.

In case your issue is not caused by unsafe access to the tx ring from
multiple threads (see above), it would help to
know what value do you have i

Re: netmap/ixl and crc addition..

2020-03-25 Thread Vincenzo Maffione
Hi Alexandre,
  Thanks. Your patch looks good to me. I assume you have tested it?
I will commit that to stable/11.

The issue you report on stable/12 is more worrisome. The 'no space in TX
ring' condition (head==cur==tail) is ok per-se: on a subsequent poll()
wakeup (e.g. TX interrupt) or explicit ioctl(NIOCTXSYNC) you should see
tail moving forward, therefore freeing some space to be used in the ring.
However, the ring_reinit means that something is going wrong: either your
application is using the TX ring incorrectly, or there is a bug in the
netmap iflib code. Since FreeBSD 12, netmap support is provided by iflib,
while before netmap support was provided directly by the ixl driver.
In any case, it would probably help if you could provide some more detailed
info (how to reproduce the problem).

Cheers,
  Vincenzo

Il giorno mar 24 mar 2020 alle ore 15:12 Alexandre Snarskii <
s...@snar.spb.ru> ha scritto:

> On Tue, Mar 24, 2020 at 03:37:36PM +0300, Alexandre Snarskii wrote:
> >
> > Hi!
> >
> > Long story short: looks like intel x722 does not by default add CRC to
> > outbound frames, so with FreeBSD 11-stable netmap-generated traffic is
> > dropped on the next port.. Fix is simple, attached.
>
> ... add missing attach :(
>
> >
> > The same behaviour of 'unconditionally ask card to compute crc' can
> > be found in both if_ixl:
> >
> https://svnweb.freebsd.org/base/stable/11/sys/dev/ixl/ixl_txrx.c?view=markup#l408
> > and in DPDK i40e driver:
> >
> https://github.com/DPDK/dpdk/blob/master/drivers/net/i40e/i40e_rxtx.c#L1105
> > so, I guess, it's safe.
> >
> > PS: of course, first idea was to upgrade to FreeBSD 12-stable, but while
> > this upgrade solved the crc problem, this version shows 'stalled tx
> queue'
> > problem: after CTXSYNC tail == head == cur, 'no space in ring'
> condition.
> > Attempts to ignore this condition led to continuous ring resets in
> txcsync:
> >
> > Mar 17 20:21:08 host kernel: 668.224836 [1679] nm_txsync_prologue
> ixl1 TX3: fail 'head < kring->rhead || head > kring->rtail' h 136 c 136 t
> 135 rh 135 rc 135 rt 135 hc 135 ht 135
> > Mar 17 20:21:08 host kernel: 668.238300 [1787] netmap_ring_reinit
> called for ixl1 TX3
> >
> > PPS: hardware details: Dell VEP4600, based on Xeon D-2100 with
> > two onboard X722 ports (actually, four, but two of them are not
> > wired).
> >
> > CPU: Intel(R) Xeon(R) D-2187NT CPU @ 2.00GHz (2000.06-MHz K8-class CPU)
> >   Origin="GenuineIntel"  Id=0x50654  Family=0x6  Model=0x55  Stepping=4
> >
> > ixl0:  1.11.9-k> mem 0xfa00-0xfaff,0xfb018000-0xfb01 irq 11 at device
> 0.0 numa-domain 0 on pci12
> > ixl0: using 1024 tx descriptors and 1024 rx descriptors
> > ixl0: fw 3.1.57069 api 1.5 nvm 3.33 etid 80001007 oem 1.263.0
> > ixl0: PF-ID[0]: VFs 32, MSIX 129, VF MSIX 5, QPs 384, I2C
> > ixl0: Using MSIX interrupts with 9 vectors
> > ixl0: Allocating 8 queues for PF LAN VSI; 8 queues active
> > ixl0: Ethernet address: 3c:2c:30:30:59:85
> > ixl0: SR-IOV ready
> > ixl0: netmap queues/slots: TX 8/1024, RX 8/1024
> >
> > ixl0@pci0:184:0:0:class=0x02 card=0x8086 chip=0x37d38086
> rev=0x04 hdr=0x00
> > vendor = 'Intel Corporation'
> > device = 'Ethernet Connection X722 for 10GbE SFP+'
> > class  = network
> > subclass   = ethernet
> >
> >
> > ___
> > freebsd-net@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Intel NETMAP performance and packet type

2020-02-28 Thread Vincenzo Maffione
Il giorno ven 28 feb 2020 alle ore 12:26 Slawa Olhovchenkov 
ha scritto:

> On Thu, Feb 27, 2020 at 11:16:50PM +0300, Slawa Olhovchenkov wrote:
>
> > On Thu, Feb 27, 2020 at 06:51:54PM +0100, Vincenzo Maffione wrote:
> >
> > > Hi,
> > >   So, the issue is not the payload.
> > > If you look at the avg_batch statistics reported by pkt-gen, you'll see
> > > that in the ACK-flood experiment you have 4.92, whereas in the
> SYN-flood
> > > case you have 17.5. The batch is the number of packets (well, actually
> > > netmap descriptors, but in this case it's the same) that you receive
> (or
> > > transmit) for each poll() invocation.
> > > So in the first case you end up doing much more poll() calls, hence the
> > > higher per-packet overhead and the lower packet-rate.
> > >
> > > Why is the poll() called more frequently? That depends on packet
> timing and
> > > interrupt rate. There must be something different on your packet
> generator
> > > that produces this effect (e.g. different burstiness, or maybe the
> packet
> > > generator is not able to saturate the 10G link)?
> >
> > No, I am capture netstat output -- raw packet rate is the same.
> > Also, I am change card to chelsio T5 and don't see issuse.
> >
> > This is payload issuse, at driver level.
> >
> > > In any case, I would suggest measuring the RX interrupt rate, and check
> > > that it's higher in the ACK-flood case. Then you can try to lower the
> > > interrupt rate by tuning the interrupt moderation features of the
> Intel NIC
> > > (e,g. limit hw.ix.max_interrupt_rate and disable hw.ix.enable_aim or
> > > similar).
> > > By playing with the interrupt moderation you should be able to
> increase the
> > > avg_batch, and then increase throghput.
> >
> > Already limited.
>
> Also, is this normal (rxd_tail == rxd_head):
>
> dev.ix.0.queue0.rx_discarded: 0
> dev.ix.0.queue0.rx_copies: 0
> dev.ix.0.queue0.rx_bytes: 612041623304
> dev.ix.0.queue0.rx_packets: 9563149414
> dev.ix.0.queue0.rxd_tail: 1120
> dev.ix.0.queue0.rxd_head: 1120
> dev.ix.0.queue0.irqs: 40154885
> dev.ix.0.queue0.interrupt_rate: 16129
> dev.ix.0.queue0.tx_packets: 553897984
> dev.ix.0.queue0.tso_tx: 0
> dev.ix.0.queue0.txd_tail: 0
> dev.ix.0.queue0.txd_head: 0
>
> I am see this RX queue is stoped.
>

Yes, (rxd_tail == rxd_head) means that the NIC ran out of RX buffers.
rxd_head is the next descriptor that the NIC will use. rxd_tail is the next
descriptor that the driver will replenish. RX buffers are replenished by
the netmap NIOCRXSYNC routine, which is called on poll().
However, rx_discarded is 0, which means that the NIC is not dropping
packets. So the problem should not be that poll() is not called frequently
enough.
You should check rx_discarded for all the queues.

Another thing you need to check is how the load is balanced across the
receive queues. How many have you configured? Maybe the two workloads
(SYN-flood and ACK-flood) load different queues in different ways.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Intel NETMAP performance and packet type

2020-02-28 Thread Vincenzo Maffione
Il giorno gio 27 feb 2020 alle ore 21:17 Slawa Olhovchenkov 
ha scritto:

> On Thu, Feb 27, 2020 at 06:51:54PM +0100, Vincenzo Maffione wrote:
>
> > Hi,
> >   So, the issue is not the payload.
> > If you look at the avg_batch statistics reported by pkt-gen, you'll see
> > that in the ACK-flood experiment you have 4.92, whereas in the SYN-flood
> > case you have 17.5. The batch is the number of packets (well, actually
> > netmap descriptors, but in this case it's the same) that you receive (or
> > transmit) for each poll() invocation.
> > So in the first case you end up doing much more poll() calls, hence the
> > higher per-packet overhead and the lower packet-rate.
> >
> > Why is the poll() called more frequently? That depends on packet timing
> and
> > interrupt rate. There must be something different on your packet
> generator
> > that produces this effect (e.g. different burstiness, or maybe the packet
> > generator is not able to saturate the 10G link)?
>
> No, I am capture netstat output -- raw packet rate is the same.
> Also, I am change card to chelsio T5 and don't see issuse.
>
> This is payload issuse, at driver level.
>

That's not possible, since netmap does not even look into the payload.

Can you please report the per-queue interrupt rate in both cases (ACK-flood
and SYN-flood)?
You can use something like `vmstat -i -w1 | grep ix` to monitor the
interrupt rate.
Or probably you can also use `sysctl -a dev.ix | grep interrupt_rate`


> > In any case, I would suggest measuring the RX interrupt rate, and check
> > that it's higher in the ACK-flood case. Then you can try to lower the
> > interrupt rate by tuning the interrupt moderation features of the Intel
> NIC
> > (e,g. limit hw.ix.max_interrupt_rate and disable hw.ix.enable_aim or
> > similar).
> > By playing with the interrupt moderation you should be able to increase
> the
> > avg_batch, and then increase throghput.
>
> Already limited.
>

Limited to which value? Have you tried to decrease max_interrupt_rate even
more?

>
> > Cheers,
> >   Vincenzo
> >
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: vmx: strange issue, related to to tso?

2019-12-28 Thread Vincenzo Maffione
I think you are correct. Good catch!
We should file a bug and/or create a review on the Phabricator (If you are
busy I could do that).

Thanks,
  Vincenzo

Il giorno sab 28 dic 2019 alle ore 05:44 Patrick Kelsey 
ha scritto:

>
> On Fri, Dec 27, 2019 at 5:01 PM Andriy Gapon  wrote:
>
>> On 27/12/2019 15:34, Vincenzo Maffione wrote:
>> > It may be useful to check what happens if you replace the vmx0
>> interface with an
>> > em0.
>> > In this way you would know if the issue is vmx-specific or not.
>>
>> I'll put this on my to-do, can't test right now.
>>
>> But one thing I noticed when comparing the TCP control block of the
>> connection
>> before and after the "TSO dance" is that TF_TSO gets cleared after any
>> outgoing
>> traffic while TSO is disabled on the interface.  And the flag does not
>> come back
>> after TSO is reenabled.  Any new connections get the flag, of course.
>>
>> So, I indeed suspect that there is a problem with vmx TSO.
>> As another data point, an older system from before vmx->iflib conversion
>> does
>> not exhibit the problem.
>>
>> > Il giorno gio 26 dic 2019 alle ore 20:04 Andriy Gapon > > <mailto:a...@freebsd.org>> ha scritto:
>> >
>> >
>> > Maybe someone would have any pointers for me with the following
>> problem.
>> > This happens with CURRENT as of the beginning of September.
>> > I connect via ssh to a VM running on VMware, it has a single vmx0
>> interface.
>> > The problem is that when I print a moderately large amount of text
>> to the
>> > terminal (e.g., tail -100 /var/log/messages) I literally see it
>> printed in
>> > chunks with noticeable pauses between chunks.  It takes several
>> seconds for all
>> > lines to get shown.  This happens every time I do it.
>> > There is an interesting twist.  If I disable TSO with ifconfig vmx0
>> -tso and
>> > print the same output in the same ssh session, then the output is
>> smooth and
>> > fast as I would expect it.  The lines scroll by almost instantly.
>> > If then I re-enable TSO and again produce the same output in the
>> same ssh, then
>> > it is still fast.
>> >
>> > It appears that the TCP connection gets tuned to some very
>> sub-optimal
>> > parameters when TSO is enabled.  When I disable TSO, the parameters
>> get re-tuned
>> > to better values and the values stick when I re-enable TSO.
>> > This is just a conjecture, of course.
>> >
>> > I have some tcpdump captures, but I do not see anything that would
>> really stand
>> > out.  One difference is that in the slow case only "full sized"
>> packets are sent
>> > while in the fast case there are shorter packets with push flag.
>> >
>> > Some packets for the slow case:
>> >  00:00:00.453202 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags
>> [.], seq
>> > 37:1485, ack 36, win 128, options [nop,nop,TS val 1403195134 ecr
>> 4966311],
>> > length 1448
>> >  00:00:00.096859 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags
>> [.], ack 1485,
>> > win 1026, options [nop,nop,TS val 4966864 ecr 1403195134], length 0
>> >  00:00:00.442963 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags
>> [.], seq
>> > 1485:2933, ack 36, win 128, options [nop,nop,TS val 1403195664 ecr
>> 4966864],
>> > length 1448
>> >  00:00:00.092677 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags
>> [.], ack 2933,
>> > win 1026, options [nop,nop,TS val 4967400 ecr 1403195664], length 0
>> >  00:00:00.437336 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags
>> [.], seq
>> > 2933:4381, ack 36, win 128, options [nop,nop,TS val 1403196194 ecr
>> 4967400],
>> > length 1448
>> >  00:00:00.097190 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags
>> [.], ack 4381,
>> > win 1026, options [nop,nop,TS val 4967934 ecr 1403196194], length 0
>> >
>> > Some packets after the TSO dance:
>> >  00:00:00.000450 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags
>> [.], seq
>> > 4077:5525, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr
>> 21706510],
>> > length 1448
>> >  00:00:00.16 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags
>> [P.], seq
>> > 5525:6097, ack 36, win 128, options [nop,

Re: vmx: strange issue, related to to tso?

2019-12-27 Thread Vincenzo Maffione
It may be useful to check what happens if you replace the vmx0 interface
with an em0.
In this way you would know if the issue is vmx-specific or not.

Cheers,
  Vincenzo

Il giorno gio 26 dic 2019 alle ore 20:04 Andriy Gapon  ha
scritto:

>
> Maybe someone would have any pointers for me with the following problem.
> This happens with CURRENT as of the beginning of September.
> I connect via ssh to a VM running on VMware, it has a single vmx0
> interface.
> The problem is that when I print a moderately large amount of text to the
> terminal (e.g., tail -100 /var/log/messages) I literally see it printed in
> chunks with noticeable pauses between chunks.  It takes several seconds
> for all
> lines to get shown.  This happens every time I do it.
> There is an interesting twist.  If I disable TSO with ifconfig vmx0 -tso
> and
> print the same output in the same ssh session, then the output is smooth
> and
> fast as I would expect it.  The lines scroll by almost instantly.
> If then I re-enable TSO and again produce the same output in the same ssh,
> then
> it is still fast.
>
> It appears that the TCP connection gets tuned to some very sub-optimal
> parameters when TSO is enabled.  When I disable TSO, the parameters get
> re-tuned
> to better values and the values stick when I re-enable TSO.
> This is just a conjecture, of course.
>
> I have some tcpdump captures, but I do not see anything that would really
> stand
> out.  One difference is that in the slow case only "full sized" packets
> are sent
> while in the fast case there are shorter packets with push flag.
>
> Some packets for the slow case:
>  00:00:00.453202 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags [.], seq
> 37:1485, ack 36, win 128, options [nop,nop,TS val 1403195134 ecr 4966311],
> length 1448
>  00:00:00.096859 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags [.], ack
> 1485,
> win 1026, options [nop,nop,TS val 4966864 ecr 1403195134], length 0
>  00:00:00.442963 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags [.], seq
> 1485:2933, ack 36, win 128, options [nop,nop,TS val 1403195664 ecr
> 4966864],
> length 1448
>  00:00:00.092677 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags [.], ack
> 2933,
> win 1026, options [nop,nop,TS val 4967400 ecr 1403195664], length 0
>  00:00:00.437336 IP 10.180.106.180.22 > 10.180.1.29.25490: Flags [.], seq
> 2933:4381, ack 36, win 128, options [nop,nop,TS val 1403196194 ecr
> 4967400],
> length 1448
>  00:00:00.097190 IP 10.180.1.29.25490 > 10.180.106.180.22: Flags [.], ack
> 4381,
> win 1026, options [nop,nop,TS val 4967934 ecr 1403196194], length 0
>
> Some packets after the TSO dance:
>  00:00:00.000450 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [.], seq
> 4077:5525, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr
> 21706510],
> length 1448
>  00:00:00.16 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [P.], seq
> 5525:6097, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr
> 21706510],
> length 572
>  00:00:00.09 IP 10.180.1.29.25369 > 10.180.106.180.22: Flags [.], ack
> 5525,
> win 1003, options [nop,nop,TS val 21706510 ecr 2124310129], length 0
>  00:00:00.000303 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [.], seq
> 6097:7545, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr
> 21706510],
> length 1448
>  00:00:00.19 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [P.], seq
> 7545:8117, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr
> 21706510],
> length 572
>  00:00:00.13 IP 10.180.1.29.25369 > 10.180.106.180.22: Flags [.], ack
> 7545,
> win 1003, options [nop,nop,TS val 21706510 ecr 2124310129], length 0
>  00:00:00.000162 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [.], seq
> 8117:9565, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr
> 21706510],
> length 1448
>  00:00:00.12 IP 10.180.106.180.22 > 10.180.1.29.25369: Flags [P.], seq
> 9565:10137, ack 36, win 128, options [nop,nop,TS val 2124310129 ecr
> 21706510],
> length 572
>  00:00:00.07 IP 10.180.1.29.25369 > 10.180.106.180.22: Flags [.], ack
> 9565,
> win 1003, options [nop,nop,TS val 21706510 ecr 2124310129], length 0
>
> What else can I examine to debug the problem further?
> Thank you!
> --
> Andriy Gapon
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ix0 and ix1 ifconfig options different on Supermicro board

2019-11-23 Thread Vincenzo Maffione
Hi,
  TSO/LRO (for IPv4 and/or IPv6) will increase TCP bulk throughput on
machine X for those TCP connection where X is one of the two endpoints,
that is TCP connections that are local to X. That's why you are seing iperf
achieving higher throughput with TSO/LRO enabled.
TSO means that your local TCP stack will pass down large (e.g. 32K) packets
to the NIC driver, and the NIC will take care of segmentation. This is
beneficial for two reasons: (1) the segmentation work is done in hardware
rather than in the CPU, and this is typically faster (and also, you save
the CPU time for other stuff); (2) the per-packet cost of protocol
processing (TCP, IP, Ethernet) is amortized over a large amount of bytes,
which means that your total per-byte CPU time will be way lower. Most of
the gain actually comes from (2).
LRO is similar, but in the receive direction.

However, if your device is a router it means that it forwards packets.
Therefore the local TCP stack is not involved, so TSO simply does not apply
(at least in FreeBSD).
I think LRO applies, but there is a latency hit, as suggested by the wiki
page you pointed.

So no, enabling TSO/LRO will not increase the forwarding rate, but possibly
increase latency. You should keep it disabled.

Cheers,
  Vincenzo

Il giorno ven 22 nov 2019 alle ore 22:47 BulkMailForRudy <
cra...@monkeybrains.net> ha scritto:

>
> I just did another test to a machine with a Chelsio card.
>
>   Server D (cxl3) -> Server A = 3.5Gbps
>
> Turning on flags lro tso4 tso6 vlanhwtso , yields
>
>   Server D (cxl3) -> Server A = 9.1 Gbps
>
> Oddly, this was an ipv4 iperf, but tso6 seems to help.
>
> I had settings turned off per
> https://wiki.freebsd.org/10gFreeBSD/Router#Disabling_LRO_and_TSO
>
> Servers A,B, and C are all running services.  Server D is acting as a
> router.  Are the LRO and TSO only for TCP to the box, or will it
> increase speeds for forwarding if I enable it?
>
>
> Thanks,
>
> Rudy
>
>
> On 11/22/19 1:30 PM, BulkMailForRudy wrote:
> >
> > I have nearly identical setups, but ix0 and ix1 are getting different
> > options at boot.  This seems to be the only difference I see between
> > machines and I am trying to answer the question, Why can Server A
> > iperf close to line rate while the other servers can not?
> >
> > The Test:  iperf -P 3 -c REMOTE_ADDR
> >
> > Server A (ix1) -> Server C (ix0)  = 9.4Gbps
> > Server B (ix0)-> Server C (ix0) = 5.6Gbps
> > Server C (ix0)-> A (ix1) or B (ix0)  = 5.0Gbps
> >
> >
> > The motherboards are identical between A,B and C and the configs very
> > similar.  The only difference is that Server A is plugged into ix1
> > while Server B and C are using ix0.
> >
> >
> > I am not modifying the flags at boot (eg ifconfig -tso), yet ix0 lacks
> > TXCSUM,TSO4,TSO6,LRO,WOL.
> >
> > ix0: flags=8943 metric
> > 0 mtu 1500
> >
> options=a538b9
>
> >
> > ether *ac:1f:6b:6a:14:6*4
> > media: Ethernet autoselect (10Gbase-T )
> > ix1: flags=8843 metric 0 mtu 1500
> >
> options=e53fbb
>
> >
> > ether *ac:1f:6b:6a:14:6*5
> > media: Ethernet autoselect (10Gbase-T )
> >
> > I did try adding some flags to ix0 and -- not sure if this was the
> > reason -- the box started acting oddly and I ended up rebooting it.
> >
> >
> > My hunch has is that there is somethign with the TSO4.
> >
> >
> > Rudy
> >
> > ___
> > freebsd-net@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> >
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Differential] D20824: Fix netmap + vlan panics

2019-07-01 Thread vmaffione (Vincenzo Maffione)
vmaffione accepted this revision.
vmaffione added a comment.
This revision is now accepted and ready to land.


  Looks good, thanks.
  The netmap unit tests and integration tests still pass with these changes.
  
  As a side, using the netmap emulated adapter (aka generic) is rather slow, 
because of the translation layer between the netmap buffers and the mbufs. The 
right way to let netmap interact with VLANs would be to implement the VLAN 
strip/push operations in your netmap application, and always open physical 
ports (e.g. ix0). I know this is hard to do here, because the VALE switch is 
not easy to extend in kernel-space. In theory VALE could be ported to 
userspace, so that you can easily add custom logic for VLAN handling.

CHANGES SINCE LAST ACTION
  https://reviews.freebsd.org/D20824/new/

REVISION DETAIL
  https://reviews.freebsd.org/D20824

EMAIL PREFERENCES
  https://reviews.freebsd.org/settings/panel/emailpreferences/

To: aleksandr.fedorov_itglobal.com, vmaffione, jhb, bz
Cc: olevole_olevole.ru, krion, evgueni.gavrilov_itglobal.com, freebsd-net-list
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is netmap jumbo frames broken in STABLE?

2019-03-19 Thread Vincenzo Maffione
Hi,
  It's not supported yet.
But I think it is reasonably feasible to add jumbo frames support on head
and stable/12, by improving iflib netmap support.
I hope to find the time to do this soon.

Cheers,
  Vincenzo

Il giorno lun 18 mar 2019 alle ore 02:52 Andrew Vylegzhanin <
avv...@gmail.com> ha scritto:

> Hi,
>
> After some  time I want to return to testing netmap+jumbo_frames+ixgbe(and
> ixl later) setup.
> What is current situation with this bundle (stable/11 and stable/12)?
>
> --
> Andrew
>
> чт, 6 дек. 2018 г. в 11:20, Vincenzo Maffione :
>
>> Hi,
>>   Actually I just realized that this patch is suitable for stable/11,
>> whereas on stable/12 ixgbe is served by iflib, and therefore
>> we need a different patch.
>>
>> I'll keep you updated then.
>>
>> Cheers,
>>   Vincenzo
>>
>> Il giorno mer 5 dic 2018 alle ore 20:45 Andrew Vylegzhanin <
>> avv...@gmail.com> ha scritto:
>>
>>> Hi,
>>> of course I want to test.
>>> But it takes time up to 1-2 weeks, since  I need setup HEAD environment
>>> in lab and modify my code for NS_MOREFRAG.
>>>
>>> When we can wait for MFC to stable/12 (or stable/11)
>>>
>>> --
>>> Andrew
>>>
>>> вс, 2 дек. 2018 г. в 13:30, Vincenzo Maffione :
>>>
>>>> Hi,
>>>>   I prepared a patch (against FreeBSD-HEAD) to to support jumbo frames
>>>> in ixgbe.
>>>> https://reviews.freebsd.org/D18402
>>>> Would you be able to test it?
>>>>
>>>> Thanks,
>>>>   Vincenzo
>>>>
>>>> Il giorno gio 22 nov 2018 alle ore 13:37 Andrew Vylegzhanin <
>>>> avv...@gmail.com> ha scritto:
>>>>
>>>>>
>>>>>
>>>>> чт, 22 нояб. 2018 г. в 13:42, Vincenzo Maffione >>>> >:
>>>>> >
>>>>> > Hi,
>>>>> >   Yes, absolutely, I'm currently working on aligning netmap on
>>>>> FreeBSD (head, stable/12 and stable/11) to
>>>>> > the same status it has on Linux (more features, more bugfixes,
>>>>> continuous integration infrastructure ... ).
>>>>>
>>>>> Great!
>>>>>
>>>>> >
>>>>> > In particular, on Linux jumbo frames are already supported on ixgbe,
>>>>> e1000, igb, e1000e, etc.
>>>>>
>>>>> BTW, what situation with ixl driver and chelsio  ?
>>>>>
>>>>> >
>>>>> >
>>>>> > I have some netmap patches are already in the queue (see here
>>>>> https://reviews.freebsd.org/differential/query/Ol8MNtAi2AIs/#R),
>>>>> > so I can address the ixgbe-jumbo-frames item as soon as the queue
>>>>> drains.
>>>>> > If you want to give a try in the meanwhile, and/or test ixgbe on
>>>>> FreeBSD it would be great.
>>>>> >
>>>>>
>>>>> I will look forward to ixgbe-jumbo-frames.
>>>>> Of course, I'm ready to test on both stable's.
>>>>>
>>>>
>>>>
>>>>>
>>>>> >
>>>>> > Cheers,
>>>>> >   Vincenzo
>>>>> >
>>>>> WBR,
>>>>> --
>>>>> Andrew
>>>>>
>>>>> > Il giorno gio 22 nov 2018 alle ore 11:23 Andrew Vylegzhanin <
>>>>> avv...@gmail.com> ha scritto:
>>>>> >>
>>>>> >> Hi,
>>>>> >>
>>>>> >> Come back to subject after two years.
>>>>> >> I would like to clarify situation with jumbo frames in ixgbe driver.
>>>>> >>
>>>>> >> I've looked to
>>>>> >>
>>>>> https://github.com/luigirizzo/netmap/blob/master/LINUX/ixgbe_netmap_linux.h
>>>>> >> and see a lot of changes compared 11/12-STABLE version
>>>>> ixgbe_netmap.h.
>>>>> >> Is it possible to back port it?
>>>>> >>
>>>>> >> In general, is there a chance to get worked jumbo frames on ixgbe?
>>>>> >>
>>>>> >> --
>>>>> >> Andrew
>>>>> >>
>>>>> >> ср, 8 июн. 2016 г. в 14:28, :
>>>>> >>
>>>>> >> > Support for fragmented packets with ixgbe was recently added on
>>>>> the linux
>>>>> >> > version of Netmap :
>>>>> >> >
>>>>> >> >
>>>>> >> >
>>>>> https://github.com/luigirizzo/netmap/commit/fc1e77560a8a8ea93cc3594de5fae94334debcd3
>>>>> >> >
>>>>> >> > I think the change for freebsd would be quite the same looking at
>>>>> >> >
>>>>> https://github.com/freebsd/freebsd/blob/master/sys/dev/netmap/ixgbe_netmap.h#L396
>>>>> >> >
>>>>> >> > After that, your userspace application simply have to check for
>>>>> the
>>>>> >> > NS_MOREFRAG flag in the receive ring, and if it's set he knows
>>>>> the end of
>>>>> >> > the packet will follow in the next buf.
>>>>> >> >
>>>>> >> > Tom
>>>>> >> >
>>>>> >> ___
>>>>> >> freebsd-net@freebsd.org mailing list
>>>>> >> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>>>>> >> To unsubscribe, send any mail to "
>>>>> freebsd-net-unsubscr...@freebsd.org"
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Vincenzo
>>>>>
>>>>
>>>>
>>>> --
>>>> Vincenzo
>>>>
>>>
>>
>> --
>> Vincenzo
>>
>

-- 
Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is netmap jumbo frames broken in STABLE?

2018-12-06 Thread Vincenzo Maffione
Hi,
  Actually I just realized that this patch is suitable for stable/11,
whereas on stable/12 ixgbe is served by iflib, and therefore
we need a different patch.

I'll keep you updated then.

Cheers,
  Vincenzo

Il giorno mer 5 dic 2018 alle ore 20:45 Andrew Vylegzhanin 
ha scritto:

> Hi,
> of course I want to test.
> But it takes time up to 1-2 weeks, since  I need setup HEAD environment in
> lab and modify my code for NS_MOREFRAG.
>
> When we can wait for MFC to stable/12 (or stable/11)
>
> --
> Andrew
>
> вс, 2 дек. 2018 г. в 13:30, Vincenzo Maffione :
>
>> Hi,
>>   I prepared a patch (against FreeBSD-HEAD) to to support jumbo frames in
>> ixgbe.
>> https://reviews.freebsd.org/D18402
>> Would you be able to test it?
>>
>> Thanks,
>>   Vincenzo
>>
>> Il giorno gio 22 nov 2018 alle ore 13:37 Andrew Vylegzhanin <
>> avv...@gmail.com> ha scritto:
>>
>>>
>>>
>>> чт, 22 нояб. 2018 г. в 13:42, Vincenzo Maffione :
>>> >
>>> > Hi,
>>> >   Yes, absolutely, I'm currently working on aligning netmap on FreeBSD
>>> (head, stable/12 and stable/11) to
>>> > the same status it has on Linux (more features, more bugfixes,
>>> continuous integration infrastructure ... ).
>>>
>>> Great!
>>>
>>> >
>>> > In particular, on Linux jumbo frames are already supported on ixgbe,
>>> e1000, igb, e1000e, etc.
>>>
>>> BTW, what situation with ixl driver and chelsio  ?
>>>
>>> >
>>> >
>>> > I have some netmap patches are already in the queue (see here
>>> https://reviews.freebsd.org/differential/query/Ol8MNtAi2AIs/#R),
>>> > so I can address the ixgbe-jumbo-frames item as soon as the queue
>>> drains.
>>> > If you want to give a try in the meanwhile, and/or test ixgbe on
>>> FreeBSD it would be great.
>>> >
>>>
>>> I will look forward to ixgbe-jumbo-frames.
>>> Of course, I'm ready to test on both stable's.
>>>
>>
>>
>>>
>>> >
>>> > Cheers,
>>> >   Vincenzo
>>> >
>>> WBR,
>>> --
>>> Andrew
>>>
>>> > Il giorno gio 22 nov 2018 alle ore 11:23 Andrew Vylegzhanin <
>>> avv...@gmail.com> ha scritto:
>>> >>
>>> >> Hi,
>>> >>
>>> >> Come back to subject after two years.
>>> >> I would like to clarify situation with jumbo frames in ixgbe driver.
>>> >>
>>> >> I've looked to
>>> >>
>>> https://github.com/luigirizzo/netmap/blob/master/LINUX/ixgbe_netmap_linux.h
>>> >> and see a lot of changes compared 11/12-STABLE version ixgbe_netmap.h.
>>> >> Is it possible to back port it?
>>> >>
>>> >> In general, is there a chance to get worked jumbo frames on ixgbe?
>>> >>
>>> >> --
>>> >> Andrew
>>> >>
>>> >> ср, 8 июн. 2016 г. в 14:28, :
>>> >>
>>> >> > Support for fragmented packets with ixgbe was recently added on the
>>> linux
>>> >> > version of Netmap :
>>> >> >
>>> >> >
>>> >> >
>>> https://github.com/luigirizzo/netmap/commit/fc1e77560a8a8ea93cc3594de5fae94334debcd3
>>> >> >
>>> >> > I think the change for freebsd would be quite the same looking at
>>> >> >
>>> https://github.com/freebsd/freebsd/blob/master/sys/dev/netmap/ixgbe_netmap.h#L396
>>> >> >
>>> >> > After that, your userspace application simply have to check for the
>>> >> > NS_MOREFRAG flag in the receive ring, and if it's set he knows the
>>> end of
>>> >> > the packet will follow in the next buf.
>>> >> >
>>> >> > Tom
>>> >> >
>>> >> ___
>>> >> freebsd-net@freebsd.org mailing list
>>> >> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org
>>> "
>>> >
>>> >
>>> >
>>> > --
>>> > Vincenzo
>>>
>>
>>
>> --
>> Vincenzo
>>
>

-- 
Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is netmap jumbo frames broken in STABLE?

2018-12-02 Thread Vincenzo Maffione
Hi,
  I prepared a patch (against FreeBSD-HEAD) to to support jumbo frames in
ixgbe.
https://reviews.freebsd.org/D18402
Would you be able to test it?

Thanks,
  Vincenzo

Il giorno gio 22 nov 2018 alle ore 13:37 Andrew Vylegzhanin <
avv...@gmail.com> ha scritto:

>
>
> чт, 22 нояб. 2018 г. в 13:42, Vincenzo Maffione :
> >
> > Hi,
> >   Yes, absolutely, I'm currently working on aligning netmap on FreeBSD
> (head, stable/12 and stable/11) to
> > the same status it has on Linux (more features, more bugfixes,
> continuous integration infrastructure ... ).
>
> Great!
>
> >
> > In particular, on Linux jumbo frames are already supported on ixgbe,
> e1000, igb, e1000e, etc.
>
> BTW, what situation with ixl driver and chelsio  ?
>
> >
> >
> > I have some netmap patches are already in the queue (see here
> https://reviews.freebsd.org/differential/query/Ol8MNtAi2AIs/#R),
> > so I can address the ixgbe-jumbo-frames item as soon as the queue drains.
> > If you want to give a try in the meanwhile, and/or test ixgbe on FreeBSD
> it would be great.
> >
>
> I will look forward to ixgbe-jumbo-frames.
> Of course, I'm ready to test on both stable's.
>


>
> >
> > Cheers,
> >   Vincenzo
> >
> WBR,
> --
> Andrew
>
> > Il giorno gio 22 nov 2018 alle ore 11:23 Andrew Vylegzhanin <
> avv...@gmail.com> ha scritto:
> >>
> >> Hi,
> >>
> >> Come back to subject after two years.
> >> I would like to clarify situation with jumbo frames in ixgbe driver.
> >>
> >> I've looked to
> >>
> https://github.com/luigirizzo/netmap/blob/master/LINUX/ixgbe_netmap_linux.h
> >> and see a lot of changes compared 11/12-STABLE version ixgbe_netmap.h.
> >> Is it possible to back port it?
> >>
> >> In general, is there a chance to get worked jumbo frames on ixgbe?
> >>
> >> --
> >> Andrew
> >>
> >> ср, 8 июн. 2016 г. в 14:28, :
> >>
> >> > Support for fragmented packets with ixgbe was recently added on the
> linux
> >> > version of Netmap :
> >> >
> >> >
> >> >
> https://github.com/luigirizzo/netmap/commit/fc1e77560a8a8ea93cc3594de5fae94334debcd3
> >> >
> >> > I think the change for freebsd would be quite the same looking at
> >> >
> https://github.com/freebsd/freebsd/blob/master/sys/dev/netmap/ixgbe_netmap.h#L396
> >> >
> >> > After that, your userspace application simply have to check for the
> >> > NS_MOREFRAG flag in the receive ring, and if it's set he knows the
> end of
> >> > the packet will follow in the next buf.
> >> >
> >> > Tom
> >> >
> >> ___
> >> freebsd-net@freebsd.org mailing list
> >> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> >
> >
> >
> > --
> > Vincenzo
>


-- 
Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is netmap jumbo frames broken in STABLE?

2018-11-23 Thread Vincenzo Maffione
On Thu, Nov 22, 2018, 1:37 PM Andrew Vylegzhanin 
>
> чт, 22 нояб. 2018 г. в 13:42, Vincenzo Maffione :
> >
> > Hi,
> >   Yes, absolutely, I'm currently working on aligning netmap on FreeBSD
> (head, stable/12 and stable/11) to
> > the same status it has on Linux (more features, more bugfixes,
> continuous integration infrastructure ... ).
>
> Great!
>
> >
> > In particular, on Linux jumbo frames are already supported on ixgbe,
> e1000, igb, e1000e, etc.
>
> BTW, what situation with ixl driver and chelsio  ?
>

The situation for ixl is the same as ixgbe. I plan to update both. I don't
know about chelsio; you should ask the cxgbe maintainers.

>
> >
> >
> > I have some netmap patches are already in the queue (see here
> https://reviews.freebsd.org/differential/query/Ol8MNtAi2AIs/#R),
> > so I can address the ixgbe-jumbo-frames item as soon as the queue drains.
> > If you want to give a try in the meanwhile, and/or test ixgbe on FreeBSD
> it would be great.
> >
>
> I will look forward to ixgbe-jumbo-frames.
> Of course, I'm ready to test on both stable's.
>
>
Ok thanks.

Vincenzo

> >
> > Cheers,
> >   Vincenzo
> >
> WBR,
> --
> Andrew
>
> > Il giorno gio 22 nov 2018 alle ore 11:23 Andrew Vylegzhanin <
> avv...@gmail.com> ha scritto:
> >>
> >> Hi,
> >>
> >> Come back to subject after two years.
> >> I would like to clarify situation with jumbo frames in ixgbe driver.
> >>
> >> I've looked to
> >>
> https://github.com/luigirizzo/netmap/blob/master/LINUX/ixgbe_netmap_linux.h
> >> and see a lot of changes compared 11/12-STABLE version ixgbe_netmap.h.
> >> Is it possible to back port it?
> >>
> >> In general, is there a chance to get worked jumbo frames on ixgbe?
> >>
> >> --
> >> Andrew
> >>
> >> ср, 8 июн. 2016 г. в 14:28, :
> >>
> >> > Support for fragmented packets with ixgbe was recently added on the
> linux
> >> > version of Netmap :
> >> >
> >> >
> >> >
> https://github.com/luigirizzo/netmap/commit/fc1e77560a8a8ea93cc3594de5fae94334debcd3
> >> >
> >> > I think the change for freebsd would be quite the same looking at
> >> >
> https://github.com/freebsd/freebsd/blob/master/sys/dev/netmap/ixgbe_netmap.h#L396
> >> >
> >> > After that, your userspace application simply have to check for the
> >> > NS_MOREFRAG flag in the receive ring, and if it's set he knows the
> end of
> >> > the packet will follow in the next buf.
> >> >
> >> > Tom
> >> >
> >> ___
> >> freebsd-net@freebsd.org mailing list
> >> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> >
> >
> >
> > --
> > Vincenzo
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Is netmap jumbo frames broken in STABLE?

2018-11-22 Thread Vincenzo Maffione
Hi,
  Yes, absolutely, I'm currently working on aligning netmap on FreeBSD
(head, stable/12 and stable/11) to
the same status it has on Linux (more features, more bugfixes, continuous
integration infrastructure ... ).
In particular, on Linux jumbo frames are already supported on ixgbe, e1000,
igb, e1000e, etc.

I have some netmap patches are already in the queue (see here
https://reviews.freebsd.org/differential/query/Ol8MNtAi2AIs/#R),
so I can address the ixgbe-jumbo-frames item as soon as the queue drains.
If you want to give a try in the meanwhile, and/or test ixgbe on FreeBSD it
would be great.

Cheers,
  Vincenzo

Il giorno gio 22 nov 2018 alle ore 11:23 Andrew Vylegzhanin <
avv...@gmail.com> ha scritto:

> Hi,
>
> Come back to subject after two years.
> I would like to clarify situation with jumbo frames in ixgbe driver.
>
> I've looked to
> https://github.com/luigirizzo/netmap/blob/master/LINUX/ixgbe_netmap_linux.h
> and see a lot of changes compared 11/12-STABLE version ixgbe_netmap.h.
> Is it possible to back port it?
>
> In general, is there a chance to get worked jumbo frames on ixgbe?
>
> --
> Andrew
>
> ср, 8 июн. 2016 г. в 14:28, :
>
> > Support for fragmented packets with ixgbe was recently added on the linux
> > version of Netmap :
> >
> >
> >
> https://github.com/luigirizzo/netmap/commit/fc1e77560a8a8ea93cc3594de5fae94334debcd3
> >
> > I think the change for freebsd would be quite the same looking at
> >
> https://github.com/freebsd/freebsd/blob/master/sys/dev/netmap/ixgbe_netmap.h#L396
> >
> > After that, your userspace application simply have to check for the
> > NS_MOREFRAG flag in the receive ring, and if it's set he knows the end of
> > the packet will follow in the next buf.
> >
> > Tom
> >
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>


-- 
Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: What is status of `pkt-gen' on FreeBSD?

2018-11-06 Thread Vincenzo Maffione
Hi,
  There are two separate issues here.
The first it's updating pkt-gen. This is what I'm trying to do right now (
https://reviews.freebsd.org/D17698). The update will land in HEAD and then
after 3 days in 12/stable. I will also update 11/stable later.

The second issue looks like a problem related to the em driver. As far as I
know, netmap support for em driver is now provided by iflib
(sys/net/iflib.c). Maybe there is an issue in iflib_netmap_txsync() that
prevents progress?
Or maybe the interface is down while TX is stuck (thus preventing progress)?

Cheers,
  Vincenzo

Il giorno lun 5 nov 2018 alle ore 22:21 Lev Serebryakov 
ha scritto:

> Hello Freebsd-net,
>
>  Is `pkt-gen' (for netmap) supported on FreeBSD?
>
>  ${SRCTOP}/tools/tools/netmap/pkt-gen.c is very old and could not be built
> (I've checked stable/11, stable/12 and head).
>
>  ${PORTS}/net/pkt-gen is not so old, but more than year old + patches
>
>  pkt-gen from github could be built on CURRENT, but can not finish
>  transmission:
>
> 236.545902 main_thread [2605] 0 pps (0 pkts 0 bps in 1031501 usec) 0.00
> avg_batch 9 min_space
> 237.00 sender_body [1687] pending tx tail 446 head 448 on ring 0
> 237.07 sender_body [1687] pending tx tail 446 head 448 on ring 0
> 237.11 sender_body [1687] pending tx tail 446 head 448 on ring 0
> 237.16 sender_body [1687] pending tx tail 446 head 448 on ring 0
> 237.20 sender_body [1687] pending tx tail 446 head 448 on ring 0
> 237.607508 main_thread [2605] 0 pps (0 pkts 0 bps in 1061606 usec) 0.00
> avg_batch 9 min_space
> 238.00 sender_body [1687] pending tx tail 446 head 448 on ring 0
> 238.05 sender_body [1687] pending tx tail 446 head 448 on ring 0
> 238.09 sender_body [1687] pending tx tail 446 head 448 on ring 0
> 238.13 sender_body [1687] pending tx tail 446 head 448 on ring 0
> 238.17 sender_body [1687] pending tx tail 446 head 448 on ring 0
> (forever)
>
>  and can not be used in benchmarking scripts...
>
>  Which one should I use? How could I be sure that transmission of given
> number of packets will be finished in finite time no matter what? Is this
> problem with tx queue really driver problem or netmap problem?
>
>
> --
> Best regards,
>  Lev  mailto:l...@freebsd.org
>
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>


-- 
Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: 11.2-STABLE: netmap/pkt-gen can not allocate memory

2018-11-02 Thread Vincenzo Maffione
Hi,
  It looks like there is not enough memory for netmap to allocate its data
structures.
What is the output of

# sysctl dev.netmap

?

Cheers,
  Vincenzo

Il giorno ven 2 nov 2018 alle ore 14:02 Lev Serebryakov 
ha scritto:

> On 02.11.2018 14:31, Lev Serebryakov wrote:
>
>
> > $ sudo ./pkt-gen -f rx -i igb1
>  and pkt-gen from ports complains about invalid interface:
>
> 622.603767 main [2699] interface is igb1
> 622.603783 main [2824] using default burst size: 512
> 622.603786 main [2832] running on 1 cpus (have 4)
> 622.603841 extract_ip_range [465] range is 10.0.0.1:1234 to 10.0.0.1:1234
> 622.603846 extract_ip_range [465] range is 10.1.0.1:1234 to 10.1.0.1:1234
> 622.603909 nm_open [920] NIOCREGIF failed: Invalid argument igb1
> 622.603912 main [2913] Unable to open netmap:igb1: Invalid argument
> 622.603914 main [2994] aborting
>
>
> --
> // Lev Serebryakov
>
>

-- 
Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: vale and netmap module questions

2018-10-04 Thread Vincenzo Maffione
I created a new patch on reviews.freebsd.org
https://reviews.freebsd.org/D17411
I need to wait for mentor's approval before writing to re@

Thanks,
  Vincenzo



Il giorno gio 4 ott 2018 alle ore 17:25 Rodney W. Grimes <
freebsd-...@pdx.rh.cn85.dnsmgr.net> ha scritto:

> > Hi Rodney,
> >   Sure. Is there a specific procedure for this, or should I just send the
> > .diff to r...@freebsd.org?
>
> Yes, there is a procedure:
> https://wiki.freebsd.org/Releng/ChangeRequestGuidelines
>
> the specific details for when the release is ^head is
> towards the bottom.
>
> Thanks again,
> >
> > Thanks,
> >   Vincenzo
> >
> > Il giorno gio 4 ott 2018 alle ore 17:03 Rodney W. Grimes <
> > freebsd-...@pdx.rh.cn85.dnsmgr.net> ha scritto:
> >
> > > > Hi,
> > > >   I fixed this upstream in this commit
> > > >
> > >
> https://github.com/luigirizzo/netmap/commit/282f404ccb4fc777056a591127060e29657c9ab7
> > > > I will apply the change into FreeBSD as soon as HEAD thaws.
> > >
> > > This appears to be a man page change only,
> > > please do submit that to r...@freebsd.org to be
> > > committed now.
> > >
> > > Thanks,
> > > Rod
> > >
> > > > Cheers,
> > > >   Vincenzo
> > > >
> > > > Il giorno gio 4 ott 2018 alle ore 00:36 K. Macy 
> ha
> > > > scritto:
> > > >
> > > > > On Fri, Aug 31, 2018 at 6:50 PM John-Mark Gurney  >
> > > wrote:
> > > > > >
> > > > > > First, does vale work for anyone?  At least one of the documented
> > > > > > commands in vale(4) does not work.
> > > > > >
> > > > > The documentation with respect to naming is wrong. I didn't have a
> bit
> > > > > when I was using it so never got around to fixing it.
> > > > >
> > > > >
> > > > > > After manually building the netmap module and loading it:
> > > > > > # tcpdump -ni vale-a:1
> > > > > > 313.748851 nm_open [947] invalid bridge name vale-a:1
> > > > > > tcpdump: netmap open: cannot access vale-a:1: Invalid argument
> > > > > >
> > > > > > If I run tcpdump with a more correct looking name of vale1:a, I
> get a
> > > > > > null deref panic in ifunit_ref.  Full trace is at the end.
> > > > > >
> > > > > > Second, is there a good reason why the netmap module is still
> > > > > > disconnected from being built as a module?  I guess not working
> > > > > > would be one, but I figure the above might be an aarch64 specific
> > > > > > problem, and not a general issue.
> > > > > >
> > > > > > FreeBSD generic 12.0-ALPHA3 FreeBSD 12.0-ALPHA3 #0 r338296M: Sat
> Aug
> > > 25
> > > > > 06:17:25 UTC 2018 freebsd@generic
> > > :/usr/obj/usr/src/arm64.aarch64/sys/GENERIC
> > > > > arm64
> > > > > >
> > > > > > netmap: loaded module
> > > > > > Fatal data abort:
> > > > > >   x0: 5894f900
> > > > > >   x1: 40351028
> > > > > >   x2:5
> > > > > >   x3: 405e4090
> > > > > >   x4: 4000
> > > > > >   x5:0
> > > > > >   x6:0
> > > > > >   x7: 40351cb0
> > > > > >   x8:0
> > > > > >   x9: 00a09960
> > > > > >  x10: 5894f908
> > > > > >  x11:0
> > > > > >  x12:1
> > > > > >  x13: fcffc000
> > > > > >  x14:3
> > > > > >  x15:1
> > > > > >  x16: 5cd35580
> > > > > >  x17: 00476de0
> > > > > >  x18: 40351010
> > > > > >  x19: 40351160
> > > > > >  x20: fd000739798e
> > > > > >  x21: 00a02000
> > > > > >  x22:1
> > > > > >  x23:0
> > > > > >  x24: fd0007397988
> > > > > >  x25:0
> > > > > >  x26:1
> > > > > >  x27:0
> > > > > >  x28: fd0007397982
> > > > > >  x29: 40351050
> > > > > >   sp: 40351010
> > > > > >   lr: 00476e0c
> > > > > >  elr: 00476e1c
> > > > > > spsr: 6345
> > > > > >  far:   28
> > > > > >  esr: 9607
> > > > > > [ thread pid 802 tid 100096 ]
> > > > > > Stopped at  ifunit_ref+0x3c:ldr x8, [x8, #40]
> > > > > > db> bt
> > > > > > Tracing pid 802 tid 100096 td 0xfd0007422000
> > > > > > db_trace_self() at db_stack_trace+0xf0
> > > > > >  pc = 0x0069b81c  lr = 0x000d6b28
> > > > > >  sp = 0x40350920  fp = 0x40350950
> > > > > >
> > > > > > db_stack_trace() at db_command+0x220
> > > > > >  pc = 0x000d6b28  lr = 0x000d67ac
> > > > > >  sp = 0x40350960  fp = 0x40350a40
> > > > > >
> > > > > > db_command() at db_command_loop+0x60
> > > > > >  pc = 0x000d67ac  lr = 0x000d6570
> > > > > >  sp = 0x40350a50  fp = 0x40350a70
> > > > > >
> > > > > > db_command_loop() at db_trap+0xf4
> > > > > >  pc = 0x000d6570  lr = 0x000d9694
> > > > > >  sp = 0x40350a80  fp = 0x40350ca0
> > > > > >
> > > > > > db_trap() at kdb_trap+0x1d8
> > > > > >  pc = 0x000d9694  lr = 

Re: vale and netmap module questions

2018-10-04 Thread Vincenzo Maffione
Hi Rodney,
  Sure. Is there a specific procedure for this, or should I just send the
.diff to r...@freebsd.org?

Thanks,
  Vincenzo

Il giorno gio 4 ott 2018 alle ore 17:03 Rodney W. Grimes <
freebsd-...@pdx.rh.cn85.dnsmgr.net> ha scritto:

> > Hi,
> >   I fixed this upstream in this commit
> >
> https://github.com/luigirizzo/netmap/commit/282f404ccb4fc777056a591127060e29657c9ab7
> > I will apply the change into FreeBSD as soon as HEAD thaws.
>
> This appears to be a man page change only,
> please do submit that to r...@freebsd.org to be
> committed now.
>
> Thanks,
> Rod
>
> > Cheers,
> >   Vincenzo
> >
> > Il giorno gio 4 ott 2018 alle ore 00:36 K. Macy  ha
> > scritto:
> >
> > > On Fri, Aug 31, 2018 at 6:50 PM John-Mark Gurney 
> wrote:
> > > >
> > > > First, does vale work for anyone?  At least one of the documented
> > > > commands in vale(4) does not work.
> > > >
> > > The documentation with respect to naming is wrong. I didn't have a bit
> > > when I was using it so never got around to fixing it.
> > >
> > >
> > > > After manually building the netmap module and loading it:
> > > > # tcpdump -ni vale-a:1
> > > > 313.748851 nm_open [947] invalid bridge name vale-a:1
> > > > tcpdump: netmap open: cannot access vale-a:1: Invalid argument
> > > >
> > > > If I run tcpdump with a more correct looking name of vale1:a, I get a
> > > > null deref panic in ifunit_ref.  Full trace is at the end.
> > > >
> > > > Second, is there a good reason why the netmap module is still
> > > > disconnected from being built as a module?  I guess not working
> > > > would be one, but I figure the above might be an aarch64 specific
> > > > problem, and not a general issue.
> > > >
> > > > FreeBSD generic 12.0-ALPHA3 FreeBSD 12.0-ALPHA3 #0 r338296M: Sat Aug
> 25
> > > 06:17:25 UTC 2018 freebsd@generic
> :/usr/obj/usr/src/arm64.aarch64/sys/GENERIC
> > > arm64
> > > >
> > > > netmap: loaded module
> > > > Fatal data abort:
> > > >   x0: 5894f900
> > > >   x1: 40351028
> > > >   x2:5
> > > >   x3: 405e4090
> > > >   x4: 4000
> > > >   x5:0
> > > >   x6:0
> > > >   x7: 40351cb0
> > > >   x8:0
> > > >   x9: 00a09960
> > > >  x10: 5894f908
> > > >  x11:0
> > > >  x12:1
> > > >  x13: fcffc000
> > > >  x14:3
> > > >  x15:1
> > > >  x16: 5cd35580
> > > >  x17: 00476de0
> > > >  x18: 40351010
> > > >  x19: 40351160
> > > >  x20: fd000739798e
> > > >  x21: 00a02000
> > > >  x22:1
> > > >  x23:0
> > > >  x24: fd0007397988
> > > >  x25:0
> > > >  x26:1
> > > >  x27:0
> > > >  x28: fd0007397982
> > > >  x29: 40351050
> > > >   sp: 40351010
> > > >   lr: 00476e0c
> > > >  elr: 00476e1c
> > > > spsr: 6345
> > > >  far:   28
> > > >  esr: 9607
> > > > [ thread pid 802 tid 100096 ]
> > > > Stopped at  ifunit_ref+0x3c:ldr x8, [x8, #40]
> > > > db> bt
> > > > Tracing pid 802 tid 100096 td 0xfd0007422000
> > > > db_trace_self() at db_stack_trace+0xf0
> > > >  pc = 0x0069b81c  lr = 0x000d6b28
> > > >  sp = 0x40350920  fp = 0x40350950
> > > >
> > > > db_stack_trace() at db_command+0x220
> > > >  pc = 0x000d6b28  lr = 0x000d67ac
> > > >  sp = 0x40350960  fp = 0x40350a40
> > > >
> > > > db_command() at db_command_loop+0x60
> > > >  pc = 0x000d67ac  lr = 0x000d6570
> > > >  sp = 0x40350a50  fp = 0x40350a70
> > > >
> > > > db_command_loop() at db_trap+0xf4
> > > >  pc = 0x000d6570  lr = 0x000d9694
> > > >  sp = 0x40350a80  fp = 0x40350ca0
> > > >
> > > > db_trap() at kdb_trap+0x1d8
> > > >  pc = 0x000d9694  lr = 0x003cdf70
> > > >  sp = 0x40350cb0  fp = 0x40350d60
> > > >
> > > > kdb_trap() at data_abort+0x1e0
> > > >  pc = 0x003cdf70  lr = 0x006b5ca4
> > > >  sp = 0x40350d70  fp = 0x40350e20
> > > >
> > > > data_abort() at do_el1h_sync+0x11c
> > > >  pc = 0x006b5ca4  lr = 0x006b59c0
> > > >  sp = 0x40350e30  fp = 0x40350e60
> > > >
> > > > do_el1h_sync() at handle_el1h_sync+0x74
> > > >  pc = 0x006b59c0  lr = 0x0069d874
> > > >  sp = 0x40350e70  fp = 0x40350f80
> > > >
> > > > handle_el1h_sync() at ifunit_ref+0x28
> > > >  pc = 0x0069d874  lr = 0x00476e08
> > > >  sp = 0x40350f90  fp = 0x40351050
> > > >
> > > > ifunit_ref() at netmap_get_bdg_na+0x194
> > > >  pc = 0x00476e08  lr = 0x5cd20b24

Re: vale and netmap module questions

2018-10-04 Thread Vincenzo Maffione
Hi,
  I fixed this upstream in this commit
https://github.com/luigirizzo/netmap/commit/282f404ccb4fc777056a591127060e29657c9ab7
I will apply the change into FreeBSD as soon as HEAD thaws.

Cheers,
  Vincenzo

Il giorno gio 4 ott 2018 alle ore 00:36 K. Macy  ha
scritto:

> On Fri, Aug 31, 2018 at 6:50 PM John-Mark Gurney  wrote:
> >
> > First, does vale work for anyone?  At least one of the documented
> > commands in vale(4) does not work.
> >
> The documentation with respect to naming is wrong. I didn't have a bit
> when I was using it so never got around to fixing it.
>
>
> > After manually building the netmap module and loading it:
> > # tcpdump -ni vale-a:1
> > 313.748851 nm_open [947] invalid bridge name vale-a:1
> > tcpdump: netmap open: cannot access vale-a:1: Invalid argument
> >
> > If I run tcpdump with a more correct looking name of vale1:a, I get a
> > null deref panic in ifunit_ref.  Full trace is at the end.
> >
> > Second, is there a good reason why the netmap module is still
> > disconnected from being built as a module?  I guess not working
> > would be one, but I figure the above might be an aarch64 specific
> > problem, and not a general issue.
> >
> > FreeBSD generic 12.0-ALPHA3 FreeBSD 12.0-ALPHA3 #0 r338296M: Sat Aug 25
> 06:17:25 UTC 2018 
> freebsd@generic:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC
> arm64
> >
> > netmap: loaded module
> > Fatal data abort:
> >   x0: 5894f900
> >   x1: 40351028
> >   x2:5
> >   x3: 405e4090
> >   x4: 4000
> >   x5:0
> >   x6:0
> >   x7: 40351cb0
> >   x8:0
> >   x9: 00a09960
> >  x10: 5894f908
> >  x11:0
> >  x12:1
> >  x13: fcffc000
> >  x14:3
> >  x15:1
> >  x16: 5cd35580
> >  x17: 00476de0
> >  x18: 40351010
> >  x19: 40351160
> >  x20: fd000739798e
> >  x21: 00a02000
> >  x22:1
> >  x23:0
> >  x24: fd0007397988
> >  x25:0
> >  x26:1
> >  x27:0
> >  x28: fd0007397982
> >  x29: 40351050
> >   sp: 40351010
> >   lr: 00476e0c
> >  elr: 00476e1c
> > spsr: 6345
> >  far:   28
> >  esr: 9607
> > [ thread pid 802 tid 100096 ]
> > Stopped at  ifunit_ref+0x3c:ldr x8, [x8, #40]
> > db> bt
> > Tracing pid 802 tid 100096 td 0xfd0007422000
> > db_trace_self() at db_stack_trace+0xf0
> >  pc = 0x0069b81c  lr = 0x000d6b28
> >  sp = 0x40350920  fp = 0x40350950
> >
> > db_stack_trace() at db_command+0x220
> >  pc = 0x000d6b28  lr = 0x000d67ac
> >  sp = 0x40350960  fp = 0x40350a40
> >
> > db_command() at db_command_loop+0x60
> >  pc = 0x000d67ac  lr = 0x000d6570
> >  sp = 0x40350a50  fp = 0x40350a70
> >
> > db_command_loop() at db_trap+0xf4
> >  pc = 0x000d6570  lr = 0x000d9694
> >  sp = 0x40350a80  fp = 0x40350ca0
> >
> > db_trap() at kdb_trap+0x1d8
> >  pc = 0x000d9694  lr = 0x003cdf70
> >  sp = 0x40350cb0  fp = 0x40350d60
> >
> > kdb_trap() at data_abort+0x1e0
> >  pc = 0x003cdf70  lr = 0x006b5ca4
> >  sp = 0x40350d70  fp = 0x40350e20
> >
> > data_abort() at do_el1h_sync+0x11c
> >  pc = 0x006b5ca4  lr = 0x006b59c0
> >  sp = 0x40350e30  fp = 0x40350e60
> >
> > do_el1h_sync() at handle_el1h_sync+0x74
> >  pc = 0x006b59c0  lr = 0x0069d874
> >  sp = 0x40350e70  fp = 0x40350f80
> >
> > handle_el1h_sync() at ifunit_ref+0x28
> >  pc = 0x0069d874  lr = 0x00476e08
> >  sp = 0x40350f90  fp = 0x40351050
> >
> > ifunit_ref() at netmap_get_bdg_na+0x194
> >  pc = 0x00476e08  lr = 0x5cd20b24
> >  sp = 0x40351060  fp = 0x403510c0
> >
> > netmap_get_bdg_na() at netmap_get_na+0x1b0
> >  pc = 0x5cd20b24  lr = 0x5cd15994
> >  sp = 0x403510d0  fp = 0x40351120
> >
> > netmap_get_na() at netmap_ioctl+0xcd0
> >  pc = 0x5cd15994  lr = 0x5cd18160
> >  sp = 0x40351130  fp = 0x403511f0
> >
> > netmap_ioctl() at netmap_ioctl_legacy+0x4b8
> >  pc = 0x5cd18160  lr = 0x5cd2b188
> >  sp = 0x40351200  fp = 0x403515b0
> >
> > netmap_ioctl_legacy() at netmap_ioctl+0x15c
> >  pc = 0x5cd2b188  lr = 0x5cd175ec
> >  sp = 0x403515c0  fp = 0x40351680
> >
> > netmap_ioctl() at freebsd_netmap_ioctl+0x4c
> >  pc = 

Re: vale and netmap module questions

2018-09-06 Thread Vincenzo Maffione
Il giorno gio 6 set 2018 alle ore 08:35 Marko Zec  ha scritto:

> On Wed, 5 Sep 2018 17:42:06 -0700
> John-Mark Gurney  wrote:
>
> > Marko Zec wrote this message on Wed, Sep 05, 2018 at 12:47 +0200:
> > > On Wed, 5 Sep 2018 12:36:38 +0200
> > > Vincenzo Maffione  wrote:
> > >
> > > > Hi Marko,
> > > >   Thanks a lot for identifying the problem.
> > > > If I understand correctly, simply adding -D VIMAGE here
> > > >
> https://github.com/luigirizzo/netmap/blob/master/sys/modules/netmap/Makefile#L11
> > > > would at least mitigate the issue.
> > > > If you think I'm right I'll just add it.
> > >
> > > Right, go for it...
> >
> > Why not just hook up netmap to the build?
>
> I have no idea why it is on by default only on amd64...  Perhaps due to
> the lack of adopters / testers on other platforms?  Perhaps the code has
> some assumptions re. memory coherence model, or unaligned accesses,
> which hold the water on amd64 but not necessarily elsewhere?
>

We used netmap on aarch64 (on Linux) with no problems.
There can certainly be bugs on untested platforms, but there are no
assumptions about the cache coherence
model. And we do not use unaligned access, of course, which would slow down
the whole thing.


>
> > Because if you add -DVIMAGE to the Makefile, you'll now break people
> > who have kernels w/o VIMAGE..
>
> No, and this is certifiable.  At least in this particular case, we have
> only two macro instances which set / restore curvnet prior to calling
> into the network stack.  If the network stack is of the non-VNET kind,
> and thus doesn't care about curvnet, that won't do any harm.
>

Yes indeed. I just tried on FreeBSD 10.3 (with VIMAGE off) and netmap works
when compiled with -DVIMAGE.


>
> > And the only reason to build netmap module manually is because it's not
> hooked up.
>
> There's still a reason to retain the possibility to build netmap as
> module, as this speeds up the development cycles for netmap-internal
> hackers / experimenters.
>

> But beyond that, running netmap without native support in device
> drivers is pretty much pointless.
>

Not really. Netmap exports software-only ports, like pipes, monitor, VALE
ports, etc, that you can use with profit even
without native support.
The real problem, if I recall correctly, is that FreeBSD modules introduce
a significant overhead every time you need
to invoke a module function from the kernel core image. I guess it is a
matter of reference counters and locks.
So running netmap as a module leads to significantly lower performance. But
it is ok for regression testing and
experimentation.

-- 
Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: vale and netmap module questions

2018-09-05 Thread Vincenzo Maffione
Hi Marko,
  Thanks a lot for identifying the problem.
If I understand correctly, simply adding -D VIMAGE here
https://github.com/luigirizzo/netmap/blob/master/sys/modules/netmap/Makefile#L11
would at least mitigate the issue.
If you think I'm right I'll just add it.

Thanks,
  Vincenzo

Il giorno mer 5 set 2018 alle ore 10:44 Marko Zec  ha scritto:

> On Tue, 4 Sep 2018 20:18:29 +0200
> Vincenzo Maffione  wrote:
>
> > Hi,
> >   I don't think the panic depends on the architecture. Also, it does
> > not depend on using emulated mode or not.
> > We have seen this happening on x86_64 on FreeBSD 12 head.
> >
> > As John-Mark says, kdloading netmap.ko is not enough. You need to
> > open a netmap port corresponding
> > to any network interface, e.g.
> >  # pkt-gen -i eth0
> > because this will trigger ifunit_ref() in the kernel (netmap tries to
> > grab the interface called "eth0").
> > The ifunit_ref() panics and we still need to figure out why.
>
> After more carefully rereading jmg's original report, on amd64 I
> reproduced exactly the same panic he reported on arm64, the summary
> follows:
>
> - the problem can be observed as VNET related, and can be triggered only
> on kernels built with "options VIMAGE" but without "device netmap".
>
> - netmap.ko built using make buildkernel works fine.
>
> - netmap.ko built separately (cd sys/modules/netmap; make) panics on a
>   first reference to vale interface, such as "tcpdump -ni vale1:a"
>
> - the problem is that the separately built module is compiled without
>   "options VIMAGE", and as such does not set curvnet prior to calling
>   into network stack functions, therefore we have a null pointer
>   dereference in ifunit_ref() and voila...
>
> - hence, there's nothing to fix in the netmap code, and there's nothing
>   to fix in the network stack either related to this particular problem,
>   but rather we need a general mechanism which would prevent kldloading
>   non-VNETized .ko modules into a VNETized kernel.  I have no idea how
>   to approach this, besides adding bz@ to the cc: list, perhaps he
>   could chime in with some thoughts?
>
> Cheers,
>
> Marko
>


-- 
Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: vale and netmap module questions

2018-09-04 Thread Vincenzo Maffione
Hi,
  I don't think the panic depends on the architecture. Also, it does not
depend on using emulated mode or not.
We have seen this happening on x86_64 on FreeBSD 12 head.

As John-Mark says, kdloading netmap.ko is not enough. You need to open a
netmap port corresponding
to any network interface, e.g.
 # pkt-gen -i eth0
because this will trigger ifunit_ref() in the kernel (netmap tries to grab
the interface called "eth0").
The ifunit_ref() panics and we still need to figure out why.

Cheers,
  Vincenzo


Il giorno mar 4 set 2018 alle ore 18:28 John-Mark Gurney 
ha scritto:

> Marko Zec wrote this message on Tue, Sep 04, 2018 at 16:43 +0200:
> > On Sat, 1 Sep 2018 14:11:23 -0700
> > John-Mark Gurney  wrote:
> > > Vincenzo Maffione wrote this message on Sat, Sep 01, 2018 at 22:25
> > > +0200:
> > ...
> > > > On x86_64 netmap is not built as a module, so everything works
> > > > fine. I don't see any reason why it should be a module in aarch64.
> > >
> > > Well, sys/modules/netmap exists... If it isn't planned on ever being
> > > made to work, it should be removed so people don't get confused, or
> > > at least marked broken so it doesn't get built...
> > >
> > > I built it manually because it was quicker than recompiling an entire
> > > kernel and rebooting...
> >
> > Hi John-Mark,
> >
> > out of curiosity I tested all four kernel config combinations with
> > "device netmap" and "options VIMAGE" being on and off (both are on by
> > default now in GENERIC) on amd64 @r338446, and found that kldloading
> > netmap.ko can't provoke a panic.  Was there a particular sequence of
> > commands issued after kldloading netmap which led to the crash you
> > reported earlier?
>
> Nope.  I would kldload netmap, and then run the tcpdump command listed
> in the original report, and it would just panic...
>
> Also note that my panic was on arm64, NOT amd64.. so it could be
> something platform specific...
>
> > Nevertheless, note that building the kernel without "device netmap" is
> > borderline pointless even if netmap core built as a kld module works, as
> > this will result in all the drivers being built without the required
> > netmap bits, which means they will only work in the painfully slow
> > "emulation" mode with netmap.
>
> I was only doing it to test something out quickly..
>
> > Perhaps the panic you stepped into was related to the emulation mode
> > being used with netmap, instead of using the native netmap hooks in
> > device drivers?  Or maybe was it vale + VNET related?
>
> All I know was that it was the arm64 GENERIC kernel + module netmap
> + running tcpdump w/ that command...  Nothing special configured, just
> a single ethernet interface configured w/ DHCP.  No firewall configured,
> just sshd and ntpd enabled..
>
> --
>   John-Mark Gurney  Voice: +1 415 225 5579
>
>  "All that I will do, has been done, All that I have, has not."
>


-- 
Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: vale and netmap module questions

2018-09-02 Thread Vincenzo Maffione
Il giorno sab 1 set 2018 alle ore 23:11 John-Mark Gurney 
ha scritto:

> Vincenzo Maffione wrote this message on Sat, Sep 01, 2018 at 22:25 +0200:
> > Il giorno sab 1 set 2018 alle ore 03:50 John-Mark Gurney <
> j...@funkthat.com>
> > ha scritto:
> >
> > > First, does vale work for anyone?  At least one of the documented
> > > commands in vale(4) does not work.
> > >
> > > After manually building the netmap module and loading it:
> > > # tcpdump -ni vale-a:1
> > > 313.748851 nm_open [947] invalid bridge name vale-a:1
> > > tcpdump: netmap open: cannot access vale-a:1: Invalid argument
> >
> > That name is invalid. See netmap(4).
>
> Are there plans to update the documentation then?  It seems like
> vale(4) should be a more authoratative reference for vale's naming
> than netmap(4)...
>

You are right. I just updated the man page in the netmap github. I will
commit this to FreeBSD soon.


>
> > > If I run tcpdump with a more correct looking name of vale1:a, I get a
> > > null deref panic in ifunit_ref.  Full trace is at the end.
> >
> > Yes, this is a known bug, already posted to this mailing list. Don't
> build
> > netmap as a module, but link it in the kernel image and it will work.
> > (Add "dev netmap" to the kernel config).
> >
> > > Second, is there a good reason why the netmap module is still
> > > disconnected from being built as a module?  I guess not working
> > > would be one, but I figure the above might be an aarch64 specific
> > > problem, and not a general issue.
> >
> > On x86_64 netmap is not built as a module, so everything works fine. I
> > don't see any reason why it should be a module in aarch64.
>
> Well, sys/modules/netmap exists... If it isn't planned on ever being
> made to work, it should be removed so people don't get confused, or
> at least marked broken so it doesn't get built...
>
> I built it manually because it was quicker than recompiling an entire
> kernel and rebooting...
>

That's right. It used to work fine, then something must have changed in the
way ifunit_ref()
must be used from modules (CURVNET related?). I have not had the time to
dig into this,
so if anyone has any suggestion please tell me.
Is there a standard way to mark a module as broken so that it does not get
built as a module,
but it still gets built statically?


>
> --
>   John-Mark Gurney  Voice: +1 415 225 5579
>
>  "All that I will do, has been done, All that I have, has not."
>


-- 
Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: vale and netmap module questions

2018-09-01 Thread Vincenzo Maffione
Il giorno sab 1 set 2018 alle ore 03:50 John-Mark Gurney 
ha scritto:

> First, does vale work for anyone?  At least one of the documented
> commands in vale(4) does not work.
>
> After manually building the netmap module and loading it:
> # tcpdump -ni vale-a:1
> 313.748851 nm_open [947] invalid bridge name vale-a:1
> tcpdump: netmap open: cannot access vale-a:1: Invalid argument
>

That name is invalid. See netmap(4).


>
> If I run tcpdump with a more correct looking name of vale1:a, I get a
> null deref panic in ifunit_ref.  Full trace is at the end.
>

Yes, this is a known bug, already posted to this mailing list. Don't build
netmap as a module, but link it in the kernel image and it will work.
(Add "dev netmap" to the kernel config).


>
> Second, is there a good reason why the netmap module is still
> disconnected from being built as a module?  I guess not working
> would be one, but I figure the above might be an aarch64 specific
> problem, and not a general issue.
>

On x86_64 netmap is not built as a module, so everything works fine. I
don't see any reason why it should be a module in aarch64.

Cheers,
  Vincenzo


>
> FreeBSD generic 12.0-ALPHA3 FreeBSD 12.0-ALPHA3 #0 r338296M: Sat Aug 25
> 06:17:25 UTC 2018 
> freebsd@generic:/usr/obj/usr/src/arm64.aarch64/sys/GENERIC
> arm64
>
> netmap: loaded module
> Fatal data abort:
>   x0: 5894f900
>   x1: 40351028
>   x2:5
>   x3: 405e4090
>   x4: 4000
>   x5:0
>   x6:0
>   x7: 40351cb0
>   x8:0
>   x9: 00a09960
>  x10: 5894f908
>  x11:0
>  x12:1
>  x13: fcffc000
>  x14:3
>  x15:1
>  x16: 5cd35580
>  x17: 00476de0
>  x18: 40351010
>  x19: 40351160
>  x20: fd000739798e
>  x21: 00a02000
>  x22:1
>  x23:0
>  x24: fd0007397988
>  x25:0
>  x26:1
>  x27:0
>  x28: fd0007397982
>  x29: 40351050
>   sp: 40351010
>   lr: 00476e0c
>  elr: 00476e1c
> spsr: 6345
>  far:   28
>  esr: 9607
> [ thread pid 802 tid 100096 ]
> Stopped at  ifunit_ref+0x3c:ldr x8, [x8, #40]
> db> bt
> Tracing pid 802 tid 100096 td 0xfd0007422000
> db_trace_self() at db_stack_trace+0xf0
>  pc = 0x0069b81c  lr = 0x000d6b28
>  sp = 0x40350920  fp = 0x40350950
>
> db_stack_trace() at db_command+0x220
>  pc = 0x000d6b28  lr = 0x000d67ac
>  sp = 0x40350960  fp = 0x40350a40
>
> db_command() at db_command_loop+0x60
>  pc = 0x000d67ac  lr = 0x000d6570
>  sp = 0x40350a50  fp = 0x40350a70
>
> db_command_loop() at db_trap+0xf4
>  pc = 0x000d6570  lr = 0x000d9694
>  sp = 0x40350a80  fp = 0x40350ca0
>
> db_trap() at kdb_trap+0x1d8
>  pc = 0x000d9694  lr = 0x003cdf70
>  sp = 0x40350cb0  fp = 0x40350d60
>
> kdb_trap() at data_abort+0x1e0
>  pc = 0x003cdf70  lr = 0x006b5ca4
>  sp = 0x40350d70  fp = 0x40350e20
>
> data_abort() at do_el1h_sync+0x11c
>  pc = 0x006b5ca4  lr = 0x006b59c0
>  sp = 0x40350e30  fp = 0x40350e60
>
> do_el1h_sync() at handle_el1h_sync+0x74
>  pc = 0x006b59c0  lr = 0x0069d874
>  sp = 0x40350e70  fp = 0x40350f80
>
> handle_el1h_sync() at ifunit_ref+0x28
>  pc = 0x0069d874  lr = 0x00476e08
>  sp = 0x40350f90  fp = 0x40351050
>
> ifunit_ref() at netmap_get_bdg_na+0x194
>  pc = 0x00476e08  lr = 0x5cd20b24
>  sp = 0x40351060  fp = 0x403510c0
>
> netmap_get_bdg_na() at netmap_get_na+0x1b0
>  pc = 0x5cd20b24  lr = 0x5cd15994
>  sp = 0x403510d0  fp = 0x40351120
>
> netmap_get_na() at netmap_ioctl+0xcd0
>  pc = 0x5cd15994  lr = 0x5cd18160
>  sp = 0x40351130  fp = 0x403511f0
>
> netmap_ioctl() at netmap_ioctl_legacy+0x4b8
>  pc = 0x5cd18160  lr = 0x5cd2b188
>  sp = 0x40351200  fp = 0x403515b0
>
> netmap_ioctl_legacy() at netmap_ioctl+0x15c
>  pc = 0x5cd2b188  lr = 0x5cd175ec
>  sp = 0x403515c0  fp = 0x40351680
>
> netmap_ioctl() at freebsd_netmap_ioctl+0x4c
>  pc = 0x5cd175ec  lr = 0x5cd25f94
>  sp = 0x40351690  fp = 0x403516b0
>
> freebsd_netmap_ioctl() at devfs_ioctl+0xc4
>  pc = 0x5cd25f94  lr = 0x0025f1ac
>  sp = 0x403516c0  fp = 

Re: Getting netmap to co-exist with user-space processes that use sockets

2018-08-17 Thread Vincenzo Maffione
Hi,
  What you want to do is definitely possible using the "host rings", aka
"sw rings".
The idea is that netmap intercepts all the packets arriving from the NIC RX
"hardware" ring(s). Your netmap program should then look at the packets and
decide which ones should be forwarded to the host kernel (e.g. to sockets),
and which ones are instead to be processed by your netmap program.
All the packets to be forwarded to the host kernel can be transmitted to
the "host TX ring". The host TX ring is a special ring that simply injects
packets into the kernel.

A similar thing happens for the egress side. Netmap intercepts all the
packets that the host kernel tries to transmit on eth0. Those packets will
show up in the "host RX ring", which is again a special ring. Your netmap
program can then process those, for instance simply forwarding them to the
hardware TX ring(s), so that they exit the eth0 interface.

On a NIC with just 1 RX/TX hardware ring, you basically have 4 netmap
rings. TX0 and RX0 are the hardware rings for transmission and reception.
TX1 and RX1 are host rings, as explained above.

I highly recommend to have a look at the netmap tutorial here
https://github.com/netmap-unipi/netmap-tutorial
Host rings are explained there. There is also a codelab with examples and
solutions you can play with to learn the netmap API.
>From the netmap API point of view, host rings are not different from
hardware rings.
Also having a look at the netmap man page can help.
Remember to disable the NIC offloadings like checksumming and TSO or things
won't work.

Cheers,
  Vincenzo

Il giorno ven 17 ago 2018 alle ore 11:33 VO Ipfix  ha
scritto:

> Hello there, I would like to use netmap with pptk (emulated driver) to
> generate send traffic from an interface, but still allow rx/tx to get to
> the the kernel so that other user-space networking processes function as
> normal. Currently, if I open an interface eg netmap:eth0, other user space
> processes are unable to perform any networking via sockets. How could I go
> about solving this?
>
> Thanks,
> Victor
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>


-- 
Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Using netmap on Windows

2018-05-31 Thread Vincenzo Maffione
Hi,
  Unfortunately the support for Windows is still very experimental, and the
developer that did it (a couple years ago) is not working on that anymore.

Also, in that first prototype there was no support for "native" drivers,
i.e. patched drivers for Intel or other NICs. Only VALE, netmap pipes and
monitors were fully functional.
For NIC access, emulated netmap was used.

I think the prototype was developed for Windows 8.1.
All the information we have are in this README.md file
https://github.com/luigirizzo/netmap/blob/master/WINDOWS/README.txt


2018-05-31 14:29 GMT+02:00 Ramandeep Sandhu :

> Hello /netmap /Experts,
>
> I am new to using /netmap /and want to use it for capturing IP(UDP/RTP)
> packets @ 10 Gbps lin-rate on Windows. I saw that /netmap /has been
> recently supported on Windows. Answers to following queries will be of
> great help for me to get started:
>
> 1. Which Windows OS is supported ?
>

I think Win7/8/8.1

2. Is there a list of cards that support netmap ? I have access to
>Intel X520-DA2, X540-T2 and Mellanox ConnectX-4 series. Will these
>work with netmap ?
>

None, just emulated netmap.


> 3. Is there any sample code available for Windows ?
>

The API is the same as FreeBSD/Linux. So any application is good:
https://github.com/luigirizzo/netmap/tree/master/apps

Another thing you may do is to look for Windows-related issues (
https://github.com/luigirizzo/netmap/issues).
There are other Github users that tried to revive this piece of code and
may help you.

Cheers,
  Vincenzo


>
> --
> Thanks & Regards
>
> Ramandeep Sandhu
> Digital Media Group, Interra Systems.
> Email : rsan...@interrasystems.com
> Ph : +91-9810980200, skype : san.raman
> http://www.interrasystems.com
>
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Support for Dual NIC

2018-04-10 Thread Vincenzo Maffione
Hi,
  In general, if a driver (e.g. ixgbe) supports DEV_NETMAP, then all the
NIC models sharing the same driver will have native netmap support.
Netmap simply does not know about the specific models.

If you are really seeing an issue, it may be related to the recent iflib
refactoring, which now collects driver functionalities common to Intel 1G
drivers (em, lem, igb, ...).
And iflib has support for netmap, so that the same netmap functions are
reused by all the 1G drivers.
>From my understanding, both 82575 (dual) and 82576 (quad) are served by
igb, so they should be supported.
One way to check if netmap kicks is to inspect the log for lines like these:

net eth0: netmap queues/slots: TX 1/256, RX 1/256
net eth1: netmap queues/slots: TX 1/256, RX 1/256


Maybe Matt wants to add something related to iflib.

The log you are seeing is a sanity check: it seems you are seeing packets
larger than 1500 bytes in the host rings. Are you using jumbo frames with
the netmap "bridge" program?

Cheers,
  Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap: How the buf_num of buffers is used by driver and monitor app

2018-03-30 Thread Vincenzo Maffione
Hi,


2018-03-30 6:00 GMT+02:00 Ming Fu <ming...@esentire.com>:

> Hi,
>
> I am wondering about the netmap module parameters buf_num. The default
> buf_num 163840 seems fairly big compare to intel 10G ixgbe of max 4096
> queue size. A full queue of the interface can only use a very small portion
> of the buf_num. Can the netmap enabled driver like ixgbe use more buffers
> than the "ethtool -G" configured rx/tx queue size?
>

No, netmap will just the same number of buffers as configured with ethtool
-G. The default buf_num is just a default that covers many things that you
can do.
First of all you would need 4096 for each TX and RX ring (so typically
2*8*4096). Then you need additional 2*4096 buffers for the host rings.
If you create netmap pipes like 'netmap:eth0{1' you need more buffers for
the pipe TX and RX ring.
If you know how many netmap buffers you really need for your purpose you
can just lower buf_num to the minimun number that won't cause your
application to fail with ENOMEM.


>
> I need to feed packages captured from netmap device to a few traffic
> monitors on the same box. The monitor application attach to the device as
> netmap monitor with the /r at the end of device name. My question is if the
> primary read of the device calls poll(), the netmap buffer is synced with
> the kernel. Can the other monitor application still access the packet that
> the primary read just returned to the kernel? If one of the monitor on the
> netmap device is slow, will it cause trouble for the primary reader and
> other monitors? How can I cache a lot of packets in the buffer in case one
> of the monitor application had a temporary slow down?
>

If you use copy-based monitor (no 'z') each monitor will receive an
independent copy of the traffic going through monitored rings. So I think
if one of the monitoring applications slows down there should
be no effect on the others. If you use zerocopy monitor, the same
intercepted buffer is passed between the various monitoring applications
(and the main application), so I think if one
stops or slows down, then all the monitors will stop.
There is some explanation in the comment here
https://github.com/luigirizzo/netmap/blob/master/sys/dev/netmap/netmap_monitor.c#L30-L69
Feel free to open a ticket on github (
https://github.com/luigirizzo/netmap/issues) if you have a more specific
question.

Cheers,
  Vincenzo


>
> Thanks,
> Ming
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Netmap on Linux nm_open() fail when receive ring size is set to 4096

2018-03-28 Thread Vincenzo Maffione
Hi,
  You need to increase the value of some netmap module parameter. At least
ring_size, maybe also buf_num. Then restart your netmap applications.

Keep in mind that performance could worsen with more slots, because of
increased cache thrashing.

Cheers,
  Vincenzo


On Wed, Mar 28, 2018, 3:50 PM Ming Fu  wrote:

> Hi,
>
> I was trying netmap on a Linux box with 128G of ram (64G per numa node).
> If I set ixgbe interface to 4096 ring size, the nm_open will fail with
> error "Cannot allocate memory". What can I tweak to make the card use
> larger ring size? The following test was run after fresh reboot.
>
> $ ethtool -g enp5s0f0
> Ring parameters for enp5s0f0:
> Pre-set maximums:
> RX:4096
> RX Mini:   0
> RX Jumbo:   0
> TX:4096
> Current hardware settings:
> RX:512
> RX Mini:   0
> RX Jumbo:   0
> TX:512
>
> $ ethtool -G enp5s0f0 rx 1024
> $ ./nmtest -i enp5s0f0
> ^C
> $ ethtool -G enp5s0f0 rx 2048
> $ nmtest -i enp5s0f0
> ^C
> $ ethtool -G enp5s0f0 rx 4096
> $ nmtest -i enp5s0f0
> 816.039684 nm_open [945] NIOCREGIF failed: Cannot allocate memory
> netmap:enp5s0f0
> fail to nm_open(netmap:enp5s0f0 ... ): Cannot allocate memory
>
> Thanks,
> Ming
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap ixgbevf mtu

2018-03-27 Thread Vincenzo Maffione
Hi,
  This commit (fe13476b106ed1f4b517b1590e1dfb3f268b6e78) in the upstream
netmap should have fixed the NS_MOREFRAG issue for ixgbe.
If you happen give a try let us know.

Cheers,
  Vincenzo

2018-03-21 21:40 GMT+01:00 Vincenzo Maffione <v.maffi...@gmail.com>:

> I see. Unfortunately this breaks the API, so I don't think we can accept
> it.
> We should probably sum up the fragment lengths, remember which one was the
> first descriptor and write the olinfo field when we process the last
> descriptor ...
> I hope this does not slow down the simpler case where NS_MOREFRAG is not
> used.
>
> In any, case we should move this discussion to the github, if possible (so
> that the issue gets tracked).
>
> Cheers,
>   Vincenzo
>
> 2018-03-20 20:54 GMT+01:00 Joe Buehler <as...@cox.net>:
>
>> Attached is a patch that allows fragmented TX with the ixgbevf driver.
>>
>> For the first TX buffer set the slot length to the full length of the
>> frame and make sure that the slot buffer is fully filled.  For succeeding
>> slots just set the length to the amount of the buffer filled.
>>
>> Not intended as the perfect solution but it works fine for my situation.
>>
>> Joe Buehler
>>
>>
>
>
> --
> Vincenzo Maffione
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap ixgbevf mtu

2018-03-21 Thread Vincenzo Maffione
I see. Unfortunately this breaks the API, so I don't think we can accept it.
We should probably sum up the fragment lengths, remember which one was the
first descriptor and write the olinfo field when we process the last
descriptor ...
I hope this does not slow down the simpler case where NS_MOREFRAG is not
used.

In any, case we should move this discussion to the github, if possible (so
that the issue gets tracked).

Cheers,
  Vincenzo

2018-03-20 20:54 GMT+01:00 Joe Buehler <as...@cox.net>:

> Attached is a patch that allows fragmented TX with the ixgbevf driver.
>
> For the first TX buffer set the slot length to the full length of the
> frame and make sure that the slot buffer is fully filled.  For succeeding
> slots just set the length to the amount of the buffer filled.
>
> Not intended as the perfect solution but it works fine for my situation.
>
> Joe Buehler
>
>


-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap ixgbevf mtu

2018-03-20 Thread Vincenzo Maffione
2018-03-19 20:49 GMT+01:00 Joe Buehler <as...@cox.net>:

> Vincenzo Maffione wrote:
>
> > To receive a frame larger than the RX buffer size you need multiple
> > netmap slots (as multiple descriptors are
> > used by the hardware), looking at the NS_MOREFRAG flag.
> > See the example code in utils/functional.c::rx_one().
>
> This works fine -- thanks.
>
> > Also TX may have per-slot limitations (e.g. due to the size of the NIC
> > TX fifo), but this is usually > 9K, so using a single descriptor per
> > packet should always
> > be ok. However, you can also use multiple slots on the TX side (see
> > utils/functional.c::tx_one()).
>
> Trying to split TX frames into multiple buffers does not work, the NIC is
> sending 2048 byte frames (the buf_size I am using).
>
> I will re-check my code.  Do I need a particular version of ixgbevf
> perhaps?
>
>
I don't think so, but you need to use the latest netmap from github.
The NS_MOREFRAG support for ixgbe/ixgbevf is here
https://github.com/luigirizzo/netmap/blob/master/LINUX/ixgbe_netmap_linux.h#L344-L345

The problem is that nobody really tried to use NS_MOREFRAG on ixgbevf
transmission so far.
So there may be a bug on how we set the flags in the hardware descriptor.
We should look at what the driver does. Here
https://elixir.bootlin.com/linux/v3.8/source/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c#L2903
I see that the olinfo_status field is set with the total frame length (and
not just the fragment length).
In the netmap code we set to the fragment length, so that's probably why
you see that behaviour.
Here
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82599-10-gbe-controller-datasheet.pdf
in sec. 7.2.3.2.4 I read that we need to properly set the olinfo_status
field on the firsts TX descriptor, while the others are irrelevant.

Cheers,
  Vincenzo





> Joe Buehler
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap ixgbevf mtu

2018-03-17 Thread Vincenzo Maffione
First, please make sure to use the latest netmap from github.

Yes, drivers in general use 2K or 4K RX buffers regardless of the MTU or
netmap buffer size.
To receive a frame larger than the RX buffer size you need multiple netmap
slots (as multiple descriptors are
used by the hardware), looking at the NS_MOREFRAG flag.
See the example code in utils/functional.c::rx_one().
Also TX may have per-slot limitations (e.g. due to the size of the NIC TX
fifo), but this is usually > 9K, so using a single descriptor per packet
should always
be ok. However, you can also use multiple slots on the TX side (see
utils/functional.c::tx_one()).

You need to set the buf_size parameter to the RX buffer size.
Currently we miss a mechanism for netmap to get the actual RX buffer size
from the NIC driver, so we assume 4K.
We need to check that buf_size is >= RX buffer size. This mechanism will be
added soon.

Cheers,
  Vincenzo

2018-03-16 23:52 GMT+01:00 Joe Buehler <as...@cox.net>:

> Sorry, I should have added, this is LINUX if it matters.
>
> Joe Buehler wrote:
> > I am having difficulties with netmap over top of ixgbevf when attempting
> to use a large MTU (say 9000 bytes).
> >
> > Does the ixgbevf driver use 2048 byte buffers for RX regardless of the
> MTU or netmap buffer size?
> >
> > I can send large frames just fine but inbound frames are passed via
> netmap as 2048 bytes max.  It is possible that netmap is passing frames in
> multiple pieces I suppose, I haven't checked that yet -- my code is looking
> at the frame headers only at the moment so would toss trailing pieces.
> >
> > Joe Buehler
> >
> >
>
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap memory config query

2018-03-14 Thread Vincenzo Maffione
Hi,
  The parameter to increase is "ring_size". FreeBSD or Linux does not
matter (except for how you modify the parameter, i.e. sysctl on FreeBSD and
sysfs in Linux).

Cheers,
  Vincenzo

2018-03-14 20:59 GMT+01:00 Joe Buehler <as...@cox.net>:

> When I increase the ixgbevf rx ring size from 1024 to 4096 I get this:
>
> [ 5526.651780] 851.135009 [ 931] netmap_obj_malloc netmap_ring
> request size 65792 too large
> [ 5526.652779] 851.136008 [1802] netmap_mem2_rings_create  Cannot allocate
> RX_ring
>
> What netmap module parameters do I need to tweak to get past this?
>
> I am using LINUX should it matter.
>
> Joe Buehler
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


fix for some netmap drivers

2018-02-19 Thread Vincenzo Maffione
Hello,
  Can anyone please apply the attached patch? It follows up the removal of
the nkr_slot_flags in the upstream netmap.
The change fixes compilation issues and has no effect on functionality.

Thanks,
  Vincenzo

-- 
Vincenzo Maffione
diff --git a/sys/dev/cxgbe/t4_netmap.c b/sys/dev/cxgbe/t4_netmap.c
index fa3bbb9fc8f..45083fb9a39 100644
--- a/sys/dev/cxgbe/t4_netmap.c
+++ b/sys/dev/cxgbe/t4_netmap.c
@@ -974,7 +974,7 @@ t4_nm_intr(void *arg)
 			case CPL_RX_PKT:
 ring->slot[fl_cidx].len = G_RSPD_LEN(lq) -
 sc->params.sge.fl_pktshift;
-ring->slot[fl_cidx].flags = kring->nkr_slot_flags;
+ring->slot[fl_cidx].flags = 0;
 fl_cidx += (lq & F_RSPD_NEWBUF) ? 1 : 0;
 fl_credits += (lq & F_RSPD_NEWBUF) ? 1 : 0;
 if (__predict_false(fl_cidx == nm_rxq->fl_sidx))
diff --git a/sys/net/iflib.c b/sys/net/iflib.c
index ba7d2547ed1..44a276e67d7 100644
--- a/sys/net/iflib.c
+++ b/sys/net/iflib.c
@@ -1068,7 +1068,6 @@ iflib_netmap_rxsync(struct netmap_kring *kring, int flags)
 	if (netmap_no_pendintr || force_update) {
 		int crclen = iflib_crcstrip ? 0 : 4;
 		int error, avail;
-		uint16_t slot_flags = kring->nkr_slot_flags;
 
 		for (i = 0; i < rxq->ifr_nfl; i++) {
 			fl = >ifr_fl[i];
@@ -1084,7 +1083,7 @@ iflib_netmap_rxsync(struct netmap_kring *kring, int flags)
 
 error = ctx->isc_rxd_pkt_get(ctx->ifc_softc, );
 ring->slot[nm_i].len = error ? 0 : ri.iri_len - crclen;
-ring->slot[nm_i].flags = slot_flags;
+ring->slot[nm_i].flags = 0;
 if (fl->ifl_sds.ifsd_map)
 	bus_dmamap_sync(fl->ifl_ifdi->idi_tag,
 			fl->ifl_sds.ifsd_map[nic_i], BUS_DMASYNC_POSTREAD);
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Linux netmap memory allocation

2018-01-03 Thread Vincenzo Maffione
ze = 8
>
> And again with 2.976Mpps (2 x 1Gbps) input and no packet loss observed:
> 1 thread - CPU usage = 100%, batch size = 12
> 2 thread - CPU usage = 68% (34% x 2), batch size = 21
> 4 thread - CPU usage = 100% (25% x 4), batch size = 17
> 6 thread - CPU usage = 105% (18% x 6), batch size = 16
>
> These results seem excellent and demonstrate that netmap is scaling as
> expected with both threads and packet volume. The higher thread count will
> be more beneficial when I am doing more processing on each packet.
>

Yes, as you can see the batch size is very beneficial to CPU utilization
and packet rate, because poll/ioctl are kind of expensive. You could try to
achieve higher batch to possibly better results. If you don't mind adding a
controlled latency you could experiment with adding something like
"usleep(30)" in your forwarding loop: this should lead to larger batches.


>
>
>> I hope this all makes sense, and again, I hope I have simply missed
>> something from the nmreq i pass to NIOCREGIF.
>>
>> It is worth mentioning that with the exception of this problem /
>> confusion, I am getting extremely good results from this code and netmap in
>> general.
>>
>
> That's nice to hear :)
> Your program looks simple enough that we could even add it to the examples
> (as an example of routing logic).
>
> I'd be very happy to contribute to the documentation in any way that may
> be helpful. I have added a permissive licence to my Github repository just
> in case my code of of use to anyone else. It is currently somewhat
> incomplete as an IPv4 router as it doesn't update MAC addresses on frames
> before forwarding them, and because the interface names are hardcoded, but
> when it's more complete I'd be very happy for it to be contributed to the
> examples. Of course anyone is free to use my code for any purpose too.
>
> Thanks for all your assistance! I'm happy enough with this that I will
> move on to looking at my IP routing code.
>

Ok, thanks!

Vincenzo

>
> Charlie
>
>
>
> *Charlie Smurthwaite*
> Technical Director
>
> *tel.* *email.* charlie@atech.media *web.* https://atech.media
>
> *This e-mail has been sent by aTech Media Limited (or one of its
> assoicated group companys, Dial 9 Communications Limited or Viaduct Hosting
> Limited). Its contents are confidential therefore if you have received this
> message in error, we would appreciate it if you could let us know and
> delete the message. aTech Media Limited is a UK limited company,
> registration number 5523199. Dial 9 Communications Limited is a UK limited
> company, registration number 7740921. Viaduct Hosting Limited is a UK
> limited company, registration number 8514362. All companies are registered
> at Unit 9 Winchester Place, North Street, Poole, Dorset, BH15 1NX.*
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Linux netmap memory allocation

2018-01-02 Thread Vincenzo Maffione
2018-01-01 23:05 GMT+01:00 Charlie Smurthwaite <charlie@atech.media>:

>
> On 01/01/18 21:05, Vincenzo Maffione wrote:
>
>
>
> 2018-01-01 17:14 GMT+01:00 Charlie Smurthwaite <charlie@atech.media>:
>
>> Hi,
>>
>> Thank you for your reply. I was able to resolve this.
>>
>> 1) I do indeed open one FD per NIC
>> 2) I no longer specify nr_arg1, nr_arg2 nor nr_arg3. Instead I just
>> verify that all NICs return with identical nr_arg2 so that the memory is
>> shared between them.
>> 3) I properly initialized my memory, my failure to do so was causing me a
>> lot of confusion,
>>
>> The resulting memory space is large enough for all the NICs, and
>> everything works perfectly with zero-copy forwarding, great!
>>
>> The only thing I am still having trouble with is the ability to
>> simultaneously trigger a TX and an RX sync on all NICs. I have tried
>> select, poll, and epoll, and in all cases, RX rings are updated but TX
>> rings are not and TX packets are not pushed out (this occurs using both
>> native and emulated netmap modes). I notice the documentation says "Note
>> that on epoll and kqueue, NETMAP_NO_TX_POLL and NETMAP_DO_RX_POLL only have
>> an effect when some event is posted for the file descriptor.", but the
>> behaviour seems the same on poll and select as well as epoll, perhaps this
>> is a linux-specific implementation detail?
>>
> I have also found that all of these mechanisms seem to incur a very high
>> cost in terms of CPU time (making them no more efficient than busy waiting
>> at 1Mpps+). My current approach is as follows, but I feel like there should
>> be a better option:
>>
>> for(int n=0; n<NIC_COUNT; n++) {
>>   // usleep(10); // More CPU time seems to be saved with a careful
>> sleep than with select/poll/epoll
>>   ioctl(fds[n], NIOCTXSYNC);
>>   ioctl(fds[n], NIOCRXSYNC);
>>   rxring = rxrings[n];
>>   while (!nm_ring_empty(rxring)) {
>> // Forward any packets waiting in this NIC's RX ring to the
>> appropriate TX ring
>>   }
>> }
>>
>
> If you are using poll() or select() you should not use ioctl(NIOC*XSYNC),
> as the txsync/rxsync operations are automatically performed within the
> poll()/select() syscall (at least assuming you did not specify
> NETMAP_NO_TX_POLL).
> Also, whether netmap calls or does not call txsync/rxsync on certain rings
> depends on the parameters passed to nm_open().
> Make sure you check for nm_ring_space(txring) when forwarding.
>
> Cheers,
>   Vincenzo
>
>
>
> Hi Vincenzo,
>
> Thanks again for your assistance. You state the following (as does the
> manual):
>
> "If you are using poll() or select() you should not use ioctl(NIOC*XSYNC),
> as the txsync/rxsync operations are automatically performed within the
> poll()/select() syscall (at least assuming you did not specify
> NETMAP_NO_TX_POLL)."
>
> However, this is not happening for me :(
>
> I am using poll(), and I am not specifying NETMAP_NO_TX_POLL, and have
> found that sometimes frames and sent only when the TX buffer is full, and
> sometimes they are not sent at all. They are never sent as expected on
> every invocation of poll(). If I run ioctl(NIOCTXSYNC) manually, everything
> works correctly. I assume I have simply missed something from my nmreq.
>

I don't think you have missed anything within nmreq.  I see that you are
waiting for POLLIN only (and this is right in your router case), so poll()
will actually invoke txsync on interface #i only when netmap intercepts an
RX or TX interrupt on interface #i. This means that packets may stall for
long time in the TX rings if you don't call ioctl(TXSYNC). The manual is
not wrong, however. You can look at the apps/bridge/bridge.c example to
understand where this "poll automatically calls txsync" thing is useful.


> You also mentioned: "whether netmap calls or does not call txsync/rxsync
> on certain rings depends on the parameters passed to nm_open()". I do not
> use the nm_open helper method, but I am extremely interested to know what
> parameters would affect this bahaviour, as this would seem very relevant to
> my problem.
>

Yes, we do not normally use the low level interface (ioctl(REGIF)), because
it's just simpler to use the nm_open() interface. Within the first
parameter of nm_open() you can specify to open just one RX/TX rings couple,
e.g. with "enp1f0s1-3". Then you usually want to mmap() just once (as you
do in your program); with nm_open(), you do that with the NM_OPEN_NO_MMAP
flag.

>
> If you are interested or if it helps explain my question, my complete co

Re: Linux netmap memory allocation

2018-01-01 Thread Vincenzo Maffione
2018-01-01 17:14 GMT+01:00 Charlie Smurthwaite <charlie@atech.media>:

> Hi,
>
> Thank you for your reply. I was able to resolve this.
>
> 1) I do indeed open one FD per NIC
> 2) I no longer specify nr_arg1, nr_arg2 nor nr_arg3. Instead I just verify
> that all NICs return with identical nr_arg2 so that the memory is shared
> between them.
> 3) I properly initialized my memory, my failure to do so was causing me a
> lot of confusion,
>
> The resulting memory space is large enough for all the NICs, and
> everything works perfectly with zero-copy forwarding, great!
>
> The only thing I am still having trouble with is the ability to
> simultaneously trigger a TX and an RX sync on all NICs. I have tried
> select, poll, and epoll, and in all cases, RX rings are updated but TX
> rings are not and TX packets are not pushed out (this occurs using both
> native and emulated netmap modes). I notice the documentation says "Note
> that on epoll and kqueue, NETMAP_NO_TX_POLL and NETMAP_DO_RX_POLL only have
> an effect when some event is posted for the file descriptor.", but the
> behaviour seems the same on poll and select as well as epoll, perhaps this
> is a linux-specific implementation detail?
>
I have also found that all of these mechanisms seem to incur a very high
> cost in terms of CPU time (making them no more efficient than busy waiting
> at 1Mpps+). My current approach is as follows, but I feel like there should
> be a better option:
>
> for(int n=0; n<NIC_COUNT; n++) {
>   // usleep(10); // More CPU time seems to be saved with a careful
> sleep than with select/poll/epoll
>   ioctl(fds[n], NIOCTXSYNC);
>   ioctl(fds[n], NIOCRXSYNC);
>   rxring = rxrings[n];
>   while (!nm_ring_empty(rxring)) {
> // Forward any packets waiting in this NIC's RX ring to the
> appropriate TX ring
>   }
> }
>

If you are using poll() or select() you should not use ioctl(NIOC*XSYNC),
as the txsync/rxsync operations are automatically performed within the
poll()/select() syscall (at least assuming you did not specify
NETMAP_NO_TX_POLL).
Also, whether netmap calls or does not call txsync/rxsync on certain rings
depends on the parameters passed to nm_open().
Make sure you check for nm_ring_space(txring) when forwarding.

Cheers,
  Vincenzo


> Thanks again,
>
> Charlie
>
>
> On 01/01/18 15:40, Vincenzo Maffione wrote:
>
> Hi,
>   If you have 32 NICs you should open 32 netmap file descriptors, (and you
> should not specify 64 in nr_arg1 or 256 in nr_arg3, this is for different
> usecases). Also, since you want to do zercopy you must not specify a
> separate memory area (nr_arg2), but use the same one.
> You may want to use the high level API nm_open()
> https://github.com/luigirizzo/netmap/blob/master/sys/net/
> netmap_user.h#L307
>
> You may also want to look at the netmap tutorial to get a better idea of
> how the API works (https://github.com/vmaffione/netmap-tutorial).
>
> Cheers,
>   Vincenzo
>
> 2017-12-28 18:34 GMT+01:00 Charlie Smurthwaite <charlie@atech.media>:
>
>> Hi,
>>
>> I'm just starting to use netmap and it is my intention to do zero-copy
>> forwarding of frames between a large number of NICs. I am using Intel
>> i350 (igb) on Linux. I therefore require a large memory area for rings
>> and buffers.
>>
>> My calculation:
>> 32 NICs * 2 rings (TX+RX) * 256 frames * 2048 bytes = 32MB
>>
>> I am currently having a problem (or perhaps just a misunderstanding)
>> regarding allocation of this memory. I am attempting to use the
>> following code:
>>
>> void thread_main(int thread_id) {
>>   struct nmreq req; // A struct for the netmap request
>>   int fd;   // File descriptor for netmap socket
>>   void * mem;   // Pointer to allocated memory area
>>
>>   fd = open("/dev/netmap", 0); // Open a generic netmap socket
>>   strcpy(req.nr_name, "enp8s0f0"); // Copy NIC name into request
>>   req.nr_version = NETMAP_API; // Set version number
>>   req.nr_flags = NR_REG_ONE_NIC;   // We will be using a single hw ring
>>
>>   // Select ring 0, disable TX on poll
>>   req.nr_ringid = NETMAP_NO_TX_POLL | NETMAP_HW_RING | 0;
>>
>>   // Ask for 64 additional rings to be allocated (32 * (TX+RX))
>>   req.nr_arg1 = 64;
>>
>>   // Allocate a separate memory area for each thread
>>   req.nr_arg2 = 10 + thread_id;
>>
>>   // Ask for additional buffers (256 per ring)
>>   req.nr_arg3 = 64*256;
>>
>>   // Initialize port
>>   ioctl(fd, NIOCREGIF, );
>>
>>   // Check the allocated memory size
>>   printf

Re: Linux netmap memory allocation

2018-01-01 Thread Vincenzo Maffione
Hi,
  If you have 32 NICs you should open 32 netmap file descriptors, (and you
should not specify 64 in nr_arg1 or 256 in nr_arg3, this is for different
usecases). Also, since you want to do zercopy you must not specify a
separate memory area (nr_arg2), but use the same one.
You may want to use the high level API nm_open()
https://github.com/luigirizzo/netmap/blob/master/sys/net/netmap_user.h#L307

You may also want to look at the netmap tutorial to get a better idea of
how the API works (https://github.com/vmaffione/netmap-tutorial).

Cheers,
  Vincenzo

2017-12-28 18:34 GMT+01:00 Charlie Smurthwaite <charlie@atech.media>:

> Hi,
>
> I'm just starting to use netmap and it is my intention to do zero-copy
> forwarding of frames between a large number of NICs. I am using Intel
> i350 (igb) on Linux. I therefore require a large memory area for rings
> and buffers.
>
> My calculation:
> 32 NICs * 2 rings (TX+RX) * 256 frames * 2048 bytes = 32MB
>
> I am currently having a problem (or perhaps just a misunderstanding)
> regarding allocation of this memory. I am attempting to use the
> following code:
>
> void thread_main(int thread_id) {
>   struct nmreq req; // A struct for the netmap request
>   int fd;   // File descriptor for netmap socket
>   void * mem;   // Pointer to allocated memory area
>
>   fd = open("/dev/netmap", 0); // Open a generic netmap socket
>   strcpy(req.nr_name, "enp8s0f0"); // Copy NIC name into request
>   req.nr_version = NETMAP_API; // Set version number
>   req.nr_flags = NR_REG_ONE_NIC;   // We will be using a single hw ring
>
>   // Select ring 0, disable TX on poll
>   req.nr_ringid = NETMAP_NO_TX_POLL | NETMAP_HW_RING | 0;
>
>   // Ask for 64 additional rings to be allocated (32 * (TX+RX))
>   req.nr_arg1 = 64;
>
>   // Allocate a separate memory area for each thread
>   req.nr_arg2 = 10 + thread_id;
>
>   // Ask for additional buffers (256 per ring)
>   req.nr_arg3 = 64*256;
>
>   // Initialize port
>   ioctl(fd, NIOCREGIF, );
>
>   // Check the allocated memory size
>   printf("memsize: %u\n", req.nr_memsize);
>   // Check the allocated memory area
>   printf("nr_arg2: %u\n", req.nr_arg2);
> }
>
> The output is as follows:
>
> memsize: 4206859
> nr_arg2: 10
>
> This is far short of the amount of memory I am hoping to be allocated.
> Am I doing something wrong, or is this simply an indication that the
> driver is unwilling to allocate more than 4MB?
>
> A secondary (related) problem is that if I don't set arg1,arg2,arg3 in
> my code (ie they will be zero), then I get varying output (it varies
> between each of the following):
>
> memsize: 4206843
> nr_arg2: 0
>
> memsize: 343019520
> nr_arg2: 1
>
> Any pointers would be appreciated. Thanks!
>
> Charlie
>
>
> Charlie Smurthwaite
> Technical Director
>
> tel.  email. charlie@atech.media<mailto:charlie@atech.media> web.
> https://atech.media
>
> This e-mail has been sent by aTech Media Limited (or one of its assoicated
> group companys, Dial 9 Communications Limited or Viaduct Hosting Limited).
> Its contents are confidential therefore if you have received this message
> in error, we would appreciate it if you could let us know and delete the
> message. aTech Media Limited is a UK limited company, registration number
> 5523199. Dial 9 Communications Limited is a UK limited company,
> registration number 7740921. Viaduct Hosting Limited is a UK limited
> company, registration number 8514362. All companies are registered at Unit
> 9 Winchester Place, North Street, Poole, Dorset, BH15 1NX.
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Netmap: Build a network SPAN/TAP from netmap

2017-12-14 Thread Vincenzo Maffione
Yes, or if you prefer you can simply extend "bridge" forwarding logic to
copy every packet to an additional TAP port. Copying a packet in netmap is
just a matter of initialize the next struct netmap_slot in the destination
(TAP) netmap ring, memcpy() the packet payload and incrementing
ring->cur/ring->head (then you need to TXSYNC or poll() at the end of the
batch.
In any case the application will work on both FreeBSD and Linux, as the API
is the same.

You may also find useful to look at the netmap tutorial, to see more
examples and explanations: https://github.com/vmaffione/netmap-tutorial

Cheers,
  Vincenzo

2017-12-15 6:58 GMT+01:00 Jim Thompson <j...@netgate.com>:

>
>
> > On Dec 14, 2017, at 12:00 PM, Ming Fu <ming...@esentire.com> wrote:
> >
> > Hi,
> >
> > I am trying to explore the possibility to build a network SPAN/TAP from
> netmap. Similar to the bridge sample, but all packet going through the
> bridge also get copied to a SPAN port. How do I duplicate or clone an
> incoming packet and send the original to bridge peer and the cloned one to
> the SPAN port? Is there an API like FreeBsd m_copypacket() for netmap?
> Would it work for Linux as well?
> >
> > Thanks
> > Ming
>
> Ming,
>
> I’d look at adapting netmap monitors.
>
> https://github.com/luigirizzo/netmap/blob/master/sys/dev/
> netmap/netmap_monitor.c
>
> For the rest of the solution, look at netmap_user.h, where it explains how
> to open a port in monitor mode.
>
> https://github.com/luigirizzo/netmap/blob/master/sys/net/netmap_user.h
>
> Essentially, once you have an active netmap port e.g. netmap:ix0, you can
> sniff the traffic by opening additional netmap ports
> named netmap:ix0/r (for rx traffic) or netmap:ix0/t (for tx) or even
> netmap:ix0/rt  (for both tx and rx)
>
> The rest of the code (to inject frames back down another interface) can be
> lifted from the bridge sample.
>
> You could also look at SF-TAP. http://sf-tap.github.io
>
> Jim
>
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Netmap ouch double free

2017-12-03 Thread Vincenzo Maffione
Hi,
  It may be related to the "extra buffers feature" of netmap, that lb uses.
When the netmap port is opened, some additional buffers (not bound to any
netmap ring) are allocated to be used by the application for slot swapping.
They are provided through the ni_bufs_head field as a linked list (see
updated netmap manual
https://github.com/luigirizzo/netmap/blob/master/share/man/man4/netmap.4 ).
To free them, the free list must be returned when closing the netmap file
descriptor, using the same ni_bufs_head field.

To check if this is true you could make sure that you are using "-B 0"
option, which means no extra buffers are used.

Are you using the latest netmap code from github? Are you using FreeBSD or
Linux?

Cheers,
  Vincenzo

2017-12-01 15:21 GMT+01:00 Martina Balintova <balint.mart...@gmail.com>:

> Hi,
> I am currently playing with lb app in netmap. Every time I kill/close it,
> the app hangs for some time and does not end immediately. In syslogs I am
> getting:
> '
> Netmap_do_unregif   deleting last instance for myapp{1
> Netmap_do_deref active=5
> Netnap_obj_free   ouch, double free on buffer 2
> Netmap_extra_free   freed 0 buffers
> '
>
> This happens when I have some consumer on the pipe and at some point during
> whole lb lifetime, it did not consume all packets (resulting in oq being
> filled or packets being dropped). If the pipe did not ever have a consumer,
> then it will not end up in the double free.
> I am finding it quite hard to debug in gdb, as this is happening at the
> shutdown.
> Could someone pls point me to reason?
>
> I am running lb with one interface and 2 groups, different numb of pipes
> per group and this happens even with no  extra buffers.
>
> Martina
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap / LINUX realtime / ixgbevf: huge RX latencies

2017-12-02 Thread Vincenzo Maffione
HEAD/11 is the FreeBSD version.
Your question is about netmap on Linux, so this is the wrong mailing list.
You should open an issue on the official netmap
https://github.com/luigirizzo/netmap/issues

Cheers
  Vincenzo

2017-12-02 13:20 GMT+01:00 Joe Buehler <as...@cox.net>:

> K. Macy wrote:
>
> > HEAD or 11?
>
> I'm not quite sure what the question means but there is this in the
> netmap code:
>
> ./net/netmap.h:42:#define   NETMAP_API  11  /* current
> API version */
>
> LINUX kernels nowadays can timestamp frames when they arrive from the
> NIC.  I made a trivial patch to the netmap driver to turn this on and
> also pass the timestamp to user space, and will pass on the changes.  I
> am doing frame latency measurements and this simple change eliminated a
> *whole* lot of noise in the measurements.
>
> Joe Buehler
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: swaping ring slots between NIC ring and Host ring does not always success

2017-11-27 Thread Vincenzo Maffione
Hi,

  If you think it's a bug can you please open an issue on the github (
https://github.com/luigirizzo/netmap/issues)?

2017-11-24 22:11 GMT+01:00 Xiaoye Sun <xiaoye@rice.edu>:

> Hi Vincenzo,
>
> Let me clarify my problem. (please ignore the previous incompleted email)
>
> I have a program, which is an extension of bridge.c
> https://github.com/luigirizzo/netmap/blob/master/apps/bridge/bridge.c
> The only difference is that my program also generates customized packets
> sent to the NIC directly.
> These customized packets have increasing sequence numbers.
> So, this program not only sends these customized packets but also forwards
> packets between NIC and host stack using zerocopy.
> The program only takes one NIC queue and there is only one thread.
>
> I think the problem is that there is a chance where netmap does not update
> the pointer to the buffer even when NS_BUF_CHANGED is set (buf_idx is
> changed).
>

Can you disable zerocopy in bridge.c to see if the problem goes away? This
would be an useful information.


>
> Let's say the NIC tx ring has 4096 slots. The customized packet sequence
> 16 is filled in the buffer of slot 2057.
> The customized packets keep filling the slots until the next available
> slot is 2056.
>

Do you mean that your program fills the TX ring slots 2057,2058...2054,2055
with custom packets? This would mean you filled all the available slots,
since one slot is left empty.


> Now the customised packet sequence 4111 is filled to 2056.
>

You cannot fill the slot 2056 if 2055 has not been NIOCTXYSINC'd. Aren't
you using nm_ring_empty() and nm_ring_space() functions to check
for available space in TX ring (assuming you update rinig->head/ring->cur
before calling those functions)?

Cheers,
  Vincenzo


> Then the netmap program is notified that there is a packet from the host
> stack sent to the NIC.
> The netmap program swaps the buf_idx between slot 2057 and the
> corresponding slot in the host rx ring and set the NS_BUF_CHANGED flag of
> both slots.
> Then the netmap program fills sequence 4112 to slot 2058.
> However, the buffer swap seems not succeed so that the original content of
> slot 2057 (sequence 16) is sent out.
> So that at the receiver side, the receiver sees two sequence
> 16s.(16,17...4110,4111,16,4112,4113).
>
> So think the root of the problem is that the buffer pointer is not always
> successfully/timely updated even after the NS_BUF_CHANGED flag is set and
> the buf_idx is updated.
>
> Best,
> Xiaoye
>
>
>
> On Wed, Nov 22, 2017 at 7:39 AM, Vincenzo Maffione <v.maffi...@gmail.com>
> wrote:
>
>> Hi,
>>
>> 2017-11-21 7:51 GMT+01:00 Xiaoye Sun <xiaoye@rice.edu>:
>>
>>> Hi,
>>>
>>> Recently I found another problem with netmap. I think this new problem
>>> could be related to the problems in this threads so I just post the new
>>> problem here.
>>>
>>> In my setup, I have a sender program having a netmap ring (a pair of
>>> RX/TX ring) for the NIC and a ring for the host stack. The sender program
>>> puts customized packets (each packet has a unique sequence number and the
>>> sender sends the packet in a sequence number increasing order) to the NIC
>>> TX ring directly and also forwards the packets from the host RX ring to
>>> the
>>> NIC TX ring using "zerocopy" by swapping the buffer indices.
>>> However, the receiver sees duplicated customized packets. For example, in
>>> the case where the ring size is 32 (32 slots in a ring) the order of the
>>> sequence numbers the receiver see is 1,2,3,4,5,...,68,69,*70*
>>> ,71,72,73,...,99,100,*70*,101,102,103,... . An interesting thing I
>>> found is
>>> that the "gaps" between these two duplicated packets (70 in the example)
>>> are always a number very close to the ring size, 32 in this example. In
>>> my
>>> experiment, I use a ring with 4096 slots and the gap is always more than
>>> 4090 and close to 4096. I verified that this duplication happens due to
>>> the
>>> sender, not the receiver. Assuming my sender's implementation is correct,
>>> then this duplication may happen in netmap and the NIC driver (ixgbe).
>>>
>>
>> Netmap itself doesn't do any duplication nor takes a look at the packets.
>> It just passes
>> down ring->cur/ring->head to the ixgbe driver (after validation).
>> The ixgbe driver datapath is bypassed and replaced with a netmap-enabled
>> datapath (see https://github.com/luigirizzo/
>> netmap/blob/master/LINUX/ixgbe_netmap_linux.h#L294-L461);
>> no duplication sh

Re: [SOLVED] Re: bridge0 not working when cable disconnected

2017-11-24 Thread Vincenzo Maffione
Hi,
  The VM IP is assigned to the emulated interface inside the guest OS (e.g.
vtnet0).
It would not make sense to assign an IP to tap0, and I'm quite sure bhyve
doesn't do that.
If tap0 is attached to bridge0 (which is normally the case, and I guess
it's your case), there is no reason for tap0 to have an IP (because it's a
data port of an L2 switch).

If you give an IP to tap0 it's not dangerous, but there is no need to do
that. (The only use-case for giving an IP to tap0 I can think of is when
you don't attach tap0 do any bridge, and you use tap0 as a peer-to-peer
link for applications on your host to communicate with applications in the
VM. But nobody does that since you can do the same with the bridge0
interface, which can be used to talk to multiple VMs, not just one).

Cheers,
  Vincenzo

2017-11-24 8:24 GMT+01:00 Andrea Venturoli <m...@netfence.it>:

> On 11/16/17 19:01, Eugene Grosbein wrote:
>
> If you add an interface to a bridge, you should remove all IP addresses
>> from it
>> and assign them to the bridge itself instead. And you will be fine.
>>
>
> Thanks.
>
> In fact, assigning the base IP and all the jails to bridge0, instead of
> re0 solved.
> I still think bhyve assigns the VM's IP to tap0, but that doesn't seam to
> be a problem.
>
>  bye
> av.
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: swaping ring slots between NIC ring and Host ring does not always success

2017-11-22 Thread Vincenzo Maffione
Hi,

2017-11-21 7:51 GMT+01:00 Xiaoye Sun <xiaoye@rice.edu>:

> Hi,
>
> Recently I found another problem with netmap. I think this new problem
> could be related to the problems in this threads so I just post the new
> problem here.
>
> In my setup, I have a sender program having a netmap ring (a pair of
> RX/TX ring) for the NIC and a ring for the host stack. The sender program
> puts customized packets (each packet has a unique sequence number and the
> sender sends the packet in a sequence number increasing order) to the NIC
> TX ring directly and also forwards the packets from the host RX ring to the
> NIC TX ring using "zerocopy" by swapping the buffer indices.
> However, the receiver sees duplicated customized packets. For example, in
> the case where the ring size is 32 (32 slots in a ring) the order of the
> sequence numbers the receiver see is 1,2,3,4,5,...,68,69,*70*
> ,71,72,73,...,99,100,*70*,101,102,103,... . An interesting thing I found
> is
> that the "gaps" between these two duplicated packets (70 in the example)
> are always a number very close to the ring size, 32 in this example. In my
> experiment, I use a ring with 4096 slots and the gap is always more than
> 4090 and close to 4096. I verified that this duplication happens due to the
> sender, not the receiver. Assuming my sender's implementation is correct,
> then this duplication may happen in netmap and the NIC driver (ixgbe).
>

Netmap itself doesn't do any duplication nor takes a look at the packets.
It just passes
down ring->cur/ring->head to the ixgbe driver (after validation).
The ixgbe driver datapath is bypassed and replaced with a netmap-enabled
datapath (see
https://github.com/luigirizzo/netmap/blob/master/LINUX/ixgbe_netmap_linux.h#L294-L461
);
no duplication should happen there as each netmap slot (1 TX packet) is used
only once.

>
>
> Thinking back to the original problem in this post, I think these problems
> may be related. It seems to me that there could be multiple threads pulling
> the packets from the NIC TX ring (or the thread moved to other CPUs when
> the problem occurs) and these threads may run on different cores so that
> the outdated content in the buffer may be sent out when new content is
> written to the buffer.
>
>
There are no such threads pulling from the NIC TX ring. Your application
directly
puts new packets to be transmitted in the netmap buffers referenced in the
netmap TX
ring. When then you call NIOCTXSYNC or poll(), all the new TX buffers (e.g.
all
the ones from the previous value ring->head (included) to the new value of
ring->head (excluded))
are moved to the NIC TX ring. This happens in the context of your
application thread,
no worker threads are used. Then the NIC hardware starts the transmission.


> I am wondering if there is a way to pin the NIC driver of the netmap module
> to a specific core. or is there a way to know the root of such problem?
>

The only threads are the ones of your application.
Maybe your problem comes from concurrent accesses to the netmap TX ring
from different threads? Only one thread at a given time should update a
netmap
TX/RX ring. Otherwise the behaviour is unspecified.

Cheers,
  Vincenzo


>
> Best,
> Xiaoye
>
>
-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap/vale periodic deadlock

2017-11-22 Thread Vincenzo Maffione
Hi,
  $ bridge -h
for usage.

Yes, bridge is just a simple example program, and stays in foreground.
Probably using daemon(8) you can turn it into a daemon.

Cheers,
  Vincenzo

2017-11-22 10:18 GMT+01:00 Harry Schmalzbauer <free...@omnilan.de>:

>  Bezüglich Harry Schmalzbauer's Nachricht vom 22.11.2017 09:39 (localtime):
> > Bezüglich Vincenzo Maffione's Nachricht vom 22.11.2017 09:04 (localtime):
> >>
> >> 2017-11-21 21:48 GMT+01:00 Harry Schmalzbauer <free...@omnilan.de
> >> <mailto:free...@omnilan.de>>:
> >>
> >> Bezüglich Vincenzo Maffione's Nachricht vom 21.11.2017 09:39
> >> (localtime):
> >> …
> >> >
> >> > If this is the case, although you are allowed to do that, I don't
> think
> >> > it's a convenient way to use netmap.
> >> > Since VLAN interfaces like vlan0 do not have (and cannot have)
> native
> >> > netmap support, you are falling back to emulated netmap adapters
> (which
> >> > are probably buggy on FreeBSD, specially when combined with VALE).
> >> > Apart from bugs I think that with this setup you can't get decent
> >> > performance that would justify using netmap rather than the
> standard
> >> > kernel bridge and TAP devices.
> >>
> >> Hello,
> >>
> >> lockup happened earlier than expected.
> >> This time 'vale-ctl' still reported (-l) the configuration.
> >> One guest, using if_vtnet(4)-virtio-net#vale2:korso, showed:
> >> dmz: watchdog timeout on queue 0
> >> (dmz is the renamed if_vtnet(4))
> >>
> >> I could attach tcpdump to the uplink interface and also to all vlan
> >> children.
> >> Complete silence everywhere.  So it seems the nic stopped processing
> >> anything.
> >>
> >> Do you think that symptom could be caused by my special vale
> >> integration, so that bugs in netmap emulation could crash the NIC?
> >> Or is it unlikely that this is related.
> >>
> >> I hadn't prepared a debug kernel for the host, so the machine
> rebooted
> >> without again.
> >> I think I'll have to start with replacing vale first, to narrow down
> >> possible causes.  Today I was lucky, the lockup happend after
> business
> >> hours, but I won't rely on that.
> >> At least I know if I really need to look for a debug netmap kernel,
> or
> >> possibly there's something else...
> >>
> >> Thanks,
> >>
> >> -harry
> >>
> >>
> >>
> >> I can't really say anything without a stack trace or meaningful logs.
> >> There is a thing that you may do to see if the bug comes out of a bad
> >> interaction between
> >> emulated netmap and VALE.
> >> Instead of attaching the vlan interfaces to VALE you can connect VALE to
> >> the vlan interface
> >> through the "bridge" program. In this way nothing changes from the
> >> functional point of view,
> >> but you are not attaching anymore the VLAN interface to VALE (and you
> >> are using an additional process).
> >>
> >> So instead of
> >>
> >>   # vale-ctl vale0:vlan0
> >>
> >> you would have
> >>
> >>   # bridge netmap:vlan0 vale0:vv  # "vv" can be anything
> > Hello Vincenzo,
> >
> > thank you very much for that interesting hint.
> > I prepared a netgraph setup yesterday evening, but I'll try your
> > suggestion first.  Unfortunately I don't have time to prepare a debug
>
> Since this doesn't need a reboot and I'm in adventure mood, I just tried
> it at runtime.
> Unfortunately I can't find bridge documentation besides the source code.
> It doesn't detach from terminal here:
> bridge built Oct  8 2017
> 12:59:57
>
> 060.359974 main [244] --- zerocopy NOT
> supported
>
> 060.359987 main [251] Wait 4 secs for link to come up...
> 064.365872 main [255] Ready to go, nic1_egn 0x0/1 <-> vale4:nic1egn
> 0x0/1.
>
> 068.084364 main [306] poll timeout [0] ev 1 0 rx 0@33 tx 1022, [1] ev 1
> 0 rx 0@34 tx
> 1023
> 072.565559 main [306] poll timeout [0] ev 1 0 rx 0@34 tx 1022, [1] ev 1
> 0 rx 0@35 tx 1023
> …
>
> In general, things are working.
>
> Is bridge staing in the foreground by design?
>
> Thanks,
>
> -harry
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap/vale periodic deadlock

2017-11-22 Thread Vincenzo Maffione
2017-11-21 21:48 GMT+01:00 Harry Schmalzbauer <free...@omnilan.de>:

> Bezüglich Vincenzo Maffione's Nachricht vom 21.11.2017 09:39 (localtime):
> …
> >
> > If this is the case, although you are allowed to do that, I don't think
> > it's a convenient way to use netmap.
> > Since VLAN interfaces like vlan0 do not have (and cannot have) native
> > netmap support, you are falling back to emulated netmap adapters (which
> > are probably buggy on FreeBSD, specially when combined with VALE).
> > Apart from bugs I think that with this setup you can't get decent
> > performance that would justify using netmap rather than the standard
> > kernel bridge and TAP devices.
>
> Hello,
>
> lockup happened earlier than expected.
> This time 'vale-ctl' still reported (-l) the configuration.
> One guest, using if_vtnet(4)-virtio-net#vale2:korso, showed:
> dmz: watchdog timeout on queue 0
> (dmz is the renamed if_vtnet(4))
>
> I could attach tcpdump to the uplink interface and also to all vlan
> children.
> Complete silence everywhere.  So it seems the nic stopped processing
> anything.
>
> Do you think that symptom could be caused by my special vale
> integration, so that bugs in netmap emulation could crash the NIC?
> Or is it unlikely that this is related.
>
> I hadn't prepared a debug kernel for the host, so the machine rebooted
> without again.
> I think I'll have to start with replacing vale first, to narrow down
> possible causes.  Today I was lucky, the lockup happend after business
> hours, but I won't rely on that.
> At least I know if I really need to look for a debug netmap kernel, or
> possibly there's something else...
>
> Thanks,
>
> -harry
>


I can't really say anything without a stack trace or meaningful logs.
There is a thing that you may do to see if the bug comes out of a bad
interaction between
emulated netmap and VALE.
Instead of attaching the vlan interfaces to VALE you can connect VALE to
the vlan interface
through the "bridge" program. In this way nothing changes from the
functional point of view,
but you are not attaching anymore the VLAN interface to VALE (and you are
using an additional process).

So instead of

  # vale-ctl vale0:vlan0

you would have

  # bridge netmap:vlan0 vale0:vv  # "vv" can be anything

Cheers,
  Vincenzo

-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap/vale periodic deadlock

2017-11-21 Thread Vincenzo Maffione
tion which, like you described,
> utilizes netmap to enable minimalistic SDN features, would be a great
> solution.  But I would need really a lot of time, since my C skills are
> lousy, and I really don't have any time, not even one more day.
>

I see. But just FYI, there isn't that much to implement :)

Cheers,
  Vincenzo


>
>
> I'll see if I can get any useful information with the kernel deadlock
> debuging feature you suggested
> (https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/
> kerneldebug-deadlocks.html),
> as soon as the problem shows up again.
> Since I forgot to add all production-RAM, I had to shutdown yesterday,
> so the lockup counter was reset ;-)
> Another last-minute change was with netmap ring size: I changed the
> vale-uplink interface.  The one I used for passthrough had 2 queues
> (with EM_MULTIQUEUE support) and the one for the vale uplink onyl one,
> and during evaluation phase I reduced rx/tx descriptors to make netmap's
> default ring size working.
> Now I use the 2-queue NIC with vale uplink and increased ring size to
> 81920 while leaving the hardware default of 4096 rx/tx desriptors.
>
> But my wording wasn't technically correct I think, because I guess what
> I'm suffering isn't a real deadlock in terms of locking, but any
> netmap-internal lockup/overflow/limit/whatever.  Just guesing here!  I
> don't know netmap code!  I only link symptoms, and since that setup is
> working really nice for some limited time, I hoped you or any other
> netmap expert could teach me how to find the root cause.
> Your sentence about FreeBSD's netmap-interface-emulation leaves a bad
> presentiment...
>
> Thank you very much,
>
> -harry
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap/vale periodic deadlock

2017-11-21 Thread Vincenzo Maffione
Hi,
  It's hard to say, specially because it happens after two days of normal
use.
Can't you enable deadlock debugging features in your kernel?
https://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html

However, if I understand correctly you have created some VLAN interfaces
vlan0, vlan1, vlan2, ... on top of a NIC (say em0). And you have attached
each VLAN interface to a vale switch:

# vale-ctl -a vale0:vlan0
# vale-ctl -a vale1:vlan1
# vale-ctl -a vale2:vlan2

and each VALE switch is attached to a different set of bhyve guests.

If this is the case, although you are allowed to do that, I don't think
it's a convenient way to use netmap.
Since VLAN interfaces like vlan0 do not have (and cannot have) native
netmap support, you are falling back to emulated netmap adapters (which are
probably buggy on FreeBSD, specially when combined with VALE).
Apart from bugs I think that with this setup you can't get decent
performance that would justify using netmap rather than the standard kernel
bridge and TAP devices.

The right way to do it imho would be to write your own (userspace) netmap
application that forwards packets between your bhyve guests and the NIC,
prepending/stripping VLAN headers according to configuration (e.g. guest A
is configured to be on VLAN 100, guest B on VLAN 200), etc.
I think this would be a very interesting netmap application in general, and
more importantly you would get the performance that you can't get with your
setup.

Cheers,
  Vincenzo

2017-11-17 17:56 GMT+01:00 Harry Schmalzbauer <free...@omnilan.de>:

>  Hello,
>
> sorry for annoying with another question/problem.
>
> I'm using netmap's vale (on stable/11) for bhyve(8) virtio-net backed SDN.
>
> The guests – unfortunately in production already – quit network services
> (resp. are not able to transceive any packets anymore) after about 2
> days; repeatedly and most likely not load related, since there is no
> significant load.
> Each guest is running fine, the host also runs without any other
> problem, no network problem elsewhere (different NICs; I use one
> dedicated NIC with vlan(4) children, each child connected to one vale
> switch).
>
> At some point, the complete netmap subsystem seems to deadlock:
> 'vale-ctl' hangs uninteruptable.
> Trying to attach a tcpdump to a vale switch also hands uninteruptable.
> Stoping (shuting down from inside) bhyve guests works up to the point
> where the vale port should be destroyed.
> I could continue the list of symptoms, but that doesn't help in any way
> I guess.
>
> My question is, where can I start finding out what happens with the
> netmap subsystem?
>
> There were no kernel messages right before or during the deadlock!
>
> The only userland tool I'm familar with (vale-ctl) isn't usable at all
> in that situation.
> Any hints what to try?
>
>
> Here's a excerpt of processes running when the netmap-lockuped host has
> all guests shut down, just before I rebooted.
> Snipped alot, the interesing ones are thos in state "netmap_g":
> …
> 0 14213 1 0 20 0 5864 0 wait IW 3 0:00,00 (sh)
> 0 14214 14213 0 -92 0 5358120 3586232 nm_kn_lo TC 3 148:02,02 bhyve:
> kallisto (bhyve)
> 0 14976 2522 0 20 0 6976 0 wait IW 3 0:00,00 su
> 0 14981 14976 0 20 0 8256 0 pause IW 3 0:00,00 _su (csh)
> 0 61615 14981 0 20 0 5864 0 wait IW 3 0:00,00 (sh)
> 0 61616 61615 0 52 0 2180648 1973252 netmap_g DEC 3 286:11,91 bhyve:
> preed (bhyve)
> 0 62845 14981 0 20 0 11624 3328 bdg lock L+ 3 0:00,01 tcpdump -n -e -s
> 150 -i vale1:test
> …
> 0 1390 1388 0 -92 0 2330024 767756 nm_kn_lo TC v0- 94:01,90 bhyve: styx0
> (bhyve)
> 0 1401 1 0 52 0 5784 0 wait IW v0- 0:00,00 (sh)
> 0 1403 1401 0 20 0 368328 43444 - TC v0- 3:35,66 bhyve: korso (bhyve)
> …
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [netmap] when does a packet in the netmap ring send out exactly

2017-11-20 Thread Vincenzo Maffione
Hi,
  Netmap version (tag v11.3 vs master) should be unrelated.
Are you using ixgbe? In this case it may depend on the ixgbe driver version
that netmap is using for patching.
On github master you can change the version by modifying the ixgbe line in
LINUX/default-config.mak.in_.
Valid versions are for instance 4.5.4 or 5.1.3.

Cheers,
  Vincenzo

2017-11-21 4:43 GMT+01:00 Xiaoye Sun <xiaoye@rice.edu>:

> Hi Luigi,
>
> Thanks!
> I was using the most recent netmap on Github and I believe the tail
> pointer only moves forward when there are less than half of the total slots
> available in the netmap ring.
> Then I switch to the version of v11.3
> <https://github.com/luigirizzo/netmap/tree/v11.3>, it behaves as what you
> described.
>
> Linux Kernel: 3.16.0-4-amd64
>
> Best,
> Xiaoye
>
>
> On Mon, Nov 20, 2017 at 6:11 PM, Luigi Rizzo <ri...@iet.unipi.it> wrote:
>
>> Hi,
>> I think if you call the TXSYNC ioctl without advancing the head
>> pointer, then the tail is advanced
>> as much as possible.
>>
>> Cheers
>> luigi
>>
>> On Mon, Nov 20, 2017 at 3:35 PM, Xiaoye Sun <xiaoye@rice.edu> wrote:
>> > Hi,
>> >
>> > I found that the tail pointer only moves when the ring has less than
>> half
>> > of the slots available. This prevents me from knowing the accurate time
>> > when the packet in a slot is processed. Is there a way to move the tail
>> > pointer as long as the packet in the slot is processed? Is this a
>> > configurable feature?
>> >
>> > Best,
>> > Xiaoye
>> >
>> > On Fri, Oct 27, 2017 at 11:52 AM, Vincenzo Maffione <
>> v.maffi...@gmail.com>
>> > wrote:
>> >
>> >> Hi,
>> >>   This is actually a limitation of the netmap API: ring->tail is
>> exposed
>> >> to the user so that it knows it can use the slots in the range
>> >> "[ring->head..ring->tail[" for new transmissions (note that head is
>> >> included, tail excluded, to prevent wraparound). However, there is no
>> >> explicit indication of "up to what slots packets were transmitted".
>> >> For hw NICs, however, ring->tail is an indication of where transmission
>> >> was completed.
>> >> Example:
>> >> 1) at the beginning ring->tail = ring->head = ring->cur = 0
>> >> 2) then your program moves head/cur forward: head = cur = 10
>> >> 3) you call TXSYNC, to submit the packets to the NIC.
>> >> 4) after the TXSYNC call, is very likely that tail is still 0, i.e.
>> >> because no transmission has been completed by the NIC (and no interrupt
>> >> generated).
>> >> 5) say after 20 us you issue another TXSYNC,  and in the meanwhile 6
>> >> packets had completed. In this case after TXSYNC you will find tail==5,
>> >> meaning that packets in the slots 0,1,2,3,4 and 5 have been completed.
>> Note
>> >> that also the slot pointed by tail has been completed.
>> >>
>> >> But you are right that there is no way to receive completion
>> notification
>> >> if the queue is not full. You must use TXSYNC to check (by sleeping or
>> busy
>> >> wait) when tail moves forward.
>> >>
>> >> Cheers,
>> >>   Vincenzo
>> >>
>> >>
>> >> 2017-10-27 3:06 GMT+02:00 Xiaoye Sun <xiaoye@rice.edu>:
>> >>
>> >>> Hi
>> >>>
>> >>> I write a netmap program that sends packets to the network. my program
>> >>> uses one netmap ring and fills the ring slots with packets.
>> >>> My program needs to do something (action A) after a particular packet
>> >>> (packet P) in the ring slot is sent to the network. so the program
>> tracks
>> >>> the position of the tail point and checks if the tail point has moved
>> >>> across the slot I used to put that packet P.
>> >>> However, I found that the tail pointer may not move forward even
>> seconds
>> >>> after the receiver side got packet P.
>> >>> Sometimes the tail pointer never moves forward until the TX ring is
>> full.
>> >>> I try ioctl(NIOCTXSYNC), however, it cannot 100% solve the problem.
>> >>>
>> >>> My question is that is there a way to make the TX ring empty as early
>> as
>> >>> possible so that I can know when my packet is sent out. or is there
>> >>> another
>> >&

Re: netmap scatter/gather?

2017-11-08 Thread Vincenzo Maffione
I see, but it seems there is a bug anyway. Feel free to open a bug report
here https://github.com/luigirizzo/netmap/issues if you wish.

2017-11-07 21:13 GMT+01:00 Joe Buehler <as...@cox.net>:

> Decreasing buf_num to 32768 eliminated the allocation failure.
>
> Joe Buehler
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: netmap scatter/gather?

2017-11-07 Thread Vincenzo Maffione
Hi,
  In general netmap adapters (i.e. netmap ports) may support NS_MOREFRAG.
But in practice this is mainly supported on VALE ports.
So if you don't want to add the missing support by yourself you can simply
change the netmap buffer size by tuning the sysctl dev.netmap.buf_size, and
increase it to 9600.

Cheers,
  Vincenzo

2017-11-07 18:28 GMT+01:00 Joseph H. Buehler <j...@cox.net>:

> Does NS_MOREFRAG work when using netmap with network adaptors (e.g.
> virtio_net)?
>
> I need to send and receive large frames -- 9600 bytes -- but the netmap
> buffer size is only 2048.
>
> Joe Buehler
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: virtio_net / netmap RX dropping frames

2017-11-07 Thread Vincenzo Maffione
I understand. My point was simply to make sure your performance problems
here are not netmap's fault.

Cheers,
  Vincenzo

2017-11-07 18:25 GMT+01:00 Joe Buehler <as...@cox.net>:

> I believe the frame drop is due to the nature of my KVM setup.  There
> are large latencies in processing incoming frames due to the vagaries of
> the LINUX kernel.  Moving to the RT kernel helped, host tuning is also
> needed to eliminate large latencies in processing frames.  There is good
> information on real-time KVM on the net that explains how to achieve
> what I want.
>
> Joe Buehler
>
>


-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: FreeBSD 11.1 vmx + netmap queues

2017-11-02 Thread Vincenzo Maffione
Hi,
  With vmx driver netmap will use the emulated netmap adapter. On freebsd
netmap still does not have a way to see how many rings an interface has. So
by default will assume 1 tx/rx rings couple for emulated adapter. You can
however change this by sysctl dev.netmap.generic_rings.

Cheers,
  Vincenzo

Il 2 nov 2017 6:22 PM, "Santiago Martinez"  ha scritto:

> Hi list, hope you guys are doing well.
>
> I have a basic question. Do you know if multiple TX queues are
> supported  for vmx + netmap ?
>
> Basically I'm using pkt-gen to generate bulk traffic @10Gbps and its OK
> with packet size >~1000b.
>
> For small packets I should use multiple cores/processes to be able to
> generate the required pps, but pkg-gen complain that I have only one queue.
>
> I tried adding multiple queues for vmx on loader.conf (can verify with
> sysctl) but netmap still complaining there is only one queue.
>
> sysctl -a | grep vmx.1:
> dev.vmx.1.mbuf_load_failed: 0
> dev.vmx.1.mgetcl_failed: 0
> dev.vmx.1.defrag_failed: 0
> dev.vmx.1.defragged: 0
> dev.vmx.1.nrxqueues: 8
> dev.vmx.1.ntxqueues: 4
> dev.vmx.1.max_nrxqueues: 8
> dev.vmx.1.max_ntxqueues: 4
> dev.vmx.1.%parent: pci4
> dev.vmx.1.%pnpinfo: vendor=0x15ad device=0x07b0 subvendor=0x15ad
> subdevice=0x07b0 class=0x02
> dev.vmx.1.%location: slot=0 function=0 dbsf=pci0:11:0:0
> handle=\_SB_.PCI0.PE50.S1F0
> dev.vmx.1.%driver: vmx
> dev.vmx.1.%desc: VMware VMXNET3 Ethernet Adapter
>
> pkg-gen still saying one queue for vmx:
>
> Sending on netmap:vmx1: 1 queues, 2 threads and 4 cpus.
>
>
> Thanks in advance.
>
> Santiago
>
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [netmap] when does a packet in the netmap ring send out exactly

2017-10-27 Thread Vincenzo Maffione
Hi,
  This is actually a limitation of the netmap API: ring->tail is exposed to
the user so that it knows it can use the slots in the range
"[ring->head..ring->tail[" for new transmissions (note that head is
included, tail excluded, to prevent wraparound). However, there is no
explicit indication of "up to what slots packets were transmitted".
For hw NICs, however, ring->tail is an indication of where transmission was
completed.
Example:
1) at the beginning ring->tail = ring->head = ring->cur = 0
2) then your program moves head/cur forward: head = cur = 10
3) you call TXSYNC, to submit the packets to the NIC.
4) after the TXSYNC call, is very likely that tail is still 0, i.e. because
no transmission has been completed by the NIC (and no interrupt generated).
5) say after 20 us you issue another TXSYNC,  and in the meanwhile 6
packets had completed. In this case after TXSYNC you will find tail==5,
meaning that packets in the slots 0,1,2,3,4 and 5 have been completed. Note
that also the slot pointed by tail has been completed.

But you are right that there is no way to receive completion notification
if the queue is not full. You must use TXSYNC to check (by sleeping or busy
wait) when tail moves forward.

Cheers,
  Vincenzo


2017-10-27 3:06 GMT+02:00 Xiaoye Sun <xiaoye@rice.edu>:

> Hi
>
> I write a netmap program that sends packets to the network. my program
> uses one netmap ring and fills the ring slots with packets.
> My program needs to do something (action A) after a particular packet
> (packet P) in the ring slot is sent to the network. so the program tracks
> the position of the tail point and checks if the tail point has moved
> across the slot I used to put that packet P.
> However, I found that the tail pointer may not move forward even seconds
> after the receiver side got packet P.
> Sometimes the tail pointer never moves forward until the TX ring is full.
> I try ioctl(NIOCTXSYNC), however, it cannot 100% solve the problem.
>
> My question is that is there a way to make the TX ring empty as early as
> possible so that I can know when my packet is sent out. or is there another
> way to know when the packet in the slot is sent to the network/NIC physical
> queue?
>
> I am using Linux 3.16.0-4-amd64.
>
> Thanks!
>
> Best,
> Xiaoye
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: virtio_net / netmap RX dropping frames

2017-10-27 Thread Vincenzo Maffione
2017-10-26 20:04 GMT+02:00 Joe Buehler <as...@cox.net>:

> Vincenzo Maffione wrote:
>
> > You can have how many threads and processes you want. The constraint is
> > that there must not be two threads accessing the same ring at the same
> > time. In this case each pktgen is accessing different rings.
>
> Thanks that was very useful info.  I had run this before and got large
> frame drop so assumed it was a violation of threading constraints.  So
> now I can remove a mutex from my app, which has an RX and TX thread.
>
> Running the two pkt-gen instances, I am getting a lot of RX frame drop.
>  Based on counters, the TX frames are making it to the external loopback
> device, about 1 million frames/sec, which is looping them all back, but
> the macvtap interface on the host that feeds into the
> ixgbe/vhost/virtio_net/netmap interface in the VM shows about 80-90% of
> them as dropped.  CPU usage is low in the VM, very roughly 25% for the
> TX thread and 5% for the RX thread.  The frame rate displayed by pkt-gen
> and the CPU displayed by top is bouncing around.
>
>
Yes, mutexes are only needed for concurrent access to the same ring, which
is not your case.
>From this description it seems that your problem is not netmap or pkt-gen.
Your TX pkt-gen transmits 1Mpps with 0.25 CPUs, which is ok giving the
limitations of virtio-net.
Your RX pkt-gen is not really doing that much work (0.05 CPU), which means
that the virtio-net RX ring is almost always empty.

You need to figure out who is dropping the RX packets and why. This seems
to happen before the packets really make their way to the virtio-net RX
ring. So it must be some queue overflow in the macvtap or your ixgbe device.
I think the received packets are handled by this function

https://elixir.free-electrons.com/linux/latest/source/drivers/net/tap.c#L317

which is registered here

https://elixir.free-electrons.com/linux/latest/source/drivers/net/macvtap.c#L101

so if you see the macvtap interface drop counters increasing, it must be
tap_handle_frame() dropping. If this is true, it means that your problem is
macvtap.

Cheers,
  Vincenzo


> Joe Buehler
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: virtio_net / netmap RX dropping frames

2017-10-26 Thread Vincenzo Maffione
You can have how many threads and processes you want. The constraint is
that there must not be two threads accessing the same ring at the same
time. In this case each pktgen is accessing different rings.

Il 26 ott 2017 7:01 PM, "Joe Buehler" <as...@cox.net> ha scritto:

> Vincenzo Maffione wrote:
> > So you are using netmap only in the guest (and not in the host).
> > And you are running a sender and a receiver inside the VM, both on the
> > VM interface.
> > Something like this
> >
> > # pkt-gen -i eth1 -f rx
> > # pkt-gen -i eth1 -f tx
>
> Yes that's the basic idea.
>
> >
> > ?
> > What happens if you use pkt-gen rather than your application?
>
> I was under the impression that I can't have two threads in the netmap
> kernel code at the same time so can't do that.
>
> Joe Buehler
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: virtio_net / netmap RX dropping frames

2017-10-26 Thread Vincenzo Maffione
So you are using netmap only in the guest (and not in the host).
And you are running a sender and a receiver inside the VM, both on the VM
interface.
Something like this

# pkt-gen -i eth1 -f rx
# pkt-gen -i eth1 -f tx

?
What happens if you use pkt-gen rather than your application?

2017-10-26 16:31 GMT+02:00 Joe Buehler <as...@cox.net>:

> Vincenzo Maffione wrote:
> > I guess you are using a FreeBSD guest. Is this the case? If you have the
>
> Sorry, I am using LINUX, ubuntu 16.04 LTS for both host and VM.  I am
> posting here at standing request of netmap driver author.
>
> The host has 24 CPUs @ 2.5 GHz and 128G of memory and is *idle* so I am
> a bit disappointed
>
> > chance, try a linux guest to check if virtio-net works better there
> > (I've used netmap on the netmap-patched virtio-net in Linux guests,
> > never tried on FreeBSD).
> > The netmap ring size is just the NIC ring size. If you change the
> > virtio-net NIC ring size (sysctl on FreeBSD, I guess).
>
> OK I'll look into that.  I increased the ring size on the host ixgbe but
> that had no effect so I guess it must be virtio_net.
>
> > Anyway, for your specific use-case (VM accessing the physical 10G NIC)
> > there is a way better solution, which is the netmap passthrough.
>
> Unfortunately I don't have control of the host, just the VM, so pt
> netmap is not an option.
>
> My initial query regarded frame drops but the latency is also pretty
> bad.  The LINUX ping utility inside the VM says 0.2 mS consistently
> without netmap in use.  My app sees that value for almost all frames but
> it spikes (up to 0.6 mS!) for a few frames, which is not acceptable for
> this application -- was expecting much better due to network stack
> bypass.  And this is just 100 frames/sec...
>
> Joe Buehler
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: virtio_net / netmap RX dropping frames

2017-10-26 Thread Vincenzo Maffione
I guess you are using a FreeBSD guest. Is this the case? If you have the
chance, try a linux guest to check if virtio-net works better there (I've
used netmap on the netmap-patched virtio-net in Linux guests, never tried
on FreeBSD).
The netmap ring size is just the NIC ring size. If you change the
virtio-net NIC ring size (sysctl on FreeBSD, I guess).

Anyway, for your specific use-case (VM accessing the physical 10G NIC)
there is a way better solution, which is the netmap passthrough.
Check out the virtualization.pdf in this tutorial
https://github.com/vmaffione/netmap-tutorial.
You basically need to run QEMU (with KVM enabled), saying that you want to
pass through a netmap port (e.g. netmap:ethX in your case) to a VM. Then in
the FreeBSD VM you will see a "ptnet0" interface, where you can use pkt-gen.
You should get a 10x improvement if properly configured.

Cheers,
  Vincenzo

2017-10-25 23:02 GMT+02:00 Joe Buehler <as...@cox.net>:

> I am running virtio_net (netmap-modified) on top of netmap (latest) in a
> KVM virtual machine.  The host adapter is Intel 82599ES 10G and the VM
> is connected to it via macvtap.
>
> My test setup is a small program in the VM sending frames out to an
> external loopback device and watching what comes back.
>
> I am running at fairly low frame rates (200k frames / sec) and seeing RX
> frame drops and high latency (a few milliseconds).  The TX frames are
> all making it to the external loopback device (based on device counters)
> but the macvtap device in the RX path is reporting dropped frames, the
> count agreeing with what the test program observes.
>
> I guess my first question has to do with ring sizes.  The netmap API is
> reporting 255 buffers in the RX and TX rings.  How do I increase this
> substantially?
>
> Joe
>
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: NULL pointer dereference bug triggered by netmap

2017-07-18 Thread Vincenzo Maffione
Hi,
  Looks good to me, although I'm not sure whether if_transmit should
assert(mbuf == NULL). Couldn't we just drop the mbuf if we receive it?

Thanks,
  Vincenzo

2017-07-18 10:43 GMT-07:00 Luiz Otavio O Souza <lists...@gmail.com>:

> On 12 July 2017 at 02:19, Vincenzo Maffione wrote:
> > Yes.
> >
> > Actually, we would also need one beteween the following two options:
> > 1) Implementing a dummy if_start() for if_loop.c
> > 2) Prevent netmap from using if_loop.
>
> Hi,
>
> Please, check the attached patches.
>
> Luiz
>
> >
> > 2017-07-11 22:05 GMT+02:00 Marius Strobl <mar...@freebsd.org>:
> >
> >> On Thu, Jul 06, 2017 at 02:19:42PM -0700, Vincenzo Maffione wrote:
> >> > Sure, can anyone commit this?
> >>
> >> The addition of KASSERTs like the below one to if_handoff() and
> >> if_start()? Sure.
> >>
> >> Marius
> >>
> >> >
> >> > Il 5 lug 2017 4:05 AM, "Marius Strobl" <mar...@freebsd.org> ha
> scritto:
> >> >
> >> > > On Mon, Jul 03, 2017 at 05:08:09PM +0200, Vincenzo Maffione wrote:
> >> > > > Details here:
> >> > > >
> >> > > > https://github.com/luigirizzo/netmap/issues/322
> >> > > >
> >> > > > Is it acceptable to commit the proposed patch?
> >> > >
> >> > > As suggested by hselasky@, the outliner problem at hand is better
> >> solved
> >> > > by a dummy if_start method in order to not hurt the fast-path.
> Thus, if
> >> > > anything at all, a KASSERT(ifp->if_start != NULL, "no if_start
> method")
> >> > > should be added to if_handoff() and if_start().
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: NULL pointer dereference bug triggered by netmap

2017-07-11 Thread Vincenzo Maffione
Yes.

Actually, we would also need one beteween the following two options:
1) Implementing a dummy if_start() for if_loop.c
2) Prevent netmap from using if_loop.

Cheers,
  Vincenzo

2017-07-11 22:05 GMT+02:00 Marius Strobl <mar...@freebsd.org>:

> On Thu, Jul 06, 2017 at 02:19:42PM -0700, Vincenzo Maffione wrote:
> > Sure, can anyone commit this?
>
> The addition of KASSERTs like the below one to if_handoff() and
> if_start()? Sure.
>
> Marius
>
> >
> > Il 5 lug 2017 4:05 AM, "Marius Strobl" <mar...@freebsd.org> ha scritto:
> >
> > > On Mon, Jul 03, 2017 at 05:08:09PM +0200, Vincenzo Maffione wrote:
> > > > Details here:
> > > >
> > > > https://github.com/luigirizzo/netmap/issues/322
> > > >
> > > > Is it acceptable to commit the proposed patch?
> > >
> > > As suggested by hselasky@, the outliner problem at hand is better
> solved
> > > by a dummy if_start method in order to not hurt the fast-path. Thus, if
> > > anything at all, a KASSERT(ifp->if_start != NULL, "no if_start method")
> > > should be added to if_handoff() and if_start().
> > >
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: NULL pointer dereference bug triggered by netmap

2017-07-06 Thread Vincenzo Maffione
Sure, can anyone commit this?

Il 5 lug 2017 4:05 AM, "Marius Strobl" <mar...@freebsd.org> ha scritto:

> On Mon, Jul 03, 2017 at 05:08:09PM +0200, Vincenzo Maffione wrote:
> > Details here:
> >
> > https://github.com/luigirizzo/netmap/issues/322
> >
> > Is it acceptable to commit the proposed patch?
>
> As suggested by hselasky@, the outliner problem at hand is better solved
> by a dummy if_start method in order to not hurt the fast-path. Thus, if
> anything at all, a KASSERT(ifp->if_start != NULL, "no if_start method")
> should be added to if_handoff() and if_start().
>
> Marius
>
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


NULL pointer dereference bug triggered by netmap

2017-07-03 Thread Vincenzo Maffione
Details here:

https://github.com/luigirizzo/netmap/issues/322

Is it acceptable to commit the proposed patch?

Thanks,
  Vincenzo
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Netmap with bonded interfaces

2017-06-29 Thread Vincenzo Maffione
Bypass what?

2017-06-28 22:59 GMT-07:00 Paras Jha <dreadisc...@gmail.com>:

> It's possible to bypass this by unloading and reloading the patched
> network driver
>
> On Thu, Jun 29, 2017 at 12:39 AM, Vincenzo Maffione <v.maffi...@gmail.com>
> wrote:
>
>> Hi,
>>   It is an expected behaviour that you cannot open eth4 and eth5 if they
>> are bond, as the device are being used by the lagg pseudo-driver.
>> Since this driver does not have netmap support for the moment being,
>> there is no way you can get the native mode performance if you use lagg.
>>
>> If you just need some failover in your application, you could just
>> implement a simple failover mechanism in your application (e.g. the
>> application opens both netmap:eth4 and netmap:eth5, and decides which one
>> to use for transmission depending on which one is up...).
>>
>> Cheers,
>>   Vincenzo
>>
>> 2017-06-29 4:16 GMT+02:00 Paras Jha <dreadisc...@gmail.com>:
>>
>>> Hi all,
>>>
>>> I have a bonded interface bond0 which enslaves eth4 and eth5. When trying
>>> to open the devices eth4 or eth5 via netmap, I get a "device in use"
>>> error.
>>> Opening the bond0 interface directly in netmap works, however it is in
>>> emulated mode (as expected of a pseudointerface)
>>>
>>> What is the idiomatic way to proceed in such a situation, without
>>> compromising on speed?
>>>
>>> Thanks
>>> _______
>>> freebsd-net@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>>>
>>
>>
>>
>> --
>> Vincenzo Maffione
>>
>
>


-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Netmap with bonded interfaces

2017-06-28 Thread Vincenzo Maffione
Hi,
  It is an expected behaviour that you cannot open eth4 and eth5 if they
are bond, as the device are being used by the lagg pseudo-driver.
Since this driver does not have netmap support for the moment being, there
is no way you can get the native mode performance if you use lagg.

If you just need some failover in your application, you could just
implement a simple failover mechanism in your application (e.g. the
application opens both netmap:eth4 and netmap:eth5, and decides which one
to use for transmission depending on which one is up...).

Cheers,
  Vincenzo

2017-06-29 4:16 GMT+02:00 Paras Jha <dreadisc...@gmail.com>:

> Hi all,
>
> I have a bonded interface bond0 which enslaves eth4 and eth5. When trying
> to open the devices eth4 or eth5 via netmap, I get a "device in use" error.
> Opening the bond0 interface directly in netmap works, however it is in
> emulated mode (as expected of a pseudointerface)
>
> What is the idiomatic way to proceed in such a situation, without
> compromising on speed?
>
> Thanks
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: state of packet forwarding in FreeBSD?

2017-06-16 Thread Vincenzo Maffione
Also, there is a more basic problem. The bridge program requires physical
interfaces to be specified with the "netmap:" prefix. For example:

# ./bridge netmap:ix0 netmap:em2# OK
# ./bridge ix0 em2  # WRONG

2017-06-16 14:47 GMT+02:00 Vincenzo Maffione <v.maffi...@gmail.com>:

> Some comments inline:
>
> 2017-06-16 14:13 GMT+02:00 Jordan Caraballo <jordancaraball...@gmail.com>:
>
>> Hi guys,
>>
>> We tried multiple attempts to implement netmap vale-ctl and bridge in
>> chelsio vcxl* interfaces. The most interesting attempts, are mentioned
>> below.
>>
>> First Attempt: Ran "./bridge vcxl0 vcxl1"; but it would complain about
>> having a 0 burst size. Added "-b 1024" to the command as recommended by
>> the log from the script, but the issue was still present.
>>
>
> If
>
> # ./bridge ifname1 ifname2
>
> doesn't work it means there is some problem with the driver the interfaces
> are using. Default burst size used by "bridge" is already 1024.
> You may try to repeat this experiment after setting "sysctl
> dev.netmap.admode=2", to use the emulated mode, that is the legacy driver
> (of course performance would be limited, but at least you check
> functionality).
> Check netmap(4) for more info.
>
>
>>
>> Second Attempt: Tried to combine vale-ctl and bridge by:
>> # ./vale -h vale0:vcxl0
>> # ./vale -h vale0:vcxl1
>> # ./bridge vale0:1 vale0:3
>
>
>> There was no error, however, traffic did not flow at the time of
>> shooting packets to the interfaces.
>>
>
> What did you mean to do here? This setup creates a forwarding loop
> involving the VALE switch "vale0" and the userspace hub implemented by
> "bridge".
> I'm not surprised that you see nothing..
>
>
>>
>> Third Attempt: By following this email
>> threadhttps://lists.openinfosecfoundation.org/pipermail/
>> oisf-users/2015-October/005310.html;
>> we ran:
>> # ./vale-ctl -n b0
>> # ./vale-ctl -n b1
>> # ./vale-ctl -a vale0:b0
>> # ./vale-ctl -a vale0:vcxl0
>> # ./vale-ctl -a vale1:b1
>> # ./vale-ctl -a vale1:vcxl1
>> # ./bridge -i netmap:vcxl0 -i netmap:vcxl1
>>
>> Same result as before, no errors, yet no traffic in the interfaces.
>>
>>
> This should throw errors, as you are first attaching vcxl0 to a VALE
> switch, and then opening vclx0 in netmap mode with the bridge program:
> netmap forbids you to do that, because when an interface is attached to a
> VALE switch it is "busy", and cannot be opened again from another netmap
> application (like bridge).
> Same reasoning for vcxl1.
>
> Have you tried with the basic TX/RX tests first, without involving VALE?
>
> # pkt-gen -i netmap:vcxl0 -f tx
> # pkt-gen -i netmap:vcxl0 -f rx
>
>
> Regards,
>   Vincenzo
>
>
>> Any feedback or advice on why traffic is not flowing?
>>
>> We would like to note that throwing packets to the vcxl interfaces
>> without any netmap aware application ranges from 1.1M to 1.2M pps.
>>
>> Is this supposed to happen? (We consider that still, the number is quite
>> low)
>>
>> - Jordan
>>
>>
>> 2017-06-15 17:15 GMT-04:00 Navdeep Parhar <npar...@gmail.com>:
>>
>> > On 06/14/2017 10:42, Olivier Cochard-Labbé wrote:
>> >
>> >> On Wed, Jun 14, 2017 at 7:36 PM, Navdeep Parhar <npar...@gmail.com
>> >> <mailto:npar...@gmail.com>> wrote:
>> >>
>> >>
>> >> I think I fixed this a long time back.  Have you tried recently?
>> We
>> >> moved the netmap functionality to the vcxl interfaces and it should
>> >> just work.
>> >>
>> >> ​
>> >> It stills panic with an -head build today.
>> >>
>> >>
>> > Fixed in r319986.
>> >
>> >
>> > ___
>> > freebsd-net@freebsd.org mailing list
>> > https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>> >
>>
>>
>>
>> --
>> Jordan
>> ___
>> freebsd-net@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>>
>
>
>
> --
> Vincenzo Maffione
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: state of packet forwarding in FreeBSD?

2017-06-16 Thread Vincenzo Maffione
Some comments inline:

2017-06-16 14:13 GMT+02:00 Jordan Caraballo <jordancaraball...@gmail.com>:

> Hi guys,
>
> We tried multiple attempts to implement netmap vale-ctl and bridge in
> chelsio vcxl* interfaces. The most interesting attempts, are mentioned
> below.
>
> First Attempt: Ran "./bridge vcxl0 vcxl1"; but it would complain about
> having a 0 burst size. Added "-b 1024" to the command as recommended by
> the log from the script, but the issue was still present.
>

If

# ./bridge ifname1 ifname2

doesn't work it means there is some problem with the driver the interfaces
are using. Default burst size used by "bridge" is already 1024.
You may try to repeat this experiment after setting "sysctl
dev.netmap.admode=2", to use the emulated mode, that is the legacy driver
(of course performance would be limited, but at least you check
functionality).
Check netmap(4) for more info.


>
> Second Attempt: Tried to combine vale-ctl and bridge by:
> # ./vale -h vale0:vcxl0
> # ./vale -h vale0:vcxl1
> # ./bridge vale0:1 vale0:3


> There was no error, however, traffic did not flow at the time of
> shooting packets to the interfaces.
>

What did you mean to do here? This setup creates a forwarding loop
involving the VALE switch "vale0" and the userspace hub implemented by
"bridge".
I'm not surprised that you see nothing..


>
> Third Attempt: By following this email
> threadhttps://lists.openinfosecfoundation.org/pipermail/oisf-users/2015-
> October/005310.html;
> we ran:
> # ./vale-ctl -n b0
> # ./vale-ctl -n b1
> # ./vale-ctl -a vale0:b0
> # ./vale-ctl -a vale0:vcxl0
> # ./vale-ctl -a vale1:b1
> # ./vale-ctl -a vale1:vcxl1
> # ./bridge -i netmap:vcxl0 -i netmap:vcxl1
>
> Same result as before, no errors, yet no traffic in the interfaces.
>
>
This should throw errors, as you are first attaching vcxl0 to a VALE
switch, and then opening vclx0 in netmap mode with the bridge program:
netmap forbids you to do that, because when an interface is attached to a
VALE switch it is "busy", and cannot be opened again from another netmap
application (like bridge).
Same reasoning for vcxl1.

Have you tried with the basic TX/RX tests first, without involving VALE?

# pkt-gen -i netmap:vcxl0 -f tx
# pkt-gen -i netmap:vcxl0 -f rx


Regards,
  Vincenzo


> Any feedback or advice on why traffic is not flowing?
>
> We would like to note that throwing packets to the vcxl interfaces
> without any netmap aware application ranges from 1.1M to 1.2M pps.
>
> Is this supposed to happen? (We consider that still, the number is quite
> low)
>
> - Jordan
>
>
> 2017-06-15 17:15 GMT-04:00 Navdeep Parhar <npar...@gmail.com>:
>
> > On 06/14/2017 10:42, Olivier Cochard-Labbé wrote:
> >
> >> On Wed, Jun 14, 2017 at 7:36 PM, Navdeep Parhar <npar...@gmail.com
> >> <mailto:npar...@gmail.com>> wrote:
> >>
> >>
> >> I think I fixed this a long time back.  Have you tried recently?  We
> >> moved the netmap functionality to the vcxl interfaces and it should
> >> just work.
> >>
> >> ​
> >> It stills panic with an -head build today.
> >>
> >>
> > Fixed in r319986.
> >
> >
> > ___
> > freebsd-net@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> >
>
>
>
> --
> Jordan
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: state of packet forwarding in FreeBSD?

2017-06-14 Thread Vincenzo Maffione
Il 14 giu 2017 5:21 PM, "Olivier Cochard-Labbé"  ha
scritto:

On Wed, Jun 14, 2017 at 4:48 PM, John Jasen  wrote:

>
> b) On the negative side, between the various releases, netmap appeared
> to be unstable with the Chelsio cards -- sometimes supported, sometimes
> broken. Also, we're still trying to figure out netmap utilities, such as
> vale-ctl and bridge, so any advice would be appreciated.
>

I confirm that mixing netmap and Chelsio is broken on -current since about​

​6 month.
We can't start 2 netmap's pkt-gen simultaneously as example.

cf my report:
https://lists.freebsd.org/pipermail/svn-src-head/2016-December/094418.html

​

>
> b.1) netmap-fwd is admittedly single-threaded and does not support IPv6.
> These clearly showed in our tests, as we were unable to achieve over 2.5
> mpps, saturating a single CPU and letting the others fall asleep.
> However, bumping a single CPU queue from around 0.6 mpps to 2.5 mpps is
> nothing to ignore, so it could be useful in some cases.
>

​Softwares using netmap are not easy to use:
- netmap-ipfw (https://github.com/Netgate/netmap-fwd) was not updated since
dec 2015.
- And I don't reach to compile netmap-ipfw too (
https://github.com/luigirizzo/netmap-ipfw).


Yes, these two ones are unmantained afaik.


​


> c) The routing improvement project USB stick performed incredibly,
> achieving 8.5 mpps out of the box. However, it appears
> (https://wiki.freebsd.org/ProjectsRoutingProposal/ConversionStatus),
> that many of the changes are still pending review, and that things have
> not moved much in the last 18 months
> (https://svnweb.freebsd.org/base/projects/routing/)
>
​
Yes, this projects/routing still give the best performance after 18 months,
but the maintainer didn't have time to works on FreeBSD since.

Then for resuming: there are 3 alpha-stage but very promising projects, but
they seems stuck because not enough manpower for finishing them.

Regards,
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: state of packet forwarding in FreeBSD?

2017-06-14 Thread Vincenzo Maffione
Hi,
  To test netmap raw forwarding performance using just one core, use the
bridge program between two netmap supported NICs, like ix or ixl

# ./bridge ix0 ix1

You could implement your own multicore software router by extending the
bridge example to implement the protocols you need.

Vale-ctl it's a different story, you can use it to attach netmap enabled
NICs to the VALE software bridge. See the VALE paper for details.

Cheers,
  Vincenzo

Il 14 giu 2017 4:48 PM, "John Jasen"  ha scritto:

> Our goal was to test whether or not FreeBSD currently is viable, as the
> operating system platform for high speed routers and firewalls, in the
> 40 to 100 GbE range.
>
> In our investigations, we tested 10.3, 11.0/-STABLE, -CURRENT, and a USB
> stick from BSDRP using the FreeBSD routing improvements project
> enhancements (https://wiki.freebsd.org/ProjectsRoutingProposal).
>
> We've tried stock and netmap-fwd, have played around a little with
> netmap itself and dpdk, with the results summarized below. The current
> testing platform is a Dell PowerEdge R530 with a Chelsio T580-LP-CR dual
> port 40GbE card.
>
> Suggestions, examples for using netmap, etc, all warmly welcomed.
>
> Further questions cheerfully answered to the best of our abilities.
>
> a) On the positive side, it appears that 11.0 is much faster than 10.0,
> which we tested several years ago. With appropriate cpuset tuning, 5.5
> mpps is achievable using modern hardware. Using slightly older hardware,
> (such as a Dell R720 with v3 xeons), around 5.2-5.3 mpps can be obtained.
>
> b) On the negative side, between the various releases, netmap appeared
> to be unstable with the Chelsio cards -- sometimes supported, sometimes
> broken. Also, we're still trying to figure out netmap utilities, such as
> vale-ctl and bridge, so any advice would be appreciated.
>
> b.1) netmap-fwd is admittedly single-threaded and does not support IPv6.
> These clearly showed in our tests, as we were unable to achieve over 2.5
> mpps, saturating a single CPU and letting the others fall asleep.
> However, bumping a single CPU queue from around 0.6 mpps to 2.5 mpps is
> nothing to ignore, so it could be useful in some cases.
>
> c) The routing improvement project USB stick performed incredibly,
> achieving 8.5 mpps out of the box. However, it appears
> (https://wiki.freebsd.org/ProjectsRoutingProposal/ConversionStatus),
> that many of the changes are still pending review, and that things have
> not moved much in the last 18 months
> (https://svnweb.freebsd.org/base/projects/routing/)
>
> d) We've not figured out dpdk  (dpdk.org) yet. Our first foray into the
> test examples, and we're stuck trying to get the interfaces online.
>
> -- John Jasen (jja...@gmail.com)
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ovs-netmap forgotten?

2017-06-07 Thread Vincenzo Maffione
You can play with NIC interrupt coalescing settings to keep them under
control (I don't know how in FreeBSD, but in Linux you would do that with
e.g. "ethtool -C rx-usecs 100").
As written in the last line of netmap(4), you must disable all the
offloadings when playing with netmap physical ports.

By the way, there is a person working on adding some offloading support to
netmap, but this is still in progress and it is not even in the netmap
github yet.

Cheers,
  Vincenzo

2017-06-06 11:59 GMT+02:00 Harry Schmalzbauer <free...@omnilan.de>:

>  Bezüglich Harry Schmalzbauer's Nachricht vom 05.06.2017 20:25 (localtime):
> > Bezüglich Vincenzo Maffione's Nachricht vom 05.06.2017 16:06 (localtime):
> >
> …
> > First quick test shows you're right and this tiny diff solves a decent
> > share of my (ESXi-replacing) problems:
> >
> > --- src/sys/net/if_vlan.c.orig  2017-06-05 17:39:27.770574000 +0200
> > +++ src/sys/net/if_vlan.c   2017-06-05 17:39:21.550278000 +0200
> > @@ -1234,7 +1234,7 @@
> > if_inc_counter(ifv->ifv_ifp, IFCOUNTER_IPACKETS, 1);
> >
> > /* Pass it back through the parent's input routine. */
> > -   (*ifp->if_input)(ifv->ifv_ifp, m);
> > +   (*ifv->ifv_ifp->if_input)(ifv->ifv_ifp, m);
> >  }
> >
> >  static int
> >
> > Will do real-world tests tommorrow.
>
> To share my observations:
>
> Attaching if_vlan(4) to vale(4) works with the above modification, as
> long as vlanhwtag is _not_ disabled, at least with igb(4) and (em4).
> Having other offloadings enabled or disabled (regardless if it's on
> parent or vlan-clone) doesn't matter, disabling vlanhwtag on the parent
> leads to congested parent if there's mor etraffic than console... I
> haven't done any tracking if it's caused by TCP windows scaling e.g. nor
> tried to ask the code, because I do want vlanhwtagging enabled and
> that's what works so far :-)
> This is also true for if_vlan(4) interfaces which have if_lagg(0) as
> parent, and also for both types of vale(4) attaching, hoststack-detached
> (-a) or hoststack-attached (-h).
> So far very nice :-)
>
> But there's a interrupt multiplication noticeable (at the host).
>
> My simple NFS-copying test causes ~10ki/s at one igb(4) queue when
> invoked on the host, with mtu 1500.
> Same invocation in the guest, with vlan-vale setup, causes 30ki/s
> average (with high discrepancy, 20-40k).
>
> Might it be possible that if_vlan(4) influences interrupt moderation
> capabilities?
>
> Vincenzo, thanks for your answers to my questions, which I read during
> writing of this - stripping them here.
>
> -harry
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: panic after LOR, 2nd netmap_mem2.c, 1st vm_fault.c

2017-06-07 Thread Vincenzo Maffione
5 0x00080122813a in ?? ()
> …
> > Does anybody have a quick idea if it's easily fixable, or a complicated
> > issue, possibly caused due to MFC boch?
>
> Similar panic happens with stable/11 native netmap code, see
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219846
>
> Hope this can be fixed before 11.1-RELEASE?
>
> Thanks,
>
> -harry
>
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ovs-netmap forgotten?

2017-06-06 Thread Vincenzo Maffione
319182 was available here too:
> ftp://ftp.omnilan.de/pub/FreeBSD/OmniLAN/misc/), bhyve(8) doesn't
> support ptnet yet.


ptnet driver is already in HEAD.
Support for bhyve is not yet in HEAD, but available here

https://github.com/vmaffione/freebsd/tree/ptnet-head

in the ptnet-head branch.



> Is there any specific reason why ptnetmap-memdev
> (https://svnweb.freebsd.org/socsvn/soc2016/vincenzo/head/
> usr.sbin/bhyve/pci_ptnetmap_netif.c)
> hasn't been commited to HEAD?
>

That's a very good question. bhyve code for ptnet has been ready for a
year, but I'm still waiting for the bhyve maintainers to commit it. I'll
raise the issue again at BSDCan over the week-end (
https://www.bsdcan.org/2017/schedule/events/814.en.html). I hope I'll find
people willing to commit this!



>
> Does anybody have an idea if there is any vmnet/vtnet companion (in
> development stage) providing offloading features, reducing interrupt
> wastings?
>
> Another question, better addressed to virtualization@ but I remember
> cross-posting is to avoid:
> I never tried to understand why vmx3f seems to work without using
> interrupts at all, as opposed to vmx(4), but maybe it is possible to do
> the same for vtnet(4)?
>

The only way to avoid interrupts at all is to do busy waiting or polling,
but nobody does that for general purpose networking because you waste CPU
or artificially increase latency.
So vmx* does use interrupts.
The way to go to optimize the TCP performance between a VM and the external
physical network is to follow the QEMU virtio-net + vhost-net approach on
Linux (
http://blog.vmsplice.net/2011/09/qemu-internals-vhost-architecture.html),
which is similar to what ptnet does.
However, offloading support if if_tap is also needed (Linux does that).

Cheers,
  Vincenzo


> Thanks,
>
> -harry
>
>


-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ovs-netmap forgotten?

2017-06-05 Thread Vincenzo Maffione
Hi Harry,
  I've done some investigation on this issue (just for fun) , and I think I
may have found the issue.

When using vlan interfaces, netmap use the emulated adapter, as the "vlan"
driver is not netmap-enabled (and it cannot be).
To intercept RX packets, netmap replaces the "if_input" function pointer
field in the kernel "struct ifnet" (the struct representing a network
interface).
Note that you have an instance of "struct ifnet" for em0 (physical NIC),
and a different instance for each VLAN cloned interface (e.g. "vlan100") on
em0.
If you put vlan100 in netmap mode, netmap will replace the if_input of
vlan100, and not the if_input of em0. So far, this is an expected behaviour.

Unfortunately, I see in the code here

https://github.com/freebsd/freebsd/blob/master/sys/net/if_vlan.c#L1244-L1245

that when VLAN driver intercepts the RX packet coming from the underlying
interface (e.g. em0 in our example), the em0 if_input is used rather than
the vlan100 if_input.

In terms of code, we have
  (*ifp->if_input)(ifv->ifv_ifp, m);
rather than
  (*ifv->ifv_ifp->if_input)(ifv->ifv_ifp, m);
Since em0 if_input is not replaced, netmap does not intercept it and you
don't see it in your application, e.g.

# pkt-gen -i vlan100 -f rx

will see nothing.

Now, I think that normally ifv->ifv_ifp->if_input == ifp->if_input, so this
may explain why the code is written like that (to avoid the additional
pointer dereferencing).
This is not the case for netmap, where ifv->ifv_ifp->if_input !=
ifp->if_input when em0 xor vlan100 are in netmap mode.

You may try to recompile the kernel with that change and see if you can see
packets coming on vlan100 with pkt-gen.
I recommend you always doing tests with pkt-gen before trying to use
vale-ctl -a.

Cheers,
  Vincenzo


2017-06-01 9:45 GMT+02:00 Harry Schmalzbauer <free...@omnilan.de>:

> Bezüglich Vincenzo Maffione's Nachricht vom 01.06.2017 00:39 (localtime):
> > Hi Harry,
> >   OVS integration with netmap is very patchy and Linux only. Most
> > importantly, it is not the right way to go, for a number of reasons.
> > The real solution would be to integrate netmap into OVS would be to
> > follow the DPDK-OVS approach: this means implementing the switching
> > logic completely in userspace, in this case using the netmap API. This
> > has not been implemented nor sketched so far.
> >
> > `vale-ctl -n valeXXX:YYY` just creates a persistent VALE port (YYY)
> > attached to the VALE switch XXX.
> > There is no difference with an ephemeral VALE port, apart for the fact
> > that the persistent one is visible with ifconfig.
> >
> > It does not really make sense to attach a VLAN interface to VALE, since
> > the VLAN driver does not have netmap support, so you lose all the
> > advantage of using netmap and VALE.
> > In your case the best solution I see is to write a custom netmap
> > application that forwards the packets between a netmap-supported NIC and
> > one or more VMs, doing the VLAN stripping in software.
>
> Thanks again, Vincenzo, for your highly appreciated support!
>
> I can only concur to your proposed solution.  Problem is, I don't speak
> any programmin language well (besides sh maybe) and have abosuletely no
> budget/time to do any work out on my C skills (which I'd love to do) ;-)
>
> So ovs-netmap wasn't the right direction, but the difference between
> em0+if_brdige+vmnet|virtio-net+vtnet and vale:em0|vale:guest+vtnet is
> noticable. I haven't done any measuring, but just performing typical
> admin jobs via cli (ssh into the bhyve-guest, whith resorces via NFS
> (v4)) behave completely different – human-noticable much more snappy in
> the vale:guest case!
> I don't think this enormous efficiency advantge is soleley caused by
> em0-netmap/ring connection; More important, in the
> vale:em0|vale:guest+vtnet case, I gain excellent inter-vm efficiency
> (and much higher attainable performance of course, which is not crucial
> at the moment; but efficiency is!).
> Now replacing vale:em0 by vale:vlan0 will surely destroy one big
> efficiency advantage, but I still benefit from excellent inter-vm
> efficiency and most likely some small efficiency advantage left over the
> if_bridge picture.
> Also, ptnet is a very interisting area of optimization which is easy to
> explore with the vale:vlan scenario.
>
> In another post I described that the vale:vlan path doesn't work, while
> vale:em0 (the parent) with everything else untouched does work.
> Dou you think it's possible to fix the vale:vlan coupling without netmap
> experts setting up a test environment?
>
> Thanks,
>
> -harry
>
>


-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: vale uplink via vlan-if [Was: Re: Are ./valte-ctl and ./bridge friends or competitors?]

2017-05-31 Thread Vincenzo Maffione
If VLAN interface do not work properly with VALE/netmap that's a bug of the
emulated adapter code in FreeBSD.
But in any case the approach is not the right one, since using the emulated
netmap adapter with VALE will make you lose most of the advantage of using
netmap in the first place.
(see my previous email).

Cheers,
  Vincenzo

2017-05-25 17:24 GMT+02:00 Harry Schmalzbauer <free...@omnilan.de>:

> Bezüglich Vincenzo Maffione's Nachricht vom 21.03.2017 19:05 (localtime):
> > 2017-03-20 19:41 GMT+01:00 Harry Schmalzbauer <free...@omnilan.de>:
> >
> >> Bezüglich Vincenzo Maffione's Nachricht vom 20.03.2017 12:50
> (localtime):
> >> …
> >>>> So to summarize for newbies exploring netmap(4) world in combination
> >>>> with physical uplinks and virtual interfaces, it's important to do the
> >>>> following uplink NIC configuration (ifconfig(8)):
> >>>> -rxcsum -txcsum -rxcsum6 -txcsum6 -tso -lro promisc
> >>>>
> >>>
> >>> Exactly. This is mentioned at the very end of netmap(4):
> >>>
> >>> "netmap does not use features such as checksum offloading, TCP
> >> segmentation
> >>> offloading, encryption, VLAN encapsulation/decapsulation, etc.  When
> >> using
> >>> netmap to exchange packets with the host stack, make sure to disable
> >> these
> >>> features."
> >>>
> >>> But it is probably a good idea to add these example ifconfig
> instructions
> >>> somewhere (man page or at least the README in the netmap repo).
> >>>
> >>>
> >>>>
> >>>> I guess vlanhwtag, vlanhwfilter and vlanhwtso don't interfere, do
> they?
> >>>>
> >>>
> >>> Well, I think they interfere: if you receive a tagged packet and the
> NIC
> >>> strips the tag and puts it in the packet descriptor, then with netmap
> you
> >>> will see the untagged packet, and you wouldn't have a way to see the
> tag.
>
> Sorry picking this up again, but I'm stuck getting vale(4) productive :-(
>
> I took lagg(4) out of the game and configured my desired setup using
> if_bridge(4) at first.
>
> The physical uplink NIC is em0.
> The bridge/vale uplink is em0.232.
>
> hostB --switch-- em0-hostA
>   |
>   '- em0.232
>   |
> bridge5-vmnet0
>   |
>  vtnet0-GUESTa <-tcpdump:
>
> 17:07:28.423768 00:a0:98:73:9f:42 > ff:ff:ff:ff:ff:ff, ethertype ARP
> (0x0806), length 42: Request who-has 172.21.34.10 tell 172.21.35.1,
> length 28
> 17:07:28.424208 00:0c:29:40:3a:dd > 00:a0:98:73:9f:42, ethertype ARP
> (0x0806), length 60: Reply 172.21.34.10 is-at 00:0c:29:40:3a:dd, length 46
>
> The same is visable on vmnet0 nad em0 of course.
>
> Now if I replace bridge5 by vale, leaving everything else unchanged
> besides using netmap-vtnet with bhyve, I don't get ARP is-at answer.
> I can see the who-has on all interfaces involved, and also the is-at
> answer up to em0.232, but not at vtnet0 (the guest, connected via vale).
>
> To draw the same picture like with bridge:
>
> hostB --switch-- em0-hostA
>   |
>   '- em0.232
>   |
>   vale232:em0.232-'
>   vale232:GUESTa--vtnet0-GUESTa <-tcpdump:
>
> 17:16:00.111868 00:a0:98:73:9f:42 > ff:ff:ff:ff:ff:ff, ethertype ARP
> (0x0806), length 42: Request who-has 172.21.34.10 tell 172.21.35.1,
> length 28
> ... no reply
>
> While tcpdump of em0.232 shows:
> 17:16:01.119537 00:a0:98:73:9f:42 > ff:ff:ff:ff:ff:ff, ethertype ARP
> (0x0806), length 60: Request who-has 172.21.34.10 tell 172.21.35.1,
> length 46
> 17:16:01.119849 00:0c:29:40:3a:dd > 00:a0:98:73:9f:42, ethertype ARP
> (0x0806), length 60: Reply 172.21.34.10 is-at 00:0c:29:40:3a:dd, length 46
>
> The reply made it up to vale's uplink, but not through vale.  Am I
> missing something?
> Tagging, checksum-disabling etc. seems to be right, since utilizing
> if_bridge(4) gives the expected result, but I have no idea why I can't
> get packets via vale(4).
>
> Important note: Using em0.232 parent (vlandev em0) for vale uplink does
> work!
> So I guess if_em(4)'s native netmap support interferes with the vlan clone.
> I'm out at this point, far too less knwoledge about the code paths...
> Can anybody else help here?
>
> Thanks,
>
> -harry
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ovs-netmap forgotten?

2017-05-31 Thread Vincenzo Maffione
Hi Harry,
  OVS integration with netmap is very patchy and Linux only. Most
importantly, it is not the right way to go, for a number of reasons.
The real solution would be to integrate netmap into OVS would be to follow
the DPDK-OVS approach: this means implementing the switching logic
completely in userspace, in this case using the netmap API. This has not
been implemented nor sketched so far.

`vale-ctl -n valeXXX:YYY` just creates a persistent VALE port (YYY)
attached to the VALE switch XXX.
There is no difference with an ephemeral VALE port, apart for the fact that
the persistent one is visible with ifconfig.

It does not really make sense to attach a VLAN interface to VALE, since the
VLAN driver does not have netmap support, so you lose all the advantage of
using netmap and VALE.
In your case the best solution I see is to write a custom netmap
application that forwards the packets between a netmap-supported NIC and
one or more VMs, doing the VLAN stripping in software.

Cheers,
  Vincenzo

2017-05-31 21:59 GMT+02:00 Harry Schmalzbauer <free...@omnilan.de>:

>  Bezüglich Harry Schmalzbauer's Nachricht vom 25.05.2017 18:01
> (localtime):
> >  Hello,
> >
> > I found lots of interesting papers about research and improvements
> > regarding Open vSwitch and netmap (on FreeBSD, e.g.
> > http://changeofelia.info.ucl.ac.be/pmwiki/uploads/
> SummerSchool/Program/poster_001.pdf)
> >
> > Again, University of Pisa with a famous team arround Luigi Rizzo did
> > some highly appreciated coding and presentation, the paper in the link
> > is from Gaetano Catalli (cc'd).
> >
> > But it seems that this work got lost in space...
> > openvswitch in ports is quiet old codebase without any
> > netmap-integration and a provided patch isn't in our netmap tree:
> > https://github.com/cnplab/ovs-netmap/blob/master/0001-
> datapath-Add-support-for-netmap-VALE.patch
> >
> > So I guess nobody uses ports/net/openvswitch these days anymore.
> >
> > I also found a FreeBSD kernel module was written back in 2014. But that
> > seems also got lost, which most likely was due to ovs-netmap replacement?
> >
> > Thanks for any hints,
>
> I made little progress answering my questions myself and still
> appreciate any hints, hopefully providing some hints for others getting
> into OVS/netmap.
>
> I found the ovs-netmap work, it's part of the standard netmap github
> repository
> (https://github.com/luigirizzo/netmap) under utils/switch-modules.
>
> In case someody is interested, I prepared a very draft net/openvswitch
> port, updated to 2.6.1 and extended by the ovs-netmap patch. Compiles
> and runs, but the result isn't fully tested yet nor is the port rellay
> correct istelf (haven't fired up portlint) or complete (wanted to
> integrate DPDK too...)
>
> Unfortunately it doesn't seem to be what I expected (a
> datapath/dataplane implementation).
> It seems on FreeBSD we still have to run Open vSwitch in userspace
> (which is described in INSTALL.userspace.md – all other documentation
> seem to assume you're running linux and have the kernel module loadad).
>
> The description in INSTALL.NETMAP, part of the mentioned patch, brings
> up more questions than answers for me, most likely due to a still
> considerable ammount of personal knwoledge absence.
>
> I don't get the idea of 'vale-ctl -n', which provides a netmap-only(?)
> interface, but even if I imagine one side of the interface to connect to
> arbitrary netmap-app, I have no idea what the other side connects to.
>
> And that's my problem with the Open vSwitch integration. I can access an
> ovs bridge via netmap-app. But this doesn't improve the severe OVS
> limitations running on FreeBSD (doing frame copies in userspace). Hope
> someone can confirm I'm wrong?
>
> Thanks,
>
> -harry
>
> P.S.: Like partly mentioned in multiple other questions I posted
> recently, I need a way to fire up bhyve with '-s
> 2,virtio-net,vale0:guestport', while vale0 only distributes frames
> filtered by 802.1Q tag. I don't care much where the filtering happens,
> but there's not much choice.
> Unfortunately, plugging a vlan(4) clone into vale doesn't work, frames
> never make it into vale (while using the parent shows results as
> expected, so basic setup soud be ok).
> Next idea was to utilize Open vSwitch, but that's the most painful
> solution to get vlan tags filtered (which could do the nic itself)... I
> hoped vale(4) could be utilized by Open vSwitch, but it seems that the
> ovs-netmap approach is a completely different thing...
> ___
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"




-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [panic] netmap(4) and if_lagg(4)

2017-05-28 Thread Vincenzo Maffione
Hi Harry,


2017-05-26 14:22 GMT+02:00 Harry Schmalzbauer <free...@omnilan.de>:

> Bezüglich Vincenzo Maffione's Nachricht vom 26.05.2017 11:06 (localtime):
> > Yes, it should integrate and compile out of the box, I've done that
> > several times with FreeBSD-11.0 and 10.3.
>
> Impressive, it needed just a small addition to sys/conf/files to make
> the linker happy :-)
>

Exactly!


>
> I also recomplied vale-ctl, but I get the following error when trying to
> add em0 (lagg-unrelated):
> 389.083433 [ 835] netmap_obj_malloc netmap_ring request size
> 65792 too large
> 389.091593 [1693] netmap_mem2_rings_create  Cannot allocate RX_ring
>

It means that you are giving too many slot to each RX ring, probably 4096.
Try to use less slots, or to increase the dev.netmap.ring_size sysctl
(default value is less than 65792).


>
> Adding lagg still results in a panci, but with your latest
> "return()"-patch, it's different:
>
> 0x8042aefb is in freebsd_generic_rx_handler
> (/usr/local/share/deploy-tools/RELENG_11/src/sys/dev/
> netmap/netmap_freebsd.c:276).
> 271 struct netmap_generic_adapter *gna =
> 272 (struct netmap_generic_adapter *)NA(ifp);
> 273 int stolen = generic_rx_handler(ifp, m);
> 274
> 275 if (!stolen) {
> 276 gna->save_if_input(ifp, m);
> 277 }
> 278 }
> 279
> 280 /*
>
> KDB: stack backtrace:
> #0 0x805e4a17 at kdb_backtrace+0x67
> #1 0x805a34b6 at vpanic+0x186
> #2 0x805a3323 at panic+0x43
> #3 0x808a49b2 at trap_fatal+0x322
> #4 0x808a4a09 at trap_pfault+0x49
> #5 0x808a4246 at trap+0x286
> #6 0x8088a521 at calltrap+0x8
> #7 0x806aa3e0 at vlan_input+0x1f0
> #8 0x8069b298 at ether_demux+0x128
> #9 0x8069bf3b at ether_nh_input+0x31b
> #10 0x806b7c00 at netisr_dispatch_src+0xa0
> #11 0x8069b546 at ether_input+0x26
> #12 0x803a0278 at igb_rxeof+0x738
> #13 0x8039f63f at igb_msix_que+0x10f
> #14 0x8056ae1c at intr_event_execute_handlers+0xec
> #15 0x8056b106 at ithread_loop+0xd6
> #16 0x80568475 at fork_exit+0x85
> #17 0x8088aa5e at fork_trampoline+0xe
>
>
Yeah, the same bug jist slipped back. Discard the previous patch and
replace it with the attached one.

Cheers,
  Vincenzo


> Any hints how I can get adding em0 (lagg and vlan-less) via vale-ctl
> again? (netmap_obj_malloc netmap_ring request size 65792 too large)
>
> Thanks,
>
> -harry
>
>
>


-- 
Vincenzo Maffione
diff --git a/sys/dev/netmap/netmap_freebsd.c b/sys/dev/netmap/netmap_freebsd.c
index 66d0637f..fa6427e0 100644
--- a/sys/dev/netmap/netmap_freebsd.c
+++ b/sys/dev/netmap/netmap_freebsd.c
@@ -270,8 +270,13 @@ freebsd_generic_rx_handler(struct ifnet *ifp, struct mbuf *m)
 {
 	struct netmap_generic_adapter *gna =
 			(struct netmap_generic_adapter *)NA(ifp);
-	int stolen = generic_rx_handler(ifp, m);
+	int stolen;
 
+if (!NM_NA_VALID(ifp)) {
+return;
+}
+
+stolen = generic_rx_handler(ifp, m);
 	if (!stolen) {
 		gna->save_if_input(ifp, m);
 	}
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [panic] netmap(4) and if_lagg(4)

2017-05-26 Thread Vincenzo Maffione
Yes, it should integrate and compile out of the box, I've done that several
times with FreeBSD-11.0 and 10.3.
And yes, HEAD contains recent code, you can also use that.

Cheers,
  Vincenzo

2017-05-26 11:01 GMT+02:00 Harry Schmalzbauer <free...@omnilan.de>:

> Bezüglich Vincenzo Maffione's Nachricht vom 26.05.2017 10:41 (localtime):
> > Ok, so you should try to completely replace the code in your
> > /usr/src/sys with the code in the upstream netmap
> > repository https://github.com/luigirizzo/netmap (sys directory).
>
> Sorry beeing so complicated; But is there a real chance this integrates
> and compiles well out of the box?
> My build machine (which is not the test machine) doesn't have internet
> access and additionally I'm not faimilar with git, so I'd have to find a
> solution getting the source first.  And if I need to adapt/port anything
> to match stable/11, I won't be abel to do so (much too less knwoledge
> about all that code).
>
> I guess HEAD has already updated codebase.
> Maybe it's better to start with HEAD?
> But on the other hand, my test equipment is semi-productive with a quiet
> special setup (memory-rootfs with additional ZFS boot and sys-pool)...
> Very complicated I am...
>
> worth the stable/11 attempt?
>
> thanks,
>
> -harry
>
>
>


-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [panic] netmap(4) and if_lagg(4)

2017-05-26 Thread Vincenzo Maffione
  ^
> /usr/local/share/deploy-tools/RELENG_11/src/sys/dev/netmap/
> netmap_kern.h:90:25:
> note: expanded from macro 'WNA'
> #define WNA(_ifp)   (_ifp)->if_netmap
>   ^
> /usr/local/share/deploy-tools/RELENG_11/src/sys/dev/netmap/
> netmap_kern.h:1290:24:
> note: to match this '('
> /usr/local/share/deploy-tools/RELENG_11/src/sys/dev/netmap/
> netmap_kern.h:1254:18:
> note: expanded from macro 'NA'
> #define NA(_ifp)((struct netmap_adapter *)WNA(_ifp))
>
> thanks,
>
> -harry
>



-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [panic] netmap(4) and if_lagg(4)

2017-05-26 Thread Vincenzo Maffione
Is lagg0 the only interface attached to vale0?
Is lagg0 aggregating a VLAN interface?

You can try this trivial patch

diff --git a/sys/dev/netmap/netmap_generic.c
b/sys/dev/netmap/netmap_generic.c
index f148b228..46a3c2c6 100644
--- a/sys/dev/netmap/netmap_generic.c
+++ b/sys/dev/netmap/netmap_generic.c
@@ -950,6 +950,10 @@ generic_rx_handler(struct ifnet *ifp, struct mbuf *m)
u_int work_done;
u_int r = MBUF_RXQ(m); /* receive ring number */

+   if (!NM_NA_VALID(ifp)) {
+   return 0;
+   }
+
if (r >= na->num_rx_rings) {
r = r % na->num_rx_rings;
}



2017-05-26 9:21 GMT+02:00 Harry Schmalzbauer <free...@omnilan.de>:

> Bezüglich Vincenzo Maffione's Nachricht vom 26.05.2017 09:14 (localtime):
> > Hi,
> >   Your stack trace report this:
> >
> > #7 0x8069dc50 at vlan_input+0x1f0
> >
> > which means VLANs are involved, in some way. Is that the correct trace?
>
> The trace is from the pnaic after doing 'vale-ctl -a vale0:lagg0' (while
> lagg0 can have various names, but I'm not using a vlan clone).
>
> Might be that existing, but to my understanding uninvolved vlan clones
> interfere here...
> The lagg0 does have vlan clones (lots of) defined.
> Unfortunately I can't take them out of the game for testing...
>
> Does that picture match the trace?
>
> thanks,
>
> -harry
>
>
>


-- 
Vincenzo Maffione
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

  1   2   >