Re: Enable to send packets on if_loop via bpf
On Tue, Nov 22, 2022 at 8:25 PM Ryota Ozaki wrote: > > On Tue, Nov 22, 2022 at 8:00 PM Ryota Ozaki wrote: > > > > On Tue, Nov 22, 2022 at 12:49 AM Greg Troxel wrote: > > > > > > > > > Ryota Ozaki writes: > > > > > > > In the specification DLT_NULL assumes a protocol family in the host > > > > byte order followed by a payload. Interfaces of DLT_NULL uses > > > > bpf_mtap_af to pass a mbuf prepending a protocol family. All interfaces > > > > follow the spec and work well. > > > > > > > > OTOH, bpf_write to interfaces of DLT_NULL is a bit of a sad situation. > > > > A writing data to an interface of DLT_NULL is treated as a raw data > > > > (I don't know why); the data is passed to the interface's output routine > > > > as is with dst (sa_family=AF_UNSPEC). tun seems to be able > > > > to handle such raw data but the others can't handle the data (probably > > > > the data will be dropped like if_loop). > > > > > > Summarizing and commenting to make sure I'm not confused > > > > > > on receive/read, DLT_NULL prepends AF in host byte order > > > on transmit/write, it just sends with AF_UNSPCE > > > > > > This seems broken as it is asymmetric, and is bad because it throws > > > away information that is hard to reliably recreate. On the other hand > > > this is for link-layer formats, and it seems that some interfaces have > > > an AF that is not really part of what is transmitted, even though > > > really it is. For example tun is using an IP proto byte to specify AF > > > and really this is part of the link protocol. Except we pretend it > > > isn't. > > > > I found the following sentence in bpf.4: > > > > A packet can be sent out on the network by writing to a bpf file > > descriptor. The writes are unbuffered, meaning only one packet can be > > processed per write. Currently, only writes to Ethernets and SLIP > > links > > are supported. > > > > So bpf_write to interfaces of DLT_NULL may be simply unsupported on > > NetBSD... > > > > > > > > > Correcting bpf_write to assume a prepending protocol family will > > > > save some interfaces like gif and gre but won't save others like stf > > > > and wg. Even worse, the change may break existing users of tun > > > > that want to treat data as is (though I don't know if users exist). > > > > > > > > BTW, prepending a protocol family on tun is a different protocol from > > > > DLT_NULL of bpf. tun has three protocol modes and doesn't always > > > > prepend > > > > a protocol family. (And also the network byte order is used on tun > > > > as gert says while DLT_NULL assumes the host byte order.) > > > > > > wow. > > > > > > > So my fix will: > > > > - keep DLT_NULL of if_loop to not break bpf_mtap_af, and > > > > - unchange DLT_NULL handling in bpf_write except for if_loop to bother > > > > existing users. > > > > The patch looks like this: > > > > > > > > @@ -447,6 +448,14 @@ bpf_movein(struct uio *uio, int linktype, > > > > uint64_t mtu, struct mbuf **mp, > > > > m0->m_len -= hlen; > > > > } > > > > > > > > + if (linktype == DLT_NULL && ifp->if_type == IFT_LOOP) { > > > > + uint32_t af; > > > > + memcpy(, mtod(m0, void *), sizeof(af)); > > > > + sockp->sa_family = af; > > > > + m0->m_data += sizeof(af); > > > > + m0->m_len -= sizeof(af); > > > > + } > > > > + > > > > *mp = m0; > > > > return (0); > > > > > > That seems ok to me. > > > > Thanks. > > > > > > > > > > > I think the long-term right fix is to define DLT_AF which has an AF word > > > in host order on receive and transmit always, and to modify interfaces > > > to use it whenever they are AF aware at all. In this case tun would > > > fill in the AF word from the IP proto field, and you'd get a > > > transformed/regularized AF word when really the "link layer packet" had > > > the IP proto field. But that's ok as it's just cleanup and reversible. > > > > I think introducing DLT_AF is a bit of a tough task because DLT_* > > definitions > > are managed by us. >^ are NOT managed, I meant to say... > > ozaki-r https://www.netbsd.org/~ozaki-r/loop-bpf2.patch Anyway this is the latest patch. It is adjusted to ensure to apply input validations (pointed out by ryo@). ozaki-r
Re: Enable to send packets on if_loop via bpf
On Tue, Nov 22, 2022 at 8:00 PM Ryota Ozaki wrote: > > On Tue, Nov 22, 2022 at 12:49 AM Greg Troxel wrote: > > > > > > Ryota Ozaki writes: > > > > > In the specification DLT_NULL assumes a protocol family in the host > > > byte order followed by a payload. Interfaces of DLT_NULL uses > > > bpf_mtap_af to pass a mbuf prepending a protocol family. All interfaces > > > follow the spec and work well. > > > > > > OTOH, bpf_write to interfaces of DLT_NULL is a bit of a sad situation. > > > A writing data to an interface of DLT_NULL is treated as a raw data > > > (I don't know why); the data is passed to the interface's output routine > > > as is with dst (sa_family=AF_UNSPEC). tun seems to be able > > > to handle such raw data but the others can't handle the data (probably > > > the data will be dropped like if_loop). > > > > Summarizing and commenting to make sure I'm not confused > > > > on receive/read, DLT_NULL prepends AF in host byte order > > on transmit/write, it just sends with AF_UNSPCE > > > > This seems broken as it is asymmetric, and is bad because it throws > > away information that is hard to reliably recreate. On the other hand > > this is for link-layer formats, and it seems that some interfaces have > > an AF that is not really part of what is transmitted, even though > > really it is. For example tun is using an IP proto byte to specify AF > > and really this is part of the link protocol. Except we pretend it > > isn't. > > I found the following sentence in bpf.4: > > A packet can be sent out on the network by writing to a bpf file > descriptor. The writes are unbuffered, meaning only one packet can be > processed per write. Currently, only writes to Ethernets and SLIP links > are supported. > > So bpf_write to interfaces of DLT_NULL may be simply unsupported on > NetBSD... > > > > > > Correcting bpf_write to assume a prepending protocol family will > > > save some interfaces like gif and gre but won't save others like stf > > > and wg. Even worse, the change may break existing users of tun > > > that want to treat data as is (though I don't know if users exist). > > > > > > BTW, prepending a protocol family on tun is a different protocol from > > > DLT_NULL of bpf. tun has three protocol modes and doesn't always prepend > > > a protocol family. (And also the network byte order is used on tun > > > as gert says while DLT_NULL assumes the host byte order.) > > > > wow. > > > > > So my fix will: > > > - keep DLT_NULL of if_loop to not break bpf_mtap_af, and > > > - unchange DLT_NULL handling in bpf_write except for if_loop to bother > > > existing users. > > > The patch looks like this: > > > > > > @@ -447,6 +448,14 @@ bpf_movein(struct uio *uio, int linktype, > > > uint64_t mtu, struct mbuf **mp, > > > m0->m_len -= hlen; > > > } > > > > > > + if (linktype == DLT_NULL && ifp->if_type == IFT_LOOP) { > > > + uint32_t af; > > > + memcpy(, mtod(m0, void *), sizeof(af)); > > > + sockp->sa_family = af; > > > + m0->m_data += sizeof(af); > > > + m0->m_len -= sizeof(af); > > > + } > > > + > > > *mp = m0; > > > return (0); > > > > That seems ok to me. > > Thanks. > > > > > > > I think the long-term right fix is to define DLT_AF which has an AF word > > in host order on receive and transmit always, and to modify interfaces > > to use it whenever they are AF aware at all. In this case tun would > > fill in the AF word from the IP proto field, and you'd get a > > transformed/regularized AF word when really the "link layer packet" had > > the IP proto field. But that's ok as it's just cleanup and reversible. > > I think introducing DLT_AF is a bit of a tough task because DLT_* definitions > are managed by us. ^ are NOT managed, I meant to say... ozaki-r
Re: Enable to send packets on if_loop via bpf
On Tue, Nov 22, 2022 at 12:49 AM Greg Troxel wrote: > > > Ryota Ozaki writes: > > > In the specification DLT_NULL assumes a protocol family in the host > > byte order followed by a payload. Interfaces of DLT_NULL uses > > bpf_mtap_af to pass a mbuf prepending a protocol family. All interfaces > > follow the spec and work well. > > > > OTOH, bpf_write to interfaces of DLT_NULL is a bit of a sad situation. > > A writing data to an interface of DLT_NULL is treated as a raw data > > (I don't know why); the data is passed to the interface's output routine > > as is with dst (sa_family=AF_UNSPEC). tun seems to be able > > to handle such raw data but the others can't handle the data (probably > > the data will be dropped like if_loop). > > Summarizing and commenting to make sure I'm not confused > > on receive/read, DLT_NULL prepends AF in host byte order > on transmit/write, it just sends with AF_UNSPCE > > This seems broken as it is asymmetric, and is bad because it throws > away information that is hard to reliably recreate. On the other hand > this is for link-layer formats, and it seems that some interfaces have > an AF that is not really part of what is transmitted, even though > really it is. For example tun is using an IP proto byte to specify AF > and really this is part of the link protocol. Except we pretend it > isn't. I found the following sentence in bpf.4: A packet can be sent out on the network by writing to a bpf file descriptor. The writes are unbuffered, meaning only one packet can be processed per write. Currently, only writes to Ethernets and SLIP links are supported. So bpf_write to interfaces of DLT_NULL may be simply unsupported on NetBSD... > > > Correcting bpf_write to assume a prepending protocol family will > > save some interfaces like gif and gre but won't save others like stf > > and wg. Even worse, the change may break existing users of tun > > that want to treat data as is (though I don't know if users exist). > > > > BTW, prepending a protocol family on tun is a different protocol from > > DLT_NULL of bpf. tun has three protocol modes and doesn't always prepend > > a protocol family. (And also the network byte order is used on tun > > as gert says while DLT_NULL assumes the host byte order.) > > wow. > > > So my fix will: > > - keep DLT_NULL of if_loop to not break bpf_mtap_af, and > > - unchange DLT_NULL handling in bpf_write except for if_loop to bother > > existing users. > > The patch looks like this: > > > > @@ -447,6 +448,14 @@ bpf_movein(struct uio *uio, int linktype, > > uint64_t mtu, struct mbuf **mp, > > m0->m_len -= hlen; > > } > > > > + if (linktype == DLT_NULL && ifp->if_type == IFT_LOOP) { > > + uint32_t af; > > + memcpy(, mtod(m0, void *), sizeof(af)); > > + sockp->sa_family = af; > > + m0->m_data += sizeof(af); > > + m0->m_len -= sizeof(af); > > + } > > + > > *mp = m0; > > return (0); > > That seems ok to me. Thanks. > > > I think the long-term right fix is to define DLT_AF which has an AF word > in host order on receive and transmit always, and to modify interfaces > to use it whenever they are AF aware at all. In this case tun would > fill in the AF word from the IP proto field, and you'd get a > transformed/regularized AF word when really the "link layer packet" had > the IP proto field. But that's ok as it's just cleanup and reversible. I think introducing DLT_AF is a bit of a tough task because DLT_* definitions are managed by us. ozaki-r
Re: Enable to send packets on if_loop via bpf
Ryota Ozaki writes: > In the specification DLT_NULL assumes a protocol family in the host > byte order followed by a payload. Interfaces of DLT_NULL uses > bpf_mtap_af to pass a mbuf prepending a protocol family. All interfaces > follow the spec and work well. > > OTOH, bpf_write to interfaces of DLT_NULL is a bit of a sad situation. > A writing data to an interface of DLT_NULL is treated as a raw data > (I don't know why); the data is passed to the interface's output routine > as is with dst (sa_family=AF_UNSPEC). tun seems to be able > to handle such raw data but the others can't handle the data (probably > the data will be dropped like if_loop). Summarizing and commenting to make sure I'm not confused on receive/read, DLT_NULL prepends AF in host byte order on transmit/write, it just sends with AF_UNSPCE This seems broken as it is asymmetric, and is bad because it throws away information that is hard to reliably recreate. On the other hand this is for link-layer formats, and it seems that some interfaces have an AF that is not really part of what is transmitted, even though really it is. For example tun is using an IP proto byte to specify AF and really this is part of the link protocol. Except we pretend it isn't. > Correcting bpf_write to assume a prepending protocol family will > save some interfaces like gif and gre but won't save others like stf > and wg. Even worse, the change may break existing users of tun > that want to treat data as is (though I don't know if users exist). > > BTW, prepending a protocol family on tun is a different protocol from > DLT_NULL of bpf. tun has three protocol modes and doesn't always prepend > a protocol family. (And also the network byte order is used on tun > as gert says while DLT_NULL assumes the host byte order.) wow. > So my fix will: > - keep DLT_NULL of if_loop to not break bpf_mtap_af, and > - unchange DLT_NULL handling in bpf_write except for if_loop to bother > existing users. > The patch looks like this: > > @@ -447,6 +448,14 @@ bpf_movein(struct uio *uio, int linktype, > uint64_t mtu, struct mbuf **mp, > m0->m_len -= hlen; > } > > + if (linktype == DLT_NULL && ifp->if_type == IFT_LOOP) { > + uint32_t af; > + memcpy(, mtod(m0, void *), sizeof(af)); > + sockp->sa_family = af; > + m0->m_data += sizeof(af); > + m0->m_len -= sizeof(af); > + } > + > *mp = m0; > return (0); That seems ok to me. I think the long-term right fix is to define DLT_AF which has an AF word in host order on receive and transmit always, and to modify interfaces to use it whenever they are AF aware at all. In this case tun would fill in the AF word from the IP proto field, and you'd get a transformed/regularized AF word when really the "link layer packet" had the IP proto field. But that's ok as it's just cleanup and reversible. signature.asc Description: PGP signature
Re: Enable to send packets on if_loop via bpf
On Wed, Nov 9, 2022 at 9:21 PM Greg Troxel wrote: > > > Ryota Ozaki writes: > > > NetBSD can't do this because a loopback interface > > registers itself to bpf as DLT_NULL and bpf treats > > packets being sent over the interface as AF_UNSPEC. > > Packets of AF_UNSPEC are just dropped by loopback > > interfaces. > > > > FreeBSD and OpenBSD enable to do that by letting users > > prepend a protocol family to a sending data. bpf (or if_loop) > > extracts it and handles the packet as an extracted protocol > > family. The following patch follows them (the implementation > > is inspired by OpenBSD). > > > > http://www.netbsd.org/~ozaki-r/loop-bpf.patch > > > > The patch changes if_loop to register itself to bpf > > as DLT_LOOP and bpf to handle a prepending protocol > > family on bpf_write if a sender interface is DLT_LOOP. > > I am surprised that there is not already a DLT_foo that already has this > concept, an AF word followed by data. But I guess every interface > already has a more-specific format. > > Looking at if_tun.c, I see DLT_NULL. This should have the same ability > to write. I have forgotten the details of how tun encodes AF when > transmitting, but I know you can have v4 or v6 inside, and tcpdump works > now. so obviously I must be missing something. > > My suggestion is to look at the rest of the drivers that register > DLT_NULL and see if they are amenable to the same fix, and choose a new > DLT_SOMETHING that accomodates the broader situation. > > I am not demanding that you add features to the rest of the drivers. I > am only asking that you think about the architectural issue of how the > rest of them would be updated, so we don't end up with DLT_LOOP, > DLT_TUN, and so on, where they all do almost the same thing, when they > could be the same. > > I don't really have an opinion on host vs network for AF, but I think > your choice of aligning with FreeBSD is reasonable. Thank you for your suggestion and I'm sorry for my late reply. I've investigated the DLT specification(*), DLT_NULL users including tun and the implementation of bpf and others. (*) https://www.tcpdump.org/linktypes.html At first, my patch was wrong because DLT_LOOP assumes a protocol family in the network byte order. So prepending a protocol family in the host byte order was wrong and also changing DLT_LOOP broke mbuf tapping on if_loop (i.e., tcpdump). In the specification DLT_NULL assumes a protocol family in the host byte order followed by a payload. Interfaces of DLT_NULL uses bpf_mtap_af to pass a mbuf prepending a protocol family. All interfaces follow the spec and work well. OTOH, bpf_write to interfaces of DLT_NULL is a bit of a sad situation. A writing data to an interface of DLT_NULL is treated as a raw data (I don't know why); the data is passed to the interface's output routine as is with dst (sa_family=AF_UNSPEC). tun seems to be able to handle such raw data but the others can't handle the data (probably the data will be dropped like if_loop). Correcting bpf_write to assume a prepending protocol family will save some interfaces like gif and gre but won't save others like stf and wg. Even worse, the change may break existing users of tun that want to treat data as is (though I don't know if users exist). BTW, prepending a protocol family on tun is a different protocol from DLT_NULL of bpf. tun has three protocol modes and doesn't always prepend a protocol family. (And also the network byte order is used on tun as gert says while DLT_NULL assumes the host byte order.) So my fix will: - keep DLT_NULL of if_loop to not break bpf_mtap_af, and - unchange DLT_NULL handling in bpf_write except for if_loop to bother existing users. The patch looks like this: @@ -447,6 +448,14 @@ bpf_movein(struct uio *uio, int linktype, uint64_t mtu, struct mbuf **mp, m0->m_len -= hlen; } + if (linktype == DLT_NULL && ifp->if_type == IFT_LOOP) { + uint32_t af; + memcpy(, mtod(m0, void *), sizeof(af)); + sockp->sa_family = af; + m0->m_data += sizeof(af); + m0->m_len -= sizeof(af); + } + *mp = m0; return (0); If we want to support another interface, we can add it to the condition. Any comments? ozaki-r
Re: Enable to send packets on if_loop via bpf
hello. Just as a matter of consistency, it seems like using network byte order would b a better choice, since it would match other interfaces on the system. -thanks -Brian
Re: Enable to send packets on if_loop via bpf
Hi, On Wed, Nov 09, 2022 at 07:21:44AM -0500, Greg Troxel wrote: > Looking at if_tun.c, I see DLT_NULL. This should have the same ability > to write. I have forgotten the details of how tun encodes AF when > transmitting, but I know you can have v4 or v6 inside, and tcpdump works > now. so obviously I must be missing something. On the user side of if_tun (not bpf), a 2-byte value consisting of htonl(AF_INET6) / htonl(AF_INET) is prepended before the actual packet. gert -- "If was one thing all people took for granted, was conviction that if you feed honest figures into a computer, honest figures come out. Never doubted it myself till I met a computer with a sense of humor." Robert A. Heinlein, The Moon is a Harsh Mistress Gert Doering - Munich, Germany g...@greenie.muc.de
Re: Enable to send packets on if_loop via bpf
Ryota Ozaki writes: > NetBSD can't do this because a loopback interface > registers itself to bpf as DLT_NULL and bpf treats > packets being sent over the interface as AF_UNSPEC. > Packets of AF_UNSPEC are just dropped by loopback > interfaces. > > FreeBSD and OpenBSD enable to do that by letting users > prepend a protocol family to a sending data. bpf (or if_loop) > extracts it and handles the packet as an extracted protocol > family. The following patch follows them (the implementation > is inspired by OpenBSD). > > http://www.netbsd.org/~ozaki-r/loop-bpf.patch > > The patch changes if_loop to register itself to bpf > as DLT_LOOP and bpf to handle a prepending protocol > family on bpf_write if a sender interface is DLT_LOOP. I am surprised that there is not already a DLT_foo that already has this concept, an AF word followed by data. But I guess every interface already has a more-specific format. Looking at if_tun.c, I see DLT_NULL. This should have the same ability to write. I have forgotten the details of how tun encodes AF when transmitting, but I know you can have v4 or v6 inside, and tcpdump works now. so obviously I must be missing something. My suggestion is to look at the rest of the drivers that register DLT_NULL and see if they are amenable to the same fix, and choose a new DLT_SOMETHING that accomodates the broader situation. I am not demanding that you add features to the rest of the drivers. I am only asking that you think about the architectural issue of how the rest of them would be updated, so we don't end up with DLT_LOOP, DLT_TUN, and so on, where they all do almost the same thing, when they could be the same. I don't really have an opinion on host vs network for AF, but I think your choice of aligning with FreeBSD is reasonable. signature.asc Description: PGP signature