Re: Performance issues with vnet jails + epair + bridge
On Sun, 15 Sept 2024 at 18:56, Sad Clouds wrote: > On Sun, 15 Sep 2024 18:01:07 +0100 > Doug Rabson wrote: > > > I just did a throughput test with iperf3 client on a FreeBSD 14.1 host > with > > an intel 10GB nic connecting to an iperf3 server running in a vnet jail > on > > a truenas host (13.something) also with an intel 10GB nic and I get full > > 10GB throughput in this setup. In the past, I had to disable LRO on the > > truenas host for this to work properly. > > > > Doug. > > Hello Doug, can you please confirm that you are NOT using if_epair(4)? I > imagine you dedicate one of the Intel 10Gb ports to a jail. This is not > an option for some of us, so a virtual NIC of some sort is the only > option with vnet jails. Other people also mentioned that vnet by itself > is not an issue and your test confirms this, however I'm observing poor > scalability specifically with the epair virtual NIC. > > I will be trying netgraph when I have some more time. If there are > other alternatives to if_epair then I would be interested to learn > about them. > I am using epair on the server side of that test. On the truenas server, I have an if_bridge instance which has one vlan of the physical intel nic as member along with one side of an epair for each of the several jails running on the host. As I mentioned, disabling LRO on the physical nic was helpful in reaching this throughput. Doug.
Re: Performance issues with vnet jails + epair + bridge
I just did a throughput test with iperf3 client on a FreeBSD 14.1 host with an intel 10GB nic connecting to an iperf3 server running in a vnet jail on a truenas host (13.something) also with an intel 10GB nic and I get full 10GB throughput in this setup. In the past, I had to disable LRO on the truenas host for this to work properly. Doug. On Sat, 14 Sept 2024 at 11:25, Sad Clouds wrote: > On Sat, 14 Sep 2024 10:45:03 +0800 > Zhenlei Huang wrote: > > > The overhead of vnet jail should be neglectable, compared to legacy jail > > or no-jail. Bare in mind when VIMAGE option is enabled, there is a > default > > vnet 0. It is not visible via jls and can not be destroyed. So when you > see > > bottlenecks, for example this case, it is mostly caused by other > components > > such as if_epair, but not the vnet jail itself. > > Perhaps this needs a correction - the vnet itself may be OK, but due to > a single physical NIC on this appliance, I cannot use vnet jails > without virtualised devices like if_epair(4) and if_bridge(4). I think > there may be other scalability bottlenecks. > > I have a similar setup on Solaris > > Here devel is a Solaris zone with exclusive IP configuration, which I > think may be similar to FreeBSD vnet. It has a virtual NIC devel/net0 > which operates over the physical NIC also called net0 in the global > zone: > > $ dladm > LINKCLASS MTUSTATEOVER > net0phys 1500 up -- > net1phys 1500 up -- > net2phys 1500 up -- > net3phys 1500 up -- > pkgsrc/net0 vnic 1500 up net0 > devel/net0 vnic 1500 up net0 > > If I run TCP bulk data benchmark with 64 concurrent threads, 32 > threads with server process in the global zone and 32 threads with > client process in the devel zone, then the system evenly spreads the > load across all CPU cores and none of them are sitting idle: > > $ mpstat -A core 1 > COR minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys st > idl sze >00 0 2262 25614 4744 2085 209 72710 747842 272 528 > 0 0 8 >10 0 3187 42092 9102 3768 514 10605 0 597012 221 579 > 0 0 8 >20 0 2091 32517 6768 2884 307 95570 658124 244 556 > 0 0 8 >30 0 1745 1786 16 3494 1520 176 88470 746373 273 527 > 0 0 8 >40 0 2797 27673 5908 2414 371 78490 692873 253 547 > 0 0 8 >50 0 2782 23595 4857 2012 324 94310 684840 251 549 > 0 0 8 >60 0 4324 41330 9138 3592 538 12525 0 516342 191 609 > 0 0 8 >70 0 2180 32490 6960 2926 321 88250 697861 257 543 > 0 0 8 > > With FreeBSD I tried "options RSS" and increasing "net.isr.maxthreads" > however this resulted in some really flaky kernel behavior. So I'm > thinking that if_epair(4) may be OK for some low-bandwidth use cases, > i.e. testing firewall rules, etc, but not suitable for things like > file/object storage servers, etc. > >
Re: OCI image compatibility spec - FYI
A while ago I drafted https://github.com/dfr/opencontainers-tob/tree/freebsd but neither I nor Samuel Karp had enough time to take this forward. Since then, we have resolved one of the trickier differences between the podman/buildah port and containerd/nerdctl around network configuration and I think this would be a good time to revive this proposal. Doug. On Mon, 9 Oct 2023 at 16:26, Greg Wallace wrote: > Hi Doug, > > I have followed your work with great interest, though I have to admit > that, because I am not a developer or DevOps practitioner, my understanding > is incomplete. > > I am in 100% agreement with you that the PR I shared is less important > than the runtime spec. I just wanted to bring it to the list's attention > since the author has said he would welcome FreeBSD involvement and they > plan a vote tomorrow. > > Several others, representing developers and end users, are also interested > in helping with the runtime spec. I would love to connect them with you and > see how we may be able to work together. > > Thanks! > > Greg > > > > On Mon, Oct 9, 2023 at 11:19 AM Doug Rabson wrote: > >> >> >> >> On Mon, 9 Oct 2023 at 13:51, Greg Wallace >> wrote: >> >>> Hi all, >>> >>> I have been trying to stay tuned in to all the efforts to get a native >>> OCI runtime on FreeBSD. There are a lot of people interested in this and >>> several efforts underway. >>> >>> In the course of listening in on some of the OCI community developer >>> calls, I learned about this effort to create image compatibility >>> specification >>> >>> https://github.com/opencontainers/tob/pull/128 >>> >>> I asked if they planned to include FreeBSD as a supported platform and >>> they have been very open to the idea but they need FreeBSD developers to >>> express interest and get involved. >>> >>> If this interests you, you can jump into the PR or ping me and I'd be >>> happy to connect with the engineers heading this up. >>> >> >> I am very interested in the area of adding FreeBSD extensions to the OCI >> specification(s). Your PR covers the image spec - I actually think that it >> might be better to start trying to define a FreeBSD extension for the >> runtime spec. >> >> Doug. >> >> >>> > > -- > Greg Wallace > Director of Partnerships & Research > M +1 919-247-3165 > Schedule a meeting <https://calendly.com/greg-freebsdfound/30min> > Get your FreeBSD Gear <https://freebsd-foundation.myshopify.com/> >
Re: OCI image compatibility spec - FYI
On Mon, 9 Oct 2023 at 13:51, Greg Wallace wrote: > Hi all, > > I have been trying to stay tuned in to all the efforts to get a native OCI > runtime on FreeBSD. There are a lot of people interested in this and > several efforts underway. > > In the course of listening in on some of the OCI community developer > calls, I learned about this effort to create image compatibility > specification > > https://github.com/opencontainers/tob/pull/128 > > I asked if they planned to include FreeBSD as a supported platform and > they have been very open to the idea but they need FreeBSD developers to > express interest and get involved. > > If this interests you, you can jump into the PR or ping me and I'd be > happy to connect with the engineers heading this up. > I am very interested in the area of adding FreeBSD extensions to the OCI specification(s). Your PR covers the image spec - I actually think that it might be better to start trying to define a FreeBSD extension for the runtime spec. Doug. >
Netlink and vnet
In Linux container runtimes, typically netlink is used with network namespaces to manage the interfaces and addresses for a container. This typically involves briefly joining the network namespace to perform actions like socket(AF_NETLINK, ...). It would be nice to find a similar approach on FreeBSD to replace the 'jexec ifconfig ...' approach which I'm using now. Is there any way to get a netlink socket that connects to a specific vnet? This would be cleaner, more efficient and would simplify porting the Linux runtimes to FreeBSD.
Re: Import dhcpcd(8) into FreeBSD base
On Sun, 7 Aug 2022 at 09:04, Franco Fichtner wrote: > > > On 7. Aug 2022, at 9:38 AM, Doug Rabson wrote: > > > > I'm not sure what the problem is here? I'm using dhcpcd client in my > home lab with pfsense acting as dhcp and dhcp6 server and it works great, > including prefix delegation. Choosing a new dhcp client in FreeBSD > certainly doesn't require {pf,opn}sense to use that client. > > Good, but keep in mind that your home lab is not millions of downstream > users. ;) > Of course but this argument is confusing - we are talking about DHCP client, not server. > > Main thing that's missing for me is dynamic dns - my dhcp server updates > my local DNS using ddns. This works well for ipv4 and I've been using it > this way for years. For ipv6, rtsold is limited to handing advertising the > local prefix. Using dhcpcd for both means I get both A and records in > my local DNS which makes me happy. > > > Dynamic records for client leases is a problem, but isn't that also a > general issue with isc-dhcpd? What's your main DHCP server for IPv6? > I'm using the pfSense default DHCP server for both IPV4 and IPV4 - as far as I remember, this is isc-dhcpd and in a previous iteration of my home infra, I had isc-dhcpd working (with dynamic DNS) for both v4 and v6. > > > Again, not seeing the harm for either OPNsense or pfSense - these > distributions are free to choose another client. > > If you want to say "not my work, not my harm" that's possibly fine, but not > well-rounded in a real world setting as indicated by your former status. > I'm saying that the base system's choice of DHCP client has little bearing on pfSense or OPNsense. I don't understand the comment on 'former status'. > > It is still a lot of work to get it working mostly like it did before and > at > least one FreeBSD major release will suffer from the inferiority of > switching > to a new integration. I'm sure disrupting basic IPv4 DHCP capability > which was > always working prior will come as a surprise to people involved in green > lighting > this, but this is likely an unavoidable consequence of the proposal. > Of course, whatever solution we choose for DHCP needs to be integrated properly. To be honest, all I want is a DHCPv6 client integrated in base - I don't care if it's dhcpcd or something else but until we have that, IPv6 is a second class citizen (IMO). Doug.
Re: Import dhcpcd(8) into FreeBSD base
On Sun, 7 Aug 2022 at 08:08, Franco Fichtner wrote: > Hi Ben, > > > On 7. Aug 2022, at 7:31 AM, Ben Woods wrote: > > > > Reason: ensure fresh installs of FreeBSD support using DHCPv6 and prefix > delegation to obtain an IP address (not supported by dhclient or rtsold). > Having it in ports/packages could be problematic if people cannot obtain an > IPv6 address to download it. > > > > Why dhcpcd vs other DHCPv6 clients? It’s well supported, full featured, > included in NetBSD and DragonflyBSD base, and is now sandboxed with > capsicum. The other DHCP clients tend to either not support DHCPv6 > (dhcpleased) or are no longer actively maintained (wide-dhcpv6-client). > > Having worked on dhclient and rtsold in FreeBSD and worked with it for > years > in pfSense/OPNsense the proposal here seems to be to throw all progress > away > that would definitely have to be rebuilt in the years to follow for the > all- > in-one (?) replacement. > I'm not sure what the problem is here? I'm using dhcpcd client in my home lab with pfsense acting as dhcp and dhcp6 server and it works great, including prefix delegation. Choosing a new dhcp client in FreeBSD certainly doesn't require {pf,opn}sense to use that client. > > For OPNsense we did fork strip down and improve wide-dhcpv6 over the years: > > https://github.com/opnsense/dhcp6c > > It could use more work and cleanups, but basically all that is required is > to > bring it into FreeBSD and use it to skip a long trail of said future work > both > in dhcpcd and putting back existing perks of the current dhclient and > rtsold. > > The basic question is: what's not working in dhclident? How is rtsold > inferior? > Main thing that's missing for me is dynamic dns - my dhcp server updates my local DNS using ddns. This works well for ipv4 and I've been using it this way for years. For ipv6, rtsold is limited to handing advertising the local prefix. Using dhcpcd for both means I get both A and records in my local DNS which makes me happy. > > It seems like "It’s well supported, full featured, included in NetBSD and > DragonflyBSD base" incorporates none of the real world concerns for > migratory > work so for the time being I don't think it's a solid proposal, also > because > it will cause heavy downstream disruption in OPNsense/pfSense in a few > years > as well. > Again, not seeing the harm for either OPNsense or pfSense - these distributions are free to choose another client.
Re: Container Networking for jails
I think it's important that configuring the container network does not rely on any utilities from inside the container - for one thing, there are no guarantees that these utilities even exist inside the container and as you note, local versions may be incompatible. On the subject of risk, with the current jail infrastructure, the only user which can create and modify containers is root. Certain users may have delegated authority, e.g. by using setuid on a daemon-less setup like podman or by adjusting permissions on a unix domain socket but this is clearly a huge risk and should be strongly discouraged (IMO). Rootless containers using something similar to linux user namespaces would be nice but it is probably a higher priority to get containers working well for root first. My concern for supporting an alternative 'tooling' image for network utilities is that it adds complexity to the infrastructure for very little gain. You could even make a weak argument that it adds a threat vector, e.g. if the network utilities image is fetched from a compromised repository (pretty far fetched IMO but possible). On Sun, 3 Jul 2022 at 17:29, Gijs Peskens wrote: > I went with exactly the same design for the Docker port I started a while > ago. > The reason I went with that design is that there weren't any facilities to > modify a jails vent network configuration from outside of the jail. So it's > needed to enter the jail, run ifconfig et all. > Linux jails will lack a compatible ifconfig. > So having a parent FreeBSD based vnet jail ensures that networking can be > configured for Linux children. > > There is a risk to using the / filesystem: users that might be allowed to > setup and configure containers run standard system tools as root on the > root filesystem, even if they might not have root permission themselves.. > If an exploit was to be ever found in any of those tools to modify files > that could be used as a step in a privilege escalation. > > Imho, that risk is acceptable in a first port, but should be documented. > And ideally an option should be provided to use an alternative root if the > user deems the risk unacceptable. > > > > > On 30 June 2022 09:04:24 CEST, Doug Rabson wrote: >> >> I wanted to get a quick sanity check for my current approach to container >> networking with buildah and podman. These systems use CNI ( >> https://www.cni.dev) to set up the network. This uses a sequence of >> 'plugins' which are executables that perform successive steps in the >> process - a very common setup uses a 'bridge' plugin to add one half of an >> epair to a bridge and put the other half into the container's vnet. IP >> addresses are managed by an 'ipam' plugin and an optional 'portmap' plugin >> can be used to advertise container service ports on the host. All of these >> plugins run on the host with root privileges. >> >> In kubernetes and podman, it is possible for more than one container to >> share a network namespace in a 'pod'. Each container in the pod can >> communicate with its peers directly via localhost and they all share a >> single IP address. >> >> Mapping this over to jails, I am using one vnet jail to manage the >> network namespace and child jails of this to isolate the containers. The >> vnet jail uses '/' as its root path and the only things which run inside >> this jail are the CNI plugins. Using the host root means that a plugin can >> safely call host utilities such as ifconfig and route without having to >> trust the container's version of them. An important factor here is that the >> CNI plugins will only be run strictly before the container (to set up) or >> strictly after (to tear down) - at no point will CNI plugins be executed at >> the same time as container executables. >> >> The child jails use ip4/6=inherit to share the vnet and each will use a >> root path to the container's contents in the same way as a normal >> non-hierarchical jail. >> >> Can anyone see any potential security problems here, particularly around >> the use of nested jails? I believe that the only difference between this >> setup and a regular non-nested jail is that the vnet outlives the container >> briefly before it is torn down. >> > > -- > Verstuurd vanaf mijn Android apparaat met K-9 Mail. Excuseer mijn > beknoptheid. >
Container Networking for jails
I wanted to get a quick sanity check for my current approach to container networking with buildah and podman. These systems use CNI ( https://www.cni.dev) to set up the network. This uses a sequence of 'plugins' which are executables that perform successive steps in the process - a very common setup uses a 'bridge' plugin to add one half of an epair to a bridge and put the other half into the container's vnet. IP addresses are managed by an 'ipam' plugin and an optional 'portmap' plugin can be used to advertise container service ports on the host. All of these plugins run on the host with root privileges. In kubernetes and podman, it is possible for more than one container to share a network namespace in a 'pod'. Each container in the pod can communicate with its peers directly via localhost and they all share a single IP address. Mapping this over to jails, I am using one vnet jail to manage the network namespace and child jails of this to isolate the containers. The vnet jail uses '/' as its root path and the only things which run inside this jail are the CNI plugins. Using the host root means that a plugin can safely call host utilities such as ifconfig and route without having to trust the container's version of them. An important factor here is that the CNI plugins will only be run strictly before the container (to set up) or strictly after (to tear down) - at no point will CNI plugins be executed at the same time as container executables. The child jails use ip4/6=inherit to share the vnet and each will use a root path to the container's contents in the same way as a normal non-hierarchical jail. Can anyone see any potential security problems here, particularly around the use of nested jails? I believe that the only difference between this setup and a regular non-nested jail is that the vnet outlives the container briefly before it is torn down.
Re: nfs buildworld blocked by rpc.lockd ?
On 28 May 2008, at 20:57, Arno J. Klaassen wrote: Hello, my buildworld on a 7-stable-amd64 blocks on the following line : TERM=dumb TERMCAP=dumb: ex - /files/bsd/src7/share/termcap/ termcap.src < /files/bsd/src7/share/termcap/reorder ex(1) stays in lockd state, and is unkillable, either by Ctl-C or kill -9 /files/bsd is nfs-mounted as follows : push:/raid1/bsd/files/bsd nfs rw,bg,soft,nfsv3,intr,noconn,noauto,-r=32768,-w=32768 0 0 I can provide tcpdumps on server and client if helpful. I would very much like to see tcpdumps (from either client or server). This problem is often caused by the fact that unless you use the '-p' flag, rpc.lockd isn't wired down to any particular port number. Since it is started at boot time, it will usually end up with the same one each time but the new kernel implementation in 7-stable typically ends up with a different port number to the old userland implementation. Quirks of the locking protocol make it difficult for the server to notice this without a lengthy timeout. Workarounds include using '-p' to wire it down to a consistent port (port 4045 is reserved for this) or restarting rpc.lockd on the server. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: fwe -> fwip in GENERIC?
On 18 Oct 2005, at 13:21, Norikatsu Shigemura wrote: On Mon, 17 Oct 2005 10:12:18 +0100 Doug Rabson <[EMAIL PROTECTED]> wrote: The fwip implementation should be fully compatible with the RFC standard. I'm happy for fwip to replace fwe in GENERIC unless anyone else has an objection. I disagree. Because fwip and fwe can exist together. So I think that fwip should be added to GENERIC. Sure - both drivers are tiny and they don't step on each others toes. Longer term, I think we should try to phase out the fwe driver since it doesn't interoperate with any other systems (except Df, I guess). ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: fwe -> fwip in GENERIC?
The fwip implementation should be fully compatible with the RFC standard. I'm happy for fwip to replace fwe in GENERIC unless anyone else has an objection. On Saturday 15 October 2005 03:35, Katsushi Kobayashi wrote: > Hi, > > Although I don't know the detail of fwe technology, I understand the > technology is a proprietary one. It is better to provide a > compatibility with RFC standard firewire over IP, if some volunteer > are there. > > On 2005/10/15, at 9:58, Cai, Quanqing wrote: > > Hi guys, > > > > When I was fixing bug kern/82727: > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/82727, I found we > > use fwe(Ethernet over FireWire) in GENERIC kernel, not fwip(IP over > > FireWire). > > But we all know that IP over FireWire is more widely used on other > > OSes, and > > now this bug is fixed, do we need change fwe to fwip? > > > > I talked it with Tai-hwa Liang, he agrees with me. But he suggests > > me to > > post here for more advices, since there might be some > > considerations such > > like backward compatibility or code size that makes re@ made this > > decision. > > > > Please give you advice or opinion. > > > > Best > > Cai, Quanqing > > ___ > > [EMAIL PROTECTED] mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > > To unsubscribe, send any mail to "freebsd-arch- > > [EMAIL PROTECTED]" > > ___ > [EMAIL PROTECTED] mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to > "[EMAIL PROTECTED]" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: new arp code snapshot for review...
On Tue, 2004-05-18 at 17:21, Harti Brandt wrote: > On Tue, 18 May 2004, Luigi Rizzo wrote: > > LR>On Tue, May 18, 2004 at 02:00:28PM +0100, Doug Rabson wrote: > LR>> On Tue, 2004-05-18 at 09:48, Luigi Rizzo wrote: > LR>> > I will try to remove as many assumptions as possible. > LR>> > thanks for the feedback. > LR>> > LR>> I think that in your prototype, the only assumption was in struct > LR>> llentry. I would suggest defining it as something like: > LR> > LR>to be really flexible, both l3_addr and ll_addr should be > LR>variable size (v4,v6,v8 over 802.x,firewire,appletalk,snail-mail), > LR>then things rapidly become confusing and inefficient. > LR>I would like to keep the ipv4 over ethernet case simple and quick, even > LR>if this means replicating the code for the generic case (and this > LR>is one of the reasons i have stalled a bit on this code -- i want > LR>to make up my mind on what is a reasonable approaxch). > > The most common use of that table is to have an l3_addr and search the > ll_addr, right? In that case making ll_addr variable shouldn't have a > measurable influence on speed. Variable l3_addr could be different though. Well it seems to me that IPv6 neighbour discovery is different enough from ARP that it makes sense to have IPv4-specialised ARP and IPv6-specialised ND. The only other variable is the size of the LL address and that doesn't add any significant complexity since its just moved around with bcopy. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: new arp code snapshot for review...
On Tue, 2004-05-18 at 09:48, Luigi Rizzo wrote: > I will try to remove as many assumptions as possible. > thanks for the feedback. I think that in your prototype, the only assumption was in struct llentry. I would suggest defining it as something like: struct llentry { struct llentry *lle_next; struct mbuf *la_hold; uint16_tflags; /* see values in if_ether.h */ uint8_t la_preempt; uint8_t la_asked; time_t expire; struct in_addr l3_addr; uint8_t ll_addr[0]; }; Where the allocation of them uses something like 'malloc(sizeof(struct llentry) + ifp->if_addrlen)'. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: new arp code snapshot for review...
On Sunday 25 April 2004 17:49, Luigi Rizzo wrote: > Here is a snapshot of the new arp code that i have been working on > lately, based a on Andre's ideas. (I say 'ARP' for brevity, what i > mean is the layer3-to-layer2 address translation code -- arp, aarp, > nd6 all fit in the category). Sorry for the delay but I've only just had reason to look at the arp code since I've recently been working on an implementation of rfc2734 IP over firewire. In your patch, you assume that the size of the link-level address is always six bytes. This assumption is not valid - from the looks of the existing arp code, people went to great lengths to avoid making this assumption throughout the networking code. For IP over firewire, the link-level address is sixteen bytes. Other link types have various sizes. You must use ifp->if_addrlen in the generic code to cope with this correctly. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Will rfc2734 be supported?
On Mon, 2004-02-02 at 05:50, Hidetoshi Shimokawa wrote: > At Sat, 31 Jan 2004 15:27:03 +0100, > Dario Freni wrote: > > > > [1 ] > > Hi guys, > > I was wondering if the standard implementation of IPoFW is planning to > > be implemented. I'm not expert on device writing, I was also looking for > > some workarounds, like attach the fwe0:lower netgraph hook to a virtual > > interface, but reading the rfc I realized that the normal IP packet > > needs an encapsulation before it's sent on the wire. > > I have no plan to implement rfc2734 by myself near future. > IEEE1394 is somewhat complicated, compared with Ethernet. > Because there are some types of packets, fwe and IPoFW uses very > different packet type and formats, so you don't have an easy > workaround using netgraph. > > If you are interested in implementing rfc2734, you need several steps. > > - Implement rfc2734 encapsulation as /sys/net/if_ethersubr.c for > ethernt. rfc2734 uses very different packet format from ethernet. > > - Implement generic GASP receive routin in the firewire driver. > You need this service for multicast/broadcast packet such as an arp > packet. > > - Implement if_fw.c for the interface device. > > Though I'm not sure it actually worked, the firewire driver for > FreeBSD-4.0 seems to have a support for IPoFW > See ftp://ftp.uec.ac.jp/pub/firewire/ for the patch. I spent a little time recently thinking about what would be needed for this and came to similar conclusions. The most interesting part is implementing generic GASP receive. I think the nicest way of doing that would be to implement a new network protocol for firewire, allowing userland programs to do something like: struct sockaddr_firewire a; s = socket(PF_FIREWIRE, SOCK_DGRAM, 0); a.sof_address = 0x12345000; ...; bind(s, &a, sizeof a); ...; len = recv(s, buf, sizeof buf, 0); Internally, this probably means arranging for all asynchronous packets to be DMA'd directly into mbufs and would probably change the firewire code a great deal. Still, it might be worth it to gain a familiar socket-based user api. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: finishing the if.h/if_var.h split
On Tue, 2003-09-30 at 09:22, Bruce Evans wrote: > That's one alternative. (Far too) many places already use the simple > alternative of just using "struct device *". Grep shows 68 lines > containing "struct device" in *.h and 32 in *.c. For "device_t", the > numbers are 2140 in *.h and 5089 in *.c. This is in a sys tree with > about 1000 matches of "device_t" in generated files. There are non-bogus > uses of "struct device" to avoid namespace pollution in . > Most other uses are just bogus (modulo the existence of device_t being > non-bogus -- its opaqueness is negative since anything that wants to > use it must include and thus can see its internals. style(9) > says to not use negatively opaque typedefs). The internals of struct device are not contained in - it is completely opaque to users outside subr_bus.c. The main 'bug' here is the idea that its a good thing to export kernel data structures (struct ifnet) to userland. The layout of struct ifnet is an implementation detail - it shouldn't form part of the userland api. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"