Re: Performance issues with vnet jails + epair + bridge

2024-09-16 Thread Doug Rabson
On Sun, 15 Sept 2024 at 18:56, Sad Clouds 
wrote:

> On Sun, 15 Sep 2024 18:01:07 +0100
> Doug Rabson  wrote:
>
> > I just did a throughput test with iperf3 client on a FreeBSD 14.1 host
> with
> > an intel 10GB nic connecting to an iperf3 server running in a vnet jail
> on
> > a truenas host (13.something) also with an intel 10GB nic and I get full
> > 10GB throughput in this setup. In the past, I had to disable LRO on the
> > truenas host for this to work properly.
> >
> > Doug.
>
> Hello Doug, can you please confirm that you are NOT using if_epair(4)? I
> imagine you dedicate one of the Intel 10Gb ports to a jail. This is not
> an option for some of us, so a virtual NIC of some sort is the only
> option with vnet jails. Other people also mentioned that vnet by itself
> is not an issue and your test confirms this, however I'm observing poor
> scalability specifically with the epair virtual NIC.
>
> I will be trying netgraph when I have some more time. If there are
> other alternatives to if_epair then I would be interested to learn
> about them.
>

I am using epair on the server side of that test. On the truenas server, I
have an if_bridge instance which has one vlan of the physical intel nic as
member along with one side of an epair for each of the several jails
running on the host. As I mentioned, disabling LRO on the physical nic was
helpful in reaching this throughput.

Doug.


Re: Performance issues with vnet jails + epair + bridge

2024-09-15 Thread Doug Rabson
I just did a throughput test with iperf3 client on a FreeBSD 14.1 host with
an intel 10GB nic connecting to an iperf3 server running in a vnet jail on
a truenas host (13.something) also with an intel 10GB nic and I get full
10GB throughput in this setup. In the past, I had to disable LRO on the
truenas host for this to work properly.

Doug.



On Sat, 14 Sept 2024 at 11:25, Sad Clouds 
wrote:

> On Sat, 14 Sep 2024 10:45:03 +0800
> Zhenlei Huang  wrote:
>
> > The overhead of vnet jail should be neglectable, compared to legacy jail
> > or no-jail. Bare in mind when VIMAGE option is enabled, there is a
> default
> > vnet 0. It is not visible via jls and can not be destroyed. So when you
> see
> > bottlenecks, for example this case, it is mostly caused by other
> components
> > such as if_epair, but not the vnet jail itself.
>
> Perhaps this needs a correction - the vnet itself may be OK, but due to
> a single physical NIC on this appliance, I cannot use vnet jails
> without virtualised devices like if_epair(4) and if_bridge(4). I think
> there may be other scalability bottlenecks.
>
> I have a similar setup on Solaris
>
> Here devel is a Solaris zone with exclusive IP configuration, which I
> think may be similar to FreeBSD vnet. It has a virtual NIC devel/net0
> which operates over the physical NIC also called net0 in the global
> zone:
>
> $ dladm
> LINKCLASS MTUSTATEOVER
> net0phys  1500   up   --
> net1phys  1500   up   --
> net2phys  1500   up   --
> net3phys  1500   up   --
> pkgsrc/net0 vnic  1500   up   net0
> devel/net0  vnic  1500   up   net0
>
> If I run TCP bulk data benchmark with 64 concurrent threads, 32
> threads with server process in the global zone and 32 threads with
> client process in the devel zone, then the system evenly spreads the
> load across all CPU cores and none of them are sitting idle:
>
> $ mpstat -A core 1
>  COR minf mjf xcal  intr ithr  csw icsw migr smtx  srw  syscl  usr sys  st
> idl sze
>00   0 2262  25614 4744 2085  209 72710 747842  272 528
>  0   0   8
>10   0 3187  42092 9102 3768  514 10605   0 597012  221 579
>  0   0   8
>20   0 2091  32517 6768 2884  307 95570 658124  244 556
>  0   0   8
>30   0 1745  1786   16 3494 1520  176 88470 746373  273 527
>  0   0   8
>40   0 2797  27673 5908 2414  371 78490 692873  253 547
>  0   0   8
>50   0 2782  23595 4857 2012  324 94310 684840  251 549
>  0   0   8
>60   0 4324  41330 9138 3592  538 12525   0 516342  191 609
>  0   0   8
>70   0 2180  32490 6960 2926  321 88250 697861  257 543
>  0   0   8
>
> With FreeBSD I tried "options RSS" and increasing "net.isr.maxthreads"
> however this resulted in some really flaky kernel behavior. So I'm
> thinking that if_epair(4) may be OK for some low-bandwidth use cases,
> i.e. testing firewall rules, etc, but not suitable for things like
> file/object storage servers, etc.
>
>


Re: OCI image compatibility spec - FYI

2023-10-09 Thread Doug Rabson
A while ago I drafted https://github.com/dfr/opencontainers-tob/tree/freebsd
but neither I nor Samuel Karp had enough time to take this forward. Since
then, we have resolved one of the trickier differences between the
podman/buildah port and containerd/nerdctl around network configuration and
I think this would be a good time to revive this proposal.

Doug.


On Mon, 9 Oct 2023 at 16:26, Greg Wallace 
wrote:

> Hi Doug,
>
> I have followed your work with great interest, though I have to admit
> that, because I am not a developer or DevOps practitioner, my understanding
> is incomplete.
>
> I am in 100% agreement with you that the PR I shared is less important
> than the runtime spec. I just wanted to bring it to the list's attention
> since the author has said he would welcome FreeBSD involvement and they
> plan a vote tomorrow.
>
> Several others, representing developers and end users, are also interested
> in helping with the runtime spec. I would love to connect them with you and
> see how we may be able to work together.
>
> Thanks!
>
> Greg
>
>
>
> On Mon, Oct 9, 2023 at 11:19 AM Doug Rabson  wrote:
>
>>
>>
>>
>> On Mon, 9 Oct 2023 at 13:51, Greg Wallace 
>> wrote:
>>
>>> Hi all,
>>>
>>> I have been trying to stay tuned in to all the efforts to get a native
>>> OCI runtime on FreeBSD. There are a lot of people interested in this and
>>> several efforts underway.
>>>
>>> In the course of listening in on some of the OCI community developer
>>> calls, I learned about this effort to create image compatibility
>>> specification
>>>
>>> https://github.com/opencontainers/tob/pull/128
>>>
>>> I asked if they planned to include FreeBSD as a supported platform and
>>> they have been very open to the idea but they need FreeBSD developers to
>>> express interest and get involved.
>>>
>>> If this interests you, you can jump into the PR or ping me and I'd be
>>> happy to connect with the engineers heading this up.
>>>
>>
>> I am very interested in the area of adding FreeBSD extensions to the OCI
>> specification(s). Your PR covers the image spec - I actually think that it
>> might be better to start trying to define a FreeBSD extension for the
>> runtime spec.
>>
>> Doug.
>>
>>
>>>
>
> --
> Greg Wallace
> Director of Partnerships & Research
> M +1 919-247-3165
> Schedule a meeting <https://calendly.com/greg-freebsdfound/30min>
> Get your FreeBSD Gear <https://freebsd-foundation.myshopify.com/>
>


Re: OCI image compatibility spec - FYI

2023-10-09 Thread Doug Rabson
On Mon, 9 Oct 2023 at 13:51, Greg Wallace 
wrote:

> Hi all,
>
> I have been trying to stay tuned in to all the efforts to get a native OCI
> runtime on FreeBSD. There are a lot of people interested in this and
> several efforts underway.
>
> In the course of listening in on some of the OCI community developer
> calls, I learned about this effort to create image compatibility
> specification
>
> https://github.com/opencontainers/tob/pull/128
>
> I asked if they planned to include FreeBSD as a supported platform and
> they have been very open to the idea but they need FreeBSD developers to
> express interest and get involved.
>
> If this interests you, you can jump into the PR or ping me and I'd be
> happy to connect with the engineers heading this up.
>

I am very interested in the area of adding FreeBSD extensions to the OCI
specification(s). Your PR covers the image spec - I actually think that it
might be better to start trying to define a FreeBSD extension for the
runtime spec.

Doug.


>


Netlink and vnet

2022-10-17 Thread Doug Rabson
In Linux container runtimes, typically netlink is used with network
namespaces to manage the interfaces and addresses for a container. This
typically involves briefly joining the network namespace to perform actions
like socket(AF_NETLINK, ...).

It would be nice to find a similar approach on FreeBSD to replace the
'jexec ifconfig ...' approach which I'm using now. Is there any way to get
a netlink socket that connects to a specific vnet? This would be cleaner,
more efficient and would simplify porting the Linux runtimes to FreeBSD.


Re: Import dhcpcd(8) into FreeBSD base

2022-08-07 Thread Doug Rabson
On Sun, 7 Aug 2022 at 09:04, Franco Fichtner  wrote:

>
> > On 7. Aug 2022, at 9:38 AM, Doug Rabson  wrote:
> >
> > I'm not sure what the problem is here? I'm using dhcpcd client in my
> home lab with pfsense acting as dhcp and dhcp6 server and it works great,
> including prefix delegation. Choosing a new dhcp client in FreeBSD
> certainly doesn't require {pf,opn}sense to use that client.
>
> Good, but keep in mind that your home lab is not millions of downstream
> users.  ;)
>

Of course but this argument is confusing - we are talking about DHCP
client, not server.


> > Main thing that's missing for me is dynamic dns - my dhcp server updates
> my local DNS using ddns. This works well for ipv4 and I've been using it
> this way for years. For ipv6, rtsold is limited to handing advertising the
> local prefix. Using dhcpcd for both means I get both A and  records in
> my local DNS which makes me happy.
>
>
> Dynamic  records for client leases is a problem, but isn't that also a
> general issue with isc-dhcpd?  What's your main DHCP server for IPv6?
>

I'm using the pfSense default DHCP server for both IPV4 and IPV4 - as far
as I remember, this is isc-dhcpd and in a previous iteration of my home
infra, I had isc-dhcpd working (with dynamic DNS) for both v4 and v6.


>
> > Again, not seeing the harm for either OPNsense or pfSense - these
> distributions are free to choose another client.
>
> If you want to say "not my work, not my harm" that's possibly fine, but not
> well-rounded in a real world setting as indicated by your former status.
>

I'm saying that the base system's choice of DHCP client has little bearing
on pfSense or OPNsense. I don't understand the comment on 'former status'.


>
> It is still a lot of work to get it working mostly like it did before and
> at
> least one FreeBSD major release will suffer from the inferiority of
> switching
> to a new integration.  I'm sure disrupting basic IPv4 DHCP capability
> which was
> always working prior will come as a surprise to people involved in green
> lighting
> this, but this is likely an unavoidable consequence of the proposal.
>

Of course, whatever solution we choose for DHCP needs to be integrated
properly. To be honest, all I want is a DHCPv6 client integrated in base -
I don't care if it's dhcpcd or something else but until we have that, IPv6
is a second class citizen (IMO).

Doug.


Re: Import dhcpcd(8) into FreeBSD base

2022-08-07 Thread Doug Rabson
On Sun, 7 Aug 2022 at 08:08, Franco Fichtner  wrote:

> Hi Ben,
>
> > On 7. Aug 2022, at 7:31 AM, Ben Woods  wrote:
> >
> > Reason: ensure fresh installs of FreeBSD support using DHCPv6 and prefix
> delegation to obtain an IP address (not supported by dhclient or rtsold).
> Having it in ports/packages could be problematic if people cannot obtain an
> IPv6 address to download it.
> >
> > Why dhcpcd vs other DHCPv6 clients? It’s well supported, full featured,
> included in NetBSD and DragonflyBSD base, and is now sandboxed with
> capsicum. The other DHCP clients tend to either not support DHCPv6
> (dhcpleased) or are no longer actively maintained (wide-dhcpv6-client).
>
> Having worked on dhclient and rtsold in FreeBSD and worked with it for
> years
> in pfSense/OPNsense the proposal here seems to be to throw all progress
> away
> that would definitely have to be rebuilt in the years to follow for the
> all-
> in-one (?) replacement.
>

I'm not sure what the problem is here? I'm using dhcpcd client in my home
lab with pfsense acting as dhcp and dhcp6 server and it works great,
including prefix delegation. Choosing a new dhcp client in FreeBSD
certainly doesn't require {pf,opn}sense to use that client.


>
> For OPNsense we did fork strip down and improve wide-dhcpv6 over the years:
>
> https://github.com/opnsense/dhcp6c
>
> It could use more work and cleanups, but basically all that is required is
> to
> bring it into FreeBSD and use it to skip a long trail of said future work
> both
> in dhcpcd and putting back existing perks of the current dhclient and
> rtsold.
>
> The basic question is: what's not working in dhclident? How is rtsold
> inferior?
>

Main thing that's missing for me is dynamic dns - my dhcp server updates my
local DNS using ddns. This works well for ipv4 and I've been using it this
way for years. For ipv6, rtsold is limited to handing advertising the local
prefix. Using dhcpcd for both means I get both A and  records in my
local DNS which makes me happy.


>
> It seems like "It’s well supported, full featured, included in NetBSD and
> DragonflyBSD base" incorporates none of the real world concerns for
> migratory
> work so for the time being I don't think it's a solid proposal, also
> because
> it will cause heavy downstream disruption in OPNsense/pfSense in a few
> years
> as well.
>

Again, not seeing the harm for either OPNsense or pfSense - these
distributions are free to choose another client.


Re: Container Networking for jails

2022-07-04 Thread Doug Rabson
I think it's important that configuring the container network does not rely
on any utilities from inside the container - for one thing, there are no
guarantees that these utilities even exist inside the container and as you
note, local versions may be incompatible.

On the subject of risk, with the current jail infrastructure, the only user
which can create and modify containers is root. Certain users may have
delegated authority, e.g. by using setuid on a daemon-less setup like
podman or by adjusting permissions on a unix domain socket but this is
clearly a huge risk and should be strongly discouraged (IMO). Rootless
containers using something similar to linux user namespaces would be nice
but it is probably a higher priority to get containers working well for
root first.

My concern for supporting an alternative 'tooling' image for network
utilities is that it adds complexity to the infrastructure for very little
gain. You could even make a weak argument that it adds a threat vector,
e.g. if the network utilities image is fetched from a compromised
repository (pretty far fetched IMO but possible).



On Sun, 3 Jul 2022 at 17:29, Gijs Peskens  wrote:

> I went with exactly the same design for the Docker port I started a while
> ago.
> The reason I went with that design is that there weren't any facilities to
> modify a jails vent network configuration from outside of the jail. So it's
> needed to enter the jail, run ifconfig et all.
> Linux jails will lack a compatible ifconfig.
> So having a parent FreeBSD based vnet jail ensures that networking can be
> configured for Linux children.
>
> There is a risk to using the / filesystem: users that might be allowed to
> setup and configure containers run standard system tools as root on the
> root filesystem, even if they might not have root permission themselves..
> If an exploit was to be ever found in any of those tools to modify files
> that could be used as a step in a privilege escalation.
>
> Imho, that risk is acceptable in a first port, but should be documented.
> And ideally an option should be provided to use an alternative root if the
> user deems the risk unacceptable.
>
>
>
>
> On 30 June 2022 09:04:24 CEST, Doug Rabson  wrote:
>>
>> I wanted to get a quick sanity check for my current approach to container
>> networking with buildah and podman. These systems use CNI (
>> https://www.cni.dev) to set up the network. This uses a sequence of
>> 'plugins' which are executables that perform successive steps in the
>> process - a very common setup uses a 'bridge' plugin to add one half of an
>> epair to a bridge and put the other half into the container's vnet. IP
>> addresses are managed by an 'ipam' plugin and an optional 'portmap' plugin
>> can be used to advertise container service ports on the host. All of these
>> plugins run on the host with root privileges.
>>
>> In kubernetes and podman, it is possible for more than one container to
>> share a network namespace in a 'pod'. Each container in the pod can
>> communicate with its peers directly via localhost and they all share a
>> single IP address.
>>
>> Mapping this over to jails, I am using one vnet jail to manage the
>> network namespace and child jails of this to isolate the containers. The
>> vnet jail uses '/' as its root path and the only things which run inside
>> this jail are the CNI plugins. Using the host root means that a plugin can
>> safely call host utilities such as ifconfig and route without having to
>> trust the container's version of them. An important factor here is that the
>> CNI plugins will only be run strictly before the container (to set up) or
>> strictly after (to tear down) - at no point will CNI plugins be executed at
>> the same time as container executables.
>>
>> The child jails use ip4/6=inherit to share the vnet and each will use a
>> root path to the container's contents in the same way as a normal
>> non-hierarchical jail.
>>
>> Can anyone see any potential security problems here, particularly around
>> the use of nested jails? I believe that the only difference between this
>> setup and a regular non-nested jail is that the vnet outlives the container
>> briefly before it is torn down.
>>
>
> --
> Verstuurd vanaf mijn Android apparaat met K-9 Mail. Excuseer mijn
> beknoptheid.
>


Container Networking for jails

2022-06-30 Thread Doug Rabson
I wanted to get a quick sanity check for my current approach to container
networking with buildah and podman. These systems use CNI (
https://www.cni.dev) to set up the network. This uses a sequence of
'plugins' which are executables that perform successive steps in the
process - a very common setup uses a 'bridge' plugin to add one half of an
epair to a bridge and put the other half into the container's vnet. IP
addresses are managed by an 'ipam' plugin and an optional 'portmap' plugin
can be used to advertise container service ports on the host. All of these
plugins run on the host with root privileges.

In kubernetes and podman, it is possible for more than one container to
share a network namespace in a 'pod'. Each container in the pod can
communicate with its peers directly via localhost and they all share a
single IP address.

Mapping this over to jails, I am using one vnet jail to manage the network
namespace and child jails of this to isolate the containers. The vnet jail
uses '/' as its root path and the only things which run inside this jail
are the CNI plugins. Using the host root means that a plugin can safely
call host utilities such as ifconfig and route without having to trust the
container's version of them. An important factor here is that the CNI
plugins will only be run strictly before the container (to set up) or
strictly after (to tear down) - at no point will CNI plugins be executed at
the same time as container executables.

The child jails use ip4/6=inherit to share the vnet and each will use a
root path to the container's contents in the same way as a normal
non-hierarchical jail.

Can anyone see any potential security problems here, particularly around
the use of nested jails? I believe that the only difference between this
setup and a regular non-nested jail is that the vnet outlives the container
briefly before it is torn down.


Re: nfs buildworld blocked by rpc.lockd ?

2008-05-28 Thread Doug Rabson


On 28 May 2008, at 20:57, Arno J. Klaassen wrote:



Hello,

my buildworld on a 7-stable-amd64 blocks on the following line :

TERM=dumb TERMCAP=dumb: ex - /files/bsd/src7/share/termcap/ 
termcap.src < /files/bsd/src7/share/termcap/reorder


ex(1) stays in lockd state, and is unkillable, either by Ctl-C or
kill -9

/files/bsd is nfs-mounted as follows :

 push:/raid1/bsd/files/bsd nfs  
rw,bg,soft,nfsv3,intr,noconn,noauto,-r=32768,-w=32768  0   0


I can provide tcpdumps on server and client if helpful.


I would very much like to see tcpdumps (from either client or server).  
This problem is often caused by the fact that unless you use the '-p'  
flag, rpc.lockd isn't wired down to any particular port number. Since  
it is started at boot time, it will usually end up with the same one  
each time but the new kernel implementation in 7-stable typically ends  
up with a different port number to the old userland implementation.  
Quirks of the locking protocol make it difficult for the server to  
notice this without a lengthy timeout.


Workarounds include using '-p' to wire it down to a consistent port  
(port 4045 is reserved for this) or restarting rpc.lockd on the server.


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: fwe -> fwip in GENERIC?

2005-10-18 Thread Doug Rabson


On 18 Oct 2005, at 13:21, Norikatsu Shigemura wrote:


On Mon, 17 Oct 2005 10:12:18 +0100
Doug Rabson <[EMAIL PROTECTED]> wrote:


The fwip implementation should be fully compatible with the RFC
standard. I'm happy for fwip to replace fwe in GENERIC unless anyone
else has an objection.



I disagree.  Because fwip and fwe can exist together.
So I think that fwip should be added to GENERIC.


Sure - both drivers are tiny and they don't step on each others toes.  
Longer term, I think we should try to phase out the fwe driver since  
it doesn't interoperate with any other systems (except Df, I guess).



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: fwe -> fwip in GENERIC?

2005-10-17 Thread Doug Rabson
The fwip implementation should be fully compatible with the RFC 
standard. I'm happy for fwip to replace fwe in GENERIC unless anyone 
else has an objection.

On Saturday 15 October 2005 03:35, Katsushi Kobayashi wrote:
> Hi,
>
> Although I don't know the detail of fwe technology, I understand the
> technology is a proprietary one. It is better to provide a
> compatibility with RFC standard firewire over IP, if some volunteer
> are there.
>
> On 2005/10/15, at 9:58, Cai, Quanqing wrote:
> > Hi guys,
> >
> > When I was fixing bug kern/82727:
> > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/82727, I found we
> > use fwe(Ethernet over FireWire) in GENERIC kernel, not fwip(IP over
> > FireWire).
> > But we all know that IP over FireWire is more widely used on other
> > OSes, and
> > now this bug is fixed, do we need change fwe to fwip?
> >
> > I talked it with Tai-hwa Liang, he agrees with me. But he suggests
> > me to
> > post here for more advices, since there might be some
> > considerations such
> > like backward compatibility or code size that makes re@ made this
> > decision.
> >
> > Please give you advice or opinion.
> >
> > Best
> > Cai, Quanqing
> > ___
> > [EMAIL PROTECTED] mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> > To unsubscribe, send any mail to "freebsd-arch-
> > [EMAIL PROTECTED]"
>
> ___
> [EMAIL PROTECTED] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to
> "[EMAIL PROTECTED]"
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: new arp code snapshot for review...

2004-05-18 Thread Doug Rabson
On Tue, 2004-05-18 at 17:21, Harti Brandt wrote:
> On Tue, 18 May 2004, Luigi Rizzo wrote:
> 
> LR>On Tue, May 18, 2004 at 02:00:28PM +0100, Doug Rabson wrote:
> LR>> On Tue, 2004-05-18 at 09:48, Luigi Rizzo wrote:
> LR>> > I will try to remove as many assumptions as possible.
> LR>> > thanks for the feedback.
> LR>> 
> LR>> I think that in your prototype, the only assumption was in struct
> LR>> llentry. I would suggest defining it as something like:
> LR>
> LR>to be really flexible, both l3_addr and ll_addr should be
> LR>variable size (v4,v6,v8 over 802.x,firewire,appletalk,snail-mail),
> LR>then things rapidly become confusing and inefficient.
> LR>I would like to keep the ipv4 over ethernet case simple and quick, even
> LR>if this means replicating the code for the generic case (and this
> LR>is one of the reasons i have stalled a bit on this code -- i want
> LR>to make up my mind on what is a reasonable approaxch).
> 
> The most common use of that table is to have an l3_addr and search the 
> ll_addr, right? In that case making ll_addr variable shouldn't have a 
> measurable influence on speed. Variable l3_addr could be different though.

Well it seems to me that IPv6 neighbour discovery is different enough
from ARP that it makes sense to have IPv4-specialised ARP and
IPv6-specialised ND. The only other variable is the size of the LL
address and that doesn't add any significant complexity since its just
moved around with bcopy.


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: new arp code snapshot for review...

2004-05-18 Thread Doug Rabson
On Tue, 2004-05-18 at 09:48, Luigi Rizzo wrote:
> I will try to remove as many assumptions as possible.
> thanks for the feedback.

I think that in your prototype, the only assumption was in struct
llentry. I would suggest defining it as something like:

struct llentry {
struct llentry *lle_next;
struct mbuf *la_hold;
uint16_tflags; /* see values in if_ether.h */
uint8_t la_preempt;
uint8_t la_asked;
time_t  expire;
struct in_addr  l3_addr;
uint8_t ll_addr[0];
};

Where the allocation of them uses something like 'malloc(sizeof(struct
llentry) + ifp->if_addrlen)'.



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: new arp code snapshot for review...

2004-05-16 Thread Doug Rabson
On Sunday 25 April 2004 17:49, Luigi Rizzo wrote:
> Here is a snapshot of the new arp code that i have been working on
> lately, based a on Andre's ideas. (I say 'ARP' for brevity, what i
> mean is the layer3-to-layer2 address translation code -- arp, aarp,
> nd6 all fit in the category).

Sorry for the delay but I've only just had reason to look at the arp 
code since I've recently been working on an implementation of rfc2734 
IP over firewire. In your patch, you assume that the size of the 
link-level address is always six bytes. This assumption is not valid - 
from the looks of the existing arp code, people went to great lengths 
to avoid making this assumption throughout the networking code.

For IP over firewire, the link-level address is sixteen bytes. Other 
link types have various sizes. You must use ifp->if_addrlen in the 
generic code to cope with this correctly.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Will rfc2734 be supported?

2004-02-03 Thread Doug Rabson
On Mon, 2004-02-02 at 05:50, Hidetoshi Shimokawa wrote:
> At Sat, 31 Jan 2004 15:27:03 +0100,
> Dario Freni wrote:
> > 
> > [1  ]
> > Hi guys,
> > I was wondering if the standard implementation of IPoFW is planning to
> > be implemented. I'm not expert on device writing, I was also looking for
> > some workarounds, like attach the fwe0:lower netgraph hook to a virtual
> > interface, but reading the rfc I realized that the normal IP packet
> > needs an encapsulation before it's sent on the wire.
> 
> I have no plan to implement rfc2734 by myself near future.
> IEEE1394 is somewhat complicated, compared with Ethernet.
> Because there are some types of packets, fwe and IPoFW uses very
> different packet type and formats, so you don't have an easy
> workaround using netgraph.
> 
> If you are interested in implementing rfc2734, you need several steps.
> 
> - Implement rfc2734 encapsulation as /sys/net/if_ethersubr.c for
> ethernt. rfc2734 uses very different packet format from ethernet.
> 
> - Implement generic GASP receive routin in the firewire driver.
> You need this service for multicast/broadcast packet such as an arp
> packet.
> 
> - Implement if_fw.c for the interface device.
> 
> Though I'm not sure it actually worked, the firewire driver for
> FreeBSD-4.0 seems to have a support for IPoFW
> See ftp://ftp.uec.ac.jp/pub/firewire/ for the patch.

I spent a little time recently thinking about what would be needed for
this and came to similar conclusions. The most interesting part is
implementing generic GASP receive. I think the nicest way of doing that
would be to implement a new network protocol for firewire, allowing
userland programs to do something like:

struct sockaddr_firewire a;
s = socket(PF_FIREWIRE, SOCK_DGRAM, 0);
a.sof_address = 0x12345000;
...;
bind(s, &a, sizeof a);
...;
len = recv(s, buf, sizeof buf, 0);

Internally, this probably means arranging for all asynchronous packets
to be DMA'd directly into mbufs and would probably change the firewire
code a great deal. Still, it might be worth it to gain a familiar
socket-based user api.



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: finishing the if.h/if_var.h split

2003-09-30 Thread Doug Rabson
On Tue, 2003-09-30 at 09:22, Bruce Evans wrote:

> That's one alternative.  (Far too) many places already use the simple
> alternative of just using "struct device *".  Grep shows 68 lines
> containing "struct device" in *.h and 32 in *.c.  For "device_t", the
> numbers are 2140 in *.h and 5089 in *.c.  This is in a sys tree with
> about 1000 matches of "device_t" in generated files.  There are non-bogus
> uses of "struct device" to avoid namespace pollution in .
> Most other uses are just bogus (modulo the existence of device_t being
> non-bogus -- its opaqueness is negative since anything that wants to
> use it must include  and thus can see its internals.  style(9)
> says to not use negatively opaque typedefs).

The internals of struct device are not contained in  - it is
completely opaque to users outside subr_bus.c. The main 'bug' here is
the idea that its a good thing to export kernel data structures (struct
ifnet) to userland. The layout of struct ifnet is an implementation
detail - it shouldn't form part of the userland api.



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"