Hello James, sorry for the long graphomaniac post again. It seems that some of my opinions are often different from yours, and the foundations seem weak. Maybe one of us can make the other one change the point of view to a more correct one ;)
> In any event, I think we're way off the path. It's recognized as a > bug (so no more argument is really needed) and it's something that we > know we want to work on. I'm afraid we (I) can go on and on about this. Maybe we should continue if my ideas, opinions and examples are of any use. But it takes time from both of us, so I'd rather stop, too :) Besides, > You're always welcome and encouraged to contribute code instead of complaint. > ;-} So far I'm limited to ideas. But the idea of becoming a respected kernel hacker like thou is attractive, I'll give it some thought and time when I straighten out the ideas and come to terms ;) >> However this matching interface is not necessarily used to push the packet >> onto the network. This kinda makes sense in ipmp, lacp and other cases of >> many-to-many interface-to-address relations, as well as for virtual >> interfaces >> (i.e. a public IP hosted on the loopback); but this seems flawed for >> one-to-one >> relations of an [aliased] interface and an IP address. > I don't follow that part. What I meant is: the current algorithm only uses the destination address when choosing a route, and doesn't use the source address to pick the gateway and the outgoing interface. Good or bad, it's a fact. It's not the point I'm arguing about. This is okay in many scenarios (I have never seen it as a problem for 10 years), this may be desired in scenarios when there is no one (=0, >1) physical interface bound to the source IP address. It happens to be a problem in case like mine - where not all default routes are created equal, and choice of the gateway and of the physical interface depends on the source IP address (the problem is indeed valid for both a "mere host" and a "router" in terms above). > it occurs in the same way that routing lookups occur -- by looking for the > longest prefix match. > ...local addresses configured on the interfaces are represented as /32 (host > route) entries in the forwarding table. Thus, if they match at all, they're > always the best match possible (longest prefix length). > ... In fact, interface subnets are represented in the routing > table with specially marked routing table entries... Correct :-} If the packet's destination IP field is matched as being in some interface's connected subnet, the routing table lookup returns a "specially marked entry" and the packet goes out *from the correct interface* (I hope I'm not mistaken in the last part) packed into an L2 (ethernet) frame with a MAC address found by the ARP table lookup of the destination host's IP address. Dammit, I even think this piece even works correctly when the packet is sent to a previously chosen gateway for a remote destination. I kind of expected "route add ... -ifp" to be the flag I need, however it only seems to enforce sending of packets to a certain gateway through a certain interface (it's not the problem I had). Again, it does so with no relation to the source addresses. # netstat -rn | grep default Routing Table: IPv4 Destination Gateway Flags Ref Use Interface -------------------- -------------------- ----- ----- ---------- --------- default 194.67.186.65 UG 1 817 e1000g0 default 81.5.113.1 UGS 1 1142 e1000g81000 > The missing part (and the part we've been belaboring here) is that > Solaris doesn't make any intelligent use of multiple equivalent > matches. If there are equal matches on the destination address, we > just pick one arbitrarily rather than using the source address to add > some "flavor." > That's arguably a bug, and is at the root of what you're seeing. The choice among equivalent gateways (and thus output interfaces) is what needs to be enhanced, in my opinion. And from your posts I see that the need for some enhancement in this area is also something evident/accepted, and does not need to be argued about either :} > (Of course, there are other scenarios that would be harder to deal with. Scenarios, variations and brainstorming for possible/best solutions is another interesting direction, though it shouldn't maybe done in this very thread. But am I wrong to think that OpenSolaris "networking-discuss" is the wrong forum to toss and kick these ideas around? :) > For instance, if a better route [longer prefix] matches and > puts the packet on the "wrong" interface given the source address, > what do you do? I would argue that this is exactly what's supposed to > happen based on the kernel forwarding table, and if you want a > different result, then you need something like finer-grained routes, > or even something exotic like source-based routing.) It depends. This may be redirected traffic for transparent proxy situations. Or a route to some hidden NAT/IDS/LB system - and the admin wants the traffic to get there regardless of the "non-matching" interface address. For an L3 router there would be no "matching" interface anyway. I think clever use of "route -ifp" can solve the issue of picking an interface for the gateway for a particular admin's situation. At least in the current routing model, where the destination alone determines the route selection ;) > For what it's worth, I think of this in completely different terms. > "Routers" are just hosts that happen to be configured to forward. I don't think so. The "routers" are specifically configured for their job. More attention is paid to networking configuration of "routers", and when they break - it's everybody's problem. Unlike "hosts" which are jumpstarted in dozens or thousands all with the same template, and possibly use routing setup received via DHCP. It is inefficient to pay as much attention (time -> salary) to configuration of each host in the legion (and due to this simlpicity they should best work with default or easily assignable settings). Configuration to forward is just one aspect of a router. They may also have more weird setups of ipfilter (NAT, redirections, as well as simply firewalls), strange routing algorithms and routing software (in.routed, quagga/zebra, gated are just a few), they can implement VPN software endpoints, etc. > The fact that this issue happens to affect this one host that you're > working with makes it visible to you as a host-related problem, but I > don't think that makes it less interesting for routers. > The same issue is visible with L3 routers, because the very same > decision process is going on internally -- if you have a choice of > interfaces to send on, picking one with a subnet match for the source > address of the packet would be a good idea, if such a choice is possible. True, my complaint so far involved the "host" scenario where the gateway selection based on packet's source address can be easilyimplemented by comparing this source address with the system's current configured addresses, and picking the output interface (and gateway) as a result. However on a "router" one can arguably afford going through the trouble of devising, testing and maintaining more exotic configurations. Unlike a "host" referencing its configured (and UP'ed) interfaces, you'd have to store subnet-specific configuration somewhere on a "router", at the very least. Let me elaborate on an example, so that, hopefully, I can explain my idea of real-life differences between a "mere host" and a "router". As one pointer which I haven't yet thoroughly checked myself, the quagga/zebra routing software suite seems to support the syntax of Cisco PBR (policy-based routing). It is mentioned in the project's documentation, but I haven't used that under Solaris so I don't know if this feature is actualy implemented and backed by the OS kernel (we use it successfully on one our Cisco 3750's, though with serious usage limitations imposed by the vendor in the firmware). This feature allows to match IP ACL's for source addresses and do some actions as a result - set a next-hop gateway, set an output interface, modify the QOS/TOS bits, whatever in general (limited by firmware and/or hardware in practice). On one hand, zebra.conf could be quite an acceptable place to store all these subnet-specific configurations to pick the "more correct" next-hop gateways and/or interfaces. On the other hand, I'd hate to maintain thousands of copies of this file on each and every host, and change them all whenever some specific subnet is added. This should all be taken care of centrally, on one router (or a small manageable amount of them in failover, etc). > They're otherwise identical in terms of IP behavior, and that > identical behavior is a good thing: it means that IP's algorithms > work the same everywhere. In a datagram network, special == bad. When you put it this way, I find it hard to find disagreement arguments. But something remains "fishy". I hope my arguments above uncover some of that ;) Nonetheless, I hold on to my opinion that due to any number of causes not limited to poor planning and lack of unlimited resources, there happen to be special cases out there, even in networking. It is a heterogenous world, for best or worse. When this happens, these special cases are best processed at some centralized point (even if at the price of non-default configuration and utilizing less common software stacks). Let's call this centralized weird-configurations point a "router" and oppose it to standardized rarely-changing simple-configurations "hosts". And keep in mind that "hosts" can be unmodifiable in their nature (i.e. you just can't influence the TCP/IP stack of a smartphone, but you may still have to provide it with multi-ISP connectivity with your Solaris router and a wi-fi access point). > At a base level, I just reject the distinction between "host" and > "router." It's never been terribly useful, and has actually > contributed a lot of harm to the Internet. (For instance, ICMP > redirects and router discovery, two completely horrible mechanisms > that ape routing protocols poorly, are based on this false > distinction.) I wouldn't be so hard on them. I have working examples where ICMP redirects help keep the end-hosts configuration simple (one default router and the local IP+subnet are the only predefined entries), while several routers on the same subnet are actually used to connect to different special routes. That is, these connections don't pass "through" the default router all the rime; it refers the sending hosts to correct secific local routers. Arguably, local RIPv2 (or OSPF, whatever) would be better, but: 1) not all routers (including wi-fi access points) can send/process RIP messages; 2) not all host devices' OSes have RIPv2 clients at all. None have it enabled by default. Enabling them won't solve the problems for all hosts, but would be a large undertaking in itself. Apparently, maintaining static routes on each host (if possible at all) is even less practical. It is not their job. > The lines are much blurrier than I think you're suggesting. While this discussion is limited to Solaris software, I'd like to remind that "routers" often differ from "hosts" not only in their software (tasks, tunings), but also in hardware which is better suited for certain networking tasks (and may be less suited for general processing). And when the difference is some quirk of a general-purpose CPU architecture (like longer processing queues and arguably inefficient processing of "quick short" interrupts) this becomes important for Solaris-related discussion. One can dedicate a server better suited for networking to be a router, and use other servers optimized for heavy number-crunching to heat the planet. And they would all run the same build of Sun Solaris - the favorite OS ;) And with their different roles they may still have to handle similar-looking tasks very differently. //Jim -- This message posted from opensolaris.org _______________________________________________ networking-discuss mailing list [email protected]
