Darren Reed wrote: > On 27/07/10 04:38 AM, James Carlson wrote: >> Darren Reed wrote: >>> Finally, given that you feel so strongly about this and given that >>> this is OpenSolaris, feel free to file a bug in bugzilla along with >>> a new design and code that fixes this issue. Nothing speaks louder >>> in an open source community than contributions of new working code. >> >> The only "new design" I would offer would be to back this errant fix out >> of the gate. The code was better the way it was before, and the change >> does not actually fix any real problem that anyone encountered. It's >> gratuitous, and I'm surprised that a reviewer didn't object. >> >> Given that this is OpenSolaris, I was hoping that we could have a >> meaningful conversation about this change on a mailing list that's >> intended for that purpose. Obviously, that's not to be, because I'm >> instead getting weird demands that I name other "multitasking operating >> systems" that implement standardized protocols. > > Let me summarise it like this: there was a feature present that > that nobody used and if they did, they would need to reboot their > system after they did so in order to restore proper operation. > Is that the kind of feature we need/want in Solaris?
Given that we have in fact used this feature as part of SolarMAX stress testing, I very much doubt that this is a complete accounting of the issues. The only thing that's really affected by this problem is the old BSD routing socket interface. In the case of adding a route, you can always use the interface name instead of the ifIndex (sockaddr_dl supports both), and that works fine. In the case of receiving a "surprising" truncated rtm_index value, you can do the smart thing and validate all potentially matching interface information you've got, as dhcpagent already does by doing a comparison under a 0xffff mask. And the previous code had the nice property that the 32-bit index numbers (which _are_ widely used both in ON code and in other projects like Quagga) are "almost never" reused, because even a busy system is unlikely to get 2^32 plumb/unplumb sequences in a single OS instance, while 2^16 plumb/unplumb events is certainly not impossible, and can even be downright likely if you have some kind of dynamic process involved -- such as with tunnels or PPP. And, to the best of my knowledge, there is no known application that was hindered in any particular way by the existing (albeit admittedly "bad") BSD compatibility interfaces. This means the fix wasn't really fixing any specific problem, but rather addressing some perceived issue. And the now more likely roll-over event is toxic to SNMP, because it violates a constraint of the Interfaces MIB, so there's really no way of knowing how wide the blast radius might be. Thus my argument is simple: there is no need to "fix" this old interface by breaking an existing interface. If there's some compelling need to address the deficiencies in the old BSD interfaces, I believe those should be addressed head-on rather than hobbling the underlying mechanisms to fit. This would mean figuring out a good way to version the existing interface and addressing the numerous defects in this area -- including the notable lack of sockaddr.sa_len on Solaris and truncated (16-bit-only) rtm_flags. Forcing the underlying ifIndex numbers to stay in a 16-bit range is no real fix at all, because it addresses only a tiny part of the problem, and a part that (at least historically) has not really been all that interesting or limiting. -- James Carlson 42.703N 71.076W <[email protected]> _______________________________________________ networking-discuss mailing list [email protected]
